Adult
Multivariate Classification Available

Adult

Donated Apr 30, 1996 Social Science Creative Commons Attribution 4.0 International

Abstract

Predict whether income exceeds $50K/yr based on census data. Also known as "Census Income" dataset.

Instances
48,842
Features
14
Data Type
Multivariate
Missing Values
Yes

Purpose

Additional Information Extraction was done by Barry Becker from the 1994 Census database. A set of reasonably clean records was extracted using the following conditions: ((AAGE>16) && (AGI>100) && (AFNLWGT>1)&& (HRSWK>0)) Prediction task is to determine whether a person's income is over $50,000 a year.

Name Role Type Description Missing
Age Feature Integer N/A No
Workclass Feature Categorical Private, Self-emp-not-inc, Self-emp-inc, Federal-gov, Local-gov, State-gov, Without-pay, Never-worked No
fnlwgt Feature Integer No
Education Feature Categorical Bachelors, Some-college, 11th, HS-grad, Prof-school, Assoc-acdm, Assoc-voc, 9th, 7th-8th, 12th, Masters, 1st-4th, 10th, Doctorate, 5th-6th, Preschool No
Education Num Feature Integer No
Marital Status Feature Categorical Married-civ-spouse, Divorced, Never-married, Separated, Widowed, Married-spouse-absent, Married-AF-spouse No
Occupation Feature Categorical Tech-support, Craft-repair, Other-service, Sales, Exec-managerial, Prof-specialty, Handlers-cleaners, Machine-op-inspct, Adm-clerical, Farming-fishing, Transport-moving, Priv-house-serv, Protective-serv, Armed-Forces No
Relationship Feature Categorical Wife, Own-child, Husband, Not-in-family, Other-relative, Unmarried No
Race Feature Categorical White, Asian-Pac-Islander, Amer-Indian-Eskimo, Other, Black No
Sex Feature Binary Female, Male No
Capital Gain Feature Integer Capital gains recorded No
Capital Loss Feature Integer Capital losses recorded No
Hours Per Week Feature Integer Hours worked per week No
Native Country Feature Categorical United-States, Cambodia, England, Puerto-Rico, Canada, Germany, Outlying-US(Guam-USVI-etc), India, Japan, Greece, South, China, Cuba, Iran, Honduras, Philippines, Italy, Poland, Jamaica, Vietnam, Mexico, Portugal, Ireland, France, Dominican-Republic, Laos, Ecuador, Taiwan, Haiti, Columbia, Hungary, Guatemala, Nicaragua, Scotland, Thailand, Yugoslavia, El-Salvador, Trinidad&Tobago, Peru, Hong, Holand-Netherlands No
Income Target Binary >50K, <=50K No

Adult.data

CSV N/A • Data

adult.test

MD N/A • Documentation

adult.names

NAMES 5.10 KB • Documentation

old.adult.names

NAMES 4.20 KB • Documentation

Index

INDEX 0.14 KB • Documentation

Papers Citing this Dataset

256 papers found

Data Management for Causal Algorithmic Fairness

By Babak Salimi, Bill Howe, Dan Suciu.

ArXiv 2019 85

Differentially Private Objective Perturbation: Beyond Smoothness and Convexity

By Seth Neel, Aaron Roth, Giuseppe Vietri, Zhiwei Wu.

ArXiv 2019 85

A Novel Dynamic KCi - Slice Publishing Prototype for Retaining Privacy and Utility of Multiple Sensitive Attributes

By N.V.S. Raju, M. Seetaramanath, P. Rao.

International Journal of Information Technology and Computer Science. 2019 85

Privacy-preserving Distributed Machine Learning via Local Randomization and ADMM Perturbation

By Xin Wang, Hideaki Ishii, Linkang Du, Peng Cheng, Jiming Chen.

ArXiv 2019 85

A tree-based radial basis function method for noisy parallel surrogate optimization

By Chenchao Shou, Matthew West.

ArXiv 2019 85
Rows per page: 1-5 of 256