Multivariate
Classification
Available
Adult
Donated Apr 30, 1996
Social Science
Creative Commons Attribution 4.0 International
Abstract
Predict whether income exceeds $50K/yr based on census data. Also known as "Census Income" dataset.
48,842
14
Multivariate
Yes
Purpose
Additional Information Extraction was done by Barry Becker from the 1994 Census database. A set of reasonably clean records was extracted using the following conditions: ((AAGE>16) && (AGI>100) && (AFNLWGT>1)&& (HRSWK>0)) Prediction task is to determine whether a person's income is over $50,000 a year.
| Name | Role | Type | Description | Missing |
|---|---|---|---|---|
| Age | Feature | Integer | N/A | No |
| Workclass | Feature | Categorical | Private, Self-emp-not-inc, Self-emp-inc, Federal-gov, Local-gov, State-gov, Without-pay, Never-worked | No |
| fnlwgt | Feature | Integer | No | |
| Education | Feature | Categorical | Bachelors, Some-college, 11th, HS-grad, Prof-school, Assoc-acdm, Assoc-voc, 9th, 7th-8th, 12th, Masters, 1st-4th, 10th, Doctorate, 5th-6th, Preschool | No |
| Education Num | Feature | Integer | No | |
| Marital Status | Feature | Categorical | Married-civ-spouse, Divorced, Never-married, Separated, Widowed, Married-spouse-absent, Married-AF-spouse | No |
| Occupation | Feature | Categorical | Tech-support, Craft-repair, Other-service, Sales, Exec-managerial, Prof-specialty, Handlers-cleaners, Machine-op-inspct, Adm-clerical, Farming-fishing, Transport-moving, Priv-house-serv, Protective-serv, Armed-Forces | No |
| Relationship | Feature | Categorical | Wife, Own-child, Husband, Not-in-family, Other-relative, Unmarried | No |
| Race | Feature | Categorical | White, Asian-Pac-Islander, Amer-Indian-Eskimo, Other, Black | No |
| Sex | Feature | Binary | Female, Male | No |
| Capital Gain | Feature | Integer | Capital gains recorded | No |
| Capital Loss | Feature | Integer | Capital losses recorded | No |
| Hours Per Week | Feature | Integer | Hours worked per week | No |
| Native Country | Feature | Categorical | United-States, Cambodia, England, Puerto-Rico, Canada, Germany, Outlying-US(Guam-USVI-etc), India, Japan, Greece, South, China, Cuba, Iran, Honduras, Philippines, Italy, Poland, Jamaica, Vietnam, Mexico, Portugal, Ireland, France, Dominican-Republic, Laos, Ecuador, Taiwan, Haiti, Columbia, Hungary, Guatemala, Nicaragua, Scotland, Thailand, Yugoslavia, El-Salvador, Trinidad&Tobago, Peru, Hong, Holand-Netherlands | No |
| Income | Target | Binary | >50K, <=50K | No |
Papers Citing this Dataset
256 papers found
Data Management for Causal Algorithmic Fairness
By Babak Salimi, Bill Howe, Dan Suciu.
ArXiv
2019
85
Differentially Private Objective Perturbation: Beyond Smoothness and Convexity
By Seth Neel, Aaron Roth, Giuseppe Vietri, Zhiwei Wu.
ArXiv
2019
85
A Novel Dynamic KCi - Slice Publishing Prototype for Retaining Privacy and Utility of Multiple Sensitive Attributes
By N.V.S. Raju, M. Seetaramanath, P. Rao.
International Journal of Information Technology and Computer Science.
2019
85
Privacy-preserving Distributed Machine Learning via Local Randomization and ADMM Perturbation
By Xin Wang, Hideaki Ishii, Linkang Du, Peng Cheng, Jiming Chen.
ArXiv
2019
85
A tree-based radial basis function method for noisy parallel surrogate optimization
By Chenchao Shou, Matthew West.
ArXiv
2019
85
Rows per page:
1-5 of 256