中文

Adult

The Adult dataset from the UCI ML repository . Census dataset including personal characteristic of a person, and their income threshold.

Configurations and tasks

Configuration Task Description
encoding Encoding dictionary showing original values of encoded features.
income Binary classification Classify the person's income as over or under the threshold.
income-no race Binary classification As income , but the race feature is removed.
race Multiclass classification Predict the race of the individual.

Usage

from datasets import load_dataset

dataset = load_dataset("mstz/adult", "income")["train"]

Features

Target feature changes according to the selected configuration and is always in last position in the dataset.

Feature Type Description
age [int64] Age of the person.
capital_gain [float64] Capital gained by the person.
capital_loss [float64] Capital lost by the person.
education [int8] Education level: the higher, the more educated the person.
final_weight [int64]
hours_worked_per_week [int64] Hours worked per week.
marital_status [string] Marital status of the person.
native_country [string] Native country of the person.
occupation [string] Job of the person.
race [string] Race of the person.
relationship [string]
is_male [bool] Man/Woman.
workclass [string] Type of job of the person.
over_threshold int8 1 for income >= 50k$ , 0 otherwise.