数据集:
mstz/compas
The Compas dataset for recidivism prediction. Dataset known to have racial bias issues, check this Propublica article on the topic.
Configuration | Task | Description |
---|---|---|
encoding | Encoding dictionary showing original values of encoded features. | |
two-years-recidividity | Binary classification | Will the defendant be a violent recidivist? |
two-years-recidividity-no-race | Binary classification | As above, but the race feature is removed. |
priors-prediction | Regression | How many prior crimes has the defendant committed? |
priors-prediction-no-race | Binary classification | As above, but the race feature is removed. |
race | Multiclass classification | What is the race of the defendant? |
from datasets import load_dataset dataset = load_dataset("mstz/compas", "two-years-recidividity")["train"]
Feature | Type | Description |
---|---|---|
sex | int64 | |
age | int64 | |
race | int64 | |
number_of_juvenile_fellonies | int64 | |
decile_score | int64 | Criminality score |
number_of_juvenile_misdemeanors | int64 | |
number_of_other_juvenile_offenses | int64 | |
number_of_prior_offenses | int64 | |
days_before_screening_arrest | int64 | |
is_recidivous | int64 | |
days_in_custody | int64 | Days spent in custody |
is_violent_recidivous | int64 | |
violence_decile_score | int64 | Criminality score for violent crimes |
two_years_recidivous | int64 |