fair_forge.datasets

Functions

grouping_by_prefix(*, columns, prefixes)

Create slices for feature grouping based on column prefixes.

load_adult(group, *[, group_in_features, ...])

Load the Adult dataset with specified group information.

load_dummy_dataset(seed)

Load a dummy dataset for testing purposes, based on a mixture of 2 2D Gaussians.

load_ethicml_toy([group_in_features])

Load the EthicML toy dataset.

Classes

GroupDataset(data, target, groups, name, ...)

A dataset containing features, labels, and groups.

class fair_forge.datasets.GroupDataset(data: ndarray[tuple[Any, ...], dtype[float32]], target: ndarray[tuple[Any, ...], dtype[int32]], groups: ndarray[tuple[Any, ...], dtype[int32]], name: str, feature_grouping: list[slice], feature_names: list[str])[source]

Bases: NamedTuple

A dataset containing features, labels, and groups.

data: ndarray[tuple[Any, ...], dtype[float32]]

Features of the dataset.

feature_grouping: list[slice]

Slices indicating groups of features.

feature_names: list[str]

Names of the features in the dataset.

groups: ndarray[tuple[Any, ...], dtype[int32]]

Groups of the dataset.

name: str

Name of the dataset.

target: ndarray[tuple[Any, ...], dtype[int32]]

Labels of the dataset.

fair_forge.datasets.grouping_by_prefix(*, columns: list[str], prefixes: list[str]) list[slice][source]

Create slices for feature grouping based on column prefixes.

fair_forge.datasets.load_adult(group: AdultGroup, *, group_in_features: bool = False, binarize_nationality: bool = False, binarize_race: bool = False) GroupDataset[source]

Load the Adult dataset with specified group information.

Parameters:

group – The group to use for the dataset.

Returns:

A Dataset object containing the Adult dataset.

fair_forge.datasets.load_dummy_dataset(seed: int) GroupDataset[source]

Load a dummy dataset for testing purposes, based on a mixture of 2 2D Gaussians.

The groups are random.

Parameters:

seed – Random seed for reproducibility.

fair_forge.datasets.load_ethicml_toy(group_in_features: bool = False) GroupDataset[source]

Load the EthicML toy dataset.