Preprocess algorithms#
Pre-process algorithms take the training data and transform it.
Preprocess base#
Abstract Base Class of all algorithms in the framework.
- class PreAlgorithm(*args, **kwargs)#
Bases:
ethicml.algorithms.algorithm_base.Algorithm
,Protocol
Abstract Base Class for all algorithms that do pre-processing.
- abstract fit(train)#
Fit transformer on the given data.
- Parameters
train (ethicml.utility.data_structures.DataTuple) – training data
self (ethicml.algorithms.preprocess.pre_algorithm._PA) –
- Returns
a tuple of the pre-processed training data and the test data
- Return type
Tuple[ethicml.algorithms.preprocess.pre_algorithm._PA, ethicml.utility.data_structures.DataTuple]
- abstract property name: str#
Name of the algorithm.
- property out_size: int#
The number of features to generate.
- abstract run(train, test)#
Generate fair features with the given data.
- Parameters
train (ethicml.utility.data_structures.DataTuple) – training data
test (ethicml.utility.data_structures.TestTuple) – test data
- Returns
a tuple of the pre-processed training data and the test data
- Return type
Tuple[ethicml.utility.data_structures.DataTuple, ethicml.utility.data_structures.TestTuple]
- run_test(train, test)#
Run with reduced training set so that it finishes quicker.
- Parameters
- Return type
Tuple[ethicml.utility.data_structures.DataTuple, ethicml.utility.data_structures.TestTuple]
- abstract transform(data)#
Generate fair features with the given data.
- Parameters
train – training data
test – test data
data (ethicml.algorithms.preprocess.pre_algorithm.T) –
- Returns
a tuple of the pre-processed training data and the test data
- Return type
ethicml.algorithms.preprocess.pre_algorithm.T
- class PreAlgorithmAsync(*args, **kwargs)#
Bases:
ethicml.algorithms.algorithm_base.SubprocessAlgorithmMixin
,ethicml.algorithms.preprocess.pre_algorithm.PreAlgorithm
,Protocol
Pre-Algorithm that can be run blocking and asynchronously.
- fit(train)#
Generate fair features with the given data asynchronously.
- Parameters
train (ethicml.utility.data_structures.DataTuple) – training data
test – test data
- Returns
a tuple of the pre-processed training data and the test data
- Return type
Tuple[ethicml.algorithms.preprocess.pre_algorithm.PreAlgorithm, ethicml.utility.data_structures.DataTuple]
- abstract property name: str#
Name of the algorithm.
- property out_size: int#
The number of features to generate.
- run(train, test)#
Generate fair features with the given data asynchronously.
- Parameters
train (ethicml.utility.data_structures.DataTuple) – training data
test (ethicml.utility.data_structures.TestTuple) – test data
- Returns
a tuple of the pre-processed training data and the test data
- Return type
Tuple[ethicml.utility.data_structures.DataTuple, ethicml.utility.data_structures.TestTuple]
- run_test(train, test)#
Run with reduced training set so that it finishes quicker.
- Parameters
- Return type
Tuple[ethicml.utility.data_structures.DataTuple, ethicml.utility.data_structures.TestTuple]
- transform(data)#
Generate fair features with the given data asynchronously.
- Parameters
train – training data
test – test data
data (ethicml.algorithms.preprocess.pre_algorithm.T) –
- Returns
a tuple of the pre-processed training data and the test data
- Return type
ethicml.algorithms.preprocess.pre_algorithm.T
- class PreAlgorithmDC(seed=888)#
Bases:
ethicml.algorithms.preprocess.pre_algorithm.PreAlgorithm
PreAlgorithm dataclass base class.
- Parameters
seed (int) –
- Return type
None
- abstract fit(train)#
Fit transformer on the given data.
- Parameters
train (ethicml.utility.data_structures.DataTuple) – training data
self (ethicml.algorithms.preprocess.pre_algorithm._PA) –
- Returns
a tuple of the pre-processed training data and the test data
- Return type
Tuple[ethicml.algorithms.preprocess.pre_algorithm._PA, ethicml.utility.data_structures.DataTuple]
- abstract property name: str#
Name of the algorithm.
- property out_size: int#
The number of features to generate.
- abstract run(train, test)#
Generate fair features with the given data.
- Parameters
train (ethicml.utility.data_structures.DataTuple) – training data
test (ethicml.utility.data_structures.TestTuple) – test data
- Returns
a tuple of the pre-processed training data and the test data
- Return type
Tuple[ethicml.utility.data_structures.DataTuple, ethicml.utility.data_structures.TestTuple]
- run_test(train, test)#
Run with reduced training set so that it finishes quicker.
- Parameters
- Return type
Tuple[ethicml.utility.data_structures.DataTuple, ethicml.utility.data_structures.TestTuple]
- abstract transform(data)#
Generate fair features with the given data.
- Parameters
train – training data
test – test data
data (ethicml.algorithms.preprocess.pre_algorithm.T) –
- Returns
a tuple of the pre-processed training data and the test data
- Return type
ethicml.algorithms.preprocess.pre_algorithm.T
Beutel#
Beutel’s algorithm.
- class Beutel(fairness='DP', *, dir='.', enc_size=(40,), adv_size=(40,), pred_size=(40,), enc_activation='Sigmoid()', adv_activation='Sigmoid()', batch_size=64, y_loss='BCELoss()', s_loss='BCELoss()', epochs=50, adv_weight=1.0, validation_pcnt=0.1, seed=888)#
Bases:
ethicml.algorithms.preprocess.pre_algorithm.PreAlgorithmAsync
Beutel’s adversarially learned fair representations.
- Parameters
fairness (Literal['DP', 'EqOp', 'EqOd']) –
dir (Union[str, pathlib.Path]) –
enc_size (Sequence[int]) –
adv_size (Sequence[int]) –
pred_size (Sequence[int]) –
enc_activation (str) –
adv_activation (str) –
batch_size (int) –
y_loss (str) –
s_loss (str) –
epochs (int) –
adv_weight (float) –
validation_pcnt (float) –
seed (int) –
- fit(train)#
Generate fair features with the given data asynchronously.
- Parameters
train (ethicml.utility.data_structures.DataTuple) – training data
test – test data
- Returns
a tuple of the pre-processed training data and the test data
- Return type
Tuple[ethicml.algorithms.preprocess.pre_algorithm.PreAlgorithm, ethicml.utility.data_structures.DataTuple]
- property name: str#
Name of the algorithm.
- property out_size: int#
The number of features to generate.
- run(train, test)#
Generate fair features with the given data asynchronously.
- Parameters
train (ethicml.utility.data_structures.DataTuple) – training data
test (ethicml.utility.data_structures.TestTuple) – test data
- Returns
a tuple of the pre-processed training data and the test data
- Return type
Tuple[ethicml.utility.data_structures.DataTuple, ethicml.utility.data_structures.TestTuple]
- run_test(train, test)#
Run with reduced training set so that it finishes quicker.
- Parameters
- Return type
Tuple[ethicml.utility.data_structures.DataTuple, ethicml.utility.data_structures.TestTuple]
- transform(data)#
Generate fair features with the given data asynchronously.
- Parameters
train – training data
test – test data
data (ethicml.algorithms.preprocess.pre_algorithm.T) –
- Returns
a tuple of the pre-processed training data and the test data
- Return type
ethicml.algorithms.preprocess.pre_algorithm.T
Calders#
Kamiran&Calders 2012, massaging.
- class Calders(*, preferable_class, disadvantaged_group, seed=888)#
Bases:
ethicml.algorithms.preprocess.pre_algorithm.PreAlgorithm
Massaging algorithm from Kamiran&Calders 2012.
- Parameters
preferable_class (int) –
disadvantaged_group (int) –
seed (int) –
- fit(train)#
Fit transformer on the given data.
- Parameters
train (ethicml.utility.data_structures.DataTuple) – training data
- Returns
a tuple of the pre-processed training data and the test data
- Return type
Tuple[ethicml.algorithms.preprocess.pre_algorithm.PreAlgorithm, ethicml.utility.data_structures.DataTuple]
- property name: str#
Name of the algorithm.
- property out_size: int#
The number of features to generate.
- run(train, test)#
Generate fair features with the given data.
- Parameters
train (ethicml.utility.data_structures.DataTuple) – training data
test (ethicml.utility.data_structures.TestTuple) – test data
- Returns
a tuple of the pre-processed training data and the test data
- Return type
Tuple[ethicml.utility.data_structures.DataTuple, ethicml.utility.data_structures.TestTuple]
- run_test(train, test)#
Run with reduced training set so that it finishes quicker.
- Parameters
- Return type
Tuple[ethicml.utility.data_structures.DataTuple, ethicml.utility.data_structures.TestTuple]
- transform(data)#
Generate fair features with the given data.
- Parameters
train – training data
test – test data
data (ethicml.algorithms.preprocess.pre_algorithm.T) –
- Returns
a tuple of the pre-processed training data and the test data
- Return type
ethicml.algorithms.preprocess.pre_algorithm.T
Upsampling#
Simple upsampler that makes subgroups the same size as the majority group.
- class Upsampler(strategy='uniform', seed=888)#
Bases:
ethicml.algorithms.preprocess.pre_algorithm.PreAlgorithm
Upsampler algorithm.
Given a datatuple, create a larger datatuple such that the subgroups have a balanced number of samples.
- Parameters
strategy (Literal['uniform', 'preferential', 'naive']) –
seed (int) –
- Return type
None
- fit(train)#
Fit transformer on the given data.
- Parameters
train (ethicml.utility.data_structures.DataTuple) – training data
- Returns
a tuple of the pre-processed training data and the test data
- Return type
Tuple[ethicml.algorithms.preprocess.upsampler.Upsampler, ethicml.utility.data_structures.DataTuple]
- property name: str#
Name of the algorithm.
- property out_size: int#
The number of features to generate.
- run(train, test)#
Generate fair features with the given data.
- Parameters
train (ethicml.utility.data_structures.DataTuple) – training data
test (ethicml.utility.data_structures.TestTuple) – test data
- Returns
a tuple of the pre-processed training data and the test data
- Return type
Tuple[ethicml.utility.data_structures.DataTuple, ethicml.utility.data_structures.TestTuple]
- run_test(train, test)#
Run with reduced training set so that it finishes quicker.
- Parameters
- Return type
Tuple[ethicml.utility.data_structures.DataTuple, ethicml.utility.data_structures.TestTuple]
- transform(data)#
Generate fair features with the given data.
- Parameters
train – training data
test – test data
data (ethicml.algorithms.preprocess.pre_algorithm.T) –
- Returns
a tuple of the pre-processed training data and the test data
- Return type
ethicml.algorithms.preprocess.pre_algorithm.T
VFAE#
Variational Fair Auto-Encoder by Louizos et al.
- class VFAE(dataset, *, dir='.', supervised=True, epochs=10, batch_size=32, fairness='DI', latent_dims=50, z1_enc_size=None, z2_enc_size=None, z1_dec_size=None, seed=888)#
Bases:
ethicml.algorithms.preprocess.pre_algorithm.PreAlgorithmAsync
VFAE Object - see implementation file for details.
- Parameters
dataset (str) –
dir (Union[str, pathlib.Path]) –
supervised (bool) –
epochs (int) –
batch_size (int) –
fairness (str) –
latent_dims (int) –
z1_enc_size (Optional[List[int]]) –
z2_enc_size (Optional[List[int]]) –
z1_dec_size (Optional[List[int]]) –
seed (int) –
- fit(train)#
Generate fair features with the given data asynchronously.
- Parameters
train (ethicml.utility.data_structures.DataTuple) – training data
test – test data
- Returns
a tuple of the pre-processed training data and the test data
- Return type
Tuple[ethicml.algorithms.preprocess.pre_algorithm.PreAlgorithm, ethicml.utility.data_structures.DataTuple]
- property name: str#
Name of the algorithm.
- property out_size: int#
The number of features to generate.
- run(train, test)#
Generate fair features with the given data asynchronously.
- Parameters
train (ethicml.utility.data_structures.DataTuple) – training data
test (ethicml.utility.data_structures.TestTuple) – test data
- Returns
a tuple of the pre-processed training data and the test data
- Return type
Tuple[ethicml.utility.data_structures.DataTuple, ethicml.utility.data_structures.TestTuple]
- run_test(train, test)#
Run with reduced training set so that it finishes quicker.
- Parameters
- Return type
Tuple[ethicml.utility.data_structures.DataTuple, ethicml.utility.data_structures.TestTuple]
- transform(data)#
Generate fair features with the given data asynchronously.
- Parameters
train – training data
test – test data
data (ethicml.algorithms.preprocess.pre_algorithm.T) –
- Returns
a tuple of the pre-processed training data and the test data
- Return type
ethicml.algorithms.preprocess.pre_algorithm.T
Zemel#
Zemel’s Learned Fair Representations.
- class Zemel(*, dir='.', threshold=0.5, clusters=2, Ax=0.01, Ay=0.1, Az=0.5, max_iter=5000, maxfun=5000, epsilon=1e-05, seed=888)#
Bases:
ethicml.algorithms.preprocess.pre_algorithm.PreAlgorithmAsync
AIF360 implementation of Zemel’s LFR.
- Parameters
dir (Union[str, pathlib.Path]) –
threshold (float) –
clusters (int) –
Ax (float) –
Ay (float) –
Az (float) –
max_iter (int) –
maxfun (int) –
epsilon (float) –
seed (int) –
- Return type
None
- fit(train)#
Fit transformer on the given data.
- Parameters
train (ethicml.utility.data_structures.DataTuple) – training data
- Returns
a tuple of the pre-processed training data and the test data
- Return type
Tuple[ethicml.algorithms.preprocess.pre_algorithm.PreAlgorithm, ethicml.utility.data_structures.DataTuple]
- property name: str#
Name of the algorithm.
- property out_size: int#
The number of features to generate.
- run(train, test)#
Generate fair features with the given data.
- Parameters
train (ethicml.utility.data_structures.DataTuple) – training data
test (ethicml.utility.data_structures.TestTuple) – test data
- Returns
a tuple of the pre-processed training data and the test data
- Return type
Tuple[ethicml.utility.data_structures.DataTuple, ethicml.utility.data_structures.TestTuple]
- run_test(train, test)#
Run with reduced training set so that it finishes quicker.
- Parameters
- Return type
Tuple[ethicml.utility.data_structures.DataTuple, ethicml.utility.data_structures.TestTuple]
- transform(data)#
Generate fair features with the given data asynchronously.
- Parameters
train – training data
test – test data
data (ethicml.algorithms.preprocess.pre_algorithm.T) –
- Returns
a tuple of the pre-processed training data and the test data
- Return type
ethicml.algorithms.preprocess.pre_algorithm.T