Inprocess algorithms#

In-process algorithms take training data and make predictions.

Inprocess base#

Abstract Base Class of all algorithms in the framework.

class InAlgorithm(*args, **kwargs)#

Bases: ethicml.algorithms.algorithm_base.Algorithm, Protocol

Abstract Base Class for algorithms that run in the middle of the pipeline.

abstract fit(train)#

Fit Algorithm on the given data.

Parameters

train (ethicml.utility.data_structures.DataTuple) – training data
self (ethicml.algorithms.inprocess.in_algorithm._I) –

Returns

self, but trained.

Return type

ethicml.algorithms.inprocess.in_algorithm._I

property hyperparameters: Dict[str, Union[str, int, float]]#: Return list of hyperparameters.

abstract property name: str#: Name of the algorithm.

abstract predict(test)#

Make predictions on the given data.

Parameters: test (ethicml.utility.data_structures.TestTuple) – data to evaluate on
Returns: Prediction
Return type: ethicml.utility.data_structures.Prediction

run(train, test)#

Run Algorithm on the given data.

Parameters

train (ethicml.utility.data_structures.DataTuple) – training data
test (ethicml.utility.data_structures.TestTuple) – test data

Returns

Prediction

Return type

ethicml.utility.data_structures.Prediction

run_test(train, test)#

Run with reduced training set so that it finishes quicker.

Parameters

train (ethicml.utility.data_structures.DataTuple) –
test (ethicml.utility.data_structures.TestTuple) –

Return type

ethicml.utility.data_structures.Prediction

class InAlgorithmAsync(*args, **kwargs)#

Bases: ethicml.algorithms.algorithm_base.SubprocessAlgorithmMixin, ethicml.algorithms.inprocess.in_algorithm.InAlgorithm, Protocol

In-Algorithm that uses a subprocess to run.

fit(train)#

Fit algorithm on the given data asynchronously.

Parameters

train (ethicml.utility.data_structures.DataTuple) – training data
test – test data
self (ethicml.algorithms.inprocess.in_algorithm._IA) –

Returns

Prediction

Return type

ethicml.algorithms.inprocess.in_algorithm._IA

property hyperparameters: Dict[str, Union[str, int, float]]#: Return list of hyperparameters.

abstract property name: str#: Name of the algorithm.

predict(test)#

Run Algorithm on the given data asynchronously.

Parameters

train – training data
test (ethicml.utility.data_structures.TestTuple) – test data

Returns

Prediction

Return type

ethicml.utility.data_structures.Prediction

run(train, test)#

Run Algorithm on the given data asynchronously.

Parameters

train (ethicml.utility.data_structures.DataTuple) – training data
test (ethicml.utility.data_structures.TestTuple) – test data

Returns

Prediction

Return type

ethicml.utility.data_structures.Prediction

run_test(train, test)#

Run with reduced training set so that it finishes quicker.

Parameters

train (ethicml.utility.data_structures.DataTuple) –
test (ethicml.utility.data_structures.TestTuple) –

Return type

ethicml.utility.data_structures.Prediction

class InAlgorithmDC(seed=888)#

Bases: ethicml.algorithms.inprocess.in_algorithm.InAlgorithm

InAlgorithm dataclass base class.

Parameters: seed (int) –
Return type: None

abstract fit(train)#

Fit Algorithm on the given data.

Parameters

train (ethicml.utility.data_structures.DataTuple) – training data
self (ethicml.algorithms.inprocess.in_algorithm._I) –

Returns

self, but trained.

Return type

ethicml.algorithms.inprocess.in_algorithm._I

property hyperparameters: Dict[str, Union[str, int, float]]#: Return list of hyperparameters.

abstract property name: str#: Name of the algorithm.

abstract predict(test)#

Make predictions on the given data.

Parameters: test (ethicml.utility.data_structures.TestTuple) – data to evaluate on
Returns: Prediction
Return type: ethicml.utility.data_structures.Prediction

run(train, test)#

Run Algorithm on the given data.

Parameters

train (ethicml.utility.data_structures.DataTuple) – training data
test (ethicml.utility.data_structures.TestTuple) – test data

Returns

Prediction

Return type

ethicml.utility.data_structures.Prediction

run_test(train, test)#

Run with reduced training set so that it finishes quicker.

Parameters

train (ethicml.utility.data_structures.DataTuple) –
test (ethicml.utility.data_structures.TestTuple) –

Return type

ethicml.utility.data_structures.Prediction

Agarwal#

Implementation of Agarwal model.

class Agarwal(*, dir='.', fairness='DP', classifier='LR', eps=0.1, iters=50, C=None, kernel=None, seed=888)#

Bases: ethicml.algorithms.inprocess.in_algorithm.InAlgorithmAsync

Agarwal class.

A wrapper around the Exponentiated Gradient method documented here.

Initialize the Agarwal algorithm.

Parameters

dir (Union[str, pathlib.Path]) – Directory to store the model.
fairness (Literal['DP', 'EqOp', 'EqOd']) – Type of fairness to enforce.
classifier (Literal['LR', 'SVM']) – Type of classifier to use.
eps (float) – Epsilon fo.
iters (int) – Number of iterations for the DP algorithm.
C (Optional[float]) – C parameter for the SVM algorithm.
kernel (Optional[Literal['linear', 'rbf', 'poly', 'sigmoid']]) – Kernel type for the SVM algorithm.
seed (int) – Random seed.

fit(train)#

Fit algorithm on the given data asynchronously.

Parameters

train (ethicml.utility.data_structures.DataTuple) – training data
test – test data
self (ethicml.algorithms.inprocess.in_algorithm._IA) –

Returns

Prediction

Return type

ethicml.algorithms.inprocess.in_algorithm._IA

property hyperparameters: Dict[str, Union[str, int, float]]#: Return list of hyperparameters.

property name: str#: Name of the algorithm.

predict(test)#

Run Algorithm on the given data asynchronously.

Parameters

train – training data
test (ethicml.utility.data_structures.TestTuple) – test data

Returns

Prediction

Return type

ethicml.utility.data_structures.Prediction

run(train, test)#

Run Algorithm on the given data asynchronously.

Parameters

train (ethicml.utility.data_structures.DataTuple) – training data
test (ethicml.utility.data_structures.TestTuple) – test data

Returns

Prediction

Return type

ethicml.utility.data_structures.Prediction

run_test(train, test)#

Run with reduced training set so that it finishes quicker.

Parameters

train (ethicml.utility.data_structures.DataTuple) –
test (ethicml.utility.data_structures.TestTuple) –

Return type

ethicml.utility.data_structures.Prediction

Distributionally-robust optimization#

Fairness without Demographics.

class DRO(*, dir='.', eta=0.5, epochs=10, batch_size=32, network_size=None, seed=888)#

Bases: ethicml.algorithms.inprocess.in_algorithm.InAlgorithmAsync

Implementation of https://arxiv.org/abs/1806.08010 .

Initialize the Distributionally Robust Optimization method.

Parameters

dir (Union[str, pathlib.Path]) – Directory to store the model.
eta (float) – Tolerance.
epochs (int) – The number of epochs to train for.
batch_size (int) – The batch size.
network_size (Optional[List[int]]) – The size of the network.
seed (int) – The seed for the random number generator.

fit(train)#

Fit algorithm on the given data asynchronously.

Parameters

train (ethicml.utility.data_structures.DataTuple) – training data
test – test data
self (ethicml.algorithms.inprocess.in_algorithm._IA) –

Returns

Prediction

Return type

ethicml.algorithms.inprocess.in_algorithm._IA

property hyperparameters: Dict[str, Union[str, int, float]]#: Return list of hyperparameters.

property name: str#: Name of the algorithm.

predict(test)#

Run Algorithm on the given data asynchronously.

Parameters

train – training data
test (ethicml.utility.data_structures.TestTuple) – test data

Returns

Prediction

Return type

ethicml.utility.data_structures.Prediction

run(train, test)#

Run Algorithm on the given data asynchronously.

Parameters

train (ethicml.utility.data_structures.DataTuple) – training data
test (ethicml.utility.data_structures.TestTuple) – test data

Returns

Prediction

Return type

ethicml.utility.data_structures.Prediction

run_test(train, test)#

Run with reduced training set so that it finishes quicker.

Parameters

train (ethicml.utility.data_structures.DataTuple) –
test (ethicml.utility.data_structures.TestTuple) –

Return type

ethicml.utility.data_structures.Prediction

Base for installed model#

Installable model.

This is a kind of complicated model, but it’s incredibly useful. Say you find a paper from a few years ago with code. It’s not unreasonable that there might be dependency clashes, python clashes, clashes galore. This approach downloads a model, runs it in its own venv and makes everyone happy.

class InstalledModel(name, dir_name, top_dir, url=None, executable=None, seed=888, use_poetry=False)#

Bases: ethicml.algorithms.algorithm_base.SubprocessAlgorithmMixin, ethicml.algorithms.inprocess.in_algorithm.InAlgorithm

The model that does the magic.

Download code from given URL and create Pip environment with Pipfile found in the code.

Parameters

name (str) – name of the model
dir_name (str) – where to download the code to (can be chosen freely)
top_dir (str) – top directory of the repository where the Pipfile can be found (this is usually simply the last part of the repository URL)
is_fairness_algo – if True, this object corresponds to an algorithm enforcing fairness
url (Optional[str]) – (optional) URL of the repository
executable (Optional[str]) – (optional) path to a Python executable
seed (int) – Random seed to use for reproducibility
use_poetry (bool) – if True, will try to use poetry instead of pipenv

abstract fit(train)#

Fit Algorithm on the given data.

Parameters

train (ethicml.utility.data_structures.DataTuple) – training data
self (ethicml.algorithms.inprocess.in_algorithm._I) –

Returns

self, but trained.

Return type

ethicml.algorithms.inprocess.in_algorithm._I

property hyperparameters: Dict[str, Union[str, int, float]]#: Return list of hyperparameters.

property name: str#: Name of the algorithm.

abstract predict(test)#

Make predictions on the given data.

Parameters: test (ethicml.utility.data_structures.TestTuple) – data to evaluate on
Returns: Prediction
Return type: ethicml.utility.data_structures.Prediction

remove()#

Removes the directory that we created in _clone_directory().

Return type: None

run(train, test)#

Run Algorithm on the given data.

Parameters

train (ethicml.utility.data_structures.DataTuple) – training data
test (ethicml.utility.data_structures.TestTuple) – test data

Returns

Prediction

Return type

ethicml.utility.data_structures.Prediction

run_test(train, test)#

Run with reduced training set so that it finishes quicker.

Parameters

train (ethicml.utility.data_structures.DataTuple) –
test (ethicml.utility.data_structures.TestTuple) –

Return type

ethicml.utility.data_structures.Prediction

Kamiran#

Kamiran and Calders 2012.

class Kamiran(*, classifier='LR', C=None, kernel=None, seed=888)#

Bases: ethicml.algorithms.inprocess.in_algorithm.InAlgorithm

An implementation of the Reweighing method from Kamiran and Calders 2012.

Each sample is assigned an instance-weight based on the joing probability of S and Y which is used during training of a classifier.

Reweighing.

Parameters

classifier (Literal['LR', 'SVM']) – The classifier to use.
C (Optional[float]) – The C parameter for the classifier.
kernel (Optional[Literal['linear', 'rbf', 'poly', 'sigmoid']]) – The kernel to use for the classifier if SVM selected.
seed (int) – The random number generator seed to use for the classifier.

fit(train)#

Fit Algorithm on the given data.

Parameters: train (ethicml.utility.data_structures.DataTuple) – training data
Returns: self, but trained.
Return type: ethicml.algorithms.inprocess.in_algorithm.InAlgorithm

property hyperparameters: Dict[str, Union[str, int, float]]#: Return list of hyperparameters.

property name: str#: Name of the algorithm.

predict(test)#

Make predictions on the given data.

Parameters: test (ethicml.utility.data_structures.TestTuple) – data to evaluate on
Returns: Prediction
Return type: ethicml.utility.data_structures.Prediction

run(train, test)#

Run Algorithm on the given data.

Parameters

train (ethicml.utility.data_structures.DataTuple) – training data
test (ethicml.utility.data_structures.TestTuple) – test data

Returns

Prediction

Return type

ethicml.utility.data_structures.Prediction

run_test(train, test)#

Run with reduced training set so that it finishes quicker.

Parameters

train (ethicml.utility.data_structures.DataTuple) –
test (ethicml.utility.data_structures.TestTuple) –

Return type

ethicml.utility.data_structures.Prediction

compute_instance_weights(train, balance_groups=False, upweight=False)#

Compute weights for all samples.

Parameters

train (ethicml.utility.data_structures.DataTuple) – The training data.
balance_groups (bool) –
Whether to balance the groups. When False, the groups are balanced as in Kamiran and Calders 2012. When True, the groups are numerically balanced.
upweight (bool) – If balance_groups is True, whether to upweight the groups, or to downweight them. Downweighting is done by multiplying the weights by the inverse of the group size and is more numerically stable for small group sizes.

Returns

A dataframe with the instance weights for each sample in the training data.

Return type

pandas.DataFrame

Kamishima#

Wrapper for calling Kamishima model.

class Kamishima(*, eta=1.0)#

Bases: ethicml.algorithms.inprocess.installed_model.InstalledModel

Model that calls Kamishima’s code.

Based on Algo-Fairness https://github.com/algofairness/fairness-comparison/blob/master/fairness/algorithms/kamishima/KamishimaAlgorithm.py

Initialize Kamishima model.

Parameters: eta (float) – Tolerance.

fit(train)#

Fit Algorithm on the given data.

Parameters: train (ethicml.utility.data_structures.DataTuple) – training data
Returns: self, but trained.
Return type: ethicml.algorithms.inprocess.kamishima.Kamishima

property hyperparameters: Dict[str, Union[str, int, float]]#: Return list of hyperparameters.

property name: str#: Name of the algorithm.

predict(test)#

Make predictions on the given data.

Parameters: test (ethicml.utility.data_structures.TestTuple) – data to evaluate on
Returns: Prediction
Return type: ethicml.utility.data_structures.Prediction

remove()#

Removes the directory that we created in _clone_directory().

Return type: None

run(train, test)#

Run Algorithm on the given data.

Parameters

train (ethicml.utility.data_structures.DataTuple) – training data
test (ethicml.utility.data_structures.TestTuple) – test data

Returns

Prediction

Return type

ethicml.utility.data_structures.Prediction

run_test(train, test)#

Run with reduced training set so that it finishes quicker.

Parameters

train (ethicml.utility.data_structures.DataTuple) –
test (ethicml.utility.data_structures.TestTuple) –

Return type

ethicml.utility.data_structures.Prediction

Logistic regression#

Wrapper around Sci-Kit Learn Logistic Regression.

class LR(C=<factory>, seed=888)#

Bases: ethicml.algorithms.inprocess.in_algorithm.InAlgorithm

Logistic regression with hard predictions.

This is a wrapper around Sci-Kit Learn’s LogisticRegression. The documentation for which is available here.

Parameters

C (float) – The regularization parameter.
seed (int) – The seed for the random number generator.

Return type

None

fit(train)#

Fit Algorithm on the given data.

Parameters: train (ethicml.utility.data_structures.DataTuple) – training data
Returns: self, but trained.
Return type: ethicml.algorithms.inprocess.in_algorithm.InAlgorithm

property hyperparameters: Dict[str, Union[str, int, float]]#: Return list of hyperparameters.

property name: str#: Name of the algorithm.

predict(test)#

Make predictions on the given data.

Parameters: test (ethicml.utility.data_structures.TestTuple) – data to evaluate on
Returns: Prediction
Return type: ethicml.utility.data_structures.Prediction

run(train, test)#

Run Algorithm on the given data.

Parameters

train (ethicml.utility.data_structures.DataTuple) – training data
test (ethicml.utility.data_structures.TestTuple) – test data

Returns

Prediction

Return type

ethicml.utility.data_structures.Prediction

run_test(train, test)#

Run with reduced training set so that it finishes quicker.

Parameters

train (ethicml.utility.data_structures.DataTuple) –
test (ethicml.utility.data_structures.TestTuple) –

Return type

ethicml.utility.data_structures.Prediction

class LRCV(n_splits=3, seed=888)#

Bases: ethicml.algorithms.inprocess.in_algorithm.InAlgorithm

Kind of a cheap hack for now, but gives a proper cross-valudeted LR.

Parameters

n_splits (int) – The number of splits for the cross-validation.
seed (int) – The seed for the random number generator.

Return type

None

fit(train)#

Fit Algorithm on the given data.

Parameters: train (ethicml.utility.data_structures.DataTuple) – training data
Returns: self, but trained.
Return type: ethicml.algorithms.inprocess.in_algorithm.InAlgorithm

property hyperparameters: Dict[str, Union[str, int, float]]#: Return list of hyperparameters.

property name: str#: Name of the algorithm.

predict(test)#

Make predictions on the given data.

Parameters: test (ethicml.utility.data_structures.TestTuple) – data to evaluate on
Returns: Prediction
Return type: ethicml.utility.data_structures.Prediction

run(train, test)#

Run Algorithm on the given data.

Parameters

train (ethicml.utility.data_structures.DataTuple) – training data
test (ethicml.utility.data_structures.TestTuple) – test data

Returns

Prediction

Return type

ethicml.utility.data_structures.Prediction

run_test(train, test)#

Run with reduced training set so that it finishes quicker.

Parameters

train (ethicml.utility.data_structures.DataTuple) –
test (ethicml.utility.data_structures.TestTuple) –

Return type

ethicml.utility.data_structures.Prediction

class LRProb(C=<factory>, seed=888)#

Bases: ethicml.algorithms.inprocess.in_algorithm.InAlgorithm

Logistic regression with soft output.

Parameters

C (float) – The regularization parameter.
seed (int) – The seed for the random number generator.

Return type

None

fit(train)#

Fit Algorithm on the given data.

Parameters: train (ethicml.utility.data_structures.DataTuple) – training data
Returns: self, but trained.
Return type: ethicml.algorithms.inprocess.in_algorithm.InAlgorithm

property hyperparameters: Dict[str, Union[str, int, float]]#: Return list of hyperparameters.

property name: str#: Name of the algorithm.

predict(test)#

Make predictions on the given data.

Parameters: test (ethicml.utility.data_structures.TestTuple) – data to evaluate on
Returns: Prediction
Return type: ethicml.utility.data_structures.Prediction

run(train, test)#

Run Algorithm on the given data.

Parameters

train (ethicml.utility.data_structures.DataTuple) – training data
test (ethicml.utility.data_structures.TestTuple) – test data

Returns

Prediction

Return type

ethicml.utility.data_structures.SoftPrediction

run_test(train, test)#

Run with reduced training set so that it finishes quicker.

Parameters

train (ethicml.utility.data_structures.DataTuple) –
test (ethicml.utility.data_structures.TestTuple) –

Return type

ethicml.utility.data_structures.Prediction

Majority#

Simply returns the majority label from the train set.

class Majority(seed=888)#

Bases: ethicml.algorithms.inprocess.in_algorithm.InAlgorithmDC

Simply returns the majority label from the train set.

Parameters: seed (int) –
Return type: None

fit(train)#

Fit Algorithm on the given data.

Parameters: train (ethicml.utility.data_structures.DataTuple) – training data
Returns: self, but trained.
Return type: ethicml.algorithms.inprocess.in_algorithm.InAlgorithmDC

property hyperparameters: Dict[str, Union[str, int, float]]#: Return list of hyperparameters.

property name: str#: Name of the algorithm.

predict(test)#

Make predictions on the given data.

Parameters: test (ethicml.utility.data_structures.TestTuple) – data to evaluate on
Returns: Prediction
Return type: ethicml.utility.data_structures.Prediction

run(train, test)#

Run Algorithm on the given data.

Parameters

train (ethicml.utility.data_structures.DataTuple) – training data
test (ethicml.utility.data_structures.TestTuple) – test data

Returns

Prediction

Return type

ethicml.utility.data_structures.Prediction

run_test(train, test)#

Run with reduced training set so that it finishes quicker.

Parameters

train (ethicml.utility.data_structures.DataTuple) –
test (ethicml.utility.data_structures.TestTuple) –

Return type

ethicml.utility.data_structures.Prediction

Manual methods#

Manually specified (i.e. not learned) models.

class Corels(seed=888)#

Bases: ethicml.algorithms.inprocess.in_algorithm.InAlgorithmDC

CORELS (Certifiably Optimal RulE ListS) algorithm for the COMPAS dataset.

This algorithm uses if-statements to make predictions. It only works on COMPAS with s as sex.

From this paper: https://arxiv.org/abs/1704.01701

Parameters: seed (int) –
Return type: None

fit(train)#

Fit Algorithm on the given data.

Parameters: train (ethicml.utility.data_structures.DataTuple) – training data
Returns: self, but trained.
Return type: ethicml.algorithms.inprocess.in_algorithm.InAlgorithmDC

property hyperparameters: Dict[str, Union[str, int, float]]#: Return list of hyperparameters.

property name: str#: Name of the algorithm.

predict(test)#

Make predictions on the given data.

Parameters: test (ethicml.utility.data_structures.TestTuple) – data to evaluate on
Returns: Prediction
Return type: ethicml.utility.data_structures.Prediction

run(train, test)#

Run Algorithm on the given data.

Parameters

train (ethicml.utility.data_structures.DataTuple) – training data
test (ethicml.utility.data_structures.TestTuple) – test data

Returns

Prediction

Return type

ethicml.utility.data_structures.Prediction

run_test(train, test)#

Run with reduced training set so that it finishes quicker.

Parameters

train (ethicml.utility.data_structures.DataTuple) –
test (ethicml.utility.data_structures.TestTuple) –

Return type

ethicml.utility.data_structures.Prediction

MLP#

Wrapper for SKLearn implementation of MLP.

class MLP(*, hidden_layer_sizes=None, activation=None, seed=888)#

Bases: ethicml.algorithms.inprocess.in_algorithm.InAlgorithm

Multi-layer Perceptron.

This is a wraper around the SKLearn implementation of the MLP. Documentation: https://scikit-learn.org/stable/modules/generated/sklearn.neural_network.MLPClassifier.html

Multi-Layer Perceptron.

Parameters

hidden_layer_sizes (Optional[Tuple[int, ...]]) – The number of neurons in each hidden layer.
activation (Optional[Literal['identity', 'logistic', 'tanh', 'relu']]) – The activation function to use.
seed (int) – The seed for the random number generator.

fit(train)#

Fit Algorithm on the given data.

Parameters: train (ethicml.utility.data_structures.DataTuple) – training data
Returns: self, but trained.
Return type: ethicml.algorithms.inprocess.in_algorithm.InAlgorithm

property hyperparameters: Dict[str, Union[str, int, float]]#: Return list of hyperparameters.

property name: str#: Name of the algorithm.

predict(test)#

Make predictions on the given data.

Parameters: test (ethicml.utility.data_structures.TestTuple) – data to evaluate on
Returns: Prediction
Return type: ethicml.utility.data_structures.Prediction

run(train, test)#

Run Algorithm on the given data.

Parameters

train (ethicml.utility.data_structures.DataTuple) – training data
test (ethicml.utility.data_structures.TestTuple) – test data

Returns

Prediction

Return type

ethicml.utility.data_structures.Prediction

run_test(train, test)#

Run with reduced training set so that it finishes quicker.

Parameters

train (ethicml.utility.data_structures.DataTuple) –
test (ethicml.utility.data_structures.TestTuple) –

Return type

ethicml.utility.data_structures.Prediction

Oracle#

How would a perfect predictor perform?

class DPOracle(seed=888)#

Bases: ethicml.algorithms.inprocess.in_algorithm.InAlgorithmDC

A perfect Demographic Parity Predictor.

Can only be used if test is a DataTuple, rather than the usual TestTuple. This model isn’t intended for general use, but can be useful if you want to either do a sanity check, or report potential values.

Parameters: seed (int) –
Return type: None

fit(train)#

Fit Algorithm on the given data.

Parameters: train (ethicml.utility.data_structures.DataTuple) – training data
Returns: self, but trained.
Return type: ethicml.algorithms.inprocess.in_algorithm.InAlgorithm

property hyperparameters: Dict[str, Union[str, int, float]]#: Return list of hyperparameters.

property name: str#: Name of the algorithm.

predict(test)#

Make predictions on the given data.

Parameters: test (ethicml.utility.data_structures.TestTuple) – data to evaluate on
Returns: Prediction
Return type: ethicml.utility.data_structures.Prediction

run(train, test)#

Run Algorithm on the given data.

Parameters

train (ethicml.utility.data_structures.DataTuple) – training data
test (ethicml.utility.data_structures.TestTuple) – test data

Returns

Prediction

Return type

ethicml.utility.data_structures.Prediction

run_test(train, test)#

Run with reduced training set so that it finishes quicker.

Parameters

train (ethicml.utility.data_structures.DataTuple) –
test (ethicml.utility.data_structures.TestTuple) –

Return type

ethicml.utility.data_structures.Prediction

class Oracle(seed=888)#

Bases: ethicml.algorithms.inprocess.in_algorithm.InAlgorithmDC

A perfect predictor.

Parameters: seed (int) –
Return type: None

fit(train)#

Fit Algorithm on the given data.

Parameters: train (ethicml.utility.data_structures.DataTuple) – training data
Returns: self, but trained.
Return type: ethicml.algorithms.inprocess.in_algorithm.InAlgorithm

property hyperparameters: Dict[str, Union[str, int, float]]#: Return list of hyperparameters.

property name: str#: Name of the algorithm.

predict(test)#

Make predictions on the given data.

Parameters: test (ethicml.utility.data_structures.TestTuple) – data to evaluate on
Returns: Prediction
Return type: ethicml.utility.data_structures.Prediction

run(train, test)#

Run Algorithm on the given data.

Parameters

train (ethicml.utility.data_structures.DataTuple) – training data
test (ethicml.utility.data_structures.TestTuple) – test data

Returns

Prediction

Return type

ethicml.utility.data_structures.Prediction

run_test(train, test)#

Run with reduced training set so that it finishes quicker.

Parameters

train (ethicml.utility.data_structures.DataTuple) –
test (ethicml.utility.data_structures.TestTuple) –

Return type

ethicml.utility.data_structures.Prediction

SVM#

Wrapper for SKLearn implementation of SVM.

class SVM(*, C=None, kernel=None, seed=888)#

Bases: ethicml.algorithms.inprocess.in_algorithm.InAlgorithm

A wraper around the SciKitLearn Support Vector Classifier (SVC) model.

Documentation for the underlying classifier can be found here.

Parameters

C (Optional[float]) –
kernel (Optional[Literal['linear', 'rbf', 'poly', 'sigmoid']]) –
seed (int) –

fit(train)#

Fit Algorithm on the given data.

Parameters: train (ethicml.utility.data_structures.DataTuple) – training data
Returns: self, but trained.
Return type: ethicml.algorithms.inprocess.in_algorithm.InAlgorithm

property hyperparameters: Dict[str, Union[str, int, float]]#: Return list of hyperparameters.

property name: str#: Name of the algorithm.

predict(test)#

Make predictions on the given data.

Parameters: test (ethicml.utility.data_structures.TestTuple) – data to evaluate on
Returns: Prediction
Return type: ethicml.utility.data_structures.Prediction

run(train, test)#

Run Algorithm on the given data.

Parameters

train (ethicml.utility.data_structures.DataTuple) – training data
test (ethicml.utility.data_structures.TestTuple) – test data

Returns

Prediction

Return type

ethicml.utility.data_structures.Prediction

run_test(train, test)#

Run with reduced training set so that it finishes quicker.

Parameters

train (ethicml.utility.data_structures.DataTuple) –
test (ethicml.utility.data_structures.TestTuple) –

Return type

ethicml.utility.data_structures.Prediction

Zafar#

Algorithms by Zafar et al. for Demographic Parity.

class ZafarAccuracy(*, gamma=0.5)#

Bases: ethicml.algorithms.inprocess.zafar._ZafarAlgorithmBase

Zafar with fairness.

Download code from given URL and create Pip environment with Pipfile found in the code.

Parameters

name – name of the model
dir_name – where to download the code to (can be chosen freely)
top_dir – top directory of the repository where the Pipfile can be found (this is usually simply the last part of the repository URL)
is_fairness_algo – if True, this object corresponds to an algorithm enforcing fairness
url – (optional) URL of the repository
executable – (optional) path to a Python executable
seed – Random seed to use for reproducibility
use_poetry – if True, will try to use poetry instead of pipenv
gamma (float) –

fit(train)#

Fit Algorithm on the given data.

Parameters: train (ethicml.utility.data_structures.DataTuple) – training data
Returns: self, but trained.
Return type: ethicml.algorithms.inprocess.zafar._ZafarAlgorithmBase

property hyperparameters: Dict[str, Union[str, int, float]]#: Return list of hyperparameters.

property name: str#: Name of the algorithm.

predict(test)#

Make predictions on the given data.

Parameters: test (ethicml.utility.data_structures.TestTuple) – data to evaluate on
Returns: Prediction
Return type: ethicml.utility.data_structures.Prediction

remove()#

Removes the directory that we created in _clone_directory().

Return type: None

run(train, test)#

Run Algorithm on the given data.

Parameters

train (ethicml.utility.data_structures.DataTuple) – training data
test (ethicml.utility.data_structures.TestTuple) – test data

Returns

Prediction

Return type

ethicml.utility.data_structures.Prediction

run_test(train, test)#

Run with reduced training set so that it finishes quicker.

Parameters

train (ethicml.utility.data_structures.DataTuple) –
test (ethicml.utility.data_structures.TestTuple) –

Return type

ethicml.utility.data_structures.Prediction

class ZafarBaseline#

Bases: ethicml.algorithms.inprocess.zafar._ZafarAlgorithmBase

Zafar without fairness.

Download code from given URL and create Pip environment with Pipfile found in the code.

Parameters

name – name of the model
dir_name – where to download the code to (can be chosen freely)
top_dir – top directory of the repository where the Pipfile can be found (this is usually simply the last part of the repository URL)
is_fairness_algo – if True, this object corresponds to an algorithm enforcing fairness
url – (optional) URL of the repository
executable – (optional) path to a Python executable
seed – Random seed to use for reproducibility
use_poetry – if True, will try to use poetry instead of pipenv

Return type

None

fit(train)#

Fit Algorithm on the given data.

Parameters: train (ethicml.utility.data_structures.DataTuple) – training data
Returns: self, but trained.
Return type: ethicml.algorithms.inprocess.zafar._ZafarAlgorithmBase

property hyperparameters: Dict[str, Union[str, int, float]]#: Return list of hyperparameters.

property name: str#: Name of the algorithm.

predict(test)#

Make predictions on the given data.

Parameters: test (ethicml.utility.data_structures.TestTuple) – data to evaluate on
Returns: Prediction
Return type: ethicml.utility.data_structures.Prediction

remove()#

Removes the directory that we created in _clone_directory().

Return type: None

run(train, test)#

Run Algorithm on the given data.

Parameters

train (ethicml.utility.data_structures.DataTuple) – training data
test (ethicml.utility.data_structures.TestTuple) – test data

Returns

Prediction

Return type

ethicml.utility.data_structures.Prediction

run_test(train, test)#

Run with reduced training set so that it finishes quicker.

Parameters

train (ethicml.utility.data_structures.DataTuple) –
test (ethicml.utility.data_structures.TestTuple) –

Return type

ethicml.utility.data_structures.Prediction

class ZafarEqOdds(*, tau=5.0, mu=1.2, eps=0.0001)#

Bases: ethicml.algorithms.inprocess.zafar.ZafarEqOpp

Zafar for Equalised Odds.

Download code from given URL and create Pip environment with Pipfile found in the code.

Parameters

name – name of the model
dir_name – where to download the code to (can be chosen freely)
top_dir – top directory of the repository where the Pipfile can be found (this is usually simply the last part of the repository URL)
is_fairness_algo – if True, this object corresponds to an algorithm enforcing fairness
url – (optional) URL of the repository
executable – (optional) path to a Python executable
seed – Random seed to use for reproducibility
use_poetry – if True, will try to use poetry instead of pipenv
tau (float) –
mu (float) –
eps (float) –

fit(train)#

Fit Algorithm on the given data.

Parameters: train (ethicml.utility.data_structures.DataTuple) – training data
Returns: self, but trained.
Return type: ethicml.algorithms.inprocess.zafar._ZafarAlgorithmBase

property hyperparameters: Dict[str, Union[str, int, float]]#: Return list of hyperparameters.

property name: str#: Name of the algorithm.

predict(test)#

Make predictions on the given data.

Parameters: test (ethicml.utility.data_structures.TestTuple) – data to evaluate on
Returns: Prediction
Return type: ethicml.utility.data_structures.Prediction

remove()#

Removes the directory that we created in _clone_directory().

Return type: None

run(train, test)#

Run Algorithm on the given data.

Parameters

train (ethicml.utility.data_structures.DataTuple) – training data
test (ethicml.utility.data_structures.TestTuple) – test data

Returns

Prediction

Return type

ethicml.utility.data_structures.Prediction

run_test(train, test)#

Run with reduced training set so that it finishes quicker.

Parameters

train (ethicml.utility.data_structures.DataTuple) –
test (ethicml.utility.data_structures.TestTuple) –

Return type

ethicml.utility.data_structures.Prediction

class ZafarEqOpp(*, tau=5.0, mu=1.2, eps=0.0001)#

Bases: ethicml.algorithms.inprocess.zafar._ZafarAlgorithmBase

Zafar for Equality of Opportunity.

Download code from given URL and create Pip environment with Pipfile found in the code.

Parameters

name – name of the model
dir_name – where to download the code to (can be chosen freely)
top_dir – top directory of the repository where the Pipfile can be found (this is usually simply the last part of the repository URL)
is_fairness_algo – if True, this object corresponds to an algorithm enforcing fairness
url – (optional) URL of the repository
executable – (optional) path to a Python executable
seed – Random seed to use for reproducibility
use_poetry – if True, will try to use poetry instead of pipenv
tau (float) –
mu (float) –
eps (float) –

fit(train)#

Fit Algorithm on the given data.

Parameters: train (ethicml.utility.data_structures.DataTuple) – training data
Returns: self, but trained.
Return type: ethicml.algorithms.inprocess.zafar._ZafarAlgorithmBase

property hyperparameters: Dict[str, Union[str, int, float]]#: Return list of hyperparameters.

property name: str#: Name of the algorithm.

predict(test)#

Make predictions on the given data.

Parameters: test (ethicml.utility.data_structures.TestTuple) – data to evaluate on
Returns: Prediction
Return type: ethicml.utility.data_structures.Prediction

remove()#

Removes the directory that we created in _clone_directory().

Return type: None

run(train, test)#

Run Algorithm on the given data.

Parameters

train (ethicml.utility.data_structures.DataTuple) – training data
test (ethicml.utility.data_structures.TestTuple) – test data

Returns

Prediction

Return type

ethicml.utility.data_structures.Prediction

run_test(train, test)#

Run with reduced training set so that it finishes quicker.

Parameters

train (ethicml.utility.data_structures.DataTuple) –
test (ethicml.utility.data_structures.TestTuple) –

Return type

ethicml.utility.data_structures.Prediction

class ZafarFairness(*, C=0.001)#

Bases: ethicml.algorithms.inprocess.zafar._ZafarAlgorithmBase

Zafar with fairness.

Download code from given URL and create Pip environment with Pipfile found in the code.

Parameters

name – name of the model
dir_name – where to download the code to (can be chosen freely)
top_dir – top directory of the repository where the Pipfile can be found (this is usually simply the last part of the repository URL)
is_fairness_algo – if True, this object corresponds to an algorithm enforcing fairness
url – (optional) URL of the repository
executable – (optional) path to a Python executable
seed – Random seed to use for reproducibility
use_poetry – if True, will try to use poetry instead of pipenv
C (float) –

fit(train)#

Fit Algorithm on the given data.

Parameters: train (ethicml.utility.data_structures.DataTuple) – training data
Returns: self, but trained.
Return type: ethicml.algorithms.inprocess.zafar._ZafarAlgorithmBase