Model evaluation#

Runs given metrics on given algorithms for given datasets.

Functions:

`evaluate_models`	Evaluate all the given models for all the given datasets and compute all the given metrics.
`evaluate_models_async`	Evaluate all the given models for all the given datasets and compute all the given metrics.
`load_results`	Load results from a CSV file that was created by evaluate_models.
`run_metrics`	Run all the given metrics on the given predictions and return the results.

evaluate_models(datasets, preprocess_models=(), inprocess_models=(), metrics=(), per_sens_metrics=(), repeats=1, test_mode=False, delete_prev=False, splitter=None, topic=None, fair_pipeline=True, scaler=None, dataset_based_results=True)#

Evaluate all the given models for all the given datasets and compute all the given metrics.

Parameters

datasets (List[ethicml.data.dataset.Dataset]) – list of dataset objects
scaler (Optional[ethicml.preprocessing.scaling.ScalerType]) – scaler to use on the continuous features of the dataset.
preprocess_models (Sequence[ethicml.algorithms.preprocess.pre_algorithm.PreAlgorithm]) – list of preprocess model objects
inprocess_models (Sequence[ethicml.algorithms.inprocess.in_algorithm.InAlgorithm]) – list of inprocess model objects
metrics (Sequence[ethicml.metrics.metric.Metric]) – list of metric objects
per_sens_metrics (Sequence[ethicml.metrics.metric.Metric]) – list of metric objects that will be evaluated per sensitive attribute
repeats (int) – number of repeats to perform for the experiments
test_mode (bool) – if True, only use a small subset of the data so that the models run faster
delete_prev (bool) – False by default. If True, delete saved results in directory
splitter (Optional[ethicml.preprocessing.train_test_split.DataSplitter]) – (optional) custom train-test splitter
topic (Optional[str]) – (optional) a string that identifies the run; the string is prepended to the filename
fair_pipeline (bool) – if True, run fair inprocess algorithms on the output of preprocessing
dataset_based_results (bool) – if True, use the name of the senisitive variable in the returned results. If False, refer to the sensitive varibale as S.

Return type

ethicml.utility.data_structures.Results

evaluate_models_async(*, datasets, preprocess_models=(), inprocess_models=(), metrics=(), per_sens_metrics=(), repeats=1, test_mode=False, delete_prev=False, splitter=None, topic=None, fair_pipeline=True, num_cpus=1, scaler=None)#

Evaluate all the given models for all the given datasets and compute all the given metrics.

Parameters

datasets (List[ethicml.data.dataset.Dataset]) – list of dataset objects
scaler (Optional[ethicml.preprocessing.scaling.ScalerType]) – Sklearn-style scaler to be used on the continuous features.
preprocess_models (Sequence[ethicml.algorithms.preprocess.pre_algorithm.PreAlgorithm]) – list of preprocess model objects
inprocess_models (Sequence[ethicml.algorithms.inprocess.in_algorithm.InAlgorithm]) – list of inprocess model objects
metrics (Sequence[ethicml.metrics.metric.Metric]) – list of metric objects
per_sens_metrics (Sequence[ethicml.metrics.metric.Metric]) – list of metric objects that will be evaluated per sensitive attribute
repeats (int) – number of repeats to perform for the experiments
test_mode (bool) – if True, only use a small subset of the data so that the models run faster
delete_prev (bool) – False by default. If True, delete saved results in directory
splitter (Optional[ethicml.preprocessing.train_test_split.DataSplitter]) – (optional) custom train-test splitter
topic (Optional[str]) – (optional) a string that identifies the run; the string is prepended to the filename
fair_pipeline (bool) – if True, run fair inprocess algorithms on the output of preprocessing
num_cpus (int) – number of CPUs to use

Return type

ethicml.utility.data_structures.Results

load_results(dataset_name, transform_name, topic=None, outdir=PosixPath('results'))#

Load results from a CSV file that was created by evaluate_models.

Parameters

dataset_name (str) – name of the dataset of the results
transform_name (str) – name of the transformation that was used for the results
topic (Optional[str]) – (optional) topic string of the results
outdir (pathlib.Path) – directory where the results are stored

Returns

DataFrame if the file exists; None otherwise

Return type

Optional[ethicml.utility.data_structures.Results]

run_metrics(predictions, actual, metrics=(), per_sens_metrics=(), diffs_and_ratios=True, use_sens_name=True)#

Run all the given metrics on the given predictions and return the results.

Parameters

predictions (ethicml.utility.data_structures.Prediction) – DataFrame with predictions
actual (ethicml.utility.data_structures.DataTuple) – DataTuple with the labels
metrics (Sequence[ethicml.metrics.metric.Metric]) – list of metrics
per_sens_metrics (Sequence[ethicml.metrics.metric.Metric]) – list of metrics that are computed per sensitive attribute
diffs_and_ratios (bool) – if True, compute diffs and ratios per sensitive attribute
use_sens_name (bool) – if True, use the name of the senisitive variable in the returned results. If False, refer to the sensitive varibale as S.

Return type

Dict[str, float]