Model evaluation#

Runs given metrics on given algorithms for given datasets.

Functions:

evaluate_models

Evaluate all the given models for all the given datasets and compute all the given metrics.

evaluate_models_async

Evaluate all the given models for all the given datasets and compute all the given metrics.

load_results

Load results from a CSV file that was created by evaluate_models.

run_metrics

Run all the given metrics on the given predictions and return the results.

evaluate_models(datasets, preprocess_models=(), inprocess_models=(), metrics=(), per_sens_metrics=(), repeats=1, test_mode=False, delete_prev=False, splitter=None, topic=None, fair_pipeline=True, scaler=None, dataset_based_results=True)#

Evaluate all the given models for all the given datasets and compute all the given metrics.

Parameters
  • datasets (List[ethicml.data.dataset.Dataset]) – list of dataset objects

  • scaler (Optional[ethicml.preprocessing.scaling.ScalerType]) – scaler to use on the continuous features of the dataset.

  • preprocess_models (Sequence[ethicml.algorithms.preprocess.pre_algorithm.PreAlgorithm]) – list of preprocess model objects

  • inprocess_models (Sequence[ethicml.algorithms.inprocess.in_algorithm.InAlgorithm]) – list of inprocess model objects

  • metrics (Sequence[ethicml.metrics.metric.Metric]) – list of metric objects

  • per_sens_metrics (Sequence[ethicml.metrics.metric.Metric]) – list of metric objects that will be evaluated per sensitive attribute

  • repeats (int) – number of repeats to perform for the experiments

  • test_mode (bool) – if True, only use a small subset of the data so that the models run faster

  • delete_prev (bool) – False by default. If True, delete saved results in directory

  • splitter (Optional[ethicml.preprocessing.train_test_split.DataSplitter]) – (optional) custom train-test splitter

  • topic (Optional[str]) – (optional) a string that identifies the run; the string is prepended to the filename

  • fair_pipeline (bool) – if True, run fair inprocess algorithms on the output of preprocessing

  • dataset_based_results (bool) – if True, use the name of the senisitive variable in the returned results. If False, refer to the sensitive varibale as S.

Return type

ethicml.utility.data_structures.Results

evaluate_models_async(*, datasets, preprocess_models=(), inprocess_models=(), metrics=(), per_sens_metrics=(), repeats=1, test_mode=False, delete_prev=False, splitter=None, topic=None, fair_pipeline=True, num_cpus=1, scaler=None)#

Evaluate all the given models for all the given datasets and compute all the given metrics.

Parameters
  • datasets (List[ethicml.data.dataset.Dataset]) – list of dataset objects

  • scaler (Optional[ethicml.preprocessing.scaling.ScalerType]) – Sklearn-style scaler to be used on the continuous features.

  • preprocess_models (Sequence[ethicml.algorithms.preprocess.pre_algorithm.PreAlgorithm]) – list of preprocess model objects

  • inprocess_models (Sequence[ethicml.algorithms.inprocess.in_algorithm.InAlgorithm]) – list of inprocess model objects

  • metrics (Sequence[ethicml.metrics.metric.Metric]) – list of metric objects

  • per_sens_metrics (Sequence[ethicml.metrics.metric.Metric]) – list of metric objects that will be evaluated per sensitive attribute

  • repeats (int) – number of repeats to perform for the experiments

  • test_mode (bool) – if True, only use a small subset of the data so that the models run faster

  • delete_prev (bool) – False by default. If True, delete saved results in directory

  • splitter (Optional[ethicml.preprocessing.train_test_split.DataSplitter]) – (optional) custom train-test splitter

  • topic (Optional[str]) – (optional) a string that identifies the run; the string is prepended to the filename

  • fair_pipeline (bool) – if True, run fair inprocess algorithms on the output of preprocessing

  • num_cpus (int) – number of CPUs to use

Return type

ethicml.utility.data_structures.Results

load_results(dataset_name, transform_name, topic=None, outdir=PosixPath('results'))#

Load results from a CSV file that was created by evaluate_models.

Parameters
  • dataset_name (str) – name of the dataset of the results

  • transform_name (str) – name of the transformation that was used for the results

  • topic (Optional[str]) – (optional) topic string of the results

  • outdir (pathlib.Path) – directory where the results are stored

Returns

DataFrame if the file exists; None otherwise

Return type

Optional[ethicml.utility.data_structures.Results]

run_metrics(predictions, actual, metrics=(), per_sens_metrics=(), diffs_and_ratios=True, use_sens_name=True)#

Run all the given metrics on the given predictions and return the results.

Parameters
Return type

Dict[str, float]