Model evaluation#
Runs given metrics on given algorithms for given datasets.
Evaluate all the given models for all the given datasets and compute all the given metrics. |
Evaluate all the given models for all the given datasets and compute all the given metrics. |
Load results from a CSV file that was created by evaluate_models. |
Run all the given metrics on the given predictions and return the results. |
- evaluate_models(datasets, preprocess_models=(), inprocess_models=(), metrics=(), per_sens_metrics=(), repeats=1, test_mode=False, delete_prev=False, splitter=None, topic=None, fair_pipeline=True, scaler=None, dataset_based_results=True)#
Evaluate all the given models for all the given datasets and compute all the given metrics.
- Parameters
datasets (List[]) – list of dataset objects
scaler (Optional[ethicml.preprocessing.scaling.ScalerType]) – scaler to use on the continuous features of the dataset.
preprocess_models (Sequence[ethicml.algorithms.preprocess.pre_algorithm.PreAlgorithm]) – list of preprocess model objects
inprocess_models (Sequence[ethicml.algorithms.inprocess.in_algorithm.InAlgorithm]) – list of inprocess model objects
metrics (Sequence[ethicml.metrics.metric.Metric]) – list of metric objects
per_sens_metrics (Sequence[ethicml.metrics.metric.Metric]) – list of metric objects that will be evaluated per sensitive attribute
repeats (int) – number of repeats to perform for the experiments
test_mode (bool) – if True, only use a small subset of the data so that the models run faster
delete_prev (bool) – False by default. If True, delete saved results in directory
splitter (Optional[ethicml.preprocessing.train_test_split.DataSplitter]) – (optional) custom train-test splitter
topic (Optional[str]) – (optional) a string that identifies the run; the string is prepended to the filename
fair_pipeline (bool) – if True, run fair inprocess algorithms on the output of preprocessing
dataset_based_results (bool) – if True, use the name of the senisitive variable in the returned results. If False, refer to the sensitive varibale as S.
- Return type
- evaluate_models_async(*, datasets, preprocess_models=(), inprocess_models=(), metrics=(), per_sens_metrics=(), repeats=1, test_mode=False, delete_prev=False, splitter=None, topic=None, fair_pipeline=True, num_cpus=1, scaler=None)#
Evaluate all the given models for all the given datasets and compute all the given metrics.
- Parameters
datasets (List[]) – list of dataset objects
scaler (Optional[ethicml.preprocessing.scaling.ScalerType]) – Sklearn-style scaler to be used on the continuous features.
preprocess_models (Sequence[ethicml.algorithms.preprocess.pre_algorithm.PreAlgorithm]) – list of preprocess model objects
inprocess_models (Sequence[ethicml.algorithms.inprocess.in_algorithm.InAlgorithm]) – list of inprocess model objects
metrics (Sequence[ethicml.metrics.metric.Metric]) – list of metric objects
per_sens_metrics (Sequence[ethicml.metrics.metric.Metric]) – list of metric objects that will be evaluated per sensitive attribute
repeats (int) – number of repeats to perform for the experiments
test_mode (bool) – if True, only use a small subset of the data so that the models run faster
delete_prev (bool) – False by default. If True, delete saved results in directory
splitter (Optional[ethicml.preprocessing.train_test_split.DataSplitter]) – (optional) custom train-test splitter
topic (Optional[str]) – (optional) a string that identifies the run; the string is prepended to the filename
fair_pipeline (bool) – if True, run fair inprocess algorithms on the output of preprocessing
num_cpus (int) – number of CPUs to use
- Return type
- load_results(dataset_name, transform_name, topic=None, outdir=PosixPath('results'))#
Load results from a CSV file that was created by evaluate_models.
- Parameters
dataset_name (str) – name of the dataset of the results
transform_name (str) – name of the transformation that was used for the results
topic (Optional[str]) – (optional) topic string of the results
outdir (pathlib.Path) – directory where the results are stored
- Returns
DataFrame if the file exists; None otherwise
- Return type
- run_metrics(predictions, actual, metrics=(), per_sens_metrics=(), diffs_and_ratios=True, use_sens_name=True)#
Run all the given metrics on the given predictions and return the results.
- Parameters
predictions (ethicml.utility.data_structures.Prediction) – DataFrame with predictions
actual (ethicml.utility.data_structures.DataTuple) – DataTuple with the labels
metrics (Sequence[ethicml.metrics.metric.Metric]) – list of metrics
per_sens_metrics (Sequence[ethicml.metrics.metric.Metric]) – list of metrics that are computed per sensitive attribute
diffs_and_ratios (bool) – if True, compute diffs and ratios per sensitive attribute
use_sens_name (bool) – if True, use the name of the senisitive variable in the returned results. If False, refer to the sensitive varibale as S.
- Return type
Dict[str, float]