ethicml.run#
Module for evaluators which apply algorithms over datasets and obtain metrics.
Classes:
Stores the results of a cross validation experiment (see |
|
A simple approach to Cross Validation. |
Functions:
Arrange the given algorithms to run (embarrassingly) parallel. |
|
Evaluate all the given models for all the given datasets and compute all the given metrics. |
|
Load results from a CSV file that was created by |
|
Run the given algorithms (embarrassingly) parallel. |
- class CVResults(results, model)#
Bases:
object
Stores the results of a cross validation experiment (see
CrossValidator
).This object isn’t meant to be iterated over directly. Instead, use the
raw_storage
property to access the results across all folds. Or, use themean_storage
property to access the average results for each parameter setting.import ethicml as em from ethicml import data, metrics, models from ethicml.run import CrossValidator train, test = em.train_test_split(data.Compas().load()) hyperparams = {"C": [1, 1e-1, 1e-2, 1e-3, 1e-4, 1e-5, 1e-6]} cv = CrossValidator(models.LR, hyperparams, folds=3) primary = metrics.Accuracy() fair_measure = metrics.AbsCV() cv_results = cv.run(train, measures=[primary, fair_measure]) best_result = cv_results.get_best_in_top_k(primary, fair_measure, top_k=3) print(f"Best C: {best_result.params['C']}") print(f"Best Accuracy: {best_result.scores['Accuracy']}") print(f"Best CV Score: {best_result.scores['CV absolute']}") print(cv_results.mean_storage) print(cv_results.raw_storage)
- Parameters:
results (list[ResultTuple]) –
model (type[InAlgorithm]) –
- best(measure)#
Return a model initialised with the best hyper-parameters.
The best hyper-parameters are those that perform optimally on average across folds for a given metric.
- Parameters:
measure (Metric) –
- Return type:
- best_hyper_params(measure)#
Get hyper-parameters that return the ‘best’ result for the metric of interest.
- Parameters:
measure (Metric) –
- Return type:
dict[str, Any]
- get_best_in_top_k(primary, secondary, top_k)#
Get best result in top-K entries.
First sort the results according to the primary metric, then take the best according to the secondary metric from the top K.
- class CrossValidator(model, hyperparams, folds=3, max_parallel=0)#
Bases:
object
A simple approach to Cross Validation.
The CrossValidator object is used to run cross-validation on a model. Results are returned in a
CVResults
object.import ethicml as em from ethicml import data, metrics, models from ethicml.run import CrossValidator train, test = em.train_test_split(data.Compas().load()) hyperparams = {"C": [1, 1e-1, 1e-2, 1e-3, 1e-4, 1e-5, 1e-6]} lr_cv = CrossValidator(models.LR, hyperparams, folds=3) primary = metrics.Accuracy() fair_measure = metrics.AbsCV() cv_results = lr_cv.run(train, measures=[primary, fair_measure])
- Parameters:
model (Type[InAlgorithm]) – the class (not an instance) of the model for cross validation
hyperparams (Mapping[str, Sequence[Any]]) – a dictionary where the keys are the names of hyperparameters and the values are lists of possible values for the hyperparameters
folds (int) – the number of folds
max_parallel (int) – the maximum number of parallel processes; if set to 0, use the default which is the number of available CPUs
- run(train, measures=None)#
Run the cross validation experiments.
- arrange_in_parallel(algos, data, seeds, num_jobs=None)#
Arrange the given algorithms to run (embarrassingly) parallel.
- Parameters:
algos (Sequence[Algorithm[_RT]]) – List of tuples consisting of a run_async function of an algorithm and a name.
data (Sequence[TrainValPair]) – List of pairs of data tuples (train and test).
seeds (list[int]) – List of random seeds.
num_jobs (int | None) – Number of parallel jobs. None means as many as available CPUs. (Default: None)
- Returns:
list of the results
- Return type:
list[list[_RT]]
- evaluate_models(datasets, *, preprocess_models=(), inprocess_models=(), metrics=(), per_sens_metrics=(), repeats=1, test_mode=False, delete_previous=True, splitter=None, topic=None, fair_pipeline=True, num_jobs=None, scaler=None, repeat_on='both')#
Evaluate all the given models for all the given datasets and compute all the given metrics.
- Parameters:
datasets (list[Dataset]) – List of dataset objects.
preprocess_models (Sequence[PreAlgorithm]) – List of preprocess model objects. (Default: ())
inprocess_models (Sequence[InAlgorithm]) – List of inprocess model objects. (Default: ())
metrics (Sequence[Metric]) – List of metric objects. (Default: ())
per_sens_metrics (Sequence[Metric]) – List of metric objects that will be evaluated per sensitive attribute. (Default: ())
repeats (int) – Number of repeats to perform for the experiments. (Default: 1)
test_mode (bool) – If True, only use a small subset of the data so that the models run faster. (Default: False)
delete_previous (bool) – True by default. If True, delete previous results in the directory.
splitter (DataSplitter | None) – Custom train-test splitter. (Default: None)
topic (str | None) – A string that identifies the run; the string is prepended to the filename. (Default: None)
fair_pipeline (bool) – if True, run fair inprocess algorithms on the output of preprocessing. (Default: True)
num_jobs (int | None) – Number of parallel jobs; if None, the number of CPUs is used. (Default: None)
scaler (ScalerType | None) – Sklearn-style scaler to be used on the continuous features. (Default: None)
repeat_on (Literal['data', 'model', 'both']) – Should the
data
ormodel
seed be varied for each run? Or should theyboth
be the same? (Default: “both”)
- Returns:
A
Results
object.- Return type:
- load_results(dataset_name, transform_name, topic=None, outdir=PosixPath('results'))#
Load results from a CSV file that was created by
evaluate_models()
.- Parameters:
dataset_name (str) – name of the dataset of the results
transform_name (str) – name of the transformation that was used for the results
topic (str | None) – (optional) topic string of the results (Default: None)
outdir (Path) – directory where the results are stored (Default: Path(“.”) / “results”)
- Returns:
DataFrame if the file exists; None otherwise
- Return type:
Results | None
- run_in_parallel(algos, *, data, seeds, num_jobs=None)#
Run the given algorithms (embarrassingly) parallel.
- Parameters:
algos (Sequence[InAlgorithm] | Sequence[PreAlgorithm]) – List of algorithms.
data (Sequence[TrainValPair]) – List of pairs of data tuples (train and test).
seeds (list[int]) – List of seeds to use when running the model.
num_jobs (int | None) – How many jobs can run in parallel at most. If
None
, use the number of CPUs (Default: None).
- Returns:
list of the results
- Return type:
List[List[Prediction]] | List[List[Tuple[DataTuple, DataTuple]]]