Utils#
This module contains kind of useful things that don’t really belong anywhere else (just yet).
Classes:
Base class for decision functions. |
|
A tuple of dataframes for the features, the sensitive attribute and the class labels. |
|
Decision function that accepts predictions with score of 50% or above. |
|
Prediction of an algorithm. |
|
Aggregate results. |
|
Prediction of an algorithm that makes soft predictions. |
|
A tuple of dataframes for the features and the sensitive attribute. |
|
2-Tuple of train and test data. |
Functions:
Aggregate results over the repeats. |
|
Concatenate the data tuples in the given list. |
|
Concatenate the test tuples in the given list. |
|
Filter entries and change the index with a mapping. |
|
Filter the entries based on the given values. |
|
Initialise Results object. |
|
Change the values of the index with a transformation function. |
|
Shuffle a given dataframe. |
|
Undo one-hot encoding. |
- class Activation#
Bases:
abc.ABC
Base class for decision functions.
- abstract apply(soft_output)#
Apply the decision function to a soft prediction.
- Parameters
soft_output (numpy.ndarray) – soft prediction (i.e. a probability or logits)
- Returns
decision
- Return type
numpy.ndarray
- abstract get_name()#
Name of activation function.
- Return type
str
- class DataTuple(x, s, y, name=None)#
Bases:
ethicml.utility.data_structures.TestTuple
A tuple of dataframes for the features, the sensitive attribute and the class labels.
- Parameters
x (pd.DataFrame) – input features
s (pd.DataFrame) – sensitive attributes
y (pd.DataFrame) – class labels
name (Optional[str]) – optional name of the dataset
Make a DataTuple.
- __len__()#
Overwrite __len__ magic method.
- Return type
int
- apply_to_joined_df(mapper)#
Concatenate the dataframes in the DataTuple and then apply a function to it.
- classmethod from_npz(data_path)#
Load data tuple from npz file.
- Parameters
data_path (pathlib.Path) –
- Return type
- get_subset(num=500)#
Get the first elements of the dataset.
- Parameters
num (int) – how many samples to take for subset
- Returns
subset of training data
- Return type
- property name: Optional[str]#
Getter for name property.
- remove_y()#
Convert the DataTuple instance to a TestTuple instance.
- Return type
- replace(*, x=None, s=None, name=None, y=None)#
Create a copy of the DataTuple but change the given values.
- Parameters
x (Optional[pd.DataFrame]) –
s (Optional[pd.DataFrame]) –
name (Optional[str]) –
y (Optional[pd.DataFrame]) –
- Return type
- property s: pandas.DataFrame#
Getter for property s.
- to_npz(data_path)#
Save DataTuple as an npz file.
- Parameters
data_path (pathlib.Path) –
- Return type
None
- property x: pandas.DataFrame#
Getter for property x.
- property y: pandas.DataFrame#
Getter for property y.
- class Heaviside#
Bases:
ethicml.utility.activation.Activation
Decision function that accepts predictions with score of 50% or above.
- apply(soft_output)#
Apply the decision function to each element of an ndarray.
- Parameters
soft_output (numpy.ndarray) –
- Return type
numpy.ndarray
- get_name()#
Getter for name of decision function.
- Return type
str
- class Prediction(hard, info=None)#
Bases:
object
Prediction of an algorithm.
Make a prediction obj.
- Parameters
hard (pd.Series) –
info (Optional[Dict[str, float]]) –
- __len__()#
Length of the predictions object.
- Return type
int
- static from_npz(npz_path)#
Load prediction from npz file.
- Parameters
npz_path (pathlib.Path) –
- Return type
- property hard: pd.Series#
Hard predictions (e.g. 0 and 1).
- property info: Dict[str, float]#
Additional info about the prediction.
- to_npz(npz_path)#
Save prediction as npz file.
- Parameters
npz_path (pathlib.Path) –
- Return type
None
- class ResultsAggregator(initial=None)#
Bases:
object
Aggregate results.
Init results aggregator obj.
- Parameters
initial (Optional[pd.DataFrame]) –
- append_df(data_frame, prepend=False)#
Append (or prepend) a DataFrame to this object.
- Parameters
data_frame (pandas.DataFrame) –
prepend (bool) –
- Return type
None
- append_from_csv(csv_file, prepend=False)#
Append results from a CSV file.
- Parameters
csv_file (pathlib.Path) –
prepend (bool) –
- Return type
bool
- property results: ethicml.utility.data_structures.Results#
Results object over which this class is aggregating.
- save_as_csv(file_path)#
Save to csv.
- Parameters
file_path (pathlib.Path) –
- Return type
None
- class SoftPrediction(soft, info=None)#
Bases:
ethicml.utility.data_structures.Prediction
Prediction of an algorithm that makes soft predictions.
Make a soft prediction object.
- Parameters
soft (pd.Series) –
info (Optional[Dict[str, float]]) –
- __len__()#
Length of the predictions object.
- Return type
int
- static from_npz(npz_path)#
Load prediction from npz file.
- Parameters
npz_path (pathlib.Path) –
- Return type
- property hard: pd.Series#
Hard predictions (e.g. 0 and 1).
- property info: Dict[str, float]#
Additional info about the prediction.
- property soft: pd.Series#
Soft predictions (e.g. 0.2 and 0.8).
- to_npz(npz_path)#
Save prediction as npz file.
- Parameters
npz_path (pathlib.Path) –
- Return type
None
- class TestTuple(x, s, name=None)#
Bases:
object
A tuple of dataframes for the features and the sensitive attribute.
Make a TestTuple.
- Parameters
x (pd.DataFrame) –
s (pd.DataFrame) –
name (Optional[str]) –
- classmethod from_npz(data_path)#
Load test tuple from npz file.
- Parameters
data_path (pathlib.Path) –
- Return type
- property name: Optional[str]#
Getter for name property.
- replace(*, x=None, s=None, name=None)#
Create a copy of the TestTuple but change the given values.
- Parameters
x (Optional[pd.DataFrame]) –
s (Optional[pd.DataFrame]) –
name (Optional[str]) –
- Return type
- property s: pandas.DataFrame#
Getter for property s.
- to_npz(data_path)#
Save TestTuple as an npz file.
- Parameters
data_path (pathlib.Path) –
- Return type
None
- property x: pandas.DataFrame#
Getter for property x.
- class TrainTestPair(train, test)#
Bases:
NamedTuple
2-Tuple of train and test data.
Create new instance of TrainTestPair(train, test)
- Parameters
- __len__()#
Return len(self).
- count(value, /)#
Return number of occurrences of value.
- index(value, start=0, stop=9223372036854775807, /)#
Return first index of value.
Raises ValueError if the value is not present.
- test: ethicml.utility.data_structures.TestTuple#
Alias for field number 1
- train: ethicml.utility.data_structures.DataTuple#
Alias for field number 0
- aggregate_results(results, metrics, aggregator=('mean', 'std'))#
Aggregate results over the repeats.
- Parameters
results (ethicml.utility.data_structures.Results) –
metrics (List[str]) –
aggregator (Union[str, Tuple[str, ...]]) –
- Return type
pandas.DataFrame
- concat_dt(datatup_list, axis='index', ignore_index=False)#
Concatenate the data tuples in the given list.
- Parameters
datatup_list (Sequence[ethicml.utility.data_structures.DataTuple]) –
axis (Literal['columns', 'index']) –
ignore_index (bool) –
- Return type
- concat_tt(datatup_list, axis='index', ignore_index=False)#
Concatenate the test tuples in the given list.
- Parameters
datatup_list (List[ethicml.utility.data_structures.TestTuple]) –
axis (Literal['columns', 'index']) –
ignore_index (bool) –
- Return type
- filter_and_map_results(results, mapping)#
Filter entries and change the index with a mapping.
- Parameters
results (ethicml.utility.data_structures.Results) –
mapping (Mapping[str, str]) –
- Return type
ethicml.utility.data_structures.Results
- filter_results(results, values, index='model')#
Filter the entries based on the given values.
- Parameters
results (ethicml.utility.data_structures.Results) –
values (Iterable) –
index (Literal['dataset', 'scaler', 'transform', 'model']) –
- Return type
ethicml.utility.data_structures.Results
- make_results(data_frame=None)#
Initialise Results object.
You should always use this function instead of using the “constructor” directly, because this function checks whether the columns are correct.
- Parameters
data_frame (Union[None, pd.DataFrame, Path]) –
- Return type
Results
- map_over_results_index(results, mapper)#
Change the values of the index with a transformation function.
- Parameters
results (ethicml.utility.data_structures.Results) –
mapper (Callable[[Tuple[str, str, str, str, str]], Tuple[str, str, str, str, str]]) –
- Return type
ethicml.utility.data_structures.Results
- shuffle_df(df, random_state)#
Shuffle a given dataframe.
- Parameters
df (pandas.DataFrame) –
random_state (int) –
- Return type
pandas.DataFrame
- undo_one_hot(df, new_column_name=None)#
Undo one-hot encoding.
- Parameters
df (pandas.DataFrame) –
new_column_name (Optional[str]) –
- Return type
Union[pandas.Series, pandas.DataFrame]