ranzen.torch

Classes:

ApproxStratBatchSampler

Approximate Stratified Batch Sampler.

BatchSamplerBase

CosineLRWithLinearWarmup

Sets the learning rate of each parameter group to follow a linear warmup schedule between lr_start and base_lr followed by a cosine annealing schedule between base_lr and lr_min.

CosineWarmup

CrossEntropyLoss

This criterion computes the cross entropy loss between input and target.

DcModule

Event

Emulates torch.cuda.Event, but supports running on a CPU too.

ExponentialWarmup

GreedyCoreSetSampler

Constructs batches from 'oversampled' batches through greedy core-set approximation.

LinearWarmup

LinearWarmupLR

Applies a linear warmup schedule to the learning rate, increasing/decreasing it by a fixed step-size from lr_start to the base value over the specified number of warmup steps.

ReductionType

An enum for the type of reduction to apply to a batch of losses.

Scheduler

SequentialBatchSampler

Infinitely samples elements sequentially, always in the same order.

SizedDataset

StratifiedBatchSampler

Samples equal proportion of elements from [0,..,len(group_ids)-1].

Subset

Subset of a dataset at specified indices.

TrainTestSplit

TrainingMode

An enum for the training mode.

WarmupScheduler

WeightedBatchSampler

Implements a batch-sampler version of torch.utils.data.WeightedRandomSampler.

Functions:

batched_randint

Batched version of torch.randint().

batchwise_pdist

Compute pair-wise distance in batches.

count_parameters

Count all parameters (that have a gradient) in the given model

cross_entropy_loss

This criterion computes the cross entropy loss between input and target.

inf_generator

Get DataLoaders in a single infinite loop.

prop_random_split

Splits a dataset based on proportions rather than on absolute sizes

stratified_split_indices

Splits the data into train/test sets conditional on super- and sub-class labels.

to_item

Safely extracts the (int, float or bool) value from a single-element (scalar) tensor.

to_numpy

Safely casts a tensor to a numpy ndarray of dtype dtype if dtype is specified.

torch_eps

Retrieves the epsilon (the smallest representable number such that 1.0 + eps != 1.0.) value for the given dtype or the dtype of the given tensor.

class ApproxStratBatchSampler(class_labels, subgroup_labels, *, num_samples_per_group=None, num_samples_per_class=None, training_mode=TrainingMode.step, generator=None)

Bases: BatchSamplerBase

Approximate Stratified Batch Sampler.

Essentially, we’re doing: \(x\sim P(x|s,y)\) where \(s\sim \text{uniform}(S|y)\). That is, we iterate over all classes y and uniformly sample a subgroup s, and then we sample a datapoint from that s-y combination.

You have to either specify num_samples_per_group or num_samples_per_class (but not both).

If num_samples_per_group is given, this faithfully implements the \(\pi\) function. This means that for those classes which have “full s-support” (all subgroups are present), we don’t sample a subgroup but iterate over each subgroup one-by-one. We take num_samples_per_group samples from each s-y combination.

On the other hand, if num_samples_per_class is given, then classes with full s-support are not given special treatment. We always sample as many subgroups as are specified in num_samples_per_class and then take a single datapoint from each of these s-y combinations.

Parameters:
  • class_labels (Sequence[int]) – List-like object with the class labels.

  • subgroup_labels (Sequence[int]) – List-like object with the subgroup labels.

  • num_samples_per_group (int | None) – How many samples to take per s-y group. Cannot be specified together with num_samples_per_class.

  • num_samples_per_class (int | None) – How many samples to take per y class. Cannot be specified together with num_samples_per_group.

  • training_mode (TrainingMode) – Iteration-based vs epoch-based.

  • generator (Generator | None) – Torch generator for random numbers.

Raises:

ValueError – If not exactly one of num_samples_per_group and num_samples_per_class is specified.

class BatchSamplerBase(epoch_length=None)

Bases: Sampler[list[int]]

Parameters:

epoch_length (int | None)

class CosineLRWithLinearWarmup(optimizer, *, warmup_iters, lr_start=0.0, total_iters, lr_min=0.0, last_epoch=-1)

Bases: _LRScheduler

Sets the learning rate of each parameter group to follow a linear warmup schedule between lr_start and base_lr followed by a cosine annealing schedule between base_lr and lr_min.

Parameters:
  • optimizer (Optimizer) – Optimizer whose parameter groups are to be scheduled.

  • warmup_iters (int | float) – Maximum number of iterations for linear warmup. Float values are interpreted as a fraction of total_iters.

  • lr_start (float) – Learning rate at the beginning of linear warmup.

  • total_iters (int) – Total number of iterations.

  • lr_min (float) – Minimum learning rate permitted with cosine annealing.

  • last_epoch (int) – The index of the last epoch.

Raises:

AttributeError – If warmup_iters is a float and not in the range [0, 1].

get_lr()

Get the learning rate of each parameter group.

Returns:

The learning rate for each parameter group in the optimizer.

Return type:

list[float]

property scheduler: LinearWarmupLR | CosineAnnealingLR

The scheduler currently in use, as determined by the curren step. If the current stepe exceeds the number of warmup iterations, then the cosine scheduler will be returned, else the learning-warmup scheduler will be.

Returns:

The learning-rate scheduler currently in use.

step(epoch=None)

Update the learning rates using the currently-used scheduler.

Parameters:

epoch (int | None)

Return type:

None

class CosineWarmup(start_val: T, end_val: T, warmup_steps: int)

Bases: WarmupScheduler[T]

Parameters:
  • start_val (T)

  • end_val (T)

  • warmup_steps (int)

class CrossEntropyLoss(*, class_weight=None, ignore_index=-100, reduction=ReductionType.mean, label_smoothing=0.0)

Bases: Module

This criterion computes the cross entropy loss between input and target.

It is useful when training a classification problem with C classes. If provided, the optional argument weight should be a 1D Tensor assigning weight to each of the classes. This is particularly useful when you have an unbalanced training set.

The input is expected to contain raw, unnormalized scores for each class. input has to be a Tensor of size \((C)\) for unbatched input, \((minibatch, C)\) or \((minibatch, C, d_1, d_2, ..., d_K)\) with \(K \geq 1\) for the K-dimensional case. The last being useful for higher dimension inputs, such as computing cross entropy loss per-pixel for 2D images.

The target that this criterion expects should contain either:

  • Class indices in the range \([0, C)\) where \(C\) is the number of classes; if ignore_index is specified, this loss also accepts this class index (this index may not necessarily be in the class range). The unreduced (i.e. with reduction set to 'none') loss for this case can be described as:

    \[\ell(x, y) = L = \{l_1,\dots,l_N\}^\top, \quad l_n = - w_{y_n} \log \frac{\exp(x_{n,y_n})}{\sum_{c=1}^C \exp(x_{n,c})} \cdot \mathbb{1}\{y_n \not= \text{ignore\_index}\}\]

    where \(x\) is the input, \(y\) is the target, \(w\) is the weight, \(C\) is the number of classes, and \(N\) spans the minibatch dimension as well as \(d_1, ..., d_k\) for the K-dimensional case. If reduction is not 'none' (default 'mean'), then

    \[\begin{split}\ell(x, y) = \begin{cases} \sum_{n=1}^N \frac{1}{\sum_{n=1}^N w_{y_n} \cdot \mathbb{1}\{y_n \not= \text{ignore\_index}\}} l_n, & \text{if reduction} = \text{`mean';}\\ \sum_{n=1}^N l_n, & \text{if reduction} = \text{`sum'.} \end{cases}\end{split}\]

    Note that this case is equivalent to the combination of torch.nn.LogSoftmax and torch.nn.NLLLoss.

  • Probabilities for each class; useful when labels beyond a single class per minibatch item are required, such as for blended labels, label smoothing, etc. The unreduced (i.e. with reduction set to 'none') loss for this case can be described as:

    \[\ell(x, y) = L = \{l_1,\dots,l_N\}^\top, \quad l_n = - \sum_{c=1}^C w_c \log \frac{\exp(x_{n,c})}{\sum_{i=1}^C \exp(x_{n,i})} y_{n,c}\]

    where \(x\) is the input, \(y\) is the target, \(w\) is the weight, \(C\) is the number of classes, and \(N\) spans the minibatch dimension as well as \(d_1, ..., d_k\) for the K-dimensional case. If reduction is not 'none' (default 'mean'), then

    \[\begin{split}\ell(x, y) = \begin{cases} \frac{\sum_{n=1}^N l_n}{N}, & \text{if reduction} = \text{`mean';}\\ \sum_{n=1}^N l_n, & \text{if reduction} = \text{`sum'.} \end{cases}\end{split}\]

Note

The performance of this criterion is generally better when target contains class indices, as this allows for optimized computation. Consider providing target as class probabilities only when a single class label per minibatch item is too restrictive.

Parameters:
  • class_weight (Tensor | None) – A manual rescaling weight given to each class. If given, has to be a Tensor of size C.

  • ignore_index (int) – Specifies a target value that is ignored and does not contribute to the input gradient. Note that ignore_index is only applicable when the target contains class indices.

  • reduction (ReductionType | str) – Specifies the reduction to apply to the output.

  • label_smoothing (float) – A float in [0.0, 1.0]. Specifies the amount of smoothing when computing the loss, where 0.0 means no smoothing. The targets become a mixture of the original ground truth and a uniform distribution as described in Rethinking the Inception Architecture for Computer Vision. Default: \(0.0\).

Example:
>>> # Example of target with class indices
>>> loss = CrossEntropyLoss()
>>> input = torch.randn(3, 5, requires_grad=True)
>>> target = torch.empty(3, dtype=torch.long).random_(5)
>>> output = loss(input, target)
>>> output.backward()
>>>
>>> # Example of target with class probabilities
>>> input = torch.randn(3, 5, requires_grad=True)
>>> target = torch.randn(3, 5).softmax(dim=1)
>>> output = loss(input, target)
>>> output.backward()
forward(input, *, target, instance_weight=None, reduction=None)

Computes the cross entropy loss between input and target.

Parameters:
  • input (Tensor) – Predicted unnormalized scores (often referred to as logits).

  • target (Tensor) – Ground truth class indices or class probabilities.

  • instance_weight (Tensor | None) – a manual rescaling weight given to each sample. If given, has to be a Tensor of ‘N’.

  • reduction (ReductionType | str | None) – Overrides reduction.

Returns:

The (reduced) cross-entropy between input and target.

Return type:

Tensor

class DcModule(*args: Any, **kwargs: Any)

Bases: Module

Parameters:
  • args (Any)

  • kwargs (Any)

Return type:

Self

class Event

Bases: object

Emulates torch.cuda.Event, but supports running on a CPU too.

Example:
>>> from ranzen.torch import Event
>>> with Event() as event:
>>>     y = some_nn_module(x)
>>> print(event.time)
class ExponentialWarmup(start_val: T, end_val: T, warmup_steps: int)

Bases: WarmupScheduler[T]

Parameters:
  • start_val (T)

  • end_val (T)

  • warmup_steps (int)

class GreedyCoreSetSampler(embeddings, *, batch_size, oversampling_factor, generator=None)

Bases: BatchSamplerBase

Constructs batches from ‘oversampled’ batches through greedy core-set approximation.

Said approximation takes the form of the furtherst-frist traversal (FFT) algorithm.

Parameters:
  • embeddings (Tensor) – Embedded dataset from which to sample the core-sets according; the order of the embeddings, v, must match the order of the dataset (i.e. f(x_i) = v_i, for embedding function f and inputs x)

  • batch_size (int) – Budget for the core-set,

  • oversampling_factor (int) – How many times larger than the budget the batch to be sampled from should be.

  • generator (Generator | None) – Pseudo-random-number generator to use for shuffling the dataset.

class LinearWarmup(start_val: T, end_val: T, warmup_steps: int)

Bases: WarmupScheduler[T]

Parameters:
  • start_val (T)

  • end_val (T)

  • warmup_steps (int)

class LinearWarmupLR(optimizer, *, warmup_iters, lr_start=0, last_epoch=-1)

Bases: _LRScheduler

Applies a linear warmup schedule to the learning rate, increasing/decreasing it by a fixed step-size from lr_start to the base value over the specified number of warmup steps.

Parameters:
  • optimizer (Optimizer)

  • warmup_iters (int)

  • lr_start (float)

  • last_epoch (int)

get_lr()

Get the learning rate of each parameter group.

Returns:

The learning rate for each parameter group in the optimizer.

Return type:

list[float]

class ReductionType(value)

Bases: Enum

An enum for the type of reduction to apply to a batch of losses.

batch_mean = 4

compute the mean over the batch (first) dimension, the sum over the remaining dimensions.

mean = 1

compute the mean over all dimensions.

none = 2

no reduction.

sum = 3

compute the sum over all dimensions.

class Scheduler(start_val: T)

Bases: Generic[T]

Parameters:

start_val (T)

step()

Update the scheduled value.

Return type:

None

class SequentialBatchSampler(data_source, *, batch_size, training_mode=TrainingMode.step, shuffle=True, drop_last=False, generator=None)

Bases: BatchSamplerBase

Infinitely samples elements sequentially, always in the same order.

This is useful for enabling iteration-based training. Note that unlike torch’s SequentialSampler which is an ordinary sampler that yields independent sample indexes, this is a BatchSampler, requiring slightly different treatment when used with a DataLoader.

Parameters:
  • data_source (Sized) – Object of the same size as the data to be sampled from.

  • batch_size (int) – How many samples per batch to load.

  • training_mode (TrainingMode | str) – The training mode to use (epoch vs. step).

  • shuffle (bool) – Set to True to have the data reshuffled at every epoch.

  • drop_last (bool) – Set to True to drop the last incomplete batch,

  • generator (Generator | None) – Pseudo-random-number generator to use for shuffling the dataset.

Example:
>>> batch_sampler = InfSequentialBatchSampler(data_source=train_data, batch_size=100, shuffle=True)
>>> train_loader = DataLoader(train_data, batch_sampler=batch_sampler, shuffle=False, drop_last=False) # drop_last and shuffle need to be False
>>> train_loader_iter = iter(train_loader)
>>> for _ in range(train_iters):
>>>     batch = next(train_loader_iter)
class SizedDataset(*args, **kwargs)

Bases: Protocol[T_co]

class StratifiedBatchSampler(group_ids, *, num_samples_per_group, multipliers=None, base_sampler=BaseSampler.sequential, training_mode=TrainingMode.step, replacement=True, shuffle=False, drop_last=True, generator=None)

Bases: BatchSamplerBase

Samples equal proportion of elements from [0,..,len(group_ids)-1].

To drop certain groups, set their multiplier to 0.

Parameters:
  • group_ids (Sequence[int]) – A sequence of group IDs, not necessarily contiguous.

  • num_samples_per_group (int) – Number of samples to draw per group. Note that if a multiplier is > 1 then effectively more samples will be drawn for that group.

  • multipliers (dict[int, int] | None) – An optional dictionary that maps group IDs to multipliers. If a multiplier is greater than 1, the corresponding group will be sampled at twice the rate as the other groups. If a multiplier is 0, the group will be skipped.

  • base_sampler (BaseSampler | str) – The base sampling strategy to use (sequential vs. random).

  • training_mode (TrainingMode | str) – The training mode to use (epoch vs. step).

  • replacement (bool) – if True, samples are drawn with replacement. If not, they are drawn without replacement, which means that when a sample index is drawn for a row, it cannot be drawn again for that row.

  • shuffle (bool) – Whether to shuffle the subsets of the data after each pass (only applicable when the base_sampler is set to sequential).

  • drop_last (bool) – Set to True to drop the last (on a per-group basis) incomplete batch.

  • generator (Generator | None) – Pseudo-random-number generator to use for shuffling the dataset.

Raises:

ValueError – If num_samples_per_group is non-positive, if replacement is not a bool, or if there are not enough samples in a group to sample num_samples_per_group.

Example:
>>> list(StratifiedSampler([0, 0, 0, 0, 1, 1, 2], 10, replacement=True))
[3, 5, 6, 3, 5, 6, 0, 5, 6]
>>> list(StratifiedSampler([0, 0, 0, 0, 1, 1, 2], 10, replacement=True, multiplier={2: 2}))
[3, 4, 6, 6, 3, 5, 6, 6, 1, 5, 6, 6]
>>> list(StratifiedSampler([0, 0, 0, 0, 1, 1, 1, 2, 2], 7, replacement=False))
[2, 6, 7, 0, 5, 8]
class Subset(dataset, indices)

Bases: Generic[D]

Subset of a dataset at specified indices.

Parameters:
  • dataset (D) – The whole Dataset.

  • indices (Sequence[int]) – Indices in the whole set selected for subset.

class TrainTestSplit(train: _S, test: _S)

Bases: Generic[_S]

Parameters:
  • train (_S)

  • test (_S)

class TrainingMode(value)

Bases: Enum

An enum for the training mode.

epoch = 1

epoch-based training

step = 2

step-based training

class WarmupScheduler(start_val: T, end_val: T, warmup_steps: int)

Bases: Scheduler[T]

Parameters:
  • start_val (T)

  • end_val (T)

  • warmup_steps (int)

step()

Update the scheduled value.

Return type:

None

class WeightedBatchSampler(weights, *, batch_size, replacement=True, generator=None)

Bases: BatchSamplerBase

Implements a batch-sampler version of torch.utils.data.WeightedRandomSampler.

Parameters:
  • weights (Sequence[float] | Tensor) – A sequence or tensor of weights, not necessarily summing to one.

  • batch_size (int) – Number of samples to draw per batch/iteration.

  • replacement (bool) – If True, samples are drawn with replacement. If not, they are drawn without replacement, which means that when a sample index is drawn for a row, it cannot be drawn again for that row.

  • generator (Generator | None) – Pseudo-random-number generator to use for randomly sampling indexes.

Raises:

ValueError – If batch_size is non-positive or is greater than the number of weights when replacement=False

classmethod from_labels(labels, *, batch_size, replacement=True, generator=None)

Instantiate a WeightedBatchSampler from a sequenece or tensor of ints, where weights is computed using the inverse frequencies of the values in labels.

Parameters:
  • labels (Sequence[int] | Tensor) – Labels from which to compute the sample weights from; should be of length equal to the size of the associated dataset being indexed.

  • batch_size (int) – Number of samples to draw per batch/iteration.

  • replacement (bool) – If True, samples are drawn with replacement. If not, they are drawn without replacement, which means that when a sample index is drawn for a row, it cannot be drawn again for that row.

  • generator (Generator | None) – Pseudo-random-number generator to use for randomly sampling indexes.

Returns:

A WeightedBatchSampler instance with weights computed using the inverse frequencies of the values in labels.

Raises:

ValueError – If labels is a tensor and does not have dtype torch.long.

Return type:

Self

batched_randint(high, *, size=None, generator=None)

Batched version of torch.randint().

Randomly samples an integer from the domain \([0, h_i]\) for each sample \(h_i \in high\).

torch.randint() requires high to be an integer and thus prohibits having different samples within a batch having sampling domains, something which is necessary in order to vectorise, for example, sampling from groups of different sizes or sampling objects with different offsets. This func addresses this limitation using inverse transform sampling.

Parameters:
  • high (Tensor) – A batch of tensors encoding the maximum integer value the corresponding random samples may take.

  • size (int | Sequence[int] | None) – An integer or sequence of integers defining the shape of the output tensor for each upper-bound specified in high. The overall size of the sampled tensor will be ‘size(high) + size. If None then the output size is simply ‘size(high)’.

  • generator (Generator | None) – Pseudo-random-number generator to use for sampling.

Returns:

A tensor of random-sampled integers upper-bounded by the values in high.

Return type:

Tensor

batchwise_pdist(x, chunk_size=1000, p_norm=2.0)

Compute pair-wise distance in batches.

This is sometimes necessary because if you compute pdist directly, it doesn’t fit into memory.

Parameters:
  • x (Tensor) – Tensor of shape (N, F) where F is the number of features.

  • chunk_size (int) – Size of the chunks that are used to compute the pair-wise distance. Larger chunk size is probably faster, but may not fit into memory.

  • p_norm (float) – Which norm to use for the distance. Euclidean norm by default.

Returns:

All pair-wise distances in a tensor of shape (N, N).

Return type:

Tensor

count_parameters(model)

Count all parameters (that have a gradient) in the given model

Parameters:

model (Module)

Return type:

int

cross_entropy_loss(input, *, target, instance_weight=None, reduction=ReductionType.mean, ignore_index=-100, class_weight=None, label_smoothing=0.0)

This criterion computes the cross entropy loss between input and target.

See CrossEntropyLoss for details.

Parameters:
  • input (Tensor) – Predicted unnormalized scores (often referred to as logits).

  • target (Tensor) – Ground truth class indices or class probabilities.

  • instance_weight (Tensor | None) – a manual rescaling weight given to each sample. If given, has to be a Tensor of ‘N’.

  • reduction (ReductionType | str) – Specifies the reduction to apply to the output.

  • ignore_index (int) – Specifies a target value that is ignored and does not contribute to the input gradient. Note that ignore_index is only applicable when the target contains class indices.

  • class_weight (Tensor | None) – A manual rescaling weight given to each class. If given, has to be a Tensor of size C.

  • label_smoothing (float) – A float in [0.0, 1.0]. Specifies the amount of smoothing when computing the loss, where 0.0 means no smoothing. The targets become a mixture of the original ground truth and a uniform distribution as described in Rethinking the Inception Architecture for Computer Vision. Default: \(0.0\).

Returns:

The (reduced) cross-entropy between input and target.

Raises:

ValueError – If ‘input’ and ‘target’ have incompatible sizes.

Example:
>>> # Example of target with class indices
>>> input = torch.randn(3, 5, requires_grad=True)
>>> target = torch.randint(5, (3,), dtype=torch.int64)
>>> loss = F.cross_entropy(input, target)
>>> loss.backward()
>>>
>>> # Example of target with class probabilities
>>> input = torch.randn(3, 5, requires_grad=True)
>>> target = torch.randn(3, 5).softmax(dim=1)
>>> loss = F.cross_entropy(input, target)
>>> loss.backward()
Return type:

Tensor

inf_generator(iterable)

Get DataLoaders in a single infinite loop.

for i, (x, y) in enumerate(inf_generator(train_loader))

Parameters:

iterable (Iterable[T]) – An iterable that will be looped infinitely.

Yield:

Elements from the given iterable.

Raises:

RuntimeError – If the given iterable is empty.

Return type:

Iterator[T]

prop_random_split(dataset_or_size, *, props, as_indices=False, seed=None, reproducible=False)

Splits a dataset based on proportions rather than on absolute sizes

Parameters:
  • dataset_or_size (D | int) – Dataset or size (length) of the dataset to split.

  • props (Sequence[float] | float) – The fractional size of each subset into which to randomly split the data. Elements must be non-negative and sum to 1 or less; if less then the size of the final split will be computed by complement.

  • as_indices (bool) – If True the raw indices are returned instead of subsets constructed from them when dataset_or_len is a dataset. This means that when dataset_or_len corresponds to the length of a dataset, this argument has no effect and the function always returns the split indices.

  • seed (int | None) – The PRNG used for determining the random splits.

  • reproducible (bool) – If True, use a generator which is reproducible across machines, operating systems, and Python versions.

Returns:

Random subsets of the data of the requested proportions.

Raises:

ValueError – If the dataset does not have a __len__ method or sum(props) > 1.

Return type:

list[Subset[D]] | list[list[int]]

stratified_split_indices(labels, *, default_train_prop, train_props=None, seed=None)

Splits the data into train/test sets conditional on super- and sub-class labels.

Parameters:
  • labels (Tensor | ndarray[Any, dtype[int64]] | Sequence[int]) – Tensor, array or sequence encoding the label associated with each sample.

  • default_train_prop (float) – Proportion of samples for a given to sample for the training set for those y-s combinations not specified in train_props.

  • train_props (dict[int, float] | None) – Proportion of each group to sample for the training set. If None then the function reduces to a simple random split of the data.

  • seed (int | None) – PRNG seed to use for sampling.

Returns:

Train-test split.

Raises:

ValueError – If a value in train_props is not in the range [0, 1] or if a key is not present in group_ids.

Return type:

TrainTestSplit[list[int]]

to_item(tensor, /)

Safely extracts the (int, float or bool) value from a single-element (scalar) tensor.

Parameters:

tensor (Tensor) – Tensor to extract the item from.

Returns:

The value of the scalar tensor as a built-in float, int, bool.

Return type:

int | float | bool

to_numpy(tensor, *, dtype=None)

Safely casts a tensor to a numpy ndarray of dtype dtype if dtype is specified.

Parameters:
  • tensor (Tensor) – Tensor to be cast to a ndarray.

  • dtype (DT | None) – dtype of the cast-to ndarray. If None then the dtype will be inherited from tensor.

Returns:

tensor cast to an ndarray of dtype dtype if dtype is specified.

Return type:

ndarray[Any, dtype[DT]] | ndarray[Any, dtype[_ScalarType_co]]

torch_eps(tensor_or_dtype, /)

Retrieves the epsilon (the smallest representable number such that 1.0 + eps != 1.0.) value for the given dtype or the dtype of the given tensor.

Parameters:

tensor_or_dtype (Tensor | dtype) – Tensor or torch.dtype instance to retrieve the epislon value for.

Returns:

Epsilon value for the given dtype if tensor_or_dtype is an instance of torch.dtype else the epsilon value of the dtype of the given tensor.

Return type:

float