📖 API#

`ddp_max_oracle`(max_oracle, losses[, src_device])	Take any existing maximization oracle and apply it to multiple devices using a gather-scatter implementation within the data distributed parallel (DDP) framework.
`l2_centered_isotonic_regression`(losses, spectrum)	Solution to the isotonic regression problem when using the centered l2 loss.
`neg_entropy_centered_isotonic_regression`(...)	Solution to the isotonic regression problem when using the centered negative entropy loss.
`make_esrm_spectrum`(batch_size, risk_param)	Create a spectrum based on the exponential spectral risk measure (ESRM) for `n` samples.
`make_extremile_spectrum`(batch_size, n_draws)	Create a spectrum based on the extremile for `n` samples.
`make_spectral_risk_measure`(spectrum[, ...])	Create a function which computes the sample weights from a vector of losses when using a spectral risk measure ambiguity set.
`make_superquantile_spectrum`(batch_size, ...)	Create a spectrum based on the superquantile (or conditional value-at-risk) for `n` samples.
`spectral_risk_measure_maximization_oracle`(...)	Maximization oracle to compute the sample weights based on a particular spectral risk measure objective.

Create risk measure#

deshift.make_spectral_risk_measure(spectrum: ndarray, penalty: str = 'chi2', shift_cost: float = 0.0)#

Create a function which computes the sample weights from a vector of losses when using a spectral risk measure ambiguity set.

Parameters:

spectrum – a Numpy array containing the spectrum weights, which should be the same length as the batch size.
penalty – either ‘chi2’ or ‘kl’ indicating which f-divergence to use as the dual regularizer.
shift_cost – the non-negative dual regularization parameter.
group_dist

Returns:

compute_sample_weight: a function that maps n losses to a vector of n weights on each training example.

deshift.spectral_risk_measure_maximization_oracle(spectrum: ndarray, shift_cost: float, penalty: str, losses: ndarray)#

Maximization oracle to compute the sample weights based on a particular spectral risk measure objective.

Parameters:

spectrum – a Numpy array containing the spectrum weights, which should be the same length as the batch size.
shift_cost – a non-negative dual regularization parameter.
penalty – either chi2 or kl indicating which f-divergence to use as the dual regularizer.
losses – a Numpy array containing the loss incurred by the model on each example in the batch.

Returns:

sample_weight: a vector of n weights on each training example.

Dual maximization oracles#

Pool adjacent violator algorithm#

deshift.l2_centered_isotonic_regression(losses: ndarray[tuple[int, ...], dtype[_ScalarType_co]], spectrum: ndarray[tuple[int, ...], dtype[_ScalarType_co]])#

Solution to the isotonic regression problem when using the centered l2 loss.

Parameters:

spectrum – a Numpy array containing the spectrum weights, which should be the same length as the batch size.
losses – a Numpy array containing the loss on each example in the batch. These are the labels for isotonic regression.

Returns:

sample_weight: a set of n weights on each training example in the batch.

deshift.neg_entropy_centered_isotonic_regression(losses: ndarray[tuple[int, ...], dtype[_ScalarType_co]], spectrum: ndarray[tuple[int, ...], dtype[_ScalarType_co]])#

Solution to the isotonic regression problem when using the centered negative entropy loss.

Parameters:

spectrum – a Numpy array containing the spectrum weights, which should be the same length as the batch size.
losses – a Numpy array containing the loss on each example in the batch. These are the labels for isotonic regression.

Returns:

sample_weight: a set of n weights on each training example in the batch.

Spectrums#

Extremile#

deshift.make_extremile_spectrum(batch_size: int, n_draws: float)#

Create a spectrum based on the extremile for n samples.

The spectrum is chosen so that the expectation of the loss vector under this spectrum equals the uniform expected maximum of n_draws elements from the loss vector.

See [Dauoia (2019)](https://www.tandfonline.com/doi/full/10.1080/01621459.2018.1498348) for more information.

Parameters:

batch_size – the batch size.
n_draws – the number of independent draws from the loss vector. It can be fractional.

Returns:

spectrum: a sorted vector of n weights on each training example.

Superquantile#

deshift.make_superquantile_spectrum(batch_size: int, tail_prob: float)#

Create a spectrum based on the superquantile (or conditional value-at-risk) for n samples.

Parameters:

batch_size – the batch size.
tail_prob – the proportion of largest elements to keep in the loss computation, i.e. k/n for the top-k loss.

Returns:

spectrum: a sorted vector of n weights on each training example.

Exponential spectral risk measure#

deshift.make_esrm_spectrum(batch_size: int, risk_param: float)#

Create a spectrum based on the exponential spectral risk measure (ESRM) for n samples.

See [Cotter (2006)](https://www.sciencedirect.com/science/article/pii/S0378426606001373) for more information.

Parameters:

batch_size – the batch size.
risk_param – The R parameter from Cotter (2006).

Returns:

spectrum: a sorted vector of n weights on each training example.

Distributed computations#

deshift.ddp_max_oracle(max_oracle, losses, src_device=0)#

Take any existing maximization oracle and apply it to multiple devices using a gather-scatter implementation within the data distributed parallel (DDP) framework. Assumes that process rank is discoverable, e.g. the job is run using torchrun.

Parameters:

max_oracle – a function that consumes n (full-batch size) loss values and returns n weights (where n == micro_size * n_gpus)
losses – a PyTorch tensor of micro_size losses

Returns:

weights: a vector of weights of size len(losses) indicating the weight on each example

📖 API

Contents

📖 API#

Create risk measure#

Dual maximization oracles#

Pool adjacent violator algorithm#

Spectrums#

Extremile#

Superquantile#

Exponential spectral risk measure#

Distributed computations#