Utilities¶

The following Utilities are used for technical purposes and are not required for the basic usage of promptzl. However, to understand the most important utilities of this package, the most essential ones are listed below. A noteworthy function is calibrate(), which is used to calibrate the probabilities, sometimes leading to better performance. More details about calibration are described in Calibration.

class promptzl.utils.LLM4ClassificationOutput(predictions: torch.Tensor | pandas.DataFrame | polars.DataFrame | List[str | int] | numpy.ndarray | None = None, distribution: torch.Tensor | pandas.DataFrame | polars.DataFrame | List[List[float]] | numpy.ndarray | None = None, logits: torch.Tensor | pandas.DataFrame | polars.DataFrame | List[List[float]] | numpy.ndarray | None = None)¶

Bases: object

Class for Organizing Output.

predictions (Optional[Union[Tensor, pd.DataFrame, pl.DataFrame, List[Union[int, str]], np.ndarray]]

Predictions (i.e. predicted label for each instance).

distribution (Optional[Union[Tensor, pd.DataFrame, pl.DataFrame, List[Union[int, str]], np.ndarray]]

Distribution of predictions (i.e. probabilities for each label).

logits (Optional[Union[Tensor, pd.DataFrame, pl.DataFrame, List[Union[int, str]], np.ndarray]]

Logits for the label words in the verbalizer. The order referes to the flattened verbalizer indices.

promptzl.utils.calibrate(probs: torch.Tensor) torch.Tensor¶

Calibrates Probabilities

Addressing the calibration issue (Zhao et al., 2021, Hu et al., 2022), where some tokens are more likely to be predicted than others, and the probabilities are calibrated accordingly.

A contextualized prior is computed and then used to calibrate the probabilities. A detailed description is found in Calibration.

Parameters:

probs (torch.Tensor) – The probabilities to be calibrated.

Returns:

The calibrated probabilities.

Return type:

torch.Tensor