pseudo_obs

copulae.core.misc.pseudo_obs(data, ties='average')[source]

Compute the pseudo-observations for the given data matrix

Parameters
  • data ((N, D) ndarray) – Random variates to be converted to pseudo-observations

  • ties (str, optional) –

    The method used to assign ranks to tied elements. The options are ‘average’, ‘min’, ‘max’, ‘dense’ and ‘ordinal’.

    average

    The average of the ranks that would have been assigned to all the tied values is assigned to each value.

    min

    The minimum of the ranks that would have been assigned to all the tied values is assigned to each value. (This is also referred to as “competition” ranking.)

    max

    The maximum of the ranks that would have been assigned to all the tied values is assigned to each value.

    dense

    Like min, but the rank of the next highest element is assigned the rank immediately after those assigned to the tied elements. ‘ordinal’: All values are given a distinct rank, corresponding to the order that the values occur in a.

Returns

matrix or vector of the same dimension as data containing the pseudo observations

Return type

numpy.array or pandas.DataFrame

Examples

>>> from copulae import pseudo_obs
>>> from copulae.datasets import load_marginal_data
>>> import numpy as np
>>> data = load_marginal_data()
>>> data.head(3)
    STUDENT      NORM       EXP
0 -0.485878  2.646041  0.393322
1 -1.088878  2.906977  0.253731
2 -0.462133  3.166951  0.480696
>>> pseudo_obs(data).head(3)  # pseudo-obs is a DataFrame because input is a DataFrame
    STUDENT      NORM       EXP
0  0.325225  0.188604  0.557814
1  0.151616  0.399533  0.409530
2  0.336221  0.656115  0.626458
>>> np.random.seed(1)
>>> rand = np.random.normal(size=(100, 3))
>>> rand[:3].round(3)
array([[ 1.624, -0.612, -0.528],
       [-1.073,  0.865, -2.302],
       [ 1.745, -0.761,  0.319]])
>>> pseudo_obs(rand)[:3].round(3)  # otherwise returns numpy arrays
array([[0.921, 0.208, 0.248],
       [0.168, 0.792, 0.01 ],
       [0.941, 0.178, 0.584]])
>>> pseudo_obs(rand.tolist())[:3].round(3)
array([[0.921, 0.208, 0.248],
       [0.168, 0.792, 0.01 ],
       [0.941, 0.178, 0.584]])