pseudo_obs¶

copulae.core.misc.pseudo_obs(data, ties='average')[source]¶

Compute the pseudo-observations for the given data matrix

Parameters

data ((N, D) ndarray) – Random variates to be converted to pseudo-observations
ties (str, optional) –
The method used to assign ranks to tied elements. The options are ‘average’, ‘min’, ‘max’, ‘dense’ and ‘ordinal’.

average
The average of the ranks that would have been assigned to all the tied values is assigned to each value.

min
The minimum of the ranks that would have been assigned to all the tied values is assigned to each value. (This is also referred to as “competition” ranking.)

max
The maximum of the ranks that would have been assigned to all the tied values is assigned to each value.

dense
Like min, but the rank of the next highest element is assigned the rank immediately after those assigned to the tied elements. ‘ordinal’: All values are given a distinct rank, corresponding to the order that the values occur in a.

Returns

matrix or vector of the same dimension as data containing the pseudo observations

Return type

numpy.array or pandas.DataFrame

Examples

>>> from copulae import pseudo_obs
>>> from copulae.datasets import load_marginal_data
>>> import numpy as np
>>> data = load_marginal_data()
>>> data.head(3)
    STUDENT      NORM       EXP
0 -0.485878  2.646041  0.393322
1 -1.088878  2.906977  0.253731
2 -0.462133  3.166951  0.480696
>>> pseudo_obs(data).head(3)  # pseudo-obs is a DataFrame because input is a DataFrame
    STUDENT      NORM       EXP
0  0.325225  0.188604  0.557814
1  0.151616  0.399533  0.409530
2  0.336221  0.656115  0.626458
>>> np.random.seed(1)
>>> rand = np.random.normal(size=(100, 3))
>>> rand[:3].round(3)
array([[ 1.624, -0.612, -0.528],
       [-1.073,  0.865, -2.302],
       [ 1.745, -0.761,  0.319]])
>>> pseudo_obs(rand)[:3].round(3)  # otherwise returns numpy arrays
array([[0.921, 0.208, 0.248],
       [0.168, 0.792, 0.01 ],
       [0.941, 0.178, 0.584]])
>>> pseudo_obs(rand.tolist())[:3].round(3)
array([[0.921, 0.208, 0.248],
       [0.168, 0.792, 0.01 ],
       [0.941, 0.178, 0.584]])

near_psd rank_data