pvanalytics.quality.outliers.hampel#

pvanalytics.quality.outliers.hampel(data, window=5, max_deviation=3.0, scale=None)#

Identify outliers by the Hampel identifier.

The Hampel identifier is computed according to 1.

Parameters
  • data (Series) – The data in which to find outliers.

  • window (int or offset, default 5) – The size of the rolling window used to compute the Hampel identifier.

  • max_deviation (float, default 3.0) – Any value with a Hampel identifier > max_deviation standard deviations from the median is considered an outlier.

  • scale (float, optional) – Scale factor used to estimate the standard deviation as \(MAD / scale\). If scale=None (default), then the scale factor is taken to be scipy.stats.norm.ppf(3/4.) (approx. 0.6745), and \(MAD / scale\) approximates the standard deviation of Gaussian distributed data.

Returns

True for each value that is an outlier according to its Hampel identifier.

Return type

Series

References

1

Pearson, R.K., Neuvo, Y., Astola, J. et al. Generalized Hampel Filters. EURASIP J. Adv. Signal Process. 2016, 87 (2016). https://doi.org/10.1186/s13634-016-0383-6

Examples using pvanalytics.quality.outliers.hampel#

Hampel Outlier Detection

Hampel Outlier Detection