pliers.diagnostics.mahalanobis_distances

pliers.diagnostics.mahalanobis_distances(df, axis=0)[source]

Returns a pandas Series with Mahalanobis distances for each sample on the axis.

Note: does not work well when # of observations < # of dimensions Will either return NaN in answer or (in the extreme case) fail with a Singular Matrix LinAlgError

Parameters
  • df – pandas DataFrame with columns to run diagnostics on

  • axis – 0 to find outlier rows, 1 to find outlier columns