Local Outlier Factor

Local outlier Factor (LoF) is another density based approach to identify outliers in a dataset. The LoF is applicable to identify outliers in a dataset, which has a mixture of data distributions.

lof
Figure showing the dense and sparse distribution of points [Source: Google images]
The above figure shows two different distributions, a dense cluster of points and a sparse distribution of points.  In such datasets, for each specific distribution within a dataset, we should perform outlier detection locally, i.e., points within one distribution should not affect outlier detection in another cluster. The LoF algorithm follows the same intuition and calculates anomaly score for  each point within a distribution as:

  1. For each data point X , let D^k(X) represent distance of point X to its k^{th} neighbor, and L_{k}(X) represent set of points within D^k(X)
  2. Compute reachability distance for each data point, X  as                                     R_{k}(X, Y) = max(dist(X,Y), D^k(Y))
  3. Compute Average reachability distance AR_{k}(X) of data point X as                         AR_{k}(X) = MEAN_{Y \in L_{k}(X)} R_{k}(X, Y)
  4. In the final step, LOF score for each point, X is calculated as:                                                            LOF_{k}(X) = MEAN_{Y \in L_{k}(X)} \frac{AR_{k}(X)}{AR_{k}(Y)}

To find the best value of k , it is always good to follow ensemble approach, i.e., use a range of k values to calculate LOF scores and then use a specific method to combine the outlier scores.

 

References:

  1. Book: Outlier Analysis by Charu Aggarwal
  2.  Wikipedia
  3. Google Images
Advertisements

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s