Questions tagged [anomaly-detection]

In data mining, anomaly detection (also outlier detection) is the identification of items, events or observations which do not conform to an expected pattern or other items in a dataset.

324 questions
21
votes
2 answers

Dataflow anomaly analysis warnings from PMD

I am using Eclipse with the PMD Plug-in (4.0.0.v20130510-1000) and get a lot of those violations: Found 'DD'-anomaly for variable 'freq' (lines '187'-'189'). Found 'DU'-anomaly for variable 'freq' (lines '189'-'333'). In this SO answer, it says that…
brimborium
  • 8,880
  • 9
  • 46
  • 73
9
votes
1 answer

Working Example Of Luminol Anomaly Detection And Correlation Library By Linkedin

Github Link Of Luminol Library: https://github.com/linkedin/luminol Can anyone explain me with a sample code, how to use this module for finding anomalies in data set. I want to use this module for finding the anomalies in my time series data. P.S.:…
Ashish
  • 3,587
  • 11
  • 37
6
votes
2 answers

Conversion of IsolationForest decision score to probability algorithm

I am looking to create a generic function to convert the output decision_scores of sklearn's IsolationForest into true probabilities [0.0, 1.0]. I am aware of, and have read, the original paper and I understand mathematically that the output of that…
6
votes
1 answer

One Class SVM algorithm taking too long

The data bellow shows part of my dataset, that is used to detect anomalies describe_file data_numbers index 0 gkivdotqvj 7309.0 0 1 hpwgzodlky 2731.0 1 2 dgaecubawx 0.0 2 3 NaN …
E199504
  • 393
  • 1
  • 8
6
votes
3 answers

What is the difference between Real-time Anomaly Detection and Anomaly Detection?

Hence, the following derives: What isa clear the definition of Real-time Anomaly Detection? I am investigating the field of Anomaly Detection and in many papers the approach is defined Real-time, while in many other it is simply called Anomaly…
GYBE
  • 687
  • 3
  • 17
6
votes
1 answer

Isolation Forest in Python

I am currently working on detecting outliers in my dataset using Isolation Forest in Python and I did not completely understand the example and explanation given in scikit-learn documentation Is it possible to use Isolation Forest to detect outliers…
Nnn
  • 171
  • 2
  • 9
6
votes
1 answer

How to monitor messages rate in Kafka topics?

How can I get alerted when there is a message rate in some topic higher or lower than usual?
marosbfm
  • 171
  • 7
5
votes
0 answers

Why my LSTM model is repeating the previous values?

I build a simple LSTM model in Keras as below: model = Sequential() model.add(keras.layers.LSTM(hidden_nodes, input_dim=num_features, input_length=window, consume_less="mem")) model.add(keras.layers.Dense(num_features,…
Alessandro
  • 702
  • 1
  • 9
  • 31
5
votes
1 answer

What is the range of Scikit-Learn's IsolationForest decision_function scores?

Scikit-Learn's IsolationForest class has a method decision_function that returns the anomaly scores of the input samples. However, the documentation does not state what the possible range of these scores is, and only states that "the lower [the…
DataMan
  • 1,894
  • 3
  • 14
  • 29
4
votes
1 answer

How to detect anomaly in a time series data(specifically) with trend and seasonality present in it?

I want to detect the outliers in a "time series data" which contains the trend and seasonality components. I want to leave out the peaks which are seasonal and only consider only the other peaks and label them as outliers. As I am new to time series…
Raja Sahe S
  • 407
  • 1
  • 6
  • 12
4
votes
1 answer

Is there a way to calculate feature importance at observation level in isolation forest?

I am using Isolation Forest in R to perform Anomaly Detection on multivariate data. I tried calculating the anomaly scores along with contribution of individual metric in calculating that score. I am able to get the anomaly score but facing problem…
4
votes
1 answer

How to train isolationForest model so as to give the minimum number of false positives?

While using Isolation Forest for anomaly detection in data should we train the model with only normal data or mix of both normal as well as outlier data? Also what is the best algorithm for anomaly detection for multivariate data? I want minimum…
4
votes
1 answer

Implementation of Excess-Mass or Mass-Volume curves

I am looking for an implementation of Excess-Mass or Mass-Volume curves which are used for the evaluation of unsupervised anomaly detection algorithms. I'd prefer an implementation in Python but I could re-write it from any other language. Thank…
4
votes
1 answer

Isolation Forest

I'm currently working on identifying outliers in my data set using the IsolationForest method in Python, but don't completely understand the example on…
bosbraves
  • 65
  • 4
4
votes
1 answer

Creating anomaly detection using machine learning

I'm very impressed from the new x-pack ML of the elastic stack. It seems their technique learns data patterns over time and can predict anomalies in multiple domains. Zoomed in: I was wondering what approach and network topology could be used, in…
Shlomi Schwartz
  • 11,238
  • 25
  • 93
  • 155
1
2 3
21 22