Suppose I have a list of events. For example A, D, T, H, U, A, B, F, H, ...
.
What I need is to find frequent patterns that occur in the complete sequence. In this problem we cannot use traditional algorithms like a priori or fp growth because they require separate item sets. And, I cannot break this stream into smaller sets.
Any idea which algorithm would work for me?
EDIT
For example, for the sequence A, D, T, H, U, A, D, T, H, T, H, U, A, H, T, H
and with min_support = 2
.
The frequent patterns will be
Of length 1 --> [A, D, T, H, U]
Of length 2 --> [AD, DT, TH, HU, UA, HT]
Of length 3 --> [ADT, DTH, THU, HUA]
Of length 4 --> [ADTH, THUA]
No sequences of length 5 and further