2

I did not find any straight answer that I am looking for in Google, so here it goes...

Is their any "algorithm" that collects unique digits/numbers from a long ("long" is not meant by data type)random number?

Such as, x = 6487657876579876867656768476876876117681761871687268726

I want to have all the possible unique numbers/digits that I can get from x. I am not looking for code. I'm looking for an "established algorithm" for doing this type of work or similar like this. Any paper/journal/book link would be useful.

EDIT: If I asked for "algorithms" that searches numbers/item from a lot of items, the answer would be : BFS, DFS, Tree, Graph etc. Similarly my question is not about coding, not even about any specific programming language, its about finding (if there is any) an algorithm that collects unique numbers from a random number. As "Muckle_ewe" said in his comment , 123 could be 1, 2, 3, 12, 23, 123 but not 13 -this is exaclty what my algorithm requirment is.

Please don't show me codes. I'm expecting a reference/name/link to any established algorithm that does this kind of simple work or similar. Of course anyone can build an algorithm for this simple work, even me. But I'm looking for a well recognized established algorithm that I can use as reference.

2nd EDIT: a little bit change to the requirement, actually I don't need all the sub-strings that I thought I would. I found that suffix tree is good for finding all the sub-strings. So it is not exact to what I'm looking for but it is a close one. Well, editing Muckle_ewe's comment, 123 could be 1, 2, 3, 12, 23 but not 13, 123 -so I only need 1 digit or 2 digit numbers from a long random number (x),this is what my algorithm requirment is, not the old one.

---Thanks.

Giliweed
  • 4,597
  • 5
  • 24
  • 34
  • 1
    One method would be to convert to a string, find all substrings and enter them into a set, which would remove duplicated. Most programming languages should have these built in so should take only a few lines of code. This assumes that you wouldn't be skipping any digits, i.e. 123 could be 1, 2, 3, 12, 23, 123 but not 13 – Muckle_ewe Aug 30 '14 at 14:29
  • 1
    Pep's (now deleted) answer solved the problem as stated. As he said, please clarify the question if that's not what you wanted. – j_random_hacker Aug 30 '14 at 14:30
  • You need all unique substrings ? Construct a DFA. – wildplasser Aug 30 '14 at 15:45
  • 1
    What you're asking is equivalent to finding all substrings of a string - your string just happens to be numeric. There is an answer at [this question](http://stackoverflow.com/questions/2560262/generate-all-unique-substrings-for-given-string), and I've marked this question as a duplicate of that one. – Don Roby Aug 30 '14 at 17:15
  • @DonRoby, your link is useful. May be "the suffix tree" is the one that I am looking for, but still not sure because I really don't know what the suffix tree is...... need to study about it. Hopefully this is the one that will help. ---and thanks for the link. – Giliweed Aug 30 '14 at 20:03
  • Maybe you just need to get a set of **distinct** random numbers? In that case, you could use [Reservoir sampling](http://en.wikipedia.org/wiki/Reservoir_sampling). – Vlad Aug 31 '14 at 12:17

1 Answers1

1

Imho, an algorithm is bound to a language which you don't mention in your question, so i take the easiest one I know, Ruby. I use your long number as a string to make it easier.

x = "6487657876579876867656768476876876117681761871687268726"
x.split("").uniq.join #gives 64875912

Now I could do this more elaborate, bypassing the included methods Ruby has but what would be the point ? Reïnveting the wheel ? Thinking I'm better than the Ruby dev's ? If I want I can look up the C code they used to make up their Ruby methods.

EDIT after OP's EDIT

Hmm, after your edit i finally see what you are getting at. See also this Quora question, there's an aexplanation about the Suffix tree traveling algorithm.

I'm just a simple programmer, so i like to keep things simple, so a simple algorithm would be: start from the first letter and continue to the last, store all suffixes for that letter, finally sort the stored suffixes and remove duplicates. I guess there are faster algorithms around, and I don't have a name for this one, I made it up myself. In Ruby this would be implemented like this, I use a short string to keep it .... euh ... simple. The two maps produce arrays in an array so i flatten them up first before sorting and removing the duplicates.

x = "BANANA"
(0...x.length).map {|i|(i...x.length).map { |j|x[i..j]}}.flatten.sort.uniq
# gives ["A", "AN", "ANA", "ANAN", "ANANA", "B", "BA", "BAN", "BANA", "BANAN", "BANANA", "N", "NA", "NAN", "NANA"]
peter
  • 40,314
  • 5
  • 58
  • 99