Given a list of numbers that are regex patterns, sort them by the last 4 numbers in that numeric value.
I have a list of regex (phone number) patterns and I am trying to sort them by the last 4 characters. Here's a sample list of phone numbers:
8062
\+13066598273
4083100
408320[0-3]
408320[4-6]
752[234569]
\+13066598305
8059
I would like to re-order these numbers by the last 4 numbers so that I would end up with a list like this:
4083100
408320[0-3]
408320[4-6]
752[234569]
8059
8062
\+13066598273
\+13066598305
Now if my patterns were nothing but numbers, I could sort them easily in either SQL or my MVC C# project. In SQL, I could use ORDER BY RIGHT(pattern, 4), or in C# MVC, I could sort my IQueryable list with patterns = patterns.OrderByDescending(s => s.Substring(...etc...)).
Patterns are a bit more difficult. The brackets count as characters, so sorting by the last 4 characters doesn't work as well.
Are there any built-in utilities in C#, MVC, or SQL that allow me to convert regex patterns to the largest possible match?
- Given a regex pattern, return the largest possible matching regex that matches my condition. For example, if I had the pattern 4[12]00[1-3], I'd have 6 possible results: 41001, 41002, 41003, 42001, 42002, 42003. Then I can get the biggest number possible, and use that for ordering in my original regex list.
- The regex pattern doesn't contain anything like * or +, special characters that may cause infinite combinations.
- There may be a C# library that does exactly what I ask for, sorting regex pattern strings.
EDIT:
I've accepted Diego's answer, but it took me a bit of time to wrap my head around it. For other readers who want to know what it's doing, this is what I think Diego is doing:
- Make a range of ints, starting at 9999, all the way back to 0. [9999], [9998], [9997]...[0].
- Replace the regex-ish part of the strings with a single character. Example, "500[1-5]" would become "500X", or "20[1-9]00[89]" would become "20X00X", so on, so forth.
Get the length of the "last" 4 characters + regex characters.
var len = lastNChars + pattern.Length - Regex.Replace(pattern, @"\[[^\]]+\]", "X").Length;
So for the pattern 20[1-9]00[89], the above formula translates to "len = 4 + 13 - 6", or 11.
Using the len variable from above, get a substring that represents the "last" 4 numbers of the phone number, even with regex characters. The original string = "20[1-9]00[89]", while the new substring = "[1-9]00[89]" (the 20 is gone now)
- Enumerate through and compare array values to the substring regex pattern. [9999] doesn't match regex pattern, [9998] doesn't match...[9997] doesn't match...aha! 9009 matches! The first match I get is going to the biggest possible regex match, which is what I want.
- So each regex pattern has been converted to its largest possible matching pattern. Now we can use C#/LINQ/other in-built methods that can sort by those sub-regex matches for us!
Thank god I'm dealing with only numbers. If I were trying to do sort regex patterns that were actually words/had alpha characters, that would be way harder, and that array would be hella bigger (I think).