3

Ok let me explain clearly what i want to achieve

It will be an object which will contain the below data - like an sql server table

BigInt parameter1
BigInt parameter2 
string parameter3

these parameter1 and parameter2 both will compose the index (like primary key in sql-server table)

So this object will have like 500000 records like the above And i will make fast look ups from this object like

return parameter3 where parameter1 <= value and value <= parameter2

What can be used for this ?

So far i tried these and they are slow

DataView.RowFilter = super slow
static Dictionary<Int64, KeyValuePair<Int64, string>> = slower than database query
Database query = where parameter1 & parameter2 composes primary key = slow since i need to make over 500000 query.

I also searched many questions at stackoverflow and none of them targeting between operator at integer keys. They are all multiple string key.

C# 4.0

Charles
  • 48,924
  • 13
  • 96
  • 136
MonsterMMORPG
  • 20,310
  • 69
  • 183
  • 306
  • Why not use an embedded DB engine? – David Heffernan Dec 23 '12 at 12:48
  • @DavidHeffernan yes it is what i am using. But it is really really slow when compared to objects in ram memory. But could not find a proper way yet. For example if it was single key, a dictionary would be thousands time faster than querying database when you are doing 500000 query. – MonsterMMORPG Dec 23 '12 at 12:50
  • A good embedded DB will hold everything in RAM? Which DB are you using? – David Heffernan Dec 23 '12 at 12:52
  • I am using sql-server. Oh i suppose you mean something else ? – MonsterMMORPG Dec 23 '12 at 12:54
  • Have you thought of some 3rd party indexing software? http://sphinxsearch.com/ must be superfast for your purposes. Either indexing your sql-server or other sources. – Max Yakimets Dec 23 '12 at 12:56
  • @MaxYakimets thanks gonna check that. Aren't there any build-in feature at C# netframework 4.0 ? – MonsterMMORPG Dec 23 '12 at 12:58
  • Yeah, I mean an in-memory embedded db – David Heffernan Dec 23 '12 at 13:11
  • @DavidHeffernan are there any in-memory embedded db at netframework ? or which one is most easily implementable thanks for answer – MonsterMMORPG Dec 23 '12 at 13:57
  • How static is your data? Do you require frequent updates? – Sergey Kalinichenko Dec 23 '12 at 14:01
  • You may want to check this thread http://stackoverflow.com/questions/9298430/is-it-possible-to-accelerate-dynamic-linq-queries-using-gpu and the discussion in comments. Your sql-server may still be faster than you might have expected. Stored procedures may reduce the overhead of parsing sql queries. – Max Yakimets Dec 23 '12 at 14:01
  • I'm quite sure there are such things. I've no experience myself. A websearch yielded this http://stackoverflow.com/questions/684595/embedded-database-for-net Whether embedded db gives the perf you need, I don't know – David Heffernan Dec 23 '12 at 14:04
  • if you are considering using static Dictionary.. can i assume your data already loaded in memory or..? i've been in the same scenario but with far more complexity (+1000000 iterations) and can tell you it takes fraction of a second if data loaded into memory and queried from there – Rached N. Dec 23 '12 at 20:45
  • I tried static dictionary but it is too slow. Because it does not have 2 keys composed indexes like sql-server. And yes i loaded it into the memory. The issue here is i am not looking string matches. I am looking between values of integers. If i could replicate what sql-server does on c# that would be super fast. – MonsterMMORPG Dec 23 '12 at 21:35
  • I was thinking about `SortedSet.GetViewBetween(T lower, T upper)` but then stumbled upon http://stackoverflow.com/questions/9850975/why-sortedsett-getviewbetween-isnt-olog-n and doubted about the overall performance. The Comparer to be used with the `SortedSet` must order items by `parameter1` and if it is equal - order by `parameter2`. – Max Yakimets Dec 24 '12 at 09:00
  • Well at SQL server i highly suppose that it is doing any ordering. It is directly looking indexes. I don't know what kind of system going at behind the scenes but it is super fast :) So i highly doubt that SortedSet.GetViewBetween would do the job ^^ – MonsterMMORPG Dec 24 '12 at 12:31
  • @MaxYakimets That wouldn't work anyway, because the ends of the interval in question are essentially in different collections. – Sergey Kalinichenko Dec 24 '12 at 15:16
  • @MonsterMMORPG I was researching your question a little, and stumbled upon [this unexpected discovery](http://stackoverflow.com/q/14023988/335858), leading me to a follow-up question: how random (or not random) is your data? It looks like an impact can be quite severe. – Sergey Kalinichenko Dec 24 '12 at 17:24
  • I think totally random. I am using IP-Country database from maxmind.com – MonsterMMORPG Dec 24 '12 at 17:58
  • @dasblinkenlight here a screenshot http://img14.imageshack.us/img14/64/resulss.png – MonsterMMORPG Dec 24 '12 at 18:00
  • @MonsterMMORPG Are all strings two upper-case character codes then? Also, do ranges overlap? – Sergey Kalinichenko Dec 24 '12 at 18:03
  • @dasblinkenlight I don't think so ranges overlap. Also yes they are all upper-case. – MonsterMMORPG Dec 24 '12 at 18:06
  • @MonsterMMORPG If ranges do not overlap, then you can sort them by the first item, and do a binary search to either get a single item, or get no items at all. This can be accomplished very, very quickly. – Sergey Kalinichenko Dec 24 '12 at 18:11
  • @dasblinkenlight how will you do that ? – MonsterMMORPG Dec 24 '12 at 18:13
  • @MonsterMMORPG I think you should update your question emphasizing the nature of your data: ordered, non-overlapping, immutable "IP-country database from maxmind.com". BTW why have you rejected C# API for GeoIp.dat? – Max Yakimets Dec 25 '12 at 09:06
  • @MaxYakimets well i really don't like using any third party application at my game. Though at my local i can use it i suppose. I will check it :) But still my question is a general question. What if we encounter with same problem and we had not other option ? – MonsterMMORPG Dec 25 '12 at 14:14
  • @MonsterMMORPG Check my answer – Max Yakimets Dec 25 '12 at 15:29

2 Answers2

1

I don't think [that] ranges overlap.

This simplifies the problem a great deal: rather than performing a two-dimensional search, you can sort your list, and perform a one-dimensional binary search, like this:

var data = new List<Tuple<long,long,string>>(TotalCount);
var cmp = new TupleComparer();
data.Sort(cmp);
long item = ... // The item to be searched
var pos = data.BinarySearch(Tuple.Create(item, long.MinValue, String.Empty), cmp);
// It appears that your data has only non-empty strings, so it is guaranteed that
// pos is going to be negative, because Item3, the last tie-breaker, will be smaller
// than anything that you may have in the table
pos = ~pos;
if (pos != data.Count && data[pos].Item1 <= item && data[pos].Item2 >= item) {
    Console.WriteLine("Found: '{0}'", data[pos].Item3);
} else {
    Console.WriteLine("Not found");
}

Here is the TupleComparer class:

class TupleComparer : IComparer<Tuple<long,long,string>> {
    public int Compare(Tuple<long,long,string> x, Tuple<long,long,string> y) {
        var res = x.Item1.CompareTo(y.Item1);
        if (res != 0) return res;
        res = x.Item2.CompareTo(y.Item2);
        return (res != 0) ? res : String.CompareOrdinal(x.Item3, y.Item3);
    }
}
Sergey Kalinichenko
  • 675,664
  • 71
  • 998
  • 1,399
1

Quick and dirty sketch:

public class GeoIp
{
    private class GeoIpRecord
    {
        public long StartIp;
        public long EndIp;
        public string Iso;
    }

    private class GeoIpRecordComparer: IComparer<GeoIpRecord>
    {
        public int Compare(GeoIpRecord x, GeoIpRecord y)
        {
            return x.StartIp.CompareTo(y.StartIp);
        }
    }

    private List<GeoIpRecord> geoIp;
    private IComparer<GeoIpRecord> comparer;

    public GeoIp()
    {
        this.geoIp = new List<GeoIpRecord>(500000)
            {
                new GeoIpRecord { StartIp = 1, EndIp = 2, Iso = "One" },
                new GeoIpRecord { StartIp = 3, EndIp = 5, Iso = "Three" },
                new GeoIpRecord { StartIp = 6, EndIp = 6, Iso = "Six" },
                new GeoIpRecord { StartIp = 7, EndIp = 10, Iso = "Seven" },
                new GeoIpRecord { StartIp = 15, EndIp = 16, Iso = "Fifteen" },
            };
        this.comparer = new GeoIpRecordComparer();
    }

    public string GetIso(long ipValue)
    {
        int index = this.geoIp.BinarySearch(new GeoIpRecord() { StartIp = ipValue }, this.comparer);

        if (index < 0)
        {
            index = ~index - 1;
            if (index < 0)
            {
                return string.Empty;
            }
        }

        GeoIpRecord record = this.geoIp[index];

        if (record.EndIp >= ipValue)
        {
            return record.Iso;
        }
        else
        {
            return string.Empty;
        }
    }
}

And the code that confirms the solution:

GeoIp geoIp = new GeoIp();
var iso1 = geoIp.GetIso(1); // One
var iso2 = geoIp.GetIso(2); // One
var iso3 = geoIp.GetIso(3); // Three
var iso4 = geoIp.GetIso(4); // Three
var iso5 = geoIp.GetIso(5); // Three
var iso6 = geoIp.GetIso(6); // Six
var iso7 = geoIp.GetIso(7); // Seven
var iso11 = geoIp.GetIso(11); //
var iso15 = geoIp.GetIso(15); // Fifteen
var iso17 = geoIp.GetIso(17); //

The List has to be filled with an ordered data.

List.BinarySearch Method (T, IComparer)

Max Yakimets
  • 1,187
  • 7
  • 12