1

We have an API (ASP.NET Web Api 2, to be specific) which needs to traverse through a very large collection of entities in some algorithmic fashion in order to research the desired result.

This collection is the same for every query performed. It forms the data "source" for every call.

Properties of the collection:

  • Ordered
  • Large (hundreds of thousands of objects)
  • Read-only
  • Must be independently traverseable by concurrent threads from the application pool
  • The items in the collection are objects consisting of only a few value types.

Reading this collection from disk or database every time a request to the API is made results in a lot of overhead. So, we persist it statically, almost as a singleton. This seems to work as static objects are shared between all requests to the application and have the same lifetime as the application domain. This results in almost instant requests from the API.

Is there perhaps a better pattern, practice or framework for such a problem?

Dave New
  • 34,265
  • 48
  • 183
  • 366
  • have you tried using asynchronous actions? – Mahesh Kava May 20 '15 at 06:40
  • Another approach would be using some sort of cache - specifically ApplicationState for storing for whole application's lifecycle. – kamil-mrzyglod May 20 '15 at 06:40
  • http://blogs.telerik.com/fiddler/posts/12-11-05/understanding-http-304-responses – Mahesh Kava May 20 '15 at 06:46
  • Depending on the type of architecture you have setup, you could perhaps store your collection in Redis. This would offload the memory usage to Redis freeing your application from managing a large dataset in memory. You could set TTL on the Redis store or remove the objects when your application restarts. – Praval 'Shaun' Tirubeni May 20 '15 at 07:26
  • @Praval'Shaun'Tirubeni: But wouldn't reading a huge collection from Redis be a overhead in performance (in comparison to having it in-memory)? – Dave New May 20 '15 at 07:40
  • @davenewza. You are correct. My bad. Depending on your algorithm, what if you used the strategy design pattern? So for a specific type of parameters, you know that you will only be returning a subset of the data. Keep data that in memory and for other infrequent requests, do a lookup? – Praval 'Shaun' Tirubeni May 20 '15 at 07:59

2 Answers2

1

If you really never need to refresh your data, your solution is ok.

But only until you have to scale to more than one web server. Reached this point you'll probably need to be sure that all the web servers have the same data, and - depending on you environment and architecture - this could be tricky. Perhaps if you'll ever reach this point, you will ask another question on SO...

MatteoSp
  • 2,842
  • 5
  • 24
  • 34
1

You can use cache servers like Redis or mecmached.

In this SO you have a comparison between them, which will also explain what they're all about, and what the difference is with your current implementation: Memcached vs. Redis?

And, of course, here you have the official web sites for each:

There is even a third competitor: hazelcast.

Community
  • 1
  • 1
JotaBe
  • 34,736
  • 7
  • 85
  • 109