0

My website has a very granular permissions system for every client, making it fairly large in size.

To keep a cap on database queries I have been loading bitmasks from a mysql database at the start of a users sessions and then saving it as session data, so it looked something like this. This has allowed me to one make one (albeit it complex JOIN query) query per user session without creating a huge session file.

"permissions" => array(
    "type 1" => 'bitfield'
    "type 2" => 'bitfield'
     "type 3" => array(
         entity id = 'bitfield'
         entity id = 'bitfield')
     "type 4" => array(
         entity id = 'bitfield'
         entity id = 'bitfield')
)

Permissions are entirely group based, so every person in a given group would have this replicated in their session data.

However bitmasks are starting to be a pain to use and I'm looking to move to using an acl. The reason I didn't use an acl in the first place however was to minimise database usage..

So.. now I am going to have an entirely database/cache driven acl without any bitmasks. However storing huge arrays of permissions in user session data doesn't seem ideal. (do you agree?)

I think the way to go is to use a flat file cache to store groups permissions. Would the easiest way to do this be a file per group? Would this change when there are 4,000 + groups each with 4 permission types (2 permission types are global with combined 40 or so permissions, 2 types are local permissions with combined 40 or so permissions per entity (each type has maybe 3 or 4 lots of 20 permission!). Edit: for clarity this means 160 - 200 permission entries per group

This seems like it would be a fairly huge cache! Would it be best to just have huge database usage on every page load? This kind of data size made bitmasks far easier but they are just simply not flexible enough anymore.

This is made harder by the fact files are served by 2 different servers (session stickied so saving bitfields to session data wasn't a problem), so any cache would have to be synced between the servers. The db is on a seperate server connected by private network with supposedly 1gig connection.

Can any solutions be suggested? I think quick access cache such as memcached with this much data will just blow my memory usage out of the water? I am tempted to just user lots of database queries but I think that may put too heavy a strain on the db server.

Fairly large question, I hope its clear. If any of it needs clarification let me know. Any solutions will be greatly appreciated!

Chris

Chris B
  • 87
  • 1
  • 9
  • You might want to take a look at [`memcached`](http://memcached.org/). It has a lot of [benefits over straight-up session](http://stackoverflow.com/questions/1197859/session-vs-file-vs-memcache-for-a-cache-in-php). – voithos Aug 30 '12 at 22:50
  • Thanks voithos, a useful link! I have considered memcached, but thought the memory usage might be too great. But as someone says on that question, the memory usage may not be as big as I expect so maybe I should give it a whirl in a test environment. Interesting point about using underutilized servers as a memcache pool. I will try and find some guides on how to set this up! – Chris B Aug 30 '12 at 23:08

1 Answers1

0

I don't think that a data structure with ~40 entries in it stored in session is particularly huge. So really, what this might come down to design-wise is how to best get this information into the application with best performance.

If you are beginning to run up against performance problems and your infrastructure budget would allow for it, I would think your might look at migrating this solution to more of a service oriented architecture that could be shared across any number of servers. I am personally a huge believer in this sort of architecture as it can really help you in dealing with problems of scale.

You could expose "permissions" as a service that could be consumed by this application (or perhaps other applications you may need to develop in the future). It might look something like this:

  • In-memory caching layer (memcached or similar) which is where application makes its initial call to look for permission information based on group. If the data does not exist here the next layer services the request
  • RESTful API. You make a simple GET request for the permissions on your group after first having a cache miss. This will need to call to the database layer to get information. It would also do things like populating the cache on a cache miss, invalidating and repopulating the cache in cases where a client POSTs some new permissions data or PUTs update to an existing set of permissions.
    • DB layer this is accessed only by RESTful service. Perhaps MySQL perhaps a NoSQL technology if you have more complex non-relational data structures.

For your service you could probably have a very smal database server (because the database itself should be queried infrequently once the cache is populated). A memory caching server that has sufficient memory to meet your storage needs (or possibly a small cluster of servers if redundancy is needed), and a relatively small server to handle REST API (this should be infrequently accessed as well once cache is developed. The good thing is that they are several memcached or similar service out there that you can use relatively cheaply (like Amazon Elasticache). Really the in-memory cache would be taking the brunt of the traffic from the application servers so you would not need to scale up the REST server of DB server much at all as your traffic grows.

Hope this helps in your thought process a little.

Mike Brant
  • 66,858
  • 9
  • 86
  • 97
  • That certainly sounds like a very good solution, and definitely something I will look to for the future. However at the moment I'm not sure it would be economical (although I would have to look into costs further). Its certainly a structure I will bear in mind if I develop an interim solution - with a view to porting it to something like this in the future. I explained the entries in a confusing way, its actually more like 160-200 entries per group. What would you say would be the max entries to make using user sessions inefficient? I assumed so many large nested arrays would be slow. – Chris B Aug 30 '12 at 22:47
  • @ChrisB The reality is that whether you pull the data from a service, from a flat file or wherever, it will end up residing in memory in your application (in session or in your user permissions object or whatever). And the more memory your app uses the less requests your server can serve efficiently based on it's particular specs. You could simply use DB storage for the permissions in your case, and as long as you take advantage of your query cache, you should be able to get good performance. – Mike Brant Aug 30 '12 at 23:01
  • I have decided to utilize this kind of design using either memcached or CouchBase to power my in memory cache. I didn't realise how easy it was to setup a memcache 'cloud'. Looking into server statistics I can easily use at least 200mb RAM from each of my current servers without damaging performance of any of the applications. With 3 servers this will give me 600mb of RAM for a cache which should be enough if only used for permissions. If not I could probably do with another webserver soon which would probably give me about 1gb RAM minimum to use for cache! – Chris B Aug 31 '12 at 18:24
  • @ChrisB I am glad that you were able to come up with a suitable solution. Memcached is greatness, and like you said very easy to use. At my work over the last few years, I have basically gone through the process of converting most all of our web services to cloud-based (AWS) service oriented architecture like I described, and I find it much easier to deal with scaling concerns, as well as to integrate new application features/components, as you can just leverage existing services. – Mike Brant Aug 31 '12 at 18:29
  • Also worth noting looking into an implementation of this solution has made me make the ACL I have been working on far more efficient. With a bit of work I reckon I reckon I could bring the in memory cache requirement below 400MB (from around 700MB), making 600mb to 1GB ample even for spikes. – Chris B Aug 31 '12 at 18:35