10

I have designed a JSON representation of a mailbox so that I can look up mails easily, for example mailjson[UID].Body.

However after looking at Angularjs and Ember, templating MVC JS engines, it seems that the JSON should be of the format:

[{
   "id": 1,
   "body": "Blah blah blah..."
 },
 {
   "id": 2,
   "body": "More blah foo blah"
 },
 {
   "id": 3,
   "body": "Hopefully you understand this example"
}]

And then there is some findAll(id) function to grab the item based on the id one wants, that iterates through the JSON. So now I'm wondering does my JSON design have merit? Am I doing it wrong? Why don't people use the dict lookup design I'm using with my JSON?

Any other tips to make sure I have a good data structure design, I would be grateful.

hendry
  • 8,343
  • 14
  • 61
  • 105
  • 2
    It sounds like you are basically asking for the (dis)advantages of hash tables vs lists/arrays. It all depends on what you are mainly doing with your data. – Felix Kling Dec 11 '13 at 07:33
  • I'm doing this for generating mail archives https://github.com/kaihendry/imap2json – hendry Dec 11 '13 at 07:35
  • 4
    That is the wrong answer :-) What he means is it depends on what you DO with it, not how you store it. What algorithm you use for *processing* the data. If you know you are going to access the n-th element in a storage container go for arrays, if you know you are going to have some string key to find a piece of information inside a large storage container go for hashes. One is ordered (numbered) storage, the other one is unordered. If you are going to retrieve your emails by ID, and there are no (large) gaps in the ID numbering, arrays are better. – Mörre Dec 11 '13 at 08:48
  • UIDs in my mail.json example are consecutive, but they will have gaps since mail is deleted or moved to different mail boxes over time. So you often see a mail.json that looks like 2,4,5,6,10,11,15,16,17. Frequent amounts of gaps, but I wouldn't say large ones. – hendry Dec 12 '13 at 02:45

1 Answers1

8

Best practice for storage of big table in JSON is to use array.

The reason is that when parsing JSON array into memory array there is no speed penalty of building map. If you need to build in-memory index by several fields for quick access, you can do it during or after loading. But if you store JSON like you did, you don't have a choice to load quickly without building map because JSON parser will always have to build that huge map of ids based on your structure.

Structure of how you store data in memory does not have to be (and should not be) the same as structure of storage on disk because there is no way to serialize/deserialize internal JavaScript map structure. If it was possible then you would serialize and store indexes just like MS SQL Server stores tables and indexes.

But if framework that you are using forces you to have same structure in memory and disk then I support your choice of using id as key in one big object, because then it's easier to pass id in JSON requests to and from server to do any action with email item or update it's state in UI, assuming that both server and browser keep significant list of email items in memory.

alpav
  • 2,612
  • 3
  • 33
  • 45