2

While working on a hobby project of mine i'm trying to find a most "efficiently" solution for the following issue. Before I'll describe the issue I want to say I was not sure if I had to post this in the https://math.stackexchange.com/ section or here in the https://stackoverflow.com/ section. So if I'm in the wrong section, sorry about that! ;)

Issue description:

I have a file containing elements, these elements can be represented by coordinates. These coordinates consists of two unsigned integers, for example: 0,2 6,3 or 1,157348. All the coordinates representing a element in the file is unique for the given file. What I would like to achieve is creating a mapping for these coordinates so I can access them quickly. A very straightforward solution would be to create a collection which to store the coordinates as a pair. for example:

$collection = [0 => [0,2],1 => [6,3], 2 => [1,157348], ...];

The problem with this solution is (I assume) that vast numbers of elements that the file can contain. In my case it is very likely that the file contains hundreds of thousands of elements, leaving me with a very large mapping collection. This could possibly cause memory issues for the application and so crashing the application.

I think storing the coordinates of a element as a paired array is inefficiently. I think a more memory efficient solution would be if I could encode the coordinates of an element as a single value which is again decodeable for getting the real element coordinates.

A solution would be to store the coordinates as strings. for example:

$collection = [0 => '0;2', 1 => '6;3', 2 => '1;157348', ...];

This leaves me with a single dimensional array, that should be more memory efficient than the multi-dimensional solution before. Which is awesome. But this solution would require some extra steps:

Writing:

  1. encoding integers to string and separate them by a token (;)

Reading:

  1. decoding string, separate the string token (;).
  2. casting the coordinate values back to integers.

Maybe it would be cool if these steps could be avoided. and I don't think storing the coordinates as a floats, because of the (possible) very large decimal part.

So I was thinking, I can't imagine being the first person to encounter this kind problem, is there a way (like: algorithm) that is capable of converting two integer values to a single integer value, and can be reconverted to the the original values by computation.

From the way I seeing it, this would have the following advantages:

  • No int -> string or string -> int conversion needed.
  • Storing the value as integer is more efficient than string storage.

A disadvantage would be:

  • Computing the coordinates to a single value and back, would require more CPU power, slowing down the application.

I would prefer a slower solution against a more memory inefficient solution, because I expect (putting it simply) a memory inefficient solution increases the risk of crashing the application where a more CPU intensive solution just increases the time taking the system to process the file.

So does anybody has a clue if such algorithm exists? Or do you have a idea of another way of solving this problem I would be happy to hear your ideas.

If you think that my assumptions are incorrectly, please correct me in a constructive manner ;-)

Beach Chicken
  • 358
  • 2
  • 13
  • Converting int to string isn't necessary memory efficient. Well, in few cases, it's memory efficient. in a string (ASCII encoded), a character is 1 byte. An integer is 4 bytes. `"99999"` is 5 bytes (6 if there's a last character `\0` like in C) while `99999` is 4 bytes. – Cid Sep 08 '18 at 12:28
  • Have you considered adding this data to a database table? You could then use indices that improve the speed of operations – Mario Sep 08 '18 at 12:34

1 Answers1

0

Do you need to have the entire collection loaded in php? If you don't, you can use Redis (https://redis.io/) to store the collection and just query for the values that you need. Would help if you'd provide some more details about what you use the values for.

symques
  • 46
  • 7
  • Thanks for all the suggestions, I will try to review them and come back to you with my chosen method. Im trying to create some kind of JSON parser. Note: I know its reinventing the wheel. But the process is very interesting and full of thing to learn from. – Beach Chicken Sep 11 '18 at 17:58