10

(There are some questions about time-efficient sparse arrays but I am looking for memory efficiency.)

I need the equivalent of a List<T> or Map<Integer,T> which

  1. Can grow on demand just by setting a key larger than any encountered before. (Can assume keys are nonnegative.)
  2. Is about as memory-efficient as an ArrayList<T> in the case that most of the indices are not null, i.e. when the actual data is not very sparse.
  3. When the indices are sparse, consumes space proportional to the number of non-null indices.
  4. Uses less memory than HashMap<Integer,T> (as this autoboxes the keys and probably does not take advantage of the scalar key type).
  5. Can get or set an element in amortized log(N) time where N is the number of entries: need not be linear time, binary search would be acceptable.
  6. Implemented in a nonviral open-source pure Java library (preferably in Maven Central).

Does anyone know of such a utility class?

I would have expected Commons Collections to have one but it did not seem to.

I came across org.apache.commons.math.util.OpenIntToFieldHashMap which looks almost right except the value type is a FieldElement which seems gratuitous; I just want T extends Object. It looks like it would be easy to edit its source code to be more generic, though I would rather use a binary dependency if one is available.

Jesse Glick
  • 22,072
  • 9
  • 77
  • 100

5 Answers5

6

I would try with trove collections, there is TIntObjectMap which can work for your intents.

Jack
  • 125,196
  • 27
  • 216
  • 324
  • That looks good. I tried adapting `OpenIntToFieldHashMap` to a generic value type, which seems to have worked with ~10min work, but it only performs marginally better than `TIntObjectMap`. – Jesse Glick Sep 27 '12 at 17:06
5

I would look at Android's SparseArray implementation for inspiration. You can view the source by downloading AOSP's source code here http://source.android.com/source/downloading.html

  • https://code.google.com/p/android-source-browsing/source/browse/core/java/android/util/SparseArray.java?spec=svn.platform--frameworks--base.58aff7debfdab8ca99dd6bfcfa0c7bebdf2d303b&repo=platform--frameworks--base&r=58aff7debfdab8ca99dd6bfcfa0c7bebdf2d303b does look to be appropriate—amortized run time is undocumented but from inspection I am guessing it is logarithmic—and is under ASL 2.0, which is fine. Unfortunately it is not in Central that I know of, and would want it decoupled from unrelated stuff like Android Bluetooth support which is all in the same source root. – Jesse Glick Jul 26 '13 at 14:24
  • 1
    Here's a self contained version that uses all the necessary code from android https://github.com/frostwire/frostwire-jlibtorrent/blob/b4b3f9a90d7a1dade864d7e3eaa88b616f200a9a/src/com/frostwire/jlibtorrent/SparseArray.java – Gubatron Oct 28 '14 at 01:44
  • You might be looking more precisely for `SparseIntArray` where you avoid costs of boxing/unboxing the indexes https://developer.android.com/reference/android/util/SparseIntArray And yes source code is available and easy+friendly license if you want to extract it from the google code base and adapt it. – Yann TM Dec 11 '19 at 19:47
1

I have saved my test case as jglick/inthashmap. The results:

HashMap size: 1017504
TIntObjectMap size: 853216
IntHashMap size: 846984
OpenIntObjectHashMap size: 760472
Madhawa Priyashantha
  • 9,208
  • 7
  • 28
  • 58
Jesse Glick
  • 22,072
  • 9
  • 77
  • 100
  • 1
    Where do I find IntHashMap? – oleh Mar 01 '13 at 12:43
  • @oleh probably apache commons (?) – Karussell Apr 05 '13 at 19:17
  • 1
    Sorry, `IntHashMap` was my adaptation of `OpenIntToFieldHashMap` from Commons Math. Since it was barely better than `TIntObjectMap` I dismissed this approach. – Jesse Glick Jul 26 '13 at 14:17
  • 1
    @JesseGlick see http://java.dzone.com/articles/time-memory-tradeoff-example and https://gist.github.com/leventov/bc14ea790b4d3cfd238d#file-memory-txt – leventov Aug 17 '14 at 23:52
  • @leventov interesting. Addresses a different set of questions than I was asking here but a good source to investigate potential implementations. – Jesse Glick Aug 18 '14 at 13:39
  • Why different. It shows average relative memory overuse of "int -> int" maps in libraries, that correlates with "int -> obj" well because specializations are homogenuous within all libs. – leventov Aug 18 '14 at 13:44
  • Well, you are also measuring access speed which I was not considering relevant (so long as it is logarithmic); and my memory comparison is to a non-sparse `ArrayList`, which would be half the size of the “theoretical minimum” in the more general case you were considering. – Jesse Glick Aug 19 '14 at 15:24
  • In your answer only different hash table implementations are compared. I referenced another comparison, which include all impls you tested, and more. – leventov Aug 21 '14 at 17:39
1

I will suggest you to use OpenIntObjectHashMap from Colt library. Link

abhi
  • 1,304
  • 16
  • 22
  • Thanks for the tip. It does indeed have moderately but significantly lower space consumption than the alternatives. I have included this in my revised test case. – Jesse Glick May 14 '14 at 23:34
0

Late to this question, but there is IntMap in libgdx which uses cuckoo hashing. If anything it would be interesting to compare with the others.

NateS
  • 5,747
  • 4
  • 46
  • 53