4

I'm writing indexing policies for my collection, and trying to figure out what is the right "Precision" for String in Hash Index, i.e.

collection.IndexingPolicy.IncludedPaths.Add(
new IncludedPath { 
    Path = "/customId/?", 
    Indexes = new Collection<Index> { 
        new HashIndex(DataType.String) { Precision = 20 } } 
});

There will be around 10,000 different customId, so what is the right "Precision"? What if it gets more than 100,000,000 ids?

Vej
  • 378
  • 1
  • 6
  • 24

1 Answers1

2

There will be around 10,000 different customId, so what is the right "Precision"? What if it gets more than 100,000,000 ids?

As Andrew Liu said in this thread: The indexing precision for a hash index indicates the number of bytes to hash the property value to.

And as we know, 1 bytes = 8 bits, which can hold 2^8 = 256 values. 2 bytes can hold 2^16 = 65,536 values, and so forth. You could do similar calculation to get the indexing precision based on the number of documents that you expect to contain the path for property customId.

Besides, you could refer to Index precision section in this article and tradeoff between index storage overhead and query performance when specifying Index precision.

Community
  • 1
  • 1
Fei Han
  • 21,907
  • 1
  • 16
  • 28
  • I've already read [that document](https://docs.microsoft.com/en-us/azure/documentdb/documentdb-indexing-policies), but didn't find [Andrew Liu's](http://stackoverflow.com/questions/32732858/documentdb-guid-index-precision). Thank you! It's very helpful. – Vej May 05 '17 at 11:43