Rowkey for time-series container of GridDB based on gsCurrentTime()

Question

I have input from a wide variety of sensors who each only ever produce one or two rows of input, so creating a new container per sensor makes little sense. The data comes in an order which should not be lost, as such I've considered enumerating through the input rows as they come and assign numbers accordingly. I then wanted to give additional information on the spacing between the data input. After first adjusting the id's to no longer be sequential I instead am now considering timestamps as rowkeys, and just assigning them when writing the data into a row. I've found mentions in regards to other databases that this can cause problems, as the data now contains information which is not technically directly associated with it.
So essentially the rowkey is set by: gsSetRowFieldByTimestamp(row, 0, gsCurrentTime()); Would using the said time function to supply the rowkey for a timeseries be appropriate? Any foreseeable issues, besides the possibly obvious one that this effectively bottlenecks insertion to the resolution of gsCurrentTime()?

Please post what you have tried and how it didn't result in what you wanted. — user3629249, Jul 09 '20 at 01:39
It's part of the question already isn't it? So first it was a simple sequential order of id's like 1,2,3,4 then I realised this would make the data points seem to be equally spaced when graphed, so I increased the id corresponding to the time between input events. 1,2,300,1522,1523 and so on. Which works but also creates somewhat meaningless numbers as only reference back to the data, so looking up data in the database was difficult. This was how I came to just using the insertion time. — Frostbite, Jul 09 '20 at 02:45
that question contains no code. So it does NOT show what you have tried nor does the question show a 'expected output' nor a 'actual output'. Therefore, the question does not meet the [mcve] idea. So we cannot reproduce the problem nor help you debug it. — user3629249, Jul 09 '20 at 22:29

score 1 · Accepted Answer · answered Jul 15 '20 at 14:13

First, even if a sensor only has a few columns I believe the data schema should still be one container per device. Yes, it seems wasteful but it is the GridDB way. GridDB needs multiple containers to partition data amongst it's nodes if using clustering. Using Multi-query will negate any performance issues on the Read side of your application.

Now, if you insist on using a singular container it is important to note your data collector must be single-threaded to avoid theoretical row key collisions and yes, use gsCurrentTime() or TimestampUtils.current in Java.

Rowkey for time-series container of GridDB based on gsCurrentTime()

1 Answers1