ignite-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Welly Tambunan <if05...@gmail.com>
Subject Re: Storing Time Series data efficiently on Ignite
Date Thu, 03 Dec 2015 13:51:33 GMT
Hi Denis,

Thanks a lot. This is really useful. I will experiment with this approach

Cheers

On Thu, Dec 3, 2015 at 7:59 PM, Denis Magda <dmagda@gridgain.com> wrote:

> Welly,
>
> Please see below
>
> On 12/3/2015 3:33 PM, Welly Tambunan wrote:
>
> Hi Denis,
>
> Thanks for your clear explanation.
>
> Our data structure is something like this one <sensorid: UUID, time: Long,
> value: Double>
>
> When i put a data point with that composite key. <sensorid, time>, is
> there any guarantee that it will be store close together in a same node ?
>
> Yes, to force such a behavior you should use so called affinity
> collocation.
> https://apacheignite.readme.io/docs/affinity-collocation
>
> In you case you can mark sensorId as an affinity key and this will enforce
> all its data (and keys) to be stored on the same partition where sensorID
> is mapped to.
>
> SampleKey {
> @AffinityKeyMapped
> int sernsorId;
>
> long sampleTime;
> }
>
> In case of update, however we need to be able to update in range, like
> replace the range <startId, endIdx, List[DataPoint]>, (where DataPoint =>
> <time, value> )
> so it will clear that range and insert the new data point into that range.
>
> If the result set of such query is not significant then you can split it
> into two steps:
> - use SQL query to retrieve keys which entries should be updated or
> removed according to you 'WHERE' clause above;
> - use removeAll or putAll to delete or update the values for the keys.
>
>
> So we need to do two step process to do that ? Select the key and then
> delete the key ?
>
> If the questions are related to the range like query above, then, yes in
> cases if the SQL result set is not significant follow this way.
> Otherwise it's always possible to come up with other solution.
>
> Regards,
> Denis
>
>
> Thanks
>
>
>
>
>
> On Thu, Dec 3, 2015 at 7:00 PM, Denis Magda <dmagda@gridgain.com> wrote:
>
>> Hi Welly,
>>
>> Ignite perfectly fits for your task.
>>
>> First, if I understand you properly there are will be many time series
>> for a give sensor ID.
>> If so then I would use a compound key like (sensorId, time) for all cache
>> related operations.
>>
>> As an example you may want to use classes like this.
>>
>> SampleKey {
>> int sernsorId;
>> long sampleTime;
>> }
>>
>> Sample {
>> int sensorId;
>> long sampleTime;
>> byte[] data1;
>> byte[] data2;
>> etc.
>> }
>>
>> And use them this way
>>
>> cache.put(new SampleKey(1, time), sample);
>> cache.get(new SampleKey(2, time));
>>
>> Second, to retrieve samples depending on sensor ID, time (time range) or
>> other parameters you can leverage Ignite SQL engine that is designed
>> exactly for the use cases you have. [1]
>> However, if you're going to use an object field in 'SELECT' or 'WHERE'
>> clause you have to annotate it properly or specify using CacheTypeMetadata
>> [2]
>>
>> Third, when you need to update data series you can remove the old one and
>> insert the new one that should have new time.
>> To perform a remove I would suggest doing the following:
>> - select sensor ID and sampleTime of all the entries to delete;
>> - use cache.removeAll by passing SampleKeys that are created using the
>> data retrieved with SQL above.
>>
>> Moreover, you can use an eviction or expire policy that is used in cases
>> when old data must be removed from cache automatically.
>> Just refer to these articles for more info  - [3]
>>
>> Finally, Ignite has bunch of cache and SQL related examples. They are
>> located in "datagrid" folder of "examples" module.
>> Have a look at them and probably you'll come up with better solution
>> based on Ignite that suggested by me above cause definitely you know all
>> the details of your case better ;)
>>
>> [1] https://apacheignite.readme.io/docs/sql-queries
>> [2]
>> https://apacheignite.readme.io/docs/sql-queries#configuring-sql-indexes-by-annotations
>> [3] https://apacheignite.readme.io/docs/evictions
>> [4] https://apacheignite.readme.io/docs/expiry-policies
>>
>>
>> Regards,
>> Denis
>>
>>
>> On 12/3/2015 12:50 PM, Welly Tambunan wrote:
>>
>> Hi Igniters,
>>
>> Currently we are trying to asses the possibility of using Ignite on our
>> Architecture.
>>
>> We have a case where we want to store time series data in memory.  We
>> will have a lots of sensor data. So i think i will use sensor id as a key
>> to retrieve the time series from cache.
>>
>> However i can't find any sorted list structure in ignite to store our
>> time series. The index can be long ( for time ). So it will need to be
>> sorted by index.
>>
>>
>> We also have a query for getting range of index, ex: Give me all series
>> from start idx to end idx.
>> For updating we also need to be able to update a range with new data
>> series.
>>
>> We just don't want to re uploaded the data again over and over again to
>> cache everytime there's an update on the range. We want to be able to
>> update the cache partially based on the range.
>>
>> Is there any way we can achieve this on Ignite ?
>>
>> Any suggestion or reference would be really appreciated.
>>
>> Cheers
>>
>> --
>> Welly Tambunan
>> Triplelands
>>
>> http://weltam.wordpress.com
>> http://www.triplelands.com <http://www.triplelands.com/blog/>
>>
>>
>>
>
>
> --
> Welly Tambunan
> Triplelands
>
> http://weltam.wordpress.com
> http://www.triplelands.com <http://www.triplelands.com/blog/>
>
>
>


-- 
Welly Tambunan
Triplelands

http://weltam.wordpress.com
http://www.triplelands.com <http://www.triplelands.com/blog/>

Mime
View raw message