ignite-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From svonn <sveng...@posteo.de>
Subject Student Blog about Apache Ignite & Questions how to efficiently handle data
Date Mon, 04 Dec 2017 16:07:55 GMT
Hello!

I noticed that this community is pretty active, so there might be some
people that are interested in this:

For a university project, we're trying to compare different stream
processing engines. I decided to use Apache Ignite.
Since our professorship hasn't really worked with most of those engines,
we're supposed to write a Blog about our progressions - mostly focused on
the stuff that doesn't work. 
So if you're interested in following a newbie struggling with Apache Ignite,
you might like this:

https://wordpress.com/post/streambench.wordpress.com/1310

I am always super happy about recommendations, tips, and experience I can
get from others, so don't hesitate with feedback!

__________________________


Now, to the question part:

The following data is to be streamed to Ignite:

Measurement: A measurement is basically the biggest entity, containing both
AccelerationPoints as well as GpsPoints. 
The key sent by our produced consists of the deviceId concatinated with a
measurementId. This key is unique as works as primary key.
GpsPoint: A GpsPoint belongs to a Measurement and has the very same key,
deviceId:MeasurementId. Its value (serialized as byte[] in Kafka, I convert
it to a binary object for Ignite currently) contains a timestamp, which
could probably be enough in combination with the key provided by Kafka for a
primary key.
AccelerationPoint: Similar to a GpsPoint, only that we get about 200 of
those per GpsPoint - they need to be interpolated later.

The issue now is that Kafka Connect provides the deviceId:MeasurementId key
schema for all three entities - therefore I'm either running in key
conflicts or I'm forced to overwrite the data already in the cache.
How do I deal with those key issues? 
I was thinking about either adding the timestamp from the value to the key
or just add any new entity to a list of entities with the same key. For both
those approaches, I don't really know how to do it properly since it has to
happen somewhere between consuming the data from Kafka and writing it to
Ignite.

Any help would be appreciated!

Best regards,
Sven 




--
Sent from: http://apache-ignite-users.70518.x6.nabble.com/

Mime
View raw message