storm-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Adam Lewis <m...@adamlewis.com>
Subject Re: Svend's blog - several questions
Date Wed, 05 Feb 2014 19:37:34 GMT
To your first two questions:


Q1: Do the db keys come only from groupBy?

Yes, that is how MapStates get their keys


Q2: Can you do groupBy multiple keys:like .groupBy("name").groupBy("id") ?

 Yes, you can specify several fields in a single groupBy,

e.g. myStream.groupBy(new Fields("name","id"))




On Wed, Feb 5, 2014 at 1:13 PM, Adrian Mocanu <amocanu@verticalscope.com>wrote:

>  Thanks
>
> Looking forward to a reply!
>
>
>
> *From:* P. Taylor Goetz [mailto:ptgoetz@gmail.com]
> *Sent:* February-05-14 12:39 PM
> *To:* user@storm.incubator.apache.org
> *Subject:* Re: Svend's blog - several questions
>
>
>
> Hi Adrian,
>
>
>
> I'll apologize up-front for not answering your questions now, but I'll try
> to follow up later when I have a little more bandwidth.
>
>
>
> In the meantime, check out the storm documentation on the new Storm
> website: http://storm.incubator.apache.org, which includes the latest
> javadoc for the 0.9.x development line.
>
>
>
> Specifically, look for the documentation for trident, which should answer
> Q7/Q8.
>
>
>
> Again, I'll try to address your other questions when I have more time, if
> someone else doesn't address them first.
>
>
>
> - Taylor
>
>
>
> On Feb 5, 2014, at 12:22 PM, Adrian Mocanu <amocanu@verticalscope.com>
> wrote:
>
>
>
>   I've read Svend's blog [
> http://svendvanderveken.wordpress.com/2013/07/30/scalable-real-time-state-update-with-storm/]
> multiple times and I have a few questions.
>
>
>
>
>
> "Because we did a groupBy on one tuple field, each List contains here one
> single
>
> String: the correlationId. Note that the list we return must have exactly
> the same
>
> size as the list of keys, so that Storm knows what period corresponds to
> what key.
>
> So for any key that does not exist in DB, we simply put a null in the
> resulting list."
>
>
>
> Q1: Do the db keys come only from groupBy?
>
> Q2: Can you do groupBy multiple keys:like .groupBy("name").groupBy("id") ?
>
> Q3: When we add null we keep the size of the results list the same as they
> keys list but I don't understand how we make sure that key(3) points to
> correct result(3).
>
> After all we're adding nulls at the end of result list not intermitently.
> ie: if
>
> key(1) does not have an entry in db, and key size is 5, we add null to
> last position
>
> in results not to results(1). This doesn't preserve consistency/order so
> key(1) now
>
> gives result(1) which is not null as it should be. Is the code incorrect
> ... or the
>
> explanation on Svend's blog is incorrect?
>
>
>
>
>
> Moving on,
>
> "Once this is loaded Storm will present the tuples having the same
> correlation ID
>
> one by one to our reducer, the PeriodBuilder"
>
>
>
> Q4: Does Trident/Storm call the reducer after calling multiGet and before
> calling multiPut?
>
> Q5: What params (and their types) are passed to the reducer and what
> parameters should it emit so they can go into multiGet?
>
>
>
> Q6: The first time the program is run the database is empty and multiGet
> will return nothing.
>
> Does the reducer need to take care and make sure to insert for the first
> time as opposed to update value? I do see that reducer (TimelineUpdater)
> checks for nulls and I'm guessing this is the reason why it does so.
>
>
>
>
>
> Q7:
>
> Can someone explain what these mean:
>
> .each  (I've seen this used even consecutively: .each(..).each(..) )
>
> .newStream
>
> .newValuesStream
>
> .persistAggregate
>
>
>
> I am unable to find javadocs with documentation for the method signatures.
>
> These java docs don't help much:
> http://nathanmarz.github.io/storm/doc/storm/trident/Stream.html
>
>
>
>
>
> Q8:
>
> Storm has ack/fail; does Trident handle that automatically?
>
>
>
>
>
> Q9: Has anyone tried Spark? http://spark.incubator.apache.org/streaming/
>
> I'm wondering if anyone has tried it because I'm thinking of ditching
> storm and moving to that.
>
> It seems much much much better documented.
>
>
>
>
>
> Lots of questions I know. Thanks for reading!
>
>
>
> -Adrian
>
>
>

Mime
View raw message