incubator-giraph-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Gianmarco De Francisci Morales <g...@apache.org>
Subject Re: Use Giraph to simulate Storm ?
Date Wed, 04 Jan 2012 11:31:47 GMT
For these use cases there is S4, another incubating Apache project.
I think that the superstep synch overhead would be a performance killer in
many cases.

Cheers,
--
Gianmarco



On Wed, Jan 4, 2012 at 03:18, Avery Ching <aching@apache.org> wrote:

> Definitely keep us up to date with your progress.  Don't hesitate to file
> and/or fix JIRAs =) (https://issues.apache.org/**jira/browse/GIRAPH<https://issues.apache.org/jira/browse/GIRAPH>
> ).
>
> Avery
>
>
> On 1/3/12 6:13 PM, prasenjit mukherjee wrote:
>
>> I will be using giraph/hadoop for other use cases anyways, and I don't
>> want to install/maintain Storm just for the real-time streaming use
>> case.
>>
>> I am also thinking of adding real-time logs to hbase and have giraph
>> pick up the incremental feeds from hbase based on  time stamp.
>>
>>
>> On 1/4/12, Avery Ching<aching@apache.org>  wrote:
>>
>>> Interesting idea.  You could actually implement the code to load the new
>>> input data in preSuperstep().  If the input data is resilient (i.e.
>>> stored on HDFS), then the system would inherit Giraph's reliability
>>> guarantees.  Implementing an external trigger to stop the application
>>> wouldn't be too difficult, (i.e. dump a file stamp or something and
>>> check for it every n supersteps).  Still, as I'm not that familiar with
>>> Storm, what would be the advantages of this over Storm?
>>>
>>> Avery
>>>
>>> On 1/3/12 5:30 PM, prasenjit mukherjee wrote:
>>>
>>>> As Jake mentioned, you can have continous processing by making the
>>>> mappers in Giraph stop based on an external condition ( I.e.
>>>> Specifically asked to do so ) and one can call voteForHalt() only if
>>>> that condition is satisfied.
>>>>
>>>> Additionally, the VertexInputSource can be modified to read it from a
>>>> continuous input ( like ActiveMQ or even a port ) potentially outside
>>>> of HDFS.
>>>>
>>>>
>>>> On 1/3/12, Sebastian Schelter<ssc@apache.org>   wrote:
>>>>
>>>>> Hi Prasen,
>>>>>
>>>>> Storm is supposed to process a continuous stream of data while Giraph
>>>>> is
>>>>> a parallel batch processing platform. I think these are inherently
>>>>> different systems and one cannot easily be transformed into the other.
>>>>>
>>>>> -sebastian
>>>>>
>>>>> On 03.01.2012 17:51, prasenjit mukherjee wrote:
>>>>>
>>>>>> I have a use case which maps perfectly with the open source
>>>>>> implementation of storm ( by twitter team ). I think Giraph can be
>>>>>> easily modified to have an implementation simulating storm's use
>>>>>> cases. Just curious, if anybody had similar thoughts.
>>>>>>
>>>>>> -Prasen
>>>>>>
>>>>>
>>>
>

Mime
View raw message