incubator-giraph-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Avery Ching <ach...@apache.org>
Subject Re: stand alone implementation
Date Thu, 29 Dec 2011 07:26:23 GMT
Gavan,

My comments are inlined.

Avery

On 12/28/11 11:48 PM, Gavan Hood wrote:
>
> Thanks Avery,
>
> I am asking questions up front ahead of jumping into the code.
>
> I am looking  at embedded up to cloud scalability.
>
> The  map slot approach hints that performance would be good on multi 
> core machines compared to alternative graph approaches, is that a 
> reasonable assumption.
>
This type of approach will work to utilize multiple cores, but there is 
probably some overhead form the Task Tracker and Job Tracker that could 
be avoided with some optimizations.

> Do you have any idea of the performance trade off on a single core 
> machine / laptop?
>
A single machine avoids the network I/O.  This is a good thing.  But 
it's limited to the speed/memory of the single machine rather that 
utilizing lots of machines.

> Is the single machine support just for debug or could you build an 
> application upon it.
>
You could do this, but remember that we have not optimized for this 
case.  That being said, there is no reason we can't tweak a couple of 
things to improve this.
>
> Could you consider the above question for embedded systems (android  
> devices , iphone etc)
>
> Is it PC and up technology or is it able to be configured for 
> reasonable support on these devices.
>
> I realise this applies to Hadoop as much as Giraph.
>
Yes, a lot of what I said would apply to Hadoop as well.
>
> Perhaps the answer is in your response of not requiring Hadoop to run, 
> does this mean there is an alternative or generic persistence model?
>
Giraph is a graph processing framework, not a persistent storage 
system.  You can store your data anyway you like (i.e. hard drive, flash 
drive, etc.)
>
> If the embedded implementation is a problem, what is required to 
> generate a back end for this size of device, has there been any 
> thought on this side.
>
I haven't thought much about using Giraph on embedded devices.  I 
certainly wouldn't want to run graph processing applications on my 
phone.  Think about what that would do to my battery life =).

> Regards
>
> Gavan
>
> *From:*Avery Ching [mailto:aching@apache.org]
> *Sent:* Thursday, 29 December 2011 1:07 AM
> *To:* giraph-user@incubator.apache.org
> *Subject:* Re: stand alone implementation
>
> Hi Gavan,
>
> Giraph can run on a single machine as well as multiple machines, just 
> like Hadoop.  Our test suite can be run with or without a running 
> Hadoop instance as an example.
>
> If you want to take advantage of multiple cores though, you might want 
> to try running Hadoop with multiple map slots on the single node and 
> then using the appropriate number of workers.
>
> Hope that helps,
>
> Avery
>
> On 12/28/11 2:41 PM, Gavan Hood wrote:
>
> Hi all,
>
> I know the focus of giraph is multiple machines etc....
>
> What if I want to scale down to single pc/ multiple cpu's and even 
> down to embedded systems.
>
> Is this project and hadoop able to scale down as well as up ?
>
> Regards
>
> Gavan
>


Mime
View raw message