incubator-giraph-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Brian Femiano (Commented) (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (GIRAPH-153) HBase/Accumulo Input and Output formats
Date Wed, 07 Mar 2012 22:02:58 GMT

    [ https://issues.apache.org/jira/browse/GIRAPH-153?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13224762#comment-13224762
] 

Brian Femiano commented on GIRAPH-153:
--------------------------------------

More on the algorithms. This was using the LineRecordReader from flat text files in HDFS.
Reading from HBase adds about 20min of setup overhead. 
My graph nodes take the form of 

0000001 -1 -1 .... 000006, 0000007 
where 0000001 = node guid
and 000006, and 000007 are children linked to that guid. A child can belong to more than 1
guid, hence the formation of a graph instead of a simple tree. 

I have tested these against 10 and 19 node clusters using m1.4xlarge instances via EC2. 70GB,
8 virtual cores per machine. GigE backplane bandwidth. 

Datasets included 98mil with 1050 possible hops from a given root to any leaf. 

The 98mil runs in about 1 hour using 143 workers (144 map tasks -1 for the master). Adding
more workers with more concurrent map slots does not lead to better performance. In fact running
with over 150 workers will cause FS open descriptor limit issues, even when increasing the
ulimit for all users on each machine and bouncing the cluster. The algorithm and dataset exhibit
the best performance on as few a workers as possible necessary to instantiate all 98mil nodes.
10 seems to be the sweet spot. There's just enough heap per JVM for all 79 workers to load
the graph in a responsible amount of time, without incurring the stack overhead of having
too many workers to open a communication channel to. I understand there is 'num_flush_thread'
and other configuration parameters designed to control this, but I haven't experimented with
them yet. This is over Federa 8 AMI images with 7GB heap per worker JVM. 

Running just a small sample (1mil nodes) over 160+ workers leads to similar results. "Unable
to create new native thread" or "Too many open FS". 
Throttle back the number of workers causes the symptoms to go away.  I've been following Giraph-45
which I believe is related. 

I'm happy to expand on these issues for anyone interested. 

I also worked on a variant algorithm that passes each node in the graph a copy of every spanning
tree from each root of which the node is connected back to. It does not adhere to strict connected
component rules, just for reference. Naturally this lead to a much higher volume of message
traffic and gc limit overhead reached issues. The 98mil graph would produce around 11billion
messages in the first 5 supersteps.  Concatenating the spanning trees together as one message
to limit the overall # of messages would lead to OOM issues. I don't have a requirement to
perform this nasty a BSP algorithm, but it was an interesting stress test. 

                
> HBase/Accumulo Input and Output formats
> ---------------------------------------
>
>                 Key: GIRAPH-153
>                 URL: https://issues.apache.org/jira/browse/GIRAPH-153
>             Project: Giraph
>          Issue Type: New Feature
>          Components: bsp
>    Affects Versions: 0.1.0
>         Environment: Single host OSX 10.6.8 2.2Ghz Intel i7, 8GB
>            Reporter: Brian Femiano
>         Attachments: AccumuloRootMarker.java, AccumuloRootMarkerInputFormat.java, AccumuloRootMarkerOutputFormat.java,
AccumuloVertexInputFormat.java, AccumuloVertexOutputFormat.java, ComputeIsRoot.java, DistributedCacheHelper.java,
HBaseVertexInputFormat.java, HBaseVertexOutputFormat.java, IdentifyAndMarkRoots.java, SetLongWritable.java,
SetTextWritable.java, TableRootMarker.java, TableRootMarkerInputFormat.java, TableRootMarkerOutputFormat.java
>
>
> Four abstract classes that wrap their respective delegate input/output formats for
> easy hooks into vertex input format subclasses. I've included some sample programs that
show two very simple graph
> algorithms. I have a graph generator that builds out a very simple directed structure,
starting with a few 'root' nodes.
> Root nodes are defined as nodes which are not listed as a child anywhere in the graph.

> Algorithm 1) AccumuloRootMarker.java  --> Accumulo as read/write source. Every vertex
starts thinking it's a root. At superstep 0, send a message down to each
> child as a non-root notification. After superstep 1, only root nodes will have never
been messaged. 
> Algorithm 2) TableRootMarker --> HBase as read/write source. Expands on A1 by bundling
the notification logic followed by root node propagation. Once we've marked the appropriate
nodes as roots, tell every child which roots it can be traced back to via one or more spanning
trees. This will take N + 2 supersteps where N is the maximum number of hops from any root
to any leaf, plus 2 supersteps for the initial root flagging. 
> I've included all relevant code plus DistributedCacheHelper.java for recursive cache
file and archive searches. It is more hadoop centric than giraph, but these jobs use it so
I figured why not commit here. 
> These have been tested through local JobRunner, pseudo-distributed on the aforementioned
hardware, and full distributed on EC2. More details in the comments.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

Mime
View raw message