giraph-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Suijian Zhou <suijian.z...@gmail.com>
Subject Re: To process a BIG input graph in giraph.
Date Wed, 05 Mar 2014 16:31:20 GMT
Hi, Experts,
  Could anybody remind me how to load mutiple input files in a giraph
command line? The following do not work, they only load the first input
file:
-vip /user/hadoop/input/ttt.txt   /user/hadoop/input/ttt2.txt
or
-vip /user/hadoop/input/ttt.txt  -vip /user/hadoop/input/ttt2.txt

  Best Regards,
  Suijian




2014-03-01 16:12 GMT-06:00 Suijian Zhou <suijian.zhou@gmail.com>:

> Hi,
>   Here I'm trying to process a very big input file through giraph, ~70GB.
> I'm running the giraph program on a 40 nodes linux cluster but the program
> just get stuck there after it read in a small fraction of the input file.
> Although each node has 16GB mem, it looks that only one node read the input
> file which is on HDFS(into its memory). As the input file is so big, is
> there a way to scatter the input file on all the nodes so each node will
> read in  a fraction of the file then start processing the graph? Will it be
> helpful if we split the single big input file into many smaller files and
> let each node read in one of them to process( of course the overall
> stucture of the graph should be kept)? Thanks!
>
>   Best Regards,
>   Suijian
>
>

Mime
View raw message