giraph-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Han JU <ju.han.fe...@gmail.com>
Subject Questions on input/output format
Date Wed, 15 May 2013 10:27:50 GMT
Hi,

Some questions:

  - My input file is a text file with edges: node1 node2 edgeValue, I
figured it out that I should use TextEdgeInputFormat and
TextVertexValueInputFormat. But how do these two things fit together?
Should I prepare another file that contains only the node informations for
VertexValueInputFormat?

  - If the input file is a sequence file, how should I implement a
SequenceEdgeInputFormat or SequenceVertexInputFormat? Or they exist already?

  - For output part, what I need to do is after the calculation terminates,
every vertex need to output many lines. This could be big (for a dataset
the output size is 400GB). I found only the TextVertexOuputFormat but it
seems to output a single line per vertex. How should I achieve this?

Thanks a lot!

-- 
*JU Han*

Software Engineer Intern @ KXEN Inc.
UTC   -  Université de Technologie de Compiègne
*     **GI06 - Fouille de Données et Décisionnel*

+33 0619608888

Mime
View raw message