giraph-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Ameet Kini <>
Subject sorted output
Date Fri, 22 Mar 2013 00:58:51 GMT
Is it possible to save the final output sorted by vertex id? My
vertices have their id of type long, and I am using
SequenceFileOutputFormat, where the key of the sequence file is the
vertex id of type long. If the vertices were somehow written in sorted
order, I could even switch to using Hadoop's MapFileOutputFormat,
which expects sorted keys. I understand that if there are multiple
workers, there won't be a total order on the keys, and that's fine. As
long as each worker writes its output sorted by vertex id.

I was looking at the code and looks like the call to writeVertex is
made in BspServiceWorker.saveVertices, Looks like there is no way to
control the order of vertices, but I may be missing something. Any
pointers or examples would help.


View raw message