giraph-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Claudio Martella (JIRA)" <j...@apache.org>
Subject [jira] [Created] (GIRAPH-625) DiskBackedMessageStore can merge fileStores in the background.
Date Fri, 12 Apr 2013 14:42:16 GMT
Claudio Martella created GIRAPH-625:
---------------------------------------

             Summary: DiskBackedMessageStore can merge fileStores in the background.
                 Key: GIRAPH-625
                 URL: https://issues.apache.org/jira/browse/GIRAPH-625
             Project: Giraph
          Issue Type: Improvement
            Reporter: Claudio Martella


If the number of messages is large compared to the number of messages kept in memory by the
DiskBackedMessageStore, it can result in a large number of files. Reading messages for each
vertex, requires linearly scanning multiple files at the same time, hence producing a lot
of seeks by the disk head.

While computing the vertices, as the messages for the next superstep flow in, we can keep
the number of filestores low by merging them in the background with a thread. The procedure
is similar to compaction in NoSQL stores, and the merging of the shuffle & sort of M/R.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

Mime
View raw message