giraph-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Gustavo Salazar Torres (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (GIRAPH-462) Multithreading breaks out-of-core graph
Date Tue, 22 Jan 2013 13:38:13 GMT

    [ https://issues.apache.org/jira/browse/GIRAPH-462?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13559625#comment-13559625
] 

Gustavo Salazar Torres commented on GIRAPH-462:
-----------------------------------------------

What if instead of this pull model a publish/subscribe would be used? That way workers, instead
of calling directly the getPartition() method, another object, let's call it PartitionCoordinator,
would receive subscribe events from workers expecting to receive a publish event from PartitionCoordinator
when a partition is available.
Workers would have to block themselves until they receive the publish event.
                
> Multithreading breaks out-of-core graph
> ---------------------------------------
>
>                 Key: GIRAPH-462
>                 URL: https://issues.apache.org/jira/browse/GIRAPH-462
>             Project: Giraph
>          Issue Type: Bug
>            Reporter: Alessandro Presta
>            Priority: Critical
>
> [~cmartella] pointed out this issue: when using multithreaded computation in conjunction
with out-of-core graph, we incur in a race condition. The compute threads share the same DiskBackedPartitionStore,
whose getPartition() method is not meant to be thread-safe. When two threads request two out-of-core
partitions concurrently, they both try to load it to the same slot.
> The result is that we can lose the reference to one of the two partitions (which will
not be written back to disk) and we can incur in a NullPointerException when both threads
are trying to offload the currently loaded partition to disk.
> I ran this test to confirm the issue:
> https://gist.github.com/4429628
> All tests pass except the one that uses both out-of-core graph and multiple compute threads.
> The error is the following:
> https://gist.github.com/4429650

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

Mime
View raw message