giraph-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "ramesh krishnan m (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (GIRAPH-462) Multithreading breaks out-of-core graph
Date Sat, 14 May 2016 20:20:12 GMT

    [ https://issues.apache.org/jira/browse/GIRAPH-462?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15283660#comment-15283660
] 

ramesh krishnan m commented on GIRAPH-462:
------------------------------------------

is this issue fixed. I am still getting this erron in the latest release .

Exception logs:

2016-05-14 19:10:55,733 ERROR [ooc-io-0] org.apache.giraph.utils.LogStacktraceCallable: Execution
of callable failed
java.lang.RuntimeException: java.io.EOFException
	at org.apache.giraph.ooc.OutOfCoreIOCallable.call(OutOfCoreIOCallable.java:76)
	at org.apache.giraph.ooc.OutOfCoreIOCallable.call(OutOfCoreIOCallable.java:30)
	at org.apache.giraph.utils.LogStacktraceCallable.call(LogStacktraceCallable.java:51)
	at java.util.concurrent.FutureTask.run(FutureTask.java:266)
	at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
	at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
	at java.lang.Thread.run(Thread.java:745)
Caused by: java.io.EOFException
	at java.io.DataInputStream.readInt(DataInputStream.java:392)
	at org.apache.hadoop.io.IntWritable.readFields(IntWritable.java:47)
	at org.apache.giraph.ooc.data.DiskBackedPartitionStore.readOutEdges(DiskBackedPartitionStore.java:286)
	at org.apache.giraph.ooc.data.DiskBackedPartitionStore.loadInMemoryPartitionData(DiskBackedPartitionStore.java:329)
	at org.apache.giraph.ooc.data.OutOfCoreDataManager.loadPartitionData(OutOfCoreDataManager.java:195)
	at org.apache.giraph.ooc.data.DiskBackedPartitionStore.loadPartitionData(DiskBackedPartitionStore.java:360)
	at org.apache.giraph.ooc.io.LoadPartitionIOCommand.execute(LoadPartitionIOCommand.java:64)
	at org.apache.giraph.ooc.OutOfCoreIOCallable.call(OutOfCoreIOCallable.java:72)
	... 6 more
2016-05-14 19:10:55,737 INFO [ooc-io-0] org.apache.giraph.ooc.OutOfCoreIOCallableFactory:
afterExecute: an out-of-core thread terminated unexpectedly with java.util.concurrent.ExecutionException:
java.lang.RuntimeException: java.io.EOFException
2016-05-14 19:10:55,739 INFO [checkpoint-vertices-7] org.apache.giraph.ooc.FixedOutOfCoreEngine:
getNextPartition: waiting until a partition becomes available!
2016-05-14 19:10:56,426 ERROR [checkpoint-vertices-6] org.apache.giraph.utils.LogStacktraceCallable:
Execution of callable failed
java.lang.RuntimeException: Job Failed due to a failure in an out-of-core IO thread
	at org.apache.giraph.ooc.FixedOutOfCoreEngine.getNextPartition(FixedOutOfCoreEngine.java:81)
	at org.apache.giraph.ooc.data.DiskBackedPartitionStore.getNextPartition(DiskBackedPartitionStore.java:187)
	at org.apache.giraph.worker.BspServiceWorker$3$1.call(BspServiceWorker.java:1398)
	at org.apache.giraph.worker.BspServiceWorker$3$1.call(BspServiceWorker.java:1392)
	at org.apache.giraph.utils.LogStacktraceCallable.call(LogStacktraceCallable.java:51)
	at java.util.concurrent.FutureTask.run(FutureTask.java:266)
	at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
	at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
	at java.lang.Thread.run(Thread.java:745)

> Multithreading breaks out-of-core graph
> ---------------------------------------
>
>                 Key: GIRAPH-462
>                 URL: https://issues.apache.org/jira/browse/GIRAPH-462
>             Project: Giraph
>          Issue Type: Bug
>            Reporter: Alessandro Presta
>            Priority: Critical
>         Attachments: GIRAPH-461.patch
>
>
> [~cmartella] pointed out this issue: when using multithreaded computation in conjunction
with out-of-core graph, we incur in a race condition. The compute threads share the same DiskBackedPartitionStore,
whose getPartition() method is not meant to be thread-safe. When two threads request two out-of-core
partitions concurrently, they both try to load it to the same slot.
> The result is that we can lose the reference to one of the two partitions (which will
not be written back to disk) and we can incur in a NullPointerException when both threads
are trying to offload the currently loaded partition to disk.
> I ran this test to confirm the issue:
> https://gist.github.com/4429628
> All tests pass except the one that uses both out-of-core graph and multiple compute threads.
> The error is the following:
> https://gist.github.com/4429650



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Mime
View raw message