giraph-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Eugene Koontz (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (GIRAPH-601) Exception when running pagerank benchmark: SendVertexRequest cannot be cast to MasterRequest
Date Fri, 29 Mar 2013 23:17:15 GMT

    [ https://issues.apache.org/jira/browse/GIRAPH-601?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13617859#comment-13617859
] 

Eugene Koontz commented on GIRAPH-601:
--------------------------------------

Using debug logging with instrumentation.patch (see attachments) shows that, in fact, we are
correctly respecting SplitMasterWorker's setting : that is, Master is running in its own separate
task, as expected:

{code}
application_1364578380737_0019/container_1364578380737_0019_01_000002/syslog
29:2013-03-29 15:50:07,620 INFO [main] org.apache.giraph.graph.GraphTaskManager: setup: Log
level remains at info
30:2013-03-29 15:50:07,639 INFO [main] org.apache.giraph.graph.GraphTaskManager: Distributed
cache is empty. Assuming fatjar.
31:2013-03-29 15:50:07,639 INFO [main] org.apache.giraph.graph.GraphTaskManager: setup: classpath
@ /tmp/hadoop-yarn/staging/ekoontz/.staging/job_1364578380737_0019/job.jar for job org.apache.giraph.benchmark.PageRankBenchmark
32:2013-03-29 15:50:07,639 DEBUG [main] org.apache.giraph.graph.GraphTaskManager: splitMasterWorker:
true
34:2013-03-29 15:50:07,640 DEBUG [main] org.apache.giraph.graph.GraphTaskManager: taskPartition:
0
35:2013-03-29 15:50:07,640 DEBUG [main] org.apache.giraph.graph.GraphTaskManager: splitMasterWorker
is true.
36:2013-03-29 15:50:07,640 DEBUG [main] org.apache.giraph.graph.GraphTaskManager: splitMasterWorker
is true and zkAlreadyProvided=true.
37:2013-03-29 15:50:07,640 DEBUG [main] org.apache.giraph.graph.GraphTaskManager: splitMasterWorker
is true and taskPartition (0) is less than masterCount (1), so MASTER_ONLY.
38:2013-03-29 15:50:07,640 INFO [main] org.apache.giraph.graph.GraphTaskManager: setup: Starting
up BspServiceMaster (master thread)...
61:2013-03-29 15:50:07,709 INFO [main] org.apache.giraph.graph.GraphTaskManager: map: No need
to do anything when not a worker
62:2013-03-29 15:50:07,709 INFO [main] org.apache.giraph.graph.GraphTaskManager: cleanup:
Starting for MASTER_ONLY

application_1364578380737_0019/container_1364578380737_0019_01_000003/syslog
29:2013-03-29 15:50:09,090 INFO [main] org.apache.giraph.graph.GraphTaskManager: setup: Log
level remains at info
30:2013-03-29 15:50:09,110 INFO [main] org.apache.giraph.graph.GraphTaskManager: Distributed
cache is empty. Assuming fatjar.
31:2013-03-29 15:50:09,110 INFO [main] org.apache.giraph.graph.GraphTaskManager: setup: classpath
@ /tmp/hadoop-yarn/staging/ekoontz/.staging/job_1364578380737_0019/job.jar for job org.apache.giraph.benchmark.PageRankBenchmark
32:2013-03-29 15:50:09,110 DEBUG [main] org.apache.giraph.graph.GraphTaskManager: splitMasterWorker:
true
34:2013-03-29 15:50:09,110 DEBUG [main] org.apache.giraph.graph.GraphTaskManager: taskPartition:
1
35:2013-03-29 15:50:09,110 DEBUG [main] org.apache.giraph.graph.GraphTaskManager: splitMasterWorker
is true.
36:2013-03-29 15:50:09,110 DEBUG [main] org.apache.giraph.graph.GraphTaskManager: splitMasterWorker
is true and zkAlreadyProvided=true.
37:2013-03-29 15:50:09,111 DEBUG [main] org.apache.giraph.graph.GraphTaskManager: splitMasterWorker
is true and taskPartition (1) is NOT less than masterCount (1), so WORKER_ONLY.
38:2013-03-29 15:50:09,111 INFO [main] org.apache.giraph.graph.GraphTaskManager: setup: Starting
up BspServiceWorker...
66:2013-03-29 15:50:09,323 INFO [main] org.apache.giraph.graph.GraphTaskManager: setup: Registering
health of this worker...

application_1364578380737_0019/container_1364578380737_0019_01_000004/syslog
29:2013-03-29 15:50:10,222 INFO [main] org.apache.giraph.graph.GraphTaskManager: setup: Log
level remains at info
30:2013-03-29 15:50:10,241 INFO [main] org.apache.giraph.graph.GraphTaskManager: Distributed
cache is empty. Assuming fatjar.
31:2013-03-29 15:50:10,242 INFO [main] org.apache.giraph.graph.GraphTaskManager: setup: classpath
@ /tmp/hadoop-yarn/staging/ekoontz/.staging/job_1364578380737_0019/job.jar for job org.apache.giraph.benchmark.PageRankBenchmark
32:2013-03-29 15:50:10,242 DEBUG [main] org.apache.giraph.graph.GraphTaskManager: splitMasterWorker:
true
34:2013-03-29 15:50:10,242 DEBUG [main] org.apache.giraph.graph.GraphTaskManager: taskPartition:
2
35:2013-03-29 15:50:10,242 DEBUG [main] org.apache.giraph.graph.GraphTaskManager: splitMasterWorker
is true.
36:2013-03-29 15:50:10,242 DEBUG [main] org.apache.giraph.graph.GraphTaskManager: splitMasterWorker
is true and zkAlreadyProvided=true.
37:2013-03-29 15:50:10,242 DEBUG [main] org.apache.giraph.graph.GraphTaskManager: splitMasterWorker
is true and taskPartition (2) is NOT less than masterCount (1), so WORKER_ONLY.
38:2013-03-29 15:50:10,242 INFO [main] org.apache.giraph.graph.GraphTaskManager: setup: Starting
up BspServiceWorker...
66:2013-03-29 15:50:10,444 INFO [main] org.apache.giraph.graph.GraphTaskManager: setup: Registering
health of this worker...

application_1364578380737_0019/container_1364578380737_0019_01_000005/syslog
29:2013-03-29 15:50:11,289 INFO [main] org.apache.giraph.graph.GraphTaskManager: setup: Log
level remains at info
30:2013-03-29 15:50:11,305 INFO [main] org.apache.giraph.graph.GraphTaskManager: Distributed
cache is empty. Assuming fatjar.
31:2013-03-29 15:50:11,305 INFO [main] org.apache.giraph.graph.GraphTaskManager: setup: classpath
@ /tmp/hadoop-yarn/staging/ekoontz/.staging/job_1364578380737_0019/job.jar for job org.apache.giraph.benchmark.PageRankBenchmark
32:2013-03-29 15:50:11,305 DEBUG [main] org.apache.giraph.graph.GraphTaskManager: splitMasterWorker:
true
34:2013-03-29 15:50:11,305 DEBUG [main] org.apache.giraph.graph.GraphTaskManager: taskPartition:
3
35:2013-03-29 15:50:11,305 DEBUG [main] org.apache.giraph.graph.GraphTaskManager: splitMasterWorker
is true.
36:2013-03-29 15:50:11,305 DEBUG [main] org.apache.giraph.graph.GraphTaskManager: splitMasterWorker
is true and zkAlreadyProvided=true.
37:2013-03-29 15:50:11,305 DEBUG [main] org.apache.giraph.graph.GraphTaskManager: splitMasterWorker
is true and taskPartition (3) is NOT less than masterCount (1), so WORKER_ONLY.
38:2013-03-29 15:50:11,305 INFO [main] org.apache.giraph.graph.GraphTaskManager: setup: Starting
up BspServiceWorker...
66:2013-03-29 15:50:11,466 INFO [main] org.apache.giraph.graph.GraphTaskManager: setup: Registering
health of this worker...

application_1364578380737_0019/container_1364578380737_0019_01_000006/syslog
29:2013-03-29 15:50:11,910 INFO [main] org.apache.giraph.graph.GraphTaskManager: setup: Log
level remains at info
30:2013-03-29 15:50:11,925 INFO [main] org.apache.giraph.graph.GraphTaskManager: Distributed
cache is empty. Assuming fatjar.
31:2013-03-29 15:50:11,925 INFO [main] org.apache.giraph.graph.GraphTaskManager: setup: classpath
@ /tmp/hadoop-yarn/staging/ekoontz/.staging/job_1364578380737_0019/job.jar for job org.apache.giraph.benchmark.PageRankBenchmark
32:2013-03-29 15:50:11,926 DEBUG [main] org.apache.giraph.graph.GraphTaskManager: splitMasterWorker:
true
34:2013-03-29 15:50:11,926 DEBUG [main] org.apache.giraph.graph.GraphTaskManager: taskPartition:
4
35:2013-03-29 15:50:11,926 DEBUG [main] org.apache.giraph.graph.GraphTaskManager: splitMasterWorker
is true.
36:2013-03-29 15:50:11,926 DEBUG [main] org.apache.giraph.graph.GraphTaskManager: splitMasterWorker
is true and zkAlreadyProvided=true.
37:2013-03-29 15:50:11,926 DEBUG [main] org.apache.giraph.graph.GraphTaskManager: splitMasterWorker
is true and taskPartition (4) is NOT less than masterCount (1), so WORKER_ONLY.
38:2013-03-29 15:50:11,926 INFO [main] org.apache.giraph.graph.GraphTaskManager: setup: Starting
up BspServiceWorker...
66:2013-03-29 15:50:12,069 INFO [main] org.apache.giraph.graph.GraphTaskManager: setup: Registering
health of this worker...

application_1364578380737_0019/container_1364578380737_0019_01_000007/syslog
29:2013-03-29 15:50:12,513 INFO [main] org.apache.giraph.graph.GraphTaskManager: setup: Log
level remains at info
30:2013-03-29 15:50:12,524 INFO [main] org.apache.giraph.graph.GraphTaskManager: Distributed
cache is empty. Assuming fatjar.
31:2013-03-29 15:50:12,525 INFO [main] org.apache.giraph.graph.GraphTaskManager: setup: classpath
@ /tmp/hadoop-yarn/staging/ekoontz/.staging/job_1364578380737_0019/job.jar for job org.apache.giraph.benchmark.PageRankBenchmark
32:2013-03-29 15:50:12,525 DEBUG [main] org.apache.giraph.graph.GraphTaskManager: splitMasterWorker:
true
34:2013-03-29 15:50:12,525 DEBUG [main] org.apache.giraph.graph.GraphTaskManager: taskPartition:
5
35:2013-03-29 15:50:12,525 DEBUG [main] org.apache.giraph.graph.GraphTaskManager: splitMasterWorker
is true.
36:2013-03-29 15:50:12,525 DEBUG [main] org.apache.giraph.graph.GraphTaskManager: splitMasterWorker
is true and zkAlreadyProvided=true.
37:2013-03-29 15:50:12,525 DEBUG [main] org.apache.giraph.graph.GraphTaskManager: splitMasterWorker
is true and taskPartition (5) is NOT less than masterCount (1), so WORKER_ONLY.
38:2013-03-29 15:50:12,525 INFO [main] org.apache.giraph.graph.GraphTaskManager: setup: Starting
up BspServiceWorker...
66:2013-03-29 15:50:12,638 INFO [main] org.apache.giraph.graph.GraphTaskManager: setup: Registering
health of this worker...

{code}

So it's good to know that splitMasterWorker works as expected.
                
> Exception when running pagerank benchmark: SendVertexRequest cannot be cast to MasterRequest
> --------------------------------------------------------------------------------------------
>
>                 Key: GIRAPH-601
>                 URL: https://issues.apache.org/jira/browse/GIRAPH-601
>             Project: Giraph
>          Issue Type: Bug
>            Reporter: Eugene Koontz
>         Attachments: instrumentation.patch
>
>
> Building Giraph with:
> {code}
> mvn -DskipTests  -Phadoop_2.0.3 clean compile
> {code}
> Running pagerank like this:
> {code}
>  $HADOOP_RUNTIME/bin/hadoop jar $JAR \
>          org.apache.giraph.benchmark.PageRankBenchmark \
> 	  -e 10 -s 10 -v -V 10 -w 6
> {code}
> I see this in  /tmp/userlogs/application_1364578380737_0003/container_1364578380737_0003_01_000002/
:
> {code}
> 2013-03-29 10:58:06,371 DEBUG [org.apache.giraph.master.MasterThread] org.apache.giraph.master.BspServiceMaster:
barrierOnWorkerList: Got finished worker list = [Eugenes-MacBook-Pro.local_1, Eugenes-MacBook-Pro.local_3],
size = 2, worker list = [Worker(hostname=Eugenes-MacBook-Pro.local, MRtaskID=2, port=30002),
Worker(hostname=Eugenes-MacBook-Pro.local, MRtaskID=1, port=30001), Worker(hostname=Eugenes-MacBook-Pro.local,
MRtaskID=4, port=30004), Worker(hostname=Eugenes-MacBook-Pro.local, MRtaskID=3, port=30003),
Worker(hostname=Eugenes-MacBook-Pro.local, MRtaskID=5, port=30005), Worker(hostname=Eugenes-MacBook-Pro.local,
MRtaskID=0, port=30010)], size = 6 from /_hadoopBsp/job_1364578380737_0003/_vertexInputSplitDoneDir
> 2013-03-29 10:58:06,373 WARN [netty-server-exec-3] org.apache.giraph.comm.netty.handler.RequestServerHandler:
exceptionCaught: Channel failed with remote address /172.16.175.1:56236
> java.lang.ClassCastException: org.apache.giraph.comm.requests.SendVertexRequest cannot
be cast to org.apache.giraph.comm.requests.MasterRequest
> 	at org.apache.giraph.comm.netty.handler.MasterRequestServerHandler.processRequest(MasterRequestServerHandler.java:27)
> 	at org.apache.giraph.comm.netty.handler.RequestServerHandler.messageReceived(RequestServerHandler.java:106)
> 	at org.jboss.netty.channel.Channels.fireMessageReceived(Channels.java:296)
> 	at org.jboss.netty.handler.codec.oneone.OneToOneDecoder.handleUpstream(OneToOneDecoder.java:71)
> 	at org.jboss.netty.handler.execution.ChannelUpstreamEventRunnable.doRun(ChannelUpstreamEventRunnable.java:45)
> 	at org.jboss.netty.handler.execution.ChannelEventRunnable.run(ChannelEventRunnable.java:69)
> 	at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:895)
> 	at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:918)
> 	at java.lang.Thread.run(Thread.java:680)
> {code}

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

Mime
View raw message