hama-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Mahesh Babu <jmb...@gmail.com>
Subject Hama graph BSPJobClient Job failing - not able to identify the reason.
Date Fri, 06 Sep 2013 09:35:16 GMT
Hi,

When I run a hama job in pseudo distributed mode (single node) I get
following error: (in stdout)
>>>>>>>>>>>>
attempt_201309061315_0005_000000_0: 13/09/06 14:01:39 DEBUG
fs.FSInputChecker: DFSClient readChunk got seqno 593 offsetInBlock 38862848
lastPacketInBlock false packetLen 66052
attempt_201309061315_0005_000000_0: 13/09/06 14:01:39 DEBUG
fs.FSInputChecker: DFSClient readChunk got seqno 594 offsetInBlock 38928384
lastPacketInBlock false packetLen 66052
attempt_201309061315_0005_000000_0: 13/09/06 14:01:40 DEBUG
fs.FSInputChecker: DFSClient readChunk got seqno 595 offsetInBlock 38993920
lastPacketInBlock false packetLen 66052
attempt_201309061315_0005_000000_0: 13/09/06 14:01:40 DEBUG
fs.FSInputChecker: DFSClient readC
*13/09/06 14:03:29 INFO bsp.BSPJobClient: Job failed.*
<<<<<<<<<<<<


*hama-ubuntu-bspmaster-ubuntu.log*
>>>>>>>>>>>>
2013-09-06 14:03:21,422 DEBUG org.apache.hama.bsp.Counters: Adding
SUPERSTEP_SUM
2013-09-06 14:03:23,423 DEBUG org.apache.hama.bsp.Counters: Adding
SUPERSTEP_SUM
2013-09-06 14:03:25,424 DEBUG org.apache.hama.bsp.Counters: Adding
SUPERSTEP_SUM
*2013-09-06 14:03:25,425 INFO org.apache.hama.bsp.JobInProgress: Taskid
'attempt_201309061315_0005_000000_0' has failed.
2013-09-06 14:03:25,425 INFO org.apache.hama.bsp.TaskInProgress: Task
'task_201309061315_0005_000000' has failed.
*2013-09-06 14:03:25,425 DEBUG org.apache.hama.bsp.JobInProgress: Removing
/tmp/hadoop-ubuntu/bsp/local/bspMaster/job_201309061315_0005.xml and
/tmp/hadoop-ubuntu/bsp/local/bspMaster/job_201309061315_0005.jar getJobFile
= hdfs://localhost:9000/tmp/hadoop-ubuntu*/bsp/system/submit_714o6m/job.xml
2013-09-06 14:03:25,434 INFO org.apache.hama.bsp.JobInProgress: Job failed.
2013-09-06 14:03:25,434 DEBUG org.apache.hama.bsp.JobInProgress: Removing
null and null getJobFile =
hdfs://localhost:9000/tmp/hadoop-ubuntu/bsp/system/submit_714o6m/job.xml
*<<<<<<<<<<<<<

*hama-ubuntu-groom-ubuntu.log*
>>>>>>>>>>>>
2013-09-06 14:03:14,660 DEBUG org.apache.hama.bsp.GroomServer: checking
task: attempt_201309061315_0005_000000_0 starttime =1378456254247 lastping
= 1378456334727 run state = RUNNING monitorPeriod = 10000 check = false
2013-09-06 14:03:24,660 DEBUG org.apache.hama.bsp.GroomServer: checking
task: attempt_201309061315_0005_000000_0 starttime =1378456254247 lastping
= 1378456334727 run state = RUNNING monitorPeriod = 10000 check = true
2013-09-06 14:03:24,660 INFO org.apache.hama.bsp.GroomServer: adding purge
task: attempt_201309061315_0005_000000_0
2013-09-06 14:03:24,660 DEBUG org.apache.hama.bsp.GroomServer: Got 1
oblivious tasks
2013-09-06 14:03:24,661 DEBUG org.apache.hama.bsp.GroomServer: Purging task
org.apache.hama.bsp.GroomServer$TaskInProgress@2e0cd499
*2013-09-06 14:03:24,661 INFO org.apache.hama.bsp.GroomServer: About to
purge task: attempt_201309061315_0005_000000_0
2013-09-06 14:03:24,661 DEBUG org.apache.hama.bsp.GroomServer: Killing
process for attempt_201309061315_0005_000000_0
2013-09-06 14:03:25,436 DEBUG org.apache.hama.bsp.GroomServer: Got Response
from BSPMaster with 1 actions
2013-09-06 14:03:25,437 INFO org.apache.hama.bsp.GroomServer: Kill 1 tasks.
*<<<<<<<<<<<<<

*attempt_201309061315_0005_000000_0.log*
>>>>>>>>>>>>
13/09/06 14:02:06 DEBUG ipc.RPC: Call: ping 2
13/09/06 14:02:07 DEBUG fs.FSInputChecker: DFSClient readChunk got seqno
633 offsetInBlock 41484288 lastPacketInBlock false packetLen 66052
13/09/06 14:02:14 DEBUG bsp.BSPTask: Pinging at time 1378456334726
13/09/06 14:02:14 DEBUG ipc.Client: IPC Client (47) connection to localhost/
127.0.0.1:49551 from ubuntu sending #24
13/09/06 14:02:14 DEBUG ipc.Client: IPC Client (47) connection to localhost/
127.0.0.1:49551 from ubuntu got value #24
13/09/06 14:02:14 DEBUG ipc.RPC: Call: ping 2
13/09/06 14:02:37 DEBUG bsp.BSPTask: Pinging at time 1378456357688
13/09/06 14:02:37 DEBUG ipc.Client: The ping interval is60000ms.
13/09/06 14:02:38 DEBUG ipc.Client: Use SIMPLE authentication for protocol
BSPPeerProtocol
13/09/06 14:02:39 DEBUG ipc.Client: Connecting to localhost/127.0.0.1:49551
13/09/06 14:02:56 DEBUG ipc.Client: The ping interval is60000ms.
13/09/06 14:02:56 DEBUG ipc.Client: Use SIMPLE authentication for protocol
ClientProtocol
13/09/06 14:02:57 DEBUG ipc.Client: Connecting to localhost/127.0.0.1:9000
13/09/06 14:02:58 DEBUG ipc.Client: IPC Client (47) connection to localhost/
127.0.0.1:49551 from ubuntu: closed
13/09/06 14:02:59 DEBUG ipc.Client: IPC Client (47) connection to localhost/
127.0.0.1:49551 from ubuntu: stopped, remaining connections 1
>>>>>>>>>>>>

Any idea why job is failing. No exceptions or failures in any logs even
when I put the logs in DEBUG mode.

Thanks,
Mahesh Babu

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message