giraph-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Vishal Patel <write2vis...@gmail.com>
Subject Re: Trying to run the Connected Components example.
Date Tue, 07 Aug 2012 17:22:43 GMT
To add, I ran it with workers=3, the job completed successfully and now the
output has 6667 lines.

4 mappers,
Mapper 000 (on master)
Mapper 001 (on slave) - wrote out 3333 lines
Mapper 002 (on slave) - wrote out 0 lines
Mapper 003 (on slave) - wrote out 3334 lines

Vishal


On Tue, Aug 7, 2012 at 10:04 AM, Vishal Patel <write2vishal@gmail.com>wrote:

> Yes, I can run upto 18 mappers, mapred.tasktracker.map.tasks is set to 4
> on the master and slave (there are only 2 machines). The
> mapred.tasktracker.map.tasks.maximum is higher.
>
> So when I do worker=1, giraph started 2 map tasks. 1 completed, 1 failed
>
> STATUS: setup: Connected to Zookeeper service master:22181
>
> java.lang.Throwable: Child Error
> 	at org.apache.hadoop.mapred.TaskRunner.run(TaskRunner.java:271)
> Caused by: java.io.IOException: Task process exit with nonzero status of 1.
> 	at org.apache.hadoop.mapred.TaskRunner.run(TaskRunner.java:258)
>
> Task attempt_201208070927_0001_m_000000_1 failed to report status for 600 seconds. Killing!
>
> Second mapper also turned to the same error after 5 mins,
>
> Task attempt_201208070927_0001_m_000001_0 failed to report status for 601 seconds. Killing!
>
>
>
>
>
>
> Next, I tried workers=2, and giraph started 3 mappers, no errors and the job was "Completed"
on hadoop's web interface. However the solution was incorrect and
>
> /giraph_out/two/part-m-00001 had 0 lines
>
>
>
>
> /giraph_out/two/part-m-00002 had 5000 lines
>
>
> Here is the command line out,
> 12/08/07 09:46:25 INFO mapred.JobClient: Running job: job_201208070927_0003
> 12/08/07 09:46:26 INFO mapred.JobClient:  map 0% reduce 0%
>
>
>
>
> 12/08/07 09:46:42 INFO mapred.JobClient:  map 100% reduce 0%
> 12/08/07 09:46:47 INFO mapred.JobClient: Job complete: job_201208070927_0003
> 12/08/07 09:46:47 INFO mapred.JobClient: Counters: 41
> 12/08/07 09:46:47 INFO mapred.JobClient:   Job Counters
>
>
>
>
> 12/08/07 09:46:47 INFO mapred.JobClient:     SLOTS_MILLIS_MAPS=25701
> 12/08/07 09:46:47 INFO mapred.JobClient:     Total time spent by all reduces waiting
after reserving slots (ms)=0
> 12/08/07 09:46:47 INFO mapred.JobClient:     Total time spent by all maps waiting after
reserving slots (ms)=0
>
>
>
>
> 12/08/07 09:46:47 INFO mapred.JobClient:     Launched map tasks=3
> 12/08/07 09:46:47 INFO mapred.JobClient:     SLOTS_MILLIS_REDUCES=0
> 12/08/07 09:46:47 INFO mapred.JobClient:   Giraph Timers
> 12/08/07 09:46:47 INFO mapred.JobClient:     Total (milliseconds)=4203
>
>
>
>
> 12/08/07 09:46:47 INFO mapred.JobClient:     Superstep 3 (milliseconds)=25
> 12/08/07 09:46:47 INFO mapred.JobClient:     Vertex input superstep (milliseconds)=426
> 12/08/07 09:46:47 INFO mapred.JobClient:     Superstep 4 (milliseconds)=69
>
>
>
>
> 12/08/07 09:46:47 INFO mapred.JobClient:     Setup (milliseconds)=3076
> 12/08/07 09:46:47 INFO mapred.JobClient:     Superstep 7 (milliseconds)=17
> 12/08/07 09:46:47 INFO mapred.JobClient:     Shutdown (milliseconds)=93
>
>
>
>
> 12/08/07 09:46:47 INFO mapred.JobClient:     Superstep 0 (milliseconds)=197
> 12/08/07 09:46:47 INFO mapred.JobClient:     Superstep 8 (milliseconds)=62
> 12/08/07 09:46:47 INFO mapred.JobClient:     Superstep 9 (milliseconds)=21
>
>
>
>
> 12/08/07 09:46:47 INFO mapred.JobClient:     Superstep 6 (milliseconds)=59
> 12/08/07 09:46:47 INFO mapred.JobClient:     Superstep 5 (milliseconds)=19
> 12/08/07 09:46:47 INFO mapred.JobClient:     Superstep 2 (milliseconds)=78
>
>
>
>
> 12/08/07 09:46:47 INFO mapred.JobClient:     Superstep 1 (milliseconds)=57
> 12/08/07 09:46:47 INFO mapred.JobClient:   Giraph Stats
> 12/08/07 09:46:47 INFO mapred.JobClient:     Aggregate edges=10005
> 12/08/07 09:46:47 INFO mapred.JobClient:     Superstep=10
>
>
>
>
> 12/08/07 09:46:47 INFO mapred.JobClient:     Last checkpointed superstep=8
> 12/08/07 09:46:47 INFO mapred.JobClient:     Current workers=2
> 12/08/07 09:46:47 INFO mapred.JobClient:     Current master task partition=0
>
>
>
>
> 12/08/07 09:46:47 INFO mapred.JobClient:     Sent messages=0
> 12/08/07 09:46:47 INFO mapred.JobClient:     Aggregate finished vertices=5000
> 12/08/07 09:46:47 INFO mapred.JobClient:     Aggregate vertices=5000
> 12/08/07 09:46:47 INFO mapred.JobClient:   File Output Format Counters
>
>
>
>
> 12/08/07 09:46:47 INFO mapred.JobClient:     Bytes Written=0
> 12/08/07 09:46:47 INFO mapred.JobClient:   FileSystemCounters
> 12/08/07 09:46:47 INFO mapred.JobClient:     FILE_BYTES_READ=236
> 12/08/07 09:46:47 INFO mapred.JobClient:     HDFS_BYTES_READ=146766
>
>
>
>
> 12/08/07 09:46:47 INFO mapred.JobClient:     FILE_BYTES_WRITTEN=66618
> 12/08/07 09:46:47 INFO mapred.JobClient:     HDFS_BYTES_WRITTEN=655800
> 12/08/07 09:46:47 INFO mapred.JobClient:   File Input Format Counters
>
>
>
> 12/08/07 09:46:47 INFO mapred.JobClient:     Bytes Read=0
>
> 12/08/07 09:46:47 INFO mapred.JobClient:   Map-Reduce Framework
> 12/08/07 09:46:47 INFO mapred.JobClient:     Map input records=3
> 12/08/07 09:46:47 INFO mapred.JobClient:     Physical memory (bytes) snapshot=364912640
>
>
>
>
> 12/08/07 09:46:47 INFO mapred.JobClient:     Spilled Records=0
> 12/08/07 09:46:47 INFO mapred.JobClient:     CPU time spent (ms)=3840
> 12/08/07 09:46:47 INFO mapred.JobClient:     Total committed heap usage (bytes)=602996736
>
>
>
>
> 12/08/07 09:46:47 INFO mapred.JobClient:     Virtual memory (bytes) snapshot=9814392832
> 12/08/07 09:46:47 INFO mapred.JobClient:     Map output records=0
> 12/08/07 09:46:47 INFO mapred.JobClient:     SPLIT_RAW_BYTES=132
>
>
>
>
>
>
> From the Web interface,
>
> Mapper 000 (went to master), status: MASTER_ZOOKEEPER_ONLY - 2 finished out of 2 on superstep
9
> Mapper 001 (went to slave), status: finishSuperstep: (all workers done) WORKER_ONLY -
Attempt=0, Superstep=10
>
>
>
>
> Mapper 002 (went to master), status: finishSuperstep: (all workers done) WORKER_ONLY
- Attempt=0, Superstep=10
>
>
> Here is the last 8KB syslog of Mapper 2 (Mapper 3 was similar),
>
> 2012-08-07 09:46:34,644 WARN org.apache.giraph.graph.BspService: process: Unknown and
unprocessed event (path=/_hadoopBsp/job_201208070927_0003/_applicationAttemptsDir/0/_superstepDir/6/_superstepFinished,
type=NodeDeleted, state=SyncConnected)
> 2012-08-07 09:46:34,646 INFO org.apache.giraph.graph.BspServiceWorker: registerHealth:
Created my health node for attempt=0, superstep=8 with /_hadoopBsp/job_201208070927_0003/_applicationAttemptsDir/0/_superstepDir/8/_workerHealthyDir/slave_1
and workerInfo= Worker(hostname=slave <http://pointblank.corpdom.com>, MRpartition=1,
port=30001)
> 2012-08-07 09:46:34,650 INFO org.apache.giraph.graph.BspService: process: partitionAssignmentsReadyChanged
(partitions are assigned)
> 2012-08-07 09:46:34,651 INFO org.apache.giraph.graph.BspServiceWorker: startSuperstep:
Ready for computation on superstep 8 since worker selection and vertex range assignments are
done in /_hadoopBsp/job_201208070927_0003/_applicationAttemptsDir/0/_superstepDir/8/_partitionAssignments
> 2012-08-07 09:46:34,651 INFO org.apache.giraph.graph.BspServiceWorker: getAggregatorValues:
no aggregators in /_hadoopBsp/job_201208070927_0003/_applicationAttemptsDir/0/_superstepDir/7/_mergedAggregatorDir
on superstep 8
> 2012-08-07 09:46:34,652 INFO org.apache.giraph.graph.BspServiceWorker: exchangeVertexPartitions:
Nothing to exchange, exiting early
> 2012-08-07 09:46:34,674 INFO org.apache.giraph.graph.BspServiceWorker: storeCheckpoint:
Finished metadata (_bsp/_checkpoints/job_201208070927_0003/8.slave_1.metadata) and vertices
(_bsp/_checkpoints/job_201208070927_0003/8.slave_1.vertices).
> 2012-08-07 09:46:34,679 INFO org.apache.giraph.comm.BasicRPCCommunications: flush: starting
for superstep 8 totalMem = 191.6875M, maxMem = 191.6875M, freeMem = 164.97296M
> 2012-08-07 09:46:34,679 INFO org.apache.giraph.comm.BasicRPCCommunications: flush: ended
for superstep 8 totalMem = 191.6875M, maxMem = 191.6875M, freeMem = 164.97285M
> 2012-08-07 09:46:34,679 INFO org.apache.giraph.graph.BspServiceWorker: finishSuperstep:
Superstep 8 totalMem = 191.6875M, maxMem = 191.6875M, freeMem = 164.97285M
> 2012-08-07 09:46:34,689 INFO org.apache.giraph.graph.BspService: process: superstepFinished
signaled
> 2012-08-07 09:46:34,690 INFO org.apache.giraph.graph.BspServiceWorker: finishSuperstep:
Completed superstep 8 with global stats (vtx=5000,finVtx=5000,edges=10005,msgCount=20)
> 2012-08-07 09:46:34,690 INFO org.apache.giraph.comm.BasicRPCCommunications: prepareSuperstep:
Superstep 9 totalMem = 191.6875M, maxMem = 191.6875M, freeMem = 164.97285M
> 2012-08-07 09:46:34,696 INFO org.apache.giraph.graph.BspServiceWorker: registerHealth:
Created my health node for attempt=0, superstep=9 with /_hadoopBsp/job_201208070927_0003/_applicationAttemptsDir/0/_superstepDir/9/_workerHealthyDir/slave_1
and workerInfo= Worker(hostname=slave <http://pointblank.corpdom.com>, MRpartition=1,
port=30001)
> 2012-08-07 09:46:34,701 WARN org.apache.giraph.graph.BspService: process: Unknown and
unprocessed event (path=/_hadoopBsp/job_201208070927_0003/_applicationAttemptsDir/0/_superstepDir/7/_partitionAssignments,
type=NodeDeleted, state=SyncConnected)
> 2012-08-07 09:46:34,705 WARN org.apache.giraph.graph.BspService: process: Unknown and
unprocessed event (path=/_hadoopBsp/job_201208070927_0003/_applicationAttemptsDir/0/_superstepDir/7/_superstepFinished,
type=NodeDeleted, state=SyncConnected)
> 2012-08-07 09:46:34,714 INFO org.apache.giraph.graph.BspService: process: partitionAssignmentsReadyChanged
(partitions are assigned)
> 2012-08-07 09:46:34,715 INFO org.apache.giraph.graph.BspServiceWorker: startSuperstep:
Ready for computation on superstep 9 since worker selection and vertex range assignments are
done in /_hadoopBsp/job_201208070927_0003/_applicationAttemptsDir/0/_superstepDir/9/_partitionAssignments
> 2012-08-07 09:46:34,716 INFO org.apache.giraph.graph.BspServiceWorker: getAggregatorValues:
no aggregators in /_hadoopBsp/job_201208070927_0003/_applicationAttemptsDir/0/_superstepDir/8/_mergedAggregatorDir
on superstep 9
> 2012-08-07 09:46:34,716 INFO org.apache.giraph.graph.BspServiceWorker: exchangeVertexPartitions:
Nothing to exchange, exiting early
> 2012-08-07 09:46:34,716 INFO org.apache.giraph.comm.BasicRPCCommunications: flush: starting
for superstep 9 totalMem = 191.6875M, maxMem = 191.6875M, freeMem = 164.47278M
> 2012-08-07 09:46:34,716 INFO org.apache.giraph.comm.BasicRPCCommunications: flush: ended
for superstep 9 totalMem = 191.6875M, maxMem = 191.6875M, freeMem = 164.47269M
> 2012-08-07 09:46:34,716 INFO org.apache.giraph.graph.BspServiceWorker: finishSuperstep:
Superstep 9 totalMem = 191.6875M, maxMem = 191.6875M, freeMem = 164.47269M
> 2012-08-07 09:46:34,721 INFO org.apache.giraph.graph.BspService: process: superstepFinished
signaled
> 2012-08-07 09:46:34,722 INFO org.apache.giraph.graph.BspServiceWorker: finishSuperstep:
Completed superstep 9 with global stats (vtx=5000,finVtx=5000,edges=10005,msgCount=0)
> 2012-08-07 09:46:34,722 INFO org.apache.giraph.graph.GraphMapper: map: BSP application
done (global vertices marked done)
> 2012-08-07 09:46:34,722 INFO org.apache.giraph.graph.GraphMapper: cleanup: Starting for
WORKER_ONLY
> 2012-08-07 09:46:34,722 WARN org.apache.giraph.graph.BspService: process: Unknown and
unprocessed event (path=/_hadoopBsp/job_201208070927_0003/_applicationAttemptsDir/0/_superstepDir/8/_partitionAssignments,
type=NodeDeleted, state=SyncConnected)
> 2012-08-07 09:46:34,726 WARN org.apache.giraph.graph.BspService: process: Unknown and
unprocessed event (path=/_hadoopBsp/job_201208070927_0003/_applicationAttemptsDir/0/_superstepDir/8/_superstepFinished,
type=NodeDeleted, state=SyncConnected)
> 2012-08-07 09:46:34,729 INFO org.apache.giraph.graph.BspServiceWorker: cleanup: Notifying
master its okay to cleanup with /_hadoopBsp/job_201208070927_0003/_cleanedUpDir/1_worker
> 2012-08-07 09:46:34,731 INFO org.apache.zookeeper.ZooKeeper: Session: 0x138fe1c4699004a
closed
> 2012-08-07 09:46:34,731 INFO org.apache.giraph.comm.BasicRPCCommunications: close: shutting
down RPC server
> 2012-08-07 09:46:34,731 INFO org.apache.hadoop.ipc.Server: Stopping server on 30011
> 2012-08-07 09:46:34,731 INFO org.apache.zookeeper.ClientCnxn: EventThread shut down
> 2012-08-07 09:46:34,731 INFO org.apache.hadoop.ipc.Server: IPC Server handler 2 on 30011:
exiting
> 2012-08-07 09:46:34,731 INFO org.apache.hadoop.ipc.metrics.RpcInstrumentation: shut down
> 2012-08-07 09:46:34,731 INFO org.apache.hadoop.ipc.Server: IPC Server handler 0 on 30011:
exiting
> 2012-08-07 09:46:34,731 INFO org.apache.hadoop.ipc.Server: Stopping IPC Server Responder
> 2012-08-07 09:46:34,731 INFO org.apache.hadoop.ipc.Server: Stopping IPC Server listener
on 30011
> 2012-08-07 09:46:34,731 INFO org.apache.hadoop.ipc.Server: IPC Server handler 1 on 30011:
exiting
> 2012-08-07 09:46:34,731 INFO org.apache.giraph.zk.ZooKeeperManager: createZooKeeperClosedStamp:
Creating my filestamp _bsp/_defaultZkManagerDir/job_201208070927_0003/_task/1.COMPUTATION_DONE
> 2012-08-07 09:46:34,736 INFO org.apache.hadoop.mapred.Task: Task:attempt_201208070927_0003_m_000001_0
is done. And is in the process of commiting
> 2012-08-07 09:46:35,837 INFO org.apache.hadoop.mapred.Task: Task attempt_201208070927_0003_m_000001_0
is allowed to commit now
> 2012-08-07 09:46:35,848 INFO org.apache.hadoop.mapreduce.lib.output.FileOutputCommitter:
Saved output of task 'attempt_201208070927_0003_m_000001_0' to hdfs:/user/vpatel/giraph_out/two
> 2012-08-07 09:46:37,776 INFO org.apache.hadoop.mapred.Task: Task 'attempt_201208070927_0003_m_000001_0'
done.
> 2012-08-07 09:46:37,779 INFO org.apache.hadoop.mapred.TaskLogsTruncater: Initializing
logs' truncater with mapRetainSize=-1 and reduceRetainSize=-1
> 2012-08-07 09:46:37,812 INFO org.apache.hadoop.io.nativeio.NativeIO: Initialized cache
for UID to User mapping with a cache timeout of 14400 seconds.
> 2012-08-07 09:46:37,813 INFO org.apache.hadoop.io.nativeio.NativeIO: Got UserName vpatel
for UID 10020 from the native implementation
>
>
> I have attached the network file to this email. It has 10,000 lines corresponding to
the 10,000 nodes in the adjacency list format (tab separated).
>
>
>
> Here is jps from master:
> 5178 TaskTracker
> 4662 DataNode
> 4491 NameNode
> 34115 RunJar
> 4865 SecondaryNameNode
> 8385 Jps
> 29410 QuorumPeerMain
> 4991 JobTracker
>
> jps from slave
> 48621 TaskTracker
> 48464 DataNode
> 51391 Jps
>
>
> Thank you again for your help,
>
> Vishal
>
>
>
>
> On Tue, Aug 7, 2012 at 12:07 AM, Sebastian Schelter <ssc@apache.org>wrote:
>
>> Can you check what the mappers where doing via the web interface of
>> Hadoop? Can you run 4 mappers at once?
>>
>>
>>
>> On 07.08.2012 01:46, Vishal Patel wrote:
>> > I'm seeing a strange behavior that I can't explain.
>> >
>> >
>> > hadoop jar giraph-0.1-jar-with-dependencies.jar
>> > org.apache.giraph.GiraphRunner
>> > org.apache.giraph.examples.ConnectedComponentsVertex --inputFormat
>> > org.apache.giraph.examples.IntIntNullIntTextInputFormat --inputPath
>> > /user/vpatel/graph_in/elist.txt --outputFormat
>> > org.apache.giraph.examples.VertexWithComponentTextOutputFormat
>> --outputPath
>> > hdfs:///user/vpatel/giraph_out/1 --workers 4 --combiner
>> > org.apache.giraph.examples.MinimumIntCombiner
>> > Warning: $HADOOP_HOME is deprecated.
>> >
>> > 12/08/06 16:16:40 INFO mapred.JobClient: Running job:
>> job_201208031459_0591
>> > 12/08/06 16:16:41 INFO mapred.JobClient:  map 0% reduce 0%
>> > 12/08/06 16:16:59 INFO mapred.JobClient:  map 20% reduce 0%
>> > 12/08/06 16:17:05 INFO mapred.JobClient:  map 40% reduce 0%
>> > 12/08/06 16:17:08 INFO mapred.JobClient:  map 100% reduce 0%
>> > 12/08/06 16:17:11 INFO mapred.JobClient:  map 80% reduce 0%
>> > 12/08/06 16:17:16 INFO mapred.JobClient: Task Id :
>> > attempt_201208031459_0591_m_000000_0, Status : FAILED
>> > *java.lang.Throwable: Child Error
>> >         at org.apache.hadoop.mapred.TaskRunner.run(TaskRunner.java:271)
>> > Caused by: java.io.IOException: Task process exit with nonzero status
>> of 1.
>> >         at org.apache.hadoop.mapred.TaskRunner.run(TaskRunner.java:258)
>> > *
>> >
>> > I either get the above error, which I can avoid if I decrease my number
>> of
>> > workers (based on previous post on the mailing list).
>> >
>> > However when I do specify lesser workers (say 2) or sometimes I don't
>> get
>> > the above error: the result is missing for one part in the hdfs.
>> > i.e. when I did workers=2, I got two parts. One of them had 5,000 out of
>> > the 10k nodes and other part was blank. This happens when I did
>> workers=4,5
>> > etc as well.
>> >
>> > There are no errors in the log.
>> >
>> > Just to be clear, the input format is adjacency list,
>> > i.e if a -> b, a ->c and b -> d then
>> > a b c
>> > b a d
>> > c a
>> > d b
>> >
>> > Since the graph is undirected. Any idea what could be wrong?
>> >
>> > Here is the log when I do workers=1
>> >
>> > Finally loaded a total of *(v=10000, e=19996)*
>> > 2012-08-06 16:39:13,902 INFO org.apache.giraph.graph.BspService:
>> > process: inputSplitsAllDoneChanged (all vertices sent from input
>> > splits)
>> > 2012-08-06 16:39:13,904 INFO
>> > org.apache.giraph.comm.BasicRPCCommunications: flush: starting for
>> > superstep -1 totalMem = 191.6875M, maxMem = 191.6875M, freeMem =
>> > 164.6044M
>> > 2012-08-06 16:39:13,906 INFO
>> > org.apache.giraph.comm.BasicRPCCommunications: flush: ended for
>> > superstep -1 totalMem = 191.6875M, maxMem = 191.6875M, freeMem =
>> > 164.60431M
>> > 2012-08-06 16:39:13,906 INFO org.apache.giraph.graph.BspServiceWorker:
>> > finishSuperstep: Superstep -1 totalMem = 191.6875M, maxMem =
>> > 191.6875M, freeMem = 164.60431M
>> > 2012-08-06 16:39:13,922 INFO org.apache.giraph.graph.BspService:
>> > process: superstepFinished signaled
>> > 2012-08-06 16:39:13,924 INFO org.apache.giraph.graph.BspServiceWorker:
>> > finishSuperstep: Completed superstep -1 with global stats
>> > (vtx=0,finVtx=0,edges=0,msgCount=0)
>> > 2012-08-06 16:39:13,924 INFO org.apache.giraph.graph.GraphMapper:
>> > cleanup: Starting for WORKER_ONLY
>> > 2012-08-06 16:39:13,925 INFO org.apache.giraph.graph.BspServiceWorker:
>> > processEvent: Job state changed, checking to see if it needs to
>> > restart
>> > 2012-08-06 16:39:13,926 INFO org.apache.giraph.graph.BspService:
>> > getJobState: Job state already exists
>> > (/_hadoopBsp/job_201208031459_0621/_masterJobState)
>> > 2012-08-06 16:39:13,929 INFO org.apache.giraph.graph.BspServiceWorker:
>> > cleanup: Notifying master its okay to cleanup with
>> > /_hadoopBsp/job_201208031459_0621/_cleanedUpDir/1_worker
>> > 2012-08-06 16:39:13,930 INFO org.apache.zookeeper.ZooKeeper: Session:
>> > 0x138fe1c4699003d closed
>> > 2012-08-06 16:39:13,930 INFO
>> > org.apache.giraph.comm.BasicRPCCommunications: close: shutting down
>> > RPC server
>> > 2012-08-06 16:39:13,930 INFO org.apache.zookeeper.ClientCnxn:
>> > EventThread shut down
>> > 2012-08-06 16:39:13,930 INFO org.apache.hadoop.ipc.Server: Stopping
>> > server on 30003
>> > 2012-08-06 16:39:13,930 INFO org.apache.hadoop.ipc.Server: IPC Server
>> > handler 0 on 30003: exiting
>> > 2012-08-06 16:39:13,930 INFO org.apache.hadoop.ipc.Server: Stopping
>> > IPC Server listener on 30003
>> > 2012-08-06 16:39:13,930 INFO
>> > org.apache.hadoop.ipc.metrics.RpcInstrumentation: shut down
>> > 2012-08-06 16:39:13,931 INFO org.apache.hadoop.ipc.Server: Stopping
>> > IPC Server Responder
>> > 2012-08-06 16:39:13,931 INFO org.apache.hadoop.ipc.Server: IPC Server
>> > handler 1 on 30003: exiting
>> > 2012-08-06 16:39:13,931 INFO org.apache.giraph.zk.ZooKeeperManager:
>> > createZooKeeperClosedStamp: Creating my filestamp
>> > _bsp/_defaultZkManagerDir/job_201208031459_0621/_task/1.COMPUTATION_DONE
>> > 2012-08-06 16:39:13,934 INFO org.apache.hadoop.mapred.Task:
>> > Task:attempt_201208031459_0621_m_000001_0 is done. And is in the
>> > process of commiting
>> > 2012-08-06 16:39:15,026 INFO org.apache.hadoop.mapred.Task: Task
>> > attempt_201208031459_0621_m_000001_0 is allowed to commit now
>> > 2012-08-06 16:39:15,036 INFO
>> > org.apache.hadoop.mapreduce.lib.output.FileOutputCommitter: Saved
>> > output of task 'attempt_201208031459_0621_m_000001_0' to
>> > hdfs:/user/vpatel/giraph_out/one
>> > 2012-08-06 16:39:16,068 INFO org.apache.hadoop.mapred.Task: Task
>> > 'attempt_201208031459_0621_m_000001_0' done.
>> > 2012-08-06 16:39:16,087 INFO
>> > org.apache.hadoop.mapred.TaskLogsTruncater: Initializing logs'
>> > truncater with mapRetainSize=-1 and reduceRetainSize=-1
>> > 2012-08-06 16:39:16,117 INFO org.apache.hadoop.io.nativeio.NativeIO:
>> > Initialized cache for UID to User mapping with a cache timeout of
>> > 14400 seconds.
>> > 2012-08-06 16:39:16,118 INFO org.apache.hadoop.io.nativeio.NativeIO:
>> > Got UserName vpatel for UID 10020 from the native implementation
>> >
>> >
>> >
>> >
>> > On Mon, Aug 6, 2012 at 3:05 PM, Sebastian Schelter <ssc@apache.org>
>> wrote:
>> >
>> >> The job expects the input data in adjacency list format, each line
>> >> should look like:
>> >>
>> >> vertex neighbor1 neighbor2 ....
>> >>
>> >> --sebastian
>> >>
>> >>
>> >> On 07.08.2012 00:02, Vishal Patel wrote:
>> >>> Thanks Sebastian, it runs fine now. However, the output comes back as
>> >>>
>> >>> 0       0
>> >>> 1       1
>> >>> 2       2
>> >>> 3       3
>> >>> 4       4
>> >>> 5       5
>> >>> 6       6
>> >>> ..
>> >>>
>> >>> I have an unsorted edge file with just int values.
>> >>> http://www.ics.uci.edu/~vishalrp/public/testg.txt
>> >>>
>> >>> My test graph (head below) has 10,000 nodes ( from 0 to 9999) and 9998
>> >>> edges. There are 4 connected components in the graph.
>> >>>
>> >>> 0       5800
>> >>> 0       5981
>> >>> 1       1239
>> >>> 1       2989
>> >>> 1       3961
>> >>> 2       5417
>> >>> 2       7350
>> >>>
>> >>> What am I doing wrong? Also, in general does the graph have to have
>> int
>> >>> values for nodes? Or can I have strings?
>> >>>
>> >>> Appreciate your help!
>> >>>
>> >>> Vishal
>> >>>
>> >>>
>> >>>
>> >>>
>> >>> On Mon, Aug 6, 2012 at 2:22 PM, Sebastian Schelter <ssc@apache.org>
>> >> wrote:
>> >>>
>> >>>> You cannot run the vertex class directly. Instead you can use
>> >>>> GiraphRunner, e.g.
>> >>>>
>> >>>> hadoop jar giraph-jar-with-dependencies.jar
>> >>>> org.apache.giraph.GiraphRunner
>> >>>> org.apache.giraph.examples.ConnectedComponentsVertex --inputFormat
>> >>>> org.apache.giraph.examples.IntIntNullIntTextInputFormat --inputPath
>> >>>> hdfs:///path/to/input --outputFormat
>> >>>> org.apache.giraph.examples.VertexWithComponentTextOutputFormat
>> >>>> --outputPath hdfs:///path/to/output --workers numWorkers --combiner
>> >>>> org.apache.giraph.examples.MinimumIntCombiner
>> >>>>
>> >>>> --sebastian
>> >>>>
>> >>>>
>> >>>> 2012/8/6 Vishal Patel <write2vishal@gmail.com>:
>> >>>>> Hi, I am trying to run the connected-components example. I have
>> giraph
>> >>>>> installed, all the test pass on a 3 node cluster running
>> hadoop-1.0.3/
>> >>>>>
>> >>>>> The main method is missing in the ConnectedComponentsVertex
class
>> >>>>>
>> >>>>> cd target/classes
>> >>>>> hadoop jar ../giraph-0.1-jar-with-dependencies.jar
>> >>>>> org.apache.giraph.examples.ConnectedComponentsVertex
>> >>>>>
>> >>>>> Exception in thread "main" java.lang.NoSuchMethodException:
>> >>>>>
>> >>>>
>> >>
>> org.apache.giraph.examples.ConnectedComponentsVertex.main([Ljava.lang.String;)
>> >>>>>         at java.lang.Class.getMethod(Class.java:1622)
>> >>>>>         at org.apache.hadoop.util.RunJar.main(RunJar.java:150)
>> >>>>>
>> >>>>> Can someone please help me with running this example?
>> >>>>>
>> >>>>> Vishal
>> >>>>>
>> >>>>
>> >>>
>> >>
>> >>
>> >
>>
>>
>

Mime
View raw message