giraph-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Vishal Patel <write2vis...@gmail.com>
Subject Re: Trying to run the Connected Components example.
Date Tue, 07 Aug 2012 17:04:51 GMT
Yes, I can run upto 18 mappers, mapred.tasktracker.map.tasks is set to 4 on
the master and slave (there are only 2 machines). The
mapred.tasktracker.map.tasks.maximum is higher.

So when I do worker=1, giraph started 2 map tasks. 1 completed, 1 failed

STATUS: setup: Connected to Zookeeper service master:22181

java.lang.Throwable: Child Error
	at org.apache.hadoop.mapred.TaskRunner.run(TaskRunner.java:271)
Caused by: java.io.IOException: Task process exit with nonzero status of 1.
	at org.apache.hadoop.mapred.TaskRunner.run(TaskRunner.java:258)

Task attempt_201208070927_0001_m_000000_1 failed to report status for
600 seconds. Killing!

Second mapper also turned to the same error after 5 mins,

Task attempt_201208070927_0001_m_000001_0 failed to report status for
601 seconds. Killing!




Next, I tried workers=2, and giraph started 3 mappers, no errors and
the job was "Completed" on hadoop's web interface. However the
solution was incorrect and

/giraph_out/two/part-m-00001 had 0 lines


/giraph_out/two/part-m-00002 had 5000 lines


Here is the command line out,
12/08/07 09:46:25 INFO mapred.JobClient: Running job: job_201208070927_0003
12/08/07 09:46:26 INFO mapred.JobClient:  map 0% reduce 0%


12/08/07 09:46:42 INFO mapred.JobClient:  map 100% reduce 0%
12/08/07 09:46:47 INFO mapred.JobClient: Job complete: job_201208070927_0003
12/08/07 09:46:47 INFO mapred.JobClient: Counters: 41
12/08/07 09:46:47 INFO mapred.JobClient:   Job Counters


12/08/07 09:46:47 INFO mapred.JobClient:     SLOTS_MILLIS_MAPS=25701
12/08/07 09:46:47 INFO mapred.JobClient:     Total time spent by all
reduces waiting after reserving slots (ms)=0
12/08/07 09:46:47 INFO mapred.JobClient:     Total time spent by all
maps waiting after reserving slots (ms)=0


12/08/07 09:46:47 INFO mapred.JobClient:     Launched map tasks=3
12/08/07 09:46:47 INFO mapred.JobClient:     SLOTS_MILLIS_REDUCES=0
12/08/07 09:46:47 INFO mapred.JobClient:   Giraph Timers
12/08/07 09:46:47 INFO mapred.JobClient:     Total (milliseconds)=4203


12/08/07 09:46:47 INFO mapred.JobClient:     Superstep 3 (milliseconds)=25
12/08/07 09:46:47 INFO mapred.JobClient:     Vertex input superstep
(milliseconds)=426
12/08/07 09:46:47 INFO mapred.JobClient:     Superstep 4 (milliseconds)=69


12/08/07 09:46:47 INFO mapred.JobClient:     Setup (milliseconds)=3076
12/08/07 09:46:47 INFO mapred.JobClient:     Superstep 7 (milliseconds)=17
12/08/07 09:46:47 INFO mapred.JobClient:     Shutdown (milliseconds)=93


12/08/07 09:46:47 INFO mapred.JobClient:     Superstep 0 (milliseconds)=197
12/08/07 09:46:47 INFO mapred.JobClient:     Superstep 8 (milliseconds)=62
12/08/07 09:46:47 INFO mapred.JobClient:     Superstep 9 (milliseconds)=21


12/08/07 09:46:47 INFO mapred.JobClient:     Superstep 6 (milliseconds)=59
12/08/07 09:46:47 INFO mapred.JobClient:     Superstep 5 (milliseconds)=19
12/08/07 09:46:47 INFO mapred.JobClient:     Superstep 2 (milliseconds)=78


12/08/07 09:46:47 INFO mapred.JobClient:     Superstep 1 (milliseconds)=57
12/08/07 09:46:47 INFO mapred.JobClient:   Giraph Stats
12/08/07 09:46:47 INFO mapred.JobClient:     Aggregate edges=10005
12/08/07 09:46:47 INFO mapred.JobClient:     Superstep=10


12/08/07 09:46:47 INFO mapred.JobClient:     Last checkpointed superstep=8
12/08/07 09:46:47 INFO mapred.JobClient:     Current workers=2
12/08/07 09:46:47 INFO mapred.JobClient:     Current master task partition=0


12/08/07 09:46:47 INFO mapred.JobClient:     Sent messages=0
12/08/07 09:46:47 INFO mapred.JobClient:     Aggregate finished vertices=5000
12/08/07 09:46:47 INFO mapred.JobClient:     Aggregate vertices=5000
12/08/07 09:46:47 INFO mapred.JobClient:   File Output Format Counters


12/08/07 09:46:47 INFO mapred.JobClient:     Bytes Written=0
12/08/07 09:46:47 INFO mapred.JobClient:   FileSystemCounters
12/08/07 09:46:47 INFO mapred.JobClient:     FILE_BYTES_READ=236
12/08/07 09:46:47 INFO mapred.JobClient:     HDFS_BYTES_READ=146766


12/08/07 09:46:47 INFO mapred.JobClient:     FILE_BYTES_WRITTEN=66618
12/08/07 09:46:47 INFO mapred.JobClient:     HDFS_BYTES_WRITTEN=655800
12/08/07 09:46:47 INFO mapred.JobClient:   File Input Format Counters

12/08/07 09:46:47 INFO mapred.JobClient:     Bytes Read=0

12/08/07 09:46:47 INFO mapred.JobClient:   Map-Reduce Framework
12/08/07 09:46:47 INFO mapred.JobClient:     Map input records=3
12/08/07 09:46:47 INFO mapred.JobClient:     Physical memory (bytes)
snapshot=364912640


12/08/07 09:46:47 INFO mapred.JobClient:     Spilled Records=0
12/08/07 09:46:47 INFO mapred.JobClient:     CPU time spent (ms)=3840
12/08/07 09:46:47 INFO mapred.JobClient:     Total committed heap
usage (bytes)=602996736


12/08/07 09:46:47 INFO mapred.JobClient:     Virtual memory (bytes)
snapshot=9814392832
12/08/07 09:46:47 INFO mapred.JobClient:     Map output records=0
12/08/07 09:46:47 INFO mapred.JobClient:     SPLIT_RAW_BYTES=132




>From the Web interface,

Mapper 000 (went to master), status: MASTER_ZOOKEEPER_ONLY - 2
finished out of 2 on superstep 9
Mapper 001 (went to slave), status: finishSuperstep: (all workers
done) WORKER_ONLY - Attempt=0, Superstep=10


Mapper 002 (went to master), status: finishSuperstep: (all workers
done) WORKER_ONLY - Attempt=0, Superstep=10


Here is the last 8KB syslog of Mapper 2 (Mapper 3 was similar),

2012-08-07 09:46:34,644 WARN org.apache.giraph.graph.BspService:
process: Unknown and unprocessed event
(path=/_hadoopBsp/job_201208070927_0003/_applicationAttemptsDir/0/_superstepDir/6/_superstepFinished,
type=NodeDeleted, state=SyncConnected)
2012-08-07 09:46:34,646 INFO org.apache.giraph.graph.BspServiceWorker:
registerHealth: Created my health node for attempt=0, superstep=8 with
/_hadoopBsp/job_201208070927_0003/_applicationAttemptsDir/0/_superstepDir/8/_workerHealthyDir/slave_1
and workerInfo= Worker(hostname=slave <http://pointblank.corpdom.com>,
MRpartition=1, port=30001)
2012-08-07 09:46:34,650 INFO org.apache.giraph.graph.BspService:
process: partitionAssignmentsReadyChanged (partitions are assigned)
2012-08-07 09:46:34,651 INFO org.apache.giraph.graph.BspServiceWorker:
startSuperstep: Ready for computation on superstep 8 since worker
selection and vertex range assignments are done in
/_hadoopBsp/job_201208070927_0003/_applicationAttemptsDir/0/_superstepDir/8/_partitionAssignments
2012-08-07 09:46:34,651 INFO org.apache.giraph.graph.BspServiceWorker:
getAggregatorValues: no aggregators in
/_hadoopBsp/job_201208070927_0003/_applicationAttemptsDir/0/_superstepDir/7/_mergedAggregatorDir
on superstep 8
2012-08-07 09:46:34,652 INFO org.apache.giraph.graph.BspServiceWorker:
exchangeVertexPartitions: Nothing to exchange, exiting early
2012-08-07 09:46:34,674 INFO org.apache.giraph.graph.BspServiceWorker:
storeCheckpoint: Finished metadata
(_bsp/_checkpoints/job_201208070927_0003/8.slave_1.metadata) and
vertices (_bsp/_checkpoints/job_201208070927_0003/8.slave_1.vertices).
2012-08-07 09:46:34,679 INFO
org.apache.giraph.comm.BasicRPCCommunications: flush: starting for
superstep 8 totalMem = 191.6875M, maxMem = 191.6875M, freeMem =
164.97296M
2012-08-07 09:46:34,679 INFO
org.apache.giraph.comm.BasicRPCCommunications: flush: ended for
superstep 8 totalMem = 191.6875M, maxMem = 191.6875M, freeMem =
164.97285M
2012-08-07 09:46:34,679 INFO org.apache.giraph.graph.BspServiceWorker:
finishSuperstep: Superstep 8 totalMem = 191.6875M, maxMem = 191.6875M,
freeMem = 164.97285M
2012-08-07 09:46:34,689 INFO org.apache.giraph.graph.BspService:
process: superstepFinished signaled
2012-08-07 09:46:34,690 INFO org.apache.giraph.graph.BspServiceWorker:
finishSuperstep: Completed superstep 8 with global stats
(vtx=5000,finVtx=5000,edges=10005,msgCount=20)
2012-08-07 09:46:34,690 INFO
org.apache.giraph.comm.BasicRPCCommunications: prepareSuperstep:
Superstep 9 totalMem = 191.6875M, maxMem = 191.6875M, freeMem =
164.97285M
2012-08-07 09:46:34,696 INFO org.apache.giraph.graph.BspServiceWorker:
registerHealth: Created my health node for attempt=0, superstep=9 with
/_hadoopBsp/job_201208070927_0003/_applicationAttemptsDir/0/_superstepDir/9/_workerHealthyDir/slave_1
and workerInfo= Worker(hostname=slave <http://pointblank.corpdom.com>,
MRpartition=1, port=30001)
2012-08-07 09:46:34,701 WARN org.apache.giraph.graph.BspService:
process: Unknown and unprocessed event
(path=/_hadoopBsp/job_201208070927_0003/_applicationAttemptsDir/0/_superstepDir/7/_partitionAssignments,
type=NodeDeleted, state=SyncConnected)
2012-08-07 09:46:34,705 WARN org.apache.giraph.graph.BspService:
process: Unknown and unprocessed event
(path=/_hadoopBsp/job_201208070927_0003/_applicationAttemptsDir/0/_superstepDir/7/_superstepFinished,
type=NodeDeleted, state=SyncConnected)
2012-08-07 09:46:34,714 INFO org.apache.giraph.graph.BspService:
process: partitionAssignmentsReadyChanged (partitions are assigned)
2012-08-07 09:46:34,715 INFO org.apache.giraph.graph.BspServiceWorker:
startSuperstep: Ready for computation on superstep 9 since worker
selection and vertex range assignments are done in
/_hadoopBsp/job_201208070927_0003/_applicationAttemptsDir/0/_superstepDir/9/_partitionAssignments
2012-08-07 09:46:34,716 INFO org.apache.giraph.graph.BspServiceWorker:
getAggregatorValues: no aggregators in
/_hadoopBsp/job_201208070927_0003/_applicationAttemptsDir/0/_superstepDir/8/_mergedAggregatorDir
on superstep 9
2012-08-07 09:46:34,716 INFO org.apache.giraph.graph.BspServiceWorker:
exchangeVertexPartitions: Nothing to exchange, exiting early
2012-08-07 09:46:34,716 INFO
org.apache.giraph.comm.BasicRPCCommunications: flush: starting for
superstep 9 totalMem = 191.6875M, maxMem = 191.6875M, freeMem =
164.47278M
2012-08-07 09:46:34,716 INFO
org.apache.giraph.comm.BasicRPCCommunications: flush: ended for
superstep 9 totalMem = 191.6875M, maxMem = 191.6875M, freeMem =
164.47269M
2012-08-07 09:46:34,716 INFO org.apache.giraph.graph.BspServiceWorker:
finishSuperstep: Superstep 9 totalMem = 191.6875M, maxMem = 191.6875M,
freeMem = 164.47269M
2012-08-07 09:46:34,721 INFO org.apache.giraph.graph.BspService:
process: superstepFinished signaled
2012-08-07 09:46:34,722 INFO org.apache.giraph.graph.BspServiceWorker:
finishSuperstep: Completed superstep 9 with global stats
(vtx=5000,finVtx=5000,edges=10005,msgCount=0)
2012-08-07 09:46:34,722 INFO org.apache.giraph.graph.GraphMapper: map:
BSP application done (global vertices marked done)
2012-08-07 09:46:34,722 INFO org.apache.giraph.graph.GraphMapper:
cleanup: Starting for WORKER_ONLY
2012-08-07 09:46:34,722 WARN org.apache.giraph.graph.BspService:
process: Unknown and unprocessed event
(path=/_hadoopBsp/job_201208070927_0003/_applicationAttemptsDir/0/_superstepDir/8/_partitionAssignments,
type=NodeDeleted, state=SyncConnected)
2012-08-07 09:46:34,726 WARN org.apache.giraph.graph.BspService:
process: Unknown and unprocessed event
(path=/_hadoopBsp/job_201208070927_0003/_applicationAttemptsDir/0/_superstepDir/8/_superstepFinished,
type=NodeDeleted, state=SyncConnected)
2012-08-07 09:46:34,729 INFO org.apache.giraph.graph.BspServiceWorker:
cleanup: Notifying master its okay to cleanup with
/_hadoopBsp/job_201208070927_0003/_cleanedUpDir/1_worker
2012-08-07 09:46:34,731 INFO org.apache.zookeeper.ZooKeeper: Session:
0x138fe1c4699004a closed
2012-08-07 09:46:34,731 INFO
org.apache.giraph.comm.BasicRPCCommunications: close: shutting down
RPC server
2012-08-07 09:46:34,731 INFO org.apache.hadoop.ipc.Server: Stopping
server on 30011
2012-08-07 09:46:34,731 INFO org.apache.zookeeper.ClientCnxn:
EventThread shut down
2012-08-07 09:46:34,731 INFO org.apache.hadoop.ipc.Server: IPC Server
handler 2 on 30011: exiting
2012-08-07 09:46:34,731 INFO
org.apache.hadoop.ipc.metrics.RpcInstrumentation: shut down
2012-08-07 09:46:34,731 INFO org.apache.hadoop.ipc.Server: IPC Server
handler 0 on 30011: exiting
2012-08-07 09:46:34,731 INFO org.apache.hadoop.ipc.Server: Stopping
IPC Server Responder
2012-08-07 09:46:34,731 INFO org.apache.hadoop.ipc.Server: Stopping
IPC Server listener on 30011
2012-08-07 09:46:34,731 INFO org.apache.hadoop.ipc.Server: IPC Server
handler 1 on 30011: exiting
2012-08-07 09:46:34,731 INFO org.apache.giraph.zk.ZooKeeperManager:
createZooKeeperClosedStamp: Creating my filestamp
_bsp/_defaultZkManagerDir/job_201208070927_0003/_task/1.COMPUTATION_DONE
2012-08-07 09:46:34,736 INFO org.apache.hadoop.mapred.Task:
Task:attempt_201208070927_0003_m_000001_0 is done. And is in the
process of commiting
2012-08-07 09:46:35,837 INFO org.apache.hadoop.mapred.Task: Task
attempt_201208070927_0003_m_000001_0 is allowed to commit now
2012-08-07 09:46:35,848 INFO
org.apache.hadoop.mapreduce.lib.output.FileOutputCommitter: Saved
output of task 'attempt_201208070927_0003_m_000001_0' to
hdfs:/user/vpatel/giraph_out/two
2012-08-07 09:46:37,776 INFO org.apache.hadoop.mapred.Task: Task
'attempt_201208070927_0003_m_000001_0' done.
2012-08-07 09:46:37,779 INFO
org.apache.hadoop.mapred.TaskLogsTruncater: Initializing logs'
truncater with mapRetainSize=-1 and reduceRetainSize=-1
2012-08-07 09:46:37,812 INFO org.apache.hadoop.io.nativeio.NativeIO:
Initialized cache for UID to User mapping with a cache timeout of
14400 seconds.
2012-08-07 09:46:37,813 INFO org.apache.hadoop.io.nativeio.NativeIO:
Got UserName vpatel for UID 10020 from the native implementation


I have attached the network file to this email. It has 10,000 lines
corresponding to the 10,000 nodes in the adjacency list format (tab
separated).

Here is jps from master:
5178 TaskTracker
4662 DataNode
4491 NameNode
34115 RunJar
4865 SecondaryNameNode
8385 Jps
29410 QuorumPeerMain
4991 JobTracker

jps from slave
48621 TaskTracker
48464 DataNode
51391 Jps


Thank you again for your help,

Vishal



On Tue, Aug 7, 2012 at 12:07 AM, Sebastian Schelter <ssc@apache.org> wrote:

> Can you check what the mappers where doing via the web interface of
> Hadoop? Can you run 4 mappers at once?
>
>
>
> On 07.08.2012 01:46, Vishal Patel wrote:
> > I'm seeing a strange behavior that I can't explain.
> >
> >
> > hadoop jar giraph-0.1-jar-with-dependencies.jar
> > org.apache.giraph.GiraphRunner
> > org.apache.giraph.examples.ConnectedComponentsVertex --inputFormat
> > org.apache.giraph.examples.IntIntNullIntTextInputFormat --inputPath
> > /user/vpatel/graph_in/elist.txt --outputFormat
> > org.apache.giraph.examples.VertexWithComponentTextOutputFormat
> --outputPath
> > hdfs:///user/vpatel/giraph_out/1 --workers 4 --combiner
> > org.apache.giraph.examples.MinimumIntCombiner
> > Warning: $HADOOP_HOME is deprecated.
> >
> > 12/08/06 16:16:40 INFO mapred.JobClient: Running job:
> job_201208031459_0591
> > 12/08/06 16:16:41 INFO mapred.JobClient:  map 0% reduce 0%
> > 12/08/06 16:16:59 INFO mapred.JobClient:  map 20% reduce 0%
> > 12/08/06 16:17:05 INFO mapred.JobClient:  map 40% reduce 0%
> > 12/08/06 16:17:08 INFO mapred.JobClient:  map 100% reduce 0%
> > 12/08/06 16:17:11 INFO mapred.JobClient:  map 80% reduce 0%
> > 12/08/06 16:17:16 INFO mapred.JobClient: Task Id :
> > attempt_201208031459_0591_m_000000_0, Status : FAILED
> > *java.lang.Throwable: Child Error
> >         at org.apache.hadoop.mapred.TaskRunner.run(TaskRunner.java:271)
> > Caused by: java.io.IOException: Task process exit with nonzero status of
> 1.
> >         at org.apache.hadoop.mapred.TaskRunner.run(TaskRunner.java:258)
> > *
> >
> > I either get the above error, which I can avoid if I decrease my number
> of
> > workers (based on previous post on the mailing list).
> >
> > However when I do specify lesser workers (say 2) or sometimes I don't get
> > the above error: the result is missing for one part in the hdfs.
> > i.e. when I did workers=2, I got two parts. One of them had 5,000 out of
> > the 10k nodes and other part was blank. This happens when I did
> workers=4,5
> > etc as well.
> >
> > There are no errors in the log.
> >
> > Just to be clear, the input format is adjacency list,
> > i.e if a -> b, a ->c and b -> d then
> > a b c
> > b a d
> > c a
> > d b
> >
> > Since the graph is undirected. Any idea what could be wrong?
> >
> > Here is the log when I do workers=1
> >
> > Finally loaded a total of *(v=10000, e=19996)*
> > 2012-08-06 16:39:13,902 INFO org.apache.giraph.graph.BspService:
> > process: inputSplitsAllDoneChanged (all vertices sent from input
> > splits)
> > 2012-08-06 16:39:13,904 INFO
> > org.apache.giraph.comm.BasicRPCCommunications: flush: starting for
> > superstep -1 totalMem = 191.6875M, maxMem = 191.6875M, freeMem =
> > 164.6044M
> > 2012-08-06 16:39:13,906 INFO
> > org.apache.giraph.comm.BasicRPCCommunications: flush: ended for
> > superstep -1 totalMem = 191.6875M, maxMem = 191.6875M, freeMem =
> > 164.60431M
> > 2012-08-06 16:39:13,906 INFO org.apache.giraph.graph.BspServiceWorker:
> > finishSuperstep: Superstep -1 totalMem = 191.6875M, maxMem =
> > 191.6875M, freeMem = 164.60431M
> > 2012-08-06 16:39:13,922 INFO org.apache.giraph.graph.BspService:
> > process: superstepFinished signaled
> > 2012-08-06 16:39:13,924 INFO org.apache.giraph.graph.BspServiceWorker:
> > finishSuperstep: Completed superstep -1 with global stats
> > (vtx=0,finVtx=0,edges=0,msgCount=0)
> > 2012-08-06 16:39:13,924 INFO org.apache.giraph.graph.GraphMapper:
> > cleanup: Starting for WORKER_ONLY
> > 2012-08-06 16:39:13,925 INFO org.apache.giraph.graph.BspServiceWorker:
> > processEvent: Job state changed, checking to see if it needs to
> > restart
> > 2012-08-06 16:39:13,926 INFO org.apache.giraph.graph.BspService:
> > getJobState: Job state already exists
> > (/_hadoopBsp/job_201208031459_0621/_masterJobState)
> > 2012-08-06 16:39:13,929 INFO org.apache.giraph.graph.BspServiceWorker:
> > cleanup: Notifying master its okay to cleanup with
> > /_hadoopBsp/job_201208031459_0621/_cleanedUpDir/1_worker
> > 2012-08-06 16:39:13,930 INFO org.apache.zookeeper.ZooKeeper: Session:
> > 0x138fe1c4699003d closed
> > 2012-08-06 16:39:13,930 INFO
> > org.apache.giraph.comm.BasicRPCCommunications: close: shutting down
> > RPC server
> > 2012-08-06 16:39:13,930 INFO org.apache.zookeeper.ClientCnxn:
> > EventThread shut down
> > 2012-08-06 16:39:13,930 INFO org.apache.hadoop.ipc.Server: Stopping
> > server on 30003
> > 2012-08-06 16:39:13,930 INFO org.apache.hadoop.ipc.Server: IPC Server
> > handler 0 on 30003: exiting
> > 2012-08-06 16:39:13,930 INFO org.apache.hadoop.ipc.Server: Stopping
> > IPC Server listener on 30003
> > 2012-08-06 16:39:13,930 INFO
> > org.apache.hadoop.ipc.metrics.RpcInstrumentation: shut down
> > 2012-08-06 16:39:13,931 INFO org.apache.hadoop.ipc.Server: Stopping
> > IPC Server Responder
> > 2012-08-06 16:39:13,931 INFO org.apache.hadoop.ipc.Server: IPC Server
> > handler 1 on 30003: exiting
> > 2012-08-06 16:39:13,931 INFO org.apache.giraph.zk.ZooKeeperManager:
> > createZooKeeperClosedStamp: Creating my filestamp
> > _bsp/_defaultZkManagerDir/job_201208031459_0621/_task/1.COMPUTATION_DONE
> > 2012-08-06 16:39:13,934 INFO org.apache.hadoop.mapred.Task:
> > Task:attempt_201208031459_0621_m_000001_0 is done. And is in the
> > process of commiting
> > 2012-08-06 16:39:15,026 INFO org.apache.hadoop.mapred.Task: Task
> > attempt_201208031459_0621_m_000001_0 is allowed to commit now
> > 2012-08-06 16:39:15,036 INFO
> > org.apache.hadoop.mapreduce.lib.output.FileOutputCommitter: Saved
> > output of task 'attempt_201208031459_0621_m_000001_0' to
> > hdfs:/user/vpatel/giraph_out/one
> > 2012-08-06 16:39:16,068 INFO org.apache.hadoop.mapred.Task: Task
> > 'attempt_201208031459_0621_m_000001_0' done.
> > 2012-08-06 16:39:16,087 INFO
> > org.apache.hadoop.mapred.TaskLogsTruncater: Initializing logs'
> > truncater with mapRetainSize=-1 and reduceRetainSize=-1
> > 2012-08-06 16:39:16,117 INFO org.apache.hadoop.io.nativeio.NativeIO:
> > Initialized cache for UID to User mapping with a cache timeout of
> > 14400 seconds.
> > 2012-08-06 16:39:16,118 INFO org.apache.hadoop.io.nativeio.NativeIO:
> > Got UserName vpatel for UID 10020 from the native implementation
> >
> >
> >
> >
> > On Mon, Aug 6, 2012 at 3:05 PM, Sebastian Schelter <ssc@apache.org>
> wrote:
> >
> >> The job expects the input data in adjacency list format, each line
> >> should look like:
> >>
> >> vertex neighbor1 neighbor2 ....
> >>
> >> --sebastian
> >>
> >>
> >> On 07.08.2012 00:02, Vishal Patel wrote:
> >>> Thanks Sebastian, it runs fine now. However, the output comes back as
> >>>
> >>> 0       0
> >>> 1       1
> >>> 2       2
> >>> 3       3
> >>> 4       4
> >>> 5       5
> >>> 6       6
> >>> ..
> >>>
> >>> I have an unsorted edge file with just int values.
> >>> http://www.ics.uci.edu/~vishalrp/public/testg.txt
> >>>
> >>> My test graph (head below) has 10,000 nodes ( from 0 to 9999) and 9998
> >>> edges. There are 4 connected components in the graph.
> >>>
> >>> 0       5800
> >>> 0       5981
> >>> 1       1239
> >>> 1       2989
> >>> 1       3961
> >>> 2       5417
> >>> 2       7350
> >>>
> >>> What am I doing wrong? Also, in general does the graph have to have int
> >>> values for nodes? Or can I have strings?
> >>>
> >>> Appreciate your help!
> >>>
> >>> Vishal
> >>>
> >>>
> >>>
> >>>
> >>> On Mon, Aug 6, 2012 at 2:22 PM, Sebastian Schelter <ssc@apache.org>
> >> wrote:
> >>>
> >>>> You cannot run the vertex class directly. Instead you can use
> >>>> GiraphRunner, e.g.
> >>>>
> >>>> hadoop jar giraph-jar-with-dependencies.jar
> >>>> org.apache.giraph.GiraphRunner
> >>>> org.apache.giraph.examples.ConnectedComponentsVertex --inputFormat
> >>>> org.apache.giraph.examples.IntIntNullIntTextInputFormat --inputPath
> >>>> hdfs:///path/to/input --outputFormat
> >>>> org.apache.giraph.examples.VertexWithComponentTextOutputFormat
> >>>> --outputPath hdfs:///path/to/output --workers numWorkers --combiner
> >>>> org.apache.giraph.examples.MinimumIntCombiner
> >>>>
> >>>> --sebastian
> >>>>
> >>>>
> >>>> 2012/8/6 Vishal Patel <write2vishal@gmail.com>:
> >>>>> Hi, I am trying to run the connected-components example. I have
> giraph
> >>>>> installed, all the test pass on a 3 node cluster running
> hadoop-1.0.3/
> >>>>>
> >>>>> The main method is missing in the ConnectedComponentsVertex class
> >>>>>
> >>>>> cd target/classes
> >>>>> hadoop jar ../giraph-0.1-jar-with-dependencies.jar
> >>>>> org.apache.giraph.examples.ConnectedComponentsVertex
> >>>>>
> >>>>> Exception in thread "main" java.lang.NoSuchMethodException:
> >>>>>
> >>>>
> >>
> org.apache.giraph.examples.ConnectedComponentsVertex.main([Ljava.lang.String;)
> >>>>>         at java.lang.Class.getMethod(Class.java:1622)
> >>>>>         at org.apache.hadoop.util.RunJar.main(RunJar.java:150)
> >>>>>
> >>>>> Can someone please help me with running this example?
> >>>>>
> >>>>> Vishal
> >>>>>
> >>>>
> >>>
> >>
> >>
> >
>
>

Mime
View raw message