hama-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Thomas Jungblut <thomas.jungb...@gmail.com>
Subject Re: Error with fastgen input
Date Thu, 28 Feb 2013 10:48:12 GMT
Now I get how the partitioning works, obviously if you merge n sorted files
by just appending to each other, this will result in totally unsorted data
;-)
Why didn't you solve this via messaging?

2013/2/28 Thomas Jungblut <thomas.jungblut@gmail.com>

> Seems that they are not correctly sorted:
>
> vertexID: 50
> vertexID: 52
> vertexID: 54
> vertexID: 56
> vertexID: 58
> vertexID: 61
> ...
> vertexID: 78
> vertexID: 81
> vertexID: 83
> vertexID: 85
> ...
> vertexID: 94
> vertexID: 96
> vertexID: 98
> vertexID: 1
> vertexID: 10
> vertexID: 12
> vertexID: 14
> vertexID: 16
> vertexID: 18
> vertexID: 21
> vertexID: 23
> vertexID: 25
> vertexID: 27
> vertexID: 29
> vertexID: 3
>
> So this won't work then correctly...
>
>
> 2013/2/28 Thomas Jungblut <thomas.jungblut@gmail.com>
>
>> sure, have fun on your holidays.
>>
>>
>> 2013/2/28 Edward J. Yoon <edwardyoon@apache.org>
>>
>>> Sure, but if you can fix quickly, please do. March 1 is holiday[1] so
>>> I'll appear next week.
>>>
>>> 1. http://en.wikipedia.org/wiki/Public_holidays_in_South_Korea
>>>
>>> On Thu, Feb 28, 2013 at 6:36 PM, Thomas Jungblut
>>> <thomas.jungblut@gmail.com> wrote:
>>> > Maybe 50 is missing from the file, didn't observe if all items were
>>> added.
>>> > As far as I remember, I copy/pasted the logic of the ID into the
>>> fastgen,
>>> > want to have a look into it?
>>> >
>>> > 2013/2/28 Edward J. Yoon <edwardyoon@apache.org>
>>> >
>>> >> I guess, it's a bug of fastgen, when generate adjacency matrix into
>>> >> multiple files.
>>> >>
>>> >> On Thu, Feb 28, 2013 at 6:29 PM, Thomas Jungblut
>>> >> <thomas.jungblut@gmail.com> wrote:
>>> >> > You have two files, are they partitioned correctly?
>>> >> >
>>> >> > 2013/2/28 Edward J. Yoon <edwardyoon@apache.org>
>>> >> >
>>> >> >> It looks like a bug.
>>> >> >>
>>> >> >> edward@udanax:~/workspace/hama-trunk$ ls -al /tmp/randomgraph/
>>> >> >> total 44
>>> >> >> drwxrwxr-x  3 edward edward  4096  2월 28 18:03 .
>>> >> >> drwxrwxrwt 19 root   root   20480  2월 28 18:04 ..
>>> >> >> -rwxrwxrwx  1 edward edward  2243  2월 28 18:01 part-00000
>>> >> >> -rw-rw-r--  1 edward edward    28  2월 28 18:01 .part-00000.crc
>>> >> >> -rwxrwxrwx  1 edward edward  2251  2월 28 18:01 part-00001
>>> >> >> -rw-rw-r--  1 edward edward    28  2월 28 18:01 .part-00001.crc
>>> >> >> drwxrwxr-x  2 edward edward  4096  2월 28 18:03 partitions
>>> >> >> edward@udanax:~/workspace/hama-trunk$ ls -al
>>> >> /tmp/randomgraph/partitions/
>>> >> >> total 24
>>> >> >> drwxrwxr-x 2 edward edward 4096  2월 28 18:03 .
>>> >> >> drwxrwxr-x 3 edward edward 4096  2월 28 18:03 ..
>>> >> >> -rwxrwxrwx 1 edward edward 2932  2월 28 18:03 part-00000
>>> >> >> -rw-rw-r-- 1 edward edward   32  2월 28 18:03 .part-00000.crc
>>> >> >> -rwxrwxrwx 1 edward edward 2955  2월 28 18:03 part-00001
>>> >> >> -rw-rw-r-- 1 edward edward   32  2월 28 18:03 .part-00001.crc
>>> >> >> edward@udanax:~/workspace/hama-trunk$
>>> >> >>
>>> >> >>
>>> >> >> On Thu, Feb 28, 2013 at 5:27 PM, Edward <edward@udanax.org>
wrote:
>>> >> >> > yes i'll check again
>>> >> >> >
>>> >> >> > Sent from my iPhone
>>> >> >> >
>>> >> >> > On Feb 28, 2013, at 5:18 PM, Thomas Jungblut <
>>> >> thomas.jungblut@gmail.com>
>>> >> >> wrote:
>>> >> >> >
>>> >> >> >> Can you verify an observation for me please?
>>> >> >> >>
>>> >> >> >> 2 files are created from fastgen, part-00000 and part-00001,
>>> both
>>> >> ~2.2kb
>>> >> >> >> sized.
>>> >> >> >> In the below partition directory, there is only a
single 5.56kb
>>> file.
>>> >> >> >>
>>> >> >> >> Is it intended for the partitioner to write a single
file if you
>>> >> >> configured
>>> >> >> >> two?
>>> >> >> >> It even reads it as a two files, strange huh?
>>> >> >> >>
>>> >> >> >> 2013/2/28 Thomas Jungblut <thomas.jungblut@gmail.com>
>>> >> >> >>
>>> >> >> >>> Will have a look into it.
>>> >> >> >>>
>>> >> >> >>> gen fastgen 100 10 /tmp/randomgraph 1
>>> >> >> >>> pagerank /tmp/randomgraph /tmp/pageout
>>> >> >> >>>
>>> >> >> >>> did work for me the last time I profiled, maybe
the
>>> partitioning
>>> >> >> doesn't
>>> >> >> >>> partition correctly with the input or something
else.
>>> >> >> >>>
>>> >> >> >>>
>>> >> >> >>> 2013/2/28 Edward J. Yoon <edwardyoon@apache.org>
>>> >> >> >>>
>>> >> >> >>> Fastgen input seems not work for graph examples.
>>> >> >> >>>>
>>> >> >> >>>> edward@edward-virtualBox:~/workspace/hama-trunk$
bin/hama jar
>>> >> >> >>>> examples/target/hama-examples-0.7.0-SNAPSHOT.jar
gen fastgen
>>> 100 10
>>> >> >> >>>> /tmp/randomgraph 2
>>> >> >> >>>> 13/02/28 10:32:02 WARN util.NativeCodeLoader:
Unable to load
>>> >> >> >>>> native-hadoop library for your platform...
using builtin-java
>>> >> classes
>>> >> >> >>>> where applicable
>>> >> >> >>>> 13/02/28 10:32:03 INFO bsp.BSPJobClient: Running
job:
>>> >> >> job_localrunner_0001
>>> >> >> >>>> 13/02/28 10:32:03 INFO bsp.LocalBSPRunner:
Setting up a new
>>> barrier
>>> >> >> for 2
>>> >> >> >>>> tasks!
>>> >> >> >>>> 13/02/28 10:32:06 INFO bsp.BSPJobClient: Current
supersteps
>>> >> number: 0
>>> >> >> >>>> 13/02/28 10:32:06 INFO bsp.BSPJobClient: The
total number of
>>> >> >> supersteps: 0
>>> >> >> >>>> 13/02/28 10:32:06 INFO bsp.BSPJobClient: Counters:
3
>>> >> >> >>>> 13/02/28 10:32:06 INFO bsp.BSPJobClient:
>>> >> >> >>>> org.apache.hama.bsp.JobInProgress$JobCounter
>>> >> >> >>>> 13/02/28 10:32:06 INFO bsp.BSPJobClient: 
   SUPERSTEPS=0
>>> >> >> >>>> 13/02/28 10:32:06 INFO bsp.BSPJobClient: 
   LAUNCHED_TASKS=2
>>> >> >> >>>> 13/02/28 10:32:06 INFO bsp.BSPJobClient:
>>> >> >> >>>> org.apache.hama.bsp.BSPPeerImpl$PeerCounter
>>> >> >> >>>> 13/02/28 10:32:06 INFO bsp.BSPJobClient:
>>> >> TASK_OUTPUT_RECORDS=100
>>> >> >> >>>> Job Finished in 3.212 seconds
>>> >> >> >>>> edward@edward-virtualBox:~/workspace/hama-trunk$
bin/hama jar
>>> >> >> >>>> examples/target/hama-examples-0.7.0-SNAPSHOT
>>> >> >> >>>> hama-examples-0.7.0-SNAPSHOT-javadoc.jar
>>> >> >> >>>> hama-examples-0.7.0-SNAPSHOT.jar
>>> >> >> >>>> edward@edward-virtualBox:~/workspace/hama-trunk$
bin/hama jar
>>> >> >> >>>> examples/target/hama-examples-0.7.0-SNAPSHOT.jar
pagerank
>>> >> >> >>>> /tmp/randomgraph /tmp/pageour
>>> >> >> >>>> 13/02/28 10:32:29 WARN util.NativeCodeLoader:
Unable to load
>>> >> >> >>>> native-hadoop library for your platform...
using builtin-java
>>> >> classes
>>> >> >> >>>> where applicable
>>> >> >> >>>> 13/02/28 10:32:29 INFO bsp.FileInputFormat:
Total input paths
>>> to
>>> >> >> process
>>> >> >> >>>> : 2
>>> >> >> >>>> 13/02/28 10:32:29 INFO bsp.FileInputFormat:
Total input paths
>>> to
>>> >> >> process
>>> >> >> >>>> : 2
>>> >> >> >>>> 13/02/28 10:32:30 INFO bsp.BSPJobClient: Running
job:
>>> >> >> job_localrunner_0001
>>> >> >> >>>> 13/02/28 10:32:30 INFO bsp.LocalBSPRunner:
Setting up a new
>>> barrier
>>> >> >> for 2
>>> >> >> >>>> tasks!
>>> >> >> >>>> 13/02/28 10:32:33 INFO bsp.BSPJobClient: Current
supersteps
>>> >> number: 1
>>> >> >> >>>> 13/02/28 10:32:33 INFO bsp.BSPJobClient: The
total number of
>>> >> >> supersteps: 1
>>> >> >> >>>> 13/02/28 10:32:33 INFO bsp.BSPJobClient: Counters:
6
>>> >> >> >>>> 13/02/28 10:32:33 INFO bsp.BSPJobClient:
>>> >> >> >>>> org.apache.hama.bsp.JobInProgress$JobCounter
>>> >> >> >>>> 13/02/28 10:32:33 INFO bsp.BSPJobClient: 
   SUPERSTEPS=1
>>> >> >> >>>> 13/02/28 10:32:33 INFO bsp.BSPJobClient: 
   LAUNCHED_TASKS=2
>>> >> >> >>>> 13/02/28 10:32:33 INFO bsp.BSPJobClient:
>>> >> >> >>>> org.apache.hama.bsp.BSPPeerImpl$PeerCounter
>>> >> >> >>>> 13/02/28 10:32:33 INFO bsp.BSPJobClient: 
   SUPERSTEP_SUM=4
>>> >> >> >>>> 13/02/28 10:32:33 INFO bsp.BSPJobClient:
>>> IO_BYTES_READ=4332
>>> >> >> >>>> 13/02/28 10:32:33 INFO bsp.BSPJobClient:
>>> TIME_IN_SYNC_MS=14
>>> >> >> >>>> 13/02/28 10:32:33 INFO bsp.BSPJobClient:
>>> TASK_INPUT_RECORDS=100
>>> >> >> >>>> 13/02/28 10:32:33 INFO bsp.FileInputFormat:
Total input paths
>>> to
>>> >> >> process
>>> >> >> >>>> : 2
>>> >> >> >>>> 13/02/28 10:32:33 INFO bsp.BSPJobClient: Running
job:
>>> >> >> job_localrunner_0001
>>> >> >> >>>> 13/02/28 10:32:33 INFO bsp.LocalBSPRunner:
Setting up a new
>>> barrier
>>> >> >> for 2
>>> >> >> >>>> tasks!
>>> >> >> >>>> 13/02/28 10:32:33 INFO graph.GraphJobRunner:
50 vertices are
>>> loaded
>>> >> >> into
>>> >> >> >>>> local:1
>>> >> >> >>>> 13/02/28 10:32:33 INFO graph.GraphJobRunner:
50 vertices are
>>> loaded
>>> >> >> into
>>> >> >> >>>> local:0
>>> >> >> >>>> 13/02/28 10:32:33 ERROR bsp.LocalBSPRunner:
Exception during
>>> BSP
>>> >> >> >>>> execution!
>>> >> >> >>>> java.lang.IllegalArgumentException: Messages
must never be
>>> behind
>>> >> the
>>> >> >> >>>> vertex in ID! Current Message ID: 1 vs. 50
>>> >> >> >>>>        at
>>> >> >> >>>>
>>> >> org.apache.hama.graph.GraphJobRunner.iterate(GraphJobRunner.java:279)
>>> >> >> >>>>        at
>>> >> >> >>>>
>>> >> >>
>>> >>
>>> org.apache.hama.graph.GraphJobRunner.doSuperstep(GraphJobRunner.java:225)
>>> >> >> >>>>        at
>>> >> >> >>>>
>>> org.apache.hama.graph.GraphJobRunner.bsp(GraphJobRunner.java:129)
>>> >> >> >>>>        at
>>> >> >> >>>>
>>> >> >>
>>> >>
>>> org.apache.hama.bsp.LocalBSPRunner$BSPRunner.run(LocalBSPRunner.java:256)
>>> >> >> >>>>        at
>>> >> >> >>>>
>>> >> >>
>>> >>
>>> org.apache.hama.bsp.LocalBSPRunner$BSPRunner.call(LocalBSPRunner.java:286)
>>> >> >> >>>>        at
>>> >> >> >>>>
>>> >> >>
>>> >>
>>> org.apache.hama.bsp.LocalBSPRunner$BSPRunner.call(LocalBSPRunner.java:211)
>>> >> >> >>>>        at
>>> >> >> >>>>
>>> java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:334)
>>> >> >> >>>>        at
>>> java.util.concurrent.FutureTask.run(FutureTask.java:166)
>>> >> >> >>>>        at
>>> >> >> >>>>
>>> >> >>
>>> java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:471)
>>> >> >> >>>>        at
>>> >> >> >>>>
>>> java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:334)
>>> >> >> >>>>        at
>>> java.util.concurrent.FutureTask.run(FutureTask.java:166)
>>> >> >> >>>>        at
>>> >> >> >>>>
>>> >> >>
>>> >>
>>> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1110)
>>> >> >> >>>>        at
>>> >> >> >>>>
>>> >> >>
>>> >>
>>> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:603)
>>> >> >> >>>>        at java.lang.Thread.run(Thread.java:722)
>>> >> >> >>>>
>>> >> >> >>>>
>>> >> >> >>>> --
>>> >> >> >>>> Best Regards, Edward J. Yoon
>>> >> >> >>>> @eddieyoon
>>> >> >> >>>
>>> >> >> >>>
>>> >> >>
>>> >> >>
>>> >> >>
>>> >> >> --
>>> >> >> Best Regards, Edward J. Yoon
>>> >> >> @eddieyoon
>>> >> >>
>>> >>
>>> >>
>>> >>
>>> >> --
>>> >> Best Regards, Edward J. Yoon
>>> >> @eddieyoon
>>> >>
>>>
>>>
>>>
>>> --
>>> Best Regards, Edward J. Yoon
>>> @eddieyoon
>>>
>>
>>
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message