hama-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Edward J. Yoon" <edwardy...@apache.org>
Subject Re: Error with fastgen input
Date Thu, 28 Feb 2013 12:03:35 GMT
Eh,..... btw, is re-partitioned data really necessary to be Sorted?

On Thu, Feb 28, 2013 at 7:48 PM, Thomas Jungblut
<thomas.jungblut@gmail.com> wrote:
> Now I get how the partitioning works, obviously if you merge n sorted files
> by just appending to each other, this will result in totally unsorted data
> ;-)
> Why didn't you solve this via messaging?
>
> 2013/2/28 Thomas Jungblut <thomas.jungblut@gmail.com>
>
>> Seems that they are not correctly sorted:
>>
>> vertexID: 50
>> vertexID: 52
>> vertexID: 54
>> vertexID: 56
>> vertexID: 58
>> vertexID: 61
>> ...
>> vertexID: 78
>> vertexID: 81
>> vertexID: 83
>> vertexID: 85
>> ...
>> vertexID: 94
>> vertexID: 96
>> vertexID: 98
>> vertexID: 1
>> vertexID: 10
>> vertexID: 12
>> vertexID: 14
>> vertexID: 16
>> vertexID: 18
>> vertexID: 21
>> vertexID: 23
>> vertexID: 25
>> vertexID: 27
>> vertexID: 29
>> vertexID: 3
>>
>> So this won't work then correctly...
>>
>>
>> 2013/2/28 Thomas Jungblut <thomas.jungblut@gmail.com>
>>
>>> sure, have fun on your holidays.
>>>
>>>
>>> 2013/2/28 Edward J. Yoon <edwardyoon@apache.org>
>>>
>>>> Sure, but if you can fix quickly, please do. March 1 is holiday[1] so
>>>> I'll appear next week.
>>>>
>>>> 1. http://en.wikipedia.org/wiki/Public_holidays_in_South_Korea
>>>>
>>>> On Thu, Feb 28, 2013 at 6:36 PM, Thomas Jungblut
>>>> <thomas.jungblut@gmail.com> wrote:
>>>> > Maybe 50 is missing from the file, didn't observe if all items were
>>>> added.
>>>> > As far as I remember, I copy/pasted the logic of the ID into the
>>>> fastgen,
>>>> > want to have a look into it?
>>>> >
>>>> > 2013/2/28 Edward J. Yoon <edwardyoon@apache.org>
>>>> >
>>>> >> I guess, it's a bug of fastgen, when generate adjacency matrix into
>>>> >> multiple files.
>>>> >>
>>>> >> On Thu, Feb 28, 2013 at 6:29 PM, Thomas Jungblut
>>>> >> <thomas.jungblut@gmail.com> wrote:
>>>> >> > You have two files, are they partitioned correctly?
>>>> >> >
>>>> >> > 2013/2/28 Edward J. Yoon <edwardyoon@apache.org>
>>>> >> >
>>>> >> >> It looks like a bug.
>>>> >> >>
>>>> >> >> edward@udanax:~/workspace/hama-trunk$ ls -al /tmp/randomgraph/
>>>> >> >> total 44
>>>> >> >> drwxrwxr-x  3 edward edward  4096  2월 28 18:03 .
>>>> >> >> drwxrwxrwt 19 root   root   20480  2월 28 18:04 ..
>>>> >> >> -rwxrwxrwx  1 edward edward  2243  2월 28 18:01 part-00000
>>>> >> >> -rw-rw-r--  1 edward edward    28  2월 28 18:01 .part-00000.crc
>>>> >> >> -rwxrwxrwx  1 edward edward  2251  2월 28 18:01 part-00001
>>>> >> >> -rw-rw-r--  1 edward edward    28  2월 28 18:01 .part-00001.crc
>>>> >> >> drwxrwxr-x  2 edward edward  4096  2월 28 18:03 partitions
>>>> >> >> edward@udanax:~/workspace/hama-trunk$ ls -al
>>>> >> /tmp/randomgraph/partitions/
>>>> >> >> total 24
>>>> >> >> drwxrwxr-x 2 edward edward 4096  2월 28 18:03 .
>>>> >> >> drwxrwxr-x 3 edward edward 4096  2월 28 18:03 ..
>>>> >> >> -rwxrwxrwx 1 edward edward 2932  2월 28 18:03 part-00000
>>>> >> >> -rw-rw-r-- 1 edward edward   32  2월 28 18:03 .part-00000.crc
>>>> >> >> -rwxrwxrwx 1 edward edward 2955  2월 28 18:03 part-00001
>>>> >> >> -rw-rw-r-- 1 edward edward   32  2월 28 18:03 .part-00001.crc
>>>> >> >> edward@udanax:~/workspace/hama-trunk$
>>>> >> >>
>>>> >> >>
>>>> >> >> On Thu, Feb 28, 2013 at 5:27 PM, Edward <edward@udanax.org>
wrote:
>>>> >> >> > yes i'll check again
>>>> >> >> >
>>>> >> >> > Sent from my iPhone
>>>> >> >> >
>>>> >> >> > On Feb 28, 2013, at 5:18 PM, Thomas Jungblut <
>>>> >> thomas.jungblut@gmail.com>
>>>> >> >> wrote:
>>>> >> >> >
>>>> >> >> >> Can you verify an observation for me please?
>>>> >> >> >>
>>>> >> >> >> 2 files are created from fastgen, part-00000 and
part-00001,
>>>> both
>>>> >> ~2.2kb
>>>> >> >> >> sized.
>>>> >> >> >> In the below partition directory, there is only
a single 5.56kb
>>>> file.
>>>> >> >> >>
>>>> >> >> >> Is it intended for the partitioner to write a
single file if you
>>>> >> >> configured
>>>> >> >> >> two?
>>>> >> >> >> It even reads it as a two files, strange huh?
>>>> >> >> >>
>>>> >> >> >> 2013/2/28 Thomas Jungblut <thomas.jungblut@gmail.com>
>>>> >> >> >>
>>>> >> >> >>> Will have a look into it.
>>>> >> >> >>>
>>>> >> >> >>> gen fastgen 100 10 /tmp/randomgraph 1
>>>> >> >> >>> pagerank /tmp/randomgraph /tmp/pageout
>>>> >> >> >>>
>>>> >> >> >>> did work for me the last time I profiled,
maybe the
>>>> partitioning
>>>> >> >> doesn't
>>>> >> >> >>> partition correctly with the input or something
else.
>>>> >> >> >>>
>>>> >> >> >>>
>>>> >> >> >>> 2013/2/28 Edward J. Yoon <edwardyoon@apache.org>
>>>> >> >> >>>
>>>> >> >> >>> Fastgen input seems not work for graph examples.
>>>> >> >> >>>>
>>>> >> >> >>>> edward@edward-virtualBox:~/workspace/hama-trunk$
bin/hama jar
>>>> >> >> >>>> examples/target/hama-examples-0.7.0-SNAPSHOT.jar
gen fastgen
>>>> 100 10
>>>> >> >> >>>> /tmp/randomgraph 2
>>>> >> >> >>>> 13/02/28 10:32:02 WARN util.NativeCodeLoader:
Unable to load
>>>> >> >> >>>> native-hadoop library for your platform...
using builtin-java
>>>> >> classes
>>>> >> >> >>>> where applicable
>>>> >> >> >>>> 13/02/28 10:32:03 INFO bsp.BSPJobClient:
Running job:
>>>> >> >> job_localrunner_0001
>>>> >> >> >>>> 13/02/28 10:32:03 INFO bsp.LocalBSPRunner:
Setting up a new
>>>> barrier
>>>> >> >> for 2
>>>> >> >> >>>> tasks!
>>>> >> >> >>>> 13/02/28 10:32:06 INFO bsp.BSPJobClient:
Current supersteps
>>>> >> number: 0
>>>> >> >> >>>> 13/02/28 10:32:06 INFO bsp.BSPJobClient:
The total number of
>>>> >> >> supersteps: 0
>>>> >> >> >>>> 13/02/28 10:32:06 INFO bsp.BSPJobClient:
Counters: 3
>>>> >> >> >>>> 13/02/28 10:32:06 INFO bsp.BSPJobClient:
>>>> >> >> >>>> org.apache.hama.bsp.JobInProgress$JobCounter
>>>> >> >> >>>> 13/02/28 10:32:06 INFO bsp.BSPJobClient:
    SUPERSTEPS=0
>>>> >> >> >>>> 13/02/28 10:32:06 INFO bsp.BSPJobClient:
    LAUNCHED_TASKS=2
>>>> >> >> >>>> 13/02/28 10:32:06 INFO bsp.BSPJobClient:
>>>> >> >> >>>> org.apache.hama.bsp.BSPPeerImpl$PeerCounter
>>>> >> >> >>>> 13/02/28 10:32:06 INFO bsp.BSPJobClient:
>>>> >> TASK_OUTPUT_RECORDS=100
>>>> >> >> >>>> Job Finished in 3.212 seconds
>>>> >> >> >>>> edward@edward-virtualBox:~/workspace/hama-trunk$
bin/hama jar
>>>> >> >> >>>> examples/target/hama-examples-0.7.0-SNAPSHOT
>>>> >> >> >>>> hama-examples-0.7.0-SNAPSHOT-javadoc.jar
>>>> >> >> >>>> hama-examples-0.7.0-SNAPSHOT.jar
>>>> >> >> >>>> edward@edward-virtualBox:~/workspace/hama-trunk$
bin/hama jar
>>>> >> >> >>>> examples/target/hama-examples-0.7.0-SNAPSHOT.jar
pagerank
>>>> >> >> >>>> /tmp/randomgraph /tmp/pageour
>>>> >> >> >>>> 13/02/28 10:32:29 WARN util.NativeCodeLoader:
Unable to load
>>>> >> >> >>>> native-hadoop library for your platform...
using builtin-java
>>>> >> classes
>>>> >> >> >>>> where applicable
>>>> >> >> >>>> 13/02/28 10:32:29 INFO bsp.FileInputFormat:
Total input paths
>>>> to
>>>> >> >> process
>>>> >> >> >>>> : 2
>>>> >> >> >>>> 13/02/28 10:32:29 INFO bsp.FileInputFormat:
Total input paths
>>>> to
>>>> >> >> process
>>>> >> >> >>>> : 2
>>>> >> >> >>>> 13/02/28 10:32:30 INFO bsp.BSPJobClient:
Running job:
>>>> >> >> job_localrunner_0001
>>>> >> >> >>>> 13/02/28 10:32:30 INFO bsp.LocalBSPRunner:
Setting up a new
>>>> barrier
>>>> >> >> for 2
>>>> >> >> >>>> tasks!
>>>> >> >> >>>> 13/02/28 10:32:33 INFO bsp.BSPJobClient:
Current supersteps
>>>> >> number: 1
>>>> >> >> >>>> 13/02/28 10:32:33 INFO bsp.BSPJobClient:
The total number of
>>>> >> >> supersteps: 1
>>>> >> >> >>>> 13/02/28 10:32:33 INFO bsp.BSPJobClient:
Counters: 6
>>>> >> >> >>>> 13/02/28 10:32:33 INFO bsp.BSPJobClient:
>>>> >> >> >>>> org.apache.hama.bsp.JobInProgress$JobCounter
>>>> >> >> >>>> 13/02/28 10:32:33 INFO bsp.BSPJobClient:
    SUPERSTEPS=1
>>>> >> >> >>>> 13/02/28 10:32:33 INFO bsp.BSPJobClient:
    LAUNCHED_TASKS=2
>>>> >> >> >>>> 13/02/28 10:32:33 INFO bsp.BSPJobClient:
>>>> >> >> >>>> org.apache.hama.bsp.BSPPeerImpl$PeerCounter
>>>> >> >> >>>> 13/02/28 10:32:33 INFO bsp.BSPJobClient:
    SUPERSTEP_SUM=4
>>>> >> >> >>>> 13/02/28 10:32:33 INFO bsp.BSPJobClient:
>>>> IO_BYTES_READ=4332
>>>> >> >> >>>> 13/02/28 10:32:33 INFO bsp.BSPJobClient:
>>>> TIME_IN_SYNC_MS=14
>>>> >> >> >>>> 13/02/28 10:32:33 INFO bsp.BSPJobClient:
>>>> TASK_INPUT_RECORDS=100
>>>> >> >> >>>> 13/02/28 10:32:33 INFO bsp.FileInputFormat:
Total input paths
>>>> to
>>>> >> >> process
>>>> >> >> >>>> : 2
>>>> >> >> >>>> 13/02/28 10:32:33 INFO bsp.BSPJobClient:
Running job:
>>>> >> >> job_localrunner_0001
>>>> >> >> >>>> 13/02/28 10:32:33 INFO bsp.LocalBSPRunner:
Setting up a new
>>>> barrier
>>>> >> >> for 2
>>>> >> >> >>>> tasks!
>>>> >> >> >>>> 13/02/28 10:32:33 INFO graph.GraphJobRunner:
50 vertices are
>>>> loaded
>>>> >> >> into
>>>> >> >> >>>> local:1
>>>> >> >> >>>> 13/02/28 10:32:33 INFO graph.GraphJobRunner:
50 vertices are
>>>> loaded
>>>> >> >> into
>>>> >> >> >>>> local:0
>>>> >> >> >>>> 13/02/28 10:32:33 ERROR bsp.LocalBSPRunner:
Exception during
>>>> BSP
>>>> >> >> >>>> execution!
>>>> >> >> >>>> java.lang.IllegalArgumentException: Messages
must never be
>>>> behind
>>>> >> the
>>>> >> >> >>>> vertex in ID! Current Message ID: 1 vs.
50
>>>> >> >> >>>>        at
>>>> >> >> >>>>
>>>> >> org.apache.hama.graph.GraphJobRunner.iterate(GraphJobRunner.java:279)
>>>> >> >> >>>>        at
>>>> >> >> >>>>
>>>> >> >>
>>>> >>
>>>> org.apache.hama.graph.GraphJobRunner.doSuperstep(GraphJobRunner.java:225)
>>>> >> >> >>>>        at
>>>> >> >> >>>>
>>>> org.apache.hama.graph.GraphJobRunner.bsp(GraphJobRunner.java:129)
>>>> >> >> >>>>        at
>>>> >> >> >>>>
>>>> >> >>
>>>> >>
>>>> org.apache.hama.bsp.LocalBSPRunner$BSPRunner.run(LocalBSPRunner.java:256)
>>>> >> >> >>>>        at
>>>> >> >> >>>>
>>>> >> >>
>>>> >>
>>>> org.apache.hama.bsp.LocalBSPRunner$BSPRunner.call(LocalBSPRunner.java:286)
>>>> >> >> >>>>        at
>>>> >> >> >>>>
>>>> >> >>
>>>> >>
>>>> org.apache.hama.bsp.LocalBSPRunner$BSPRunner.call(LocalBSPRunner.java:211)
>>>> >> >> >>>>        at
>>>> >> >> >>>>
>>>> java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:334)
>>>> >> >> >>>>        at
>>>> java.util.concurrent.FutureTask.run(FutureTask.java:166)
>>>> >> >> >>>>        at
>>>> >> >> >>>>
>>>> >> >>
>>>> java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:471)
>>>> >> >> >>>>        at
>>>> >> >> >>>>
>>>> java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:334)
>>>> >> >> >>>>        at
>>>> java.util.concurrent.FutureTask.run(FutureTask.java:166)
>>>> >> >> >>>>        at
>>>> >> >> >>>>
>>>> >> >>
>>>> >>
>>>> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1110)
>>>> >> >> >>>>        at
>>>> >> >> >>>>
>>>> >> >>
>>>> >>
>>>> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:603)
>>>> >> >> >>>>        at java.lang.Thread.run(Thread.java:722)
>>>> >> >> >>>>
>>>> >> >> >>>>
>>>> >> >> >>>> --
>>>> >> >> >>>> Best Regards, Edward J. Yoon
>>>> >> >> >>>> @eddieyoon
>>>> >> >> >>>
>>>> >> >> >>>
>>>> >> >>
>>>> >> >>
>>>> >> >>
>>>> >> >> --
>>>> >> >> Best Regards, Edward J. Yoon
>>>> >> >> @eddieyoon
>>>> >> >>
>>>> >>
>>>> >>
>>>> >>
>>>> >> --
>>>> >> Best Regards, Edward J. Yoon
>>>> >> @eddieyoon
>>>> >>
>>>>
>>>>
>>>>
>>>> --
>>>> Best Regards, Edward J. Yoon
>>>> @eddieyoon
>>>>
>>>
>>>
>>



-- 
Best Regards, Edward J. Yoon
@eddieyoon

Mime
View raw message