giraph-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Young Han <young....@uwaterloo.ca>
Subject Re: ConnectedComponents example
Date Mon, 31 Mar 2014 20:15:15 GMT
Huh, it might be a bug in the code. Could it be that Pattern.compile has to
take "[\\t ]" (note the double backslash) to properly match tabs? If so,
that bug is in all the input formats...

Happy to help :)

Young


On Mon, Mar 31, 2014 at 4:07 PM, ghufran malik <ghufran1malik@gmail.com>wrote:

> Hi,
>
> I removed the spaces and it worked! I don't understand though. I'm sure
> the separator pattern means that it splits it by tab spaces?.
>
> Thanks for all your help though some what relieved now!
>
> Kind regards,
>
> Ghufran
>
>
> On Mon, Mar 31, 2014 at 8:15 PM, Young Han <young.han@uwaterloo.ca> wrote:
>
>> Hi,
>>
>> That looks like an error with the algorithm... What do the Hadoop
>> userlogs say?
>>
>> And just to rule out weirdness, what happens if you use spaces instead of
>> tabs (for your input graph)?
>>
>> Young
>>
>>
>> On Mon, Mar 31, 2014 at 2:04 PM, ghufran malik <ghufran1malik@gmail.com>wrote:
>>
>>> Hey,
>>>
>>> No even after I added the .txt it gets to map 100% then drops back down
>>> to 50 and gives me the error:
>>>
>>> 14/03/31 18:22:56 INFO utils.ConfigurationUtils: No edge input format
>>> specified. Ensure your InputFormat does not require one.
>>> 14/03/31 18:22:56 WARN job.GiraphConfigurationValidator: Output format
>>> vertex index type is not known
>>> 14/03/31 18:22:56 WARN job.GiraphConfigurationValidator: Output format
>>> vertex value type is not known
>>> 14/03/31 18:22:56 WARN job.GiraphConfigurationValidator: Output format
>>> edge value type is not known
>>> 14/03/31 18:22:56 INFO job.GiraphJob: run: Since checkpointing is
>>> disabled (default), do not allow any task retries (setting
>>> mapred.map.max.attempts = 0, old value = 4)
>>> 14/03/31 18:22:57 INFO mapred.JobClient: Running job:
>>> job_201403311622_0004
>>> 14/03/31 18:22:58 INFO mapred.JobClient:  map 0% reduce 0%
>>> 14/03/31 18:23:16 INFO mapred.JobClient:  map 50% reduce 0%
>>> 14/03/31 18:23:19 INFO mapred.JobClient:  map 100% reduce 0%
>>> 14/03/31 18:33:25 INFO mapred.JobClient:  map 50% reduce 0%
>>> 14/03/31 18:33:30 INFO mapred.JobClient: Job complete:
>>> job_201403311622_0004
>>> 14/03/31 18:33:30 INFO mapred.JobClient: Counters: 6
>>> 14/03/31 18:33:30 INFO mapred.JobClient:   Job Counters
>>> 14/03/31 18:33:30 INFO mapred.JobClient:     SLOTS_MILLIS_MAPS=1238858
>>> 14/03/31 18:33:30 INFO mapred.JobClient:     Total time spent by all
>>> reduces waiting after reserving slots (ms)=0
>>> 14/03/31 18:33:30 INFO mapred.JobClient:     Total time spent by all
>>> maps waiting after reserving slots (ms)=0
>>> 14/03/31 18:33:30 INFO mapred.JobClient:     Launched map tasks=2
>>> 14/03/31 18:33:30 INFO mapred.JobClient:     SLOTS_MILLIS_REDUCES=0
>>> 14/03/31 18:33:30 INFO mapred.JobClient:     Failed map tasks=1
>>>
>>>
>>> I did a check to make sure the graph was being stored correctly by
>>> doing:
>>>
>>> ghufran@ghufran:~/Downloads/hadoop-0.20.203.0/bin$ hadoop dfs -cat
>>> input/*
>>> 1 2
>>> 2 1 3 4
>>> 3 2
>>> 4 2
>>>
>>
>>
>

Mime
View raw message