hama-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Edward J. Yoon" <edwardy...@apache.org>
Subject Re: Error partitioning input path
Date Sat, 19 Oct 2013 00:22:19 GMT
> Shouldn't the partitioner be able to handle bigger numbers of files than
> the maximum number of tasks the cluster may work with?
> To me it seems like an issue, or at least something which should be noted
> so user could take it into account.

Thanks for your suggestion. We'll.

Alternatively, you can set the below property to false and use the MR
job for input data partitioning.

  <property>
    <name>bsp.input.runtime.partitioning</name>
    <value>true</value>
    <description>Basically, we provides a data partitioning program
based on BSP job,
    which you can use without any extra program. Set this property to
false if you
    want to use the custom partition program.
    </description>
  </property>

On Fri, Oct 18, 2013 at 10:27 PM, Steven van Beelen
<smcvbeelen@gmail.com> wrote:
> Hi,
>
> As a matter of fact I did.
> The problem was due to the maximum size of tasks my small system could
> handle.
> The number of files was larger than the max amount of tasks.
> Therefor the Partitioner could not partition the files together.
> Shouldn't the partitioner be able to handle bigger numbers of files than
> the maximum number of tasks the cluster may work with?
> To me it seems like an issue, or at least something which should be noted
> so user could take it into account.
>
> Regards,
>
> Steven
>
>
> On Fri, Oct 18, 2013 at 1:47 PM, Edward J. Yoon <edwardyoon@apache.org>wrote:
>
>> Hi,
>>
>> http://wiki.apache.org/hama/FAQ
>>
>> Have you tried it with small inputs?
>>
>> On Wed, Oct 16, 2013 at 9:22 PM, Steven van Beelen <smcvbeelen@gmail.com>
>> wrote:
>> > Hello,
>> >
>> > I get roughly the same error as Yingyi Bu received the 27th of september,
>> > but with a difference.
>> > The error is as follows:
>> >
>> > 13/10/16 14:02:47 INFO mortbay.log: Logging to
>> > org.slf4j.impl.Log4jLoggerAdapter(org.mortbay.log) via
>> > org.mortbay.log.Slf4jLog
>> > 13/10/16 14:02:48 INFO bsp.FileInputFormat: Total input paths to process
>> :
>> > 325
>> > 13/10/16 14:02:48 INFO bsp.FileInputFormat: Total input paths to process
>> :
>> > 325
>> > 13/10/16 14:02:48 INFO bsp.BSPJobClient: Running job:
>> job_201309201620_0037
>> > 13/10/16 14:02:48 INFO bsp.BSPJobClient: Job failed.
>> > 13/10/16 14:02:48 ERROR bsp.BSPJobClient: Error partitioning the input
>> path.
>> > Exception in thread "main" java.io.IOException: Runtime partition failed
>> > for the job.
>> >     at org.apache.hama.bsp.BSPJobClient.partition(BSPJobClient.java:465)
>> >     at
>> > org.apache.hama.bsp.BSPJobClient.submitJobInternal(BSPJobClient.java:333)
>> >     at org.apache.hama.bsp.BSPJobClient.submitJob(BSPJobClient.java:293)
>> >     at org.apache.hama.bsp.BSPJob.submit(BSPJob.java:229)
>> >     at org.apache.hama.bsp.BSPJob.waitForCompletion(BSPJob.java:236)
>> >     at InvertedIndex.run(InvertedIndex.java:227)
>> >     at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:65)
>> >     at InvertedIndex.main(InvertedIndex.java:234)
>> >     at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>> >     at
>> >
>> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
>> >     at
>> >
>> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
>> >     at java.lang.reflect.Method.invoke(Method.java:597)
>> >     at org.apache.hama.util.RunJar.main(RunJar.java:146)
>> >
>> > As evident, my job does not even perform any supersteps, but fails
>> directly.
>> > On top of that, I cannot find any logs in the tasklog maps of my machines
>> > for the job *job_201309201620_0037 *from the above exception. Same holds
>> > for any consecutive jobs and for any of my three own written Hama
>> programs.
>> >
>> > Last, the BSPMaster log looks as follows:
>> >
>> > 13/10/16 14:02:48 INFO bsp.JobInProgress: num BSPTasks: 325
>> > 13/10/16 14:02:48 INFO bsp.JobInProgress: Job is initialized.
>> > 13/10/16 14:02:48 ERROR bsp.SimpleTaskScheduler: Could not schedule
>> > alltasks!
>> > 13/10/16 14:02:48 ERROR bsp.SimpleTaskScheduler: Scheduling of job
>> Inverted
>> > Indexing could not be done successfully. Killing it!
>> >
>> > Momentarily I'm running in distributed mode with 7 nodes and the Pi
>> example
>> > works.
>> > Could anyone tell me what I'm doing wrong?
>> >
>> > Regards,
>> > Steven van Beelen
>> > Vrije Universiteit van Amsterdam / SURFsara
>>
>>
>>
>> --
>> Best Regards, Edward J. Yoon
>> @eddieyoon
>>



-- 
Best Regards, Edward J. Yoon
@eddieyoon

Mime
View raw message