hadoop-common-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Devaraj Das" <d...@yahoo-inc.com>
Subject RE: Stackoverflow
Date Wed, 04 Jun 2008 13:30:14 GMT
Hi Andreas,

Here is what I did:

bin/hadoop jar build/hadoop-0.18.0-dev-examples.jar randomtextwriter
-Dtest.randomtextwrite.maps_per_host=1 textinput
(this would generate 1GB of text data with pretty long sentences. Refer

bin/hadoop jar build/hadoop-0.18.0-dev-examples.jar sort
-Dmapred.min.split.size=536870912 -Dio.sort.mb=256 -inFormat
org.apache.hadoop.mapred.KeyValueTextInputFormat -outFormat
org.apache.hadoop.mapred.lib.NullOutputFormat -outKey
org.apache.hadoop.io.Text -outValue org.apache.hadoop.io.Text   textinput

(This is similar to what you run. Notice that I have a pretty high value of
the mapred.min.split.size and io.sort.mb to ensure that each invocation of
qsort processes good amount of data)

This ran perfectly well.

I even tried reducing the length of the sentences by specifying 1 for all
the four - min_words_key/value, max_words_key/value during data creation.
That seemed to work fine too.

So could you pls do this:

1) Generate data using RandomTextWriter having similar characteristics as
your input data set where qsort fails.
2) Try to reproduce the issue (you may have to do a couple of runs of (1)).
Let us know the configuration of RandomTextWriter with which you see
StackOverflow errors in qsort.

I hope I am not asking for too much.. Pls let us know if you need any help
in this regard...

Thanks a lot!


> -----Original Message-----
> From: Andreas Kostyrka [mailto:andreas@kostyrka.org] 
> Sent: Wednesday, June 04, 2008 4:56 AM
> To: core-user@hadoop.apache.org
> Subject: Re: Stackoverflow
> Ok, I've tried it out, the example sort bombs exactly like 
> streaming =>
> http://heaven.kostyrka.org/test.log
> Any recommendations?
> Thanks,
> Andreas

View raw message