hadoop-mapreduce-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Bertrand Dechoux <decho...@gmail.com>
Subject Re: changing split size in Hadoop configuration
Date Mon, 14 Jul 2014 18:54:37 GMT
For what it's worth, mapreduce.jobtracker.split.metainfo.maxsize is related
to the size of the file containing the information describing the input
splits. It is not related directly to the volume of data but to the number
of splits which might explode when using too many (small) files. It's
basically a safeguard. Alternatively, you might want to reduce the number
of splits ; raising the block size is one way to do it.

Bertrand Dechoux


On Mon, Jul 14, 2014 at 7:50 PM, Adam Kawa <kawa.adam@gmail.com> wrote:

> It sounds like JobTracker setting, so the restart looks to be required.
>
> You verify it in pseudo-distributed mode by setting it to a very low
> value, restarting JT and seeing if you get the exception that prints this
> new value.
>
> Sent from my iPhone
>
> On 14 jul 2014, at 16:03, Jan Warchoł <jan.warchol@codilime.com> wrote:
>
> Hello,
>
> I recently got "Split metadata size exceeded 10000000" error when running
> Cascading jobs with very big joins.  I found that I should change
> mapreduce.jobtracker.split.metainfo.maxsize property in hadoop
> configuration by adding this to the mapred-site.xml file:
>
>   <property>
>     <!-- allow more space for split metadata (default is 10000000) -->
>     <name>mapreduce.jobtracker.split.metainfo.maxsize</name>
>     <value>1000000000</value>
>   </property>
>
> but it didn't seem to have any effect - I'm probably doing something wrong.
>
> Where should I add this change so that is has the desired effect?  Do I
> understand correctly that jobtracker restart is required after making the
> change? The cluster I'm working on has Hadoop 1.0.4.
>
> thanks for any help,
> --
> *Jan Warchoł*
> *Software Engineer*
> <clr[1][14].png>
>
> -----------------------------------------
> M: +48 509 078 203
>  E: jan.warchol@codilime.com
> -----------------------------------------
> CodiLime Sp. z o.o. - Ltd. company with its registered office in Poland,
> 01-167 Warsaw, ul. Zawiszy 14/97. Registered by The District Court for the
> Capital City of Warsaw, XII Commercial Department of the National Court
> Register. Entered into National Court Register under No. KRS 0000388871.
> Tax identification number (NIP) 5272657478. Statistical number
> (REGON) 142974628.
>
>

Mime
View raw message