hadoop-mapreduce-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Jan Warchoł <jan.warc...@codilime.com>
Subject Re: changing split size in Hadoop configuration
Date Tue, 15 Jul 2014 08:19:52 GMT

On Mon, Jul 14, 2014 at 7:50 PM, Adam Kawa <kawa.adam@gmail.com> wrote:

> It sounds like JobTracker setting, so the restart looks to be required.


> You verify it in pseudo-distributed mode by setting it to a very low
> value, restarting JT and seeing if you get the exception that prints this
> new value.

Well, the funny thing is that it did work when i made the change in a
pseudo-distributed "cluster" on my laptop, but it didn't have any effect
when i tried it on the real cluster.  I probably changed wrong
configuration file.  How do i check where the configuration that is
actually used for (re)starting JobTracker comes from?

On Mon, Jul 14, 2014 at 8:54 PM, Bertrand Dechoux <dechouxb@gmail.com>

> For what it's worth, mapreduce.jobtracker.split.metainfo.maxsize is
> related to the size of the file containing the information describing the
> input splits. It is not related directly to the volume of data but to the
> number of splits which might explode when using too many (small) files.
> It's basically a safeguard. Alternatively, you might want to reduce the
> number of splits ; raising the block size is one way to do it.

Ok, i'll keep this in mind and try changing block size if necessary.

*Jan Warchoł*
*Software Engineer*

M: +48 509 078 203
 E: jan.warchol@codilime.com
CodiLime Sp. z o.o. - Ltd. company with its registered office in Poland,
01-167 Warsaw, ul. Zawiszy 14/97. Registered by The District Court for the
Capital City of Warsaw, XII Commercial Department of the National Court
Register. Entered into National Court Register under No. KRS 0000388871.
Tax identification number (NIP) 5272657478. Statistical number
(REGON) 142974628.

View raw message