hadoop-mapreduce-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Elad Itzhakian <el...@mellanox.com>
Subject Help running Hadoop 2.0.5 with Snappy compression
Date Mon, 23 Sep 2013 12:27:22 GMT
Hi,

I'm trying to run some MapReduce jobs on Hadoop 2.0.5 framework using Snappy compression.

I built Hadoop with -Pnative, installed it and Snappy on all 3 machines (master+2 slaves)
and copied .so files as required to $HADOOP_HOME/lib/native

Also I added the following to $HADOOP_CONF_DIR/mapred-site.xml:
<property>
  <name>mapreduce.map.output.compress</name>
  <value>true</value>
</property>
<property>
  <name>mapred.map.output.compress.codec</name>
  <value>org.apache.hadoop.io.compress.SnappyCodec</value>
</property>

And this I added to core-site.xml:

<property>
    <name>io.compression.codecs</name>
    <value>
      org.apache.hadoop.io.compress.GzipCodec,
      org.apache.hadoop.io.compress.DefaultCodec,
      org.apache.hadoop.io.compress.BZip2Codec,
      org.apache.hadoop.io.compress.SnappyCodec
    </value>
  </property>


I then ran the following jobs:

Pi:
                bin/hadoop jar share/hadoop/mapreduce/hadoop-mapreduce-examples-2.0.5-alpha.jar
pi 8 2000

Teragen & Terasort:
                bin/hadoop jar share/hadoop/mapreduce/hadoop-mapreduce-examples-2.0.5-alpha.jar
teragen 100000 /in
                bin/hadoop jar share/hadoop/mapreduce/hadoop-mapreduce-examples-2.0.5-alpha.jar
terasort /in /out


But when grepping the logs I can't seem to find any sign of Snappy:

[eladi@r-zorro003 hadoop-2.0.5-alpha]$ grep -r "Snappy" /data1/elad/logs/*
[eladi@r-zorro003 hadoop-2.0.5-alpha]$ grep -r "compress" /data1/elad/logs/*
/data1/elad/logs/application_1379926544427_0001/container_1379926544427_0001_01_000010/syslog:2013-09-23
11:56:24,182 INFO [fetcher#5] org.apache.hadoop.io.compress.zlib.ZlibFactory: Successfully
loaded & initialized native-zlib library
/data1/elad/logs/application_1379926544427_0001/container_1379926544427_0001_01_000010/syslog:2013-09-23
11:56:24,183 INFO [fetcher#5] org.apache.hadoop.io.compress.CodecPool: Got brand-new decompressor
[.deflate]
/data1/elad/logs/application_1379926544427_0001/container_1379926544427_0001_01_000010/syslog:2013-09-23
11:56:24,331 INFO [main] org.apache.hadoop.io.compress.CodecPool: Got brand-new compressor
[.deflate]

It seems as if Zlib is being loaded instead of Snappy.

What am I missing?

Thanks,
Elad Itzhakian

Mime
View raw message