hadoop-mapreduce-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Tao Xiao <xiaotao.cs....@gmail.com>
Subject Re: hadoop fs -text OutOfMemoryError
Date Sat, 14 Dec 2013 02:11:18 GMT
hi xiao li,
   you said "Basically, what I need is a Storm HDFS Bolt to be able to
write output to hdfs file, in order to get less small files, i use hdfs
append". Did you configue the "append" property in your configuration file?
you can search for "append" related issues first


2013/12/14 xiao li <xelllee@outlook.com>

> export HADOOP_CLIENT_OPTS="-Xms268435456 -Xmx268435456 $HADOOP_CLIENT_OPTS"
>
>
>
> I guess it is not the memory issue, just the way how i write the snappy
> compress file to hdfs.
> Basically, what I need is a Storm HDFS Bolt to be able to write output to
> hdfs file, in order to get less small files, i use hdfs append.
>
> Well I just can't get snappy working or write compressed files to hdfs
> through Java.
>
> I am looking at the flume hdfs sink to get better code. ; )
>
>
> https://github.com/cloudera/flume-ng/blob/cdh4-1.1.0_4.0.0/flume-ng-sinks/flume-hdfs-sink/src/main/java/org/apache/flume/sink/hdfs/HDFSCompressedDataStream.java
>
> ------------------------------
> Date: Fri, 13 Dec 2013 22:24:21 +0100
>
> Subject: Re: hadoop fs -text OutOfMemoryError
> From: kawa.adam@gmail.com
> To: user@hadoop.apache.org
>
>
> Hi,
>
> What is the value of HADOOP_CLIENT_OPTS in you hadoop-env.sh file?
>
> We had similar problems with running OOM with hadoop fs command (I do not
> remember if they were exactly related to -text + snappy), when we decreased
> the heap to some small value. With higher value e.g. 1 or 2 GB, we were
> fine:
>
> # The following applies to multiple commands (fs, dfs, fsck, distcp etc)
> export HADOOP_CLIENT_OPTS="-Xmx2048m ${HADOOP_CLIENT_OPTS}"
>
>
> 2013/12/13 xiao li <xelllee@outlook.com>
>
> Hi Tao
>
> Thanks for your reply,
>
> This is the code, it is pretty simple.
>
> '
>                     fsDataOutputStream.write(Snappy.compress(NEWLINE));
>                     fsDataOutputStream
> .write(Snappy.compress(json.getBytes("UTF-8")));'
>
>
> but FSDataOutputStream is actually opened for appending, I guess the I
> can't simply append to the snappy file(know nothing about it.)
>
>
>
> ------------------------------
> Date: Fri, 13 Dec 2013 21:42:38 +0800
> Subject: Re: hadoop fs -text OutOfMemoryError
> From: xiaotao.cs.nju@gmail.com
> To: user@hadoop.apache.org
>
>
> can you describe your problems in more details, for example, was snappy
> library installed correctly in your cluster, how did you code yout files
> with snappy, was your file correctly coded with snappy ?
>
>
> 2013/12/13 xiao li <xelllee@outlook.com>
>
> I could view the snappy file with hadoop fs -cat but when i issue the
> -text, it gives me this error though the file size is really tiny. what
> have i done wrong? Thanks
>
> hadoop fs -text /test/SinkToHDFS-ip-.us-west-2.compute.internal-6703-22-
> 20131212-0.snappy
> Exception in thread "main" java.lang.OutOfMemoryError: Java heap space
>  at org.apache.hadoop.io.compress.BlockDecompressorStream.
> getCompressedData(BlockDecompressorStream.java:115)
>  at org.apache.hadoop.io.compress.BlockDecompressorStream.decompress(
> BlockDecompressorStream.java:95)
>  at org.apache.hadoop.io.compress.DecompressorStream.read(
> DecompressorStream.java:83)
>  at java.io.InputStream.read(InputStream.java:82)
> at org.apache.hadoop.io.IOUtils.copyBytes(IOUtils.java:78)
>  at org.apache.hadoop.io.IOUtils.copyBytes(IOUtils.java:52)
>  at org.apache.hadoop.io.IOUtils.copyBytes(IOUtils.java:112)
> at org.apache.hadoop.fs.shell.Display$Cat.printToStdout(Display.java:86)
>  at org.apache.hadoop.fs.shell.Display$Cat.processPath(Display.java:81)
>  at org.apache.hadoop.fs.shell.Command.processPaths(Command.java:306)
> at org.apache.hadoop.fs.shell.Command.processPathArgument(
> Command.java:278)
>  at org.apache.hadoop.fs.shell.Command.processArgument(Command.java:260)
>  at org.apache.hadoop.fs.shell.Command.processArguments(Command.java:244)
> at org.apache.hadoop.fs.shell.Command.processRawArguments(
> Command.java:190)
>  at org.apache.hadoop.fs.shell.Command.run(Command.java:154)
>  at org.apache.hadoop.fs.FsShell.run(FsShell.java:254)
> at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:70)
>  at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:84)
>  at org.apache.hadoop.fs.FsShell.main(FsShell.java:304)
>
>
>
>

Mime
View raw message