Return-Path: Delivered-To: apmail-hadoop-common-user-archive@www.apache.org Received: (qmail 60596 invoked from network); 1 Apr 2010 07:16:47 -0000 Received: from unknown (HELO mail.apache.org) (140.211.11.3) by 140.211.11.9 with SMTP; 1 Apr 2010 07:16:47 -0000 Received: (qmail 93906 invoked by uid 500); 1 Apr 2010 07:16:45 -0000 Delivered-To: apmail-hadoop-common-user-archive@hadoop.apache.org Received: (qmail 93853 invoked by uid 500); 1 Apr 2010 07:16:44 -0000 Mailing-List: contact common-user-help@hadoop.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: common-user@hadoop.apache.org Delivered-To: mailing list common-user@hadoop.apache.org Received: (qmail 93845 invoked by uid 99); 1 Apr 2010 07:16:44 -0000 Received: from athena.apache.org (HELO athena.apache.org) (140.211.11.136) by apache.org (qpsmtpd/0.29) with ESMTP; Thu, 01 Apr 2010 07:16:44 +0000 X-ASF-Spam-Status: No, hits=-1.2 required=10.0 tests=AWL,RCVD_IN_DNSWL_NONE,SPF_PASS X-Spam-Check-By: apache.org Received-SPF: pass (athena.apache.org: domain of dmitriy@twitter.com designates 209.85.218.217 as permitted sender) Received: from [209.85.218.217] (HELO mail-bw0-f217.google.com) (209.85.218.217) by apache.org (qpsmtpd/0.29) with ESMTP; Thu, 01 Apr 2010 07:16:39 +0000 Received: by bwz9 with SMTP id 9so632688bwz.29 for ; Thu, 01 Apr 2010 00:16:17 -0700 (PDT) MIME-Version: 1.0 Received: by 10.204.127.69 with HTTP; Thu, 1 Apr 2010 00:16:16 -0700 (PDT) Date: Thu, 1 Apr 2010 00:16:16 -0700 Received: by 10.204.133.27 with SMTP id d27mr827766bkt.51.1270106176503; Thu, 01 Apr 2010 00:16:16 -0700 (PDT) Message-ID: Subject: Errors reading lzo-compressed files from Hadoop From: Dmitriy Ryaboy To: common-user@hadoop.apache.org Content-Type: text/plain; charset=ISO-8859-1 Hi folks, We write a lot of lzo-compressed files to HDFS -- some via scribe, some using internal tools. Occasionally, we discover that the created lzo files cannot be read from HDFS -- they get through some (often large) portion of the file, and then fail with the following stack trace: Exception in thread "main" java.lang.InternalError: lzo1x_decompress_safe returned: at com.hadoop.compression.lzo.LzoDecompressor.decompressBytesDirect(Native Method) at com.hadoop.compression.lzo.LzoDecompressor.decompress(LzoDecompressor.java:303) at com.hadoop.compression.lzo.LzopDecompressor.decompress(LzopDecompressor.java:122) at com.hadoop.compression.lzo.LzopInputStream.decompress(LzopInputStream.java:223) at org.apache.hadoop.io.compress.DecompressorStream.read(DecompressorStream.java:74) at java.io.InputStream.read(InputStream.java:85) at com.twitter.twadoop.jobs.LzoReadTest.main(LzoReadTest.java:51) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25) at java.lang.reflect.Method.invoke(Method.java:597) at org.apache.hadoop.util.RunJar.main(RunJar.java:156) The initial thought is of course that the lzo file is corrupt -- however, plain-jane lzop is able to read these files. Moreover, if we pull the files out of hadoop, uncompress them, compress them again, and put them back into HDFS, we can usually read them from HDFS as well. We've been thinking that this strange behavior is caused by a bug in the hadoop-lzo libraries (we use the version with Twitter and Cloudera fixes, on github: http://github.com/kevinweil/hadoop-lzo ) However, today I discovered that using the exact same environment, codec, and InputStreams, we can successfully read from the local file system, but cannot read from HDFS. This appears to point at possible issues in the FSDataInputStream or further down the stack. Here's a small test class that tries to read the same file from HDFS and from the local FS, and the output of running it on our cluster. We are using the CDH2 distribution. https://gist.github.com/e1bf7e4327c7aef56303 Any ideas on what could be going on? Thanks, -Dmitriy