Return-Path: Delivered-To: apmail-hadoop-common-user-archive@www.apache.org Received: (qmail 22393 invoked from network); 22 Oct 2010 01:12:55 -0000 Received: from unknown (HELO mail.apache.org) (140.211.11.3) by 140.211.11.9 with SMTP; 22 Oct 2010 01:12:55 -0000 Received: (qmail 22465 invoked by uid 500); 22 Oct 2010 01:12:52 -0000 Delivered-To: apmail-hadoop-common-user-archive@hadoop.apache.org Received: (qmail 22415 invoked by uid 500); 22 Oct 2010 01:12:52 -0000 Mailing-List: contact common-user-help@hadoop.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: common-user@hadoop.apache.org Delivered-To: mailing list common-user@hadoop.apache.org Received: (qmail 22407 invoked by uid 99); 22 Oct 2010 01:12:52 -0000 Received: from nike.apache.org (HELO nike.apache.org) (192.87.106.230) by apache.org (qpsmtpd/0.29) with ESMTP; Fri, 22 Oct 2010 01:12:52 +0000 X-ASF-Spam-Status: No, hits=2.2 required=10.0 tests=FREEMAIL_FROM,HTML_MESSAGE,SPF_PASS,T_TO_NO_BRKTS_FREEMAIL X-Spam-Check-By: apache.org Received-SPF: pass (nike.apache.org: domain of hadoopnode@gmail.com designates 209.85.215.176 as permitted sender) Received: from [209.85.215.176] (HELO mail-ey0-f176.google.com) (209.85.215.176) by apache.org (qpsmtpd/0.29) with ESMTP; Fri, 22 Oct 2010 01:12:46 +0000 Received: by eyg24 with SMTP id 24so135021eyg.35 for ; Thu, 21 Oct 2010 18:12:26 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=gamma; h=domainkey-signature:mime-version:received:received:in-reply-to :references:date:message-id:subject:from:to:content-type; bh=8wZQCdIC7WiumNvVGObU4WlzOZKp2Vi0ALQj8zxnGw4=; b=rq3wMkVGSBYjOtiyM4hJfqnq+ZMal2EmTCE28yjEhe7NHmARLeMb+/oknChj5+k9RS Ai7b9hAGkzZZOUt4NtZQFLmdrK5KxoSvG2keRWVxfiuESQSyVrbL6OmvpqhdjiBi31LW 0BS+rxD8ZcXo/tVLKurZD+4hF4y9mKxYlNgu0= DomainKey-Signature: a=rsa-sha1; c=nofws; d=gmail.com; s=gamma; h=mime-version:in-reply-to:references:date:message-id:subject:from:to :content-type; b=bNhVq3mbccdOiAUCcVi0tojKM/Wffkrk4OSzi7ci3lCk3NvY3TuE0BrtglKblosvGg O1l0sc0lYarZ7y58nIquVCjXnWpc67FhduyCjxkamMTq3AwSwrFhU7Cvj6w/XapmvyMi hMDMWH4dDeVd56nI2O5BWMArmnG7WBs0JHuOE= MIME-Version: 1.0 Received: by 10.213.104.133 with SMTP id p5mr2555205ebo.77.1287709945705; Thu, 21 Oct 2010 18:12:25 -0700 (PDT) Received: by 10.213.29.81 with HTTP; Thu, 21 Oct 2010 18:12:25 -0700 (PDT) In-Reply-To: References: Date: Thu, 21 Oct 2010 21:12:25 -0400 Message-ID: Subject: Re: LZO Compression Libraries don't appear to work properly with MultipleOutputs From: ed To: common-user@hadoop.apache.org Content-Type: multipart/alternative; boundary=00151748dd7ce3364804932a55a3 X-Virus-Checked: Checked by ClamAV on apache.org --00151748dd7ce3364804932a55a3 Content-Type: text/plain; charset=ISO-8859-1 Hi Todd, I don't have the code in front of me right but I was looking over the API docs and it looks like I forgot to call close() on the MultipleOutput. I'll post back if that fixes the problem. If not I'll put together a unit test. Thanks! ~Ed On Thu, Oct 21, 2010 at 6:31 PM, Todd Lipcon wrote: > Hi Ed, > > Sounds like this might be a bug, either in MultipleOutputs or in LZO. > > Does it work properly with gzip compression? Which LZO implementation > are you using? The one from google code or the more up to date one > from github (either kevinweil's or mine)? > > Any chance you could write a unit test that shows the issue? > > Thanks > -Todd > > On Thu, Oct 21, 2010 at 2:52 PM, ed wrote: > > Hello everyone, > > > > I am having problems using MultipleOutputs with LZO compression (could be > a > > bug or something wrong in my own code). > > > > In my driver I set > > > > MultipleOutputs.addNamedOutput(job, "test", TextOutputFormat.class, > > NullWritable.class, Text.class); > > > > In my reducer I have: > > > > MultipleOutputs mOutput = new > > MultipleOutputs(context); > > > > public String generateFileName(Key key){ > > return "custom_file_name"; > > } > > > > Then in the reduce() method I have: > > > > mOutput.write(mNullWritable, mValue, generateFileName(key)); > > > > This results in creating LZO files that do not decompress properly (lzop > -d > > throws the error "lzop: unexpected end of file: outputFile.lzo") > > > > If I switch back to the regular context.write(mNullWritable, mValue); > > everything works fine. > > > > Am I forgetting a step needed when using MultipleOutputs or is this a > > bug/non-feature of using LZO compression in Hadoop. > > > > Thank you! > > > > > > ~Ed > > > > > > -- > Todd Lipcon > Software Engineer, Cloudera > --00151748dd7ce3364804932a55a3--