Return-Path: X-Original-To: apmail-hadoop-common-user-archive@www.apache.org Delivered-To: apmail-hadoop-common-user-archive@www.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id BA563782B for ; Sun, 27 Nov 2011 03:49:28 +0000 (UTC) Received: (qmail 22102 invoked by uid 500); 27 Nov 2011 03:49:25 -0000 Delivered-To: apmail-hadoop-common-user-archive@hadoop.apache.org Received: (qmail 22067 invoked by uid 500); 27 Nov 2011 03:49:24 -0000 Mailing-List: contact common-user-help@hadoop.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: common-user@hadoop.apache.org Delivered-To: mailing list common-user@hadoop.apache.org Received: (qmail 22059 invoked by uid 99); 27 Nov 2011 03:49:23 -0000 Received: from nike.apache.org (HELO nike.apache.org) (192.87.106.230) by apache.org (qpsmtpd/0.29) with ESMTP; Sun, 27 Nov 2011 03:49:23 +0000 X-ASF-Spam-Status: No, hits=-0.7 required=5.0 tests=FREEMAIL_FROM,RCVD_IN_DNSWL_LOW,SPF_PASS X-Spam-Check-By: apache.org Received-SPF: pass (nike.apache.org: domain of flechadeorion@gmail.com designates 209.85.210.176 as permitted sender) Received: from [209.85.210.176] (HELO mail-iy0-f176.google.com) (209.85.210.176) by apache.org (qpsmtpd/0.29) with ESMTP; Sun, 27 Nov 2011 03:49:16 +0000 Received: by iaqq3 with SMTP id q3so8312401iaq.35 for ; Sat, 26 Nov 2011 19:48:55 -0800 (PST) Received: by 10.50.207.38 with SMTP id lt6mr44673238igc.43.1322365734816; Sat, 26 Nov 2011 19:48:54 -0800 (PST) References: From: Leonardo Urbina In-Reply-To: Mime-Version: 1.0 (1.0) Date: Sat, 26 Nov 2011 21:48:50 -0600 Message-ID: <930994860714920899@unknownmsgid> Subject: Re: Hadoop Serialization: Avro To: "common-user@hadoop.apache.org" Content-Type: text/plain; charset=ISO-8859-1 X-Virus-Checked: Checked by ClamAV on apache.org Thanks, I will send the question to that last as well, Best, -Leo Sent from my phone On Nov 26, 2011, at 7:32 PM, Brock Noland wrote: > Hi, > > Depending on the response you get here, you might also post the > question separately on avro-user. > > On Sat, Nov 26, 2011 at 1:46 PM, Leonardo Urbina wrote: >> Hey everyone, >> >> First time posting to the list. I'm currently writing a hadoop job that >> will run daily and whose output will be part of the part of the next day's >> input. Also, the output will potentially be read by other programs for >> later analysis. >> >> Since my program's output is used as part of the next day's input, it would >> be nice if it was stored in some binary format that is easy to read the >> next time around. But this format also needs to be readable by other >> outside programs, not necessarily written in Java. After searching for a >> while it seems that Avro is what I want to be using. In any case, I have >> been looking around for a while and I can't seem to find a single example >> of how to use Avro within a Hadoop job. >> >> It seems that in order to use Avro I need to change the io.serializations >> value, however I don't know which value should be specified. Furthermore, I >> found that there are classes Avro{Input,Output}Format but these use a >> series of other Avro classes which, as far as I understand, seem need the >> use of other Avro classes such as AvroWrapper, AvroKey, AvroValue, and as >> far as I am concerned Avro* (with * replaced with pretty much any Hadoop >> class name). It seems however that these are used so that the Avro format >> is used throughout the Hadoop process to pass objects around. >> >> I just want to use Avro to save my output and read it again as input next >> time around. So far I have been using SequenceFile{Input,Output}Format, and >> have implemented the Writable interface in the relevant classes, however >> this is not portable to other languages. Is there a way to use Avro without >> a substantial rewrite (using Avro* classes) of my Hadoop job? Thanks in >> advance, >> >> Best, >> -Leo >> >> -- >> Leo Urbina >> Massachusetts Institute of Technology >> Department of Electrical Engineering and Computer Science >> Department of Mathematics >> lurbina@mit.edu >>