avro-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Michael Malak <michaelma...@yahoo.com>
Subject Re: Is it possible to append to an already existing avro file
Date Thu, 07 Feb 2013 16:42:05 GMT
I confess to being a user of rather than a developer of open source, but perhaps you could
elaborate on what "depends on" means and what the consequences are?

Isn't it -- or couldn't it be made -- a run-time binding, so that only those who try to use
the HDFS append functionality would be required to also include the HDFS Jars in their classpath?

Or is the issue more of a bookkeeping one, whereby every update to HDFS will require an Avro
regression test?

Now that Hive supports Avro as of the Jan. 11 release of Hive 0.10, the use case of ingesting
data into Avro on HDFS is only going to get more popular, and appending is very handy for
ingesting, especially for live real-time or near-real-time data.  So it seems to me that if
the inconveniences are minor or can be worked around, that Avro indeed should perhaps "depend
on" HDFS.

--- On Thu, 2/7/13, Harsh J <harsh@cloudera.com> wrote:

> From: Harsh J <harsh@cloudera.com>
> Subject: Re: Is it possible to append to an already existing avro file
> To: user@avro.apache.org
> Date: Thursday, February 7, 2013, 9:28 AM
> I assume by non-trivial you meant the
> extra Seekable stuff I needed to
> wrap around the DFS output streams to let Avro take it as
> append-able?
> I don't think its possible for Avro to carry it since Avro
> (core) does
> not reverse-depend on Hadoop. Should we document it
> somewhere though?
> Do you have any ideas on the best place to do that?
> 
> On Thu, Feb 7, 2013 at 6:12 AM, Michael Malak <michaelmalak@yahoo.com>
> wrote:
> > Thanks so much for the code -- it works great!
> >
> > Since it is a non-trivial amount of code required to
> > achieve append, I suggest attaching that code to AVRO-1035,
> > in the hopes that someone will come up with an interface
> > that requires just one line of user code to achieve append.
> >
> > --- On Wed, 2/6/13, Harsh J <harsh@cloudera.com>
> wrote:
> >
> >> From: Harsh J <harsh@cloudera.com>
> >> Subject: Re: Is it possible to append to an already existing avro file
> >> To: user@avro.apache.org
> >> Date: Wednesday, February 6, 2013, 11:17 AM
> >> Hey Michael,
> >>
> >> It does implement the regular Java OutputStream interface,
> >> as seen in
> >> the API: http://hadoop.apache.org/docs/current/api/org/apache/hadoop/fs/FSDataOutputStream.html.
> >>
> >> Here's a sample program that works on Hadoop 2.x in my
> >> tests:
> >> https://gist.github.com/QwertyManiac/4724582


Mime
View raw message