flume-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Viral Bajaria <viral.baja...@gmail.com>
Subject Re: flume hdfs sink notify / callback to add partition
Date Fri, 01 Aug 2014 02:55:53 GMT
Any suggestions on this ? Still trying to figure out how do I get a
notification that a new partition is being created by the HDFS sink and I
can add that via a ALTER TABLE statement on a separate thread.

Is adding a callback the right way to handle this ?


On Mon, Jul 28, 2014 at 2:40 PM, Viral Bajaria <viral.bajaria@gmail.com>

> Hi,
> Is there a way to get the hdfs sink to signal that a file was just closed
> and then use that signal to add a partition to hive if one does not exist
> already.
> Right now, what I do is:
> - move files to s3
> - run recover partitions <--- step takes forever.
> But given that I have so much historical data, it's not feasible to run
> recover partitions every single day since it takes forever.
> I had much rather add an extra partition whenever I see a file in that
> partition for the first time.
> I looked around the code base and it seems the Flume-OG had something like
> this but I don't see the capability in Flume-NG.
> I can see a way to adding this by adding another Callback parameter to the
> HdfsEventSink and create a customer wrapper around it.
> Any other suggestions ?
> Thanks,
> Viral

View raw message