flume-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Otis Gospodnetic <otis.gospodne...@gmail.com>
Subject Re: AWS S3 flume source
Date Thu, 31 Jul 2014 23:58:03 GMT
+1 for seeing S3Source, starting with a JIRA issue.

But being able to dynamically add/remove S3 buckets from which to pull data
seems important.

Any suggestions for how to approach that?

Otis
--
Performance Monitoring * Log Analytics * Search Analytics
Solr & Elasticsearch Support * http://sematext.com/


On Thu, Jul 31, 2014 at 9:14 PM, Hari Shreedharan <hshreedharan@cloudera.com
> wrote:

> Please go ahead and file a jira. If you are willing to submit a patch, you
> can post it on the jira.
>
> Viral Bajaria wrote:
>
>
> I have a similar use case that cropped up yesterday. I saw the archive
> and found that there was a recommendation to build it as Sharninder
> suggested.
>
> For now, I went down the route of writing a python script which
> downloads from S3 and puts the files in a directory which is
> configured to be picked up via a spooldir.
>
> I would prefer to get a direct S3 source, and maybe we could
> collaborate on it and open-source it. Let me know if you prefer that
> and we can work directly on it by creating a JIRA.
>
> Thanks,
> Viral
>
>
>
> On Thu, Jul 31, 2014 at 10:26 AM, Hari Shreedharan
> <hshreedharan@cloudera.com <mailto:hshreedharan@cloudera.com>> wrote:
>
>     In both cases, Sharninder is right :)
>
>     Sharninder wrote:
>
>
>
>     As far as I know, there is no (open source) implementation of an S3
>     source, so yes, you'll have to implement your own. You'll have to
>     implement a Pollable source and the dev documentation has an outline
>     that you can use. You can also look at the existing Execsource and
>     work your way up.
>
>     As far as I know, there is no way to configure flume without
>     using the
>     configuration file.
>
>
>
>     On Thu, Jul 31, 2014 at 7:57 PM, Paweł <prog88@gmail.com
>     <mailto:prog88@gmail.com>
>     <mailto:prog88@gmail.com <mailto:prog88@gmail.com>>> wrote:
>
>         Hi,
>         I'm wondering if Flume is able to read directly from S3.
>
>         I'll describe my case. I have log files stored in AWS S3. I have
>         to fetch periodically new S3 objects and read log lines from it.
>         Than use log lines (events) are processed in standard flume's way
>         (as with other sources).
>
>         *1) Is there any way to fetch S3 objects or I have to write
>     my own
>         Source?*
>
>
>         There is also second case. I want to have flume configuration
>         dynamic. Flume sources can change in time. New AWS key and S3
>         bucket can be added or deleted.
>
>         *2) Is there any other way to configure Flume than by static
>         configuration file?*
>
>         --
>         Paweł Róg
>
>
>

Mime
View raw message