flume-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Shumin Wu <shumin...@gmail.com>
Subject Re: Best way to control the schedule period of process() in source/sink runner
Date Fri, 14 Sep 2012 14:40:37 GMT
Hi Brock,

Thanks for redirecting me to the right mailing list. My use case is like
this:

1. In a pollable source, the polling rate to pull data from a server is
about once every few seconds. At the same time, the source wakes up
periodically to do message routing based on some business logic.
2. On the sink side, it tries to batch process its received flume events
before it invokes a db call scheduled at a certain rate. Besides, the sinks
need to periodically aggregate the data based on some business logic and
make a different db call at a different rate.

So in my case both source and sink have multiple scheduled service.

I have two questions.
Question 1: How do I control the polling rate in a pollable source other
than putting the process() to sleep for some time? I need to adjust the
polling rate when server does not have much data for me to pull.

Question 2: What's the best practice to manage several scheduled services
in one source/sink? Right now I allocate my own scheduledthreadpools. Would
it be clearer in terms of program architecture to have multiple source/sink
runners grouped in one source/sink?

Thanks for your time!

Shumin

On Thu, Sep 13, 2012 at 1:28 PM, Brock Noland <brock@cloudera.com> wrote:

> Hi,
>
> At this point I think this is appropriate for the user list.
>
> Can you explain a little more about your use case and what you are
> trying to achieve?
>
> Brock
>
> On Thu, Sep 13, 2012 at 1:19 PM, Shumin Wu <shumin.wu@gmail.com> wrote:
> > Hi,
> >
> > Please enlighten me a better way to control the scheduled executing
> period
> > of process() method in source/sink runners than the code in below.
> >
> >     @Override
> >     public Status process() throws EventDeliveryException {
> >         try {
> >             Thread.sleep(period * 1000);
> >         } catch (InterruptedException ie) {
> >             logger.error("Process thread sleep interrupted", ie);
> >         } catch (Exception ex) {
> >             logger.error("Error in the empty process method", ex);
> >             throw new EventDeliveryException(ex);
> >         }
> >         return Status.READY;
> >     }
> >
> > What's the best practice say I need to have multiple services running at
> > different period? I can of course manage my own ScheduledThreadPools, but
> > does flume-ng have any utilities that is available or planned down on the
> > road?
> >
> > Thanks,
> >
> > Shumin
>
>
>
> --
> Apache MRUnit - Unit testing MapReduce -
> http://incubator.apache.org/mrunit/
>

Mime
View raw message