flume-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Brock Noland <br...@cloudera.com>
Subject Re: HDFS SINK "txnEventMax" property
Date Wed, 26 Sep 2012 12:30:37 GMT
Sorry, you are correct there are two parameters. We should perhaps see if both are required.



On Wednesday, September 26, 2012 at 6:59 AM, Harish Mandala wrote:

> It seems that batchSize defines how many events are flushed to HDFS by the sink, and
txnEventMax defines how many events are read off the channel. I would just set them both to
the same value. 
> 
> -Harish 
> 
> On Wed, Sep 26, 2012 at 7:50 AM, Jagadish Bihani <jagadish.bihani@pubmatic.com (mailto:jagadish.bihani@pubmatic.com)>
wrote:
> > Hi
> > 
> > Even in the file HDFSEventSink.java there
> > are 2 variables: 
> > defaultBatchSize (default value:1)
> > defaultTxnEventMax(default value: 100)
> > 
> > Would be very helpful to understand the working & difference between both properties.
> > 
> > Regards,
> > Jagadish
> > 
> > 
> > On 09/26/2012 05:14 PM, Harish Mandala wrote:
> > > But there exists already a different property called batchSize. 
> > > 
> > > -Harish
> > > 
> > > On Wed, Sep 26, 2012 at 7:30 AM, Brock Noland <brock@cloudera.com (mailto:brock@cloudera.com)>
wrote:
> > > > A better name for that property would be batchSize.
> > > > 
> > > > Brock
> > > > 
> > > > On Wed, Sep 26, 2012 at 5:13 AM, Jagadish Bihani
> > > > <jagadish.bihani@pubmatic.com (mailto:jagadish.bihani@pubmatic.com)>
wrote:
> > > > > Hi
> > > > >
> > > > > What is the significance of this property?
> > > > > I think because of this property almost 100 files are being created
within
> > > > > a particular rolling interval instead of 1.
> > > > >
> > > > > If I set it to 1; what performance penalty it may cause?
> > > > >
> > > > > Regards,
> > > > > Jagadish
> > > > 
> > > > 
> > > > 
> > > > --
> > > > Apache MRUnit - Unit testing MapReduce - http://incubator.apache.org/mrunit/
> > > 
> > 
> 


Mime
View raw message