chukwa-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Eric Yang <eric...@gmail.com>
Subject Re: Creating a new adaptor: FileTailingAdaptor that would not cut lines
Date Mon, 22 Apr 2013 04:25:19 GMT
maxReadSize can be increased in the configuration.  If using larger
maxReadSize is preferred, we can update the default to be larger size.

regards,
Eric

On Sun, Apr 21, 2013 at 3:07 PM, Luangsay Sourygna <luangsay@gmail.com>wrote:

> As I said before, I don't think Chukwa should handle those situations since
> I think this is a "log rotation" problem.
> Personally, I have never seen such problem (log4j RFA for instance has a
> kind of "flexible" size and every rotated file ended with a \n).
>
> On the other side, there is a special situation I think Chukwa should take
> care of.
> Default value for configuration
> "chukwaAgent.fileTailingAdaptor.maxReadSize" is 128kB, which means that if
> a line/record is bigger than that size, the record won't be sent by the
> agent.
> We'll get a warning in the Chukwa's log, but the record will be lost (see
> LWFTAdaptor.slurp() method).
> In such case, would it be possible to temporally increase MAX_READ_SIZE so
> that we are able to send
> one record on the wire?
>
> Regards,
>
> Sourygna
>
>
>
>
> On Sun, Apr 21, 2013 at 7:05 PM, Eric Yang <eric818@gmail.com> wrote:
>
> > Do we need to consider rotation base on size?  For example the last line
> of
> > the log file that reaches 300MB.  There is no line break in the first
> file,
> > but the entry continue to the next rotated log then have a line feed
> > delimiter.  If we are splitting line base on \n, then we can reconstruct
> > the full line between two files. I am not sure if this case need to be
> > supported?
> >
> > regards,
> > Eric
> >
> >
> > On Fri, Apr 19, 2013 at 12:01 PM, Luangsay Sourygna <luangsay@gmail.com
> > >wrote:
> >
> > > Well, log4j socket adaptor may be great if you control the software
> that
> > > generates logs.
> > > That is not usually my case: customers don't really like having to
> > install
> > > a Chukwa agents
> > > on their production servers so I don't want to think about telling them
> > to
> > > change the log system
> > > of their software.
> > >
> > > As for partial line when log files rotate, I don't think this is
> > something
> > > Chukwa should manage (what
> > > is more: how could Chukwa be aware there is a problem?).
> > > To my view, this would be an error of the "logrotate" system. As far
> as I
> > > know, RFA and DRFA log4j
> > > appenders handle quite well the rotation.
> > >
> > > Regards,
> > >
> > > Sourygna
> > >
> > >
> > > On Fri, Apr 19, 2013 at 8:17 AM, Eric Yang <eric818@gmail.com> wrote:
> > >
> > > > I think the best solution is to use Log4j socket appender and Chukwa
> > > log4j
> > > > socket adaptor to get the full entry of the log without worry about
> > line
> > > > feed.  However, this solution only works with program that is written
> > in
> > > > Java, and does not keep a copy of existing log file on disk.
> > > >
> > > > I think your proposal is a good idea to solve tailing text file and
> > only
> > > > line delimited entry will be send.  How do we handle partial line and
> > log
> > > > file has rotated?
> > > >
> > > > regards,
> > > > Eric
> > > >
> > > > On Thu, Apr 18, 2013 at 11:33 AM, Luangsay Sourygna <
> > luangsay@gmail.com
> > > > >wrote:
> > > >
> > > > > Hi all,
> > > > >
> > > > > FileTailingAdaptor is great to tail log files and send them to
> > Hadoop.
> > > > > However, last line of the chunk is usually cut which leads to some
> > > > errors.
> > > > >
> > > > > I know that we can use CharFileTailingAdaptorUTF8 to solve such
> > > problem.
> > > > > Nonetheless, this adaptor calls the MapProcessor.process() method
> for
> > > > every
> > > > > line in each chunk, thus slowing a lot the Demux phase.
> > > > >
> > > > > I suggest creating a new adaptor that would mix the benefits of the
> > two
> > > > > adaptors: the (Demux) speed of FileTailingAdaptor and
> > > > > the preservation of lines from CharFileTailingAdaptorUTF8.
> > > > >
> > > > > The implementation of the extractRecords() would be:
> > > > > - "for loop" on the buffer, starting from the end of the buffer and
> > > going
> > > > > backward
> > > > > - if we find a separator, save the offset and exit the loop
> > > > > - rest of method would be similar to CharFileTailingAdaptorUTF8.
> > > > >
> > > > > Could you guys please tell me what do you think about it?
> > > > > How do you currently manage the "lines cut" with Chukwa?
> > > > >
> > > > > Regards,
> > > > >
> > > > > Sourygna
> > > > >
> > > >
> > >
> >
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message