flume-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Denes Arvay <de...@cloudera.com>
Subject Re: Review Request 50378: FLUME-2960: Support Wildcards in directoryname in TaildirSource
Date Tue, 28 Feb 2017 15:33:51 GMT


> On July 29, 2016, 4:49 p.m., Attila Simon wrote:
> > flume-ng-sources/flume-taildir-source/src/main/java/org/apache/flume/source/taildir/TaildirMatcher.java,
lines 240-242
> > <https://reviews.apache.org/r/50378/diff/4/?file=1457767#file1457767line240>
> >
> >     performance downgrade due to the idempotent instantiations of matchers
> 
> qiao wen wrote:
>     I agree with you. But is there any good idea?
> 
> eskrm wrote:
>     Extract the matchers out to the enclosing class and finalize?

I second this, extracting the matchers will save unnecessary instance creation and pattern
parsing. It's possible to move both of the matchers to the `TailDirMatcher` class.


> On July 29, 2016, 4:49 p.m., Attila Simon wrote:
> > flume-ng-sources/flume-taildir-source/src/main/java/org/apache/flume/source/taildir/TaildirMatcher.java,
lines 241-251
> > <https://reviews.apache.org/r/50378/diff/4/?file=1457767#file1457767line241>
> >
> >     Missing short circuit on surely not matching huge subdirs. You need to override
the preVisitDirectory for that. (Also I would recommend overriding visitFileFailed for robustness)
> 
> qiao wen wrote:
>     I add overriding visitFileFailed. But I don't override preVisitDirectory. @Denes
said "it's not needed to implement this method, the super implementation is basically the
same (does some not-null checks and returns CONTINUE)".
> 
> Denes Arvay wrote:
>     Your original implementation of `preVisitDirectory` simply returned `CONTINUE`, which
is the same as the super implementation. But I didn't take into account the possible optimization
so I agree with Attila that it would be good to short circuit if possible.

It might be worth to think a bit more on this, as it seems to be possible to optimize it by
implementing the `preVisitDirectory` and using the `dirMatcher` to check whether any of the
children of the given directory might match. It may be needed to split the `dirMatcher` by
the directory separator character to do this.


- Denes


-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/50378/#review144092
-----------------------------------------------------------


On July 30, 2016, 7:33 a.m., qiao wen wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/50378/
> -----------------------------------------------------------
> 
> (Updated July 30, 2016, 7:33 a.m.)
> 
> 
> Review request for Flume.
> 
> 
> Repository: flume-git
> 
> 
> Description
> -------
> 
> In our log management project, we wan't to track many log files like this:
> /app/dir1/log.*
> /app/dir2/log.*
> ...
> /app/dirn/log.*
> But TaildirSource can't support wildcards in filegroup directory name. The following
config is expected:
> a1.sources.r1.filegroups.fg = /app/*/log.*
> 
> 
> Diffs
> -----
> 
>   flume-ng-doc/sphinx/FlumeUserGuide.rst 3f08d8b 
>   flume-ng-sources/flume-taildir-source/src/main/java/org/apache/flume/source/taildir/TaildirMatcher.java
ad9f720 
>   flume-ng-sources/flume-taildir-source/src/test/java/org/apache/flume/source/taildir/TestTaildirMatcher.java
c341054 
>   flume-ng-sources/flume-taildir-source/src/test/java/org/apache/flume/source/taildir/TestTaildirSource.java
097ee0b 
> 
> Diff: https://reviews.apache.org/r/50378/diff/
> 
> 
> Testing
> -------
> 
> All tests in TestTaildirSource passed.
> 
> 
> Thanks,
> 
> qiao wen
> 
>


Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message