flume-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Attila Simon (JIRA)" <j...@apache.org>
Subject [jira] [Updated] (FLUME-2918) TaildirSource is underperforming with huge parent directories
Date Tue, 31 May 2016 19:27:12 GMT

     [ https://issues.apache.org/jira/browse/FLUME-2918?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]

Attila Simon updated FLUME-2918:
--------------------------------
    Attachment: profiling_after.png

profiling_after.png shows that with the fix time spent on getMatchFiles() (same workload with
occasionally (2 per min) adding new files to the directory) is reduced to 3.4% of thread time

> TaildirSource is underperforming with huge parent directories
> -------------------------------------------------------------
>
>                 Key: FLUME-2918
>                 URL: https://issues.apache.org/jira/browse/FLUME-2918
>             Project: Flume
>          Issue Type: Improvement
>          Components: Sinks+Sources
>            Reporter: Attila Simon
>              Labels: performance
>             Fix For: v1.7.0
>
>         Attachments: profiling_after.png, profiling_before.png
>
>
> TailDir source cause high cpu utilization, when large amount of file is sitting in the
target directory. File pattern matches only a single file, but the parent directory contains
about 50,000 other file. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Mime
View raw message