hadoop-hdfs-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Robert Kanter (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (HDFS-9782) RollingFileSystemSink should have configurable roll interval
Date Tue, 23 Feb 2016 21:56:18 GMT

    [ https://issues.apache.org/jira/browse/HDFS-9782?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15159721#comment-15159721

Robert Kanter commented on HDFS-9782:

Looks good overall.  A few things:
- Typo: "...some variance *is* the roll times..."
- If the idea here is to prevent attacking HDFS with everyone rolling at the same time, I
think the default value should not be 0.  That basically negates the what we're trying to
do here.
- I'm not sure we should try to conform to HDFS-9821 here at this point.  You have to define
a lot of extra code to handle the parsing.  I imagine HDFS-9821 will eventually make a common
place for these and take care of it transparently to the code that's actually using the config
property's value; we don't want a bunch of different implementations of this.  Instead, it
sounds like you should be able to make this "compatible" by naming the key {{roll-offset-interval}}
instead of {{roll-offset-interval-millis}}.
- On a slow system or with some other delay, this could easily cause the test to be flakey:
    int count = 0;

    // Sleep until the flusher has run
    while (!RollingFileSystemSink.hasFlushed) {

      if (++count > 15) {
        fail("Flush thread did not run within 1.5 seconds");

> RollingFileSystemSink should have configurable roll interval
> ------------------------------------------------------------
>                 Key: HDFS-9782
>                 URL: https://issues.apache.org/jira/browse/HDFS-9782
>             Project: Hadoop HDFS
>          Issue Type: Improvement
>            Reporter: Daniel Templeton
>            Assignee: Daniel Templeton
>         Attachments: HDFS-9782.001.patch, HDFS-9782.002.patch, HDFS-9782.003.patch
> Right now it defaults to rolling at the top of every hour.  Instead that interval should
be configurable.  The interval should also allow for some play so that all hosts don't try
to flush their files simultaneously.
> I'm filing this in HDFS because I suspect it will involve touching the HDFS tests.  If
it turns out not to, I'll move it into common instead.

This message was sent by Atlassian JIRA

View raw message