hadoop-yarn-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Atul Sikaria (JIRA)" <j...@apache.org>
Subject [jira] [Comment Edited] (YARN-5915) ATS 1.5 FileSystemTimelineWriter causes flush() to be called after every event write
Date Sat, 19 Nov 2016 03:13:58 GMT

    [ https://issues.apache.org/jira/browse/YARN-5915?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15678475#comment-15678475
] 

Atul Sikaria edited comment on YARN-5915 at 11/19/16 3:13 AM:
--------------------------------------------------------------

This was seen previously as well, in YARN-4814. 

The issue is with writeEntities method in FileSystemTimelineWriter (https://github.com/apache/hadoop/blob/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common/src/main/java/org/apache/hadoop/yarn/client/api/impl/FileSystemTimelineWriter.java#L317).
This calls getObjectMapper().writeValue(…), which does a flush() after every write with
default config.

{noformat} 
@Override
public void writeValue(JsonGenerator jgen, Object value)
    throws IOException, JsonGenerationException, JsonMappingException
{
    SerializationConfig config = copySerializationConfig();
    if (config.isEnabled(SerializationConfig.Feature.CLOSE_CLOSEABLE) && (value instanceof
Closeable)) {
        _writeCloseableValue(jgen, value, config);
    } else {
        _serializerProvider.serializeValue(config, jgen, value, _serializerFactory);
        if (config.isEnabled(SerializationConfig.Feature.FLUSH_AFTER_WRITE_VALUE)) {
            jgen.flush();
        }
    }
}
{noformat} 

On filesystems that map flush() to no-op or trivial operations, this is not a big deal. But
on filesystems where flush() incurs a larger cost, this becomes a bottleneck for timeline
events flow.

The fix is to set the property above (FLUSH_AFTER_WRITE_VALUE) to false, so the JSonGenerator
does not do a flush after every JSon write.

The flush of the stream is done in a timer thread at configurable interval (10 seconds by
default). As [~jlowe] pointed out in YARN-4814, the timer thread also needs to also do a flush()
on the JsonGenerator, to make sure the json serializer does not have any buffered data - so
the hflush() in the timer thread actually flushes all the data seen so far.


was (Author: asikaria):
This was seen previously as well, in YARN-4814. 

The issue is with writeEntities method in FileSystemTimelineWriter (https://github.com/apache/hadoop/blob/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common/src/main/java/org/apache/hadoop/yarn/client/api/impl/FileSystemTimelineWriter.java#L317).
This calls getObjectMapper().writeValue(…), which does a flush() after every write with
default config.

{noformat} 
@Override
public void writeValue(JsonGenerator jgen, Object value)
    throws IOException, JsonGenerationException, JsonMappingException
{
    SerializationConfig config = copySerializationConfig();
    if (config.isEnabled(SerializationConfig.Feature.CLOSE_CLOSEABLE) && (value instanceof
Closeable)) {
        _writeCloseableValue(jgen, value, config);
    } else {
        _serializerProvider.serializeValue(config, jgen, value, _serializerFactory);
        if (config.isEnabled(SerializationConfig.Feature.FLUSH_AFTER_WRITE_VALUE)) {
            jgen.flush();
        }
    }
}
{noformat} 

On filesystems that map flush() to no-op or trivial operations, this is not a big deal. But
on filesystems where flush() incurs a larger cost, this becomes a bottleneck for timeline
events flow.

> ATS 1.5 FileSystemTimelineWriter causes flush() to be called after every event write
> ------------------------------------------------------------------------------------
>
>                 Key: YARN-5915
>                 URL: https://issues.apache.org/jira/browse/YARN-5915
>             Project: Hadoop YARN
>          Issue Type: Bug
>          Components: timelineserver
>    Affects Versions: 3.0.0-alpha1
>            Reporter: Atul Sikaria
>




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

---------------------------------------------------------------------
To unsubscribe, e-mail: yarn-issues-unsubscribe@hadoop.apache.org
For additional commands, e-mail: yarn-issues-help@hadoop.apache.org


Mime
View raw message