hadoop-common-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Daniel Templeton (JIRA)" <j...@apache.org>
Subject [jira] [Updated] (HADOOP-12702) Add an HDFS metrics sink
Date Wed, 27 Jan 2016 14:37:40 GMT

     [ https://issues.apache.org/jira/browse/HADOOP-12702?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]

Daniel Templeton updated HADOOP-12702:
--------------------------------------
    Attachment: HADOOP-12702.004.patch

bq. Given the sink we are adding here has some quirks to its behavior - new directory every
hour etc., the class name FileSystemSink seems too simple. Can we capture more of the behavior
in the name?

How about RollingFileSystemSink?

bq. checkAppend: If appending throws an IOE that is not because of not being supported, should
we allow appending? I would think not.

Actually, yes.  If the operation isn't supported, the method contains nothing but the exception
throw.  If we see anything other than that exception, the operation is supported and just
not working at the moment.  Could be transitive, could be persistent.  Either way, the sink
should keep trying.

bq. rollLogDirIfNeeded: For readability, should we split it into two ifs - the first is when
the directories don't match. Also, the comment in the method is wrongly indented and slightly
confusing.

Hmmm...  I'm not sure that actually improves readability.  I really dislike chained logic
like that.  I find it hard to follow.

I'll take this point as, "gee that's a long/busy method..."  I have a different idea for how
to clean it up.

bq. putMetrics: When throwing MetricsException, no need for a new line between setting the
message and actually throwing the exception. Also, should just have a method that takes a
message (String) and throws an exception if ignore error is not turned on. The only downside
would be the intern objects for the strings here.

Fair enough.  On the blank line, I always put a blank line before a throw because it's kinda
a big deal and should jump out at the reader.  If it really annoys you, tell me. Otherwise
I'm leaving it in.

> Add an HDFS metrics sink
> ------------------------
>
>                 Key: HADOOP-12702
>                 URL: https://issues.apache.org/jira/browse/HADOOP-12702
>             Project: Hadoop Common
>          Issue Type: Improvement
>          Components: metrics
>    Affects Versions: 2.7.1
>            Reporter: Daniel Templeton
>            Assignee: Daniel Templeton
>         Attachments: HADOOP-12702.001.patch, HADOOP-12702.002.patch, HADOOP-12702.003.patch,
HADOOP-12702.004.patch
>
>
> We need a metrics2 sink that can write metrics to HDFS. The sink should accept as configuration
a "directory prefix" and do the following in {{putMetrics()}}
> * Get yyyyMMddHH from current timestamp.
> * If HDFS dir "dir prefix" + yyyyMMddHH doesn't exist, create it. Close any currently
open file and create a new file called <hostname>.log in the new directory.
> * Write metrics to the current log file.
> * If a write fails, it should be fatal to the process running the sink.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Mime
View raw message