ambari-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Hemanth Yamijala (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (AMBARI-17785) Provide support for S3 as a first class destination for log events
Date Tue, 19 Jul 2016 08:05:20 GMT

    [ https://issues.apache.org/jira/browse/AMBARI-17785?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15383754#comment-15383754
] 

Hemanth Yamijala commented on AMBARI-17785:
-------------------------------------------

One approach which I am trying out is to support the {{S3OutputFile.write}} (note that AMBARI-17045
made this unsupported). In this approach, I spool events to a local file temporarily. Once
a certain threshold in reached, the spool file is rolled over and writing continues in a new
file. The threshold is based on number of events currently, but can be based on other criteria
like time elapsed since last upload, size of the file etc. In the meantime, the rolled over
file is compressed and uploaded to S3 in a separate thread. Upon successful upload, the spool
file is deleted.

> Provide support for S3 as a first class destination for log events
> ------------------------------------------------------------------
>
>                 Key: AMBARI-17785
>                 URL: https://issues.apache.org/jira/browse/AMBARI-17785
>             Project: Ambari
>          Issue Type: Improvement
>          Components: ambari-logsearch
>            Reporter: Hemanth Yamijala
>
> AMBARI-17045 added support for uploading Hadoop service logs from machines to S3. The
intended usage there was as a one time trigger where, on-demand, the log files matching certain
paths can be uploaded to a given S3 bucket and path.
> While useful, there are some use cases where we might need more than this one time activity,
particularly when clusters are deployed on ephemeral machines such as cloud instances:
> * The machines running the logfeeder could be irrevocably lost and in that case we would
not be able to retrieve any logs.
> * If we are copying logs at one time, that were generated over a long period of time,
the time to copy all the logs at the end could extend cluster up-time and cost.
> It would be nice to have an ability to support S3 as another output destination in logsearch
just like Kafka, Solr etc. This JIRA is to track work towards this enhancement.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Mime
View raw message