hadoop-yarn-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Hadoop QA (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (YARN-3448) Add Rolling Time To Lives Level DB Plugin Capabilities
Date Thu, 16 Apr 2015 18:39:01 GMT

    [ https://issues.apache.org/jira/browse/YARN-3448?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14498480#comment-14498480
] 

Hadoop QA commented on YARN-3448:
---------------------------------

{color:red}-1 overall{color}.  Here are the results of testing the latest attachment 
  http://issues.apache.org/jira/secure/attachment/12725932/YARN-3448.10.patch
  against trunk revision 1fa8075.

    {color:green}+1 @author{color}.  The patch does not contain any @author tags.

    {color:green}+1 tests included{color}.  The patch appears to include 4 new or modified
test files.

    {color:green}+1 javac{color}.  The applied patch does not increase the total number of
javac compiler warnings.

    {color:green}+1 javadoc{color}.  There were no new javadoc warning messages.

    {color:green}+1 eclipse:eclipse{color}.  The patch built with eclipse:eclipse.

    {color:red}-1 findbugs{color}.  The patch appears to introduce 1 new Findbugs (version
2.0.3) warnings.

    {color:green}+1 release audit{color}.  The applied patch does not increase the total number
of release audit warnings.

    {color:green}+1 core tests{color}.  The patch passed unit tests in hadoop-yarn-project/hadoop-yarn/hadoop-yarn-api
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-applicationhistoryservice.

Test results: https://builds.apache.org/job/PreCommit-YARN-Build/7361//testReport/
Findbugs warnings: https://builds.apache.org/job/PreCommit-YARN-Build/7361//artifact/patchprocess/newPatchFindbugsWarningshadoop-yarn-server-applicationhistoryservice.html
Console output: https://builds.apache.org/job/PreCommit-YARN-Build/7361//console

This message is automatically generated.

> Add Rolling Time To Lives Level DB Plugin Capabilities
> ------------------------------------------------------
>
>                 Key: YARN-3448
>                 URL: https://issues.apache.org/jira/browse/YARN-3448
>             Project: Hadoop YARN
>          Issue Type: Sub-task
>            Reporter: Jonathan Eagles
>            Assignee: Jonathan Eagles
>         Attachments: YARN-3448.1.patch, YARN-3448.10.patch, YARN-3448.2.patch, YARN-3448.3.patch,
YARN-3448.4.patch, YARN-3448.5.patch, YARN-3448.7.patch, YARN-3448.8.patch, YARN-3448.9.patch
>
>
> For large applications, the majority of the time in LeveldbTimelineStore is spent deleting
old entities record at a time. An exclusive write lock is held during the entire deletion
phase which in practice can be hours. If we are to relax some of the consistency constraints,
other performance enhancing techniques can be employed to maximize the throughput and minimize
locking time.
> Split the 5 sections of the leveldb database (domain, owner, start time, entity, index)
into 5 separate databases. This allows each database to maximize the read cache effectiveness
based on the unique usage patterns of each database. With 5 separate databases each lookup
is much faster. This can also help with I/O to have the entity and index databases on separate
disks.
> Rolling DBs for entity and index DBs. 99.9% of the data are in these two sections 4:1
ration (index to entity) at least for tez. We replace DB record removal with file system removal
if we create a rolling set of databases that age out and can be efficiently removed. To do
this we must place a constraint to always place an entity's events into it's correct rolling
db instance based on start time. This allows us to stitching the data back together while
reading and artificial paging.
> Relax the synchronous writes constraints. If we are willing to accept losing some records
that we not flushed in the operating system during a crash, we can use async writes that can
be much faster.
> Prefer Sequential writes. sequential writes can be several times faster than random writes.
Spend some small effort arranging the writes in such a way that will trend towards sequential
write performance over random write performance.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Mime
View raw message