hadoop-yarn-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Sangjin Lee (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (YARN-3411) [Storage implementation] explore the native HBase write schema for storage
Date Fri, 15 May 2015 00:16:00 GMT

    [ https://issues.apache.org/jira/browse/YARN-3411?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14544659#comment-14544659
] 

Sangjin Lee commented on YARN-3411:
-----------------------------------

The latest patch looks pretty good. I think we're almost there. I went over it one more time
bit more closely, and have mostly minor comments. Some are performance-related so you might
want to consider them. Hopefully it won't be too difficult to address these...

(EntityColumnDetails.java)
- let's add Private and Unstable to this

(EntityTableDetails.java)
- let's add Private and Unstable to this
- l.27: this also should be non-public?

(HBaseTimelineWriterImpl.java)
- l.88: (nit) you must be sick of me saying it, but let's declare row inside the for loop,
and also use a more standard declaration (byte[] row = ...)
- l.165: it might be a minor optimization, but we might want to reuse bytes for te.getType()
and te.getId() for other calls that need them (e.g. getInfoPut)
- l.234: we should get the bytes for the key *once* outside the inner for loop and reuse it
- l.274: we should get the bytes for the id *once* outside the inner for loop and reuse it

(Range.java)
- let's add Private and Unstable to this

(TimelineSchemaCreator.java)
- l.137: I think we should use LOG.error() as we have the logger
- l.144-147: I'm unsure why we're using the log4j logger when we're using the Apache Commons
logger; is this for HBase debug logging?

(TimelineWriterUtil.java)
- l.53: do we need to handle the case where a component byte array is empty?
- l.53: it's not critical but it could be good to handle the case where components.length
= 1 (just return it)
- l.181,197: what are "taskValues"?

(TestHBaseTimelineWriterImpl.java)
- l.56-60: the javadoc should be before the class name
- l.56: (nit) remove PoC
- l.62: (nit) it should be util (not upper case)
- l.141: (nit) it should call init() (best practice)
- l.147: (nit) should be long, not Long
- l.212: (nit) it should call stop() (best practice)

> [Storage implementation] explore the native HBase write schema for storage
> --------------------------------------------------------------------------
>
>                 Key: YARN-3411
>                 URL: https://issues.apache.org/jira/browse/YARN-3411
>             Project: Hadoop YARN
>          Issue Type: Sub-task
>          Components: timelineserver
>            Reporter: Sangjin Lee
>            Assignee: Vrushali C
>            Priority: Critical
>         Attachments: ATSv2BackendHBaseSchemaproposal.pdf, YARN-3411-YARN-2928.001.patch,
YARN-3411-YARN-2928.002.patch, YARN-3411.poc.2.txt, YARN-3411.poc.3.txt, YARN-3411.poc.4.txt,
YARN-3411.poc.5.txt, YARN-3411.poc.6.txt, YARN-3411.poc.7.txt, YARN-3411.poc.txt
>
>
> There is work that's in progress to implement the storage based on a Phoenix schema (YARN-3134).
> In parallel, we would like to explore an implementation based on a native HBase schema
for the write path. Such a schema does not exclude using Phoenix, especially for reads and
offline queries.
> Once we have basic implementations of both options, we could evaluate them in terms of
performance, scalability, usability, etc. and make a call.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Mime
View raw message