hadoop-yarn-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Vrushali C (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (YARN-3411) [Storage implementation] explore the native HBase write schema for storage
Date Thu, 21 May 2015 15:36:19 GMT

    [ https://issues.apache.org/jira/browse/YARN-3411?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14554532#comment-14554532
] 

Vrushali C commented on YARN-3411:
----------------------------------

HI [~djp]
Thanks, appreciate your review very much! 
bq. Given all columns here belongs to INFO C_F, we can omit the parameter of EntityColumnFamily.INFO.
Also, rename EntityColumnDetails to EntityInfoColumnFamilyDetails could sound more clear.

I actually had it named as EntityInfoColumnDetails in the previous patches. While I was discussing
the patches with [~jrottinghuis], we thought it might be better to call it something more
generic so that going forward if we need to add more columns in other column families, we
can reuse some of this structure. Also, [~jrottinghuis] had some really good ideas on how
to make things very generic across tables but in the interest of time, this was not done in
this patch. 

bq. This is a private constructor and all callers are under control to make sure value is
already lower case. So toLowerCase() is not necessary (the same for EntityColumnFamily).
Yes, the values being initialized right now are lower case, but going forward, if some adds
in an uppercase value, or a sentence case, we may not realize it. While storing in hbase,
we need to careful about the column names, since we need to query for that exact name, hence
the explicit lower casing in the constructor. 

bq. Do we try to add a new configuration "yarn.application.id" here? It doesn't sounds right
and HBaseTimelineWriterImpl.class.getName() doesn't sounds like a default value for this configuration.
Am I missing anything?

So, I didn't have a good string to pass in, so I picked a value. But I am open to initializing
it to something more appropriate. Is there a general recommendation around this... I can change
it to that.

thanks
Vrushali

> [Storage implementation] explore the native HBase write schema for storage
> --------------------------------------------------------------------------
>
>                 Key: YARN-3411
>                 URL: https://issues.apache.org/jira/browse/YARN-3411
>             Project: Hadoop YARN
>          Issue Type: Sub-task
>          Components: timelineserver
>            Reporter: Sangjin Lee
>            Assignee: Vrushali C
>            Priority: Critical
>         Attachments: ATSv2BackendHBaseSchemaproposal.pdf, YARN-3411-YARN-2928.001.patch,
YARN-3411-YARN-2928.002.patch, YARN-3411-YARN-2928.003.patch, YARN-3411-YARN-2928.004.patch,
YARN-3411-YARN-2928.005.patch, YARN-3411-YARN-2928.006.patch, YARN-3411-YARN-2928.007.patch,
YARN-3411.poc.2.txt, YARN-3411.poc.3.txt, YARN-3411.poc.4.txt, YARN-3411.poc.5.txt, YARN-3411.poc.6.txt,
YARN-3411.poc.7.txt, YARN-3411.poc.txt
>
>
> There is work that's in progress to implement the storage based on a Phoenix schema (YARN-3134).
> In parallel, we would like to explore an implementation based on a native HBase schema
for the write path. Such a schema does not exclude using Phoenix, especially for reads and
offline queries.
> Once we have basic implementations of both options, we could evaluate them in terms of
performance, scalability, usability, etc. and make a call.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Mime
View raw message