hadoop-yarn-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Junping Du (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (YARN-3134) [Storage implementation] Exploiting the option of using Phoenix to access HBase backend
Date Mon, 27 Apr 2015 16:22:40 GMT

    [ https://issues.apache.org/jira/browse/YARN-3134?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14514388#comment-14514388
] 

Junping Du commented on YARN-3134:
----------------------------------

I would like to raise an important issue for reusing JDBC connections in PhoenixTimelineWriterImpl:
It sounds like we only release/close these JDBC connections until the writer get stopped.
Given the writer's lifecycle is the same as TimelineCollectorManager (at current design which
could be changed due to discussions above) which means it almost the same as RM or NM. It
also means we don't close/release any JDBC connections in the whole lifecycle of NM/RM. It
doesn't sounds right as the resource of JDBC connections is pretty expensive and very limited
(in traditional DB case), phoenix could be better as the client only server for local node.
However, it could still be expensive when large app number especially for RMTimelineCollectorManager.
 
In addition, sounds like our cache the connection per thread is also problematic: these threads
are coming from each collectors, we cache them in a Hashmap which could live forever that
could affect the GC of these collectors even these collectors should be removed when application
get finished.    

> [Storage implementation] Exploiting the option of using Phoenix to access HBase backend
> ---------------------------------------------------------------------------------------
>
>                 Key: YARN-3134
>                 URL: https://issues.apache.org/jira/browse/YARN-3134
>             Project: Hadoop YARN
>          Issue Type: Sub-task
>          Components: timelineserver
>            Reporter: Zhijie Shen
>            Assignee: Li Lu
>         Attachments: YARN-3134-040915_poc.patch, YARN-3134-041015_poc.patch, YARN-3134-041415_poc.patch,
YARN-3134-042115.patch, YARN-3134DataSchema.pdf
>
>
> Quote the introduction on Phoenix web page:
> {code}
> Apache Phoenix is a relational database layer over HBase delivered as a client-embedded
JDBC driver targeting low latency queries over HBase data. Apache Phoenix takes your SQL query,
compiles it into a series of HBase scans, and orchestrates the running of those scans to produce
regular JDBC result sets. The table metadata is stored in an HBase table and versioned, such
that snapshot queries over prior versions will automatically use the correct schema. Direct
use of the HBase API, along with coprocessors and custom filters, results in performance on
the order of milliseconds for small queries, or seconds for tens of millions of rows.
> {code}
> It may simply our implementation read/write data from/to HBase, and can easily build
index and compose complex query.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Mime
View raw message