drill-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Aditya Kishore (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (DRILL-1414) Move profile storage to DFS rather than using PStore
Date Thu, 25 Sep 2014 00:57:33 GMT

    [ https://issues.apache.org/jira/browse/DRILL-1414?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14147194#comment-14147194
] 

Aditya Kishore commented on DRILL-1414:
---------------------------------------

So I have been thinking about couple of ways to do it.

# Extend {{org.apache.drill.exec.store.sys.PStore}} interface to add two additional functions
{code}
  public V getBlob(String key);
  public void putBlob(String key, V value);
{code}
Now these two methods can be used by the consumers to store large amount of data, that may
not require frequent enumeration and not suitable for storage on systems like Zookeeper. A
particular PStore implementation could choose to store the blob data differently than the
primary value, for example, HBase PStore provider could store them in a different column family
while Zookeeper PStore provider can store them on DFS (as this JIRA summary suggests).
The Query Profile, then can be split into two part where small, meta info about the query
is stored with a {{put()}} while the fragment profiles are stored using {{putBlob()}}.
# Alternatively, we could handle this narrowly by just modifying {{org.apache.drill.exec.work.foreman.QueryStatus}}
to split and store the profile meta data separately form individual query profile.

I am inclined to go with approach #1 as it will allow any future consumer to reuse it effortlessly.
I already have a partial patch, excluding modification to the Web UI, that I am currently
testing at this moment. If I do not hear any concern with the approach #1, I'll post the patch
shortly for the review.

> Move profile storage to DFS rather than using PStore
> ----------------------------------------------------
>
>                 Key: DRILL-1414
>                 URL: https://issues.apache.org/jira/browse/DRILL-1414
>             Project: Apache Drill
>          Issue Type: Improvement
>            Reporter: Jacques Nadeau
>            Assignee: Aditya Kishore
>             Fix For: 0.6.0
>
>
> PStores were really built for trivial configuration data, not large query profiles. 
As such, we should move to using the DFS for storage of query profiles when distributed mode.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Mime
View raw message