hbase-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Himanshu Vashishtha (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (HBASE-3607) Cursor functionality for results generated by Coprocessors
Date Thu, 14 Apr 2011 21:52:06 GMT

    [ https://issues.apache.org/jira/browse/HBASE-3607?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13020055#comment-13020055
] 

Himanshu Vashishtha commented on HBASE-3607:
--------------------------------------------

Thanks for the review Gary, really appreciate your time and effort. Some key points that I
got are:

a) Flesh out a neater client side API, where the client gets a handler to invoke iteration
methods. It should be as simple as that. Giving it a long integer id does not help.

b) In executing the above calls, use the existing cp RPC mechanism (so get rid of CursorCallable).


c) Use existing code for scanner and other stateful objects at RS. Can RegionObserver be used
for maintaining these objects?

I am in the process of coming up with a better approach and facing one design question at
server side. It will be great to have comments on it:
a) When it comes to maintaining stateful scanners at RS side, we are dealing with instances
of Internal scanners that are created to do scans on a region basis. They as such can't be
registered at RegionServer because a region has only limited access to its HRS (via RegionServerServices).
The idea of having these objects stored at RS level has at least two benefits:
i) current scanners are registered this way (use existing code).
ii) these internal scanners will be instantiated per region, so if we try to register (house
keep) them in a cp, we will be having that many lease objects (a daemon threads) which is
not justifiable; or a timer object or so to do the resources in check.

So, these stateful scan objects should be registred at RS level. To do so, a region (or the
CP) should have access to RS's APIs which does this job like addScanner(InternalScanner).
Currently it has RegionServerServices, but it can't be used to do the registering of these
scan objects.
One approach is add such a method in HRS and then either add a method in RS (or refactor existing
addScanner method appropriately). 
Is this a right way or is there other better approach to do so.

> Cursor functionality for results generated by Coprocessors
> ----------------------------------------------------------
>
>                 Key: HBASE-3607
>                 URL: https://issues.apache.org/jira/browse/HBASE-3607
>             Project: HBase
>          Issue Type: New Feature
>          Components: coprocessors
>            Reporter: Himanshu Vashishtha
>         Attachments: patch-2.txt
>
>
> I tried to come up with a scanner like functionality for results generated by coprocessors
at region level. 
> This is just a poc, and it will be good to have your comments on it.
> It has support for both Incremental and In-memory Result sets. Attached is a patch that
has a test case for an incremental result (i.e., client receives a cursorId from the CP core
method, it instantiates a cursor object and iterates over the result set. He can set a cache
limit on the CursorCallable object to reduce the number of rpc --> just like scanners.
> In its current state, it has some limitations too :)), like, it is region specific only,
i.e., one can instantiate and use cursor at one region only (and that region is determined
by the input row while instantiating the cursor). I will try to expand it so that it can have
atleast a sequential access to other regions, but as I said, I want the opinion of experts
to know whether this approach really makes some sense or not.
> I have tested it with the inbuilt testing framework on my laptop only.
> It will be good if I copy the use case here in the description too:
> Test table has rows like:
>  /**
>    * The scenario is that I have these rows keys in the test table:
>   'aaa-123'
>   'aaa-456'
>   'abc-111'
>   'abd-111'
>   'abd-222'
>   & I want to return:
>   ('aaa', 2)
>   ('abc', 1)
>   ('abd', 2)

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

Mime
View raw message