hbase-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Ted Yu <yuzhih...@gmail.com>
Subject Re: Coprocessors vs MapReduce?
Date Tue, 24 Jul 2012 17:55:52 GMT
Your questions are quite common ones.
Let me try clarifying a few. Andy or Gary should be able to give better

For #1, there is no support for coprocessor if your code is not compiled
and built in a jar.
Need to get bit more familiar with Cascading :-)

For #2, can you give us some scenarios when parameters should be passed to
coprocessor at runtime ?

For #3, can you clarify the type of multiple blocks involved ? As you know,
HBase organizes data into regions.

For #5, it is up to you how to partition work between coprocessor (executed
on region servers) and your client code.


On Tue, Jul 24, 2012 at 7:59 AM, Bertrand Dechoux <dechouxb@gmail.com>wrote:

> Hello,
> I am learning about coprocessors and would like to know more about how to
> choose between coprocessors and MapReduce.
> First, I thought coprocessors needed a restart but it seems a shell can be
> used to add/remove them without requiring a restart. However, at the moment
> the coprocessors are defined within jar and can not be dynamically created.
> Could you confirm that? (I am thinking about the Cascading way of creating
> the implementation which will then be serialized, send and executed.)
> Second, I didn't see any way to give parameters to coprocessors. Is that
> really the case? If not, how would the parameters be handled?
> Third, I assume coprocessors are using the processus/thread of the region
> server. Does that means that, if multiple blocks need to be processed,
> MaReduce should be more efficient? Are there other ways to know whether
> coprocessors or MapReduce should be chosen?
> Fourth, I know this is a really broad question but how would you compare
> coprocessors to YARN? I have yet to know more about both subjects but I
> feel that the concepts are not totally unrelated.
> Lastly, this is an implementation detail but how the client side waits for
> the results? Is it possible to perform early aggregation or does the client
> need to receive all the information before doing anything else?
> Regards
> Bertrand
> Ps : My two sources for that subject are for HBase 0.92 :
> * https://blogs.apache.org/hbase/entry/coprocessor_introduction
> * HBase The Definitive Guide.

  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message