cassandra-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From DuyHai Doan <doanduy...@gmail.com>
Subject Re: read time coprocessor?
Date Fri, 11 Dec 2015 16:34:40 GMT
The new UDF (User Defined Function) and UDA (User Defined Aggregate)
introduced since Cassandra 2.2 is the feature to closest HBase co-processor.

1. They are real time, in the sense that they are applied right away on the
fly after fetching data from C*
2. The computation is done on the coordinator, not on replica

The second point may be surprising. One might expect that UDF and UDA
computation is *distributed* among replicas but  because of the eventual
consistency model, data need to be retrieved and reconciled first on
coordinator node (last write win) before applying any UDF or UDA.

Now, if you're using consistency level ONE or LOCAL_ONE and a client with
TokenAware load balancing strategy, the coordinator node is indeed the
replica itself. In this particular configuration, UDF/UDA are applied
locally.

More info on UDF/UDA here:
http://www.slideshare.net/doanduyhai/cassandra-udf-and-materialized-views

On Fri, Dec 11, 2015 at 7:52 AM, Li Yang <liyang@apache.org> wrote:

> This is Yang from Apache Kylin project. We are thinking about using
> Cassandra instead of HBase as storage. I searched and read around and still
> have one question.
>
> Does Cassandra support read time coprocessor that allows moving
> computation to data node before scan result is returned? This shall reduce
> network traffic greatly in our case.
>
> Thank
> Yang
>

Mime
View raw message