cassandra-commits mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Alex Liu (JIRA)" <j...@apache.org>
Subject [jira] [Comment Edited] (CASSANDRA-6048) Add the ability to use multiple indexes in a single query
Date Fri, 18 Oct 2013 18:42:42 GMT

    [ https://issues.apache.org/jira/browse/CASSANDRA-6048?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13799394#comment-13799394
] 

Alex Liu edited comment on CASSANDRA-6048 at 10/18/13 6:42 PM:
---------------------------------------------------------------

We could potential develop a DB engine on top of Cassandra to enable real time joining on
multiple tables or other comprehensive queries. It can be similar to Hive vs Hadoop. This
will make Cassandra more relational DB like, but under the hook is a distributed DB.


was (Author: alexliu68):
We could potential develop a DB engine on top of Cassandra to enable real time joining on
multiple tables or other comprehensive queries. It can be similar to Hive vs Hadoop

> Add the ability to use multiple indexes in a single query
> ---------------------------------------------------------
>
>                 Key: CASSANDRA-6048
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-6048
>             Project: Cassandra
>          Issue Type: New Feature
>          Components: Core
>            Reporter: Alex Liu
>            Assignee: Alex Liu
>             Fix For: 2.1
>
>         Attachments: 6048-1.2-branch.txt, 6048-trunk.txt
>
>
> Existing data filtering uses the following algorithm
> {code}
>    1. find best selective predicate based on the smallest mean columns count
>    2. fetch rows for the best selective predicate predicate, then filter the data based
on other predicates left.
> {code}
> So potentially we could improve the performance by
> {code}
>    1.  joining multiple predicates then do the data filtering for other predicates.
>    2.  fine tune the best predicate selection algorithm
> {code}
> For multiple predicate join, it could improve performance if one predicate has many entries
and another predicate has a very few of entries. It means a few index CF read, join the row
keys, fetch rows then filter other predicates
> Another approach is to have index on multiple columns.



--
This message was sent by Atlassian JIRA
(v6.1#6144)

Mime
View raw message