cassandra-commits mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Alex Petrov (JIRA)" <j...@apache.org>
Subject [jira] [Comment Edited] (CASSANDRA-11031) MultiTenant : support “ALLOW FILTERING" for Partition Key
Date Thu, 25 Aug 2016 09:18:20 GMT

    [ https://issues.apache.org/jira/browse/CASSANDRA-11031?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15436560#comment-15436560
] 

Alex Petrov edited comment on CASSANDRA-11031 at 8/25/16 9:17 AM:
------------------------------------------------------------------

Thank you for the review! I've addressed your comments and wait for the test results now.
All changes are done in a separate commit to make it simpler to see what's been done, I'll
combine them if you +1.

bq. The hasSlice method is now duplicated in PartitionKeySingleRestrictionSet and ClusteringColumnRestrictions.
It would be better to put it only into RestrictionSet and expose it into Restrictions.

Removed the redundant method, pulled it up.

bq. We should have some unit test for MaterializedViews with Slice and unrestricted partition
keys.

Added the MV tests, similar to the previously existing ones.

bq. In StatementRestrictions::processPartitionKeyRestrictions I do not think that the error
message: "Partition key parts: %s must be restricted as other parts are" makes sense any more
(for non-MVs). We should probably only use the ALLOW FILTERING error message.

Now we just have one error message, same everywhere. Tests adjusted accordingly.

bq. The part of StatementRestrictions::processPartitionKeyRestrictions dealing with secondary
index can probably be combined with the one dealing with the filtering.

True, I've separated them for sakes of simplicity, but I hope that new version looks clean
as well. Combining them furhter is tricky and will require double-checking of non-restricted
PK parts and whether or not restrictions are empty. It was a bit simpler previously since
we did't have to take care of slices.

bq. In RowFilter we should not use Lambda expression to filter out the rows as they are pretty
bad from the performance point of view and result in a lot of garbage. In RowFilter: Iterables.size(rowLevelExpressions)
can be replaced by rowLevelExpressions.size()

Optimised that part as well, got rid of streams alltogether, we iterate through restrictions
just once now.

bq. The tests using secondary indices should be put into SecondaryIndexTest

Moved the tests.

bq. It will be good to add more tests into SelectOrderedPartitionerTest::testFilteringOnPartitionKeyWithToken:
with > on one of the partition key or only the second partition key being specified

I've added one more clause that seemed missing, although since token restrictions require
both parts for the token function, I did not add anything else.

bq. In SelectTest::testAllowFilteringOnPartitionKey the message IN restrictions are not supported
on indexed columns looks wrong to me as the test do not use any secondary index.

For clarity here, I've added a more elaborate message that specifies that filtering is involved.

|[trunk|https://github.com/ifesdjeen/cassandra/tree/11031-trunk]|[dtest|https://cassci.datastax.com/view/Dev/view/ifesdjeen/job/ifesdjeen-11031-trunk-dtest/]|[testall|https://cassci.datastax.com/view/Dev/view/ifesdjeen/job/ifesdjeen-11031-trunk-testall/]|


was (Author: ifesdjeen):
Thank you for the review! I've addressed your comments and wait for the test results now.
All changes are done in a separate commit to make it simpler to see what's been done, I'll
combine them if you +1.

bq. The hasSlice method is now duplicated in PartitionKeySingleRestrictionSet and ClusteringColumnRestrictions.
It would be better to put it only into RestrictionSet and expose it into Restrictions.

Removed the redundant method, pulled it up.

bw. We should have some unit test for MaterializedViews with Slice and unrestricted partition
keys.

Added the MV tests, similar to the previously existing ones.

bq. In StatementRestrictions::processPartitionKeyRestrictions I do not think that the error
message: "Partition key parts: %s must be restricted as other parts are" makes sense any more
(for non-MVs). We should probably only use the ALLOW FILTERING error message.

Now we just have one error message, same everywhere. Tests adjusted accordingly.

bq. The part of StatementRestrictions::processPartitionKeyRestrictions dealing with secondary
index can probably be combined with the one dealing with the filtering.

True, I've separated them for sakes of simplicity, but I hope that new version looks clean
as well. Combining them furhter is tricky and will require double-checking of non-restricted
PK parts and whether or not restrictions are empty. It was a bit simpler previously since
we did't have to take care of slices.

bq. In RowFilter we should not use Lambda expression to filter out the rows as they are pretty
bad from the performance point of view and result in a lot of garbage. In RowFilter: Iterables.size(rowLevelExpressions)
can be replaced by rowLevelExpressions.size()

Optimised that part as well, got rid of streams alltogether, we iterate through restrictions
just once now.

bq. The tests using secondary indices should be put into SecondaryIndexTest

Moved the tests.

bq. It will be good to add more tests into SelectOrderedPartitionerTest::testFilteringOnPartitionKeyWithToken:
with > on one of the partition key or only the second partition key being specified

I've added one more clause that seemed missing, although since token restrictions require
both parts for the token function, I did not add anything else.

bq. In SelectTest::testAllowFilteringOnPartitionKey the message IN restrictions are not supported
on indexed columns looks wrong to me as the test do not use any secondary index.

For clarity here, I've added a more elaborate message that specifies that filtering is involved.

|[trunk|https://github.com/ifesdjeen/cassandra/tree/11031-trunk]|[dtest|https://cassci.datastax.com/view/Dev/view/ifesdjeen/job/ifesdjeen-11031-trunk-dtest/]|[testall|https://cassci.datastax.com/view/Dev/view/ifesdjeen/job/ifesdjeen-11031-trunk-testall/]|

> MultiTenant : support “ALLOW FILTERING" for Partition Key
> ---------------------------------------------------------
>
>                 Key: CASSANDRA-11031
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-11031
>             Project: Cassandra
>          Issue Type: New Feature
>          Components: CQL
>            Reporter: ZhaoYang
>            Assignee: ZhaoYang
>            Priority: Minor
>             Fix For: 3.x
>
>
> Currently, Allow Filtering only works for secondary Index column or clustering columns.
And it's slow, because Cassandra will read all data from SSTABLE from hard-disk to memory
to filter.
> But we can support allow filtering on Partition Key, as far as I know, Partition Key
is in memory, so we can easily filter them, and then read required data from SSTable.
> This will similar to "Select * from table" which scan through entire cluster.
> CREATE TABLE multi_tenant_table (
> 	tenant_id text,
> 	pk2 text,
> 	c1 text,
> 	c2 text,
> 	v1 text,
> 	v2 text,
> 	PRIMARY KEY ((tenant_id,pk2),c1,c2)
> ) ;
> Select * from multi_tenant_table where tenant_id = "datastax" allow filtering;



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Mime
View raw message