cassandra-commits mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Sylvain Lebresne (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (CASSANDRA-10471) fix flapping empty_in_test dtest
Date Thu, 15 Oct 2015 09:19:05 GMT

    [ https://issues.apache.org/jira/browse/CASSANDRA-10471?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14958561#comment-14958561
] 

Sylvain Lebresne commented on CASSANDRA-10471:
----------------------------------------------

bq. As a reviewer I didn't figure out how to verify that this statement is true or why. I
mucked about with StatementRestrictions and family and found where IN restrictions are expressed,
but it's all pretty big in scope. Do you have any pointers?

The query that triggers the problem is
{noformat}
SELECT v FROM test_compact WHERE k1 = 0 AND k2 IN ()
{noformat}
That query explicitly requests no row (it's a query by name with an empty list of names).
One would expect such valid but somewhat uninteresting query to be dealt with at the CQL layer
(by returning nothing), yielding no internal query, and that is what happens on 3.0 (see {{SelectStatement.makeClusteringIndexFilter}},
in the case of a query by names, the code checks if {{getRequestedRows}} returns an empty
list and return {{null}} if so which is code for "we know the query return nothing"). That's
the reason why I initially made {{ColumnFilter.Builder}} reject the case where nothing was
selected, but that was misguided especially since the code is perfectly fine dealing with
an empty {{ColumnFilter}}.

And it happens that this optimization isn't done in 2.2. Or rather, it used to be done but
was "broken" by CASSANDRA-7981. If you look at the equivalent code on 2.2, in {{SelectStatement.makeFilter()}},
it assumes a {{IN ()}} would result in {{getRequestedColumns}} returning {{null}}, but it's
easy to see it's not the case anymore (and it's equally easy to see that CASSANDRA-7981 is
the culprit for that). So on the 2.2 node, the query simply generates an empty list in {{SelectStatement.getRequestedColumns()}}
(because {{SingleColumnRestriction.InWithValues.getValues()}} does nothing special if its
list of terms is empty, is just returns an empty list) and queries with that. That's where,
during an upgrade, the 3.0 node receives a query by name with an empty list of names, and
that triggers my misguided assertion.

I'll note that I'm fine "fixing" that broken optimization in 2.2 and I've pushed a very trivial
fix to do so [here|https://github.com/pcmanus/cassandra/commits/10471-2.2], but I don't really
care whether we do it or not since 1) it's pretty inconsequential for 2.2 users and 2) even
if we do commit that 2.2 patch, we still need to fix 3.0 for users who might upgrade from
a 2.2 version that don't have that fix.

bq. As far as I can tell null isn't used as a signal anywhere

Well it does. It signals we don't want to skip any value when {{isFetchAll}} is set (paraphrasing
the comment on the declaration of {{selection}} here). If {{isFetchAll == true}}, {{selection}}
is the subset of columns for which we want to include the values (the ones whose values are
not skipped). As the case where we don't skip any value is common, we use {{null}} to signal
it. The equivalent if we were to not use {{null}} would be to have {{selection}} be all the
columns, but that would mean a slightly less efficient {{canSkipValue}} in that common case.
I'll note that I feel all this is reasonably well explained in the class javadoc and the comments
around the class field declarations, but I'm open to improvement suggestions.

bq. canSkipValue might have depended on it before this change

Why only before this change? {{selection == null}} only matter in {{canSkipValue}} if {{isFetchAll
== true}}, and {{ColumnFilter.Builder.build()}} explicitly only force {{PartitionColumns.NONE}}
if {{!isFetchAll}}.

bq. There is no unit test for ColumnFilter

There isn't and I've created CASSANDRA-10531 to change that.



> fix flapping empty_in_test dtest
> --------------------------------
>
>                 Key: CASSANDRA-10471
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-10471
>             Project: Cassandra
>          Issue Type: Sub-task
>            Reporter: Jim Witschey
>            Assignee: Sylvain Lebresne
>             Fix For: 3.0.0 rc2
>
>
> {{upgrade_tests/cql_tests.py:TestCQL.empty_in_test}} fails about half the time on the
upgrade path from 2.2 to 3.0:
> http://cassci.datastax.com/view/Upgrades/job/storage_engine_upgrade_dtest-22_tarball-30_HEAD/42/testReport/upgrade_tests.cql_tests/TestCQL/empty_in_test/history/
> Once [this dtest PR|https://github.com/riptano/cassandra-dtest/pull/586] is merged, these
tests should also run with this upgrade path on normal 3.0 jobs. Until then, you can run it
with the following command:
> {code}
> SKIP=false CASSANDRA_VERSION=binary:2.2.0 UPGRADE_TO=git:cassandra-3.0 nosetests 2>&1
upgrade_tests/cql_tests.py:TestCQL.empty_in_test
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Mime
View raw message