cassandra-commits mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Sam Tunnicliffe (JIRA)" <>
Subject [jira] [Commented] (CASSANDRA-5732) Can not query secondary index
Date Thu, 10 Oct 2013 14:02:43 GMT


Sam Tunnicliffe commented on CASSANDRA-5732:

The reason for the missing results is that in CFS.getColumnFamily() we look up the cfs id
from Schema to calculate the cache key. However, 2i CFSes are never loaded into the Schema,
so Schema.instance.getId always returns null. Simply fixing this by calling Schema.instance.load()
with the 2i CFMD when the index is initialized uncovers another issue. The cfid is now retrievable,
but the deserialization of a cached 2i row fails as it depends on the 2i CFMD being present
in the enclosing KSMD for the eventual call to Schema.getCFMD(). Once we start adding index
CFs to Schema they then become involved in schema migrations which makes everything very messy.
So rather than adding them directly to KSMD like regular CFs, I added a separate cfId->CFMD
map to Schema, so as far as most things are concerned nothing has changed, just we have one
further place to look when retrieving CFMD for a given cfId.

The attached patch is against the 1.2 branch, CASSANDRA-4875 is a duplicate of this, but has
a fixver of 1.1 [~jbellis], do you want me to submit a patch against 1.1 also?

I wrote a dtest for this, pull request for that here:

Looking at this, I also uncovered what I think is an issue with the setup of the 2i cache
config. In AbstractSimplePerColumnSecondaryIndex (in 1.2, the same code is in KeysIndex in
1.1), the estimated key and mean column counts are used to gauge the index's cardinality then
use that to decide whether or not to enable row caching. This calculation is first performed
prior to the index actually being built, so there are no SSTables to provide the estimates,
which results in row caching always being disabled until the next time the index is initialized
when C* is restarted (this appears to be why the repro steps require a restart). If this is
a genuine problem, I'll create a separate JIRA to address it. 

> Can not query secondary index
> -----------------------------
>                 Key: CASSANDRA-5732
>                 URL:
>             Project: Cassandra
>          Issue Type: Bug
>          Components: Core
>    Affects Versions: 1.2.5
>         Environment: Windows 8, Jre 1.6.0_45 32-bit
>            Reporter: Tony Anecito
>            Assignee: Sam Tunnicliffe
>         Attachments: 5732-1.2
> Noticed after taking a column family that already existed and assigning to an IntegerType
index_type:KEYS and the caching was already set to 'ALL' that the prepared statement do not
return rows neither did it throw an exception. Here is the sequence.
> 1. Starting state query running with caching off for a Column Family with the query using
the secondary index for te WHERE clause.
> 2, Set Column Family caching to ALL using Cassandra-CLI and update CQL. Cassandra-cli
Describe shows column family caching set to ALL
> 3. Rerun query and it works.
> 4. Restart Cassandra and run query and no rows returned. Cassandra-cli Describe shows
column family caching set to ALL
> 5. Set Column Family caching to NONE using Cassandra-cli and update CQL. Rerun query
and no rows returned. Cassandra-cli Describe for column family shows caching set to NONE.
> 6. Restart Cassandra. Rerun query and it is working again. We are now back to the starting
> Best Regards,
> -Tony

This message was sent by Atlassian JIRA

View raw message