Joel,

Thanks for your reply, I have checked and found that the behavior is same in case of CASSANDRA-11513. I have verified this behavior (for both 11513 & 12003) to occur in case of 3.4 & 3.5. They both don't occur in 3.0.4, 3.6 & 3.7. 

Please find below the results of selecting only pk and clustering key from 11513. It has also been verified that both issues occur while selecting all / filtered rows therefore selection criteria is not an issue filtering by WHERE is:

cqlsh:ks> select pk,a from test0 where pk=0 and a=2;

 pk | a
----+---
  0 | 1
  0 | 2
  0 | 3

We can verify this claim by applying 11513 Patch to 3.5 Tag and build & test for 12003. If it is fixed then we can guarantee the claim. Let me know if any further input may possibly be required here.

On Wed, Jun 15, 2016 at 2:23 AM, Joel Knighton <joel.knighton@datastax.com> wrote:
The important part of that query is that it's selecting a static column (with select *), not whether it is filtering on one. In CASSANDRA-12003 and this thread, it looks like you're only selecting the primary and clustering columns. I'd be cautious about concluding that CASSANDRA-12003 and CASSANDRA-11513 are the same issue and that CASSANDRA-12003 is fixed. 

If you have a reproduction path for CASSANDRA-12003, I'd recommend attaching it to a ticket, and someone can investigate internals to see if CASSANDRA-11513 (or something else entirely) fixed the issue.

On Tue, Jun 14, 2016 at 2:13 PM, Bhuvan Rawal <bhu1rawal@gmail.com> wrote:
Joel,

If we look at the schema carefully:
CREATE TABLE test0 (
    pk int,
    a int,
    b text,
    s text static,
    PRIMARY KEY (pk, a)
);
and filtering is performed on clustering column a and its not a static column:
select * from test0 where pk=0 and a=2;


On Wed, Jun 15, 2016 at 12:39 AM, Joel Knighton <joel.knighton@datastax.com> wrote:
It doesn't seem to be an exact duplicate - CASSANDRA-11513 relies on you selecting a static column, which you weren't doing in the reported issue. That said, I haven't looked too closely.

On Tue, Jun 14, 2016 at 2:07 PM, Bhuvan Rawal <bhu1rawal@gmail.com> wrote:
I can reproduce CASSANDRA-11513 locally on 3.5, possible duplicate.

On Wed, Jun 15, 2016 at 12:29 AM, Joel Knighton <joel.knighton@datastax.com> wrote:
There's some precedent for similar issues with static columns in 3.5 with https://issues.apache.org/jira/browse/CASSANDRA-11513 - a deterministic (or somewhat deterministic) path for reproduction would help narrow the issue down farther. I've played around locally with similar schemas (sans the stratio indices) and couldn't reproduce the issue.

On Tue, Jun 14, 2016 at 1:41 PM, Bhuvan Rawal <bhu1rawal@gmail.com> wrote:
Jira CASSANDRA-12003 Has been created for the same.

On Tue, Jun 14, 2016 at 11:54 PM, Atul Saroha <atul.saroha@snapdeal.com> wrote:
Hi Tyler,

This issue is mainly visible for tables having static columns, still investigating.
We will try to test after removing lucene index but I don’t think this plug-in could led to change in behaviour of cassandra write to table's memtable.

---------------------------------------------------------------------------------------------------------------------
Atul Saroha

Lead Software Engineer
M: +91 8447784271 T: +91 124-415-6069 EXT: 12369
Plot # 362, ASF Centre - Tower A, Udyog Vihar,
 Phase -4, Sector 18, Gurgaon, Haryana 122016, INDIA

On Tue, Jun 14, 2016 at 9:54 PM, Tyler Hobbs <tyler@datastax.com> wrote:
Is 'id' your partition key? I'm not familiar with the stratio indexes, but it looks like the primary key columns are both indexed.  Perhaps this is related?

On Tue, Jun 14, 2016 at 1:25 AM, Atul Saroha <atul.saroha@snapdeal.com> wrote:
After further debug, this issue is found in in-memory memtable as doing nodetool flush + compact resolve the issue. And there is no batch write used for this table which is showing issue.
Table properties:

WITH CLUSTERING ORDER BY (f_name ASC)
    AND bloom_filter_fp_chance = 0.01
    AND caching = {'keys': 'ALL', 'rows_per_partition': 'NONE'}
    AND comment = ''
    AND compaction = {'class': 'org.apache.cassandra.db.compaction.SizeTieredCompactionStrategy', 'max_threshold': '32', 'min_threshold': '4'}
    AND compression = {'chunk_length_in_kb': '64', 'class': 'org.apache.cassandra.io.compress.LZ4Compressor'}
    AND crc_check_chance = 1.0
    AND dclocal_read_repair_chance = 0.1
    AND default_time_to_live = 0
    AND gc_grace_seconds = 864000
    AND max_index_interval = 2048
    AND memtable_flush_period_in_ms = 0
    AND min_index_interval = 128
    AND read_repair_chance = 0.0
    AND speculative_retry = '99PERCENTILE';
CREATE CUSTOM INDEX nbf_index ON nbf () USING 'com.stratio.cassandra.lucene.Index' WITH OPTIONS = {'refresh_seconds': '1', 'schema': '{
        fields : {
            id  : {type : "bigint"},
            f_d_name : {
                type           : "string",
                indexed        : true,
                sorted         : false,
                validated      : true,
                case_sensitive : false
            }       
        }
    }'};


---------------------------------------------------------------------------------------------------------------------
Atul Saroha

Lead Software Engineer
M: +91 8447784271 T: +91 124-415-6069 EXT: 12369
Plot # 362, ASF Centre - Tower A, Udyog Vihar,
 Phase -4, Sector 18, Gurgaon, Haryana 122016, INDIA

On Mon, Jun 13, 2016 at 11:11 PM, Siddharth Verma <verma.siddharth@snapdeal.com> wrote:
No, all rows were not the same.
Querying only on the partition key gives 20 rows.
In the erroneous result, while querying on partition key and clustering key, we got 16 of those 20 rows.

And for "tombstone_threshold" there isn't any entry at column family level.

Thanks,
Siddharth Verma






--
Tyler Hobbs
DataStax





--

Joel Knighton
Cassandra Developer | joel.knighton@datastax.com








--

Joel Knighton
Cassandra Developer | joel.knighton@datastax.com








--

Joel Knighton
Cassandra Developer | joel.knighton@datastax.com