cassandra-commits mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Marco Matarazzo (JIRA)" <j...@apache.org>
Subject [jira] [Updated] (CASSANDRA-5501) Missing data on SELECT on secondary index
Date Mon, 22 Apr 2013 09:49:17 GMT

     [ https://issues.apache.org/jira/browse/CASSANDRA-5501?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]

Marco Matarazzo updated CASSANDRA-5501:
---------------------------------------

    Description: 
We have a 3 nodes cluster, and a keyspace with RF = 3.

>From cassandra-cli everything is fine (we actually never use it, I just launched it for
a check in this particular case).

[default@goh_master] get agents where station_id = ascii(1110129);
-------------------
RowKey: 6c8efeb6-7209-11e2-890a-aacc00000216
=> (column=, value=, timestamp=1364580868176000)
=> (column=character_points, value=, timestamp=1361030686890000)
=> (column=component_id, value=0, timestamp=1364580868176000)
=> (column=corporation_id, value=3efc729e-7209-11e2-890a-aacc00000216, timestamp=1361030686890000)
=> (column=entity_id, value=0, timestamp=1364580868176000)
=> (column=manufacturing, value=, timestamp=1361030686890000)
=> (column=model, value=500005, timestamp=1361030686890000)
=> (column=name, value=Jenny Olifield, timestamp=1361030686890000)
=> (column=name_check, value=jenny_olifield, timestamp=1361030686890000)
=> (column=station_id, value=1110129, timestamp=1364580868176000)
=> (column=stats_intellect, value=8, timestamp=1361030686890000)
=> (column=stats_reflexes, value=8, timestamp=1361030686890000)
=> (column=stats_stamina, value=7, timestamp=1361030686890000)
=> (column=stats_technology, value=7, timestamp=1361030686890000)
=> (column=trading, value=, timestamp=1361030686890000)
-------------------
RowKey: dc413373-6b06-11e2-8943-aacc00000216
=> (column=, value=, timestamp=1366568185220000)
=> (column=character_points, value=100, timestamp=1364580381651000)
=> (column=component_id, value=, timestamp=1364580381651000)
=> (column=corporation_id, value=574934cc-6b06-11e2-a512-aacc00000200, timestamp=1364580381651000)
=> (column=entity_id, value=0, timestamp=1364580381651000)
=> (column=manufacturing, value=, timestamp=1364580381651000)
=> (column=model, value=500018, timestamp=1364580381651000)
=> (column=name, value=Darren Matar, timestamp=1364580381651000)
=> (column=name_check, value=darren_matar, timestamp=1364580381651000)
=> (column=station_id, value=1110129, timestamp=1364580381651000)
=> (column=stats_intellect, value=10, timestamp=1364580381651000)
=> (column=stats_reflexes, value=10, timestamp=1364580381651000)
=> (column=stats_stamina, value=10, timestamp=1364580381651000)
=> (column=stats_technology, value=10, timestamp=1364580381651000)
=> (column=trading, value=1, timestamp=1366568185220000)
-------------------
RowKey: 0e7074ac-64bd-11e2-8c38-aacc00000201
=> (column=, value=, timestamp=1364828039093000)
=> (column=character_points, value=, timestamp=1361030686760000)
=> (column=component_id, value=0, timestamp=1364828039093000)
=> (column=corporation_id, value=e398294e-64bc-11e2-8c38-aacc00000201, timestamp=1361030686760000)
=> (column=entity_id, value=0, timestamp=1364828039093000)
=> (column=manufacturing, value=1, timestamp=1362517535613000)
=> (column=model, value=500008, timestamp=1361030686760000)
=> (column=name, value=Tom Bishop, timestamp=1361030686760000)
=> (column=name_check, value=tom_bishop, timestamp=1361030686760000)
=> (column=station_id, value=1110129, timestamp=1364828039093000)
=> (column=stats_intellect, value=9, timestamp=1361030686760000)
=> (column=stats_reflexes, value=7, timestamp=1361030686760000)
=> (column=stats_stamina, value=5, timestamp=1361030686760000)
=> (column=stats_technology, value=9, timestamp=1361030686760000)
=> (column=trading, value=, timestamp=1361030686760000)
-------------------
RowKey: 1b462f09-65f3-4148-a1a6-536b52b3bcfa
=> (column=, value=, timestamp=1366568185096000)
=> (column=character_points, value=100, timestamp=1364580381537000)
=> (column=component_id, value=, timestamp=1364580381537000)
=> (column=corporation_id, value=1d2a8803-d139-4b50-85eb-92cb1082de2e, timestamp=1364580381537000)
=> (column=entity_id, value=0, timestamp=1364580381537000)
=> (column=manufacturing, value=, timestamp=1364580381537000)
=> (column=model, value=500003, timestamp=1364580381537000)
=> (column=name, value=Andrea Len, timestamp=1364580381537000)
=> (column=name_check, value=andrea_len, timestamp=1364580381537000)
=> (column=station_id, value=1110129, timestamp=1364580381537000)
=> (column=stats_intellect, value=10, timestamp=1364580381537000)
=> (column=stats_reflexes, value=10, timestamp=1364580381537000)
=> (column=stats_stamina, value=10, timestamp=1364580381537000)
=> (column=stats_technology, value=10, timestamp=1364580381537000)
=> (column=trading, value=1, timestamp=1366568185096000)

4 Rows Returned.


>From CQLSH, hovewer, the result is different, and 2 rows are missing.


cqlsh:goh_master> select agent_id,name,station_id from agents where station_id='1110129';

 agent_id                             | name           | station_id
--------------------------------------+----------------+------------
 6c8efeb6-7209-11e2-890a-aacc00000216 | Jenny Olifield |    1110129
 0e7074ac-64bd-11e2-8c38-aacc00000201 |     Tom Bishop |    1110129

cqlsh:goh_master> select agent_id, name, station_id from agents where agent_id = '1b462f09-65f3-4148-a1a6-536b52b3bcfa';

 agent_id                             | name       | station_id
--------------------------------------+------------+------------
 1b462f09-65f3-4148-a1a6-536b52b3bcfa | Andrea Len |    1110129


Updating one column makes the single row reappear in the index, but just for that row and
that columns/index.

cqlsh:goh_master> update agents set station_id = '1110129' where agent_id = '1b462f09-65f3-4148-a1a6-536b52b3bcfa';

cqlsh:goh_master> select agent_id,name,station_id from agents where station_id='1110129';

 agent_id                             | name           | station_id
--------------------------------------+----------------+------------
 6c8efeb6-7209-11e2-890a-aacc00000216 | Jenny Olifield |    1110129
 0e7074ac-64bd-11e2-8c38-aacc00000201 |     Tom Bishop |    1110129
 1b462f09-65f3-4148-a1a6-536b52b3bcfa |     Andrea Len |    1110129


Updating one columns does not make all the row re-appear on all indexes (as it would be somewhat
expected), but just on the updated one.

cqlsh:goh_master> select * from agents where name = 'Andrea Len';
cqlsh:goh_master> 


Running nodetool rebuild_index on all three nodes apparently DOES NOT fixes the problem, neither
do nodetool repair.


We also used COPY TO to dump the entire row to check for hidden spaces or anything like that,
but we can't see anything:

....
dc413373-6b06-11e2-8943-aacc00000216,100,,574934cc-6b06-11e2-a512-aacc00000200,0,,500018,Darren
Matar,darren_matar,1110129,10,10,10,10,1
1b462f09-65f3-4148-a1a6-536b52b3bcfa,100,,1d2a8803-d139-4b50-85eb-92cb1082de2e,0,,500003,Andrea
Len,andrea_len,1110129,10,10,10,10,1
....


Logs shows no error. I lowered log level to DEBUG and the only thing that I see when I launch
the query is just:

DEBUG [Thrift:16] 2013-04-22 10:47:27,727 CassandraServer.java (line 1703) execute_cql3_query
DEBUG [Thrift:16] 2013-04-22 10:47:27,737 Tracing.java (line 167) request complete


Attached, a log of the query with TRACE level log.


  was:
We have a 3 nodes cluster, and a keyspace with RF = 3.

>From cassandra-cli everything is fine (we actually never use it, I just launched it for
a check in this particular case).

[default@goh_master] get agents where station_id = ascii(1110129);
-------------------
RowKey: 6c8efeb6-7209-11e2-890a-aacc00000216
=> (column=, value=, timestamp=1364580868176000)
=> (column=character_points, value=, timestamp=1361030686890000)
=> (column=component_id, value=0, timestamp=1364580868176000)
=> (column=corporation_id, value=3efc729e-7209-11e2-890a-aacc00000216, timestamp=1361030686890000)
=> (column=entity_id, value=0, timestamp=1364580868176000)
=> (column=manufacturing, value=, timestamp=1361030686890000)
=> (column=model, value=500005, timestamp=1361030686890000)
=> (column=name, value=Jenny Olifield, timestamp=1361030686890000)
=> (column=name_check, value=jenny_olifield, timestamp=1361030686890000)
=> (column=station_id, value=1110129, timestamp=1364580868176000)
=> (column=stats_intellect, value=8, timestamp=1361030686890000)
=> (column=stats_reflexes, value=8, timestamp=1361030686890000)
=> (column=stats_stamina, value=7, timestamp=1361030686890000)
=> (column=stats_technology, value=7, timestamp=1361030686890000)
=> (column=trading, value=, timestamp=1361030686890000)
-------------------
RowKey: dc413373-6b06-11e2-8943-aacc00000216
=> (column=, value=, timestamp=1366568185220000)
=> (column=character_points, value=100, timestamp=1364580381651000)
=> (column=component_id, value=, timestamp=1364580381651000)
=> (column=corporation_id, value=574934cc-6b06-11e2-a512-aacc00000200, timestamp=1364580381651000)
=> (column=entity_id, value=0, timestamp=1364580381651000)
=> (column=manufacturing, value=, timestamp=1364580381651000)
=> (column=model, value=500018, timestamp=1364580381651000)
=> (column=name, value=Darren Matar, timestamp=1364580381651000)
=> (column=name_check, value=darren_matar, timestamp=1364580381651000)
=> (column=station_id, value=1110129, timestamp=1364580381651000)
=> (column=stats_intellect, value=10, timestamp=1364580381651000)
=> (column=stats_reflexes, value=10, timestamp=1364580381651000)
=> (column=stats_stamina, value=10, timestamp=1364580381651000)
=> (column=stats_technology, value=10, timestamp=1364580381651000)
=> (column=trading, value=1, timestamp=1366568185220000)
-------------------
RowKey: 0e7074ac-64bd-11e2-8c38-aacc00000201
=> (column=, value=, timestamp=1364828039093000)
=> (column=character_points, value=, timestamp=1361030686760000)
=> (column=component_id, value=0, timestamp=1364828039093000)
=> (column=corporation_id, value=e398294e-64bc-11e2-8c38-aacc00000201, timestamp=1361030686760000)
=> (column=entity_id, value=0, timestamp=1364828039093000)
=> (column=manufacturing, value=1, timestamp=1362517535613000)
=> (column=model, value=500008, timestamp=1361030686760000)
=> (column=name, value=Tom Bishop, timestamp=1361030686760000)
=> (column=name_check, value=tom_bishop, timestamp=1361030686760000)
=> (column=station_id, value=1110129, timestamp=1364828039093000)
=> (column=stats_intellect, value=9, timestamp=1361030686760000)
=> (column=stats_reflexes, value=7, timestamp=1361030686760000)
=> (column=stats_stamina, value=5, timestamp=1361030686760000)
=> (column=stats_technology, value=9, timestamp=1361030686760000)
=> (column=trading, value=, timestamp=1361030686760000)
-------------------
RowKey: 1b462f09-65f3-4148-a1a6-536b52b3bcfa
=> (column=, value=, timestamp=1366568185096000)
=> (column=character_points, value=100, timestamp=1364580381537000)
=> (column=component_id, value=, timestamp=1364580381537000)
=> (column=corporation_id, value=1d2a8803-d139-4b50-85eb-92cb1082de2e, timestamp=1364580381537000)
=> (column=entity_id, value=0, timestamp=1364580381537000)
=> (column=manufacturing, value=, timestamp=1364580381537000)
=> (column=model, value=500003, timestamp=1364580381537000)
=> (column=name, value=Andrea Len, timestamp=1364580381537000)
=> (column=name_check, value=andrea_len, timestamp=1364580381537000)
=> (column=station_id, value=1110129, timestamp=1364580381537000)
=> (column=stats_intellect, value=10, timestamp=1364580381537000)
=> (column=stats_reflexes, value=10, timestamp=1364580381537000)
=> (column=stats_stamina, value=10, timestamp=1364580381537000)
=> (column=stats_technology, value=10, timestamp=1364580381537000)
=> (column=trading, value=1, timestamp=1366568185096000)

4 Rows Returned.


>From CQLSH, hovewer, the result is different, and 2 rows are missing.


cqlsh:goh_master> select agent_id,name,station_id from agents where station_id='1110129';

 agent_id                             | name           | station_id
--------------------------------------+----------------+------------
 6c8efeb6-7209-11e2-890a-aacc00000216 | Jenny Olifield |    1110129
 0e7074ac-64bd-11e2-8c38-aacc00000201 |     Tom Bishop |    1110129

cqlsh:goh_master> select agent_id, name, station_id from agents where agent_id = '1b462f09-65f3-4148-a1a6-536b52b3bcfa';

 agent_id                             | name       | station_id
--------------------------------------+------------+------------
 1b462f09-65f3-4148-a1a6-536b52b3bcfa | Andrea Len |    1110129


Updating one column makes the single row reappear in the index, but just for that row and
that columns/index.

cqlsh:goh_master> update agents set station_id = '1110129' where agent_id = '1b462f09-65f3-4148-a1a6-536b52b3bcfa';

cqlsh:goh_master> select agent_id,name,station_id from agents where station_id='1110129';

 agent_id                             | name           | station_id
--------------------------------------+----------------+------------
 6c8efeb6-7209-11e2-890a-aacc00000216 | Jenny Olifield |    1110129
 0e7074ac-64bd-11e2-8c38-aacc00000201 |     Tom Bishop |    1110129
 1b462f09-65f3-4148-a1a6-536b52b3bcfa |     Andrea Len |    1110129


Updating one columns does not make all the row re-appear on all indexes (as it would be somewhat
expected), but just on the updated one.

cqlsh:goh_master> select * from agents where name = 'Andrea Len';
cqlsh:goh_master> 


Running nodetool rebuild_index on all three nodes apparently DOES NOT fixes the problem, neither
do nodetool repair.


We also used COPY TO to dump the entire row to check for hidden spaces or anything like that,
but we can't see anything:

....
dc413373-6b06-11e2-8943-aacc00000216,100,,574934cc-6b06-11e2-a512-aacc00000200,0,,500018,Darren
Matar,darren_matar,1110129,10,10,10,10,1
1b462f09-65f3-4148-a1a6-536b52b3bcfa,100,,1d2a8803-d139-4b50-85eb-92cb1082de2e,0,,500003,Andrea
Len,andrea_len,1110129,10,10,10,10,1
....


Logs shows no error. I lowered log level to DEBUG and the only thing that I see when I launch
the query is just:

DEBUG [Thrift:16] 2013-04-22 10:47:27,727 CassandraServer.java (line 1703) execute_cql3_query
DEBUG [Thrift:16] 2013-04-22 10:47:27,737 Tracing.java (line 167) request complete



    
> Missing data on SELECT on secondary index 
> ------------------------------------------
>
>                 Key: CASSANDRA-5501
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-5501
>             Project: Cassandra
>          Issue Type: Bug
>    Affects Versions: 1.2.4
>         Environment: linux ubuntu 12.04
>            Reporter: Marco Matarazzo
>
> We have a 3 nodes cluster, and a keyspace with RF = 3.
> From cassandra-cli everything is fine (we actually never use it, I just launched it for
a check in this particular case).
> [default@goh_master] get agents where station_id = ascii(1110129);
> -------------------
> RowKey: 6c8efeb6-7209-11e2-890a-aacc00000216
> => (column=, value=, timestamp=1364580868176000)
> => (column=character_points, value=, timestamp=1361030686890000)
> => (column=component_id, value=0, timestamp=1364580868176000)
> => (column=corporation_id, value=3efc729e-7209-11e2-890a-aacc00000216, timestamp=1361030686890000)
> => (column=entity_id, value=0, timestamp=1364580868176000)
> => (column=manufacturing, value=, timestamp=1361030686890000)
> => (column=model, value=500005, timestamp=1361030686890000)
> => (column=name, value=Jenny Olifield, timestamp=1361030686890000)
> => (column=name_check, value=jenny_olifield, timestamp=1361030686890000)
> => (column=station_id, value=1110129, timestamp=1364580868176000)
> => (column=stats_intellect, value=8, timestamp=1361030686890000)
> => (column=stats_reflexes, value=8, timestamp=1361030686890000)
> => (column=stats_stamina, value=7, timestamp=1361030686890000)
> => (column=stats_technology, value=7, timestamp=1361030686890000)
> => (column=trading, value=, timestamp=1361030686890000)
> -------------------
> RowKey: dc413373-6b06-11e2-8943-aacc00000216
> => (column=, value=, timestamp=1366568185220000)
> => (column=character_points, value=100, timestamp=1364580381651000)
> => (column=component_id, value=, timestamp=1364580381651000)
> => (column=corporation_id, value=574934cc-6b06-11e2-a512-aacc00000200, timestamp=1364580381651000)
> => (column=entity_id, value=0, timestamp=1364580381651000)
> => (column=manufacturing, value=, timestamp=1364580381651000)
> => (column=model, value=500018, timestamp=1364580381651000)
> => (column=name, value=Darren Matar, timestamp=1364580381651000)
> => (column=name_check, value=darren_matar, timestamp=1364580381651000)
> => (column=station_id, value=1110129, timestamp=1364580381651000)
> => (column=stats_intellect, value=10, timestamp=1364580381651000)
> => (column=stats_reflexes, value=10, timestamp=1364580381651000)
> => (column=stats_stamina, value=10, timestamp=1364580381651000)
> => (column=stats_technology, value=10, timestamp=1364580381651000)
> => (column=trading, value=1, timestamp=1366568185220000)
> -------------------
> RowKey: 0e7074ac-64bd-11e2-8c38-aacc00000201
> => (column=, value=, timestamp=1364828039093000)
> => (column=character_points, value=, timestamp=1361030686760000)
> => (column=component_id, value=0, timestamp=1364828039093000)
> => (column=corporation_id, value=e398294e-64bc-11e2-8c38-aacc00000201, timestamp=1361030686760000)
> => (column=entity_id, value=0, timestamp=1364828039093000)
> => (column=manufacturing, value=1, timestamp=1362517535613000)
> => (column=model, value=500008, timestamp=1361030686760000)
> => (column=name, value=Tom Bishop, timestamp=1361030686760000)
> => (column=name_check, value=tom_bishop, timestamp=1361030686760000)
> => (column=station_id, value=1110129, timestamp=1364828039093000)
> => (column=stats_intellect, value=9, timestamp=1361030686760000)
> => (column=stats_reflexes, value=7, timestamp=1361030686760000)
> => (column=stats_stamina, value=5, timestamp=1361030686760000)
> => (column=stats_technology, value=9, timestamp=1361030686760000)
> => (column=trading, value=, timestamp=1361030686760000)
> -------------------
> RowKey: 1b462f09-65f3-4148-a1a6-536b52b3bcfa
> => (column=, value=, timestamp=1366568185096000)
> => (column=character_points, value=100, timestamp=1364580381537000)
> => (column=component_id, value=, timestamp=1364580381537000)
> => (column=corporation_id, value=1d2a8803-d139-4b50-85eb-92cb1082de2e, timestamp=1364580381537000)
> => (column=entity_id, value=0, timestamp=1364580381537000)
> => (column=manufacturing, value=, timestamp=1364580381537000)
> => (column=model, value=500003, timestamp=1364580381537000)
> => (column=name, value=Andrea Len, timestamp=1364580381537000)
> => (column=name_check, value=andrea_len, timestamp=1364580381537000)
> => (column=station_id, value=1110129, timestamp=1364580381537000)
> => (column=stats_intellect, value=10, timestamp=1364580381537000)
> => (column=stats_reflexes, value=10, timestamp=1364580381537000)
> => (column=stats_stamina, value=10, timestamp=1364580381537000)
> => (column=stats_technology, value=10, timestamp=1364580381537000)
> => (column=trading, value=1, timestamp=1366568185096000)
> 4 Rows Returned.
> From CQLSH, hovewer, the result is different, and 2 rows are missing.
> cqlsh:goh_master> select agent_id,name,station_id from agents where station_id='1110129';
>  agent_id                             | name           | station_id
> --------------------------------------+----------------+------------
>  6c8efeb6-7209-11e2-890a-aacc00000216 | Jenny Olifield |    1110129
>  0e7074ac-64bd-11e2-8c38-aacc00000201 |     Tom Bishop |    1110129
> cqlsh:goh_master> select agent_id, name, station_id from agents where agent_id = '1b462f09-65f3-4148-a1a6-536b52b3bcfa';
>  agent_id                             | name       | station_id
> --------------------------------------+------------+------------
>  1b462f09-65f3-4148-a1a6-536b52b3bcfa | Andrea Len |    1110129
> Updating one column makes the single row reappear in the index, but just for that row
and that columns/index.
> cqlsh:goh_master> update agents set station_id = '1110129' where agent_id = '1b462f09-65f3-4148-a1a6-536b52b3bcfa';
> cqlsh:goh_master> select agent_id,name,station_id from agents where station_id='1110129';
>  agent_id                             | name           | station_id
> --------------------------------------+----------------+------------
>  6c8efeb6-7209-11e2-890a-aacc00000216 | Jenny Olifield |    1110129
>  0e7074ac-64bd-11e2-8c38-aacc00000201 |     Tom Bishop |    1110129
>  1b462f09-65f3-4148-a1a6-536b52b3bcfa |     Andrea Len |    1110129
> Updating one columns does not make all the row re-appear on all indexes (as it would
be somewhat expected), but just on the updated one.
> cqlsh:goh_master> select * from agents where name = 'Andrea Len';
> cqlsh:goh_master> 
> Running nodetool rebuild_index on all three nodes apparently DOES NOT fixes the problem,
neither do nodetool repair.
> We also used COPY TO to dump the entire row to check for hidden spaces or anything like
that, but we can't see anything:
> ....
> dc413373-6b06-11e2-8943-aacc00000216,100,,574934cc-6b06-11e2-a512-aacc00000200,0,,500018,Darren
Matar,darren_matar,1110129,10,10,10,10,1
> 1b462f09-65f3-4148-a1a6-536b52b3bcfa,100,,1d2a8803-d139-4b50-85eb-92cb1082de2e,0,,500003,Andrea
Len,andrea_len,1110129,10,10,10,10,1
> ....
> Logs shows no error. I lowered log level to DEBUG and the only thing that I see when
I launch the query is just:
> DEBUG [Thrift:16] 2013-04-22 10:47:27,727 CassandraServer.java (line 1703) execute_cql3_query
> DEBUG [Thrift:16] 2013-04-22 10:47:27,737 Tracing.java (line 167) request complete
> Attached, a log of the query with TRACE level log.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

Mime
View raw message