cassandra-commits mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Marco Matarazzo (JIRA)" <j...@apache.org>
Subject [jira] [Issue Comment Deleted] (CASSANDRA-5501) Missing data on SELECT on secondary index
Date Mon, 22 Apr 2013 09:51:15 GMT

     [ https://issues.apache.org/jira/browse/CASSANDRA-5501?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]

Marco Matarazzo updated CASSANDRA-5501:
---------------------------------------

    Comment: was deleted

(was: I don't know how the internal mechanism of Cassandra works, so this could absolutely
be no related, but the two columns that are not coming out have a more recent timestamp as
it can be seen from cassandra-cli, and this CF has two sstables on disk, ...-435 and ...-436
.

Trying to rebuild the index with nodetool and with log level lowered to trace, it seems to
me that -436 get skipped due to bloom filter, and i find it strange during an index rebuild
phase. 

I repeat, I have just a very vague sense of how the internal mechanism works, so it could
be perfectly fine and I could be completely wrong, that's why I'm writing this as a comment
:-)

Sensitive (to me) parts of the logs follows:

 INFO [RMI TCP Connection(6)-194.116.72.170] 2013-04-22 11:21:31,491 ColumnFamilyStore.java
(line 552) User Requested secondary index re-build for goh_master/agents indexes
 INFO [RMI TCP Connection(6)-194.116.72.170] 2013-04-22 11:21:31,491 SecondaryIndexManager.java
(line 136) Submitting index build of [agents_station_id] for data in SSTableReader(path='/var/lib/cassandra/data/goh_master/agents/goh_master-agents-ib-435-Data.db'),
SSTableReader(path='/var/lib/cassandra/data/goh_master/agents/goh_master-agents-ib-436-Data.db')
DEBUG [CompactionExecutor:4] 2013-04-22 11:21:31,493 Table.java (line 412) Indexing row a4a0c7d4-a1e9-11e2-a860-aacc00000216

TRACE [CompactionExecutor:4] 2013-04-22 11:21:31,493 CollationController.java (line 233) collectAllData
TRACE [CompactionExecutor:4] 2013-04-22 11:21:31,493 SSTableReader.java (line 748) Cache hit
for KeyCacheKey(/var/lib/cassandra/data/goh_master/agents/goh_master-agents-ib-435, 61346130633764342d613165392d313165322d613836302d616163633030303030323136)
-> org.apache.cassandra.db.RowIndexEntry@39b0595c
TRACE [CompactionExecutor:4] 2013-04-22 11:21:31,493 SliceQueryFilter.java (line 161) collecting
0 of 10000: :false:0@1366040879670000
TRACE [CompactionExecutor:4] 2013-04-22 11:21:31,493 SliceQueryFilter.java (line 161) collecting
1 of 10000: component_id:false:3@1365603563600000
TRACE [CompactionExecutor:4] 2013-04-22 11:21:31,494 SliceQueryFilter.java (line 161) collecting
2 of 10000: corporation_id:false:36@1365603563600000
TRACE [CompactionExecutor:4] 2013-04-22 11:21:31,494 SliceQueryFilter.java (line 161) collecting
3 of 10000: entity_id:false:1@1366040879670000
TRACE [CompactionExecutor:4] 2013-04-22 11:21:31,494 SliceQueryFilter.java (line 161) collecting
4 of 10000: name:false:6@1365603563600000
TRACE [CompactionExecutor:4] 2013-04-22 11:21:31,494 SliceQueryFilter.java (line 161) collecting
5 of 10000: name_check:false:6@1365603563600000
TRACE [CompactionExecutor:4] 2013-04-22 11:21:31,495 SliceQueryFilter.java (line 161) collecting
6 of 10000: station_id:false:36@1366040879670000
TRACE [CompactionExecutor:4] 2013-04-22 11:21:31,495 SliceQueryFilter.java (line 161) collecting
7 of 10000: stats_intellect:false:1@1365603563600000
TRACE [CompactionExecutor:4] 2013-04-22 11:21:31,495 SliceQueryFilter.java (line 161) collecting
8 of 10000: stats_reflexes:false:1@1365603563600000
TRACE [CompactionExecutor:4] 2013-04-22 11:21:31,495 SliceQueryFilter.java (line 161) collecting
9 of 10000: stats_stamina:false:1@1365603563600000
TRACE [CompactionExecutor:4] 2013-04-22 11:21:31,495 SliceQueryFilter.java (line 161) collecting
10 of 10000: stats_technology:false:1@1365603563600000
DEBUG [CompactionExecutor:4] 2013-04-22 11:21:31,496 Table.java (line 412) Indexing row 5315b187-bc33-4747-997a-dcc7bd60d768

TRACE [CompactionExecutor:4] 2013-04-22 11:21:31,496 CollationController.java (line 233) collectAllData
TRACE [CompactionExecutor:4] 2013-04-22 11:21:31,496 SSTableReader.java (line 748) Cache hit
for KeyCacheKey(/var/lib/cassandra/data/goh_master/agents/goh_master-agents-ib-435, 35333135623138372d626333332d343734372d393937612d646363376264363064373638)
-> org.apache.cassandra.db.RowIndexEntry@6ee09a07
TRACE [CompactionExecutor:4] 2013-04-22 11:21:31,496 SliceQueryFilter.java (line 161) collecting
0 of 10000: :false:0@1366568185181000
TRACE [CompactionExecutor:4] 2013-04-22 11:21:31,496 SliceQueryFilter.java (line 161) collecting
1 of 10000: character_points:false:3@1364580381572000

[...lots of these repeated until...]

DEBUG [CompactionExecutor:4] 2013-04-22 11:21:31,642 Table.java (line 412) Indexing row 7b594dc0-a901-11e2-80b4-aacc00000201

TRACE [CompactionExecutor:4] 2013-04-22 11:21:31,642 CollationController.java (line 233) collectAllData
TRACE [CompactionExecutor:4] 2013-04-22 11:21:31,642 SSTableReader.java (line 748) Cache hit
for KeyCacheKey(/var/lib/cassandra/data/goh_master/agents/goh_master-agents-ib-436, 37623539346463302d613930312d313165322d383062342d616163633030303030323031)
-> org.apache.cassandra.db.RowIndexEntry@cb8a0c8
DEBUG [CompactionExecutor:4] 2013-04-22 11:21:31,642 Table.java (line 412) Indexing row 45b2939c-5ce1-45c4-a80e-c55cbe04fb8b

TRACE [CompactionExecutor:4] 2013-04-22 11:21:31,643 CollationController.java (line 233) collectAllData
DEBUG [CompactionExecutor:4] 2013-04-22 11:21:31,643 SSTableReader.java (line 735) Bloom filter
allows skipping sstable 436

[... then again lots of SliceQueryFilter lines with "skipping sstable 436"...]

)
    
> Missing data on SELECT on secondary index 
> ------------------------------------------
>
>                 Key: CASSANDRA-5501
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-5501
>             Project: Cassandra
>          Issue Type: Bug
>    Affects Versions: 1.2.4
>         Environment: linux ubuntu 12.04
>            Reporter: Marco Matarazzo
>
> We have a 3 nodes cluster, and a keyspace with RF = 3.
> From cassandra-cli everything is fine (we actually never use it, I just launched it for
a check in this particular case).
> [default@goh_master] get agents where station_id = ascii(1110129);
> -------------------
> RowKey: 6c8efeb6-7209-11e2-890a-aacc00000216
> => (column=, value=, timestamp=1364580868176000)
> => (column=character_points, value=, timestamp=1361030686890000)
> => (column=component_id, value=0, timestamp=1364580868176000)
> => (column=corporation_id, value=3efc729e-7209-11e2-890a-aacc00000216, timestamp=1361030686890000)
> => (column=entity_id, value=0, timestamp=1364580868176000)
> => (column=manufacturing, value=, timestamp=1361030686890000)
> => (column=model, value=500005, timestamp=1361030686890000)
> => (column=name, value=Jenny Olifield, timestamp=1361030686890000)
> => (column=name_check, value=jenny_olifield, timestamp=1361030686890000)
> => (column=station_id, value=1110129, timestamp=1364580868176000)
> => (column=stats_intellect, value=8, timestamp=1361030686890000)
> => (column=stats_reflexes, value=8, timestamp=1361030686890000)
> => (column=stats_stamina, value=7, timestamp=1361030686890000)
> => (column=stats_technology, value=7, timestamp=1361030686890000)
> => (column=trading, value=, timestamp=1361030686890000)
> -------------------
> RowKey: dc413373-6b06-11e2-8943-aacc00000216
> => (column=, value=, timestamp=1366568185220000)
> => (column=character_points, value=100, timestamp=1364580381651000)
> => (column=component_id, value=, timestamp=1364580381651000)
> => (column=corporation_id, value=574934cc-6b06-11e2-a512-aacc00000200, timestamp=1364580381651000)
> => (column=entity_id, value=0, timestamp=1364580381651000)
> => (column=manufacturing, value=, timestamp=1364580381651000)
> => (column=model, value=500018, timestamp=1364580381651000)
> => (column=name, value=Darren Matar, timestamp=1364580381651000)
> => (column=name_check, value=darren_matar, timestamp=1364580381651000)
> => (column=station_id, value=1110129, timestamp=1364580381651000)
> => (column=stats_intellect, value=10, timestamp=1364580381651000)
> => (column=stats_reflexes, value=10, timestamp=1364580381651000)
> => (column=stats_stamina, value=10, timestamp=1364580381651000)
> => (column=stats_technology, value=10, timestamp=1364580381651000)
> => (column=trading, value=1, timestamp=1366568185220000)
> -------------------
> RowKey: 0e7074ac-64bd-11e2-8c38-aacc00000201
> => (column=, value=, timestamp=1364828039093000)
> => (column=character_points, value=, timestamp=1361030686760000)
> => (column=component_id, value=0, timestamp=1364828039093000)
> => (column=corporation_id, value=e398294e-64bc-11e2-8c38-aacc00000201, timestamp=1361030686760000)
> => (column=entity_id, value=0, timestamp=1364828039093000)
> => (column=manufacturing, value=1, timestamp=1362517535613000)
> => (column=model, value=500008, timestamp=1361030686760000)
> => (column=name, value=Tom Bishop, timestamp=1361030686760000)
> => (column=name_check, value=tom_bishop, timestamp=1361030686760000)
> => (column=station_id, value=1110129, timestamp=1364828039093000)
> => (column=stats_intellect, value=9, timestamp=1361030686760000)
> => (column=stats_reflexes, value=7, timestamp=1361030686760000)
> => (column=stats_stamina, value=5, timestamp=1361030686760000)
> => (column=stats_technology, value=9, timestamp=1361030686760000)
> => (column=trading, value=, timestamp=1361030686760000)
> -------------------
> RowKey: 1b462f09-65f3-4148-a1a6-536b52b3bcfa
> => (column=, value=, timestamp=1366568185096000)
> => (column=character_points, value=100, timestamp=1364580381537000)
> => (column=component_id, value=, timestamp=1364580381537000)
> => (column=corporation_id, value=1d2a8803-d139-4b50-85eb-92cb1082de2e, timestamp=1364580381537000)
> => (column=entity_id, value=0, timestamp=1364580381537000)
> => (column=manufacturing, value=, timestamp=1364580381537000)
> => (column=model, value=500003, timestamp=1364580381537000)
> => (column=name, value=Andrea Len, timestamp=1364580381537000)
> => (column=name_check, value=andrea_len, timestamp=1364580381537000)
> => (column=station_id, value=1110129, timestamp=1364580381537000)
> => (column=stats_intellect, value=10, timestamp=1364580381537000)
> => (column=stats_reflexes, value=10, timestamp=1364580381537000)
> => (column=stats_stamina, value=10, timestamp=1364580381537000)
> => (column=stats_technology, value=10, timestamp=1364580381537000)
> => (column=trading, value=1, timestamp=1366568185096000)
> 4 Rows Returned.
> From CQLSH, hovewer, the result is different, and 2 rows are missing.
> cqlsh:goh_master> select agent_id,name,station_id from agents where station_id='1110129';
>  agent_id                             | name           | station_id
> --------------------------------------+----------------+------------
>  6c8efeb6-7209-11e2-890a-aacc00000216 | Jenny Olifield |    1110129
>  0e7074ac-64bd-11e2-8c38-aacc00000201 |     Tom Bishop |    1110129
> cqlsh:goh_master> select agent_id, name, station_id from agents where agent_id = '1b462f09-65f3-4148-a1a6-536b52b3bcfa';
>  agent_id                             | name       | station_id
> --------------------------------------+------------+------------
>  1b462f09-65f3-4148-a1a6-536b52b3bcfa | Andrea Len |    1110129
> Updating one column makes the single row reappear in the index, but just for that row
and that columns/index.
> cqlsh:goh_master> update agents set station_id = '1110129' where agent_id = '1b462f09-65f3-4148-a1a6-536b52b3bcfa';
> cqlsh:goh_master> select agent_id,name,station_id from agents where station_id='1110129';
>  agent_id                             | name           | station_id
> --------------------------------------+----------------+------------
>  6c8efeb6-7209-11e2-890a-aacc00000216 | Jenny Olifield |    1110129
>  0e7074ac-64bd-11e2-8c38-aacc00000201 |     Tom Bishop |    1110129
>  1b462f09-65f3-4148-a1a6-536b52b3bcfa |     Andrea Len |    1110129
> Updating one columns does not make all the row re-appear on all indexes (as it would
be somewhat expected), but just on the updated one.
> cqlsh:goh_master> select * from agents where name = 'Andrea Len';
> cqlsh:goh_master> 
> Running nodetool rebuild_index on all three nodes apparently DOES NOT fixes the problem,
neither do nodetool repair.
> We also used COPY TO to dump the entire row to check for hidden spaces or anything like
that, but we can't see anything:
> ....
> dc413373-6b06-11e2-8943-aacc00000216,100,,574934cc-6b06-11e2-a512-aacc00000200,0,,500018,Darren
Matar,darren_matar,1110129,10,10,10,10,1
> 1b462f09-65f3-4148-a1a6-536b52b3bcfa,100,,1d2a8803-d139-4b50-85eb-92cb1082de2e,0,,500003,Andrea
Len,andrea_len,1110129,10,10,10,10,1
> ....
> Logs shows no error. I lowered log level to DEBUG and the only thing that I see when
I launch the query is just:
> DEBUG [Thrift:16] 2013-04-22 10:47:27,727 CassandraServer.java (line 1703) execute_cql3_query
> DEBUG [Thrift:16] 2013-04-22 10:47:27,737 Tracing.java (line 167) request complete
> Attached, a log of the query with TRACE level log.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

Mime
View raw message