cassandra-commits mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "DOAN DuyHai (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (CASSANDRA-11525) StaticTokenTreeBuilder should respect posibility of duplicate tokens
Date Sat, 09 Apr 2016 13:28:25 GMT

    [ https://issues.apache.org/jira/browse/CASSANDRA-11525?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15233546#comment-15233546
] 

DOAN DuyHai commented on CASSANDRA-11525:
-----------------------------------------

[~xedin]   [~jrwest]

OK the fix is confirmed. I have fetched successfully ~36 millions CQL rows using the index
without the exception

Good job and thanks for fixing this tricky bug

> StaticTokenTreeBuilder should respect posibility of duplicate tokens
> --------------------------------------------------------------------
>
>                 Key: CASSANDRA-11525
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-11525
>             Project: Cassandra
>          Issue Type: Bug
>          Components: sasi
>         Environment: Cassandra 3.5-SNAPSHOT
>            Reporter: DOAN DuyHai
>            Assignee: Jordan West
>             Fix For: 3.5
>
>
> Bug reproduced in *Cassandra 3.5-SNAPSHOT* (after the fix of OOM)
> {noformat}
> create table if not exists test.resource_bench ( 
>  dsr_id uuid,
>  rel_seq bigint,
>  seq bigint,
>  dsp_code varchar,
>  model_code varchar,
>  media_code varchar,
>  transfer_code varchar,
>  commercial_offer_code varchar,
>  territory_code varchar,
>  period_end_month_int int,
>  authorized_societies_txt text,
>  rel_type text,
>  status text,
>  dsp_release_code text,
>  title text,
>  contributors_name list<text>,
>  unic_work text,
>  paying_net_qty bigint,
> PRIMARY KEY ((dsr_id, rel_seq), seq)
> ) WITH CLUSTERING ORDER BY (seq ASC); 
> CREATE CUSTOM INDEX resource_period_end_month_int_idx ON test.resource_bench (period_end_month_int)
USING 'org.apache.cassandra.index.sasi.SASIIndex' WITH OPTIONS = {'mode': 'PREFIX'};
> {noformat}
> So the index is a {{DENSE}} numerical index.
> When doing the request {{SELECT dsp_code, unic_work, paying_net_qty FROM test.resource_bench
WHERE period_end_month_int = 201401}} using server-side paging.
> I bumped into this stack trace:
> {noformat}
> WARN  [SharedPool-Worker-1] 2016-04-06 00:00:30,825 AbstractLocalAwareExecutorService.java:169
- Uncaught exception on thread Thread[SharedPool-Worker-1,5,main]: {}
> java.lang.ArrayIndexOutOfBoundsException: -55
> 	at org.apache.cassandra.db.ClusteringPrefix$Serializer.deserialize(ClusteringPrefix.java:268)
~[apache-cassandra-3.5-SNAPSHOT.jar:3.5-SNAPSHOT]
> 	at org.apache.cassandra.db.Serializers$2.deserialize(Serializers.java:128) ~[apache-cassandra-3.5-SNAPSHOT.jar:3.5-SNAPSHOT]
> 	at org.apache.cassandra.db.Serializers$2.deserialize(Serializers.java:120) ~[apache-cassandra-3.5-SNAPSHOT.jar:3.5-SNAPSHOT]
> 	at org.apache.cassandra.io.sstable.IndexHelper$IndexInfo$Serializer.deserialize(IndexHelper.java:148)
~[apache-cassandra-3.5-SNAPSHOT.jar:3.5-SNAPSHOT]
> 	at org.apache.cassandra.db.RowIndexEntry$Serializer.deserialize(RowIndexEntry.java:218)
~[apache-cassandra-3.5-SNAPSHOT.jar:3.5-SNAPSHOT]
> 	at org.apache.cassandra.io.sstable.format.SSTableReader.keyAt(SSTableReader.java:1823)
~[apache-cassandra-3.5-SNAPSHOT.jar:3.5-SNAPSHOT]
> 	at org.apache.cassandra.index.sasi.SSTableIndex$DecoratedKeyFetcher.apply(SSTableIndex.java:168)
~[apache-cassandra-3.5-SNAPSHOT.jar:3.5-SNAPSHOT]
> 	at org.apache.cassandra.index.sasi.SSTableIndex$DecoratedKeyFetcher.apply(SSTableIndex.java:155)
~[apache-cassandra-3.5-SNAPSHOT.jar:3.5-SNAPSHOT]
> 	at org.apache.cassandra.index.sasi.disk.TokenTree$KeyIterator.computeNext(TokenTree.java:518)
~[apache-cassandra-3.5-SNAPSHOT.jar:3.5-SNAPSHOT]
> 	at org.apache.cassandra.index.sasi.disk.TokenTree$KeyIterator.computeNext(TokenTree.java:504)
~[apache-cassandra-3.5-SNAPSHOT.jar:3.5-SNAPSHOT]
> 	at org.apache.cassandra.index.sasi.utils.AbstractIterator.tryToComputeNext(AbstractIterator.java:116)
~[apache-cassandra-3.5-SNAPSHOT.jar:3.5-SNAPSHOT]
> 	at org.apache.cassandra.index.sasi.utils.AbstractIterator.hasNext(AbstractIterator.java:110)
~[apache-cassandra-3.5-SNAPSHOT.jar:3.5-SNAPSHOT]
> 	at org.apache.cassandra.utils.MergeIterator$Candidate.advance(MergeIterator.java:374)
~[apache-cassandra-3.5-SNAPSHOT.jar:3.5-SNAPSHOT]
> 	at org.apache.cassandra.utils.MergeIterator$ManyToOne.advance(MergeIterator.java:186)
~[apache-cassandra-3.5-SNAPSHOT.jar:3.5-SNAPSHOT]
> 	at org.apache.cassandra.utils.MergeIterator$ManyToOne.computeNext(MergeIterator.java:155)
~[apache-cassandra-3.5-SNAPSHOT.jar:3.5-SNAPSHOT]
> 	at org.apache.cassandra.utils.AbstractIterator.hasNext(AbstractIterator.java:47) ~[apache-cassandra-3.5-SNAPSHOT.jar:3.5-SNAPSHOT]
> 	at org.apache.cassandra.index.sasi.plan.QueryPlan$ResultIterator.computeNext(QueryPlan.java:106)
~[apache-cassandra-3.5-SNAPSHOT.jar:3.5-SNAPSHOT]
> 	at org.apache.cassandra.index.sasi.plan.QueryPlan$ResultIterator.computeNext(QueryPlan.java:71)
~[apache-cassandra-3.5-SNAPSHOT.jar:3.5-SNAPSHOT]
> 	at org.apache.cassandra.utils.AbstractIterator.hasNext(AbstractIterator.java:47) ~[apache-cassandra-3.5-SNAPSHOT.jar:3.5-SNAPSHOT]
> 	at org.apache.cassandra.db.transform.BasePartitions.hasNext(BasePartitions.java:72)
~[apache-cassandra-3.5-SNAPSHOT.jar:3.5-SNAPSHOT]
> 	at org.apache.cassandra.db.partitions.UnfilteredPartitionIterators$Serializer.serialize(UnfilteredPartitionIterators.java:289)
~[apache-cassandra-3.5-SNAPSHOT.jar:3.5-SNAPSHOT]
> {noformat}
> There are 2 possible root cause:
> 1. Index corrupted
> 2. Raw SSTable is corrupted
> To rule out *scenario 1*, I just drop and rebuild the index *many times* but the exception
was still there, so I modified the method {{SSTableReader.keyAt(long indexPosition)}} to log
the impacted partition:
> {noformat}
>             try
>             {
>                 if (isKeyCacheSetup())
>                     cacheKey(key, rowIndexEntrySerializer.deserialize(in));
>             } catch (IndexOutOfBoundsException ex)
>             {
>                 logger.error(String.format(
>                 "Error when reading index entry for token '%s' at indexPosition %s ",
>                 key.getToken().getTokenValue(), indexPosition));
>             }
> {noformat}
> Below are the output in the log after code modification:
> {noformat}
> system_ns3038406.ip-5-39-72.eu.log:ERROR [SharedPool-Worker-1] 2016-04-07 17:08:28,843
SSTableReader.java:1830 - Error when reading index entry for token '-7005474773654630139'
at indexPosition 2147457128
> system_ns3038406.ip-5-39-72.eu.log:ERROR [SharedPool-Worker-1] 2016-04-07 17:08:28,917
SSTableReader.java:1830 - Error when reading index entry for token '-5016711186446865616'
at indexPosition 2147458268
> system_ns3038406.ip-5-39-72.eu.log:ERROR [SharedPool-Worker-1] 2016-04-07 17:08:28,918
SSTableReader.java:1830 - Error when reading index entry for token '1027994831942941747' at
indexPosition 2147459218
> {noformat}
> I double check the original C* data using {{cqlsh}} but it seems that there is no data
for those tokens:
> {noformat}
> SELECT dsr_id,rel_seq FROM resource_bench WHERE token(dsr_id,rel_seq)=-7005474773654630139;
>  dsr_id | rel_seq
> --------+---------
> (0 rows)
>  SELECT dsr_id,rel_seq FROM resource_bench WHERE token(dsr_id,rel_seq)=-5016711186446865616;
>  dsr_id | rel_seq
> --------+---------
> (0 rows)
> SELECT dsr_id,rel_seq FROM resource_bench WHERE token(dsr_id,rel_seq)=1027994831942941747;
>  dsr_id | rel_seq
> --------+---------
> (0 rows)
> {noformat}
> /cc [~xedin] [~beobal]



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Mime
View raw message