incubator-cassandra-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Desimpel, Ignace" <Ignace.Desim...@nuance.com>
Subject RE: AssertionError
Date Wed, 18 May 2011 13:09:25 GMT
Great! I'm not using PIG.

Thanks.

-----Original Message-----
From: Sylvain Lebresne [mailto:sylvain@datastax.com] 
Sent: Wednesday, May 18, 2011 3:07 PM
To: user@cassandra.apache.org
Subject: Re: AssertionError

The compose() and decompose() methods of AbstractType are used only by the PIG driver (in
0.7 at least, in 0.8 I think CQL uses them too). If you're not using PIG, you safe with making
those function simple pass-through, i.e, to have something along the line of:
  class CustomComparator extends AbstractType<ByteBuffer>
  {
      ...
      public ByteBuffer compose(ByteBuffer v) { return v; }
      public ByteBuffer decompose(ByteBuffer v) { return v; }
  }

I'm not a PIG expert, but even if you're using it I'm not sure how much useful it is to actually
diverge from what's above since PIG probably doesn't know much about your type. In any case,
those function are not called during "normal" query.

Sylvain

On Wed, May 18, 2011 at 2:40 PM, Desimpel, Ignace <Ignace.Desimpel@nuance.com> wrote:
> Hi Sylvain,
>
> I did the upgrade from 0.7.4 to 0.7.5 and the exception does not occur anymore (on Windows
...). Thanks for pointing me to the bug fix.
> From the 0.7.5 version I upgraded to the 0.7.6 version, and this is also OK, without
any code changes and by still keeping the same data files generate with the 0.7.4 version.
>
> Could you still give me a comment on the question regarding the AbstractType class change?
To be on the save side, I could simply make new array backed byte buffers (that is what I
need). But I ask the question because I want to avoid allocating any object if it is not really
needed since I know that I will query for a lot of data of that type.
>
> Ignace
>
>
> -----Original Message-----
> From: Desimpel, Ignace [mailto:Ignace.Desimpel@nuance.com]
> Sent: Tuesday, May 17, 2011 3:33 PM
> To: user@cassandra.apache.org
> Subject: RE: AssertionError
>
> Seems like the AbstractType class has changed going from 0.7.4 to 0.7.5.
> It is now required to implement a compose and decompose method. Already did that, and
it starts up with the 0.7.5 code using the 0.7.4 data and configuration (using a smaller extra
test database) Below I made a sample implementations to illustrate another question : On the
compose method , can I simply create my own AbstractType class and use the given ByteBuffer.
Or like in the decompose example, do I need to duplicate the ByteBuffer or could the paramT
object  be reused or should I make a complete copy?
>
>        @Override
>        public Object compose(ByteBuffer paramByteBuffer){
>                ReverseCFFloatValues oNew = new ReverseCFFloatValues();
>                oNew.paramByteBuffer = paramByteBuffer;
>                return oNew;
>        }
>
>        @Override
>        public ByteBuffer decompose(Object paramT){
>                return 
> ((ReverseCFFloatValues)paramT).paramByteBuffer.duplicate();
>        }
>
>
>
> -----Original Message-----
> From: Sylvain Lebresne [mailto:sylvain@datastax.com]
> Sent: Tuesday, May 17, 2011 2:50 PM
> To: user@cassandra.apache.org
> Subject: Re: AssertionError
>
> On Tue, May 17, 2011 at 1:46 PM, Desimpel, Ignace <Ignace.Desimpel@nuance.com>
wrote:
>> Ok, I will do that (next test will be done on some linux boxes being installed now,
but at this time I need to gone with the current windows setup).
>> Question : Can I use the 0.7.4 data files as is? Do I need to backup the datafiles
in order to be able to get back to the 0.7.4 version if needed?
>
> Yes you can use 0.7.4 data files as is. And I can't think of a reason why you should
have problem getting back to 0.7.4 if needed, though snapshotting before cannot hurt.
>
>>
>> Ignace
>>
>> -----Original Message-----
>> From: Sylvain Lebresne [mailto:sylvain@datastax.com]
>> Sent: Tuesday, May 17, 2011 1:16 PM
>> To: user@cassandra.apache.org
>> Subject: Re: AssertionError
>>
>> First thing to do would be to update to 0.7.5.
>>
>> The assertionError you're running into is a assertion where we check if a skipBytes
did skip all the bytes we had ask him to. As it turns out, the spec for skipBytes authorize
it to not skip all the bytes asked even with no good reason. I'm pretty sure on a linux box
skipBytes on a file will always read the number of asked bytes unless it reaches EOF, but
I see you're running windows, so who knows what can happen.
>>
>> Anyway, long story short, it's a "bug" in 0.7.4 that has been fixed in 0.7.5. If
you still run into this in 0.7.5 at least we'll know it's something else (and we will have
a more helpful error message).
>>
>> --
>> Sylvain
>>
>> On Tue, May 17, 2011 at 12:41 PM, Desimpel, Ignace <Ignace.Desimpel@nuance.com>
wrote:
>>> I use a custom comparator class. So I think there is a high chance 
>>> that I do something wrong there. I was thinking that the stack trace 
>>> could give a clue and help me on the way, maybe because some already got the
same error.
>>>
>>>
>>>
>>> Anyway, here is some more information you requested.
>>>
>>>
>>>
>>> Yaml definition :
>>>
>>> name: ForwardStringValues
>>>
>>>           column_type: Super
>>>
>>>           compare_with:
>>> be.landc.services.search.server.db.cassandra.node.ForwardCFStringVal
>>> u
>>> e
>>> s
>>>
>>>           compare_subcolumns_with: BytesType
>>>
>>>           keys_cached: 100000
>>>
>>>           rows_cached: 0
>>>
>>>           comment: Stores the values of functions returning string
>>>
>>>           memtable_throughput_in_mb: 64
>>>
>>>           memtable_operations_in_millions: 15
>>>
>>>           min_compaction_threshold: 2
>>>
>>>           max_compaction_threshold: 5
>>>
>>>
>>>
>>> Column Family: ForwardStringValues
>>>
>>>                 SSTable count: 8
>>>
>>>                 Space used (live): 131311776690
>>>
>>>                 Space used (total): 131311776690
>>>
>>>                 Memtable Columns Count: 0
>>>
>>>                 Memtable Data Size: 0
>>>
>>>                 Memtable Switch Count: 0
>>>
>>>                 Read Count: 1
>>>
>>>                 Read Latency: 404.890 ms.
>>>
>>>                 Write Count: 0
>>>
>>>                 Write Latency: NaN ms.
>>>
>>>                 Pending Tasks: 0
>>>
>>>                 Key cache capacity: 100000
>>>
>>>                 Key cache size: 8
>>>
>>>                 Key cache hit rate: 1.0
>>>
>>>                 Row cache: disabled
>>>
>>>                 Compacted row minimum size: 150
>>>
>>>                 Compacted row maximum size: 7152383774
>>>
>>>                 Compacted row mean size: 3064535
>>>
>>>
>>>
>>> No secondary indexes.
>>>
>>> Total database disk size 823 Gb
>>>
>>> disk_access_mode: auto on 64 bit windows os
>>>
>>> partitioner: org.apache.cassandra.dht.ByteOrderedPartitioner
>>>
>>> Data was stored over a period of 5 days.
>>>
>>> Cassandra 0.7.4 was running as an embedded server.
>>>
>>> Batch insert, using the StorageProxy.mutate.
>>>
>>> No errors were logged during the batch insert period.
>>>
>>> The row key is a string representation of a positive integer value.
>>>
>>> The same row key is used during many different mutate calls, but all 
>>> super column names are different for each call.
>>>
>>> The column name of the super class stored  is composed of the 32 
>>> bytes and the bytes of 2 integer (positive and negative) values and 
>>> the bytes (UTF8) of the string value :[32 bytes][4 int bytes][4 int 
>>> bytes][string bytes]
>>>
>>> The custom comparator class ...ForwardCFStringValues sorts the names 
>>> by first sorting the string , then the 32 bytes, and then the two 
>>> integer values
>>>
>>> For each column name two subcolumns are inserted with fixed name and 
>>> some small binary value (about 40 bytes)
>>>
>>>
>>>
>>> The query :
>>>
>>> Get_slice using thrift.
>>>
>>> Params :
>>>
>>>   Row key : the string representation of the positive integer String '1788'
>>> thus hex values 31 37 38 38
>>>
>>>   ColumnParent : the column family ForwardStringValues
>>>
>>>   SlicePredicate : SlicePredicate(slice_range:SliceRange(start:00 00
>>> 00 00
>>> 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
>>> 00 00
>>> 00 00 00 FF FF FF FF FF FF FF FF 55 52 49 4E 41 52 59 20 54 52 41 43
>>> 54 20
>>> 49 4E 46 45 43 54 49 4F 4E, finish:7F 7F 7F 7F 7F 7F 7F 7F 7F 7F 7F 
>>> 7F 7F 7F 7F 7F 7F 7F 7F 7F 7F 7F 7F 7F 7F 7F 7F 7F 7F 7F 7F 7F FF FF 
>>> FF 7F FF FF FF 7F 55 52 49 4E 41 52 59 20 54 52 41 43 54 20 49 4E 46
>>> 45
>>> 43 54 49 4F 4E, reversed:false, count:10000))
>>>
>>>
>>>
>>> This SlicePredicate is supposed to fetch all the columns with the 
>>> string '55
>>> 52 49 4E 41 52 59 20 54 52 41 43 54 20 49 4E 46 45 43 54 49 4F 4E'
>>> regardless of the other bytes in the column name. So the start and 
>>> finish have the same string bytes. The rest of the bytes for the 
>>> start values are set to the lowest possible value (32 zero bytes and 
>>> the bytes FFFFFFFF representing the integer value -1) , the finish 
>>> is set the highest possible value (32 bytes with value 7F, ...)
>>>
>>>
>>>
>>> I tested the same code but with a small data set and all seemed to be OK.
>>> Even on the same database I get back results without exception if I 
>>> use different String values. I'm almost sure that there should be 
>>> columns with that string. If the string is not present I don't get the error.
>>>
>>>
>>>
>>>
>>>
>>>
>>>
>>>
>>>
>>> From: Aaron Morton [mailto:aaron@thelastpickle.com]
>>> Sent: Monday, May 16, 2011 11:33 PM
>>> To: user@cassandra.apache.org
>>> Subject: Re: AssertionError
>>>
>>>
>>>
>>> The code is trying to follow the column index for a row in an 
>>> sstable, but it cannot skip as many bytes as it would like to to get to the column.
>>> Helpfully the help says running out of bytes is only one of the 
>>> reasons why this could happen:)
>>>
>>>
>>>
>>> Can you provide some more information about the query and the data, 
>>> and also the upgrade history for your cluster.
>>>
>>>
>>>
>>> Thanks
>>>
>>> Aaron
>>>
>>> On 17/05/2011, at 3:07 AM, "Desimpel, Ignace"
>>> <Ignace.Desimpel@nuance.com>
>>> wrote:
>>>
>>> Environment : java 64 bit server, java client, thrift get_slice 
>>> method, Cassandra 0.7.4, single node
>>>
>>> Depending on the data I pass for a query on a CF I get the following 
>>> listed below. Any suggestions what could be wrong based on the stack trace?
>>>
>>>
>>>
>>> java.lang.AssertionError
>>>
>>>                 at
>>> org.apache.cassandra.db.columniterator.IndexedSliceReader$IndexedBlo
>>> c
>>> k
>>> Fetcher.getNextBlock(IndexedSliceReader.java:176)
>>>
>>>                 at
>>> org.apache.cassandra.db.columniterator.IndexedSliceReader.computeNex
>>> t
>>> (
>>> IndexedSliceReader.java:120)
>>>
>>>                 at
>>> org.apache.cassandra.db.columniterator.IndexedSliceReader.computeNex
>>> t
>>> (
>>> IndexedSliceReader.java:48)
>>>
>>>                 at
>>> com.google.common.collect.AbstractIterator.tryToComputeNext(Abstract
>>> I
>>> t
>>> erator.java:136)
>>>
>>>                 at
>>> com.google.common.collect.AbstractIterator.hasNext(AbstractIterator.
>>> j
>>> a
>>> va:131)
>>>
>>>                 at
>>> org.apache.cassandra.db.columniterator.SSTableSliceIterator.hasNext(
>>> S
>>> S
>>> TableSliceIterator.java:108)
>>>
>>>                 at
>>> org.apache.commons.collections.iterators.CollatingIterator.set(Colla
>>> t
>>> i
>>> ngIterator.java:282)
>>>
>>>                 at
>>> org.apache.commons.collections.iterators.CollatingIterator.least(Col
>>> l
>>> a
>>> tingIterator.java:325)
>>>
>>>                 at
>>> org.apache.commons.collections.iterators.CollatingIterator.next(Coll
>>> a
>>> t
>>> ingIterator.java:229)
>>>
>>>                 at
>>> org.apache.cassandra.utils.ReducingIterator.computeNext(ReducingIter
>>> a
>>> t
>>> or.java:68)
>>>
>>>                 at
>>> com.google.common.collect.AbstractIterator.tryToComputeNext(Abstract
>>> I
>>> t
>>> erator.java:136)
>>>
>>>                 at
>>> com.google.common.collect.AbstractIterator.hasNext(AbstractIterator.
>>> j
>>> a
>>> va:131)
>>>
>>>                 at
>>> org.apache.cassandra.db.filter.SliceQueryFilter.collectReducedColumn
>>> s
>>> (
>>> SliceQueryFilter.java:116)
>>>
>>>                 at
>>> org.apache.cassandra.db.filter.QueryFilter.collectCollatedColumns(Qu
>>> e
>>> r
>>> yFilter.java:130)
>>>
>>>                 at
>>> org.apache.cassandra.db.ColumnFamilyStore.getTopLevelColumns(ColumnF
>>> a
>>> m
>>> ilyStore.java:1368)
>>>
>>>                 at
>>> org.apache.cassandra.db.ColumnFamilyStore.getColumnFamily(ColumnFami
>>> l
>>> y
>>> Store.java:1245)
>>>
>>>                 at
>>> org.apache.cassandra.db.ColumnFamilyStore.getColumnFamily(ColumnFami
>>> l
>>> y
>>> Store.java:1173)
>>>
>>>                 at
>>> org.apache.cassandra.db.Table.getRow(Table.java:333)
>>>
>>>                 at
>>> org.apache.cassandra.db.SliceFromReadCommand.getRow(SliceFromReadCom
>>> m
>>> a
>>> nd.java:63)
>>>
>>>                 at
>>> org.apache.cassandra.service.StorageProxy$LocalReadRunnable.runMayTh
>>> r
>>> o
>>> w(StorageProxy.java:453)
>>>
>>>                 at
>>> org.apache.cassandra.utils.WrappedRunnable.run(WrappedRunnable.java:
>>> 3
>>> 0
>>> )
>>>
>>>                 at
>>> java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExe
>>> c
>>> u
>>> tor.java:886)
>>>
>>>                 at
>>> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.
>>> java:908)
>>>
>>>                 at java.lang.Thread.run(Thread.java:662)
>>>
>>>
>>>
>>> Ignace Desimpel
>>
>

Mime
View raw message