incubator-cassandra-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Desimpel, Ignace" <Ignace.Desim...@nuance.com>
Subject RE: Question about AbstractType class
Date Wed, 20 Apr 2011 13:06:40 GMT


-----Original Message-----
From: Sylvain Lebresne [mailto:sylvain@datastax.com] 
Sent: Wednesday, April 20, 2011 2:07 PM
To: user@cassandra.apache.org
Subject: Re: Question about AbstractType class

On Wed, Apr 20, 2011 at 1:35 PM, Desimpel, Ignace <Ignace.Desimpel@nuance.com> wrote:
> Cassandra version 0.7.4
>
>
>
> Hi,
>
>
>
> I created my own java class as an extension of the AbstractType class. 
> But I'm not sure about the following items related to the compare function :
>
> # The remaining bytes of the buffer sometimes is zero during thrift 
> get_slice execution, however I never store any zero length column name 
> nor query for it . If normal, what would be the correct handling of 
> the zero remaining bytes?

It is normal, the empty ByteBuffer is used in slice queries to indicate the beginning of the
row (start=""). More generally, compare and validate should work for anything you store but
also anything you provide for the 'start'
and 'end' argument of slices.

> Would it be something like :
>
> public int compare(ByteBuffer o1, ByteBuffer o2){ int ar1Rem = 
> o1.remaining(); int ar2Rem = o2.remaining(); if ( ar1Rem == 0 || 
> ar2Rem == 0 ) { if ( ar1Rem != 0 ) {
>                      return 1;
>               } else if ( ar2Rem != 0 ) {
>                      return -1;
>               } else {
>                      return 0;
>               }
> }
> //Add the real compare here
> .......}

That looks reasonable (though not optimal in the number of comparison :))
->OK

> # Since in version 0.6.3 the same function was passing an array of 
> bytes, I assumed that I could now call the ByteBuffer.array() function 
> in order to get the array of bytes backing up the ByteBuffer.

It's not that simple. First, even if you use ByteBuffer.array(), you'll have to be careful
that the ByteBuffer has a position, a limit and an arrayOffset and you should take that into
account when accessing the backing array. But there is also no guarantee that the ByteBuffer
will have a backing array so you need to handle this case too (I refer you to the ByteBuffer
documentation).
->OK

> Also the length of the
> byte array in 0.6.3 seemed always to correspond to the bytes of column 
> name stored. But now in version 0.7.4 that ByteBuffer is not always 
> backed by such an array.
>
> I can still get around this by making the needed buffer myself like :
>
> int ar2Rem = o2.remaining();
>> byte[] ar2 = new byte[ar2Rem];
>> o2.get(ar2, 0, ar2Rem);
>
> Question is : Are the remaining bytes the actual bytes for this column 
> name
> (eg: 20 bytes) or would that ByteBuffer ever be some wrapper around 
> some larger stream of data and the remaining bytes number could be 10 M bytes.
> Thus I would not be able to detect the end of the column to compare 
> and I would possibly be allocating a large unneeded byte array?

As said above, the remaing bytes won't (always) be the actual bytes.
->Then how do I know the end is near? Eg.:  If the stored value is a char string, it would
be nice to know the end. Unless I also store it before the char string.
->Assuming that both ByteBuffers have the same data and the same position and limit, thus
same remaining, one can imagine a loop comparing each byte until the remaining is used up.
Thus then I can not get any more data and thus I should return 0?

> #Using the ByteBuffer's 'get' function also updates the position of 
> the ByteBuffer. Is the compare function expected to do that or should 
> it reset the position back to what it was or ...?

Neither. You should *not* use any function that change the ByteBuffer position.
That is, changing it and resetting it afterward is *not* ok.
->OK
Instead you should only use only the absolute get() methods, that do not change the position
at all.
Or, you start your compare function by calling BB.duplicate() on both buffers and then you're
free to change the position of the duplicates.
->OK

--
Sylvain

Thanks Sylvain!

Mime
View raw message