incubator-cassandra-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Keith Wright <kwri...@nanigans.com>
Subject Re: Bulk loading into CQL3 Composite Columns
Date Fri, 31 May 2013 00:12:52 GMT
StringSerializer and CompositeSerializer are actually from Astyanax for what's it worth.  I
would recommend you change your table definition so that only val1 is part of the primary
key.  There is no reason to include val2.  Perhaps sending the IndexOutOfBoundsException would
help.

All the StringSerializer is really doing is

ByteBuffer.wrap<http://grepcode.com/file/repository.grepcode.com/java/root/jdk/openjdk/6-b14/java/nio/ByteBuffer.java#ByteBuffer.wrap%28byte%5B%5D%29>(obj.getBytes<http://grepcode.com/file/repository.grepcode.com/java/root/jdk/openjdk/6-b14/java/lang/String.java#String.getBytes%28java.nio.charset.Charset%29>(charset<http://grepcode.com/file/repo1.maven.org/maven2/com.netflix.astyanax/astyanax/1.56.26/com/netflix/astyanax/serializers/StringSerializer.java#StringSerializer.0charset>))

Using UTF-8 as the charset (see http://grepcode.com/file/repo1.maven.org/maven2/com.netflix.astyanax/astyanax/1.56.26/com/netflix/astyanax/serializers/StringSerializer.java#StringSerializer)

You can see the source for CompositeSerializer here:  http://grepcode.com/file/repo1.maven.org/maven2/com.netflix.astyanax/astyanax/1.56.26/com/netflix/astyanax/serializers/CompositeSerializer.java

Good luck!

From: Daniel Morton <daniel@djmorton.com<mailto:daniel@djmorton.com>>
Reply-To: "user@cassandra.apache.org<mailto:user@cassandra.apache.org>" <user@cassandra.apache.org<mailto:user@cassandra.apache.org>>
Date: Thursday, May 30, 2013 4:33 PM
To: "user@cassandra.apache.org<mailto:user@cassandra.apache.org>" <user@cassandra.apache.org<mailto:user@cassandra.apache.org>>
Subject: Re: Bulk loading into CQL3 Composite Columns

Hi Keith... Thanks for the help.

I'm presently not importing the Hector library (Which is where classes like CompositeSerializer
and StringSerializer come from, yes?), only the cassandra-all maven artifact.  Is the behaviour
of the CompositeSerializer much different than using a Builder from a CompositeType?  When
I saw the error about '20101201' failing to decode, I tried only including the values for
val1 and val2 like:


final List<AbstractType<?>> compositeTypes = new ArrayList<>();
compositeTypes.add(IntegerType.instance);
compositeTypes.add(IntegerType.instance);

final CompositeType compType = CompositeType.getInstance(compositeTypes);
final Builder builder = new CompositeType.Builder(compType);

builder.add(bytes(5));
builder.add(bytes(10));

ssTableWriter.newRow(bytes("20101201"));
ssTableWriter.addColumn(builder.build(), ByteBuffer.allocate(0), System.currentTimeMillis());



(where bytes is the statically imported ByteBufferUtil.bytes method)

But doing this resulted in an ArrayIndexOutOfBounds exception from Cassandra.  Is doing this
any different than using the CompositeSerializer you suggest?

Thanks again,

Daniel Morton


On Thu, May 30, 2013 at 3:32 PM, Keith Wright <kwright@nanigans.com<mailto:kwright@nanigans.com>>
wrote:
You do not want to repeat the first item of your primary key again.  If you recall, in CQL3
a primary key as defined below indicates that the row key is the first item (key) and then
the column names are composites of val1,val2.  Although I don't see why you need val2 as part
of the primary key in this case.  In any event, you would do something like this (although
I've never tested passing a null value):

ssTableWriter.newRow(StringSerializer.get().toByteBuffer("20101201"));
Composite columnComposite = new Composite();
columnComposite(0,5,IntegerSerializer.get());
columnComposite(0,10,IntegerSerializer.get());
ssTableWriter.addColumn(
CompositeSerializer.get().toByteBuffer(columnComposite),
null,
System.currentTimeMillis()
);

From: Daniel Morton <daniel@djmorton.com<mailto:daniel@djmorton.com>>
Reply-To: "user@cassandra.apache.org<mailto:user@cassandra.apache.org>" <user@cassandra.apache.org<mailto:user@cassandra.apache.org>>
Date: Thursday, May 30, 2013 1:06 PM
To: "user@cassandra.apache.org<mailto:user@cassandra.apache.org>" <user@cassandra.apache.org<mailto:user@cassandra.apache.org>>
Subject: Bulk loading into CQL3 Composite Columns

Hi All.  I am trying to bulk load some data into a CQL3 table using the sstableloader utility
and I am having some difficulty figuring out how to use the SSTableSimpleUnsortedWriter with
composite columns.

I have created this simple contrived table for testing:

create table test (key varchar, val1 int, val2 int, primary key (key, val1, val2));

Loosely following the bulk loading example in the docs, I have constructed the following method
to create my temporary SSTables.

public static void main(String[] args) throws Exception {
   final List<AbstractType<?>> compositeTypes = new ArrayList<>();
   compositeTypes.add(UTF8Type.instance);
   compositeTypes.add(IntegerType.instance);
   compositeTypes.add(IntegerType.instance);
   final CompositeType compType =
      CompositeType.getInstance(compositeTypes);
   SSTableSimpleUnsortedWriter ssTableWriter =
      new SSTableSimpleUnsortedWriter(
         new File("/tmp/cassandra_bulk/bigdata/test"),
         new Murmur3Partitioner() ,
         "bigdata",
         "test",
         compType,
         null,
         128);

   final Builder builder =
      new CompositeType.Builder(compType);

   builder.add(bytes("20101201"));
   builder.add(bytes(5));
   builder.add(bytes(10));

   ssTableWriter.newRow(bytes("20101201"));
   ssTableWriter.addColumn(
         builder.build(),
         ByteBuffer.allocate(0),
         System.currentTimeMillis()
   );

   ssTableWriter.close();
}

When I execute this method and load the data using sstableloader, if I do a 'SELECT * FROM
test' in cqlsh, I get the results:

key      | val1       | val2
----------------------------
20101201 | '20101201' | 5

And the error:  Failed to decode value '20101201' (for column 'val1') as int.

The error I get makes sense, as apparently it tried to place the key value into the val1 column.
 From this error, I then assumed that the key value should not be part of the composite type
when the row is added, so I removed the UTF8Type from the composite type, and only added the
two integer values through the builder, but when I repeat the select with that data loaded,
Cassandra throws an ArrayIndexOutOfBoundsException in the ColumnGroupMap class.

Can anyone offer any advice on the correct way to insert data via the bulk loading process
into CQL3 tables with composite columns?  Does the fact that I am not inserting a value for
the columns make a difference?  For my particular use case, all I care about is the values
in the column names themselves (and the associated sorting that goes with them).

Any info or help anyone could provide would be very much appreciated.

Regards,

Daniel Morton


Mime
View raw message