cassandra-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Leonid Ilyevsky <lilyev...@mooncapital.com>
Subject RE: Dynamic CF
Date Tue, 10 Jul 2012 19:02:10 GMT
I see now there is a package org.apache.cassandra.cql3.statements, with BatchStatement class.
Is this what I should use?

-----Original Message-----
From: Leonid Ilyevsky [mailto:lilyevsky@mooncapital.com]
Sent: Tuesday, July 10, 2012 11:45 AM
To: user@cassandra.apache.org
Subject: RE: Dynamic CF

I see. I actually tried it, and it consistently throws an exception. Below is my test code.
I have two tests; test1 is for the composite key case, and test2 is for the simple key. The
test2 works fine, while test1 gives me:

Exception in thread "main" InvalidRequestException(why:Not enough bytes to read value of component
0)
        at org.apache.cassandra.thrift.Cassandra$batch_mutate_result.read(Cassandra.java:20253)
        at org.apache.thrift.TServiceClient.receiveBase(TServiceClient.java:78)
        at org.apache.cassandra.thrift.Cassandra$Client.recv_batch_mutate(Cassandra.java:922)
        at org.apache.cassandra.thrift.Cassandra$Client.batch_mutate(Cassandra.java:908)
        at com.moon.cql.BatchTest.test1(BatchTest.java:99)
        at com.moon.cql.BatchTest.main(BatchTest.java:45)


So you suggest to use BATCH statement. Since I do it from Java, it means creating a huge string
(I may need to update thousands records at once), and executing it. Does it even make sense?
Why is this going to be any better than simply execute prepared statement multiple times?
The only thing it does is reduce number of calls to the server, but I have to figure out if
this is the bottle neck I need to optimize.
Or maybe I need to break all my updates in a number of batches.
By the way, can a batch statement be prepared? With thousands of question marks in it?


============================================================

public class BatchTest {

    /**
     * @param args the command line arguments
     */
    public static void main(String[] args) throws TTransportException,
            InvalidRequestException, TException, UnavailableException,
            TimedOutException {

        String host = args[0];
        int port = Integer.parseInt(args[1]);

        test1(host, port);
        //test2(host, port);
    }

    private static void test1(String host, int port) throws TTransportException,
            InvalidRequestException, TException, UnavailableException,
            TimedOutException {
        TTransport transport =
                new TFramedTransport(new org.apache.thrift.transport.TSocket(
                host, port));
        transport.open();
        TProtocol protocol = new TBinaryProtocol(transport);
        Cassandra.Client client = new Cassandra.Client(protocol);
        client.set_cql_version("3.0.0");
        client.set_keyspace("test");

        Map<ByteBuffer, Map<String, List<Mutation>>> mutationMap =
                new HashMap();

        Map<String, List<Mutation>> mutations = new HashMap();
        List<Mutation> columnsMutations = new ArrayList();

        // key
        ByteBuffer keyBuffer = AsciiType.instance.decompose("KEY1");

        // key1 as column
        Column key1 = new Column();
        key1.setName("key1".getBytes());
        key1.setValue(LongType.instance.decompose(System.nanoTime()));
        key1.setTimestamp(System.currentTimeMillis());
        ColumnOrSuperColumn cc = new ColumnOrSuperColumn();
        cc.setColumn(key1);
        Mutation m = new Mutation();
        m.setColumn_or_supercolumn(cc);
        columnsMutations.add(m);

        // value column
        Column value = new Column();
        value.setName("value".getBytes());
        value.setValue(DoubleType.instance.decompose(5.3));
        value.setTimestamp(System.currentTimeMillis());
        cc = new ColumnOrSuperColumn();
        cc.setColumn(value);
        m = new Mutation();
        m.setColumn_or_supercolumn(cc);
        columnsMutations.add(m);

        // Inner mutation map
        mutations.put("testtable1", columnsMutations);

        // outer map : use the partition key
        mutationMap.put(keyBuffer, mutations);

        // Execute
        client.batch_mutate(mutationMap, ConsistencyLevel.ANY);
    }

      private static void test2(String host, int port) throws TTransportException,
            InvalidRequestException, TException, UnavailableException,
            TimedOutException {
        TTransport transport =
                new TFramedTransport(new org.apache.thrift.transport.TSocket(
                host, port));
        transport.open();
        TProtocol protocol = new TBinaryProtocol(transport);
        Cassandra.Client client = new Cassandra.Client(protocol);
        client.set_cql_version("3.0.0");
        client.set_keyspace("test");

        Map<ByteBuffer, Map<String, List<Mutation>>> mutationMap =
                new HashMap();

        Map<String, List<Mutation>> mutations = new HashMap();
        List<Mutation> columnsMutations = new ArrayList();

        // key
        ByteBuffer keyBuffer = AsciiType.instance.decompose("KEY1");

        // value column
        Column value = new Column();
        value.setName("value".getBytes());
        value.setValue(DoubleType.instance.decompose(5.3));
        value.setTimestamp(System.currentTimeMillis());
        ColumnOrSuperColumn cc = new ColumnOrSuperColumn();
        cc.setColumn(value);
        Mutation m = new Mutation();
        m.setColumn_or_supercolumn(cc);
        columnsMutations.add(m);

        // Inner mutation map
        mutations.put("testtable2", columnsMutations);

        // outer map : use the partition key
        mutationMap.put(keyBuffer, mutations);

        // Execute
        client.batch_mutate(mutationMap, ConsistencyLevel.ANY);
    }
}

============================================================

-----Original Message-----
From: Sylvain Lebresne [mailto:sylvain@datastax.com]
Sent: Tuesday, July 10, 2012 10:37 AM
To: user@cassandra.apache.org
Subject: Re: Dynamic CF

On Tue, Jul 10, 2012 at 4:17 PM, Leonid Ilyevsky
<lilyevsky@mooncapital.com> wrote:
> So I guess, in the batch_mutate call, in the map that I pass to it, only the first element
of the composite key should be used as a key (because it is the real key), and the other parts
of the key should be passed as regular columns? Is this correct? While I am waiting for your
confirmation, I am going to try it.

I would really advise you to use the BATCH statement of CQL3 rather
than the thrift batch_mutate call. If only because until
https://issues.apache.org/jira/browse/CASSANDRA-4377 is resolved it
won't work at all, but also because the whole point of CQL3 is to hide
that kind of complexity.

--
Sylvain

>
> -----Original Message-----
> From: Sylvain Lebresne [mailto:sylvain@datastax.com]
> Sent: Tuesday, July 10, 2012 8:24 AM
> To: user@cassandra.apache.org
> Subject: Re: Dynamic CF
>
> On Fri, Jul 6, 2012 at 10:49 PM, Leonid Ilyevsky
> <lilyevsky@mooncapital.com> wrote:
>> At this point I am really confused about what direction Cassandra is going. CQL 3
has the benefit of composite keys, but no dynamic columns.
>> I thought, the whole point of Cassandra was to provide dynamic tables.
>
> CQL3 absolutely provide "dynamic tables"/wide rows, the syntax is just
> different. The typical example for wide rows is a time serie, for
> instance keeping all the events for a given event_kind in the same C*
> row ordered by time. You declare that in CQL3 using:
>   CREATE TABLE events (
>     event_kind text,
>     time timestamp,
>     event_name text,
>     event_details text,
>     PRIMARY KEY (event_kind, time)
>   )
>
> The important part in such definition is that one CQL row (i.e a given
> event_kind, time, event_name, even_details) does not map to an internal
> Cassandra row. More precisely, all events sharing the same event_kind will be
> in the same internal row. This is a wide row/dynamic table in the sense of
> thrift.
>
>
>> I need to have a huge table to store market quotes, and be able to query it by name
and timestamp (t1 <= t <= t2), therefore I wanted the composite key.
>> Loading data to such table using prepared statements (CQL 3-based) was very slow,
because it makes a server call for each row.
>
> You should use a BATCH statement which is the equivalent to batch_mutate.
>
> --
> Sylvain
>
> This email, along with any attachments, is confidential and may be legally privileged
or otherwise protected from disclosure. Any unauthorized dissemination, copying or use of
the contents of this email is strictly prohibited and may be in violation of law. If you are
not the intended recipient, any disclosure, copying, forwarding or distribution of this email
is strictly prohibited and this email and any attachments should be deleted immediately. 
This email and any attachments do not constitute an offer to sell or a solicitation of an
offer to purchase any interest in any investment vehicle sponsored by Moon Capital Management
LP ("Moon Capital"). Moon Capital does not provide legal, accounting or tax advice. Any statement
regarding legal, accounting or tax matters was not intended or written to be relied upon by
any person as advice. Moon Capital does not waive confidentiality or privilege as a result
of this email.

This email, along with any attachments, is confidential and may be legally privileged or otherwise
protected from disclosure. Any unauthorized dissemination, copying or use of the contents
of this email is strictly prohibited and may be in violation of law. If you are not the intended
recipient, any disclosure, copying, forwarding or distribution of this email is strictly prohibited
and this email and any attachments should be deleted immediately.  This email and any attachments
do not constitute an offer to sell or a solicitation of an offer to purchase any interest
in any investment vehicle sponsored by Moon Capital Management LP ("Moon Capital"). Moon Capital
does not provide legal, accounting or tax advice. Any statement regarding legal, accounting
or tax matters was not intended or written to be relied upon by any person as advice. Moon
Capital does not waive confidentiality or privilege as a result of this email.

This email, along with any attachments, is confidential and may be legally privileged or otherwise
protected from disclosure. Any unauthorized dissemination, copying or use of the contents
of this email is strictly prohibited and may be in violation of law. If you are not the intended
recipient, any disclosure, copying, forwarding or distribution of this email is strictly prohibited
and this email and any attachments should be deleted immediately.  This email and any attachments
do not constitute an offer to sell or a solicitation of an offer to purchase any interest
in any investment vehicle sponsored by Moon Capital Management LP ("Moon Capital"). Moon Capital
does not provide legal, accounting or tax advice. Any statement regarding legal, accounting
or tax matters was not intended or written to be relied upon by any person as advice. Moon
Capital does not waive confidentiality or privilege as a result of this email.

Mime
View raw message