incubator-cassandra-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Nate McCall <n...@thelastpickle.com>
Subject Re: heavy insert load overloads CPUs, with MutationStage pending
Date Fri, 13 Sep 2013 16:19:40 GMT
Also, I was working on this a bit for a client so compiled my notes and
approach into a blog post for posterity (and so it's easier to find for
others):
http://thelastpickle.com/blog/2013/09/13/CQL3-to-Astyanax-Compatibility.html

Paul's method on this thread is cited at the bottom as well.


On Fri, Sep 13, 2013 at 11:16 AM, Nate McCall <nate@thelastpickle.com>wrote:

> https://github.com/Netflix/astyanax/issues/391
>
> I've gotten in touch with a couple of netflix folks and they are going to
> try to roll a release shortly.
>
> You should be able to build against 1.2.2 and 'talking' to 1.2.9 instance
> should work. Just a PITA development wise to maintain a different
> version(s).
>
>
> On Fri, Sep 13, 2013 at 10:52 AM, Keith Freeman <8forty@gmail.com> wrote:
>
>> Paul-  Sorry to go off-list but I'm diving pretty far into details here.
>>  Ignore if you wish.
>>
>> Thanks a lot for the example, definitely very helpful.  I'm surprised
>> that the Cassandra experts aren't more interested-in/alarmed-by our
>> results, it seems like we've proved that insert performance for wide rows
>> in CQL is enormously worse than it was before CQL.  And I have a feeling
>> 2.0 won't help much -- I'm already using entirely-prepared batches.
>>
>> To reproduce your example, I switched to cassandra 1.2.6  and astyanax
>> 1.56.42.  But anything I try to do with that version combination gives me
>> an exception on the client side (e.g. execute() on a query):
>>
>>> 13-09-13 15:42:42.511 [pool-6-thread-1] ERROR c.n.a.t.**
>>> ThriftSyncConnectionFactoryImp**l - Error creating connection
>>> java.lang.NoSuchMethodError: org.apache.cassandra.thrift.**TBinaryProtocol:
>>> method <init>(Lorg/apache/thrift/**transport/TTransport;)V not found
>>>     at com.netflix.astyanax.thrift.**ThriftSyncConnectionFactoryImp**
>>> l$ThriftConnection.open(**ThriftSyncConnectionFactoryImp**l.java:195)
>>> ~[astyanax-thrift-1.56.37.jar:**na]
>>>     at com.netflix.astyanax.thrift.**ThriftSyncConnectionFactoryImp**
>>> l$ThriftConnection$1.run(**ThriftSyncConnectionFactoryImp**l.java:232)
>>> [astyanax-thrift-1.56.37.jar:**na]
>>>     at java.util.concurrent.**Executors$RunnableAdapter.**call(Executors.java:471)
>>> [na:1.7.0_07]
>>>
>> From my googling this is due to a cassandra API change in
>> TBinaryProtocol, which is why I had to use cassandra 1.2.5 jars to get my
>> astyanax client to work at all in my earlier experiments. Did you encounter
>> this?  Also, you had 1.2.8 in the stackoverflow post, but 1.2.6 in this
>> email, did you have to rollback?
>>
>> Thanks for any help you can offer, hope I can return the favor at some
>> point.
>>
>>
>>
>> On 09/12/2013 02:26 PM, Paul Cichonski wrote:
>>
>>> I'm running Cassandra 1.2.6 without compact storage on my tables. The
>>> trick is making your Astyanax (I'm running 1.56.42) mutation work with the
>>> CQL table definition (this is definitely a bit of a hack since most of the
>>> advice says don't mix the CQL and Thrift APIs so it is your call on how far
>>> you want to go). If you want to still try and test it out you need to
>>> leverage the Astyanax CompositeColumn construct to make it work (
>>> https://github.com/Netflix/**astyanax/wiki/Composite-**columns<https://github.com/Netflix/astyanax/wiki/Composite-columns>
>>> )
>>>
>>> I've provided a slightly modified version of what I am doing below:
>>>
>>> CQL table def:
>>>
>>> CREATE TABLE standard_subscription_index
>>> (
>>>         subscription_type text,
>>>         subscription_target_id text,
>>>         entitytype text,
>>>         entityid int,
>>>         creationtimestamp timestamp,
>>>         indexed_tenant_id uuid,
>>>         deleted boolean,
>>>      PRIMARY KEY ((subscription_type, subscription_target_id),
>>> entitytype, entityid)
>>> )
>>>
>>> ColumnFamily definition:
>>>
>>> private static final ColumnFamily<**SubscriptionIndexCompositeKey,
>>> SubscribingEntityCompositeColu**mn> COMPOSITE_ROW_COLUMN = new
>>> ColumnFamily<**SubscriptionIndexCompositeKey,
>>> SubscribingEntityCompositeColu**mn>(
>>>         SUBSCRIPTION_CF_NAME, new AnnotatedCompositeSerializer<**
>>> SubscriptionIndexCompositeKey>**(**SubscriptionIndexCompositeKey.**
>>> class),
>>>         new AnnotatedCompositeSerializer<**
>>> SubscribingEntityCompositeColu**mn>(**SubscribingEntityCompositeColu**
>>> mn.class));
>>>
>>>
>>> SubscriptionIndexCompositeKey is a class that contains the fields from
>>> the row key (e.g., subscription_type, subscription_target_id), and
>>> SubscribingEntityCompositeColu**mn contains the fields from the
>>> composite column (as it would look if you view your data using
>>> Cassandra-cli), so: entityType, entityId, columnName. The columnName field
>>> is the tricky part as it defines what to interpret the column value as
>>> (i.e., if it is a value for the creationtimestamp the column might be
>>> "someEntityType:4:**creationtimestamp"
>>>
>>> The actual mutation looks something like this:
>>>
>>> final MutationBatch mutation = getKeyspace().**prepareMutationBatch();
>>> final ColumnListMutation<**SubscribingEntityCompositeColu**mn> row =
>>> mutation.withRow(COMPOSITE_**ROW_COLUMN,
>>>                 new SubscriptionIndexCompositeKey(**targetEntityType.getName(),
>>> targetEntityId));
>>>
>>> for (Subscription sub : subs) {
>>>         row.putColumn(new SubscribingEntityCompositeColu**
>>> mn(sub.getEntityType().**getName(), sub.getEntityId(),
>>>                                 "creationtimestamp"),
>>> sub.getCreationTimestamp());
>>>         row.putColumn(new SubscribingEntityCompositeColu**
>>> mn(sub.getEntityType().**getName(), sub.getEntityId(),
>>>                                 "deleted"), sub.isDeleted());
>>>         row.putColumn(new SubscribingEntityCompositeColu**
>>> mn(sub.getEntityType().**getName(), sub.getEntityId(),
>>>                                 "indexed_tenant_id"), tenantId);
>>> }
>>>
>>> Hope that helps,
>>> Paul
>>>
>>>
>

Mime
View raw message