cassandra-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Ryan Svihla ...@foundev.pro>
Subject Re: Are Triggers in Cassandra 2.1.2 performace Hog??
Date Wed, 07 Jan 2015 16:03:13 GMT
@Ken So I actually support a lot of the DSE Search users and teach classes
on it, so as long as you're not dropping mutations you're in sync, and if
you're dropping mutations you're probably sized way too small anyway, and
once you run repair (which you should be doing anyway when dropping
mutations) you're back in sync. I actually think because of that the models
work well together.

FWIW the improvement since 3.0 is MASSIVE (it's been what I'd call stable
since 3.2.x and we're on 4.6 now)

@Asit to answer the ES question, it's not really for me to say at all what
the lag will be or to help in advising sizing of ES, so that's probably
more of a question for them.


On Wed, Jan 7, 2015 at 8:56 AM, Asit KAUSHIK <asitkaushiknosql@gmail.com>
wrote:

> HI All,
>
> What i intend to do is on every write i would push the code to
> elasticsearch using the Trigger. I know it would impact the Cassandra write
> but  given that the WRITE is pretty performant on Cassandra would that lag
> be a big one.
>
> Also as per my information SOLR  has  limitation of using Nested JSON
> documents  which is elasticsearch does seamlessly and hence it was our
> preference.
>
> Please Let me know about you thought on this as we are struck on this and
> i am looking into Streaming Part of cassandra in hope that i can find
> something
>
> Regards
> Asit
>
>
>
> On Wed, Jan 7, 2015 at 8:16 PM, Ken Hancock <ken.hancock@schange.com>
> wrote:
>
>> When last I looked at Datastax Enterprise (DSE 3.0ish), it exhibits the
>> same problem that you highlight, no different than your good idea of
>> asynchronously pushing to ES.
>>
>> Each Cassandra write was indexed independently by each server in the
>> replication group.  If a node timed out or a mutation was dropped, that
>> Solr node would have an out-of-sync index.  Doing a solr query such as
>> count(*) users could return inconsistent results depending on which node
>> you hit since solr didn't support Cassandra consistency levels.
>>
>> I haven't seen any blog posts or docs as to whether this intrinsic
>> mismatch between how Cassandra handles eventual consistency and Solr has
>> ever been resolved.
>>
>> Ken
>>
>>
>> On Wed, Jan 7, 2015 at 9:05 AM, DuyHai Doan <doanduyhai@gmail.com> wrote:
>>
>>> Be very very careful not to perform blocking calls to ElasticSearch in
>>> your trigger otherwise you will kill C* performance. The biggest danger of
>>> the triggers in their current state is that they are on the write path.
>>>
>>> In your trigger, you can try to push the mutation asynchronously to ES
>>> but in this case it will mean managing a thread pool and all related issues.
>>>
>>> Not even mentioning atomicity issues like: what happen if the update to
>>> ES fails  or the connection times out ? etc ...
>>>
>>> As an alternative, instead of implementing yourself the integration with
>>> ES, you can have a look at Datastax Enterprise integration of Cassandra
>>> with Apache Solr (not free) or some open-source alternatives like Stratio
>>> or TupleJump fork of Cassandra with Lucene integration.
>>>
>>> On Wed, Jan 7, 2015 at 2:40 PM, Asit KAUSHIK <asitkaushiknosql@gmail.com
>>> > wrote:
>>>
>>>> HI All,
>>>>
>>>> We are trying to integrate elasticsearch with Cassandra and as the
>>>> river plugin uses select * from any table it seems to be bad performance
>>>> choice. So i was thinking of inserting into elasticsearch using Cassandra
>>>> trigger.
>>>> So i wanted your view does a Cassandra Trigger impacts the performance
>>>> of read/Write of Cassandra.
>>>>
>>>> Also any other way you guys achieve this please guide me. I am struck
>>>> on this .
>>>>
>>>> Regards
>>>> Asit
>>>>
>>>>
>>>
>>
>>
>>
>>
>


-- 

Thanks,
Ryan Svihla

Mime
View raw message