cassandra-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From siddharth verma <sidd.verma29.l...@gmail.com>
Subject Re: [External] Re: Cassandra ad hoc search options
Date Tue, 31 Jan 2017 06:20:01 GMT
Hi,
*Are you using the DataStax connector as well? *
Yes, we used it to query on lucene index.

*Does it support querying against any column well (not just clustering
columns)?*
Yes it does. We used lucene particularly for this purpose.
( You can use :
1.
https://github.com/Stratio/cassandra-lucene-index/blob/branch-3.0.10/doc/documentation.rst#searching
2. https://www.youtube.com/watch?v=Hg5s-hXy_-M
for more details)

*I’m wondering how it could build the index around them “on-the-fly”*
You can build indexes at run time, but it takes time(took a lot of time on
our cluster. Plus, CPU utilization went through the roof)

*did you use Spark for the full set of data or just partial*
We weren't allowed to install spark ( tech decision)
Some tech discussions going around for the bulk job ecosystem.

Hence as a work around, we used a faster scan utility.
For all the adhoc purposes/scripts, you could do a full scan.

I hope it helps.

Regards


On Tue, Jan 31, 2017 at 4:11 AM, Yu, John <john.yu@sandc.com> wrote:

> A follow up question is: did you use Spark for the full set of data or
> just partial? In our case, I feel we need all the data to support ad hoc
> queries (with multiple conditional filters).
>
>
>
> Thanks,
>
> John
>
>
>
> *From:* Yu, John [mailto:john.yu@sandc.com]
> *Sent:* Monday, January 30, 2017 12:04 AM
> *To:* user@cassandra.apache.org
> *Subject:* RE: [External] Re: Cassandra ad hoc search options
>
>
>
> Thanks for the input! Are you using the DataStax connector as well? Does
> it support querying against any column well (not just clustering columns)?
> I’m wondering how it could build the index around them “on-the-fly”.
>
>
>
> Regards,
>
> John
>
>
>
> *From:* siddharth verma [mailto:sidd.verma29.list@gmail.com
> <sidd.verma29.list@gmail.com>]
> *Sent:* Friday, January 27, 2017 12:15 AM
> *To:* user@cassandra.apache.org
> *Subject:* Re: [External] Re: Cassandra ad hoc search options
>
>
>
> Hi
>
> We used lucene stratio plugin with C*3.0.3
>
>
>
> Helped to solve a lot of some read patterns. Served well for prefix.
>
> But created problems as repairs failed repeatedly.
>
> We might have used it sub optimally, not sure.
>
>
>
> Later, we had to do away with it, and tried to serve most of the read
> patterns with materialised views. (currently C*3.0.9)
>
>
>
> Currently, for adhoc querries, we use spark or full scan.
>
>
>
> Regards,
>
>
>
> On Fri, Jan 27, 2017 at 1:03 PM, Yu, John <john.yu@sandc.com> wrote:
>
> Thanks a lot. Mind sharing a couple of points where you feel it’s better
> than the alternatives.
>
>
>
> Regards,
>
> John
>
>
>
> *From:* Jonathan Haddad [mailto:jon@jonhaddad.com]
> *Sent:* Thursday, January 26, 2017 2:33 PM
> *To:* user@cassandra.apache.org
> *Subject:* [External] Re: Cassandra ad hoc search options
>
>
>
> > With Cassandra, what are the options for ad hoc query/search similar to
> RDBMS?
>
>
>
> Your best options are Spark w/ the DataStax connector or Presto.
> Cassandra isn't built for ad-hoc queries so you need to use other tools to
> make it work.
>
>
>
> On Thu, Jan 26, 2017 at 2:22 PM Yu, John <john.yu@sandc.com> wrote:
>
> Hi All,
>
>
>
> Hope I can get some help here. We’re using Cassandra for services, and
> recently we’re adding UI support.
>
> With Cassandra, what are the options for ad hoc query/search similar to
> RDBMS? We love the features of Cassandra but it seems it’s a known
> “weakness” that it doesn’t come with strong support of indexing and ad hoc
> queries. There’re some recent development with SASI as part of secondary
> index. However I heard from a video where it says it shall not be
> extensively used.
>
>
>
> Has anyone have much experience with SASI? How does it compare to Lucene
> plugin?
>
> What is the direction of Apache Cassandra in the search area?
>
>
>
> We’re also looking into Solr or ElasticSearch integration, but it seems it
> might take more efforts, and possibly involve data duplication.
>
> For Solr, we don’t have DSE.
>
> Sorry if this has been asked before, but I haven’t seen a more complete
> answer.
>
>
>
> Thanks!
>
> John
> ------------------------------
>
> NOTICE OF CONFIDENTIALITY:
> This message may contain information that is considered confidential and
> which may be prohibited from disclosure under applicable law or by
> contractual agreement. The information is intended solely for the use of
> the individual or entity named above. If you are not the intended
> recipient, you are hereby notified that any disclosure, copying,
> distribution or use of the information contained in or attached to this
> message is strictly prohibited. If you have received this email
> transmission in error, please notify the sender by replying to this email
> and then delete it from your system.
>
>
>
>
>
> --
>
> Siddharth Verma
>
> (Visit https://github.com/siddv29/cfs for a high speed cassandra full
> table scan)
>



-- 
Siddharth Verma
(Visit https://github.com/siddv29/cfs for a high speed cassandra full table
scan)

Mime
View raw message