incubator-cassandra-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Peter Lin <wool...@gmail.com>
Subject Re: Ad-hoc queries question
Date Fri, 20 Sep 2013 23:52:08 GMT
there are several ways of handling these types of use cases. Some people
take a soft real-time approach by calculating aggregates in-memory and
saving it to tables periodically. One example of this is twitter and storm.
Other techniques includes using batch process to extract summaries and
storing them in a OLAP cube, for reporting purposes.

If your application doesn't need ad-hoc queries results immediately,
usually mapreduce is sufficient. Many people use Pig and Hive to do this
type of operation.



On Fri, Sep 20, 2013 at 7:41 PM, Hartzman, Leslie <
leslie.d.hartzman@medtronic.com> wrote:

>  By ad-hoc queries I mean exactly what you’ve described. The need to
> access data from multiple column families, typically addressed in RDBs with
> JOINs. ****
>
> ** **
>
> I haven’t really become familiar enough with MapReduce yet, so I’ll have
> to delve deeper into that. I’m hoping that the de-normalized nature of
> things would obviate the need for complex subquery-type of operations.****
>
> ** **
>
> *From:* Peter Lin [mailto:woolfel@gmail.com]
> *Sent:* Friday, September 20, 2013 4:30 PM
>
> *To:* user@cassandra.apache.org
> *Subject:* Re: Ad-hoc queries question****
>
> ** **
>
> ** **
>
> What do you mean by ad-hoc queries?****
>
> Most NoSql databases do not support cross table joins, due to the
> distributed nature of NoSql databases. If we compare this to partitioned
> databases in the RDB world, cross partition joins is also more expensive
> than non-partitioned databases.****
>
> you can do ad-hoc queries on a single table as long as the columns have
> secondary indexes defined. You can do multi-table joins using MapReduce or
> using CQL handle that logic in your application. In some cases, you can use
> the concept of summary tables to speed up complex multi-table adhoc queries
> that have nasty joins. One thing that is very hard to do with all NoSql
> databases is complex correlated subqueries. For those kinds of use cases,
> MapReduce is the "preferred" technique.
>
> for comparison, databases like Oracle RAC distribute table indexes and
> perform index joins to speed up complex multi-table joins. The downside is
> a full Oracle RAC is very expensive and has a high up front cost.****
>
> ** **
>
> On Fri, Sep 20, 2013 at 7:20 PM, Hartzman, Leslie <
> leslie.d.hartzman@medtronic.com> wrote:****
>
> Thanks Rob. I thought that might have been the situation but wasn’t sure.
> So does this negate the use of cqlsh to do this then? I’d hate to have to
> provide custom code to support ad-hoc queries.****
>
>  ****
>
> Les****
>
>  ****
>
> *From:* Robert Coli [mailto:rcoli@eventbrite.com]
> *Sent:* Friday, September 20, 2013 4:06 PM
> *To:* user@cassandra.apache.org
> *Subject:* Re: Ad-hoc queries question****
>
>  ****
>
> On Fri, Sep 20, 2013 at 3:25 PM, Hartzman, Leslie <
> leslie.d.hartzman@medtronic.com> wrote:****
>
>  So are ad-hoc queries more awkward or not feasible?****
>
>   ****
>
> Yes.****
>
>  ****
>
> To expand slightly, you will probably end up querying multiple
> columnfamilies and doing the ad-hoc JOIN-esque aspect in application code.
> ****
>
>  ****
>
> =Rob****
>
>  ****
>
> [CONFIDENTIALITY AND PRIVACY NOTICE] Information transmitted by this email
> is proprietary to Medtronic and is intended for use only by the individual
> or entity to which it is addressed, and may contain information that is
> private, privileged, confidential or exempt from disclosure under
> applicable law. If you are not the intended recipient or it appears that
> this mail has been forwarded to you without proper authority, you are
> notified that any use or dissemination of this information in any manner is
> strictly prohibited. In such cases, please delete this mail from your
> records. To view this notice in other languages you can either select the
> following link or manually copy and paste the link into the address bar of
> a web browser: http://emaildisclaimer.medtronic.com****
>
> ** **
>

Mime
View raw message