incubator-cassandra-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Hartzman, Leslie" <leslie.d.hartz...@medtronic.com>
Subject RE: Ad-hoc queries question
Date Sat, 21 Sep 2013 00:20:33 GMT
Yeah, I know it was vague, but that is due to the fact that I'm still coming up to speed on
the project and have yet to hear some of the details. Since I had heard that there has always
been a requirement for ad-hoc queries against the Oracle DB for data-mining purpsoes, that
was the best I could do. The database definition to support the day-to-day needs was not a
worry. It was how some analysts or business people might use it.

My hope is that with a better understanding of how Oracle is used now, a better database definition
will remove/minimize the need for some of the existing complexity.

Les

From: Robert Coli [mailto:rcoli@eventbrite.com]
Sent: Friday, September 20, 2013 5:10 PM
To: user@cassandra.apache.org
Subject: Re: Ad-hoc queries question

On Fri, Sep 20, 2013 at 4:20 PM, Hartzman, Leslie <leslie.d.hartzman@medtronic.com<mailto:leslie.d.hartzman@medtronic.com>>
wrote:
Thanks Rob. I thought that might have been the situation but wasn't sure. So does this negate
the use of cqlsh to do this then? I'd hate to have to provide custom code to support ad-hoc
queries.

The form of your question is pretty vague. CQLsh, and CQL generally, give you a bit more flexibility
to construct complex queries than the old thrift interface. The more complex these queries,
however, the worse/less predictably they are likely to perform. An example would be ALLOW
FILTERING, which the docs describe thusly.

"
By default, CQL only allows select queries that don't involve "filtering" server side, i.e.
queries where we know that all (live) record read will be returned (maybe partly) in the result
set. The reasoning is that those "non filtering" queries have predictable performance in the
sense that they will execute in a time that is proportional to the amount of data returned
by the query (which can be controlled through LIMIT).

The ALLOW FILTERING option allows to explicitely [sic] allow (some) queries that require filtering.
Please note that a query using ALLOW FILTERING may thus have unpredictable performance (for
the definition above), i.e. even a query that selects a handful of records may exhibit performance
that depends on the total amount of data stored in the cluster.
"

=Rob

[CONFIDENTIALITY AND PRIVACY NOTICE]

Information transmitted by this email is proprietary to Medtronic and is intended for use
only by the individual or entity to which it is addressed, and may contain information that
is private, privileged, confidential or exempt from disclosure under applicable law. If you
are not the intended recipient or it appears that this mail has been forwarded to you without
proper authority, you are notified that any use or dissemination of this information in any
manner is strictly prohibited. In such cases, please delete this mail from your records.
 
To view this notice in other languages you can either select the following link or manually
copy and paste the link into the address bar of a web browser: http://emaildisclaimer.medtronic.com

Mime
View raw message