cassandra-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Dinesh Shanbhag <dinesh.shanb...@isanasystems.com>
Subject Re: Cassandra 3.1 - Aggregation query failure
Date Thu, 24 Dec 2015 06:07:06 GMT

Even if aggregation that forces a full table scan across partitions is 
not recommended, the message/exception does seems unrelated to partitioning:

    cqlsh:flightdata> select late_flights(uniquecarrier, depdel15) from
    flightsbydate in ('2015-09-15', '2015-09-16',
    '2015-09-17', '2015-09-18', '2015-09-19', '2015-09-20', '2015-09-21');

    Traceback (most recent call last):
       File "CassandraInstall-3.1/bin/cqlsh.py", line 1258, in
    perform_simple_statement
         result = future.result()
       File
    
"/home/wpl/CassandraInstall-3.1/bin/../lib/cassandra-driver-internal-only-3.0.0-6af642d.zip/cassandra-driver-3.0.0-6af642d/cassandra/cluster.py",


    line 3122, in result
         raise self._final_exception
    FunctionFailure: code=1400 [User Defined Function failure]
    message="execution of 'flightdata.state_late_flights[map<text,
    frozen<tuple<int, int>>>, text, decimal]' failed:
    java.security.AccessControlException: access denied
    ("java.io.FilePermission"
    "/home/wpl/CassandraInstall-3.1/conf/logback.xml" "read")"

Is that right?

And note that this same aggregation query (on a subset of the month's 
days) does complete successfully sometimes.

The behavior is similar with Cassandra 3.0 as well: on the same set of 
days, the query sometimes succeeds, fails most times.  Would trying the 
Datastax distribution offer any better chances?

Thanks,
Dinesh.


On 12/24/2015 2:59 AM, DuyHai Doan wrote:
> Thanks for the pointer on internal paging Tyler, I missed this one. 
> But then it raises some questions:
>
> 1. Is it possible to "tune" the page size or is it hard-coded internally ?
> 2. Is read-repair performed on EACH page or is it done on the whole 
> requested rows once they are fetched ?
>
> Question 2. is relevant in some particular scenarios when the user is 
> using CL QUORUM (or more) and some replicas are out-of-sync. Even in 
> the case of aggregation over a single partition, if this partition is 
> wide and spans many fetch pages, the time the coordinator performs all 
> the read-repair and reconcile over QUORUM replicas, the query may 
> timeout very quickly.
>
>
> On Fri, Dec 18, 2015 at 5:26 PM, Tyler Hobbs <tyler@datastax.com 
> <mailto:tyler@datastax.com>> wrote:
>
>
>     On Fri, Dec 18, 2015 at 9:17 AM, DuyHai Doan <doanduyhai@gmail.com
>     <mailto:doanduyhai@gmail.com>> wrote:
>
>         Cassandra will perform a full table scan and fetch all the
>         data in memory to apply the aggregate function.
>
>
>     Just to clarify for others on the list: when executing aggregation
>     functions, Cassandra /will/ use paging internally, so at most one
>     page worth of data will be held in memory at a time.  However, if
>     your aggregation function retains a large amount of data, this may
>     contribute to heap pressure.
>
>
>     -- 
>     Tyler Hobbs
>     DataStax <http://datastax.com/>
>
>


Mime
View raw message