cassandra-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Dinesh Shanbhag <dinesh.shanb...@isanasystems.com>
Subject Cassandra 3.1 - Aggregation query failure
Date Fri, 18 Dec 2015 12:57:12 GMT


I am trying out Aggregations in a single node Cassandra 3.1 
installation.  The node has 4GB RAM.  The table being aggregated on 
contains ~450000 rows.  It contains information on US domestic flights 
for a single month (from 
http://www.transtats.bts.gov/DL_SelectFields.asp?Table_ID=236&DB_Short_Name=On-Time).

CREATE AGGREGATE flightdata.late_flights(text, decimal)
     SFUNC state_late_flights
     STYPE map<text, frozen<tuple<int, int>>>
     INITCOND {};

The late_flights aggregation function uses a state_late_flights() 
User-Defined Function that maintains a map of uniquecarrier to 
tuple<int,int>.  The first int in the tuple represents delayed flights 
of the corresponding uniquecarrier.  The second int represents total 
flights of the uniquecarrier.

This aggregation query on a subset of the days of the month works:

    cqlsh:flightdata> select late_flights(uniquecarrier, depdel15) from
    flightsbydate *where flightdate in ('2015-09-15', '2015-09-16',
    '2015-09-17', '2015-09-18', '2015-09-19', '2015-09-20', '2015-09-21')*;

      flightdata.late_flights(uniquecarrier, depdel15)
    -------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
      {'AA': (2395, 17138), 'AS': (234, 3308), 'B6': (703, 4832), 'DL':
    (1452, 17311), 'EV': (1028, 10502), 'F9': (221, 1837), 'HA': (79,
    1414), 'MQ': (892, 4926), 'NK': (535, 2300), 'OO': (1539, 11299),
    'UA': (1422, 9792), 'VX': (181, 1209), 'WN': (3446, 23659)}

    (1 rows)

    Warnings :
    Aggregation query used on multiple partition keys (IN restriction)


However, the aggregation on all ~450000 rows always fails, sometimes 
immediately, sometimes after 30-60 seconds:

    cqlsh:flightdata> select late_flights(uniquecarrier, depdel15) from
    flightsbydate;

    Traceback (most recent call last):
       File "CassandraInstall-3.1/bin/cqlsh.py", line 1258, in
    perform_simple_statement
         result = future.result()
       File
    "/home/wpl/CassandraInstall-3.1/bin/../lib/cassandra-driver-internal-only-3.0.0-6af642d.zip/cassandra-driver-3.0.0-6af642d/cassandra/cluster.py",
    line 3122, in result
         raise self._final_exception
    FunctionFailure: code=1400 [User Defined Function failure]
    message="execution of 'flightdata.state_late_flights[map<text,
    frozen<tuple<int, int>>>, text, decimal]' failed:
    java.security.AccessControlException: access denied
    ("java.io.FilePermission"
    "/home/wpl/CassandraInstall-3.1/conf/logback.xml" "read")"


While this query runs, CPU utilization is 100% - 120%, Peak RAM used is 
less than 3.5GB.

Just in case it is useful, the state_late_flights User-Defined function:

    cqlsh:flightdata> describe function state_late_flights;

    CREATE FUNCTION flightdata.state_late_flights(state map<text,
    frozen<tuple<int, int>>>, flid text, fldelay decimal)
         CALLED ON NULL INPUT
         RETURNS map<text, frozen<tuple<int, int>>>
         LANGUAGE java
         AS $$com.datastax.driver.core.TupleType tt =
    com.datastax.driver.core.TupleType.of(com.datastax.driver.core.ProtocolVersion.NEWEST_SUPPORTED,
    com.datastax.driver.core.CodecRegistry.DEFAULT_INSTANCE,
    com.datastax.driver.core.DataType.cint(),
    com.datastax.driver.core.DataType.cint());
    com.datastax.driver.core.TupleValue tv = tt.newValue(); tv.setInt(0,
    0); tv.setInt(1, 1); if (flid == null) { state.put("EMPTY", tv);
    return state; } if (state.get(flid) != null) {  tv =
    (com.datastax.driver.core.TupleValue) state.get(flid);  tv.setInt(1,
    tv.getInt(1) + 1); if
    (fldelay.compareTo(java.math.BigDecimal.valueOf(0)) == 1) {
    tv.setInt(0, tv.getInt(0) + 1); } } state.put(flid, tv); return
    state;$$;


What should be checked on to investigate this further?
Thanks,
Dinesh.

Mime
View raw message