Mailing-List: contact commits-help@cassandra.apache.org; run by ezmlm
Precedence: bulk
Reply-To: dev@cassandra.apache.org
Date: Fri, 22 Nov 2013 23:11:36 +0000 (UTC)
From: "Alex Liu (JIRA)" <jira@apache.org>
To: commits@cassandra.apache.org
Message-ID: <JIRA.12679298.1384466139432.17199.1385161895986@arcas>
In-Reply-To: <JIRA.12679298.1384466139432@arcas>
References: <JIRA.12679298.1384466139432@arcas>
Subject: [jira] [Commented] (CASSANDRA-6348) TimeoutException throws if Cql
 query allows data filtering and index is too big and it can't find the data
 in base CF after filtering
MIME-Version: 1.0
Content-Type: text/plain; charset=utf-8
Content-Transfer-Encoding: 7bit


    [ https://issues.apache.org/jira/browse/CASSANDRA-6348?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13830401#comment-13830401 ] 

Alex Liu commented on CASSANDRA-6348:
-------------------------------------

rowsPerQuery is only used as page size for Index CF during 2i search.

maxColumns is the number of limit clause.  If meanColumns is a big number, then filter.maxColumns()/meanColumns is less than 1, rowsPerQuery is 2. The result paging size for index CF is 2 which is too small, we end up with too many random seeks between index CF and base CF, that's the reason why sometimes 2i index search is so slow. We need to avoid the page size of index CF too small. The goal is to set page size an enough large number but not too large to avoid OOM, so we can have less random seeks between index CF and base CF.

If there is data filtering involved and many base CF columns don't match the filter,  the small page size causes the issue even worse for we needs paging through more pages in index CF.

{code}
    public int maxRows()
    {
        return countCQL3Rows ? Integer.MAX_VALUE : maxResults;
    }

    public int maxColumns()
    {
        return countCQL3Rows ? maxResults : Integer.MAX_VALUE;
    }
{code}

for none-cql query,
{code}
            rowsPerQuery = Math.max(Math.min(filter.maxResults, Integer.MAX_VALUE / meanColumns), 2);
            most likely  becomes rowsPerQuery = Math.max(filter.maxResults, 2);
            most likely becomes rowsPerQuery = filter.maxResults
            which is the same number of rows to fetch
{code}

for cql query
{code}
            rowsPerQuery = Math.max(Math.min(Integer.MAX_VALUE, filter.maxResults / meanColumns), 2);
            most likely  becomes rowsPerQuery = Math.max(filter.maxResults/ meanColumns, 2);
            most likely becomes rowsPerQuery = filter.maxResults/ meanColumns
            if meanColumns is too big, it's a very small number less than 1 possible.
            if no limit clause in cql query, it becomes Integer.MAX_VALUE/ meanColumns which is a big number.
{code}

So the question is how to calculate page size for index CF, so we don't have too many random seeks between index CF and base CF and void fetching too many index columns to avoid OOM.


> TimeoutException throws if Cql query allows data filtering and index is too big and it can't find the data in base CF after filtering 
> --------------------------------------------------------------------------------------------------------------------------------------
>
>                 Key: CASSANDRA-6348
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-6348
>             Project: Cassandra
>          Issue Type: Bug
>          Components: Core
>            Reporter: Alex Liu
>            Assignee: Alex Liu
>         Attachments: 6348.txt
>
>
> If index row is too big, and filtering can't find the match Cql row in base CF, it keep scanning the index row and retrieving base CF until the index row is scanned completely which may take too long and thrift server returns TimeoutException. This is one of the reasons why we shouldn't index a column if the index is too big.
> Multiple indexes merging can resolve the case where there are only EQUAL clauses. (CASSANDRA-6048 addresses it).
> If the query has none-EQUAL clauses, we still need do data filtering which might lead to timeout exception.
> We can either disable those kind of queries or WARN the user that data filtering might lead to timeout exception or OOM.


--
This message was sent by Atlassian JIRA
(v6.1#6144)