cassandra-commits mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Sylvain Lebresne (JIRA)" <j...@apache.org>
Subject [jira] [Resolved] (CASSANDRA-8824) cassandra python driver return None when querying static column on partition bigger than 5000 entites
Date Wed, 18 Feb 2015 14:10:11 GMT

     [ https://issues.apache.org/jira/browse/CASSANDRA-8824?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]

Sylvain Lebresne resolved CASSANDRA-8824.
-----------------------------------------
    Resolution: Duplicate

This is a duplicate of CASSANDRA-8502.

> cassandra python driver return None when querying static column on partition bigger than
5000 entites
> -----------------------------------------------------------------------------------------------------
>
>                 Key: CASSANDRA-8824
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-8824
>             Project: Cassandra
>          Issue Type: Bug
>            Reporter: Mateusz Moneta
>
> When we querying partition with static column that has more than 5000 entities some of
them has unset static value, however when querying by cqlsh everything is fine.
> Here is example, {{expire}} is a static column, {{folder_id}} is primary key.
> {noformat}
> cqlsh> select id, parent_id, expire, mtime from share.entity where folder_id='68f2af3a2d1e4f95a231d5cb47e57cf2'
and mtime < '2015-02-01 06:21:25+0000';
>  id                               | parent_id | expire                   | mtime
> ----------------------------------+-----------+--------------------------+--------------------------
>  68f2af3a2d1e4f95a231d5cb47e57cf2 |      null | 2015-02-22 10:51:27+0000 | 2015-02-01
06:21:24+0000
> cqlsh> select count(*) from share.entity where folder_id='68f2af3a2d1e4f95a231d5cb47e57cf2';
>  count
> -------
>   5547
> In [1]: from django.db import connection
> In [2]: ses = connection.connection.session
> In [3]: from cassandra.query import SimpleStatement
> In [13]: query = "select * from share.entity where folder_id='68f2af3a2d1e4f95a231d5cb47e57cf2'";
> In [14]: st = SimpleStatement(query)
> In [15]: c, d = 0, 0
> In [16]: for e in ses.execute(st):
>     if e['expire'] is None:
>         c += 1
>     else:
>         d += 1
> In [17]: c
> Out[17]: 547
> In [18]: d
> Out[18]: 5000
> {noformat}
> After further digging its turned out that this is a problem with fetch_size param and
this can be easily reproduced:
> {noformat}
> In [1]: from cassandra.query import SimpleStatement
> In [2]: from django.db import connection
> In [3]: ses = connection.connection.session
> In [4]: ses.execute(SimpleStatement("create table t (k text, s text static, i int, primary
key(k, i));"))
> In [5]: for i in range(1, 500):
>    ....:     ses.execute(SimpleStatement("insert into t (k, i) values ('k', %d);" % i))
> In [6]: c, d = 0, 0
> In [7]: for e in ses.execute(SimpleStatement("select * from t", fetch_size=100)):
>     if e['s'] is None:
>         c += 1
>     else:
>         d += 1
>    ....:         
> In [8]: c
> Out[8]: 400
> In [9]: d
> Out[9]: 100
> {noformat}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Mime
View raw message