cassandra-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Carlos Alonso <i...@mrcalonso.com>
Subject Re: Sorting & pagination in apache cassandra 2.1
Date Tue, 12 Jan 2016 10:50:26 GMT
Hi Anuja.

Cassandra saves records on disk sorted by the clustering column. In this
case you haven't selected any but it looks like is picking birth_year as
the clustering column. I don't know which is the clustering column
selection algorithm though (maybe alphabetically by name?).

Regards

Carlos Alonso | Software Engineer | @calonso <https://twitter.com/calonso>

On 12 January 2016 at 07:30, anuja jain <anujajain4@gmail.com> wrote:

> 1 more question, what does it mean by "cassandra inherently sorts data"?
> For eg:
> I have a table with schema
>
> CREATE TABLE users (
>
>             ...   user_name varchar PRIMARY KEY,
>
>             ...   password varchar,
>
>             ...   gender varchar,
>
>             ...   session_token varchar,
>
>             ...   state varchar,
>
>             ...   birth_year bigint
>
>             ... );
>
> I inserted data in random order but I on firing select statement I get
> data sorted by birth_year..  So why does this happen?
>
>  cqlsh:learning> select * from users;
>
>
>
> user_name | birth_year | gender | password | session_token | state
>
> -----------+------------+--------+----------+---------------+---------
>
>       John |       1979 |      M |     qwer |           abc |      JK
>
>    Dharini |       1980 |      F |      Xyz |           abc | Gujarat
>
>      Keval |       1990 |      M |      DDD |           abc |      WB
>
> On Tue, Jan 12, 2016 at 12:52 PM, anuja jain <anujajain4@gmail.com> wrote:
>
>> What is the alternative if my cassandra version is prior to 3.0
>> (specifically) 2.1) and which is already in production.?
>>
>> Also as per the docs given at
>>
>>
>> https://docs.datastax.com/en/datastax_enterprise/4.6/datastax_enterprise/srch/srchCapazty.html
>>  what does it mean by we need to do capacity planning if we need to
>> search using SOLR. What is other alternative when we do not know the size
>> of the data ?
>>
>>  Thanks,
>>
>> Anuja
>>
>>
>>
>> On Fri, Jan 8, 2016 at 12:15 AM, Tyler Hobbs <tyler@datastax.com> wrote:
>>
>>>
>>> On Thu, Jan 7, 2016 at 6:45 AM, anuja jain <anujajain4@gmail.com> wrote:
>>>
>>>> My question is, what is the alternative if we need to order by col3 or
>>>> col4 in my above example without including col2 in order by clause.
>>>>
>>>
>>> The server-side alternative is to create a second table (or a
>>> materialized view, if you're using 3.0+) that uses a different clustering
>>> order.  Cassandra purposefully only supports simple and efficient queries
>>> that can be handled quickly (with a few exceptions), and arbitrary ordering
>>> is not part of that, especially if you consider complications like paging.
>>>
>>>
>>> --
>>> Tyler Hobbs
>>> DataStax <http://datastax.com/>
>>>
>>
>>
>

Mime
View raw message