cassandra-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Sylvain Lebresne <sylv...@yakaz.com>
Subject Re: Just starting to play with Cassandra: (Surely) Dumb Question
Date Sat, 17 Apr 2010 15:09:13 GMT
On Sat, Apr 17, 2010 at 4:52 PM, Lucas Di Pentima
<lucas@di-pentima.com.ar> wrote:
> Hello Jonathan,
>
> I supposed the same, that's why I tried the count_columns() call, but when I try it with
some big SCF, I get the same error message:
>
> Thrift::TransportException: Socket: Timed out reading 4096 bytes from 127.0.0.1:9160
>
> Should I use count_columns() or is there any other way to know how much columns exists?

get_count() (that, even though I don't know the ruby gem, is most
probably used by
count_columns() under the hood) actually query the whole row and
simply return the number
of column founded. Hence the only thing you gain by counting columns
instead of requesting
them is that you don't have to pull all the columns over the network.
Hence counting is (roughly) as costly as requesting the whole row and
as such, it is no wonder
it timeout in your case.

When https://issues.apache.org/jira/browse/CASSANDRA-744 will be
included, you'll be able
to count the columns chunk by chunk (but it will still be as costly as
reading the row chunk by
chunk excepted for the network transfer of all those columns).

--
Sylvain

>
> Best regards
>
> El 17/04/2010, a las 01:14, Jonathan Ellis escribió:
>
>> You're supposed to request a few hundred or thousand columns per call,
>> then if you need more request the next set using the start parameter.
>>
>> On Fri, Apr 16, 2010 at 7:13 PM, Lucas Di Pentima
>> <lucas@di-pentima.com.ar> wrote:
>>> Hello all,
>>>
>>> I'm playing with Cassandra 0.6.0-rc1 on a MacOSX, with the 'cassandra' ruby gem.
>>>
>>> I load some test data to it and I was trying the gem's get() API when I realized
that if I call it some way like this:
>>>
>>> db.get('SomeSCFName', 'SomeKey')
>>>
>>> It returned me only 100 subcolumns when 'SomeKey' has approx 150000 subcolumns.
Next I tried calling get() like this:
>>>
>>> db.get('SomeSFCName', 'SomeKey', :count => N)
>>>
>>> My problem is that when N is a number higher than 50000 (approximately), I get
the following error:
>>>
>>> Thrift::TransportException: Socket: Timed out reading 4096 bytes from 127.0.0.1:9160
>>>
>>> The same happens if I call:
>>>
>>> db.count_columns('SomeSCFName', 'SomeKey')
>>>
>>> ...on the same 'SomeKey', but if I call count_columns() with some other key that
holds less columns, it works without problems.
>>>
>>> My setup is:
>>>
>>> * Cassandra 0.6.0-rc1 downloaded from the website, with all default configurations
>>> * Ruby 1.8.7
>>> * Cassandra gem 0.8.1
>>> * MacOSX 1.6.3
>>>
>>> Any help will be appreciated!
>>>
>>> --
>>> Lucas Di Pentima - Santa Fe, Argentina
>>> Jabber: lucas@di-pentima.com.ar
>>> MSN: ldipenti75@hotmail.com
>>>
>>>
>>>
>>>
>>>
>
> --
> Lucas Di Pentima - Santa Fe, Argentina
> Jabber: lucas@di-pentima.com.ar
> MSN: ldipenti75@hotmail.com
>
>
>
>
>

Mime
View raw message