cassandra-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Ryan Svihla <rsvi...@datastax.com>
Subject Re: Get column family size
Date Thu, 11 Dec 2014 21:27:21 GMT
An estimated partition key count can be had from nodetool cfstats, however
for large data sets analytics style queries (such as verification of large
data sets) I recommend spark, hive, hadoop, and even solr for some use
cases.

On Thu, Dec 11, 2014 at 3:10 PM, Philip Thompson <
philip.thompson@datastax.com> wrote:
>
> Chamila,
>
> You can find more detailed explanations in previous posts on this mailing
> list as to why, but a "Select count(*) from table;" query is inefficient in
> Cassandra for non-trivial datasets. You will need a better way to get the
> number of partition keys of a CF, which hopefully someone else in the user
> list can provide, as I have never needed to do that.
>
> On Thu, Dec 11, 2014 at 1:59 PM, Chamila Wijayarathna <
> cdwijayarathna@gmail.com> wrote:
>
>> Hi Philip,
>>
>> Yes, I'm using cqlsh. Is there any way I can solve this?
>>
>> Thank You!
>>
>> On Fri, Dec 12, 2014 at 12:26 AM, Philip Thompson <
>> philip.thompson@datastax.com> wrote:
>>
>>> I assume the query you are sending is through cqlsh. You are actually
>>> getting a client-side timeout error, which is unclear in 2.1.2, but I
>>> believe the error message will be more helpful as of 2.1.3.
>>>
>>> On Thu, Dec 11, 2014 at 1:52 PM, Chamila Wijayarathna <
>>> cdwijayarathna@gmail.com> wrote:
>>>
>>>> Hello all,
>>>>
>>>> I am trying to get the number of key value pairs.
>>>>
>>>> I used following query for this.
>>>>
>>>> select count(*) from corpus.word_usage ;
>>>>
>>>> This returns number of key value pairs when CF is relatively small. But
>>>> when I insert more key-velue pairs, I am getting error saying, "errors={},
>>>> last_host=127.0.0.1".
>>>>
>>>> What is the reason for this? Is there any better way to get the size
>>>> (number of key value pairs) of a CF in CQL?
>>>>
>>>> Thank You!
>>>>
>>>> --
>>>> *Chamila Dilshan Wijayarathna,*
>>>> SMIEEE, SMIESL,
>>>> Undergraduate,
>>>> Department of Computer Science and Engineering,
>>>> University of Moratuwa.
>>>>
>>>
>>>
>>
>>
>> --
>> *Chamila Dilshan Wijayarathna,*
>> SMIEEE, SMIESL,
>> Undergraduate,
>> Department of Computer Science and Engineering,
>> University of Moratuwa.
>>
>
>

-- 

[image: datastax_logo.png] <http://www.datastax.com/>

Ryan Svihla

Solution Architect

[image: twitter.png] <https://twitter.com/foundev> [image: linkedin.png]
<http://www.linkedin.com/pub/ryan-svihla/12/621/727/>

DataStax is the fastest, most scalable distributed database technology,
delivering Apache Cassandra to the world’s most innovative enterprises.
Datastax is built to be agile, always-on, and predictably scalable to any
size. With more than 500 customers in 45 countries, DataStax is the
database technology and transactional backbone of choice for the worlds
most innovative companies such as Netflix, Adobe, Intuit, and eBay.

Mime
View raw message