cassandra-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From siddharth verma <sidd.verma29.l...@gmail.com>
Subject Re: Count(*) is not working
Date Fri, 17 Feb 2017 11:12:20 GMT
Hi,
We faced this issue too.
You could try with reduced paging size, so that tombstone threshold isn't
breached.

try using "paging 500" in cqlsh
[ https://docs.datastax.com/en/cql/3.3/cql/cql_reference/cqlshPaging.html ]

Similarly paging size could be set in java driver as well

This is a work around.
For this warning, do review your data model once.

Regards


On Fri, Feb 17, 2017 at 4:36 PM, Sylvain Lebresne <sylvain@datastax.com>
wrote:

> On Fri, Feb 17, 2017 at 11:54 AM, kurt greaves <kurt@instaclustr.com>
> wrote:
>
>> if you want a reliable count, you should use spark. performing a count
>> (*) will inevitably fail unless you make your server read timeouts and
>> tombstone fail thresholds ridiculous
>>
>
> That's just not true. count(*) is paged internally so while it is not
> particular fast, it shouldn't require bumping neither the read timeout nor
> the tombstone fail threshold in any way to work.
>
> In that case, it seems the partition does have many tombstones (more than
> live rows) and so the tombstone threshold is doing its job of warning about
> it.
>
>
>>
>> On 17 Feb. 2017 04:34, "Jan" <jan@dafuer.de> wrote:
>>
>>> Hi,
>>>
>>> could you post the output of nodetool cfstats for the table?
>>>
>>> Cheers,
>>>
>>> Jan
>>>
>>> Am 16.02.2017 um 17:00 schrieb Selvam Raman:
>>>
>>> I am not getting count as result. Where i keep on getting n number of
>>> results below.
>>>
>>> Read 100 live rows and 1423 tombstone cells for query SELECT * FROM
>>> keysace.table WHERE token(id) > token(test:ODP0144-0883E-022R-002/047-052)
>>> LIMIT 100 (see tombstone_warn_threshold)
>>>
>>> On Thu, Feb 16, 2017 at 12:37 PM, Jan Kesten <jan@dafuer.de> wrote:
>>>
>>>> Hi,
>>>>
>>>> do you got a result finally?
>>>>
>>>> Those messages are simply warnings telling you that c* had to read many
>>>> tombstones while processing your query - rows that are deleted but not
>>>> garbage collected/compacted. This warning gives you some explanation why
>>>> things might be much slower than expected because per 100 rows that count
>>>> c* had to read about 15 times rows that were deleted already.
>>>>
>>>> Apart from that, count(*) is almost always slow - and there is a
>>>> default limit of 10.000 rows in a result.
>>>>
>>>> Do you really need the actual live count? To get a idea you can always
>>>> look at nodetool cfstats (but those numbers also contain deleted rows).
>>>>
>>>>
>>>> Am 16.02.2017 um 13:18 schrieb Selvam Raman:
>>>>
>>>> Hi,
>>>>
>>>> I want to know the total records count in table.
>>>>
>>>> I fired the below query:
>>>>        select count(*) from tablename;
>>>>
>>>> and i have got the below output
>>>>
>>>> Read 100 live rows and 1423 tombstone cells for query SELECT * FROM
>>>> keysace.table WHERE token(id) > token(test:ODP0144-0883E-022R-002/047-052)
>>>> LIMIT 100 (see tombstone_warn_threshold)
>>>>
>>>> Read 100 live rows and 1435 tombstone cells for query SELECT * FROM
>>>> keysace.table WHERE token(id) > token(test:2565-AMK-2) LIMIT 100 (see
>>>> tombstone_warn_threshold)
>>>>
>>>> Read 96 live rows and 1385 tombstone cells for query SELECT * FROM
>>>> keysace.table WHERE token(id) > token(test:-2220-UV033/04) LIMIT 100 (see
>>>> tombstone_warn_threshold).
>>>>
>>>>
>>>>
>>>>
>>>> Can you please help me to get the total count of the table.
>>>>
>>>> --
>>>> Selvam Raman
>>>> "லஞ்சம் தவிர்த்து நெஞ்சம் நிமிர்த்து"
>>>>
>>>>
>>>
>>>
>>> --
>>> Selvam Raman
>>> "லஞ்சம் தவிர்த்து நெஞ்சம் நிமிர்த்து"
>>>
>>>
>>>
>


-- 
Siddharth Verma
(Visit https://github.com/siddv29/cfs for a high speed cassandra full table
scan)

Mime
View raw message