cassandra-commits mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Tyler Hobbs (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (CASSANDRA-8938) Full Row Scan does not count towards Reads
Date Thu, 09 Apr 2015 15:58:14 GMT

    [ https://issues.apache.org/jira/browse/CASSANDRA-8938?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14487564#comment-14487564
] 

Tyler Hobbs commented on CASSANDRA-8938:
----------------------------------------

[~eanujwa] so, there are two separate issues here: the read count and latency metrics (what
you see in cfstats), and the hotness measurements for sstables.  We don't have to update them
the same way.

Regarding metrics, I would be okay with having separate range scan count and latency metrics.
 We need to decide exactly those metrics behave, though (e.g. increment the read count for
each full scan, or each partition scanned, or each row scanned?).

For the hotness measurements, I do _not_ think we should increment the read count for each
row (or even partition) in a scan.  After the removal of {{cold_reads_to_omit}} in CASSANDRA-8860,
the hotness measurements do two things: prioritize compaction of certain sstables when there
are multiple sstable sets that can be compacted, and determine the amount of space to allocate
for the index summary for an sstable.  Since the index summary is far more important for partition
reads than scans, I think we can agree that scans shouldn't have a big impact on this.  For
prioritizing compaction, the absolute read numbers don't matter, only how large they are relative
to each other.  So, incrementing the count by one for each scan should be sufficient to handle
a scan-only workload.  If the workload is mixed, I think it's okay if partition reads have
a greater influence on compaction prioritization than range scans do.

> Full Row Scan does not count towards Reads
> ------------------------------------------
>
>                 Key: CASSANDRA-8938
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-8938
>             Project: Cassandra
>          Issue Type: Bug
>          Components: API, Core, Tools
>         Environment: Unix, Cassandra 2.0.3
>            Reporter: Amit Singh Chowdhery
>            Assignee: Marcus Eriksson
>            Priority: Minor
>              Labels: none
>
> When a CQL SELECT statement is executed with WHERE clause, Read Count is incremented
in cfstats of the column family. But, when a full row scan is done using SELECT statement
without WHERE clause, Read Count is not incremented. 
> Similarly, when using Size Tiered Compaction, if we do a full row scan using Hector RangeslicesQuery,
Read Count is not incremented in cfstats, Cassandra still considers all sstables as cold and
does not trigger compaction for them. If we fire MultigetSliceQuery, Read Count is incremented
and sstables becomes hot, triggering compaction of these sstables. 
> Expected Behavior:
> 1. Read Count must be incremented by number of rows read during a full row scan done
using CQL SELECT statement or Hector RangeslicesQuery.
> 2. Size Tiered compaction must consider all sstables as Hot after a full row scan.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Mime
View raw message