cassandra-commits mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Brandon Williams (JIRA)" <j...@apache.org>
Subject [jira] Updated: (CASSANDRA-1155) keep persistent row statistics
Date Thu, 22 Jul 2010 18:52:51 GMT

     [ https://issues.apache.org/jira/browse/CASSANDRA-1155?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]

Brandon Williams updated CASSANDRA-1155:
----------------------------------------

    Attachment: 1155-v3.txt

v3 builds on v2, finishing the TODOs in CFS and loading the persistent statistics on SSTRs
when opening an existing SST.  This has a deadlock problem when flushing, where a flush of
A goes to write the stats, but meanwhile B has acquired the flusherlock in preparation to
flush, so A can't acquire the lock to do the stats write, and B can't release the lock because
we only allow N flushes at a time.  Because that's pretty hairy, I'm going to go the route
of storing a separate -Statistics.db, but am posting this patch in case it turns out to be
useful later.

> keep persistent row statistics
> ------------------------------
>
>                 Key: CASSANDRA-1155
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-1155
>             Project: Cassandra
>          Issue Type: Sub-task
>          Components: Core
>            Reporter: Jonathan Ellis
>            Assignee: Brandon Williams
>             Fix For: 0.7 beta 1
>
>         Attachments: 1155-v2.txt, 1155-v3.txt, 1155.txt
>
>
> during flush and compaction we should keep row size statistics using EstimatedHistogram
(column count, and row size), replacing min/max/total sizes in CFS.
> having this detail will let us estimate, given an index CF, how many nodes we need to
query to get the number of matching rows requested by the client.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


Mime
View raw message