cassandra-commits mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Brandon Williams (JIRA)" <>
Subject [jira] Updated: (CASSANDRA-1155) keep persistent row statistics
Date Thu, 22 Jul 2010 18:52:51 GMT


Brandon Williams updated CASSANDRA-1155:

    Attachment: 1155-v3.txt

v3 builds on v2, finishing the TODOs in CFS and loading the persistent statistics on SSTRs
when opening an existing SST.  This has a deadlock problem when flushing, where a flush of
A goes to write the stats, but meanwhile B has acquired the flusherlock in preparation to
flush, so A can't acquire the lock to do the stats write, and B can't release the lock because
we only allow N flushes at a time.  Because that's pretty hairy, I'm going to go the route
of storing a separate -Statistics.db, but am posting this patch in case it turns out to be
useful later.

> keep persistent row statistics
> ------------------------------
>                 Key: CASSANDRA-1155
>                 URL:
>             Project: Cassandra
>          Issue Type: Sub-task
>          Components: Core
>            Reporter: Jonathan Ellis
>            Assignee: Brandon Williams
>             Fix For: 0.7 beta 1
>         Attachments: 1155-v2.txt, 1155-v3.txt, 1155.txt
> during flush and compaction we should keep row size statistics using EstimatedHistogram
(column count, and row size), replacing min/max/total sizes in CFS.
> having this detail will let us estimate, given an index CF, how many nodes we need to
query to get the number of matching rows requested by the client.

This message is automatically generated by JIRA.
You can reply to this email to add a comment to the issue online.

View raw message