incubator-cassandra-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From aaron morton <>
Subject Re: hadoop results
Date Thu, 30 Jun 2011 03:16:09 GMT
How about  get_slice() with reversed == true and count = 1 to get the highest time UUID ? 

Or you can also store a column with a magic name that have the value of the timeuuid that
is the current metric to use. 


Aaron Morton
Freelance Cassandra Developer

On 30 Jun 2011, at 06:35, William Oberman wrote:

> I'll start with my question: given a CF with comparator TimeUUIDType, what is the most
efficient way to get the greatest column's value?
> Context: I've been running cassandra for a couple of months now, so obviously it's time
to start layering more on top :-)  In my test environment, I managed to get pig/hadoop running,
and developed a few scripts to collect metrics I've been missing since I switched from MySQL
to cassandra (including the ever useful "select count(*) from table" equivalent).  
> I was hoping to dump the results of this processing back into cassandra for use in other
tools/processes.  My initial thought was: new CF called "stats" with comparator TimeUUIDType.
 The basic idea being I'd store:
> stat_name -> time stat was computed (as UUID) -> value
> That way I can also see a historical perspective of any given stat for auditing (and
for cumulative stats to see trends).  The stat_name itself is a URI that is composed of "what"
and any constraints on the "what" (including an optional time range, if the stat supports
it).  E.g. ClassOfSomething/ID/MetricName/OptionalTimeRange (or something, still deciding
on the format of the URI).  But, right now, the only way I know to get the "current" stat
value would be to iterate over all columns (the TimeUUIDs) and then return the last one.
> Thanks for any tips,
> will

View raw message