accumulo-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Josh Elser <>
Subject Re: Fetch rows in reversed order and how to model time range for quick fetching
Date Tue, 17 Jun 2014 05:15:53 GMT
The "acct:" in the row seems to be unnecessary. It seems like the ID 
should be enough. You'll want to consider the maximum of transactions 
that you want to support. You don't want a single row to grow 
indefinitely, but you're probably talking about GBs of data (compressed).

The column family is usually best served as a filtering mechanism. 
Limiting it to "payment" alone is a good idea as you can then 
efficiently filter on that column family (or other relevant column 
families) by configuring a locality group.

You could then make the column qualifier: timestamp_receiverId_edgeId.

You might also be able to use the ReverseLexicoder[1] and the 
DateLexicoder[2] to encode the date so you can get the most recent 
transactions first.

Lots of different ways to approach this, but it depends on what exactly 
you want to support.


On 6/16/14, 10:02 PM, Jianshi Huang wrote:
> Hi all,
> I'm thinking about storing payments in the following format:
> rowId: senderId (i.e. "acct:123")
> CF: "payment@<timestamp>" (i.e. "payment@201406171224000")
> CQ: receiverId_edgeId ("acct:456_payment:1001")
> Value: properties
> Is this a good way to model payment events? The most frequent ops is to
> get the last payment, so can I scan the table using a reversed range?
> Also I'd like to know if point-in-time status data can be modeled in a
> similar fashion, or should I take advantage of the timestamp column.
> Cheers,
> --
> Jianshi Huang
> LinkedIn: jianshi
> Twitter: @jshuang
> Github & Blog:

View raw message