accumulo-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Keith Turner <ke...@deenlo.com>
Subject Re: sorting in Accumulo
Date Tue, 06 Mar 2012 18:26:25 GMT
If you want to sort in descending order, you can make the row
(Long.MAX_VALUE - timestamp).  Stil make this fixed width.


On Tue, Mar 6, 2012 at 1:06 PM, Jason Trost <jason.trost@gmail.com> wrote:
> You could ingest this data into accumulo using the following "schema"
>
> row:       timestamp
> colfam:  "record"
> colqual: md5(JSON)
> value:   JSON record
>
> Accumulo would sort this for you in lexicographical order by timestamp
> (stored as a string). Depending on the range your data comes from, if
> all the epoch timestamps are the same length, then lexigraphical
> should equal numeric sorting.  If this is not the case for you, then
> you could convert your timestamps to a string using the following
> template (with each field zero padded to its max length):
>
> ${year}${month}{$day}${hour}${minute}${second}
>
> The md5(JSON) is there b/c I assume some of your events could have the
> same timestamp.  If you could have events that are exactly the same
> (and you need to track this) you may want to append a one-up counter
> to the md5 just to gurantee that you won't overwritten duplicates.
> Without the md5 (or another simialr mechanism), Accumulo would
> overwrite any previously stored values with the exact same [row,
> colfam, colqual, colvis].
>
> Iterating in temporal order would just be a simple full table scan.
>
> I hope this helps.
>
> --Jason
>
> On Tue, Mar 6, 2012 at 12:15 PM, John R. Frank <jrf@mit.edu> wrote:
>> Accumulo Experts,
>>
>> Is there an example of working with a time-ordered stream in Accumulo?
>>
>>
>> Given:
>>        ~500M JSON records each about 30kb
>>        each record hasa timestamp field (seconds since the epoch)
>>
>>
>> Goal:
>>        iterate over all records in temporal order
>>        run some function on this simulated stream
>>
>>
>> Thanks for any pointers or advice!
>>
>> John

Mime
View raw message