accumulo-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Lam <dnae...@gmail.com>
Subject Re: querying for relevant rows
Date Fri, 29 Jun 2012 19:01:47 GMT
This sounds like a good idea.  But how do I scan forward -- do I set
end=null in the following code?


			Scanner scan=conn.createScanner(tableName, auths);

			Text start=new Text(Value.longToBytes(beginTimestamp));
			Text end=new Text(Value.longToBytes(endTimestamp);
			scan.setRange(new Range(start, true, end, false));

			for(Entry<Key,Value> e:scan) ...


And is it efficient?  i.e., the scanner won't move to the next entry
until the next iteration through the for loop, right?

I'll run a test right now.

--
D. Lam


On Fri, Jun 29, 2012 at 1:52 PM, Adam Fuchs <afuchs@apache.org> wrote:
> You can't scan backwards in Accumulo, but you probably don't need to. What
> you can do instead is use the last timestamp in the range as the key like
> this:
>
>     key=2  value= {a.1 b.1 c.2 d.2}
>     key=5  value= {m.3 n.4 o.5}
>     key=7  value={x.6 y.6 z.7}
>
> As long as your ranges are non-overlapping, you can just stop when you get
> to the first key/value pair that starts after your given time range. If your
> ranges are overlapping then you will have to do a more complicated
> intersection between forward and reverse orderings to efficiently select
> ranges, or maybe use some type of hierarchical range intersection index akin
> to a binary space partitioning tree.
>
> Cheers,
> Adam
>
>
>
> On Fri, Jun 29, 2012 at 2:19 PM, Lam <dnaelam@gmail.com> wrote:
>>
>> I'm using a timestamp as a key and the value is all the relevant data
>> starting at that timestamp up to the timestamp represented by the key
>> of the next row.
>>
>> When querying, I'm given a time span, consisting of a start and stop
>> time.  I want to return all the relevant data within the time span, so
>> I was to retrieve the appropriate rows (then filter the data for the
>> given timespan).
>>
>> Example:
>> In Accumulo:  (the format of the value is  <letter>.<timestamp>)
>>     key=1  value= {a.1 b.1 c.2 d.2}
>>     key=3  value= {m.3 n.4 o.5}
>>     key=6  value={x.6 y.6 z.7}
>>
>> Query:  timespan=[2 4]  (get all data from timestamp 2 to 4 inclusively)
>>
>> Desire result: retrieve key=1 and key=3, then filter out a.1, b.1, and
>> o.5, and return the rest
>>
>> Problem: How do I know to retrieve key=1 and key=3 without scanning
>> all the keys?
>>
>> Can I create a scanner that looks for the given start key=2 and go to
>> the prior row (i.e. key=1)?
>>
>> --
>> D. Lam
>
>

Mime
View raw message