jackrabbit-users mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Bertrand Delacretaz <bdelacre...@apache.org>
Subject Re: Performance of a large number of small nodes
Date Fri, 28 Aug 2009 07:05:44 GMT
Hi Nigel,

On Thu, Aug 27, 2009 at 1:09 PM, Bertrand
Delacretaz<bdelacretaz@apache.org> wrote:
> On Thu, Aug 27, 2009 at 5:40 AM, Nigel Sim<nigel.sim@gmail.com> wrote:
>> ...In Jackrabbit the path looks like /<instrument>/<dataset>/YYYY/MM/DD/<value>
>>
> ...What's the query?
>
> Find all values for a given instrument and dataset in a specific period of time?

Have you tried something like this?

Given timestamps T1 and T2, boundaries of the data to retrieve.

Compute start and end paths P1 and P2 corresponding to T1 and T2.

Start at P1, navigate to the value nodes using Node.getNodes(),
retrieve value nodes where timestamp >= T1 (try to avoid data
conversions when doing that - maybe store timestamps as a long).

Compute next path, retrieve all value nodes using Node.getNodes() (as
they are by definition within range).

Repeat, and when next path is P2 check timestamp <= T2 when retrieving
value nodes.

Dunno if that's what you're doing already, and I didn't test the
performance, but that feels intuitively like the fastest way to run
such a query with a large number of results.

-Bertrand

Mime
View raw message