jackrabbit-users mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Bertrand Delacretaz <bdelacre...@apache.org>
Subject Re: Performance of a large number of small nodes
Date Fri, 14 Aug 2009 07:11:46 GMT
Hi,

On Fri, Aug 14, 2009 at 6:34 AM, Nigel Sim<nigel.sim@gmail.com> wrote:
> ...I am using Jackrabbit to store a mixture of scientific data, which includes
> files and numerical data. The performance of files are fine, but the
> numerical data needs to be extracted as datasets based on attributes such as
> observation time, and this appears to be quite slow in comparison to a
> native DB (obviously). I would really prefer to keep all this related data
> in the same management system, so is there a way to improve the ingestion
> and retrieval of many small nodes?...

Could you take advantage of paths to express the observation time, and
use that for "queries"?

Storing data under paths like /data/2009/12/24/23/02/58 would allow
you to find nodes that belong to a specific day, or hour, by
navigating paths, which might be much more efficient than queries.

> ...My second question, is there an efficient way to query for the latest
> observation? I would assume querying for the node type, sorting, and just
> retrieving the first result?...

Paths would also help here, and you could use observation to keep
track of the path that corresponds to the most recent data, if needed.

-Bertrand

Mime
View raw message