jackrabbit-users mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Michael Neale" <michael.ne...@gmail.com>
Subject Re: Query Performance and Optimization
Date Wed, 07 Mar 2007 23:35:27 GMT
On 3/7/07, Stefan Guggisberg <stefan.guggisberg@gmail.com> wrote:
>
> On 3/7/07, Michael Neale <michael.neale@gmail.com> wrote:
> > Hi Marcel - yes it would be interesting - I guess to get the most out of
> it,
> > the node type definitions would have to come into play to generate DDL
> for
> > the database - so the node type definitions will map to a more "tuned"
> > database schema - of course some concepts may not work that way, like
>
> i guess by "tuned" you mean a normalized schema. why do you think that
> such a normalized schema would improve performance?


Mainly allowing the RDBMS to perform queries - natively.

> hierarchies, or "nt:unstructured" in which case it would need to use the
> > current style.
> >
> > As for fulltext - database support varies with each vendor, so I would
> > hazard a guess that lucene would still need to be part of it (that is
> the
> > approach that the newer versions of hibernate have taken - take full
> text
> > out of the hands of the database).
> >
> > The DDL generation kind of scares me, in terms of complexity, but I
> think
> > its necessary to let RDBMS "do its thing" so to speak?
>
> why?


Mainly for queries. if we have a  node type def that has something:title,
something:size etc... then if they map to  columns in a table called
something_title, something_age we can get the RDBMS to do indexing. However,
this is turning jackrabbit into a kind of ORM itself - probably not one of
the aims ;)

> ORM tools can certainly help here - can avoid programmatically generating
> > DDL by instead generating a meta model that ORM tools work off - just a
> > thought (let the ORMs generate DB specific schemas).
> >
> > I know RDBMS are a proven way to scale up - but as for content, I am a
> > novice, so I am happy to follow the lead of those in the know in how
> best to
> > help jackrabbit scale. So far I have not been that "whelmed" by the
> query
> > performance - I am using the SQL dialect cause its familiar, but I think
> its
> > familiarity makes me want to do things that it is perhaps not optimised
> for,
> > maybe that is my problem.
> >
> > I should and will join the dev list, so as to not pollute the user list
> with
> > ponderings over jackrabbit internals ;)
> >
> > Thoughts?
> >
> > Michael.
> >
> > On 3/6/07, Marcel Reutegger <marcel.reutegger@gmx.net> wrote:
> > >
> > > Michael Neale wrote:
> > > > I know from previous discussions that it is a design decision of
> > > Jackrabbit
> > > > to not exlcusively work with RDBMS - if it was, I would be all in
> favour
> > > of
> > > > leaning on it to do the hardwork.
> > >
> > > please note that it is possible to exclusively use an RDBMS for
> storing
> > > and
> > > querying content, though you have to create your own persistence
> manager
> > > and
> > > query handler. the jackrabbit core does not force you to separate the
> > > store and
> > > the index.
> > >
> > > but you are right that it was a design decision to allow separation if
> you
> > > want
> > > to. because jackrabbit initially only had plain file based persistence
> > > managers
> > > and because lucene provides very good fulltext indexing we decided to
> go
> > > with
> > > lucene.
> > >
> > > coming back to the RDBMS only approach. you would have to implement a
> > > persistence manager that stores nodes and properties in a way that
> allows
> > > the
> > > database to use its indexes. then create a query handler that
> translates
> > > an
> > > abstract query tree into a SQL statement based on the database schema.
> > >
> > > there are some obstacles you will have to overcome (or actually the
> > > database):
> > > 1) handle node hierarchies (e.g. get all ancestors of a certain node)
> > > 2) provide fulltext indexing
> > >
> > > I think this would be a very useful extension for jackrabbit. so, if
> > > anyone is
> > > interested in implementing this, I'm very curious how well it performs
> > > compared
> > > to the current implementation using lucene.
> > >
> > > regards
> > >   marcel
> > >
> >
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message