jackrabbit-users mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Lei Zhou <L...@pointalliance.com>
Subject Re: Question on jcr:deref usage
Date Fri, 24 Nov 2006 15:05:25 GMT
Hi Thomas, 

Thanks for responding.

>I think there are two solutions: Add the missing features to the JCR
>API, and provide the missing features in some other way. Do you have a
>suggestion how to extend the API to support the features you like to
>have (seems to be: aggregation, join, ordering)?

I came to JCR and Jackrabbit more from the perspective of  an 
integrator/end user, and did not have a chance to study the implementation 
in greater technical details. So I wouldn't dare to say "here is how I 
think things should be done".
I could provide some thoughts & observations though.

Jackrabbit (or JCR) handles all kinds of possible persistence storage 
(local files, DB, etc.) through unified PersistenceManager interface. This 
is great because it makes the product easily adapt to any usage scenario. 

This approach also has a draw-back: by treating Local file systems and RDB 
systems the same way - same set of features, same indexing & searching API 
(??),  a lot of good stuff from RDBMS are wasted. For example, SELECT 
DISTINCT, JOIN, GROUP BY, triggers, stored procedures etc. Some may argue 
that JCR is about "Structural", not "Relational" data, why would we care? 
These features are very useful even in querying structural data model - my 
previous email discussed one use case as an example.

I'm not saying we should use all RDBMS features where it exists, because 
there are compatibility & portability issues. Since we have already 
provided DB persistence manager and schema DDLs for several RDBMS, it 
wouldn't hurt if we extend that effort to do more with the 'native' 
features of supported RDBMS. 

One idea is to have an "extended set" of features for RDBMS, that can be 
queried by Repository.getDescriptorKeys(). These features would support 
extended SQL capabilities like SELECT DISTINCT, JOIN, GROUP BY, and ORDER 
BY etc.
And I'm not proposing to completely "normalize" the DB schema, there is 
always a line between "better" and "extreme".

My personal experience is that production-level content management systems 
are more implemented on RDBMS than on local file system. If this applies 
to most of the community (??), why would we restrict ourselves? 

>If you want to integrate other products in the DB schema level, then
>the current schema may not be the best. However I don't think it was
>the idea that other software accesses the schema of Jackrabbit
>directly.

As described above, I'm not trying to manipulate the repository at DB 
level. There are two reasons for me to raise that point: 

1.  For same reasons as mentioned above, and previous emails, I felt it 
would be more beneficial for 
     people who use RDBMS for repository - and I would bet that represents 
a good portion of 
    JCR/Jackrabbit based applications. 

2.  When presenting architecture design to a business client (usually with 
'some' knowledge of the IT 
     systems/products), the first question would be "is this a serious 
design? why are all the data in 
     Blobs?". Although we as developers know that there are good reasons 
for that, it may not be easily 
     conveyed to the client.

Again, these are just personal observations and I'm not yet an expert in 
JCR/Jackrabbit. Any comments / corrections are appreciated.

Best regards,
Lei





"Thomas Mueller" <thomas.tom.mueller@gmail.com> 
11/24/06 03:53 AM
Please respond to
users@jackrabbit.apache.org


To
users@jackrabbit.apache.org
cc

Subject
Re: Question on jcr:deref usage






Hi,

> So it seems that due to the limitation of JCR (no aggregation query
> support).

I think there are two solutions: Add the missing features to the JCR
API, and provide the missing features in some other way. Do you have a
suggestion how to extend the API to support the features you like to
have (seems to be: aggregation, join, ordering)?

One option is to make the structured part of the JCR repository
accessible like a 'standard' SQL database. Existing (SQL based) report
generators could then be used as well. If you could access the data
stored in the repository using the JDBC API using the following SQL
query, would this provide the convenience you are looking for?

select m.uuid from manual m, product p, region r
where p.uuid = m.product and r.uuid = m.region
and p.name in ('TV', 'VCR', 'DVD')
and r.name in ('North America', 'Europe')
and p.availableFor in ('distributor', 'repairHouse')
order by r.name, p.name

My idea is to add support for 'jcr views' to my database
(http://www.h2database.com).

> #2. The RDBMS based repository, current DB schema is not very convincing
> for large enterprise level applications. A more normalized schema might
> help both performance and #1, but yes, more DB level code may be needed
> (for performance's sake) and that may limit the portability of the
> product.

If you want to integrate other products in the DB schema level, then
the current schema may not be the best. However I don't think it was
the idea that other software accesses the schema of Jackrabbit
directly.

Thomas


Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message