jackrabbit-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Cosmin Lehene <cleh...@adobe.com>
Subject Re: AW: NoSql Support
Date Thu, 18 Aug 2011 13:07:17 GMT
It might not be feasible to have a full JCR on top of NoSQL, I don't know
However supporting basic search is definitely possible and it should be
fast as well. Whether that's synchronous (fully consistent) or
asynchronous should be optional.
I assume some of the features (e.g. transactions or indexing) should be
available in the NoSQL store and the persistence manager should deal with
existing interfaces and data layout.
However there's a relatively clear solution right now for JCR on top of
Hbase and it should have enough features so that someone looking for
scalability could use it.
I also think global write locks need to go away (at least for NoSQL
persistence). This can be taken care at a more granular level inside the
actual store.


On 8/18/11 3:14 PM, "Bart van der Schans" <b.vanderschans@onehippo.com>

>On Wed, Aug 17, 2011 at 1:51 PM, Jukka Zitting <jukka.zitting@gmail.com>
>> Hi,
>> On Wed, Aug 17, 2011 at 1:37 PM, Cosmin Lehene <clehene@adobe.com>
>>> First I'll have to better understand what a bundle is :) (JCR newbie
>>> here:)). I'll try to read about it.
>> A bundle is the unit of data stored by a bundle persistence manager.
>> It contains the properties and the list of child nodes of a single JCR
>> node.
>> A bundle persistence manager is expected to be able to atomically
>> update not just a single bundle at a time, but an arbitrarily large
>> ChangeLog of created, updated and deleted bundles. This has so far
>> been a big problem for NoSQL-style persistence managers that only
>> support locking at the level of individual rows.
>I think this is one of the biggest reasons why JCR 1.0 and 2.0 do not
>match "nicely" to most popular NoSQL stores. Imo it's not just a
>Jackrabbit issue. The other big problem would be the search. As you
>can scale out nicely to huge numbers with some NoSQL stores, the
>search will not. This is partly an issue with the Lucene
>implementation in Jackrabbit, but also the spec doesn't really "help".
>In a big NoSQL deployment you might want to defer the searches to an
>external clustered search engine (something solr llike), but that
>would/could mean that the search updates lag behind the content. Aka
>save first, index later. Another problem could be the current
>clustering implementation which requires a global write lock (which is
>handled through the database or shared filesystem). Especially in a
>multi geolocation deployment a global write lock is not an option..
>I don't think these issues can be easily "solved" by just implementing
>a different persistence manager. It would be interesting to see if we
>can come up with some kind of design plan of how JCR could work with a
>NoSQL store. Maybe some of that work already started with the
>JR3/microkernel prototyping? I could also be that you need to choose
>one NoSQL solution and then leverage all the
>facilities/services/functionallity provided by the store. So fully use
>and exploit something like the Hadoop stack, the Amazon stack or even
>the GAE stack.
>We do see more and more people that expect everything to work smoothly
>in the cloud and that everything scales nicely and elastically over
>multiple datacenters. In the coming years this will become a
>requirement and Jackrabbit should be ready for that.

View raw message