jackrabbit-oak-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Ian Boston <...@tfd.co.uk>
Subject Re: Elastic Search and OAK comparisons
Date Wed, 18 Dec 2013 14:14:46 GMT
On 18 December 2013 09:25, Lukas Kahwe Smith <smith@pooteeweet.org> wrote:
>
> On Dec 18, 2013, at 10:16 , Ian Boston <ieb@tfd.co.uk> wrote:
>
>> Hi,
>>
>> On 17 December 2013 22:43, Reza Jalili <jalili@adobe.com> wrote:
>>> Forwarding to the open group
>>>
>>>
>>>> Hi Toby,
>>>>
>>>> I've just started to take a look at elasticsearch.org / .com
>>>>
>>>> Do you know:
>>>> How does oak compare with elasticsearch open source
>>>> search/data store?
>>
>> Elastic search is only a distributed elastic search index based on
>> Lucene, so comparing it with Oak as a whole is not  a like for like
>> comparison. It is not a data store.
>
> From what I gather ElasticSearch core devs are no longer opposing the idea of people
using ES as a data store.
>
>> However:
>> Many large applications especially in the OpenData field have used it
>> as a data store since its resilience to unforeseen failures is high
>> mainly due to:
>> * close to real time with a data update latency often around 50ms
>> between update and availability in the index.
>> * replication and sharding with no single point of failure
>> * write ahead log on write giving it automated recovery.
>> * True elasticity.
>>
>> The datastore that results from an elastic search deployment can be
>> considered as a flat datastore with no inherent structure and no
>> versioning. ie billions of documents in a bucket.
>
> Right .. ES can work as a document data store, but it lacks the CMS specific capabilities
of JCR like versioning, ACLs, tree structure, native support for references etc.
>
>> If you were brave, you could write a EasticSearchMK.
>
> Indeed this would be great! One of the biggest deficiencies with JCR/Jackrabbit is the
lack for facetted search. Oak at least makes the search indexer pluggable and afaik there
is work being done to write a Solr plugin for Oak. Once you have the data in Solr (or ElasticSearch)
I envision it should be possible to write queries directed at them including facetting. However
then one will obviously loose all the CMS capabilities I mentioned above. Might be cool if
someone would provide some plugin specific tools to easily create build a facetted search
query that still takes into account ACLs and the tree structure.

http://blog.tfd.co.uk/2012/02/14/search-acls-part-2-simple-is-always-best/

Its a QueryComponent, that gets installed in Lucene (so works with
either any Lucene core) that encodes principals that can read a
document in the index. It will make sparse queries dense, and will
scale to large numbers of principals (IIRC tests were done upto 2000
principals per query with no signs of degradation).

The disadvantage is, if root ACLs are changed subtrees have to be
re-indexed which can be expensive.

Best Regards
Ian

>
> regards,
> Lukas Kahwe Smith
> smith@pooteeweet.org
>
>
>

Mime
View raw message