lucene-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "J. Delgado" <joaquin.delg...@gmail.com>
Subject Re: Realtime Search for Social Networks Collaboration
Date Sun, 07 Sep 2008 17:14:22 GMT
BTW, quoting Marcelo Ochoa (the developer behind the Oracle/Lucene
implementation) the three minimal features a transactional DB should support
for Lucene integration are:

  1) The ability to define new functions (e.g. lcontains() lscore) which
would allow to bind queries to lucene and obtain document/row scores
  2) An API that would allow DML intercepts, like  Oracle's ODCI.
  3) The ability to extend and/or implement new types of "domain" indexes
that the engine's query evaluation and execution/optimization planner can
use efficiently.

Thanks Marcelo.

-- Joaquin

On Sun, Sep 7, 2008 at 8:16 AM, J. Delgado <joaquin.delgado@gmail.com>wrote:

> On Sun, Sep 7, 2008 at 2:41 AM, mark harwood <markharw00d@yahoo.co.uk>wrote:
>
>  >>for example joins are not possible using SOLR).
>>
>> It's largely *because* Lucene doesn't do joins that it can be made to
>> scale out. I've replaced two large-scale database systems this year with
>> distributed Lucene solutions because this scale-out architecture provided
>> significantly better performance. These were "semi-structured" systems too.
>> Lucene's comparitively simplistic data model/query model is both a weakness
>> and a strength in this regard.
>>
>
>  Hey, maybe the right way to go for a truly scalable and high performance
> semi-structured database is to marry HBase (Big-table like data storage)
> with SOLR/Lucene.I concur with you in the sense that simplistic data models
> coupled with high performance are the killer.
>
> Let me quote this from the original Bigtable paper from Google:
>
> " Bigtable does not support a full relational data model; instead, it
> provides clients with a simple data model that supports dynamic control over
> data layout and format, and allows clients to reason about the locality
> properties of the data represented in the underlying storage. Data is
> indexed using row and column names that can be arbitrary strings. Bigtable
> also treats data as uninterpreted strings, although clients often serialize
> various forms of structured and semi-structured data into these strings.
> Clients can control the locality of their data through careful choices in
> their schemas. Finally, Bigtable schema parameters let clients dynamically
> control whether to serve data out of memory or from disk."
>
>

Mime
View raw message