lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Sertic Mirko, Bedag" <>
Subject AW: Lucene as a primary datastore
Date Wed, 20 Jan 2010 14:26:23 GMT

Did you ever think about a Content Repository, like JackRabbit or Alfresco?
Alfresco generates also a Lucene Index for documents stored in its repository. The content
repository itself
is backed by a database or a filesystem...


-----Urspr√ľngliche Nachricht-----
Von: Darren Hartford [] 
Gesendet: Mittwoch, 20. Januar 2010 15:12
Betreff: RE: Lucene as a primary datastore

My two cents is no, not to use lucene as a primary datastore.  Although
there are some datastores that look similar to lucene who define
themselves as primary datastores (the 'nosql' style datastores), I would
put lucene besides the likes of RRD and other specifically purposed
information stores that are about providing information and
functionality, but not necessarily be the gatekeeper of your raw (gold)

Caveat: My definition of raw, or 'gold', data is detailed data and the
audit/transaction history to identify the origin of the detailed data.
So it's not just the end result (often called 'current') data, it's all
the data on how it got to the current state and as it fluctuates over

My two coppers,


-----Original Message-----
From: Erick Erickson [] 
Sent: Wednesday, January 20, 2010 8:59 AM
Subject: Re: Lucene as a primary datastore

My preference is to put the effort into preserving the original
source on the theory that I'm sure no information is lost that
way. So the suitability of Lucene to store it varies depending
upon the source IMO.

If it's raw text, then storing all the raw text in an un-indexed
field in Lucene might suit you well. But if it's a database,
there may be lots of meta-data codified in the design
(e.g. foreign keys, required fields, etc) that's hard to
preserve outside a DB, and that you may need someday....

So that's the first question I'd explore....


On Tue, Jan 19, 2010 at 10:58 PM, Guido Bartolucci <> wrote:

> I know that the primary use case for Lucene is as an index of data
> that can be reconstructed (e.g., from a relational database or from
> spidering your corporate intranet).
> But, I'm curious if anyone uses Lucene as their primary datastore for
> their gold data. Is it good enough?
> Would anyone consider (or do people already) store data in Lucene
> that, if it was lost, would destroy their business? And no, I'm not
> suggesting that you don't back up this data, I'm just curious if there
> are problems with using Lucene in this way. Are there subtle
> corruptions that might show up in Lucene that wouldn't show up in
> Oracle or MySQL?
> I'm considering using Lucene in this way but I haven't been able to
> find any documentation describing this use case. Are there any studies
> of Lucene vs MySQL running for N years comparing the corruptions and
> recovery times?
> Am I just ignorant and scared of Lucene and too trusting of Oracle and
> MySQL?
> Thanks.
> -guido.
> (BTW, I did find a similar question asked back in 2007 in the archives
> but it doesn't really answer my question)
> ---------------------------------------------------------------------
> To unsubscribe, e-mail:
> For additional commands, e-mail:

To unsubscribe, e-mail:
For additional commands, e-mail:

To unsubscribe, e-mail:
For additional commands, e-mail:

View raw message