lucene-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Chuck Williams <>
Subject Re: Lucene Gdata -- the best way to store the feeds / entries
Date Sun, 28 May 2006 00:36:32 GMT

Storing content in a Lucene index is a common approach and works well. 
I use a patch, LUCENE-362, to boost performance.  Compress and
decompress the field externally, storing just the byte[] in the Lucene
index.  The patch eliminates all copying of the byte[] otherwise done in
lucene, at the cost of supporting only one such a field per Document. 
As the patch is a bit older, you may need to "help" it apply to latest
source, if patch doesn't do it for you.


Simon Willnauer wrote on 05/27/2006 01:33 PM:
> For those who haven't heard about the GData project please check
> today's mailing list  .
> The Lucene Indexer is supposed to be used as the search component of
> this implementation. As GData is an extension to the Atom/Rss format
> including search and a kind of versioning. This project is a server
> side implementation of the protocol. So what's the problem, the
> incoming feed entries and their updates have to be stored somewhere in
> a persistent storage. The easiest approach would be a flat file
> storage which is not sufficient in my eyes. I thought about using a
> similar approach to the Nutch dist. file system by Indexing the
> incoming entries in a "searchable" index and store the whole entry in
> an associated index to prevent the index from growing to fast.
> To keep the index small I would create a separate index for each feed
> instance which is organized in the local file system.
> I would be interested if anybody has experience with retrieving large
> data like whole feed entries out of a "storage" lucene  index. Am I
> supposed to face any performance problems with this approach?
> As far as I know lucene doesn't support any versioning or did that
> change by any chance? Well, the protocol description doesn't say
> anything about retrieving old versions.(the documentation only about
> optimistic locking / updating versions)
> regards Simon

To unsubscribe, e-mail:
For additional commands, e-mail:

View raw message