db-derby-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Jørgen Løland <Jorgen.Lol...@Sun.COM>
Subject Re: Lucene integration
Date Mon, 16 Mar 2009 14:00:43 GMT
Geoffrey Hendrey wrote:
> Would it be possible for the derby team to implement lucene support in the following
way? Hook into the asynchronous replication protocol to send committed transactions to a lucene
receiver. I think it is acceptable for the free text search to only "see" committed data.

> Alterative to opening the protocol would be to create an abstract ReceiverServer for
asynchronous data, then LuceneReceiver is just a subclass. 
> Thoughts? 

What does Lucene expect as input? I doubt that the replication code can 
be easily integrated with Lucene because...

  1) The information replication sends from a master to a slave is a 
physical transaction log, which is in a derby-internal format. It is not 
human readable. To get an idea of what it looks like, you can take a 
look at logN.dat in one of your databases' log/ directories.
  2) Replication does not distinguish between committed and uncommitted 
data; log for all transactions, committed or not, is sent to the slave.

This means that before anything is fed into Lucene, the information has 
to be processed. This processing is effectively Derby's crash recovery 
code and is non-trivial to extract.

Note that I'm not familiar with Lucene.

Jørgen Løland

View raw message