lucene-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From <>
Subject Concurency in Lucene (repost with attachments)
Date Tue, 25 Nov 2003 22:20:56 GMT
Reposting with attachments. 

This email is regarding my previous post:
I attached the solution to which I referred before (sorry for a delay).

This solution puts transactional wrapper around Lucene index which
provides new set of APIs which has update, insert, delete, fetch, query
functionality. All functions work as if it was a database. Records
interface wraps Document interface and requires documents to have unique
ids.  All writes are transactional (all or nothing); solution guarantees
commitment of all writes by writing them to disk and later propagating
them to index. This also implies that once document is deleted, it will
not show up as part of search. One caveat is that inserts have lag time
before they make it to the Lucene index and thus will not be queryable
right a way. Fetch however will retrieve an inserted document as
Solution does all of the bookkeeping of Lucene's IndexWrites,
IndexReaders and of internal log files. 
Solution ensures that writes and reads are non-blocking.
Solution handles computer crashes gracefully. As stated before all
writes are done to the log first and are "all or none". If crash
occurred when indexing is performed, solution ensures that indexes are
not corrupted by keeping internal state of work. 
Internally, solution keeps two sets of indexes and logs and hot-swaps
them in the round-robin fashion. I will not go into details on how it
does it, read IndexManager class comments if you want to know details.

Quick How To Use It guide.

First off, this code is about one year old and is our first prototype,
so you will find bunch of todos and log that in the code. 

I will write sample code here with comments, I think this is the best
way since API is straight forward.

import com.epiphany.know.*;
import com.epiphany.know.server.Service;

class Test
static public void main(String args[]) throws Exception
	// First start service

	// Connection is the interface you want to use to update,
delete, fetch, insert and query
	Connection conn = Service.get().getConnection();

	// Record is the structure Connection uses. All records must
have primary key
	// lets construct first record
	Record rec = new Record();
	rec.setField(Record.PKID, "PK_abc123");
           // name of the records fields are defined in the Properties
	rec.setField("title_field", "this is title");
	rec.setField("text_field", "some text here");

	//let's insert the record

	// let's fetch and modify the record
	rec = conn.fetch("PK_abc123);
	rec.setField("text_field", "new text");
	Thread.sleep(20000); // wait until record  makes it to the index

	RecordSet rs = conn.query("text");
	rec =;
	while(rec != null)
		rec =;
} // end of class Test

I did not try to compile code above, but you can run ConnectionTest to
see it work. 
There is also an HTTP version of Connection implementation. This
implementation wires all requests to Connection via http. To set that
up, download tomcat and under webapps create directory tree
Under lib, put kms_server.jar (all classes from the solution jared up)
and Lucene-1.2-rc3.jar (you can try latest too, but I used that
Under WEB-INF put web.xml file (attached).
Start tomcat.

On the client side, use this code to get Connection:
Connection con = new
com.epiphany.know.transport.HTTPConnectionClient(serverName, serverPort,
"km/conn" );

Lucene's index and log file will be stored in c:\lucene directory, so
you need to create that. If you want to change location, go to
Properties class and change it there.

I think this should get things started.

Kiril Zack

These changes are provided "AS IS" and any express or implied
warranties, including, but not limited to, the implied warranties of
merchantability and fitness for a particular purpose are disclaimed. In
no event shall E.piphany, Inc. be liable for any direct, indirect,
incidental, special, exemplary or consequential damages (including, but
not limited to, procurement of substitute goods or services; loss of
use, data, or profits; or business interruption) however caused and on
any theory of liability, whether in contract, strict liability, or tort
(including negligence or otherwise) arising in any way out of the use of
this software or the changes made thereto, even if advised of the
possibility of such damage.

View raw message