lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Johannes.Lichtenberger" <Johannes.Lichtenber...@uni-konstanz.de>
Subject native, versioned XML-DBMS (that is full text search in versioned document collections)
Date Wed, 28 Nov 2012 01:09:50 GMT
Hello,

as posted some time ago I'm working on a native, versioned XML-DBMS [1]. 
I'd like to provide a full text index and I recently read about 
customized Codecs which can be plugged in. Usually data (for instance 
XML nodes) are stored on RecordPages. I'm still not sure if it is 
possible and makes sense to implement PostingsFormat and possibly Directory.

What I want to achieve is to be able to use my infrastructure for 
transaction-safe versioning. That is I need some kind of record for the 
different types (I think fields, terms, documents and term positions) 
with a simple record-ID to retrieve the record from disk and which kind 
the record is. Furthermore all I need is a serialization/deserialization 
mechanism for each record type. Probably I can simply reuse the default 
serialization/deserialization routine. I'm furthermore not sure if it 
would be nice to provide a B+-tree implementation which always clusters 
for instance the fields, the terms, then the documents and the term 
positions. I don't know what index structure Lucene uses per default, 
but I think it must be something which is performant with any kind of 
disks (reading/writing blocks of data).

Any hints and suggestions would be nice.

kind regards,
Johannes

[1] https://github.com/JohannesLichtenberger/sirix

---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org


Mime
View raw message