lucene-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Simon Cozens <>
Subject Re: Language neutral index format representation (was Re: Confused by writePostings/
Date Tue, 02 Dec 2003 18:56:13 GMT
Erik Hatcher:
> Very nice!  At FOO you mentioned you were going to probably write a 
> Perl version - glad you're getting the time to do it now. 

Yep, thanks to Kasei, who are also cleaning up and documenting the code I
write. For the interested, what I'm doing is at and I hope to sync back over
the docs/tests once they're completed.

> I've been dragging my feet on RubyLucene (@ - I've gotten
> some low-level file I/O Directory implementations working, but nothing above
> that yet.

My version's almost there, thanks to a month basically full-time work on it.

> Speaking of language implementations of Lucene's index format and 
> associated searching/indexing API, I think it would be cool if we 
> represent the directory and file formats in a computer-readable 
> (probably XML) format which could be used by to code generate the 
> low-level language-specific code for the various implementations.  

That would be quite nifty; I'll have a think about how it might look.

> What do folks think of this idea?  Any drawbacks?  Could the Java I/O 
> code be code generated without affecting the design at that level if 
> such a representation existed?

I believe so. You'd generate, conceptually, an ObjectSerializer class of
some sort which has read and write methods, which is overloaded to do
the right thing with the right object type.

However, I can imagine some snags, such as the one which prompted this
thread: how would you represent sequences of objects with their properties
delta-encoded, for instance, or the cunning buffer-substring trick used to
store the terms in the .tis file?

BASH is great, it dumps core and has clear documentation.  -Ari Suntioinen

To unsubscribe, e-mail:
For additional commands, e-mail:

View raw message