lucene-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Andrzej Bialecki>
Subject Re: IndexFileFormat documentation / specification
Date Tue, 04 Oct 2011 10:55:06 GMT
On 04/10/2011 12:44, Michael McCandless wrote:
> Can we stop trying to document the file format?
> Is it really needed?  It has been an error-proned process over time...
> Can't the source code be the definitive resource one reads to
> determine how a codec stores stuff....?

I'm more or less indifferent on maintaining detailed docs about formats 
for users' consumption. However, the absolute minimum IMHO is to 
document what possible files do indeed belong to an index, under what 
codec, and is the purpose of any particular file - in many situations 
this is crucial for troubleshooting, e.g. "your prx file is large, 
because...". The annotation-based method that I suggested would do fine, 
with a post-processing step to collect the documentation in one place.

Still, if a developer wants to study the source code then at least 
having a per-codec doc that explains the principles behind a particular 
format would be a big help. It's not always obvious from the code what 
the code is trying to do, and why, and as we move towards more and more 
esoteric codecs this will become even harder.

Best regards,
Andrzej Bialecki     <><
  ___. ___ ___ ___ _ _   __________________________________
[__ || __|__/|__||\/|  Information Retrieval, Semantic Web
___|||__||  \|  ||  |  Embedded Unix, System Integration  Contact: info at sigram dot com

To unsubscribe, e-mail:
For additional commands, e-mail:

View raw message