lucene-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Andrzej Bialecki>
Subject Re: IndexFileFormat documentation / specification
Date Tue, 04 Oct 2011 15:23:59 GMT
On 04/10/2011 16:37, Robert Muir wrote:
> On Tue, Oct 4, 2011 at 10:33 AM, Andrzej Bialecki<>  wrote:
>> So far the list of possible file names was relatively small and well-known,
>> e.g. people knew that a prx file contained postings, and its size would
>> indicate this or that. We are going to have dozens of codecs soon, and if I
>> come up with a codec that creates, say, abc and xyz files then without
>> knowing what they logically correspond to, it will be difficult to
>> troubleshoot. Similarly, if I discover files abc and xyz in my Directory I
>> should be able to tell whether they belong there.
> I'm not sure about the latter part? this is not really possible unless
> we make all of our codecs use "unique" extensions? Currently preflex
> codec uses some of the same extensions as standard (e.g. .frq) for
> example, but its definitely a different codec.

I think that's ok as long as I can say that a .frq file could plausibly 
end up in my Directory because it's documented to belong to a codec I 
was using.

(And also maybe we shouldn't insist on using the same extensions over 
and over again. I guess the original motivation for 3-letter extension 
names came from a certain stone age operating system, but I think we are 
free to use even 4+ letters now ;) )

Best regards,
Andrzej Bialecki     <><
  ___. ___ ___ ___ _ _   __________________________________
[__ || __|__/|__||\/|  Information Retrieval, Semantic Web
___|||__||  \|  ||  |  Embedded Unix, System Integration  Contact: info at sigram dot com

To unsubscribe, e-mail:
For additional commands, e-mail:

View raw message