lucene-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From robert engels <reng...@ix.netcom.com>
Subject Re: Writing out the term count when merging
Date Mon, 19 Mar 2007 23:17:20 GMT
write the term count at the end of the file, uncompressed

On Mar 19, 2007, at 6:13 PM, Matt Chaput wrote:

> Hi all!
>
> I'm reimplementing a very Lucene-like search library as a learning  
> experience and I've run into a snag. Before I go deep code diving,  
> I thought I'd ask here in case someone has the time to answer.
>
> The term dictionary file includes the term count in a header. But  
> when I'm merging segments, I can't know the collected number of  
> UNIQUE terms in the merging segments before I've read them, so I  
> can't write the header before I start merging the segments.
>
> The ways I can see to do this are (a) to scan the term lists of the  
> segments first and build the collected term list in memory before  
> merging, (b) leave space in the file for the term count and go back  
> and overwrite it later, or (c) something much more clever that  
> Lucene does but I haven't figured out yet.
>
> (b) is undesirable for me, because I'd like the option of using  
> compressed streams in the backend, which must be written serially.
>
> Anyway, if someone more familiar with the code could point me in  
> the right direction, I'd appreciate it very much.
>
> Thanks!
>
> Matt
>
>
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: java-dev-unsubscribe@lucene.apache.org
> For additional commands, e-mail: java-dev-help@lucene.apache.org
>


---------------------------------------------------------------------
To unsubscribe, e-mail: java-dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-dev-help@lucene.apache.org


Mime
View raw message