lucene-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Michael McCandless" <>
Subject Re: DocumentsWriter questions
Date Sat, 03 Nov 2007 13:15:57 GMT
"Leon" <> wrote:

>   1:  Why not extract the code of ThreadState management to a new
>   internal class such as ThreadStatePool.
>   At present, there are lots of threadstate management code occured
>   everywhere in the DocumentsWriter.

I'm not sure what you mean by "ThreadState management"?

Currently there are two methods that manage allocating
(getThreadState) and freeing (finishDocument) a ThreadState for the
processing of one document.  Are you proposing making a new class that
would do what these two methods do now?

>   2:  Why not extract the hash method to something like LuceneHashMap 

Well the hashing that DocumentsWriter does is fairly specifically
tailored to what DocumentsWriter needs.  It's not a general hash map:
you can't remove entries; it relies on specific packed block storage
of the char[] for a term; the internal hash array gets compacted &
sorted & nulled out in bulk to write a segment; etc.  Maybe if we
factored it out and called it DocumentsWriterHashMap this could work?

> Make code easier to understand is a good way to attract more people
> to involed in.

Absolutely!  Could you boil these ideas down into a patch?
Simplifying DocumentsWriter (without losing too much performance)
would be awesome.


To unsubscribe, e-mail:
For additional commands, e-mail:

View raw message