lucene-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Leon <>
Subject 回复: DocumentsWriter questions
Date Sat, 03 Nov 2007 13:44:00 GMT
> "Leon" wrote:
> > 1: Why not extract the code of ThreadState management to a new
> > internal class such as ThreadStatePool.
> >
> > At present, there are lots of threadstate management code occured
> > everywhere in the DocumentsWriter.
> I'm not sure what you mean by "ThreadState management"? 
> Currently there are two methods that manage allocating
> (getThreadState) and freeing (finishDocument) a ThreadState for the
> processing of one document. Are you proposing making a new class that
> would do what these two methods do now?
Hmm. yes, i am proposing a new class to manage these ThreadState. I already extracted 
these code into one class called ThreadStatePool in my workplace. which somthing like:

 private class ThreadStatePool{
   //Max # ThreadState instances; if there are more threads
   // than this they share ThreadStates
   private final static int MAX_THREAD_STATE = 5;
   private ThreadState[] threadStates = new ThreadState[0];
   private final HashMap threadBindings = new HashMap();
   private int numWaiting;
   private ThreadState[] waitingThreadStates = new ThreadState[1];
   public void restPostings() throws IOException {}
   public synchronized boolean allThreadsIdle() {}
   public void forceRemoveWaiting() {}// Forcefully remove waiting ThreadStates from line
   public ArrayList gatherHasPostings(){}
   public void clearVectorsAndFields()throws IOException{}// Clear vectors & fields from
   public void sweep() throws IOException{}//If any states were waiting on me, sweep through
and flush those that are enabled by my write.
   synchronized ThreadState getThreadState(Document doc, Term delTerm) throws IOException

Another question, there are lots of similar operations such as reSize() the allFieldDataArray/docFieldDataArray
and so on. 
why not make a method like:

 void reSize(FieldData[] orign, int newSize){
     FieldData newArray[] = new FieldData[newSize];
     System.arraycopy(orign, 0, newArray, 0, orign.length);
     orign = newArray;

All i proposed just wants to make code more easily to read.

> > 2: Why not extract the hash method to something like LuceneHashMap 
> Well the hashing that DocumentsWriter does is fairly specifically
> tailored to what DocumentsWriter needs. It's not a general hash map:
> you can't remove entries; it relies on specific packed block storage
> of the char[] for a term; the internal hash array gets compacted &
> sorted & nulled out in bulk to write a segment; etc. Maybe if we
> factored it out and called it DocumentsWriterHashMap this could work?
> > Make code easier to understand is a good way to attract more people
> > to involed in.
> Absolutely! Could you boil these ideas down into a patch?
> Simplifying DocumentsWriter (without losing too much performance)
> would be awesome.
> Mike
> ---------------------------------------------------------------------
> To unsubscribe, e-mail:
> For additional commands, e-mail:

      ___________________________________________________________ 新域名、无限量,快来抢注!

To unsubscribe, e-mail:
For additional commands, e-mail:

View raw message