lucene-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Leon <lucen...@yahoo.cn>
Subject 回复: DocumentsWriter questions
Date Sat, 03 Nov 2007 13:44:00 GMT
> "Leon" wrote:
> 
> > 1: Why not extract the code of ThreadState management to a new
> > internal class such as ThreadStatePool.
> >
> > At present, there are lots of threadstate management code occured
> > everywhere in the DocumentsWriter.
> 
> I'm not sure what you mean by "ThreadState management"? 
> Currently there are two methods that manage allocating
> (getThreadState) and freeing (finishDocument) a ThreadState for the
> processing of one document. Are you proposing making a new class that
> would do what these two methods do now?
  
Hmm. yes, i am proposing a new class to manage these ThreadState. I already extracted 
these code into one class called ThreadStatePool in my workplace. which somthing like:

 private class ThreadStatePool{
   //Max # ThreadState instances; if there are more threads
   // than this they share ThreadStates
   private final static int MAX_THREAD_STATE = 5;
   private ThreadState[] threadStates = new ThreadState[0];
   private final HashMap threadBindings = new HashMap();
   private int numWaiting;
   private ThreadState[] waitingThreadStates = new ThreadState[1];
 
   public void restPostings() throws IOException {}
   public synchronized boolean allThreadsIdle() {}
   public void forceRemoveWaiting() {}// Forcefully remove waiting ThreadStates from line
   public ArrayList gatherHasPostings(){}
   public void clearVectorsAndFields()throws IOException{}// Clear vectors & fields from
ThreadStates
   public void sweep() throws IOException{}//If any states were waiting on me, sweep through
and flush those that are enabled by my write.
   synchronized ThreadState getThreadState(Document doc, Term delTerm) throws IOException
{}
}

Another question, there are lots of similar operations such as reSize() the allFieldDataArray/docFieldDataArray
and so on. 
why not make a method like:

 void reSize(FieldData[] orign, int newSize){
     FieldData newArray[] = new FieldData[newSize];
     System.arraycopy(orign, 0, newArray, 0, orign.length);
     orign = newArray;
}

All i proposed just wants to make code more easily to read.


> > 2: Why not extract the hash method to something like LuceneHashMap 
> 
> Well the hashing that DocumentsWriter does is fairly specifically
> tailored to what DocumentsWriter needs. It's not a general hash map:
> you can't remove entries; it relies on specific packed block storage
> of the char[] for a term; the internal hash array gets compacted &
> sorted & nulled out in bulk to write a segment; etc. Maybe if we
> factored it out and called it DocumentsWriterHashMap this could work?
> 
> > Make code easier to understand is a good way to attract more people
> > to involed in.
> 
> Absolutely! Could you boil these ideas down into a patch?
> Simplifying DocumentsWriter (without losing too much performance)
> would be awesome.
> 
> Mike
> 
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: java-dev-unsubscribe@lucene.apache.org
> For additional commands, e-mail:
> java-dev-help@lucene.apache.org
>


      ___________________________________________________________ 
@yahoo.cn 新域名、无限量,快来抢注! 
http://mail.yahoo.cn/

---------------------------------------------------------------------
To unsubscribe, e-mail: java-dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-dev-help@lucene.apache.org


Mime
View raw message