lucene-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Andrew C. Oliver" <>
Subject RE: Patches and samples
Date Sat, 19 Jan 2002 23:36:09 GMT
> The contribution process for Lucene is not very mature.  I am the lead
> developer, but I have not had the time recently to work on Lucene.  I have
> also not been involved with other Apache projects, and hence am not that
> familiar with the processes.
> I think your patches look good and that they should be integrated--I'm "+1"
> in Apache-speak.  However I may not get to applying them right away.

no problem I didn't mean to imply anything by my query. I was just
pinging. :-)

> > One thing I was about to work on was a few slightly more universal web
> > examples as well as an ant build target for them.  I'm also 
> > working on a
> > short tutorial and walk-through for the demos.  However, I 
> > noticed that
> > there have already been submissions for some of these such as a JSP
> > version of the demos on the user list.  Is there some reason 
> > these were
> > not included/added?  I'd not like to duplicate someone else's mistake.
> The only reason that these have not been added is that I have not had a
> chance to inspect and test them before integrating them.  However I think
> improving this sort of stuff for Lucene is of vital importance--improving
> the initial experience.
Great.  I just didn't want to work on something that was unwelcome and
had no chance of being accepted.  The current development cycle for POI
is very pressing.

> > 2. create a "getting started" document in xdocs for how one builds and
> > installs Lucene and the demos.
> That would be great.

Great, I'm nearly done.

> > 3. create a template web app and ant target including war file
> > deployable in Tomcat.
> That would be marvelous.
> > 4. investigate the issue of file handles as mentioned earlier.
> The file handle issue is basically this: all active index files that are not
> entirely read into memory must be kept open in case another thread or
> process removes them while updating the index.  A single handle is kept for
> each file per IndexReader.  The number of files is proportional to the
> number of segments, which, worst-case, is b*log-base-b(doc-count), where b
> is IndexWriter.mergeFactor, 10 by default, and doc-count is the number of
> documents in the index.  There are a few files per segment which must be
> kept open.  So with a million documents, the maximum number of files is
> around 200.  If you increase IndexWriter.mergeFactor this will drop.  An
> optimized index contains only a single segment and thus requires only a few
> files open.

okay that makes sense.  I misunderstood an earlier email about this.  (I
was thinking of a Solaris 8 box where serious problems happen above >8k
file handles and worse when WebSphere for instance is installed nearly
all of them are used)

> > 5. create a set of interfaces a/o classes for attaching other document
> > filters.  
> Sounds cool.

Great.  I should have something to look at on all of these by the end of
next week.  Thanks for taking the time to respond.  BTW, I'm sorry if my
query came off wrong, I'm extra-multi-tasked at the moment :-).



> Doug
> --
> To unsubscribe, e-mail:   <>
> For additional commands, e-mail: <>
-- - port of Excel format to java 
			- fix java generics!

The avalanche has already started. It is too late for the pebbles to
-Ambassador Kosh

To unsubscribe, e-mail:   <>
For additional commands, e-mail: <>

View raw message