lucene-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Erik Hatcher" <>
Subject [SUBMIT] docweb demo app
Date Mon, 11 Feb 2002 04:32:12 GMT

Attached is the recently discussed Ant <index> task code and the patches
necessary to build a web application containing an index of Lucene's
documentation (the docs in CVS and the generated Javadocs).

Its certainly not perfect, and no doubt is in need of refactoring, but if I
waited until it was completely polished it'd never see the light of day!

I patched into build.xml such that I took advantage of the existing web
application so as to get it developed quickly.  My actual application uses
Struts for the web app, and that is of course a bit too heavy to toss into
this demo, especially the first pass.

One change to configuration.jsp is required (and since this would typically
be changed in the other demo web app, perhaps it can be modified to this in
CVS or cloned for this demo app, or use a filtered <copy> with a token in
    String indexLocation =

This is a proof-of-concept, and the build and web app will need some
polishing.  But it works "out of the box" with the patches applied (see the
docweb-patches.txt file in the .zip attached).

To build, simply run the docweb-war target:

    ant docweb-war

it churns for a bit on indexing the documentation (but not long), and there
will then be a bin/docweb/lucenedocweb.war file (mine was about 2MB in size
because of the embedded index).

Here are some known issues (the ones I can think off the top of my head):

- It hooks into the existing demo web app pages, so searching works, but the
results page is broken because its not adding the same fields the original
demo expects.  Again, this is proof-of-concept that a web app demo can be
delivered that is fully functional with an embedded index and does something
useful with Lucene's docs.

- The <index> task does some dependency checking, but after seeing the 'uid'
features in the Lucene demo code more closely I can see that it needs to be
refactored to take advantage of this kind of thing.

- How should file content be handled?   Embedded in the index as
"rawcontents"?  Copied to a directory inside the WAR statically?  Its
currently being embedded, which has its share of issues, as would copying

- I'm still a "newbie" to Lucene's API.  I built this thing with very little
effort (thanks lucene-dev!) several months ago, and haven't really touched
it since as its not the direct focus of my current work.  It has simply
worked solidly since built, so I haven't had a need to dig into it more to
clean it up or tweak it.  There is of course much work to be done to have a
really solid <index> task, but I feel this is a good start.

Let me know if there are any questions or problems with incorporating this.
And like I mentioned before, I'm looking for expert eyes at IndexTask code
to make it the best it can be.  Folks that contribute to it will earn
acknowledgements in my upcoming Ant book for sure (if you desire, certainly
wouldn't print anyone's name without permission).

Think about Gump or some other process indexing all of Jakarta's docs
(perhaps during release builds only?) with fields for product, version, etc.
Wow!  There is already a Jakarta search engine, but I'm sure it doesn't hold
a candle to Lucene's capabilities. I feel that building an index of static
content during an Ant build is an important capability for Lucene to have -
docs are typically static until the next release, so it makes sense to index
them at build time, right?

Comments, suggestions, criticisms, etc all are welcome.


View raw message