lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Brian Cuttler <br...@wadsworth.org>
Subject Re: lucene question, examples
Date Fri, 04 Mar 2005 16:38:31 GMT
Otis,

On Fri, Mar 04, 2005 at 11:31:22AM -0500, Brian Cuttler wrote:
> 
> Otis,
> 
> > If by shtml you mean HTML with server-side includes, then note that you
> > will not be able to do this with Lucene alone, as server-side includes
> > are not static.
> 
> My understanding is that our webmaster is hoping to have very simple
> files created by the data-suppliers, ie, depatmental people that generate
> web pages. Then use server-side-includes to include site or departement
> comment headers/footers/backgrounds, so that ultimately it will be
> the .shtml with the interesting data.

Still not sure I used the right vocabulary to express myself.

I don't believe that there is much in the included files that will
need to be indexed, that will be common header/footer data. It is
the static information in the unprocessed shtml that I believe will
be the data people wanting to search the sites will want to find.

> Is this something that in your experience works well or is there some
> other methology I can recommend to her ?
> 
> 						thank you,
> 
> 						Brian
> 
> On Thu, Mar 03, 2005 at 02:40:41PM -0800, Otis Gospodnetic wrote:
> > Brian,
> > 
> > It sounds like you are using a little demo application that comes with
> > Lucene.  This is really just a demo that shows how you can use Lucene. 
> > Lucene in a toolkit for building search applications, so you would
> > really want to develop something of your own around Lucene.  Sure, you
> > can use a demo, but that little demo is not perfect.  Since v 1.2 there
> > have been some changes in the demo area, so you could try grabbing the
> > latest version of Lucene and trying its demo (same application, it's
> > just that it may be a bit better).  If that fails, you can try the file
> > indexing framework from Lucene in Action (c.f.
> > http://www.lucenebook.com ) - source code is freely available.
> > 
> > If by shtml you mean HTML with server-side includes, then note that you
> > will not be able to do this with Lucene alone, as server-side includes
> > are not static.
> > 
> > Otis
> > 
> > 
> > --- Brian Cuttler <brian@wadsworth.org> wrote:
> > 
> > > I've sorry if this is the wrong forum, I was trying for lucene-user
> > > and been unable to subscribe (but seem to see lucene items here).
> > > 
> > > We have been runing apache on our internal sites for a while, with
> > > tomcat and lucene. Plugged in the demo index build and search 
> > > features... and for a long time life has been good.
> > > 
> > > Now we are looking to implement apache on our external site with
> > > tomcat and lucene.
> > > 
> > > System is Solaris 9
> > > Apache/1.3.29 (Unix) ApacheJserv/1.1.2 mod_perl/1.25 configured
> > > Apache, with Tomcat included, from Solaris freeware site.
> > > 
> > > I didn't see Lucene on the Sun FW site so just replicated the
> > > installation
> > > from the internal to the (future) external website.
> > > 
> > > Lucene is currently v 1.2 (at least that is the version number of the
> > > demo package).
> > > 
> > > The index we are building (org.apache.lucene.demo.IndexHTML) seems to
> > > capture the tags from the "ALT" text, where really we need it to pick
> > > up not image texts but content and keyword fields, or perhaps even
> > > plain
> > > text that is outside of the ALT tags.
> > > 
> > > We also suspect that we are not picking up all documents, ie, not
> > > both
> > > html, htm. We'd like to extend the range of documents we index, soon
> > > to include shtml unless I'm mistaken.
> > > 
> > > I strongly suspect that newer demos might already do this, or that
> > > with
> > > some basic instruction I could modify the document extentions if not
> > > the
> > > (I suspect rather complex) target strings.
> > > 
> > > Unfortunately while the implement demo docs are great, I've so far
> > > not
> > > found (or simply not understood) the docs that might give the options
> > > we are hoping to implement.
> > > 
> > > 						Thank you in advance,
> > > 
> > > 						Brian
> > > 
> > > ---
> > >    Brian R Cuttler                 brian.cuttler@wadsworth.org
> > >    Computer Systems Support        (v) 518 486-1697
> > >    Wadsworth Center                (f) 518 473-6384
> > >    NYS Department of Health        Help Desk 518 473-0773
> > > 
> > > 
> > > ---------------------------------------------------------------------
> > > To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
> > > For additional commands, e-mail: java-user-help@lucene.apache.org
> > > 
> > > 
> > 
> > 
> > ---------------------------------------------------------------------
> > To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
> > For additional commands, e-mail: java-user-help@lucene.apache.org
> > 
> ---
>    Brian R Cuttler                 brian.cuttler@wadsworth.org
>    Computer Systems Support        (v) 518 486-1697
>    Wadsworth Center                (f) 518 473-6384
>    NYS Department of Health        Help Desk 518 473-0773
> 
> 
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
> For additional commands, e-mail: java-user-help@lucene.apache.org
> 
---
   Brian R Cuttler                 brian.cuttler@wadsworth.org
   Computer Systems Support        (v) 518 486-1697
   Wadsworth Center                (f) 518 473-6384
   NYS Department of Health        Help Desk 518 473-0773


---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org


Mime
View raw message