forrest-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Juan Jose Pablos <che...@che-che.com>
Subject Re: about lucent and exist
Date Fri, 12 Sep 2003 00:36:12 GMT
Ramon Prades wrote:
> 
>>Which make me realize that lucene is a *text search engine*.
> 
> 
> That's the main advantage about lucene: it's language independent. In fact,
> Forrest isn't concerned at all about the input documents: you have to write
> an indexer for each format you want to use, i.e. if you want to search in
> Microsoft Word documents, you have to write a class to open and process
> them.
> 

I am not worry about fixing just one issue. Being XML aware means that 
you can do a:

(after using forms to create this Xpath query)
//faqs/part/id['general']/faq/question[containts(.,'xsl')]

So you would search for "xsl" within a collection of FAQ XML documents 
that have a faq part called 'general'

I am not sure how dificult is to get there with lucene, but exist seems 
to get it already.

> You can do the same with Lucene, it's all down to the Indexer. In mine, I
> index forrest documents by mixing all the text. This is because I don't
> think queries like "p:lucene" (read: "search all docs with word "lucene"
> inside a "p" tag) are a good idea (specially for non-programmers).

I do not think that users should deal with that, for them that language 
is hidden.

> 
> Having said that, I think certain tags with a very strong meaning can be
> used. For example "authors" and "title" (both working in my code): this can
> be useful, specially if we have radio buttons for "search in authors only"
> and "search in title only".

Semantics searching ( I thought about something similar before I knew 
the name) is about using tags to limited the search and get better results.

> 
> I wanted to do all this a few weeks ago, but I've been awfully busy (who
> isn't?). I plan to start again in 2 or 3 weeks.
> 

I will help you as I promised, I got that bug assigned to me.Using 
lucene wihtin forrest and having exist support are compatible tasks, you 
got the first one almost done. Spain..go..go..go!

Cheers,
Cheche





Mime
View raw message