lucene-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From hg...@cswebmail.com
Subject Re: Search Expansion - more
Date Mon, 05 Apr 2004 18:53:53 GMT
Hi Erik,

I am really desperate because I cannot clarify the
problem to you - and I am really desperate for help now
as well.
Creating a sample application would be possible (and
the next step). I call Lucene as web service (could
however try to wrap the WS function with a main() and
create an application for you to run from the command
line).

However please allow me once again to try to explain:

I have lots of small xml files that I want to show only
depending on whether their <subject> tag contains
certain keywords / keyphrases.

They have been indexed using StandardAnalyser

As search criterion I pass on terms from a domain
ontology to see what XML files match these terms within
<subject>.

I started using QueryParser:
Query query = QueryParser.parse(line, "name",
analyzer);
where 'line' was simply a whitespace-delimited line of
concepts 

Worked fine, even could search for keyphrases by
linking the words with underscore, e.g. host_defense. 

Did produce an error however if the user chooses a very
high concept level in the domain ontology resulting in
> 200 terms to be put into the query string.

As you pointed out the limitation was obviously the
QueryParser (which I could reproduce) so you suggested
to bypass QueryParser by constructing a boolean query
using TermQuery.

This worked and could take more than 800 (!) terms
without errors (could not test more) but because of
using TermQuery I lost the functionality to search for
phrases, e.g. 'host defense'.

After your last response the only question that remains
to me is the syntax for adding a PhraseQuery on field
<subject>. I could not make sense of the sparse
description in the apidoc for that. 

Why am I using the array myquery[]? Well it's simply
the one that passes on the massive amount of query
terms to the web service. I though by using a string
array I could maintain the aspect of each search term,
especially when they represent phrases and not single
terms, e.g. myquery[n]="host defense"

I would need something that recognises whether the term
in myquery[n] is a single term (then adding to the
boolean search with TermQuery as usual) OR whether it
is a phrase, then adding with PhraseQuery (for which I
do not know the syntax). 
Maybe the PhraseQuery can also add single terms as well
- then I would only need this.

Thanks for your help, Erik

-Holger

___________________________________________________
The ALL NEW CS2000 from CompuServe
 Better!  Faster! More Powerful!
 250 FREE hours! Sign-on Now!
 http://www.compuserve.com/trycsrv/cs2000/webmail/





---------------------------------------------------------------------
To unsubscribe, e-mail: lucene-dev-unsubscribe@jakarta.apache.org
For additional commands, e-mail: lucene-dev-help@jakarta.apache.org


Mime
View raw message