lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From AmitShu...@Freightliner.com
Subject RE: Question regarding using Lucene or not
Date Mon, 04 Oct 2004 20:22:16 GMT
Thanks Daniel
Can you tell me two more things.
1. How difficult it is to implement our own Similarity class that can do the
things we want ?
2. If there are more than one field that are percentage match like HP, can
we also specify which field gets the preference while search.
For example, in the search, the model "has to be" Cargo, HP value should be
55,000 or near (tolerance of 5000) and GVWR value should 10,000 or near
(tolerance of 1000). Also GVWR gets a preference over HP value. So if one of
the file contains 
	Cargo, HP=54,000 and GVWR=9800 
and second file contains 
	Cargo, HP=55,000 and GVWR=9200 
then it should give first file a better rating although the second one has
HP as the exact matching because GVWR has more weightage than HP.

Thanks in advance.

-----Original Message-----
From: Daniel Naber [mailto:daniel.naber@t-online.de] 
Sent: Saturday, October 02, 2004 6:37 AM
To: Lucene Users List
Subject: Re: Question regarding using Lucene or not


On Saturday 02 October 2004 02:06, AmitShukla@Freightliner.com wrote:

> The parameters are both string and numeric. For example, the model 
> should be Cargo and its HP value should be 55,000 or near it . If we 
> specify tolerance value of 5000 then it should search for all the data 
> files where model node is Cargo (definitive match) and HP value is 
> between 50,000 to 60,000 with the one having 55,000 coming as the 100% 
> match.

That's possible with Lucene, you'll need to parse the XML files and put the 
required data into the Lucene index. Then you can search with a query like 
this:

+model:cargo^0 +hp:[50000 TO 60000] hp:55000^10

This will match all document which contain "cargo" in the model field and a 
value of 50000 to 60000 in the hp field. Matches with hp 55000 will be 
boosted so they appear on top. However, matches 50000 to 54999 and 50001 
to 60000 will have the same ranking. To change that you will need to 
implement your own variation of Lucene's Similarity class.

Regards
 Daniel

-- 
http://www.danielnaber.de

---------------------------------------------------------------------
To unsubscribe, e-mail: lucene-user-unsubscribe@jakarta.apache.org
For additional commands, e-mail: lucene-user-help@jakarta.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: lucene-user-unsubscribe@jakarta.apache.org
For additional commands, e-mail: lucene-user-help@jakarta.apache.org


Mime
View raw message