lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Adrien Grand <jpou...@gmail.com>
Subject Re: Index-boosting not working in 5.2.1?
Date Wed, 01 Jul 2015 22:31:51 GMT
Hi Markus,

Tetra* will be parsed as a PrefixQuery, which returns a constant score
by default. You can enable scoring by calling
MultiTermQuery.setRewriteMethod(SCORING_BOOLEAN_REWRITE) on it[1].
However, this will have the downside that not all possible terms
matching this prefix will be considered.

If you want to build a suggester, you could also have a look at the
lucene/suggest module.

[1] https://lucene.apache.org/core/5_2_0/core/org/apache/lucene/search/MultiTermQuery.html#SCORING_BOOLEAN_REWRITE

On Wed, Jul 1, 2015 at 11:18 PM, Markus Hegi - Nagavkar
<markushegi@gmail.com> wrote:
> Thanks Adrien for the quick response - that's a good hint -
>
> I simplified my code and realized, that the sorting DOES work, if I use
> full words. With "*", index boosting is not taken into account - see below
> my code, I used the two queries for testing:
>
> "Tetrachloroethane" - scoring works fine
> "Tetra*" - here, I get all the same scores.
>
> I am building an auto-suggest, based on ontology terms. Scoring is crucial
> there, and also, that I find parts of words.
>
> Markus
>
> Simplified test code:
>
> public void simple(String inp) throws IOException
> {
> try
> {
> FSDirectory dir =
> FSDirectory.open(FileSystems.getDefault().getPath(Co.folder()+"SEARCH\\ontoTest\\"));
> analyzer = new EnglishAnalyzer();
>     IndexWriterConfig config = new
> IndexWriterConfig(analyzer).setOpenMode(OpenMode.CREATE);
>     IndexWriter writer = new IndexWriter(dir, config);
>
>     index(writer, "4", "1,1,2,2-Tetrachloroethane", 1);
>     index(writer, "5", "Tetrachloroethane", 1);
>     index(writer, "6", "Tetrachloroethane", 10);
>     index(writer, "7", "Tetrachloroethane", 0.1f);
>     writer.close();
>
> QueryParser parser; Query query;
> parser=new QueryParser("TERM", analyzer);
> query = parser.parse(inp);
> IndexReader reader = DirectoryReader.open(dir);
> IndexSearcher searcher = new IndexSearcher(reader);
> ScoreDoc[] hits = searcher.search(query, 100).scoreDocs;
> out("found: "+hits.length+" results!");
>  for(int i=0; i<hits.length; i++)
> {
> ScoreDoc sd=hits[i];
> if(sd==null) continue;
> int docId = sd.doc;
> Document d = searcher.doc(docId);
> if(d==null) continue;
> out("<br>id: "+d.get("ID")+" - score: "+sd.score+" - "+d.get("TERM"));
> }
> }
> catch(Exception e) {
> e.printStackTrace();
> }
> }
> public void index(IndexWriter writer, String id, String str, float boost)
> throws IOException
> {
> Document doc=new Document();
>
> FieldType ft=new FieldType();
> ft.setIndexOptions(IndexOptions.DOCS_AND_FREQS_AND_POSITIONS);
> ft.setStored(true);
> ft.setTokenized(true);
> Field f=new Field("TERM", str, ft);
> f.setBoost(boost);
> doc.add(f);
>
> writer.updateDocument(new Term("ID", id), doc);
> }
>
>
>
>
> On Wed, Jul 1, 2015 at 9:57 PM, Adrien Grand <jpountz@gmail.com> wrote:
>
>> What query did you run? Not all queries take index-time boosts into
>> account for scoring.
>>
>> On Wed, Jul 1, 2015 at 7:30 PM, Markus Hegi - Nagavkar
>> <markushegi@gmail.com> wrote:
>> > Hello
>> >
>> > Downloaded & imported latest 5.2.1 version, but Index-scoring seems not
>> to
>> > work for me:
>> >
>> > I index two types of documents:
>> > - For one, I boost every field with a factor 1
>> > - For the other one, I boost every field with 0.01
>> >
>> > When I search, I get documents of both types, but for ALL document an
>> > identical score of:
>> > 1.4142135
>> >
>> > What could be the problem?
>> > Some of my code:
>> > ...
>> > FieldType ft=new FieldType();
>> > ft.setIndexOptions(IndexOptions.DOCS_AND_FREQS_AND_POSITIONS);
>> > ft.setStored(true);
>> > ft.setTokenized(true);
>> >
>> > Field f=new Field(name, value, ft);
>> > f.setBoost(0.001f);
>> > doc.add(f);
>> > ...
>> >
>> > Markus
>>
>>
>>
>> --
>> Adrien
>>
>> ---------------------------------------------------------------------
>> To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
>> For additional commands, e-mail: java-user-help@lucene.apache.org
>>
>>



-- 
Adrien

---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org


Mime
View raw message