Hi all,
=20
I have two questions related to the Lucene ranking.
=20
1) Does anyone know how the posting lists (term -> doc1 doc2 doc3) from =
the
index are sorted?
It is used a TFxIDF value, the boost value or none to sort documents =
(doc1
doc2 doc3)? Does Lucene compute the ranking for all the documents in the
posting lists or only part?
=20
2) Does anyone know how to add more ranking features to the ranking =
function
of Lucene (eg. Pagerank, BM25)?
Extending the DefaultSimilarity class from Lucene is insufficient to =
achieve
this. It is only prepared to change the TFxIDF function.
=20
Thanks in advance.
=20
--
Miguel Costa=20
HYPERLINK "http://xldb.fc.ul.pt/~mcosta/"http://xldb.fc.ul.pt/~mcosta/
=20
FCCN-Funda=E7=E3o para a Computa=E7=E3o Cient=EDfica Nacional Av. do =
Brasil, n.=BA 101
1700-066 Lisboa
Tel.: +351 21 8440190
Fax: +351 218472167
HYPERLINK "outbind://25/www.fccn.pt"www.fccn.pt
Aviso de Confidencialidade
Esta mensagem =E9 exclusivamente destinada ao seu destinat=E1rio, =
podendo conter
informa=E7=E3o CONFIDENCIAL, cuja divulga=E7=E3o est=E1 expressamente =
vedada nos
termos da lei. Caso tenha recepcionado indevidamente esta mensagem,
solicitamos-lhe que nos comunique esse mesmo facto por esta via ou para =
o
telefone +351 218440100 devendo apagar o seu conte=FAdo de imediato.=20
This message is intended exclusively for its addressee. It may contain
CONFIDENTIAL information protected by law. If this message has been =
received
by error, please notify us via e-mail or by telephone +351 218440100 and
delete it immediately.
=20
No virus found in this outgoing message.
Checked by AVG.=20
Version: 7.5.524 / Virus Database: 269.23.14/1425 - Release Date: =
09-05-2008
12:38
=20
|