lucene-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From blazingwolf7 <blazingwo...@gmail.com>
Subject Untokenized URL
Date Fri, 04 Jul 2008 08:19:21 GMT

Hi,

I am currently working on retrieving url and contentLength of each document
found during the search. I want to retrieve it during the calculation of
score so that I can influence the score in some other way.

I used the methods from TermDocs and TermEnum to get the information.
However, the url I retrieve as is know by most, is tokenized. It is broken
down into several parts and I will have to rejoin them. Can anyone help me
with this? I am stuck here wondering how to get back the whole url without
using a Reader.

Also, I try to retrieve the contentLength, but the results return are null.
Why is that? I opened the index using Luke and the contentLength is there
but when I try to get it using this way, the results is null. 

Can anyone help me with both of these problems? Any help will be
appreciated. Thanks
-- 
View this message in context: http://www.nabble.com/Untokenized-URL-tp18275048p18275048.html
Sent from the Lucene - Java Developer mailing list archive at Nabble.com.


---------------------------------------------------------------------
To unsubscribe, e-mail: java-dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-dev-help@lucene.apache.org


Mime
View raw message