lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "DECAFFMEYER MATHIEU" <MATHIEU.DECAFFMA...@fortis.lu>
Subject RE: Low hits
Date Tue, 23 Jan 2007 13:33:51 GMT
 
Actually I am using Regain over Lucene for URL indexing.
And Regain uses in its last stable release Lucene 1.4.3

When I index the whole website, then when I type a title of a document I
have like 60 to 70 % as score.
When I index only one page, then when I type the title I have like 2% as
score.

It is very annoying for me since I have, for some reasons, to index all
the pages separetely.

That's why I am looking through the code of Regain, I will paste some
more code except if someone of you has an idea of this behaviour with
this explanation ...

Thank u for any further help.

__________________________________

   Mathieu Decaffmeyer

    

-----Original Message-----
From: Erick Erickson [mailto:erickerickson@gmail.com] 
Sent: Tuesday, January 23, 2007 2:01 PM
To: java-user@lucene.apache.org
Subject: Re: Low hits

*****  This message comes from the Internet Network *****

What version of Lucene are you using? 2.0 doesn't have a doc.add like
that.
You'd do something like
doc.add(new Field("title", title, Field.Store.YES,
Field.Index.TOKENIZED);

So I really don't understand what you're trying to do. Nor do I
understand
what "2%" means in this context....

But there are two things you should be aware of.
1> the analyzer you use when you create your index AND when you query
your
index should be the same, at least when you start. There are reasons why
you
might want to use different ones, but certainly not when you are
starting.
Otherwise you, say, index something and it's automatically folded into
lowercase (StandardAnalyzer certainly does this) but your query could be
uppercase if you use, say SimpleAnalyzer for the query phase.

2> get Luke (google lucene luke). It allows you to examine your index
and
see what's actually in it. Otherwise you're flying blind. It also allows
you
to enter queries manually (see Lucene In Action for lucene's query
syntax).
Really, really get Luke. It'll make your life much easier.

You probably want to post bigger snippets of actual code for folks to
look
at, since one line of code and "it doesn't work" don't give us much to
go on
<G>.

Best
Erick

On 1/23/07, DECAFFMEYER MATHIEU <MATHIEU.DECAFFMAYER@fortis.lu> wrote:
>
>
> Hi,
>
> I'm pretty new to Lucene and I try to find some help here.
> I added  the title of the document :
> doc.add(Field.Text("title", title));
> e.g. the title is "Constructions"
> When I do a search on this title I have as result 2%
> Can someone help me udnerstanding what I am doing wrong ?
>
> Thank u.
>
> *__________________________________*
>
> *   Mathieu Decaffmeyer*
>
>
> ============================================
> Internet communications are not secure and therefore Fortis Banque
> Luxembourg S.A. does not accept legal responsibility for the contents
of
> this message. The information contained in this e-mail is confidential
and
> may be legally privileged. It is intended solely for the addressee. If
you
> are not the intended recipient, any disclosure, copying, distribution
or any
> action taken or omitted to be taken in reliance on it, is prohibited
and may
> be unlawful. Nothing in the message is capable or intended to create
any
> legally binding obligations on either party and it is not intended to
> provide legal advice.
> ============================================
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
> For additional commands, e-mail: java-user-help@lucene.apache.org
>
>


============================================
Internet communications are not secure and therefore Fortis Banque Luxembourg S.A. does not
accept legal responsibility for the contents of this message. The information contained in
this e-mail is confidential and may be legally privileged. It is intended solely for the addressee.
If you are not the intended recipient, any disclosure, copying, distribution or any action
taken or omitted to be taken in reliance on it, is prohibited and may be unlawful. Nothing
in the message is capable or intended to create any legally binding obligations on either
party and it is not intended to provide legal advice.
============================================


---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org


Mime
View raw message