lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Bertrand VENZAL" <bertrand.ven...@cirso.fr>
Subject Réf. : Re: IndexSearcher and number of occurence
Date Thu, 13 Jan 2005 15:17:24 GMT



Hi,

Thanks for your quick answer, I understood wot u meant by using the
indexSearcher to get the termFreqVector. But, you use an int as an id to
find the termFrequency so I suppose that it is the position number in the
IndexReader vector.
My problem is : during the indexing phase, I can store the id, but if a
document is deleted and recreated later on (like in an update), this will
change my vector and all the id's previously set will be no more correct.
Am i right on this point ? or am i missing something ...

thanks ...
|------------------------+------------------------+------------------------|
|                        |   Erik Hatcher         |                        |
|                        |   <erik@ehatchersolutio|                        |
|                        |   ns.com>              |      Pour :            |
|                        |   Envoyé par :         |                   "Luce|
|                        |   lucene-user-return-12|                   ne   |
|                        |   431-bertrand.venzal=c|                   Users|
|                        |   irso.fr@jakarta.apach|                   List"|
|                        |   e.org                |                   <luce|
|                        |                        |                   ne-us|
|                        |   13/01/2005 11:28     |                   er@ja|
|                        |   Veuillez répondre à  |                   karta|
|                        |   "Lucene Users List"  |                   .apac|
|                        |                        |                   he.or|
|                        |                        |                   g>   |
|                        |                        |                        |
|                        |                        |                        |
|                        |                        |      cc :              |
|                        |                        |                        |
|                        |                        |                        |
|                        |                        |                        |
|                        |                        |                        |
|                        |                        |                        |
|                        |                        |      Objet :           |
|                        |                        |                   Re:  |
|                        |                        |                   Index|
|                        |                        |                   Searc|
|                        |                        |                   her  |
|                        |                        |                   and  |
|                        |                        |                   numbe|
|                        |                        |                   r of |
|                        |                        |                   occur|
|                        |                        |                   ence |
|                        |                        |                        |
|                        |                        |                        |
|------------------------+------------------------+------------------------|










On Jan 13, 2005, at 5:03 AM, Bertrand VENZAL wrote:

>
>
> Hi all,
>
> Im quite new in this mailing list. I ve many difficulties to find the
> number of a word (occurence) in a document, I need to use indexSearcher
> because of the query but the score returning is not wot i m looking
> for.
> I found in the mailing List the class TermDoc but it seems to work only
> with indexReader.
>
> If anyone can give a hand of this one, I will appreciate ...

Perhaps this technique is what you're looking for.... set the field(s)
you're interested in capturing frequency on to be vectored.  You'll see
that flag as additional overloaded methods on the Field.  You'll still
need to use an IndexReader, but that is no problem.  Construct an
IndexReader and use it to construct the IndexSearcher that you'll also
use.  Here's some snippets of code:

                // During indexing, "subject" field was added like this:
    doc.add(Field.UnStored("subject", subject, true));

... // now during searching...

    IndexReader reader = IndexReader.open(directory);

    ...
    // from your Hits, get the document id
    int id = hits.doc(i);

    TermFreqVector vector =
        reader.getTermFreqVector(id, "subject");

Now read up on the TermFreqVector API to get at the frequency of a
specific term.

                Erik


---------------------------------------------------------------------
To unsubscribe, e-mail: lucene-user-unsubscribe@jakarta.apache.org
For additional commands, e-mail: lucene-user-help@jakarta.apache.org







---------------------------------------------------------------------
To unsubscribe, e-mail: lucene-user-unsubscribe@jakarta.apache.org
For additional commands, e-mail: lucene-user-help@jakarta.apache.org


Mime
View raw message