lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From hrishim <smarthr...@yahoo.co.in>
Subject Re: Term Frequency for phrases
Date Fri, 08 Jan 2010 17:37:42 GMT

@All : Elaborating the problem

The phrase is being indexed as a single token ...
I have a Gene tag in the xml document which is like
<Gene>brain natriuretic peptide </Gene>
This phrase is  present in the abstract text for the given document .
 
Code is as :

doc.add(new Field("Gene", geneName, Field.Store.YES,
Field.Index.ANALYZED,Field.TermVector.YES));

doc.add(new Field("Token", abstractText.toString().toLowerCase(),
Field.Store.YES, Field.Index.ANALYZED,Field.TermVector.YES));

When I retrieve all tokens as well as genes for a given doc and calculate
the tf for each of these , 
a null exception is thrown . Term = brain natriuretic peptide 

TermDocs termDocs = indexReader.termDocs(term);
termDocs.next();
double tf = termDocs.freq();

Regards,
Hrishi


Grant Ingersoll-6 wrote:
> 
> When do you detect that they are phrases?  During indexing or during
> search?
> 
> On Jan 8, 2010, at 5:16 AM, hrishim wrote:
> 
>> 
>> Hi .
>> I have phrases like brain natriuretic peptide indexed as a single token
>> using Lucene.
>> When I calculate the term frequency for the same  the count is 0 since
>> the
>> tokens from the text are indexed separately i.e. brain , natriuretic ,
>> peptide.
>> Is there a way to solve this problem and get the term frequency for the
>> entire phrase ?
>> 
>> Regards,
>> Hrishi
>> -- 
>> View this message in context:
>> http://old.nabble.com/Term-Frequency-for-phrases-tp27073866p27073866.html
>> Sent from the Lucene - Java Users mailing list archive at Nabble.com.
>> 
>> 
>> ---------------------------------------------------------------------
>> To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
>> For additional commands, e-mail: java-user-help@lucene.apache.org
>> 
> 
> --------------------------
> Grant Ingersoll
> http://www.lucidimagination.com/
> 
> Search the Lucene ecosystem using Solr/Lucene:
> http://www.lucidimagination.com/search
> 
> 
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
> For additional commands, e-mail: java-user-help@lucene.apache.org
> 
> 
> 

-- 
View this message in context: http://old.nabble.com/Term-Frequency-for-phrases-tp27073866p27079648.html
Sent from the Lucene - Java Users mailing list archive at Nabble.com.


---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org


Mime
View raw message