lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Kunzman, Douglas *" <Douglas.Kunz...@fda.hhs.gov>
Subject RE: Wildcard Terms and total word or phrase count
Date Sun, 29 Nov 2015 11:43:12 GMT
Everyone -

Thanks for  getting back to me. Unfortunately, in the sample code even when I pass a term
with no wild cards
and it is a string with multiple instances in my document the totalHitCount is never more
than one.  Does anyone have any
ideas what I could be doing wrong?

Thanks, 
Doug

-----Original Message-----
From: Michael Wilkowski [mailto:mw@silenteight.com] 
Sent: Sunday, November 29, 2015 3:38 AM
To: java-user@lucene.apache.org
Subject: Re: Wildcard Terms and total word or phrase count

It is because your index does not contain term quar* and this statistics
function is not a query (you have to pass exact form of the term). To count
terms that meet search criteria you may run search query with custom
collector and count results. Or use normal search query returning TopDocs
and just check totalHitCount (however, first option is faster because no
results are gathered and sorted).

MW
Sent from Mi phone
On 27 Nov 2015 22:06, "Kunzman, Douglas *" <Douglas.Kunzman@fda.hhs.gov>
wrote:

> Hi -
>
> This is my first Lucene project, my other search projects have used Solr.
> I would like to find the total number of WildCard terms in a set of
> documents with 0-N matches per document.
> I would prefer not have to open each document where a match is found.  I
> need to be able to support wildcards but my requirements are somewhat
> flexible in about phrase search support.
> Whatever is easier.
>
> This is what I have so far.
>
>    public static void main(String args[]) throws IOException,
> ParseException {
>         Directory idx = FSDirectory.open(path);
>         index("C:\\Users\\Douglas.Kunzman\\Desktop\\test_index");
>
>         Term term = new Term("Doc", "quar*");
>
>         WildcardQuery wc = new WildcardQuery(term);
>
>         SpanQuery spanTerm = new
> SpanMultiTermQueryWrapper<WildcardQuery>(wc);
>         IndexReader indexReader = DirectoryReader.open(idx);
>
>         System.out.println("Term freq=" + indexReader.totalTermFreq(term));
>         System.out.println("Term freq=" +
> indexReader.getSumTotalTermFreq("Doc"));
>
>         IndexSearcher isearcher = new IndexSearcher(indexReader);
>
>         IndexReaderContext indexReaderContext =
> isearcher.getTopReaderContext();
>         TermContext context = TermContext.build(indexReaderContext, term);
>         TermStatistics termStatistics = isearcher.termStatistics(term,
> context);
>         System.out.println("termStatics=" +
> termStatistics.totalTermFreq());
>     }
>
> Does anyone have any suggestions?  totalTermFreq is zero, but when search
> using quartz we find matches.
> I'm searching the Quartz user's guide as an example.
>
> Thanks,
> Doug
>
>
>
>
>
>
Mime
View raw message