lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Erik Hatcher <e...@ehatchersolutions.com>
Subject Re: Lucene Search Capabilities.
Date Fri, 13 May 2005 09:46:45 GMT

On May 12, 2005, at 10:24 AM, Goel, Nikhil wrote:
> 1) Lucene does the inverted indexing by which we mean it keeps how  
> many
> times a particular token is used. Is there a way to find out the  
> list of
> most frequently used words in the descending order.

Have a look at Luke's code to see how it does this - it has a view of  
the most frequent terms.  http://www.getopt.org/luke/

> 2) I have a number of documents with BTN(10 digit numeric charater) in
> their content. I want to do the following things:-
> a) What query can I write to find the documents that have BTN included
> in it. I think wildcard search will help but I am not able to find the
> exact query.
> b) More importantly, will it tell us what exact BTN is there in the
> document? For example lets say I search with java* and say 2 documents
> matched. One of the document has "javaspace" in it and second has
> "javaworld" in it.
> Is it possible to get these matched phrases through some API?

One option to consider is extracting these BTN values from the  
original documents and indexing them into a separate field.

     Erik


---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org


Mime
View raw message