lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Ted Chen <>
Subject keywords in a document
Date Mon, 09 Apr 2007 22:36:04 GMT
    I'm new to Lucene, but I did spend quite some time trying to find an answer to the problem
before turning to you gurus for help.
    I'm given a text file.  My task is to extract some top keywords from the file so that
these words can describe this document.  Ideally, terms with the highest tfidf should be returned.
  I noticed lucene provides a method called QueryTermExtractor.getIdfWeightedTerms(), but
to use this method, I need to provide a query and an IndexReader.  In my case, I only have
a text file, and don't have a handle on any IndexReader.  Is there any way to indicate I want
to use the "default IndexReader", where the default reader represents a typical collection
of documents.  Also, do I need to construct a Query out of the text file?  If so, is the best
choice a multi-term query?
    Any suggestions?


Sucker-punch spam with award-winning protection. 
Try the free Yahoo! Mail Beta.
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message