lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Erick Erickson <erickerick...@gmail.com>
Subject Re: IndexSearcher
Date Thu, 05 Mar 2009 14:02:16 GMT
I think your root problem is that you're indexing UN_TOKENIZED, which
means that the tokens you're adding to your index are NOT run through
the analyzer.

So your terms are exactly "111", "222 333" and "111 222 333", none of which
match "222". I expect you wanted your tokens to be "111", "222", and "333",
each appearing twice in your index.

Try indexing them tokenized. Although note that I don't remember what
StandardAnalyzer does with numbers. WhitespaceAnalyzer does the
more intuitive thing, but beware that it doesn't fold case. But it might be
an easier place for you to start until you get more comfortable with what
various analyzers do.

Also, I *strongly* advise that you get a copy of Luke. It is a wonderful
tool
that allows you to examine your index, analyze queries, test queries, etc.

But be aware that the site that maintains Luke was having problems
yesterday,
look over the user list messages from yesterday if you have problems.

Best
Erick

On Thu, Mar 5, 2009 at 8:40 AM, liat oren <oren.liat@gmail.com> wrote:

> Hi,
>
> I would like to do a search that will return documents that contain a given
> word.
> For example, I created the following index:
>
> IndexWriter writer = new IndexWriter("C:/TryIndex", new
> StandardAnalyzer());
> Document doc = new Document();
>  doc.add(new Field(WordIndex.FIELD_WORLDS, "111 222 333", Field.Store.YES,
> Field.Index.UN_TOKENIZED));
> writer.addDocument(doc);
> doc = new Document();
> doc.add(new Field(WordIndex.FIELD_WORLDS, "111", Field.Store.YES,
> Field.Index.UN_TOKENIZED));
> writer.addDocument(doc);
>  doc = new Document();
>  doc.add(new Field(WordIndex.FIELD_WORLDS, "222 333", Field.Store.YES,
> Field.Index.UN_TOKENIZED));
>  writer.addDocument(doc);
> writer.optimize();
>  writer.close();
>
> now I want to get all the documents that contain the word "222".
>
> I tried to run  the following code but it doesn;t return any doc
>
>  IndexSearcher searcher = new IndexSearcher(indexPath);
>
> //  //  TermQuery mapQuery = new TermQuery(new Term(FIELD_WORLDS,
> worldNum)); - this one also didn't word
> Analyzer analyzer = new StandardAnalyzer();
> QueryParser parser = new QueryParser(FIELD_WORLDS, analyzer);
>  Query query = parser.parse(worldNum);
>  Hits mapHits = searcher.search(query);
>
>
> Thanks a lot,
> Liat
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message