lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Adrien Grand <jpou...@gmail.com>
Subject Re: Beginner's questions
Date Thu, 28 Mar 2013 00:12:53 GMT
On Wed, Mar 27, 2013 at 9:04 PM, Paul Bell <arachweb@gmail.com> wrote:
> Thanks Adrien.
>
> I've scraped together a simple program in the Lucene 4.2 idiom (see below).
> Does this illustrate what you meant by your last sentence?
>
> The code adds/indexes 5 documents all of whose content is identical, but
> whose 'id' field is unique ("v1" through "v5"). It then queries the 'id'
> field for the pattern "v*".

Even if your program works, there is something "dangerous" in it: you
index your id field with a String field, meaning that the field should
not be analyzed and then query it using a query parser, which analyzes
the data it is given. So you gave any of your document the id "ABC",
you will never be able to find it since StandardAnalyzer filters
tokens with a LowerCaseFilter. You could simply create the query
manually:

Query query = new PrefixQuery(new Term("id", "v" + id));

without help from a query parser.

To ensure that your id field is unique across documents, you could replace

writer.addDocument(createDocument("This is a test; for the
next 60 seconds..."));

with

Document doc = createDocument("This is a test; for the next 60 seconds...")
writer.updateDocument(new Term("id", doc.get("id")), doc);

> While we're at it, what method should I be using to obtain merely the
> original document itself after a query?

You can println document.get("id").

-- 
Adrien

---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org


Mime
View raw message