lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Ahmet Arslan <iori...@yahoo.com.INVALID>
Subject Re: How to escape URL at indexing time
Date Sun, 27 Dec 2015 22:17:03 GMT
Hi Daniel,

The exception you have posted is a parse exception. 
Something occurs during querying. Not indexing.

There are some special characters that are part of query parsing syntax.
You need to escape them.

Ahmet




On Sunday, December 27, 2015 10:53 PM, Daniel Valdivia <hola@danielvaldivia.com> wrote:
Hi

I'm trying to index documents that have a URL in some field, however as soon as I try to index
a URL like "http://yahoo.com" I get error:

org.apache.lucene.queryparser.classic.ParseException: Cannot parse 'id:'http://www.yahoo.com'':
Encountered " ":" ": "" at line 1, column 8.

I asume I need to escape the URL, but not sure if encoding the URL is the right way to go.

my indexing code:

Document doc = new Document();

doc.add(new StringField("id", url, Field.Store.YES));
doc.add(new StringField("domain", domain, Field.Store.NO));
doc.add(new StringField("title", pageTitle, Field.Store.NO));
doc.add(new TextField("body", pageBody, Field.Store.NO));
w.addDocument(doc);

Any ideas on how I can avoid the parsing issue?

I’m using Lucene 5.4.0

---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org


Mime
View raw message