lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Daniel Valdivia <h...@danielvaldivia.com>
Subject Re: How to escape URL at indexing time
Date Sun, 27 Dec 2015 22:59:26 GMT
Thanks for your feedback,

As Ahmet pointed out it was an error at query time, I was validating the id was unique before
inserting, and I was not escaping the url there, so I just added QueryParser.escape() to my
validator and the error went away.

thanks a ot!

> On Dec 27, 2015, at 2:17 PM, Ahmet Arslan <iorixxx@yahoo.com.INVALID> wrote:
> 
> Hi Daniel,
> 
> The exception you have posted is a parse exception. 
> Something occurs during querying. Not indexing.
> 
> There are some special characters that are part of query parsing syntax.
> You need to escape them.
> 
> Ahmet
> 
> 
> 
> 
> On Sunday, December 27, 2015 10:53 PM, Daniel Valdivia <hola@danielvaldivia.com>
wrote:
> Hi
> 
> I'm trying to index documents that have a URL in some field, however as soon as I try
to index a URL like "http://yahoo.com" I get error:
> 
> org.apache.lucene.queryparser.classic.ParseException: Cannot parse 'id:'http://www.yahoo.com'':
Encountered " ":" ": "" at line 1, column 8.
> 
> I asume I need to escape the URL, but not sure if encoding the URL is the right way to
go.
> 
> my indexing code:
> 
> Document doc = new Document();
> 
> doc.add(new StringField("id", url, Field.Store.YES));
> doc.add(new StringField("domain", domain, Field.Store.NO));
> doc.add(new StringField("title", pageTitle, Field.Store.NO));
> doc.add(new TextField("body", pageBody, Field.Store.NO));
> w.addDocument(doc);
> 
> Any ideas on how I can avoid the parsing issue?
> 
> I’m using Lucene 5.4.0
> 
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
> For additional commands, e-mail: java-user-help@lucene.apache.org
> 


---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org


Mime
View raw message