lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Juan Pablo Morales" <jpmora...@ingenian.com>
Subject Re: Storing special characters in Lucene
Date Fri, 22 Aug 2008 00:15:51 GMT
It was, after all an XML issue, the servlets creating the content that was
being indexed were not sending UTF but the XML declaration stated the code
WAS UTF, so it really was not a Lucene issue after all. Thanks for all the
help.

On Thu, Aug 21, 2008 at 6:18 PM, Juan Pablo Morales
<jpmorales@ingenian.com>wrote:

> You are right, it does work. I'll look into my example to see where the
> difference is.
>
> On Thu, Aug 21, 2008 at 5:30 PM, Grant Ingersoll <gsingers@apache.org>wrote:
>
>> Here's a unit test:
>> import junit.framework.TestCase;
>> import org.apache.lucene.analysis.snowball.SnowballAnalyzer;
>> import org.apache.lucene.analysis.standard.StandardAnalyzer;
>> import org.apache.lucene.document.Document;
>> import org.apache.lucene.document.Field;
>> import org.apache.lucene.index.IndexWriter;
>> import org.apache.lucene.queryParser.QueryParser;
>> import org.apache.lucene.search.Hits;
>> import org.apache.lucene.search.IndexSearcher;
>> import org.apache.lucene.search.Query;
>> import org.apache.lucene.store.RAMDirectory;
>>
>>
>> public class SpanishTest extends TestCase {
>>
>>  public void testSpanish() throws Exception {
>>    RAMDirectory directory = new RAMDirectory();
>>    String content = "niños";
>>    IndexWriter writer = new IndexWriter(directory, new StandardAnalyzer(),
>> true);
>>    Document document = new Document();
>>    document.add(new Field("name", content, Field.Store.YES,
>> Field.Index.TOKENIZED));
>>    SnowballAnalyzer snowballAnalyzer = new SnowballAnalyzer("Spanish");
>>    writer.addDocument(document, snowballAnalyzer);
>>    writer.close();
>>
>>    IndexSearcher searcher = new IndexSearcher(directory);
>>    QueryParser parser = new QueryParser("name", snowballAnalyzer);
>>    Query query = parser.parse(content);
>>    System.out.println("Query: " + query);
>>    Hits hits = searcher.search(query);
>>    assertTrue("hits Size: " + hits.length() + " is not: " + 1,
>> hits.length() == 1);
>>    Document theDoc = hits.doc(0);
>>    String nombre = theDoc.get("name");
>>    System.out.println("Nombre: " + nombre);
>>  }
>> }
>>
>>
>> When I run this in IntelliJ, I get:
>>
>> Query: name:niñ
>> Nombre: niños
>>
>> Process finished with exit code 0
>>
>>
>> Are you by chance indexing XML?
>
> Indirectly, yes
>
>>
>>
> --
> Juan Pablo Morales
> Ingenian Software ltda
>



-- 
Juan Pablo Morales
Ingenian Software ltda

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message