lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Harpreet S Walia" <harpr...@sansuisoftware.com>
Subject Problem in unicode field value retrival
Date Mon, 10 Jun 2002 11:12:31 GMT
Hi

I am trying to index and search unicode (utf - 8) . the code i am using to index the documents
is as follows :

/**************************************************************************************************************************************/
IndexWriter iw = new IndexWriter("d:\\jakarta-tomcat3.2.3\\webapps\\lucene\\index", new SimpleAnalyzer(),
true); 
String dirBase = "d:\\jakarta-tomcat3.2.3\\webapps\\lucene\\docs";
File docDir = new File(dirBase);
String[] docFiles  = docDir.list();
InputStreamReader isr;
InputStream is;
Document doc;
for(int i=0;i<docFiles.length;i++)
   { 
  File tempFile = new File(dirBase + "\\" + docFiles[i]);
  if(tempFile.isFile()==true)
    {
    System.out.println("Indexing File :" + docFiles[i]);
    is = new FileInputStream(tempFile);
    isr=new InputStreamReader(is,"utf-8");
       doc= new Document();
       doc.add(Field.UnIndexed("path",tempFile.toString()));
       doc.add(Field.Text("abc",(Reader)isr));
       doc.add(Field.Text("all","sansui"));
       iw.addDocument(doc);
       is.close();
       isr.close();
      doc=null;
          }
    }
     iw.close();
     is=null;
     isr=null;
     iw=null;
     docDir=null;
 
     System.out.println("Indexing Complete");

/**************************************************************************************************************************************/

Now when i try to search the contents and get the field called abc by using the method doc.get("abc")
, i get null as the output.

Can anyone please tell me where i am going wrong .

Thanks And Regards
Harpreet

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message