lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Harpreet S Walia" <harpr...@sansuisoftware.com>
Subject Re: Problem in unicode field value retrival
Date Mon, 10 Jun 2002 12:21:57 GMT
Hi,

That was the problem , Thanks :-) . still i am strugling to get lucene to
search non english unicode content . it works partially will simple analyser
but doesn't return any results with standard analyser . is there a way by
which i can output the exact contents that are going into the index    .

Thanks and regards,
Harpreet


----- Original Message -----
From: "Ian Lea" <ian@digimem.net>
To: "Harpreet S Walia" <harpreet@sansuisoftware.com>
Cc: "Lucene Users List" <lucene-user@jakarta.apache.org>
Sent: Monday, June 10, 2002 5:15 PM
Subject: Re: Problem in unicode field value retrival


> I don't think you can retrieve the contents of Fields that have
> been loaded by a Reader.  From the javadoc for Field:
>
> Text(String name, Reader value)
>
>    Constructs a Reader-valued Field that is tokenized and indexed, but is
>    not stored in the index verbatim.
>
>
> --
> Ian.
> ian@digimem.net
>
>
> > harpreet@sansuisoftware.com (Harpreet S Walia) wrote
> >
> > Hi
> >
> > I am trying to index and search unicode (utf - 8) . the code i am using
to index the documents is as follows :
> >
> >
/***************************************************************************
***********************************************************/
> > IndexWriter iw = new
IndexWriter("d:\\jakarta-tomcat3.2.3\\webapps\\lucene\\index", new
SimpleAnalyzer(), true);
> > String dirBase = "d:\\jakarta-tomcat3.2.3\\webapps\\lucene\\docs";
> > File docDir = new File(dirBase);
> > String[] docFiles  = docDir.list();
> > InputStreamReader isr;
> > InputStream is;
> > Document doc;
> > for(int i=0;i<docFiles.length;i++)
> >    {
> >   File tempFile = new File(dirBase + "\\" + docFiles[i]);
> >   if(tempFile.isFile()==true)
> >     {
> >     System.out.println("Indexing File :" + docFiles[i]);
> >     is = new FileInputStream(tempFile);
> >     isr=new InputStreamReader(is,"utf-8");
> >        doc= new Document();
> >        doc.add(Field.UnIndexed("path",tempFile.toString()));
> >        doc.add(Field.Text("abc",(Reader)isr));
> >        doc.add(Field.Text("all","sansui"));
> >        iw.addDocument(doc);
> >        is.close();
> >        isr.close();
> >       doc=null;
> >           }
> >     }
> >      iw.close();
> >      is=null;
> >      isr=null;
> >      iw=null;
> >      docDir=null;
> >
> >      System.out.println("Indexing Complete");
> >
> >
/***************************************************************************
***********************************************************/
> >
> > Now when i try to search the contents and get the field called abc by
using the method doc.get("abc") , i get null as the output.
> >
> > Can anyone please tell me where i am going wrong .
> >
> > Thanks And Regards
> > Harpreet
> >
> ----------------------------------------------------------------------
> Searchable personal storage and archiving from http://www.digimem.net/
>
>


----------------------------------------------------------------------------
----


> --
> To unsubscribe, e-mail:
<mailto:lucene-user-unsubscribe@jakarta.apache.org>
> For additional commands, e-mail:
<mailto:lucene-user-help@jakarta.apache.org>


--
To unsubscribe, e-mail:   <mailto:lucene-user-unsubscribe@jakarta.apache.org>
For additional commands, e-mail: <mailto:lucene-user-help@jakarta.apache.org>


Mime
View raw message