lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From nachi <nachit...@gmail.com>
Subject Re: indexing going wrong
Date Sun, 12 Aug 2007 01:32:18 GMT
oops...that was a control-c control-v error. Im indexing directory "c:\test"
and using the
index in c:\test for searching. I found the problem to be this - Im reusing
the same document
object in the for loop. I solved it by creating new document each time the
loop runs...
actually when if statement becomes true.

-nachi




On 8/11/07, Erick Erickson <erickerickson@gmail.com> wrote:
>
> A couple of things come to mind. But before I get to them, really, really,
> really get a copy of Luke. It'll allow you to examine your index and
> see if what's in there is really what you expect. It'll save you a world
> of hurt <G>.... Google luke lucene....
>
> Also, use query.toString to see what the query actually looks like.
>
> Your index and the files you're trying to put into that index are both
> "c:/test". Are there really files to start with in that directory?
> And Lucene creates a new index when you specify true in the
> IndexWriter, and I'm not sure how many files it blows away in the process.
>
>
> But none of that matters, because your searcher opens "d:/test" and
> you're indexing into "c:/test" <G>.....
>
>
> Best
> Erick
>
>
> On 8/11/07, nachi <nachitech@gmail.com> wrote:
> >
> > all,
> >
> > No sure if earlier mail went thru..so resending...
> >
> > Im new lucene and Im trying to develope a textual search module. I have
> > written the following code ( this is research code) -
> >
> >
> > File dir = new File("c:/test");
> >   IndexWriter writer = new IndexWriter(dir, new StandardAnalyzer(),
> true);
> >   Document doc = new Document();
> >   File[] file = dir.listFiles();
> >   for (File f: file) {
> >    if (f.isFile() && f.canRead()) {
> >     System.out.println(f.getName());
> >     doc.add(new Field("filename",f.getName(),Field.Store.YES,
> > Field.Index.UN_TOKENIZED));
> >     doc.add(new Field("contents", new FileReader(f)));
> >     writer.addDocument(doc);
> >    }
> >   }
> >
> >   System.out.println("count=" + writer.docCount());
> >      writer.optimize();
> >   writer.close();
> >
> >
> >
> > I'm trying to index the contents of test diretory which has only txt
> > files.
> >
> > When I search the index for an particular word, I get the same filename
> > everytime.
> >
> > Here is the code for searching -
> >
> > File dir = new File("D:\\test");
> >   FSDirectory fsdir = FSDirectory.getDirectory(dir);
> >   IndexSearcher d = new IndexSearcher(fsdir);
> >   QueryParser p = new QueryParser("contents",new StandardAnalyzer());
> >   Query q = p.parse("ERROR");
> >   Hits hits = d.search(q);
> >
> >   for (int i = 0; i < hits.length(); i++) {
> >    Document doc = hits.doc(i);
> >    System.out.println(doc.get("filename"));
> >    }
> >   d.close();
> > }
> >
> > Can somebody tell me what I'm doing wrong ? I suspect that there is
> > something wrong in the way I index.
> >
> >
> >
> >
> > --
> > -nachi
> >
>



-- 
-nachi

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message