lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Allel BenBrahim" <abenbra...@object-ive.com>
Subject Information for index filed created by Lucene whene using Nutch
Date Wed, 04 May 2011 09:09:51 GMT
Hello

I'm using lucene & nutch, but I don't now witch type of field of documents
are created by nutch, I developed this program in java :

Directory dir =  FSDirectory.open(new File("C:/Users/MyWebPage/index"));

            

            IndexSearcher search = new IndexSearcher(dir);

            int numberDoc = search.maxDoc();

            System.out.println("number of doc "+ numberDoc);

            for (int i=0;i<numberDoc;i++)

            {

                  System.out.println("Document numero "+i);

                  Document doc = search.doc(i);

                  for (Fieldable f :doc.getFields())

                  {

                        for (Fieldable ff: doc.getFieldables(f.name()))

                        {

                             System.out.println("\t"+ff.name()+"
"+ff.stringValue());

                             

                        }

                  System.out.println("*******************");

            }

 

I'have this result

....

number of doc 1907

 

Document numero 0

      title Convention and Visitors Office 

      segment 20110502142927

      boost 0.13529637

      digest d07c6f19b2efaa8739754e9e9ff75fcc

      tstamp 20110502122931566

      url http://ar.info.com/

.....

...

Document numero 90

      title Who are we? - Presentation of the Paris Convention Bureau

      segment 20110502144050

      boost 0.0016601664

      digest 62ee8c0ff6c2ab7c91599f3c3ff18735

      tstamp 20110502125316832

      url http://convention.info.com/en/about-us/

 

 

my question is :

what's segment, boost, digest, tstamp and how can I read it

 

thanks for your help

 

 

 


Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message