lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Erik Hatcher <e...@ehatchersolutions.com>
Subject Re: Errors while running LIA code.
Date Thu, 06 Dec 2007 14:13:47 GMT
Wow, sure enough there is a bug in LIA's SAXXMLHandler!   After all  
these years!   We did not have it registered to run by default in the  
examples - it uses the Digester implementation instead of SAX.

Mike's suggested fix works fine for me, changing the attributeMap  
declaration to be this:

   private HashMap attributeMap = new HashMap();

Here's how I ran this:

   1) Downloaded http://www.ehatchersolutions.com/downloads/ 
LuceneInAction.zip - sorry, lucenebook.com is broken at the moment :(

   2) Unzipped it, ran "ant" to build the base indexes.  Ran "ant  
test" to verify all was working fine.

   3) Ran "ant ExtensionHandler" to run the handling test, and put in  
"src/lia/handlingtypes/data/addressbook.xml".   That works because  
the XML handler is set to Digester.  If you change src/lia/ 
handlingtypes/framework/handler.properties to have

    xml  = lia.handlingtypes.xml.SAXXMLHandler

instead, it'll fail until you add the above "new HashMap()" to the mix.

	Erik

p.s. Otis!!!!!!  :)


On Dec 6, 2007, at 5:06 AM, Liaqat Ali wrote:

> Michael McCandless wrote:
>> See this thread for one suggestion:
>>
>>     http://www.gossamer-threads.com/lists/lucene/java-user/55465
>>
>> Mike
>>
>> "Liaqat Ali" <liaqatalimian@gmail.com> wrote:
>>
>>> Hi
>>>
>>> I am trying to run a code from Lucene In Action, but it generate  
>>> some errors.There is one one warning at compilation time and the  
>>> errors generate at run time. Given below the code and errors.  
>>> Kindly give me some clue. thanks...
>>>
>>> *_Code:_*
>>>
>>> ///package lia.handlingtypes.xml;
>>> import lia.handlingtypes.framework.DocumentHandler;
>>> import lia.handlingtypes.framework.DocumentHandlerException;
>>> import org.xml.sax.helpers.DefaultHandler;
>>> import org.xml.sax.SAXException;
>>> import org.xml.sax.Attributes;
>>> import javax.xml.parsers.SAXParser;
>>> import javax.xml.parsers.SAXParserFactory;
>>> import javax.xml.parsers.ParserConfigurationException;
>>> import org.apache.lucene.document.Document;
>>> import org.apache.lucene.document.Field;
>>> import java.io.File;
>>> import java.io.IOException;
>>> import java.io.InputStream;
>>> import java.io.FileInputStream;
>>> import java.util.Iterator;
>>> import java.util.HashMap;
>>>
>>> public class SAXXMLHandler
>>>   extends DefaultHandler implements DocumentHandler {
>>>
>>>   /** A buffer for each XML element */
>>>   private StringBuffer elementBuffer = new StringBuffer();
>>>   private HashMap attributeMap;
>>>
>>>   private Document doc;
>>>
>>>   public Document getDocument(InputStream is)
>>>     throws DocumentHandlerException {
>>>
>>>     SAXParserFactory spf = SAXParserFactory.newInstance();
>>>     try {
>>>       SAXParser parser = spf.newSAXParser();
>>>       parser.parse(is, this);
>>>     }
>>>     catch (IOException e) {
>>>       throw new DocumentHandlerException(
>>>         "Cannot parse XML document", e);
>>>     }
>>>     catch (ParserConfigurationException e) {
>>>       throw new DocumentHandlerException(
>>>         "Cannot parse XML document", e);
>>>     }
>>>     catch (SAXException e) {
>>>       throw new DocumentHandlerException(
>>>         "Cannot parse XML document", e);
>>>     }
>>>
>>>     return doc;
>>>   }
>>>
>>>   public void startDocument() {
>>>     doc = new Document();
>>>   }
>>>
>>>   public void startElement(String uri, String localName,
>>>     String qName, Attributes atts)
>>>     throws SAXException {
>>>
>>>     elementBuffer.setLength(0);
>>>     attributeMap.clear();
>>>     if (atts.getLength() > 0) {
>>>       attributeMap = new HashMap();
>>>       for (int i = 0; i < atts.getLength(); i++) {
>>>         attributeMap.put(atts.getQName(i), atts.getValue(i));
>>>       }
>>>     }
>>>   }
>>>
>>>   public void characters(char[] text, int start, int length) {
>>>     elementBuffer.append(text, start, length);
>>>   }
>>>
>>>   public void endElement(String uri, String localName, String qName)
>>>     throws SAXException {
>>>     if (qName.equals("address-book")) {
>>>       return;
>>>     }
>>>     else if (qName.equals("contact")) {
>>>       Iterator iter = attributeMap.keySet().iterator();
>>>       while (iter.hasNext()) {
>>>         String attName = (String) iter.next();
>>>         String attValue = (String) attributeMap.get(attName);
>>>         doc.add(new Field(attName,  
>>> attValue,Field.Store.YES,Field.Index.TOKENIZED));
>>>       }
>>>     }
>>>     else {
>>>       doc.add(new Field(qName, elementBuffer.toString 
>>> (),Field.Store.YES,Field.Index.TOKENIZED));
>>>     }
>>>   }
>>>
>>>   public static void main(String args[]) throws Exception {
>>>     SAXXMLHandler handler = new SAXXMLHandler();
>>>
>>>     //File file = new File ("d:\\addressbook.xml");
>>>
>>>     Document doc = handler.getDocument(new FileInputStream(new  
>>> File(args[0])));
>>>
>>>     //Document doc = handler.getDocument(new FileInputStream(file));
>>>
>>>     System.out.println(doc);
>>>   }
>>> }
>>> /
>>>
>>> _*Errors:
>>>
>>> *_/D:\>java SAXXMLHandler d:\addressbook.xml
>>>
>>> Exception in thread "main" java.lang.NullPointerException
>>>         at SAXXMLHandler.startElement(SAXXMLHandler.java:66)
>>>         at  
>>> com.sun.org.apache.xerces.internal.parsers.AbstractSAXParser.startEl 
>>> e
>>> ment(Unknown Source)
>>>         at  
>>> com.sun.org.apache.xerces.internal.impl.dtd.XMLDTDValidator.startEle 
>>> m
>>> ent(Unknown Source)
>>>         at  
>>> com.sun.org.apache.xerces.internal.impl.XMLDocumentFragmentScannerIm 
>>> p
>>> l.scanStartElement(Unknown Source)
>>>         at  
>>> com.sun.org.apache.xerces.internal.impl.XMLDocumentScannerImpl 
>>> $Conten
>>> tDriver.scanRootElementHook(Unknown Source)
>>>         at  
>>> com.sun.org.apache.xerces.internal.impl.XMLDocumentFragmentScannerIm 
>>> p
>>> l$FragmentContentDriver.next(Unknown Source)
>>>         at  
>>> com.sun.org.apache.xerces.internal.impl.XMLDocumentScannerImpl 
>>> $Prolog
>>> Driver.next(Unknown Source)
>>>         at  
>>> com.sun.org.apache.xerces.internal.impl.XMLDocumentScannerImpl.next( 
>>> U
>>> nknown Source)
>>>         at  
>>> com.sun.org.apache.xerces.internal.impl.XMLDocumentFragmentScannerIm 
>>> p
>>> l.scanDocument(Unknown Source)
>>>         at  
>>> com.sun.org.apache.xerces.internal.parsers.XML11Configuration.parse( 
>>> U
>>> nknown Source)
>>>         at  
>>> com.sun.org.apache.xerces.internal.parsers.XML11Configuration.parse( 
>>> U
>>> nknown Source)
>>>         at  
>>> com.sun.org.apache.xerces.internal.parsers.XMLParser.parse 
>>> (Unknown So
>>> urce)
>>>         at  
>>> com.sun.org.apache.xerces.internal.parsers.AbstractSAXParser.parse 
>>> (Un
>>> known Source)
>>>         at com.sun.org.apache.xerces.internal.jaxp.SAXParserImpl 
>>> $JAXPSAXParser.p
>>> arse(Unknown Source)
>>>         at javax.xml.parsers.SAXParser.parse(Unknown Source)
>>>         at javax.xml.parsers.SAXParser.parse(Unknown Source)
>>>         at SAXXMLHandler.getDocument(SAXXMLHandler.java:39)
>>>         at SAXXMLHandler.main(SAXXMLHandler.java:102)
>>> /_*
>>> *_
>>>
>>> -------------------------------------------------------------------- 
>>> -
>>> To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
>>> For additional commands, e-mail: java-user-help@lucene.apache.org
>>>
>>>
>>
>> ---------------------------------------------------------------------
>> To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
>> For additional commands, e-mail: java-user-help@lucene.apache.org
>>
>>
>>
> Hello, Still facing the same problem after making the suggested  
> change.  What would be a solution for this?
>
> Liaqat
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
> For additional commands, e-mail: java-user-help@lucene.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org


Mime
View raw message