lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From David Kendig <dken...@gcmd.nasa.gov>
Subject Re: Parsing and Indexing XML Docs
Date Tue, 18 Mar 2003 22:11:45 GMT
Bummer, I get the same thing with Xerces.  I do not suspect the XML file
itself since it is from a separate app that has been operational for
over a year.  Does anyone maintain the sandbox contributions?

Dave


Traceback (innermost last):
  File "./indexTest.py", line 22, in ?
java.lang.StringIndexOutOfBoundsException: String index out of range:
200
        at
org.apache.xerces.framework.XMLParser.parse(XMLParser.java:1111)
        at
org.xml.sax.helpers.XMLReaderAdapter.parse(XMLReaderAdapter.java:223)
        at javax.xml.parsers.SAXParser.parse(SAXParser.java:314)
        at javax.xml.parsers.SAXParser.parse(SAXParser.java:253)
        at
org.apache.lucenesandbox.xmlindexingdemo.XMLDocumentHandlerSAX.<init>(XMLDocumentHandlerSAX.java:34)
        at
org.apache.lucenesandbox.xmlindexingdemo.IndexFiles.indexDocs(IndexFiles.java:104)


    Doesn't that look like an error in Crimson?
    If I were you I'd use Xerces instead, I always had a better feeling
    about Xerces, and I think that demo code doesn't have anything
    Crimson-specific hard-coded in it.
    
    Otis
    
    --- David Kendig <dkendig@gcmd.nasa.gov> wrote:
    > I am having problems with the
    > lucene-sandbox/contributions/XML-Indexing-Demo.  I get the following
    > error when I index my XML documents with the SAX parser in Java 1.4.1
    > 
    > java.lang.StringIndexOutOfBoundsException: String index out of range:
    > 200
    >         at
    > org.apache.crimson.parser.Parser2.parseInternal(Parser2.java:524)
    >         at org.apache.crimson.parser.Parser2.parse(Parser2.java:305)
    >         at
    > org.apache.crimson.parser.XMLReaderImpl.parse(XMLReaderImpl.java:442)
    >         at
    > org.xml.sax.helpers.XMLReaderAdapter.parse(XMLReaderAdapter.java:223)
    >         at javax.xml.parsers.SAXParser.parse(SAXParser.java:314)
    >         at javax.xml.parsers.SAXParser.parse(SAXParser.java:253)
    >         at
    >
    org.apache.lucenesandbox.xmlindexingdemo.XMLDocumentHandlerSAX.<init>(XMLDocumentHandlerSAX.java:34)
    > 
    > 
    > I thought it may be related to the depricated messages I get when I
    > build the XML demo so I replaced the depricated calls.  This was
    > mostly
    > by extending from DefaultHandler instead of BaseHandler.  Now my XML
    > doc
    > is parsed but there are no events generated that call startElement()
    > and
    > stopElement(). I need stopElement() to be called to add the field to
    > my
    > Lucene document.  Any one else had any problems like this?
    > 
    > Thanks,
    > 
    > Dave Kendig
    > 
    
    
    __________________________________________________
    Do you Yahoo!?
    Yahoo! Platinum - Watch CBS' NCAA March Madness, live on your desktop!
    http://platinum.yahoo.com
    
    ---------------------------------------------------------------------
    To unsubscribe, e-mail: lucene-user-unsubscribe@jakarta.apache.org
    For additional commands, e-mail: lucene-user-help@jakarta.apache.org
    
    

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message