commons-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Patrick Diviacco <patrick.divia...@gmail.com>
Subject Re: [digester] java.lang.NullPointerException only for a specific file
Date Tue, 29 Mar 2011 09:20:12 GMT
hey Simone,

I was now wondering if isn't better to import my xml doc in a database and
working with mysql.

I guess it is faster to scan a mysql database with java rather than a xml
doc, what do you think ?

I'm using Digester combined with Apache Lucene to perform queries (all
together they are 65MBs in a xml file) against a collection (65MBs in XML
again).

thanks



On 28 March 2011 17:20, Simone Tripodi <simonetripodi@apache.org> wrote:

> Hi Patrick,
> take a look at this example[1]: all you have to do is obtaining a
> ContentHandler instance as shown, then invoking SAX events while
> parsing the original document.
> It's more efficient and consumes less memory
> Simo
>
> [1] http://www.stylusstudio.com/xmldev/200502/post20440.html
>
> http://people.apache.org/~simonetripodi/
> http://www.99soft.org/
>
>
>
> On Mon, Mar 28, 2011 at 4:56 PM, Patrick Diviacco
> <patrick.diviacco@gmail.com> wrote:
> > hi!
> >
> > What should I use instead of StringBuffer ?
> >
> > Any example or tutorial ?
> >
> > thanks
> > Patrick
> >
> > On 28 March 2011 16:53, Simone Tripodi <simonetripodi@apache.org> wrote:
> >
> >> Hi Patrick,
> >> nice to know you quickly fixed the issue before anybody could have
> >> provided his help! :)
> >>
> >> As a side note, I would suggest you taking in consideration a
> >> different solution for the XML generation rather the StringBuffer,
> >> since you're parsing large dataset, streaming data while parsing
> >> would improve the performances and reduce the consumed memory.
> >>
> >> Just my 2 cents, have a nice day,
> >> Simo
> >>
> >> http://people.apache.org/~simonetripodi/
> >> http://www.99soft.org/
> >>
> >>
> >>
> >> On Mon, Mar 28, 2011 at 2:28 PM, Patrick Diviacco
> >> <patrick.diviacco@gmail.com> wrote:
> >> > I've solved. the issue was a row in train.xml file. To solve the issue
> >> I've
> >> > printed the source file rows while processing. However it has been
> >> possible
> >> > only because the parsing takes 4 minutes.
> >> >
> >> > I'm wondering how to debug such issues with a much bigger text file.
> >> >
> >> > thanks
> >> >
> >> > On 28 March 2011 14:14, Patrick Diviacco <patrick.diviacco@gmail.com>
> >> wrote:
> >> >
> >> >> And these are the files:
> >> >>
> >> >> http://dl.dropbox.com/u/72686/test.xml
> >> >>
> >> >> http://dl.dropbox.com/u/72686/train.xml
> >> >>
> >> >> thanks
> >> >>
> >> >>
> >> >> On 28 March 2011 14:13, Patrick Diviacco <patrick.diviacco@gmail.com
> >> >wrote:
> >> >>
> >> >>> Hi,
> >> >>>
> >> >>> I've a 74MB xml document and I've split it into 2 docs:52MB and
22MB
> >> >>> respectively.
> >> >>>
> >> >>> I'm parsing the file using common Digester library, and everything
> >> works
> >> >>> perfectly for the small file, but I  get a NullPointerExceptio
with
> the
> >> big
> >> >>> one.
> >> >>>
> >> >>> I don't think the issue is the code because it works for the small
> >> file...
> >> >>> I guess the problem is with the file itself.
> >> >>>
> >> >>> I've parsed the files with the same parser, so I don't think the
> files
> >> >>> have issues either.
> >> >>>
> >> >>> In conclusion I dunno where the issue is. This is the code:
> >> >>> http://pastie.org/1726063
> >> >>>
> >> >>> This is the exception
> >> >>> SEVERE: End event threw exception
> >> >>> java.lang.reflect.InvocationTargetException
> >> >>> at sun.reflect.GeneratedMethodAccessor7.invoke(Unknown Source)
> >> >>> at
> >> >>>
> >>
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
> >> >>>  at java.lang.reflect.Method.invoke(Method.java:597)
> >> >>> at
> >> >>>
> >>
> org.apache.commons.beanutils.MethodUtils.invokeMethod(MethodUtils.java:216)
> >> >>>  at
> org.apache.commons.digester.SetNextRule.end(SetNextRule.java:220)
> >> >>> at org.apache.commons.digester.Rule.end(Rule.java:257)
> >> >>>  at
> org.apache.commons.digester.Digester.endElement(Digester.java:1345)
> >> >>> at
> >> >>>
> >>
> com.sun.org.apache.xerces.internal.parsers.AbstractSAXParser.endElement(AbstractSAXParser.java:601)
> >> >>>  at
> >> >>>
> >>
> com.sun.org.apache.xerces.internal.impl.XMLDocumentFragmentScannerImpl.scanEndElement(XMLDocumentFragmentScannerImpl.java:1782)
> >> >>> at
> >> >>>
> >>
> com.sun.org.apache.xerces.internal.impl.XMLDocumentFragmentScannerImpl$FragmentContentDriver.next(XMLDocumentFragmentScannerImpl.java:2938)
> >> >>>  at
> >> >>>
> >>
> com.sun.org.apache.xerces.internal.impl.XMLDocumentScannerImpl.next(XMLDocumentScannerImpl.java:648)
> >> >>> at
> >> >>>
> >>
> com.sun.org.apache.xerces.internal.impl.XMLDocumentFragmentScannerImpl.scanDocument(XMLDocumentFragmentScannerImpl.java:511)
> >> >>>  at
> >> >>>
> >>
> com.sun.org.apache.xerces.internal.parsers.XML11Configuration.parse(XML11Configuration.java:808)
> >> >>> at
> >> >>>
> >>
> com.sun.org.apache.xerces.internal.parsers.XML11Configuration.parse(XML11Configuration.java:737)
> >> >>>  at
> >> >>>
> >>
> com.sun.org.apache.xerces.internal.parsers.XMLParser.parse(XMLParser.java:119)
> >> >>> at
> >> >>>
> >>
> com.sun.org.apache.xerces.internal.parsers.AbstractSAXParser.parse(AbstractSAXParser.java:1205)
> >> >>>  at
> >> >>>
> >>
> com.sun.org.apache.xerces.internal.jaxp.SAXParserImpl$JAXPSAXParser.parse(SAXParserImpl.java:522)
> >> >>> at org.apache.commons.digester.Digester.parse(Digester.java:1871)
> >> >>>  at CentroidGenerator.main(CentroidGenerator.java:137)
> >> >>> Caused by: java.lang.NullPointerException
> >> >>> at CentroidGenerator.nextItem(CentroidGenerator.java:62)
> >> >>>  ... 19 more
> >> >>> Exception in thread "main" java.lang.NullPointerException
> >> >>> at
> >> >>>
> >>
> org.apache.commons.digester.Digester.createSAXException(Digester.java:3363)
> >> >>>  at
> >> >>>
> >>
> org.apache.commons.digester.Digester.createSAXException(Digester.java:3389)
> >> >>> at
> org.apache.commons.digester.Digester.endElement(Digester.java:1348)
> >> >>>  at
> >> >>>
> >>
> com.sun.org.apache.xerces.internal.parsers.AbstractSAXParser.endElement(AbstractSAXParser.java:601)
> >> >>> at
> >> >>>
> >>
> com.sun.org.apache.xerces.internal.impl.XMLDocumentFragmentScannerImpl.scanEndElement(XMLDocumentFragmentScannerImpl.java:1782)
> >> >>>  at
> >> >>>
> >>
> com.sun.org.apache.xerces.internal.impl.XMLDocumentFragmentScannerImpl$FragmentContentDriver.next(XMLDocumentFragmentScannerImpl.java:2938)
> >> >>> at
> >> >>>
> >>
> com.sun.org.apache.xerces.internal.impl.XMLDocumentScannerImpl.next(XMLDocumentScannerImpl.java:648)
> >> >>>  at
> >> >>>
> >>
> com.sun.org.apache.xerces.internal.impl.XMLDocumentFragmentScannerImpl.scanDocument(XMLDocumentFragmentScannerImpl.java:511)
> >> >>> at
> >> >>>
> >>
> com.sun.org.apache.xerces.internal.parsers.XML11Configuration.parse(XML11Configuration.java:808)
> >> >>>  at
> >> >>>
> >>
> com.sun.org.apache.xerces.internal.parsers.XML11Configuration.parse(XML11Configuration.java:737)
> >> >>> at
> >> >>>
> >>
> com.sun.org.apache.xerces.internal.parsers.XMLParser.parse(XMLParser.java:119)
> >> >>>  at
> >> >>>
> >>
> com.sun.org.apache.xerces.internal.parsers.AbstractSAXParser.parse(AbstractSAXParser.java:1205)
> >> >>> at
> >> >>>
> >>
> com.sun.org.apache.xerces.internal.jaxp.SAXParserImpl$JAXPSAXParser.parse(SAXParserImpl.java:522)
> >> >>>  at org.apache.commons.digester.Digester.parse(Digester.java:1871)
> >> >>> at CentroidGenerator.main(CentroidGenerator.java:137)
> >> >>> Caused by: java.lang.NullPointerException
> >> >>> at CentroidGenerator.nextItem(CentroidGenerator.java:62)
> >> >>> at sun.reflect.GeneratedMethodAccessor7.invoke(Unknown Source)
> >> >>>  at
> >> >>>
> >>
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
> >> >>> at java.lang.reflect.Method.invoke(Method.java:597)
> >> >>>  at
> >> >>>
> >>
> org.apache.commons.beanutils.MethodUtils.invokeMethod(MethodUtils.java:216)
> >> >>> at org.apache.commons.digester.SetNextRule.end(SetNextRule.java:220)
> >> >>>  at org.apache.commons.digester.Rule.end(Rule.java:257)
> >> >>> at
> org.apache.commons.digester.Digester.endElement(Digester.java:1345)
> >> >>>  ... 12 more
> >> >>>
> >> >>> thanks
> >> >>>
> >> >>
> >> >>
> >> >
> >>
> >> ---------------------------------------------------------------------
> >> To unsubscribe, e-mail: user-unsubscribe@commons.apache.org
> >> For additional commands, e-mail: user-help@commons.apache.org
> >>
> >>
> >
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: user-unsubscribe@commons.apache.org
> For additional commands, e-mail: user-help@commons.apache.org
>
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message