xerces-j-users mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Jean Georges PERRIN" <...@jgp.net>
Subject RE: Enhancing parsing performance
Date Mon, 13 Jan 2003 23:00:01 GMT
Hi,

Thanks for the hope message!

I was timing the whole method, I focused on parser creation and parse time
now.

I changed my code to:
  public void load () {
    DOMParser parser;
    Logger log = ThinStructureConfiguration.getInstance().getLogger();
    
    try {
      long start = System.currentTimeMillis();
      parser = new DOMParser();
      long stop = System.currentTimeMillis();
      log.finest ("Creating parser took " + (stop - start) + " ms");
    }
    catch (Exception e) {
      log.severe ("Error: Unable to instantiate parser");
      return;
    }

    try {
      long start = System.currentTimeMillis();
      parser.parse(m_file.toURI().toString());
      long stop = System.currentTimeMillis();
      log.finest ("Parsing of " + m_file.getName() + " took " + (stop -
start) + " ms");
      m_document = parser.getDocument();
    }
    catch (SAXParseException e) {
      // ignore
    }
    catch (Exception e) {
      String msg;
      msg = ("Error: Parse error occurred, " + e.getMessage());
      if (e instanceof SAXException) {
        e = ((SAXException)e).getException();
      }
      msg += '\n' + e.toString();
      log.severe (msg);
    }
  }

Results are:
Jan 13, 2003 11:52:20 PM com.awoma.ts.ui.impl.XHTML11Window load
FINEST: Creating parser took 251 ms
Jan 13, 2003 11:52:25 PM com.awoma.ts.ui.impl.XHTML11Window load
FINEST: Parsing of emailpassword.xhtml took 5227 ms
Jan 13, 2003 11:52:25 PM com.awoma.ts.ui.Store add
INFO: Window definition emailpassword.xhtml added.
Jan 13, 2003 11:52:25 PM com.awoma.ts.ui.impl.XHTML11Window load
FINEST: Creating parser took 10 ms
Jan 13, 2003 11:52:29 PM com.awoma.ts.ui.impl.XHTML11Window load
FINEST: Parsing of emailpassword2.xhtml took 3085 ms
Jan 13, 2003 11:52:29 PM com.awoma.ts.ui.Store add
INFO: Window definition emailpassword2.xhtml added.
Jan 13, 2003 11:52:29 PM com.awoma.ts.ui.impl.XHTML11Window load
FINEST: Creating parser took 0 ms
Jan 13, 2003 11:52:29 PM com.awoma.ts.ui.impl.XHTML11Window load
FINEST: Parsing of emailpassword3.xhtml took 10 ms
Jan 13, 2003 11:52:29 PM com.awoma.ts.ui.Store add
INFO: Window definition emailpassword3.xhtml added.
Jan 13, 2003 11:52:29 PM com.awoma.ts.ui.impl.XHTML11Window load
FINEST: Creating parser took 0 ms
Jan 13, 2003 11:52:31 PM com.awoma.ts.ui.impl.XHTML11Window load
FINEST: Parsing of emailpassword4.xhtml took 2774 ms

All files are identical, except #3 where I removed all references to the
external world.

I use Xerces J 2.2.1 (according to build.xml).

Conclusions & questions:
1/ Creation of DOMParser() is slow the first time, but ridiculous
afterwards, so there is no need for enhancing that much.
2/ My parser seems to want to check the validity through external
connection. How can I remove those without modifying all my files?

jgp 

> -----Original Message-----
> From: Simon Kitching [mailto:simon@ecnetwork.co.nz]
> Sent: Monday, January 13, 2003 23:24
> To: jgp@jgp.net
> Cc: xerces-j-user@xml.apache.org
> Subject: Re: Enhancing parsing performance
> 
> Hi Jean Georges,
> 
> Firstly, does the document you are parsing contain a DTD or schema
> reference? If it uses http://acme.com/xyz.dtd, then much of your parsing
> time may actually be in retrieval of the remote dtd. And if the
> dtd/schema is large then time will be spent processing it. If this is
> the case, there are optimisations available for both these problems.
> 
> Secondly, you don't say exactly what you are timing. Is it the complete
> application time, or the time taken by the method you include below, or
> just the time for the parse method?
> 
> Thirdly, you don't mention which version of Xerces you are using...
> 
> Providing information on the above would allow people to provide better
> suggestions for you..
> 
> I certainly see better performance than you do, so there is hope :-)
> 
> Regards,
> 
> Simon
> 
> On Tue, 2003-01-14 at 10:56, Jean Georges PERRIN wrote:
> > Hi,
> >
> > Thanks for those who helped me with cloning...
> >
> > I am a little surprised with performance. Maybe there are some basic
> things
> > I am doing wrong.
> >
> > I am parsing a 3 Kb XHTML file and it takes me about 4s, cloning the
> tree
> > takes me roughly a ridiculous amount of time (10ms). This on an Athlon
> XP
> > 1800+ running XP (sure I could switch to Linux but it is not planned for
> now
> > :) ).
> >
> > My code for parsing:
> >   protected void load () {
> >     DOMParser parser;
> >
> >     try {
> >       parser = new DOMParser();
> >     }
> >     catch (Exception e) {
> >       log.severe ("Error: Unable to instantiate parser");
> >       return;
> >     }
> >
> >     try {
> >       parser.parse(m_file.toURI().toString());
> >       m_document = parser.getDocument();
> >     }
> >     catch (SAXParseException e) {
> >       // ignore
> >     }
> >     catch (Exception e) {
> >       String msg;
> >       msg = ("Error: Parse error occurred, " + e.getMessage());
> >       if (e instanceof SAXException) {
> >         e = ((SAXException)e).getException();
> >       }
> >       msg += '\n' + e.toString();
> >       log.severe (msg);
> >     }
> >   }
> >
> > Questions:
> > 1/ is static'ing my parser will enhance the process?
> > 2/ can I "pre" create some objects I can reuse?
> > 3/ are there some eventual verification I can turn off?
> >
> > My code for cloning:
> >   public Object clone() {
> >     XHTML11Window win = new XHTML11Window(m_file);
> >     win.m_document = new DocumentImpl();
> >     win.m_document.importNode(m_document.getDocumentElement(), true);
> >
> >     return win;
> >   }
> >
> > I haven't checked that they really were cloned, but it looks as if they
> > were...
> >
> > Any tips are more than welcome!
> >
> > Jean Georges PERRIN
> >
> >
> >
> > ---------------------------------------------------------------------
> > To unsubscribe, e-mail: xerces-j-user-unsubscribe@xml.apache.org
> > For additional commands, e-mail: xerces-j-user-help@xml.apache.org
> >
> >


---------------------------------------------------------------------
To unsubscribe, e-mail: xerces-j-user-unsubscribe@xml.apache.org
For additional commands, e-mail: xerces-j-user-help@xml.apache.org


Mime
View raw message