lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Chuck Williams" <ch...@manawiz.com>
Subject RE: Parsing issue
Date Wed, 05 Jan 2005 00:20:04 GMT
I use it and have yet to have a problem with it.  It uses the Xerces API
so you parse and access html files just like xml files.  Very cool,

Chuck

  > -----Original Message-----
  > From: Otis Gospodnetic [mailto:otis_gospodnetic@yahoo.com]
  > Sent: Tuesday, January 04, 2005 2:05 PM
  > To: Lucene Users List
  > Subject: Re: Parsing issue
  > 
  > That's the correct place to look and it includes code samples.
  > Yes, it's a Jar file that you add to the CLASSPATH and use ... hm,
  > normally.... programmatically, yes :).
  > 
  > Otis
  > 
  > --- Hetan Shah <Hetan.Shah@Sun.COM> wrote:
  > 
  > > Has any one used NekoHTML ? If so how do I use it. Is it a stand
  > > alone
  > > jar file that I include in my classpath and start using just like
  > > IndexHTML ?
  > > Can some one share syntax and or code if it is supposed to be used
  > > programetically. I am looking at
  > > http://www.apache.org/~andyc/neko/doc/html/ for more information
is
  > > that
  > > the correct place to look?
  > >
  > > Thanks,
  > > -H
  > >
  > >
  > > Erik Hatcher wrote:
  > >
  > > > Sure... clean up your HTML and it'll parse fine :)   Perhaps use
  > > JTidy
  > > > to clean up the HTML.  Or switch to using a more forgiving
parser
  > > like
  > > > NekoHTML.
  > > >
  > > >     Erik
  > > >
  > > > On Jan 4, 2005, at 3:59 PM, Hetan Shah wrote:
  > > >
  > > >> Hello All,
  > > >>
  > > >> Does any one know how to handle the following parsing error?
  > > >>
  > > >> thanks for pointers/code snippets.
  > > >>
  > > >> -H
  > > >>
  > > >> While trying to parse a HTML file using IndexHTML I get
  > > >>
  > > >> Parse Aborted: Encountered "\"" at line 8, column 1162.
  > > >> Was expecting one of:
  > > >>     <ArgName> ...
  > > >>     "=" ...
  > > >>     <TagEnd> ...
  > > >>
  > > >>
  > > >>
  > > >>
  > >
---------------------------------------------------------------------
  > > >> To unsubscribe, e-mail:
lucene-user-unsubscribe@jakarta.apache.org
  > > >> For additional commands, e-mail:
  > > lucene-user-help@jakarta.apache.org
  > > >
  > > >
  > > >
  > > >
  > >
---------------------------------------------------------------------
  > > > To unsubscribe, e-mail:
lucene-user-unsubscribe@jakarta.apache.org
  > > > For additional commands, e-mail:
  > > lucene-user-help@jakarta.apache.org
  > > >
  > >
  > >
  > >
  > >
---------------------------------------------------------------------
  > > To unsubscribe, e-mail: lucene-user-unsubscribe@jakarta.apache.org
  > > For additional commands, e-mail:
lucene-user-help@jakarta.apache.org
  > >
  > >
  > 
  > 
  >
---------------------------------------------------------------------
  > To unsubscribe, e-mail: lucene-user-unsubscribe@jakarta.apache.org
  > For additional commands, e-mail: lucene-user-help@jakarta.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: lucene-user-unsubscribe@jakarta.apache.org
For additional commands, e-mail: lucene-user-help@jakarta.apache.org


Mime
View raw message