Return-Path: X-Original-To: apmail-any23-dev-archive@www.apache.org Delivered-To: apmail-any23-dev-archive@www.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id 4050B10D3A for ; Tue, 13 Aug 2013 19:22:54 +0000 (UTC) Received: (qmail 18572 invoked by uid 500); 13 Aug 2013 19:22:51 -0000 Delivered-To: apmail-any23-dev-archive@any23.apache.org Received: (qmail 18550 invoked by uid 500); 13 Aug 2013 19:22:51 -0000 Mailing-List: contact dev-help@any23.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: dev@any23.apache.org Delivered-To: mailing list dev@any23.apache.org Received: (qmail 18487 invoked by uid 99); 13 Aug 2013 19:22:49 -0000 Received: from arcas.apache.org (HELO arcas.apache.org) (140.211.11.28) by apache.org (qpsmtpd/0.29) with ESMTP; Tue, 13 Aug 2013 19:22:49 +0000 Date: Tue, 13 Aug 2013 19:22:49 +0000 (UTC) From: "Ruben Verborgh (JIRA)" To: dev@any23.apache.org Message-ID: In-Reply-To: References: Subject: [jira] [Updated] (ANY23-166) Parsing fails with missing quote in HTML attribute MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: 7bit X-JIRA-FingerPrint: 30527f35849b9dde25b450d4833f0394 [ https://issues.apache.org/jira/browse/ANY23-166?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ruben Verborgh updated ANY23-166: --------------------------------- Description: When trying http://ruben.verborgh.org/tmp/og-test.html in the validator, it fails with: Internal error. ================================================================ java.lang.IllegalArgumentException: Invalid content '' at org.apache.any23.extractor.microdata.ItemPropValue.(ItemPropValue.java:89) at org.apache.any23.extractor.microdata.MicrodataParser.getPropertyValue(MicrodataParser.java:341) at org.apache.any23.extractor.microdata.MicrodataParser.getItemProps(MicrodataParser.java:394) at org.apache.any23.extractor.microdata.MicrodataParser.getItemScope(MicrodataParser.java:471) at org.apache.any23.extractor.microdata.MicrodataParser.getMicrodata(MicrodataParser.java:186) at org.apache.any23.extractor.microdata.MicrodataParser.getMicrodata(MicrodataParser.java:203) at org.apache.any23.extractor.microdata.MicrodataExtractor.run(MicrodataExtractor.java:100) at org.apache.any23.extractor.microdata.MicrodataExtractor.run(MicrodataExtractor.java:62) at org.apache.any23.extractor.SingleDocumentExtraction.runExtractor(SingleDocumentExtraction.java:477) at org.apache.any23.extractor.SingleDocumentExtraction.run(SingleDocumentExtraction.java:260) at org.apache.any23.Any23.extract(Any23.java:294) at org.apache.any23.Any23.extract(Any23.java:446) at org.apache.any23.servlet.WebResponder.runExtraction(WebResponder.java:113) at org.apache.any23.servlet.Servlet.doGet(Servlet.java:74) at javax.servlet.http.HttpServlet.service(HttpServlet.java:617) at javax.servlet.http.HttpServlet.service(HttpServlet.java:717) at org.apache.catalina.core.ApplicationFilterChain.internalDoFilter(ApplicationFilterChain.java:290) at org.apache.catalina.core.ApplicationFilterChain.doFilter(ApplicationFilterChain.java:206) at org.apache.catalina.core.StandardWrapperValve.invoke(StandardWrapperValve.java:233) at org.apache.catalina.core.StandardContextValve.invoke(StandardContextValve.java:191) at org.apache.catalina.core.StandardHostValve.invoke(StandardHostValve.java:127) at com.googlecode.psiprobe.Tomcat60AgentValve.invoke(Tomcat60AgentValve.java:30) at org.apache.catalina.valves.ErrorReportValve.invoke(ErrorReportValve.java:102) at org.apache.catalina.core.StandardEngineValve.invoke(StandardEngineValve.java:109) at org.apache.catalina.connector.CoyoteAdapter.service(CoyoteAdapter.java:293) at org.apache.coyote.http11.Http11Processor.process(Http11Processor.java:859) at org.apache.coyote.http11.Http11Protocol$Http11ConnectionHandler.process(Http11Protocol.java:602) at org.apache.tomcat.util.net.JIoEndpoint$Worker.run(JIoEndpoint.java:489) at java.lang.Thread.run(Thread.java:662) ================================================================ Source: http://any23.org/any23/?format=best&uri=http%3A%2F%2Fruben.verborgh.org%2Ftmp%2Fog-test.html&validation-mode=validate-fix Note how a quote is missing in the prefix attribute of the html tag. The strange thing is that editing the document body makes the error disappear, depending on what you remove. For instance: http://any23.org/any23/?format=best&uri=http%3A%2F%2Fruben.verborgh.org%2Ftmp%2Fog-test2.html&validation-mode=validate-fix was: When trying http://ruben.verborgh.org/tmp/og-test.html in the validator, it fails with: Internal error. ================================================================ java.lang.IllegalArgumentException: Invalid content '' at org.apache.any23.extractor.microdata.ItemPropValue.(ItemPropValue.java:89) at org.apache.any23.extractor.microdata.MicrodataParser.getPropertyValue(MicrodataParser.java:341) at org.apache.any23.extractor.microdata.MicrodataParser.getItemProps(MicrodataParser.java:394) at org.apache.any23.extractor.microdata.MicrodataParser.getItemScope(MicrodataParser.java:471) at org.apache.any23.extractor.microdata.MicrodataParser.getMicrodata(MicrodataParser.java:186) at org.apache.any23.extractor.microdata.MicrodataParser.getMicrodata(MicrodataParser.java:203) at org.apache.any23.extractor.microdata.MicrodataExtractor.run(MicrodataExtractor.java:100) at org.apache.any23.extractor.microdata.MicrodataExtractor.run(MicrodataExtractor.java:62) at org.apache.any23.extractor.SingleDocumentExtraction.runExtractor(SingleDocumentExtraction.java:477) at org.apache.any23.extractor.SingleDocumentExtraction.run(SingleDocumentExtraction.java:260) at org.apache.any23.Any23.extract(Any23.java:294) at org.apache.any23.Any23.extract(Any23.java:446) at org.apache.any23.servlet.WebResponder.runExtraction(WebResponder.java:113) at org.apache.any23.servlet.Servlet.doGet(Servlet.java:74) at javax.servlet.http.HttpServlet.service(HttpServlet.java:617) at javax.servlet.http.HttpServlet.service(HttpServlet.java:717) at org.apache.catalina.core.ApplicationFilterChain.internalDoFilter(ApplicationFilterChain.java:290) at org.apache.catalina.core.ApplicationFilterChain.doFilter(ApplicationFilterChain.java:206) at org.apache.catalina.core.StandardWrapperValve.invoke(StandardWrapperValve.java:233) at org.apache.catalina.core.StandardContextValve.invoke(StandardContextValve.java:191) at org.apache.catalina.core.StandardHostValve.invoke(StandardHostValve.java:127) at com.googlecode.psiprobe.Tomcat60AgentValve.invoke(Tomcat60AgentValve.java:30) at org.apache.catalina.valves.ErrorReportValve.invoke(ErrorReportValve.java:102) at org.apache.catalina.core.StandardEngineValve.invoke(StandardEngineValve.java:109) at org.apache.catalina.connector.CoyoteAdapter.service(CoyoteAdapter.java:293) at org.apache.coyote.http11.Http11Processor.process(Http11Processor.java:859) at org.apache.coyote.http11.Http11Protocol$Http11ConnectionHandler.process(Http11Protocol.java:602) at org.apache.tomcat.util.net.JIoEndpoint$Worker.run(JIoEndpoint.java:489) at java.lang.Thread.run(Thread.java:662) ================================================================ Source: http://any23.org/any23/?format=best&uri=http%3A%2F%2Fruben.verborgh.org%2Ftmp%2Fog-test.html&validation-mode=validate-fix Summary: Parsing fails with missing quote in HTML attribute (was: Parsing fails on RDFa/Microdata combination) > Parsing fails with missing quote in HTML attribute > -------------------------------------------------- > > Key: ANY23-166 > URL: https://issues.apache.org/jira/browse/ANY23-166 > Project: Apache Any23 > Issue Type: Bug > Reporter: Ruben Verborgh > > When trying http://ruben.verborgh.org/tmp/og-test.html in the validator, it fails with: > Internal error. > ================================================================ > java.lang.IllegalArgumentException: Invalid content '' > at org.apache.any23.extractor.microdata.ItemPropValue.(ItemPropValue.java:89) > at org.apache.any23.extractor.microdata.MicrodataParser.getPropertyValue(MicrodataParser.java:341) > at org.apache.any23.extractor.microdata.MicrodataParser.getItemProps(MicrodataParser.java:394) > at org.apache.any23.extractor.microdata.MicrodataParser.getItemScope(MicrodataParser.java:471) > at org.apache.any23.extractor.microdata.MicrodataParser.getMicrodata(MicrodataParser.java:186) > at org.apache.any23.extractor.microdata.MicrodataParser.getMicrodata(MicrodataParser.java:203) > at org.apache.any23.extractor.microdata.MicrodataExtractor.run(MicrodataExtractor.java:100) > at org.apache.any23.extractor.microdata.MicrodataExtractor.run(MicrodataExtractor.java:62) > at org.apache.any23.extractor.SingleDocumentExtraction.runExtractor(SingleDocumentExtraction.java:477) > at org.apache.any23.extractor.SingleDocumentExtraction.run(SingleDocumentExtraction.java:260) > at org.apache.any23.Any23.extract(Any23.java:294) > at org.apache.any23.Any23.extract(Any23.java:446) > at org.apache.any23.servlet.WebResponder.runExtraction(WebResponder.java:113) > at org.apache.any23.servlet.Servlet.doGet(Servlet.java:74) > at javax.servlet.http.HttpServlet.service(HttpServlet.java:617) > at javax.servlet.http.HttpServlet.service(HttpServlet.java:717) > at org.apache.catalina.core.ApplicationFilterChain.internalDoFilter(ApplicationFilterChain.java:290) > at org.apache.catalina.core.ApplicationFilterChain.doFilter(ApplicationFilterChain.java:206) > at org.apache.catalina.core.StandardWrapperValve.invoke(StandardWrapperValve.java:233) > at org.apache.catalina.core.StandardContextValve.invoke(StandardContextValve.java:191) > at org.apache.catalina.core.StandardHostValve.invoke(StandardHostValve.java:127) > at com.googlecode.psiprobe.Tomcat60AgentValve.invoke(Tomcat60AgentValve.java:30) > at org.apache.catalina.valves.ErrorReportValve.invoke(ErrorReportValve.java:102) > at org.apache.catalina.core.StandardEngineValve.invoke(StandardEngineValve.java:109) > at org.apache.catalina.connector.CoyoteAdapter.service(CoyoteAdapter.java:293) > at org.apache.coyote.http11.Http11Processor.process(Http11Processor.java:859) > at org.apache.coyote.http11.Http11Protocol$Http11ConnectionHandler.process(Http11Protocol.java:602) > at org.apache.tomcat.util.net.JIoEndpoint$Worker.run(JIoEndpoint.java:489) > at java.lang.Thread.run(Thread.java:662) > ================================================================ > Source: http://any23.org/any23/?format=best&uri=http%3A%2F%2Fruben.verborgh.org%2Ftmp%2Fog-test.html&validation-mode=validate-fix > Note how a quote is missing in the prefix attribute of the html tag. > The strange thing is that editing the document body makes the error disappear, depending on what you remove. For instance: http://any23.org/any23/?format=best&uri=http%3A%2F%2Fruben.verborgh.org%2Ftmp%2Fog-test2.html&validation-mode=validate-fix -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira