any23-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Hudson (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (ANY23-318) ExtractionException handling in BaseRDFExtractor.java kills entire extraction
Date Sat, 30 Dec 2017 17:22:00 GMT

    [ https://issues.apache.org/jira/browse/ANY23-318?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16306881#comment-16306881
] 

Hudson commented on ANY23-318:
------------------------------

SUCCESS: Integrated in Jenkins build Any23-trunk #1515 (See [https://builds.apache.org/job/Any23-trunk/1515/])
ANY23-318 ExtractionException handling in BaseRDFExtractor.java kills (lewis.mcgibbney: rev
4c81edde390b6b6e91566f490ca5d915ca0b0945)
* (edit) core/src/test/java/org/apache/any23/validator/XMLValidationReportSerializerTest.java
* (edit) core/src/main/java/org/apache/any23/extractor/rdf/BaseRDFExtractor.java
* (edit) core/src/main/java/org/apache/any23/extractor/SingleDocumentExtraction.java
* (edit) service/src/main/java/org/apache/any23/servlet/Servlet.java
* (edit) core/src/main/java/org/apache/any23/validator/rule/MissingOpenGraphNamespaceRule.java
* (edit) core/src/main/java/org/apache/any23/validator/rule/OpenGraphNamespaceFix.java
* (edit) cli/src/main/java/org/apache/any23/cli/Rover.java
* (edit) core/src/main/java/org/apache/any23/validator/ValidationReport.java
* (edit) test-resources/src/test/resources/microdata/microdata-basic.html
* (edit) api/src/main/java/org/apache/any23/extractor/ExtractionParameters.java
* (edit) core/src/test/java/org/apache/any23/validator/DefaultValidatorTest.java
* (edit) core/src/main/java/org/apache/any23/validator/rule/MetaNameMisuseRule.java
* (edit) core/src/main/java/org/apache/any23/validator/rule/MissingItemscopeAttributeValueRule.java
* (edit) core/src/test/java/org/apache/any23/Any23Test.java
* (edit) service/src/main/java/org/apache/any23/servlet/RedirectServlet.java
* (edit) core/src/main/java/org/apache/any23/validator/rule/AboutNotURIRule.java
* (edit) core/src/main/java/org/apache/any23/validator/DefaultValidationReportBuilder.java
* (edit) core/src/main/java/org/apache/any23/validator/rule/MissingItemscopeAttributeValueFix.java
* (edit) core/src/main/java/org/apache/any23/validator/rule/MetaNameMisuseFix.java
* (edit) service/src/main/java/org/apache/any23/servlet/WebResponder.java
ANY23-318 ExtractionException handling in BaseRDFExtractor.java kills (lewis.mcgibbney: rev
15571d45f89e8c63b8da6a699b345131d4433ad9)
* (edit) core/src/main/java/org/apache/any23/validator/DefaultValidator.java
* (edit) core/src/test/java/org/apache/any23/validator/DefaultValidatorTest.java
* (edit) core/src/main/java/org/apache/any23/validator/rule/MissingItemscopeAttributeValueFix.java


> ExtractionException handling in BaseRDFExtractor.java kills entire extraction
> -----------------------------------------------------------------------------
>
>                 Key: ANY23-318
>                 URL: https://issues.apache.org/jira/browse/ANY23-318
>             Project: Apache Any23
>          Issue Type: Bug
>          Components: core, extractors
>    Affects Versions: 2.1
>            Reporter: Lewis John McGibbney
>            Assignee: Lewis John McGibbney
>            Priority: Blocker
>             Fix For: 2.2
>
>
> Right now the following snippet of code contained within BaseRDFExtractor.java kills
entire extractions. I propose to merely log the errors and continue with the extraction.
> {code}
>          } catch (RDFParseException ex) {
> -            throw new ExtractionException("Error while parsing RDF document.", ex, extractionResult);
> +            LOG.error("Error while parsing RDF document.", ex, extractionResult);
>          }
>      }
> {code}
> The parsing strictness is inherited from the underlying semargl parsers which expect
perfect syntax for input data... in the 'wild' however, this unfortunately is not realistic.

> The solution is for us to log the Exception, issues, etc. and carry on with the extraction.
> Patch coming up.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

Mime
View raw message