any23-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Lewis John McGibbney (JIRA)" <j...@apache.org>
Subject [jira] [Created] (ANY23-318) ExtractionException handling in BaseRDFExtractor.java kills entire extraction
Date Wed, 27 Dec 2017 20:05:00 GMT
Lewis John McGibbney created ANY23-318:
------------------------------------------

             Summary: ExtractionException handling in BaseRDFExtractor.java kills entire extraction
                 Key: ANY23-318
                 URL: https://issues.apache.org/jira/browse/ANY23-318
             Project: Apache Any23
          Issue Type: Bug
          Components: core, extractors
    Affects Versions: 2.1
            Reporter: Lewis John McGibbney
            Assignee: Lewis John McGibbney
            Priority: Blocker
             Fix For: 2.2


Right now the following snippet of code contained within BaseRDFExtractor.java kills entire
extractions. I propose to merely log the errors and continue with the extraction.

{code}
         } catch (RDFParseException ex) {
-            throw new ExtractionException("Error while parsing RDF document.", ex, extractionResult);
+            LOG.error("Error while parsing RDF document.", ex, extractionResult);
         }
     }
{code}

The parsing strictness is inherited from the underlying semargl parsers which expect perfect
syntax for input data... in the 'wild' however, this unfortunately is not realistic. 
The solution is for us to log the Exception, issues, etc. and carry on with the extraction.
Patch coming up.




--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

Mime
View raw message