drill-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "ASF GitHub Bot (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (DRILL-4653) Malformed JSON should not stop the entire query from progressing
Date Tue, 20 Sep 2016 01:03:21 GMT

    [ https://issues.apache.org/jira/browse/DRILL-4653?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15505198#comment-15505198
] 

ASF GitHub Bot commented on DRILL-4653:
---------------------------------------

Github user ssriniva123 commented on the issue:

    https://github.com/apache/drill/pull/518
  
    Apologize for getting back on this thread late, got tied up with some issues@work.
    
    Paul,
    The json parser is not just a tokenizer, it keeps track of the JSON structure and understands
various aspects of it like root, array/objectcontext and all parsing is done under that context.
    
    - we cannot keep track of {} accurately - For eg: The counting json processor does a parser.
skipChildren which tries to skip to the end of the JSON, but this can rollover to next line
when
    there is a malformed JSON in the bottom most json sub object - see example below (missing
" in last json structure). This is similar behavior with the JsonReader.
    
    {"balance": 1000.0,"num": 100,"is_vip": true,"name": "foo3","curr":{"denom":"pound","test":{"value
 :false}}}
    
    - One possible solution is to rewind the input source to reset the stream (which is not
recommended and there is no guarentee that all streams support mark/reset semantics.
    
    Given where we are, I think the solution proposed works perfect for almost all malformed
JSON's.
    
    



> Malformed JSON should not stop the entire query from progressing
> ----------------------------------------------------------------
>
>                 Key: DRILL-4653
>                 URL: https://issues.apache.org/jira/browse/DRILL-4653
>             Project: Apache Drill
>          Issue Type: Improvement
>          Components: Storage - JSON
>    Affects Versions: 1.6.0
>            Reporter: subbu srinivasan
>             Fix For: Future
>
>
> Currently Drill query terminates upon first encounter of a invalid JSON line.
> Drill has to continue progressing after ignoring the bad records. Something 
> similar to a setting of (ignore.malformed.json) would help.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Mime
View raw message