drill-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "ASF GitHub Bot (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (DRILL-4653) Malformed JSON should not stop the entire query from progressing
Date Tue, 20 Sep 2016 01:03:21 GMT

    [ https://issues.apache.org/jira/browse/DRILL-4653?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15505198#comment-15505198

ASF GitHub Bot commented on DRILL-4653:

Github user ssriniva123 commented on the issue:

    Apologize for getting back on this thread late, got tied up with some issues@work.
    The json parser is not just a tokenizer, it keeps track of the JSON structure and understands
various aspects of it like root, array/objectcontext and all parsing is done under that context.
    - we cannot keep track of {} accurately - For eg: The counting json processor does a parser.
skipChildren which tries to skip to the end of the JSON, but this can rollover to next line
    there is a malformed JSON in the bottom most json sub object - see example below (missing
" in last json structure). This is similar behavior with the JsonReader.
    {"balance": 1000.0,"num": 100,"is_vip": true,"name": "foo3","curr":{"denom":"pound","test":{"value
    - One possible solution is to rewind the input source to reset the stream (which is not
recommended and there is no guarentee that all streams support mark/reset semantics.
    Given where we are, I think the solution proposed works perfect for almost all malformed

> Malformed JSON should not stop the entire query from progressing
> ----------------------------------------------------------------
>                 Key: DRILL-4653
>                 URL: https://issues.apache.org/jira/browse/DRILL-4653
>             Project: Apache Drill
>          Issue Type: Improvement
>          Components: Storage - JSON
>    Affects Versions: 1.6.0
>            Reporter: subbu srinivasan
>             Fix For: Future
> Currently Drill query terminates upon first encounter of a invalid JSON line.
> Drill has to continue progressing after ignoring the bad records. Something 
> similar to a setting of (ignore.malformed.json) would help.

This message was sent by Atlassian JIRA

View raw message