drill-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From paul-rogers <...@git.apache.org>
Subject [GitHub] drill issue #594: DRILL-4842: SELECT * on JSON data results in NumberFormatE...
Date Tue, 21 Feb 2017 00:10:02 GMT
Github user paul-rogers commented on the issue:

    https://github.com/apache/drill/pull/594
  
    Given all the above, there is very simple fix to the particular case that this bug covers.
    
    {code}
      private void writeDataAllText(MapWriter map, FieldSelection selection,
        ...
          case VALUE_NULL:
            // Here we do have a type. This is a null VarChar.
            handleNullString(map, fieldName);
            break;
        ...
    
      /**
       * Create a VarChar column. No need to explicitly set a
       * null value; nulls are the default.
       * <p>
       * Note: This only works for all-text mode because we can
       * predict that, if we ever see an actual value, it will be
       * treated as a VarChar. This trick <b>will not</b> work for the
       * general case because we cannot predict the actual column
       * type.
       * @param writer
       * @param fieldName
       */
    
      private void handleNullString(MapWriter writer, String fieldName) {
        writer.varChar(fieldName);
      }
    {code}
    
    The above simply leverages the existing mechanism for mapping columns to types, and for
filling in missing null values.
    
    Output, when printing {{tooManyNulls.json}} to CSV:
    
    {code}
    4096 row(s):
    c1
    null
    ...
    null
    1 row(s):
    c1
    Hello World
    Total rows returned : 4097.  Returned in 242ms.
    {code}
    
    Performance here will be slower than master because we now do a field lookup for each
null column where in the past we did not. The performance of null columns, however, should
be identical to non-null columns. And, performance of the above fix should be identical to
the fix proposed in this PR: but the code here is simpler.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

Mime
View raw message