drill-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From paul-rogers <...@git.apache.org>
Subject [GitHub] drill issue #916: DRILL-5377: Five-digit year dates are displayed incorrectl...
Date Sat, 09 Sep 2017 22:43:08 GMT
Github user paul-rogers commented on the issue:

    https://github.com/apache/drill/pull/916
  
    Back to my original question. The premise of this bug seems to be that we corrupt Parquet
dates and convert perfectly valid 4-digit years into invalid 5-digit years. That is clearly
a data corruption bug that should never occur. Why don't we fix that?
    
    Given that we've accepted the data corruption, we need to display five-digit years which
the Java classes for date and time don't support in `toString()`. The code uses `toString()`
because it does not do correct formatting using the classes provided. That's the second bug.
Date display should make use of format preferences provided by the user, not the default ones
provided by `toString()`. So, that's bug number 2.
    
    Now given the above two bugs, we introduce a third by creating ad-hoc, Drill-specific
date/time classes, violating the JDBC standard, to display the corrupt five-digit years. So,
no longer will Drill return the java.sql.Date class as specified by the standard, but rather
our own subclass. How will this affect client code that relies on standard behavior?
    
    I feel we are compounding error upon error. Can we go back and fix the original problem:
that users might prefer that we don't corrupt dates in their data? That is, the problem is
not so much that we don't format corrupt data correctly, but rather that we do, in fact, corrupt
data.


---

Mime
View raw message