phoenix-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "James Taylor (JIRA)" <j...@apache.org>
Subject [jira] [Comment Edited] (PHOENIX-1127) Cannot load Timestamp(6) in CSV bulk loader
Date Fri, 19 Dec 2014 18:51:14 GMT

    [ https://issues.apache.org/jira/browse/PHOENIX-1127?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14253769#comment-14253769
] 

James Taylor edited comment on PHOENIX-1127 at 12/19/14 6:50 PM:
-----------------------------------------------------------------

[~gabriel.reid] - I think this is a nice improvement for power users to have more control
over the CSV parsing. It'd be good to solve the "simple" case too, though. IMO, the SimpleDateFormat
in Java is pretty weak - the fact that you can't have an optional time component and the fact
that the parser is not thread safe. I think Joda time is far superior.

Here's a quick example of using Joda time to parse an optional time component. If we do something
along these lines, then we can make the default parse string accept time and milliseconds
optionally and still maintain b/w compatibility. That way, for the user who just wants to
import a CSV that has date and time components in their dates, it'll just work.

I think it'd just be a matter of tweaking DateUtil to use Joda time (plus we can get rid of
the ThreadLocal BS which is always a good thing). Optionally, we can say that our date/time
parse string is ISO8601 which is better than saying that it's like the Java date/time parsing
IMO. It's possible we could control this with a config parameter if we think there are b/w
compat issues.

We'd need to parse the date first, get the millis from it, and create a java.sql.Date, java.sql.Time
or java.sql.Timestamp from that, but I think it'd be worth it.

{code}
    @Test
    public void testOptionalMilliseconds() throws ParseException {
        String dateStr1 = "2006-06-07";
        String dateStr2 = "2006-06-07T10:30:10";
        String dateStr3 = "2006-06-07T10:30:10.123";
        DateTimeFormatter dateTimeFormat = ISODateTimeFormat.dateOptionalTimeParser();
        DateTime date1 = dateTimeFormat.withZone(DateTimeZone.UTC).parseDateTime(dateStr1);
        DateTime date2 = dateTimeFormat.withZone(DateTimeZone.UTC).parseDateTime(dateStr2);
        long diff2 = date2.getMillis() - date1.getMillis();
        assertEquals(10 * 60 * 60 * 1000 + 30 * 60 * 1000 + 10 * 1000, diff2);
        DateTime date3 = dateTimeFormat.withZone(DateTimeZone.UTC).parseDateTime(dateStr3);
        long diff3 = date3.getMillis() - date1.getMillis();
        assertEquals(diff2 + 123, diff3);
   }
{code}


was (Author: jamestaylor):
[~gabriel.reid] - I think this is a nice improvement for power users to have more control
over the CSV parsing. It'd be good to solve the "simple" case too, though. IMO, the SimpleDateFormat
in Java is pretty weak - the fact that you can't have an optional time component and the fact
that the parser is not thread safe. I think Joda time is far superior.

Here's a quick example of using Joda time to parse an optional time component. If we something
along these lines, then we can make the default parse string accept time and milliseconds
optionally and still maintain b/w compatibility. That way, for the user who just wants to
import a CSV that has date and time components in their dates, it'll just work.

I think it'd just be a matter of tweaking DateUtil to use Joda time (plus we can get rid of
the ThreadLocal BS which is always a good thing). Optionally, we can say that our date/time
parse string is ISO8601 which is better than saying that it's like the Java date/time parsing
IMO. It's possible we could control this with a config parameter if we think there are b/w
compat issues.

{code}
    @Test
    public void testOptionalMilliseconds() throws ParseException {
        String dateStr1 = "2006-06-07";
        String dateStr2 = "2006-06-07T10:30:10";
        String dateStr3 = "2006-06-07T10:30:10.123";
        DateTimeFormatter dateTimeFormat = ISODateTimeFormat.dateOptionalTimeParser();
        DateTime date1 = dateTimeFormat.withZone(DateTimeZone.UTC).parseDateTime(dateStr1);
        DateTime date2 = dateTimeFormat.withZone(DateTimeZone.UTC).parseDateTime(dateStr2);
        long diff2 = date2.getMillis() - date1.getMillis();
        assertEquals(10 * 60 * 60 * 1000 + 30 * 60 * 1000 + 10 * 1000, diff2);
        DateTime date3 = dateTimeFormat.withZone(DateTimeZone.UTC).parseDateTime(dateStr3);
        long diff3 = date3.getMillis() - date1.getMillis();
        assertEquals(diff2 + 123, diff3);
   }
{code}

> Cannot load Timestamp(6) in CSV bulk loader
> -------------------------------------------
>
>                 Key: PHOENIX-1127
>                 URL: https://issues.apache.org/jira/browse/PHOENIX-1127
>             Project: Phoenix
>          Issue Type: Improvement
>    Affects Versions: 4.0.0
>         Environment: PROD
>            Reporter: Deepak Gattala
>             Fix For: 5.0.0
>
>         Attachments: PHOENIX-1127.patch
>
>   Original Estimate: 360h
>  Remaining Estimate: 360h
>
> cannot do any date manipulation because of this issue.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Mime
View raw message