Return-Path: X-Original-To: apmail-hbase-issues-archive@www.apache.org Delivered-To: apmail-hbase-issues-archive@www.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id 11E3210EA8 for ; Mon, 3 Feb 2014 21:51:15 +0000 (UTC) Received: (qmail 56044 invoked by uid 500); 3 Feb 2014 21:51:10 -0000 Delivered-To: apmail-hbase-issues-archive@hbase.apache.org Received: (qmail 55965 invoked by uid 500); 3 Feb 2014 21:51:08 -0000 Mailing-List: contact issues-help@hbase.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Delivered-To: mailing list issues@hbase.apache.org Received: (qmail 55954 invoked by uid 99); 3 Feb 2014 21:51:07 -0000 Received: from arcas.apache.org (HELO arcas.apache.org) (140.211.11.28) by apache.org (qpsmtpd/0.29) with ESMTP; Mon, 03 Feb 2014 21:51:07 +0000 Date: Mon, 3 Feb 2014 21:51:07 +0000 (UTC) From: "Nick Dimiduk (JIRA)" To: issues@hbase.apache.org Message-ID: In-Reply-To: References: Subject: [jira] [Commented] (HBASE-10385) ImportTsv to parse date time from typical loader formats MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: 7bit X-JIRA-FingerPrint: 30527f35849b9dde25b450d4833f0394 [ https://issues.apache.org/jira/browse/HBASE-10385?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13889961#comment-13889961 ] Nick Dimiduk commented on HBASE-10385: -------------------------------------- Sorry [~ericavijay], it looks like your heroic patch slipped through the cracks. After attaching a patch file vs trunk, the next step is to click the "submit patch" button. This will queue it up for our QABot to do the due diligence. Patches are generally not accepted unless they pass the precommit verification that bot performs. I've clicked the button, we'll see if your patch still applies cleanly. As for exposing ParsedLine, I vaguely remember running into something similar in a previous life. My workaround was to place my custom mapper in the org.apache.hadoop.hbase namespace. I think this is tedious and should not be necessary, so I'm fine with exposing it as a public class. I think that'll facilitate general reuse. I think that's a separate issue, right? > ImportTsv to parse date time from typical loader formats > -------------------------------------------------------- > > Key: HBASE-10385 > URL: https://issues.apache.org/jira/browse/HBASE-10385 > Project: HBase > Issue Type: New Feature > Components: mapreduce > Affects Versions: 0.96.1.1 > Reporter: Vijay Sarvepali > Priority: Minor > Labels: importtsv > Attachments: HBASE-10385.patch > > Original Estimate: 2h > Remaining Estimate: 2h > > Simple patch to enable parsing of standard date time fields from TSV files into Hbase. > *************** > *** 57,62 **** > --- 57,70 ---- > import com.google.common.base.Splitter; > import com.google.common.collect.Lists; > > + //2013-08-19T04:39:07 > + import java.text.DateFormat; > + import java.util.*; > + import java.text.SimpleDateFormat; > + import java.text.ParseException; > + > + > + > /** > * Tool to import data from a TSV file. > * > *************** > *** 220,229 **** > getColumnOffset(timestampKeyColumnIndex), > getColumnLength(timestampKeyColumnIndex)); > try { > ! return Long.parseLong(timeStampStr); > } catch (NumberFormatException nfe) { > // treat this record as bad record > ! throw new BadTsvLineException("Invalid timestamp " + timeStampStr); > } > } > > --- 228,239 ---- > getColumnOffset(timestampKeyColumnIndex), > getColumnLength(timestampKeyColumnIndex)); > try { > ! return Long.parseLong(timeStampStr); > } catch (NumberFormatException nfe) { > + // Try this record with string to date in mseconds long > + return extractTimestampInput(timeStampStr); > // treat this record as bad record > ! //throw new BadTsvLineException("Invalid timestamp " + timeStampStr); > } > } > > *************** > *** 243,248 **** > --- 253,274 ---- > return lineBytes; > } > } > + public static long extractTimestampInput(String strDate) throws BadTsvLineException{ > + final List dateFormats = Arrays.asList("yyyy-MM-dd HH:mm:ss.SSS", "yyyy-MM-dd'T'HH:mm:ss"); > + > + for(String format: dateFormats){ > + SimpleDateFormat sdf = new SimpleDateFormat(format); > + try{ > + Date d= sdf.parse(strDate); > + long msecs = d.getTime(); > + return msecs; > + } catch (ParseException e) { > + //intentionally empty > + } > + } > + // If we come here we have a problem with converting timestamps for this row. > + throw new BadTsvLineException("Invalid timestamp " + strDate); > + } > > public static class BadTsvLineException extends Exception { > public BadTsvLineException(String err) { -- This message was sent by Atlassian JIRA (v6.1.5#6160)