Return-Path: X-Original-To: apmail-hbase-issues-archive@www.apache.org Delivered-To: apmail-hbase-issues-archive@www.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id 07F3E10E90 for ; Mon, 3 Feb 2014 21:49:12 +0000 (UTC) Received: (qmail 53409 invoked by uid 500); 3 Feb 2014 21:49:07 -0000 Delivered-To: apmail-hbase-issues-archive@hbase.apache.org Received: (qmail 53316 invoked by uid 500); 3 Feb 2014 21:49:07 -0000 Mailing-List: contact issues-help@hbase.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Delivered-To: mailing list issues@hbase.apache.org Received: (qmail 53200 invoked by uid 99); 3 Feb 2014 21:49:06 -0000 Received: from arcas.apache.org (HELO arcas.apache.org) (140.211.11.28) by apache.org (qpsmtpd/0.29) with ESMTP; Mon, 03 Feb 2014 21:49:06 +0000 Date: Mon, 3 Feb 2014 21:49:06 +0000 (UTC) From: "Nick Dimiduk (JIRA)" To: issues@hbase.apache.org Message-ID: In-Reply-To: References: Subject: [jira] [Updated] (HBASE-10385) ImportTsv to parse date time from typical loader formats MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: 7bit X-JIRA-FingerPrint: 30527f35849b9dde25b450d4833f0394 [ https://issues.apache.org/jira/browse/HBASE-10385?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Nick Dimiduk updated HBASE-10385: --------------------------------- Status: Patch Available (was: Open) > ImportTsv to parse date time from typical loader formats > -------------------------------------------------------- > > Key: HBASE-10385 > URL: https://issues.apache.org/jira/browse/HBASE-10385 > Project: HBase > Issue Type: New Feature > Components: mapreduce > Affects Versions: 0.96.1.1 > Reporter: Vijay Sarvepali > Priority: Minor > Labels: importtsv > Attachments: HBASE-10385.patch > > Original Estimate: 2h > Remaining Estimate: 2h > > Simple patch to enable parsing of standard date time fields from TSV files into Hbase. > *************** > *** 57,62 **** > --- 57,70 ---- > import com.google.common.base.Splitter; > import com.google.common.collect.Lists; > > + //2013-08-19T04:39:07 > + import java.text.DateFormat; > + import java.util.*; > + import java.text.SimpleDateFormat; > + import java.text.ParseException; > + > + > + > /** > * Tool to import data from a TSV file. > * > *************** > *** 220,229 **** > getColumnOffset(timestampKeyColumnIndex), > getColumnLength(timestampKeyColumnIndex)); > try { > ! return Long.parseLong(timeStampStr); > } catch (NumberFormatException nfe) { > // treat this record as bad record > ! throw new BadTsvLineException("Invalid timestamp " + timeStampStr); > } > } > > --- 228,239 ---- > getColumnOffset(timestampKeyColumnIndex), > getColumnLength(timestampKeyColumnIndex)); > try { > ! return Long.parseLong(timeStampStr); > } catch (NumberFormatException nfe) { > + // Try this record with string to date in mseconds long > + return extractTimestampInput(timeStampStr); > // treat this record as bad record > ! //throw new BadTsvLineException("Invalid timestamp " + timeStampStr); > } > } > > *************** > *** 243,248 **** > --- 253,274 ---- > return lineBytes; > } > } > + public static long extractTimestampInput(String strDate) throws BadTsvLineException{ > + final List dateFormats = Arrays.asList("yyyy-MM-dd HH:mm:ss.SSS", "yyyy-MM-dd'T'HH:mm:ss"); > + > + for(String format: dateFormats){ > + SimpleDateFormat sdf = new SimpleDateFormat(format); > + try{ > + Date d= sdf.parse(strDate); > + long msecs = d.getTime(); > + return msecs; > + } catch (ParseException e) { > + //intentionally empty > + } > + } > + // If we come here we have a problem with converting timestamps for this row. > + throw new BadTsvLineException("Invalid timestamp " + strDate); > + } > > public static class BadTsvLineException extends Exception { > public BadTsvLineException(String err) { -- This message was sent by Atlassian JIRA (v6.1.5#6160)