hive-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Igor Kravzov <igork.ine...@gmail.com>
Subject Re: Convert date in string format to timestamp in table definition
Date Sat, 04 Jun 2016 20:31:51 GMT
Thanks Dudu.
So if I need actual date I will use view.
Regarding partition column:  I can create 2 external tables based on the
same data with integer or string column partition and see which one is more
convenient for our use.

On Sat, Jun 4, 2016 at 2:20 PM, Markovitz, Dudu <dmarkovitz@paypal.com>
wrote:

> I’m not aware of an option to do what you request in the external table
> definition but you might want to that using a view.
>
>
>
> P.s.
>
> I seems to me that defining the partition column as a string would be more
> user friendly than integer, e.g. –
>
>
>
> select * from threads_test where yyyymmdd like ‘2016%’ – year 2016;
>
> select * from threads_test where yyyymmdd like ‘201603%’ –- March 2016;
>
> select * from threads_test where yyyymmdd like ‘______01’ -- first of
> every month;
>
>
>
>
>
>
>
>
>
>
>
> $ hdfs dfs -ls -R /tmp/threads_test
>
> drwxr-xr-x   - cloudera supergroup          0 2016-06-04 10:45
> /tmp/threads_test/20160604
>
> -rw-r--r--   1 cloudera supergroup        136 2016-06-04 10:45
> /tmp/threads_test/20160604/data.txt
>
>
>
> $ hdfs dfs -cat /tmp/threads_test/20160604/data.txt
>
> {"url":"www.blablabla.com
> ","pageType":"pg1","addDate":"2016-05-17T02:10:44.527","postDate":"2016-05-16T02:08:55","postText":"YadaYada"}
>
>
>
>
> ----------------------------------------------------------------------------------------------------
>
>
>
>
>
> hive> add jar /usr/lib/hive-hcatalog/share/hcatalog/hive-hcatalog-core.jar;
>
>
>
> hive>
>
> create external table threads_test
>
> (
>
>     url         string
>
>    ,pagetype    string
>
>    ,adddate     string
>
>    ,postdate    string
>
>    ,posttext    string
>
> )
>
>     partitioned by (yyyymmdd string)
>
>     row format serde 'org.apache.hive.hcatalog.data.JsonSerDe'
>
>     location '/tmp/threads_test'
>
> ;
>
>
>
> hive> alter table threads_test add partition (yyyymmdd=20160604) location
> '/tmp/threads_test/20160604';
>
>
>
> hive> select * from threads_test;
>
>
>
> www.blablabla.com    pg1    2016-05-17T02:10:44.527
> 2016-05-16T02:08:55  YadaYada      20160604
>
>
>
> hive>
>
> create view threads_test_v
>
> as
>
> select      url
>
>            ,pagetype
>
>            ,cast (concat_ws(' ',substr (adddate ,1,10),substr (adddate
> ,12)) as timestamp)  as adddate
>
>            ,cast (concat_ws(' ',substr (postdate,1,10),substr
> (postdate,12)) as timestamp)  as postdate
>
>            ,posttext
>
>
>
> from        threads_test
>
> ;
>
>
>
> hive> select * from threads_test_v;
>
>
>
> www.blablabla.com    pg1    2016-05-17 02:10:44.527    2016-05-16
> 02:08:55  YadaYada
>
>
>
>
>
> *From:* Igor Kravzov [mailto:igork.inexso@gmail.com]
> *Sent:* Saturday, June 04, 2016 8:13 PM
> *To:* user@hive.apache.org
> *Subject:* Convert date in string format to timestamp in table definition
>
>
>
> Hi,
>
>
>
> I have 2 dates in Json file defined like this
>
> "addDate": "2016-05-17T02:10:44.527",
>
>   "postDate": "2016-05-16T02:08:55",
>
>
>
> Right now I define external table based on this file like this:
>
> CREATE external TABLE threads_test
>
> (url string,
>
>  pagetype string,
>
>  adddate string,
>
>  postdate string,
>
>  posttext string)
>
> partitioned by (yyyymmdd int)
>
> ROW FORMAT SERDE 'org.apache.hive.hcatalog.data.JsonSerDe'
>
> location 'my location';
>
>
>
> is it possible to define these 2 dates as timestamp?
>
> Do I need to change date format in the file? is it possible to specify
> date format in table definition?
>
> Or I better off with string?
>
>
>
> Thanks in advance.
>

Mime
View raw message