hive-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Daniel Papp (JIRA)" <j...@apache.org>
Subject [jira] [Created] (HIVE-17487) Example fails on the Hive Getting started page
Date Fri, 08 Sep 2017 15:12:00 GMT
Daniel Papp created HIVE-17487:
----------------------------------

             Summary: Example fails on the Hive Getting started page
                 Key: HIVE-17487
                 URL: https://issues.apache.org/jira/browse/HIVE-17487
             Project: Hive
          Issue Type: Bug
            Reporter: Daniel Papp
            Priority: Trivial


There is an example on [Hive Getting Started|https://cwiki.apache.org/confluence/display/Hive/GettingStarted]
page using the MovieLens100k dataset. The mapper is defined as a python script in the following
way:

{code}
import sys
import datetime

for line in sys.stdin:
  line = line.strip()
  userid, movieid, rating, unixtime = line.split('\t')
  weekday = datetime.datetime.fromtimestamp(float(unixtime)).isoweekday()
  print '\t'.join([userid, movieid, rating, str(weekday)])
{code}

which is correct assuming you're using the python 2 series. The following code works with
both 2 and 3 series:

{code}
from __future__ import print_function
import sys
import datetime

for line in sys.stdin:
  line = line.strip()
  userid, movieid, rating, unixtime = line.split('\t')
  weekday = datetime.datetime.fromtimestamp(float(unixtime)).isoweekday()
  print('\t'.join([userid, movieid, rating, str(weekday)]))
{code}

I think this should be corrected.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

Mime
View raw message