Return-Path: Delivered-To: apmail-hadoop-hive-user-archive@minotaur.apache.org Received: (qmail 72006 invoked from network); 25 May 2009 07:59:30 -0000 Received: from hermes.apache.org (HELO mail.apache.org) (140.211.11.3) by minotaur.apache.org with SMTP; 25 May 2009 07:59:30 -0000 Received: (qmail 62782 invoked by uid 500); 25 May 2009 07:59:42 -0000 Delivered-To: apmail-hadoop-hive-user-archive@hadoop.apache.org Received: (qmail 62728 invoked by uid 500); 25 May 2009 07:59:42 -0000 Mailing-List: contact hive-user-help@hadoop.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: hive-user@hadoop.apache.org Delivered-To: mailing list hive-user@hadoop.apache.org Received: (qmail 62719 invoked by uid 99); 25 May 2009 07:59:42 -0000 Received: from athena.apache.org (HELO athena.apache.org) (140.211.11.136) by apache.org (qpsmtpd/0.29) with ESMTP; Mon, 25 May 2009 07:59:42 +0000 X-ASF-Spam-Status: No, hits=1.2 required=10.0 tests=HTML_MESSAGE,RCVD_IN_DNSWL_LOW,SPF_PASS X-Spam-Check-By: apache.org Received-SPF: pass (athena.apache.org: domain of jo@nttdocomo.com designates 202.19.227.74 as permitted sender) Received: from [202.19.227.74] (HELO zcsg-mailro11.is.nttdocomo.co.jp) (202.19.227.74) by apache.org (qpsmtpd/0.29) with ESMTP; Mon, 25 May 2009 07:59:33 +0000 Received: from zcsg-mailmt12.is.nttdocomo.co.jp (unknown [10.160.86.41]) by zcsg-mailro11.is.nttdocomo.co.jp (Postfix) with ESMTP id 07D5334019 for ; Mon, 25 May 2009 16:59:12 +0900 (JST) Received: from smtp_saver-daemon.zcsg-mailmt12.is.nttdocomo.co.jp by zcsg-mailmt12.is.nttdocomo.co.jp (NTT DoCoMo Mail System) id <0KK600DANWUO3800@NTTDoCoMo.co.jp> for hive-user@hadoop.apache.org; Mon, 25 May 2009 16:59:12 +0900 (JST) Received: from zcsg-mailmi11.is.nttdocomo.co.jp ([10.160.86.49]) by zcsg-mailmt12.is.nttdocomo.co.jp (NTT DoCoMo Mail System) with SMTP id <0KK600EFVWUN0850@NTTDoCoMo.co.jp> for hive-user@hadoop.apache.org; Mon, 25 May 2009 16:59:12 +0900 (JST) Received: from unknown (HELO zcsg-mailvs12.is.nttdocomo.co.jp) (10.160.86.48) by zcsg-mailmi11.is.nttdocomo.co.jp with SMTP; Mon, 25 May 2009 16:59:11 +0900 Received: from zcsg-mailvs12.is.nttdocomo.co.jp (localhost [127.0.0.1]) by localhost.nttdocomo.co.jp (Postfix) with ESMTP id D3ACA8003 for ; Mon, 25 May 2009 16:59:11 +0900 (JST) Received: from zcsg-mailsa11.is.nttdocomo.co.jp (unknown [10.160.86.46]) by zcsg-mailvs12.is.nttdocomo.co.jp (Postfix) with ESMTP id C727F8002 for ; Mon, 25 May 2009 16:59:11 +0900 (JST) Received: from gc_check-daemon.zcsg-mailsa11.is.nttdocomo.co.jp by zcsg-mailsa11.is.nttdocomo.co.jp (NTT DoCoMo Mail System) id <0KK60042HWUNVQ00@NTTDoCoMo.co.jp> for hive-user@hadoop.apache.org; Mon, 25 May 2009 16:59:11 +0900 (JST) Received: from aknetlab20 (aknetlab20.docomo.docomogr.net [10.19.78.102]) by zcsg-mailsa11.is.nttdocomo.co.jp (NTT DoCoMo Mail System) with ESMTPA id <0KK6005IWWUN1SC0@NTTDoCoMo.co.jp> for hive-user@hadoop.apache.org; Mon, 25 May 2009 16:59:11 +0900 (JST) Date: Mon, 25 May 2009 16:59:11 +0900 From: Manhee Jo Subject: Re: Hive and Hadoop streaming To: hive-user@hadoop.apache.org Message-id: <3175AFBFFB68400DB88DEABF6B150F15@docomo.docomogr.net> MIME-version: 1.0 X-MIMEOLE: Produced By Microsoft MimeOLE V6.00.2900.5579 X-Mailer: Microsoft Outlook Express 6.00.2900.5512 Content-type: multipart/alternative; boundary="----=_NextPart_000_03FD_01C9DD5A.2043A8F0" X-Priority: 3 X-MSMail-priority: Normal X-DoCoMo: ZCSG References: <01CD7914E51B4F44AED96ED26CA861DC@docomo.docomogr.net> <34fd060d0905241828h5790502dl243505176b86e62b@mail.gmail.com> <8542C6D784D747A8BDD242583118F802@docomo.docomogr.net> <34fd060d0905250033xcf5f61cl20189757f8191bf5@mail.gmail.com> X-Virus-Checked: Checked by ClamAV on apache.org This is a multi-part message in MIME format. ------=_NextPart_000_03FD_01C9DD5A.2043A8F0 Content-Type: text/plain; charset="ISO-8859-1" Content-Transfer-Encoding: 7bit Thank you so much!!! ----- Original Message ----- From: Zheng Shao To: hive-user@hadoop.apache.org Sent: Monday, May 25, 2009 4:33 PM Subject: Re: Hive and Hadoop streaming In this case, you just need to compile your .java into a jar file, and do add jar fullpath/to/myprogram.jar; SELECT TRANSFORM(col1, col2, col3, col4) USING "java -cp myprogram.jar WeekdayMapper" AS (outcol1, outcol2, outcol3, outcol4)" Let us know if it works out or not. Zheng On Sun, May 24, 2009 at 10:50 PM, Manhee Jo wrote: Thank you Zheng, Here is my WeekdayMapper.java, which is just a test that does almost same thing as the "weekday_mapper.py" does. As you see below, it does not take WritableComparable nor Writable class. It receives the 4 columns just string arguments. Any advice would be very appreciated. /** * WeekdayMapper.java */ import java.io.*; import java.util.*; class WeekdayMapper { public static void main (String[] args) throws IOException { Scanner stdIn = new Scanner(System.in); String line=null; String[] column; long unixTime; Date d; GregorianCalendar cal1 = new GregorianCalendar(); while (stdIn.hasNext()) { line = stdIn.nextLine(); column = line.split("\t"); unixTime = Long.parseLong(column[3]); d = new Date(unixTime*1000); cal1.setTime(d); int dow = cal1.get(Calendar.DAY_OF_WEEK); System.out.println(column[0] + "\t" + column[1] + "\t" + column[2] + "\t" + dow); } } } Thanks, Manhee ----- Original Message ----- From: Zheng Shao To: hive-user@hadoop.apache.org Sent: Monday, May 25, 2009 10:28 AM Subject: Re: Hive and Hadoop streaming How do your java map function receive the 4 columns? I assume your java map function takes a WritableComparable key and Writable value. Zheng 2009/5/24 Manhee Jo I have some mappers already coded in Java. So I want to use it as much as possible in Hive environment. Then, how can I call a Java mapper to "select transform" in Hive? For example, what is wrong with the query below and why? INSERT OVERWRITE TABLE u_data_new SELECT TRANSFORM (userid, movieid, rating, unixtime) USING 'java WeekdayMapper' AS (userid, movieid, rating, weekday) FROM u_data; Thank you. Regards, Manhee -- Yours, Zheng -- Yours, Zheng ------=_NextPart_000_03FD_01C9DD5A.2043A8F0 Content-Type: text/html; charset="ISO-8859-1" Content-Transfer-Encoding: quoted-printable
Thank you so much!!!
 
----- Original Message ----- =
From:=20 Zheng = Shao
To: hive-user@hadoop.apache.org=20
Sent: Monday, May 25, = 2009 4:33=20 PM
Subject: Re: Hive and = Hadoop=20 streaming

In this case, you just need to compile your .java into = a jar=20 file, and do

add jar
fullpath/to/myprogram.jar; SELECT = TRANSFORM(col1, col2, col3, col4) USING "java -cp myprogram.jar = WeekdayMapper"=20 AS (outcol1, outcol2, outcol3, outcol4)"

Let us know if it = works out or=20 not.

Zheng

On Sun, May 24, 2009 at 10:50 PM, Manhee Jo = <jo@nttdocomo.com>=20 wrote:
Thank you Zheng,
Here is my WeekdayMapper.java, which is just a test that does = almost=20 same thing as the "weekday_mapper.py" does.
As you see below, it does not take WritableComparable nor = Writable=20 class. It receives the 4 columns just string
arguments. Any advice would be very appreciated.
 
/**
 *  WeekdayMapper.java
 */
 
import java.io.*;
import java.util.*;
 
class WeekdayMapper {
   public static void = main=20 (String[] args) throws IOException {
   Scanner stdIn = =3D new=20 Scanner(System.in);
   String line=3Dnull;
 =20  String[] column;
   long unixTime;
  =  Date=20 d;
   GregorianCalendar cal1 =3D new = GregorianCalendar();
 
   while (stdIn.hasNext()) = {
     line=20 =3D stdIn.nextLine();
     column =3D=20 line.split("\t");
     unixTime =3D=20 Long.parseLong(column[3]);
     d =3D new=20 Date(unixTime*1000);
    =20 cal1.setTime(d);
     int dow =3D=20 cal1.get(Calendar.DAY_OF_WEEK);
    =20 System.out.println(column[0] + "\t" + column[1] + "\t"=20
          + = column[2] +=20 "\t" + dow);
    } 
  }
}
 
Thanks,
Manhee
 
-----=20 Original Message -----
From:=20 Zheng Shao
To:=20 hive-user@hadoop.apache.org
Sent:=20 Monday, May 25, 2009 10:28 AM
Subject:=20 Re: Hive and Hadoop streaming

How do your java map function receive the 4 = columns?
I=20 assume your java map function takes a WritableComparable key and = Writable=20 value.

Zheng

2009/5/24 Manhee Jo <jo@nttdocomo.com>
I have some mappers already = coded in=20 Java. So I want to use it
as much as possible in Hive=20 environment.
Then, how can I call = a Java=20 mapper to "select = transform"=20 in Hive?
For example, what is wrong = with=20 the query below and=20 why?
 
INSERT OVERWRITE TABLE=20 u_data_new
SELECT
  TRANSFORM (userid, movieid, = rating,=20 unixtime)
  USING 'java=20 WeekdayMapper'
  AS (userid, movieid, rating, = weekday)
FROM u_data;
Thank you.
 
 
Regards,
Manhee


--=20 =
Yours,
Zheng
=

--
Yours,
Zheng
------=_NextPart_000_03FD_01C9DD5A.2043A8F0--