Return-Path: X-Original-To: apmail-hive-user-archive@www.apache.org Delivered-To: apmail-hive-user-archive@www.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id 0F2CF10163 for ; Thu, 20 Jun 2013 18:04:15 +0000 (UTC) Received: (qmail 63223 invoked by uid 500); 20 Jun 2013 18:04:13 -0000 Delivered-To: apmail-hive-user-archive@hive.apache.org Received: (qmail 63065 invoked by uid 500); 20 Jun 2013 18:04:12 -0000 Mailing-List: contact user-help@hive.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: user@hive.apache.org Delivered-To: mailing list user@hive.apache.org Received: (qmail 63057 invoked by uid 99); 20 Jun 2013 18:04:10 -0000 Received: from athena.apache.org (HELO athena.apache.org) (140.211.11.136) by apache.org (qpsmtpd/0.29) with ESMTP; Thu, 20 Jun 2013 18:04:10 +0000 X-ASF-Spam-Status: No, hits=1.8 required=5.0 tests=HTML_FONT_FACE_BAD,HTML_MESSAGE,RCVD_IN_DNSWL_LOW,SPF_PASS,WEIRD_PORT X-Spam-Check-By: apache.org Received-SPF: pass (athena.apache.org: domain of sanjeev.sagar@gmail.com designates 74.125.82.54 as permitted sender) Received: from [74.125.82.54] (HELO mail-wg0-f54.google.com) (74.125.82.54) by apache.org (qpsmtpd/0.29) with ESMTP; Thu, 20 Jun 2013 18:04:06 +0000 Received: by mail-wg0-f54.google.com with SMTP id n11so5748174wgh.33 for ; Thu, 20 Jun 2013 11:03:45 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113; h=mime-version:in-reply-to:references:date:message-id:subject:from:to :cc:content-type; bh=ZEvuD0fcyT5XCyb7Rz4XNisxBUJgKEKxLEYP7XtxT3Y=; b=Ah0UVSMYrEp/FZuxwTdIVcU9A2/qx6obgzyYoHUzVhmMsZonTRXmcSIEVJ20rGiJ7c 7kIQPmg2a+vvcn6e0oAECdNPOVVHTFI+wMNYVM1eWNSzMVooPoZQw0DarLl7ZDGprzIe aCVuQgfQdwoTAsqd4Hsetnl1dGKZh0OYKHwHZQ0EFTbB8AY4bnBFUKlH2t6xRTTzmaOi oqWRakQ+3WYxbYMlUDYfUCLnao8HMZ4chiD//ET8LMgrp76PlkLddwk3YcPLC7s1AhHz wO/MMvL2RDihFoc230ggUy/xSgqQTOa2F03iXxxXwuHOh3Lum1/s0hSzq3w197oOrknl ZwMg== MIME-Version: 1.0 X-Received: by 10.180.11.206 with SMTP id s14mr330807wib.40.1371751425422; Thu, 20 Jun 2013 11:03:45 -0700 (PDT) Received: by 10.216.121.132 with HTTP; Thu, 20 Jun 2013 11:03:45 -0700 (PDT) In-Reply-To: References: Date: Thu, 20 Jun 2013 11:03:45 -0700 Message-ID: Subject: Re: Hive External Table issue From: sanjeev sagar To: Nitin Pawar Cc: "user@hive.apache.org" Content-Type: multipart/alternative; boundary=001a11c2443c6ed27004df99c43e X-Virus-Checked: Checked by ClamAV on apache.org --001a11c2443c6ed27004df99c43e Content-Type: text/plain; charset=ISO-8859-1 Two issues: 1. I've created external tables in hive based on file location before and it work without any issue. It don't have to be a directory. 2. If there are more than one file in the directory, and you create external table based on directory then how the table knows that which file it need to look for the data? I tried to create the table based on directory, it created the table but all the rows were NULL. -Sanjeev On Thu, Jun 20, 2013 at 10:30 AM, Nitin Pawar wrote: > in hive when you create table and use the location to refer hdfs path, > that path is supposed to be a directory. > If the directory is not existing it will try to create it and if its a > file it will throw an error as its not a directory > > thats the error you are getting that location you referred is a file. > Change it to the directory and see if that works for you > > > On Thu, Jun 20, 2013 at 10:57 PM, sanjeev sagar wrote: > >> I did mention in my mail the hdfs file exists in that location. See below >> >> In HDFS: file exists >> >> >> >> hadoop fs -ls >> >> /user/flume/events/request_logs/ >> ar1.vgs.mypoints.com/13-06-13/FlumeData.1371144648033 >> >> Found 1 items >> >> -rw-r--r-- 3 hdfs supergroup 2242037226 2013-06-13 11:14 >> >> /user/flume/events/request_logs/ >> ar1.vgs.mypoints.com/13-06-13/FlumeData.1371144648033 >> >> so the directory and file both exists. >> >> >> On Thu, Jun 20, 2013 at 10:24 AM, Nitin Pawar wrote: >> >>> MetaException(message:hdfs:// >>> h1.vgs.mypoints.com:8020/user/flume/events/request_logs/ar1.vgs.mypoints.com/13-06-13/FlumeData.1371144648033 >>> >>> is not a directory or unable to create one) >>> >>> >>> it clearly says its not a directory. Point to the dictory and it will >>> work >>> >>> >>> On Thu, Jun 20, 2013 at 10:52 PM, sanjeev sagar >> > wrote: >>> >>>> Hello Everyone, I'm running into the following Hive external table >>>> issue. >>>> >>>> >>>> >>>> hive> CREATE EXTERNAL TABLE access( >>>> >>>> > host STRING, >>>> >>>> > identity STRING, >>>> >>>> > user STRING, >>>> >>>> > time STRING, >>>> >>>> > request STRING, >>>> >>>> > status STRING, >>>> >>>> > size STRING, >>>> >>>> > referer STRING, >>>> >>>> > agent STRING) >>>> >>>> > ROW FORMAT SERDE >>>> >>>> 'org.apache.hadoop.hive.contrib.serde2.RegexSerDe' >>>> >>>> > WITH SERDEPROPERTIES ( >>>> >>>> > "input.regex" = "([^ ]*) ([^ ]*) ([^ ]*) (-|\\[[^\\]]*\\]) >>>> >>>> ([^ \"]*|\"[^\"]*\") (-|[0-9]*) (-|[0-9]*)(?: ([^ \"]*|\"[^\"]*\") ([^ >>>> \"]*|\"[^\"]*\"))?", >>>> >>>> > "output.format.string" = "%1$s %2$s %3$s %4$s %5$s %6$s >>>> >>>> %7$s %8$s %9$s" >>>> >>>> > ) >>>> >>>> > STORED AS TEXTFILE >>>> >>>> > LOCATION >>>> >>>> '/user/flume/events/request_logs/ >>>> ar1.vgs.mypoints.com/13-06-13/FlumeData.1371144648033'; >>>> >>>> FAILED: Error in metadata: >>>> >>>> MetaException(message:hdfs:// >>>> h1.vgs.mypoints.com:8020/user/flume/events/request_logs/ar1.vgs.mypoints.com/13-06-13/FlumeData.1371144648033 >>>> >>>> is not a directory or unable to create one) >>>> >>>> FAILED: Execution Error, return code 1 from >>>> org.apache.hadoop.hive.ql.exec.DDLTask >>>> >>>> >>>> >>>> >>>> >>>> In HDFS: file exists >>>> >>>> >>>> >>>> hadoop fs -ls >>>> >>>> /user/flume/events/request_logs/ >>>> ar1.vgs.mypoints.com/13-06-13/FlumeData.1371144648033 >>>> >>>> Found 1 items >>>> >>>> -rw-r--r-- 3 hdfs supergroup 2242037226 2013-06-13 11:14 >>>> >>>> /user/flume/events/request_logs/ >>>> ar1.vgs.mypoints.com/13-06-13/FlumeData.1371144648033 >>>> >>>> >>>> >>>> I've download the serde2 jar file too and install it in >>>> /usr/lib/hive/lib/hive-json-serde-0.2.jar and I've bounced all the hadoop >>>> services after that. >>>> >>>> >>>> >>>> I even added the jar file manually in hive and run the above sql but >>>> still failing. >>>> >>>> ive> add jar /usr/lib/hive/lib/hive-json-serde-0.2.jar >>>> >>>> > ; >>>> >>>> Added /usr/lib/hive/lib/hive-json-serde-0.2.jar to class path Added >>>> resource: /usr/lib/hive/lib/hive-json-serde-0.2.jar >>>> >>>> >>>> >>>> Any help would be highly appreciable. >>>> >>>> >>>> >>>> -Sanjeev >>>> >>>> >>>> >>>> >>>> >>>> >>>> >>>> >>>> >>>> -- >>>> Sanjeev Sagar >>>> >>>> *"**Separate yourself from everything that separates you from others !"- Nirankari >>>> Baba Hardev Singh ji * >>>> >>>> ** >>>> >>> >>> >>> >>> -- >>> Nitin Pawar >>> >> >> >> >> -- >> Sanjeev Sagar >> >> *"**Separate yourself from everything that separates you from others !"- Nirankari >> Baba Hardev Singh ji * >> >> ** >> > > > > -- > Nitin Pawar > -- Sanjeev Sagar *"**Separate yourself from everything that separates you from others !" - Nirankari Baba Hardev Singh ji * ** --001a11c2443c6ed27004df99c43e Content-Type: text/html; charset=ISO-8859-1 Content-Transfer-Encoding: quoted-printable
Two issues:

1. I've created e= xternal tables in hive based on file location before and it work without an= y issue. It don't have to be a directory.

2. If there are more than one file in the directory, and you create externa= l table based on directory then how the table knows that which file it need= to look for the data?

I tried to crea= te the table based on directory, it created the table but all the rows were= NULL.=A0

-Sanjeev


On Thu, Jun 20, 2013 at 10:30 AM, Ni= tin Pawar <nitinpawar432@gmail.com> wrote:
in hive when you create tab= le and use the location to refer hdfs path, that path is supposed to be a d= irectory.=A0
If the directory is not existing it will try to create it and if its a file= it will throw an error as its not a directory=A0

thats the error you are getting that location you refer= red is a file. Change it to the directory and see if that works for you=A0<= /div>


On Thu, Jun 20, 2013 at 10:57 PM, sanjeev sagar <sanjeev.sagar@gmail= .com> wrote:
I did mention in my mail the hdfs file exists in that loca= tion. See below

In HDFS: file exists

=A0

hadoop fs -ls

/user/flume/events/request_logs/ar1.vgs.mypoints.com/13-= 06-13/FlumeData.1371144648033

Found 1 items

-rw-r--r--=A0=A0 3 hdfs supergroup 2242037226 2013-06-13 11:14

/user/flume/events/request_logs/ar1.vgs.mypoints.com/13-= 06-13/FlumeData.1371144648033

so the directory and file bot= h exists.=A0



On Thu, Jun 20, 2013 at 10:24 AM, Nitin Pawar &l= t;nitinpawar43= 2@gmail.com> wrote:

it clearly says its not a directory. Point to the dictory and it will work= =A0



On Thu, Jun 20, 2013 at 10:52 PM, sanjeev sagar <sanjeev.sagar@gm= ail.com> wrote:

Hello Everyone, I'm = running into the following Hive external table issue.

=A0

hive> CREATE EXTERNAL TABLE access(

=A0=A0=A0=A0 >=A0=A0=A0=A0=A0=A0 host STRING,

=A0=A0=A0=A0 >=A0=A0=A0=A0=A0=A0 identity STRING,

=A0=A0=A0=A0 >=A0=A0=A0=A0=A0=A0 user STRING,

=A0=A0=A0=A0 >=A0=A0=A0=A0=A0=A0 time STRING,

=A0=A0=A0=A0 >=A0=A0=A0=A0=A0=A0 request STRING,

=A0=A0=A0=A0 >=A0=A0=A0=A0=A0=A0 status STRING,

=A0=A0=A0=A0 >=A0=A0=A0=A0=A0=A0 size STRING,

=A0=A0=A0=A0 >=A0=A0=A0=A0=A0=A0 referer STRING,

=A0=A0=A0=A0 >=A0=A0=A0=A0=A0=A0 agent STRING)

=A0=A0=A0=A0 >=A0=A0=A0=A0=A0=A0 ROW FORMAT SERDE

'org.apache.hadoop.hive.contrib.serde2.RegexSerDe'

=A0=A0=A0=A0 >=A0=A0=A0=A0=A0=A0 WITH SERDEPROPERTIES (

=A0=A0=A0=A0 >=A0=A0=A0=A0=A0 "input.regex" =3D "([^ ]= *) ([^ ]*) ([^ ]*) (-|\\[[^\\]]*\\])

([^ \"]*|\"[^\"]*\") (-|[0-9]*) (-|[0-9]*)(?: ([^ \"]*|\"[^\"]*\") ([^ \"]*|\"[^\"]*\"))?",

=A0=A0=A0=A0 >=A0=A0 =A0=A0=A0=A0"output.format.string" =3D "%1$s %2$s %3$s %4$s %5$s %6$s

%7$s %8$s %9$s"

=A0=A0=A0=A0 >=A0=A0=A0=A0=A0=A0 )

=A0=A0=A0=A0 >=A0=A0=A0=A0=A0=A0 STORED AS TEXTFILE

=A0=A0=A0=A0 >=A0=A0=A0=A0=A0=A0 LOCATION

'/user/flume/events/request_logs/ar1.vgs.mypoints.co= m/13-06-13/FlumeData.1371144648033';

FAILED: Error in metadata:

MetaException(message:hdfs://h1.vgs.mypoints.com:8020/user/flume/events/req= uest_logs/ar1.vgs.mypoints.com/13-06-13/FlumeData.1371144648033

is not a directory or unable to create one)

FAILED: Execution Error, return code 1 from org.apache.hadoop.hive.ql.exec.DDLTask

=A0

=A0

In HDFS: file exists

=A0

hadoop fs -ls

/user/flume/events/request_logs/ar1.vgs.mypoints.com/13-= 06-13/FlumeData.1371144648033

Found 1 items

-rw-r--r--=A0=A0 3 hdfs supergroup 2242037226 2013-06-13 11:14

/user/flume/events/request_logs/ar1.vgs.mypoints.com/13-= 06-13/FlumeData.1371144648033

=A0

I've download the serde2 jar file too and install it in /usr/lib/hive/lib/hive-json-serde-0.2.jar and I've bounced all the hado= op services after that.

=A0

I even added the jar file manually in hive and run the above sql but still failing.

ive> add jar /usr/lib/hive/lib/hive-json-serde-0.2.jar

=A0=A0=A0=A0 > ;

Added /usr/lib/hive/lib/hive-json-serde-0.2.jar to class path Added resource: /usr/lib/hive/lib/hive-json-serde-0.2.jar

=A0

Any help would be highly appreciable.

=A0

-Sanjeev

=A0

=A0

=A0

=A0


--
Sanjeev Sagar

"Separate yourself from everything that separates you from other= s !" - Nirankari Baba Hardev Singh ji=A0




<= font color=3D"#888888">--
Nitin Pawar



--
Sanjeev Saga= r

"Separate yourself from everythin= g that separates you from others !" - Nirankari Baba Ha= rdev Singh ji=A0


<= /div>--
Nitin Pawar



--
Sanjeev Saga= r

"Separate yourself from everything that separates you from other= s !" - Nirankari Baba Hardev Singh ji=A0