hive-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Sanjay Subramanian <>
Subject Re: S3/EMR Hive: Load contents of a single file
Date Tue, 26 Mar 2013 17:21:33 GMT
Hi Tony

Can u create the table without any location.

After that you could do an ALTER TABLE add location and partition

ALTER TABLE myData ADD PARTITION (partitionColumn1='$value1' , partitionColumn2='$value2')
LOCATION '/path/to/your/directory/in/hdfs';"

An example Without Partitions

While specifying location, you have to point to a directory. You cannot point to a file (IMHO).

Hope that helps


From: Tony Burton <<>>
Reply-To: "<>" <<>>
Date: Tuesday, March 26, 2013 10:11 AM
To: "<>" <<>>
Subject: S3/EMR Hive: Load contents of a single file

Hi list,

I've been using hive to perform queries on data hosted on AWS S3, and my tables point at data
by specifying the directory in which the data is stored, eg

$ create external table myData (str1 string, str2 string, count1 int) partitioned by <snip>
row format <snip> stored as textfile location 's3://mybucket/path/to/data';

where s3://mybucket/path/to/data is the "directory" that contains the files I'm interested
in. My use case now is to create a table with data pointing to a specifc file in a directory:

$ create external table myData (str1 string, str2 string, count1 int) partitioned by <snip>
row format <snip> stored as textfile location 's3://mybucket/path/to/data/src1.txt';

and I get the error: "FAILED: Error in metadata: MetaException(message:Got exception:
Can't make directory for path 's3://spinmetrics/global/counter_Fixture.txt' since it is a
file.)". Ok, lets try to create the table without specifying the data source:

$ create external table myData (str1 string, str2 string, count1 int) partitioned by <snip>
row format <snip> stored as textfile

Ok, no problem. Now lets load the data

$ LOAD DATA INPATH 's3://mybucket/path/to/data/src1.txt' INTO TABLE myData;

(referring to - "...filepath can refer
to a file (in which case hive will move the file into the table)")

Error message is: " FAILED: Error in semantic analysis: Line 1:17 Path is not legal ''s3://mybucket/path/to/data/src1.txt":
Move from: s3:// mybucket/path/to/data/src1.txt to: hdfs://
is not valid. Please check that values for params "" and "hive.metastore.warehouse.dir"
do not conflict."

So I check my and hive.metastore.warehouse.dir (which have never caused problems

$ set;
$ set hive.metastore.warehouse.dir;

Clearly different, but which is correct? Is there an easier way to load a single file into
a hive table? Or should I just put each file in a directory and proceed as before?



Tony Burton
Senior Software Engineer


PPlease consider the environment before printing this email or attachments

This email and any attachments are confidential, protected by copyright and may be legally
privileged. If you are not the intended recipient, then the dissemination or copying of this
email is prohibited. If you have received this in error, please notify the sender by replying
by email and then delete the email completely from your system. Neither Sporting Index nor
the sender accepts responsibility for any virus, or any other defect which might affect any
computer or IT system into which the email is received and/or opened. It is the responsibility
of the recipient to scan the email and no responsibility is accepted for any loss or damage
arising in any way from receipt or use of this email. Sporting Index Ltd is a company registered
in England and Wales with company number 2636842, whose registered office is at Gateway House,
Milverton Street, London, SE11 4AP. Sporting Index Ltd is authorised and regulated by the
UK Financial Services Authority (reg. no. 150404) and Gambling Commission (reg. no. 000-027343-R-308898-001).
Any financial promotion contained herein has been issued and approved by Sporting Index Ltd.

Outbound email has been scanned for viruses and SPAM

This email message and any attachments are for the exclusive use of the intended recipient(s)
and may contain confidential and privileged information. Any unauthorized review, use, disclosure
or distribution is prohibited. If you are not the intended recipient, please contact the sender
by reply email and destroy all copies of the original message along with any attachments,
from your computer system. If you are the intended recipient, please be advised that the content
of this message is subject to access, review and disclosure by the sender's Email System Administrator.

View raw message