hive-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Keith Wiley <>
Subject Re: S3/EMR Hive: Load contents of a single file
Date Wed, 27 Mar 2013 17:02:36 GMT
Okay, I also saw your previous response which analyzed queries into two tables built around
two files in the same directory.  I guess I was simply wrong in my understanding that a Hive
table is fundamentally associated with a directory instead of a file.  Turns out, it be can
either one.  A directory table uses all files in the directory while a file table uses one
specific file and properly avoids sibling files.  My bad.

Thanks for the careful analysis and clarification.  TIL!


On Mar 27, 2013, at 02:58 , Tony Burton wrote:

> A bit more info - do an extended description of the table:
> $ desc extended gsrc1;
> And the “location” field is “location:s3://mybucket/path/to/data/src1.txt”
> Do the same on a table created with a location pointing at the directory and the same
info gives (not surprisingly) “location:s3://mybucket/path/to/data/”

Keith Wiley

"I used to be with it, but then they changed what it was.  Now, what I'm with
isn't it, and what's it seems weird and scary to me."
                                           --  Abe (Grandpa) Simpson

View raw message