hive-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Suhail Doshi <>
Subject Re: LOAD DATA question
Date Sun, 22 Mar 2009 23:24:29 GMT

Do you know if hive may have problems going through *lots* of log files
(each 1 MB large). I remember reading about how hadoop sometimes has
problems dealing with lots of small files due to the default block size it


On Sun, Mar 22, 2009 at 4:16 PM, Zheng Shao <> wrote:

> For now, please append the unix timestamp to the end of the file name.
> Zheng
> On Sun, Mar 22, 2009 at 12:35 PM, Suhail Doshi <>wrote:
>> Hi there,
>> I was reading some of the documentation and I came across this statement:
>> "Note that if the target table (or partition) already has a file whose name
>> collides with any of the filenames contained in *filepath* - then the
>> existing file will be replaced with the new file."
>> I have rotating data logs that start at log.1 and go to log.512 and wrap
>> around back to log.1, does this mean that when I try to LOAD DATA log.1
>> again it's going to overwrite the other one?
>> In normal MySQL, this data is just constantly appended regardless of the
>> file name, but given how it's likely the file is being loaded in hdfs this
>> probably is different. If what I am thinking is happening, what is the
>> solution for rotating log files?
>> Thanks,
>> Suhail
> --
> Yours,
> Zheng


View raw message