hive-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Edward Capriolo <edlinuxg...@gmail.com>
Subject Re: java.io.FileNotFoundException: File does not exist on modified data.
Date Sun, 26 Sep 2010 02:12:03 GMT
On Monday, September 20, 2010, Bennie Schut <bschut@ebuddy.com> wrote:
>
>
>
>
>
>
>
>
>
>
>
>
>
> Hi all,
>
>
>
> We are sometimes getting file not found exceptions while
> running large queries on hive. During these large queries we also import data
> on the partitions we are querying which raises a question for us. How does hive
> handle data which is being modified in the background?
>
> We use insert overwrite on the partitions so I can imagine
> the large query can be surprised with some new files and some missing old
> files.
>
> If others are experiencing this how do they work around
> this? Perhaps partition on 2 keys so you don’t overwrite existing data?
>
>
>
> Thanks for any pointers on this.
>
> Bennie.
>
>
>
>
>
>
>

I do think hive/map reduce have a great way of dealing with moving
targets. if the content changes between get splits and task execution.

We have a program file crush which crushes up small files. We
implemented read and write locks on a table basis. I am sure the new
zk locking might handles this better.

Mime
View raw message