Thanks for you positive answer.


From your answer, I get the key word “map join”, and realize it, do you mean that I can do as the blog says:


If you do mind, please scan the website.



发件人: Nitin Pawar []
发送时间: 2013716 15:29
主题: Re: hive task fails when left semi join


Can you try map only join? 

Your one table is just 1k records .. map join will help you run it faster and hopefully you will not hit memory condition 


On Tue, Jul 16, 2013 at 12:56 PM, <> wrote:



I am trying to filter out some records in a table in hive.

The number of lines in this table is 4billions+,

I make a left semi join between above table and a small table with 1k lines.


However, after 3 hours job running, it turns out a fail status.


My question are as follows,

1.     How could I address this problem and final solve it?

2.     Is there any other good methods could filter out records with give conditions?


The following picture is a snapshot of the failed job.



Nitin Pawar