hive-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Sai Sai <saigr...@yahoo.in>
Subject hive newb questions
Date Mon, 04 Mar 2013 08:30:27 GMT
Hi
I was wondering if it is right to assume:

1. The first time we create a table in hive and load it followed by running the first query
like 

Select * from Table1

will result in a MR job running and will get the data to us.

If we run the same query second time MR job will not run but will result in just fetch the
data.

2. If the above assumption is not right is possible to cache the data in hive so the MR job
will not run 
again for the subsequent queries and just fetch it right away.

3. Once we load the data in hive table how many days should we keep it.

4. Is it a good practise to remove the data in a certain period of time as it may take a large
space.

5. Should this really be a concern or not as the memory today is not that expensive.

Any inputs will be appreciated.
Thanks
Sai
Mime
View raw message