hive-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Mohit <mohitsi...@huawei.com>
Subject RE: Regarding Hive History File(s).
Date Wed, 05 Jan 2011 05:41:04 GMT
hmm, ok , I think the process of creating and cleanup of resources should be
the part of the same system, lets not hand it over to cron utility, users
might not be knowing or need not to know what files to delete, when to
delete, from where to delete.

 

What about a timer task which cleans up these files older than the
configured elapsed time say a deleting files an hour old or a week old.?

 

I'm raising new JIRA for this and will provide the patch.

 

Ok, you are talking about HIVE-1708, WELL If it is about changing the file
location, one can do that by overriding the property hive.querylog.location
by adding into hive-default.xml. I will comment on that.

 

 

-Mohit

****************************************************************************
***********

This e-mail and attachments contain confidential information from HUAWEI,
which is intended only for the person or entity whose address is listed
above. Any use of the information contained herein in any way (including,
but not limited to, total or partial disclosure, reproduction, or
dissemination) by persons other than the intended recipient's) is
prohibited. If you receive this e-mail in error, please notify the sender by
phone or email immediately and delete it!

 

-----Original Message-----
From: Edward Capriolo [mailto:edlinuxguru@gmail.com] 
Sent: Tuesday, January 04, 2011 8:03 PM
To: mohitsikri@huawei.com
Cc: hive-dev@hadoop.apache.org; carl@cloudera.com
Subject: Re: Regarding Hive History File(s).

 

On Tue, Jan 4, 2011 at 7:03 AM, Mohit <mohitsikri@huawei.com> wrote:

> Hello All,

> 

> 

> 

> What is the purpose of maintaining hive history files which contain
session

> information like session start, query start, query end, task start, task
end

> etc.? Are they being used later (say by a tool) for some purpose?

> 

> 

> 

> I don't see these files being getting deleted from the system ;any

> configuration needed to be set  to enable deletion or Is there any design

> strategy/decision/rationale for not deleting them at all?

> 

> 

> 

> Also, in these files I don't see the session end message being logged, is
it

> reserved for future use?

> 

> 

> 

> -Mohit

> 

> 

> 

>
****************************************************************************
***********

> This e-mail and attachments contain confidential information from HUAWEI,

> which is intended only for the person or entity whose address is listed

> above. Any use of the information contained herein in any way (including,

> but not limited to, total or partial disclosure, reproduction, or

> dissemination) by persons other than the intended recipient's) is

> prohibited. If you receive this e-mail in error, please notify the sender
by

> phone or email immediately and delete it!

> 

> 

 

HiveHistory was added a while ago between 3.0 and 4.0 (iirc). A tool

to view them is HiveHistoryViewer in the API. I am not exactly sure

who is doing what with that data. The Web Interface does use it to

provide links to the JobTracker. So it helpful for trying to trace all

the dependant jobs of a query after the fact.

 

There is a ticket open to customize the file location. I was also

thinking we should allow the user to supply a 'none' to turn off the

feature. As for clean up and management cron and rm seem like a good

fit.


Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message