hive-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Edward Capriolo <edlinuxg...@gmail.com>
Subject Re: Regarding Hive History File(s).
Date Thu, 06 Jan 2011 16:17:11 GMT
Mohit,

Please try to not open to many issues you do not plan to directly work
on in some time frame. We have a road map and request for features on
the wiki, please use those instead. We have 600 un scheduled open
tickets at this point. If there is no immediate plan to work on the
issue the clutter of so many wishes and possible enhancements makes
finding actual issues much more difficult.

Edward

On Wed, Jan 5, 2011 at 11:37 PM, Mohit <mohitsikri@huawei.com> wrote:
> Rightly said Carl, may be the java code here not the better and optimized
> solution for all the use case(s), Ok I will raise a documentation JIRA flow
> and provide documentation on transaction logs behavior and sample cron and
> logrotate usage for the same.
>
> ****************************************************************************
> ***********
> This e-mail and attachments contain confidential information from HUAWEI,
> which is intended only for the person or entity whose address is listed
> above. Any use of the information contained herein in any way (including,
> but not limited to, total or partial disclosure, reproduction, or
> dissemination) by persons other than the intended recipient's) is
> prohibited. If you receive this e-mail in error, please notify the sender by
> phone or email immediately and delete it!
>
> -----Original Message-----
> From: Carl Steinbach [mailto:carl@cloudera.com]
> Sent: Wednesday, January 05, 2011 12:42 PM
> To: mohitsikri@huawei.com
> Cc: dev@hive.apache.org; hive-dev@hadoop.apache.org; edlinuxguru@gmail.com
> Subject: Re: Regarding Hive History File(s).
>
> Hi Mohit,
>
> Usually it's the Ops/IT staff that ends up managing things like a production
> HiveServer instance, and in a UNIX shop I suspect that most of these folks
> are already going to be familiar with using cron and logrotate (
> http://linuxcommand.org/man_pages/logrotate8.html) to manage the logs
> produced by their other server systems.
>
> Building a log rotation feature into HiveServer defies this convention and
> will force people to learn how to configure a new log rotation system
> specific to HiveServer. It also requires us to write, debug, document and
> maintain code that isn't really necessary. I think the best approach is to
> take advantage of what already exists by documenting Hive's logging behavior
> in the Admin manual and providing a sample logrotate configuration file.
>
> Thanks.
>
> Carl
>
> On Tue, Jan 4, 2011 at 9:41 PM, Mohit <mohitsikri@huawei.com> wrote:
>
>>  hmm, ok , I think the process of creating and cleanup of resources should
>> be the part of the same system, lets not hand it over to cron utility,
> users
>> might not be knowing or need not to know what files to delete, when to
>> delete, from where to delete.
>>
>>
>>
>> What about a timer task which cleans up these files older than the
>> configured elapsed time say a deleting files an hour old or a week old.?
>>
>>
>>
>> I'm raising new JIRA for this and will provide the patch.
>>
>>
>>
>> Ok, you are talking about HIVE-1708, WELL If it is about changing the file
>> location, one can do that by overriding the property
> *hive.querylog.location
>> *by adding into hive-default.xml. I will comment on that.
>>
>>
>>
>>
>>
>> -Mohit
>>
>>
>>
> ****************************************************************************
> ***********
>>
>> This e-mail and attachments contain confidential information from HUAWEI,
>> which is intended only for the person or entity whose address is listed
>> above. Any use of the information contained herein in any way (including,
>> but not limited to, total or partial disclosure, reproduction, or
>> dissemination) by persons other than the intended recipient's) is
>> prohibited. If you receive this e-mail in error, please notify the sender
> by
>> phone or email immediately and delete it!
>>
>>
>>
>> -----Original Message-----
>> From: Edward Capriolo [mailto:edlinuxguru@gmail.com]
>> Sent: Tuesday, January 04, 2011 8:03 PM
>> To: mohitsikri@huawei.com
>> Cc: hive-dev@hadoop.apache.org; carl@cloudera.com
>> Subject: Re: Regarding Hive History File(s).
>>
>>
>>
>> On Tue, Jan 4, 2011 at 7:03 AM, Mohit <mohitsikri@huawei.com> wrote:
>>
>> > Hello All,
>>
>> >
>>
>> >
>>
>> >
>>
>> > What is the purpose of maintaining hive history files which contain
>> session
>>
>> > information like session start, query start, query end, task start, task
>> end
>>
>> > etc.? Are they being used later (say by a tool) for some purpose?
>>
>> >
>>
>> >
>>
>> >
>>
>> > I don't see these files being getting deleted from the system ;any
>>
>> > configuration needed to be set  to enable deletion or Is there any
> design
>>
>> > strategy/decision/rationale for not deleting them at all?
>>
>> >
>>
>> >
>>
>> >
>>
>> > Also, in these files I don't see the session end message being logged,
> is
>> it
>>
>> > reserved for future use?
>>
>> >
>>
>> >
>>
>> >
>>
>> > -Mohit
>>
>> >
>>
>> >
>>
>> >
>>
>> >
>>
> ****************************************************************************
> ***********
>>
>> > This e-mail and attachments contain confidential information from
> HUAWEI,
>>
>> > which is intended only for the person or entity whose address is listed
>>
>> > above. Any use of the information contained herein in any way
> (including,
>>
>> > but not limited to, total or partial disclosure, reproduction, or
>>
>> > dissemination) by persons other than the intended recipient's) is
>>
>> > prohibited. If you receive this e-mail in error, please notify the
> sender
>> by
>>
>> > phone or email immediately and delete it!
>>
>> >
>>
>> >
>>
>>
>>
>> HiveHistory was added a while ago between 3.0 and 4.0 (iirc). A tool
>>
>> to view them is HiveHistoryViewer in the API. I am not exactly sure
>>
>> who is doing what with that data. The Web Interface does use it to
>>
>> provide links to the JobTracker. So it helpful for trying to trace all
>>
>> the dependant jobs of a query after the fact.
>>
>>
>>
>> There is a ticket open to customize the file location. I was also
>>
>> thinking we should allow the user to supply a 'none' to turn off the
>>
>> feature. As for clean up and management cron and rm seem like a good
>>
>> fit.
>>
>
>

Mime
View raw message