hive-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Joydeep Sen Sarma (JIRA)" <>
Subject [jira] Commented: (HIVE-176) structured log for obtaining query stats/info
Date Thu, 08 Jan 2009 02:28:44 GMT


Joydeep Sen Sarma commented on HIVE-176:

looks very good in general.

Code comments:

HIVEHISTORTFILELOC - spell error why a separate init() call - why not put this in a constructor? Also - the
init() call and HiveHistory.get() both call setHistory on sessionstate. i am actually unsure
of why there is a HiveHistory.get() call in the first place. seems like all u have to do is
initialize the history object and store it in the sessionstate - both of which can be done
by SessionState.getHiveHistory(). probably u don't need a setHiveHistory() function in SessionState
as well. should move up the TASK_NAME setting up into the startTask code since it seems
to me that having the task name would be good for building any kind of UI stuff on top of
this query log.  setJobProperty(command, Keys.TASK_NUM_REDUCE_TASKS, String.valueOf(hasReduce));
- confused by this - this is a task property - right? (ExecDriver sets it as well). HAS_REDUCE_TASKS
is questionable from point of view of history (not sure why it's relevant). I also see that
JOB_ERROR_MSG is defined but not implemented - just wanted to make sure that this is what
u wanted.

we need test cases. I would definitely like to verify some end to end tests - like making
sure that for both positive and negative test queries - that history files are properly generated
and contain some data. doing diffs on such files may be hard - but we should be able to assert
that we see a start and end for job and the expected number of tasks etc.

> structured log for obtaining query stats/info
> ---------------------------------------------
>                 Key: HIVE-176
>                 URL:
>             Project: Hadoop Hive
>          Issue Type: Bug
>          Components: Logging
>    Affects Versions: 0.2.0
>            Reporter: Joydeep Sen Sarma
>            Assignee: Suresh Antony
>             Fix For: 0.2.0
>         Attachments: patch_176.txt, patch_176.txt
> Josh <> wrote:
> When launching off hive queries using hive -e is there a way to get the job id so that
I can just queue them up and go check their statuses later? What's the general pattern for
queueing and monitoring without using the libraries directly?
> I'm gonna throw my vote in for a structured log format. Users could tail it and use whatever
queuing or monitoring they wish. It's also probably just a 30 minute project for someone already
familiar with the code. I suggest ^A seperated key=value pairs per log line.

This message is automatically generated by JIRA.
You can reply to this email to add a comment to the issue online.

View raw message