hadoop-hive-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Joydeep Sen Sarma (JIRA)" <j...@apache.org>
Subject [jira] Commented: (HIVE-30) Hive web interface
Date Fri, 21 Nov 2008 17:31:44 GMT

    [ https://issues.apache.org/jira/browse/HIVE-30?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12649712#action_12649712
] 

Joydeep Sen Sarma commented on HIVE-30:
---------------------------------------

Blockers from my side:

hwi shell script: i would like to see this merged with the hive cli shell script and written
as a generic harness to launch hive utilities. given that the bulk of the libraries are common
- it seems perfectly fine to add more jars and classname to be executed based on the actual
utility name (cli vs. hwi)

also - i think it will be fairly critical to take in userids and propagate them to hive/hadoop
(by setting user.name property). why don't we just replace 'sessionname' with 'userid' ? that
should also automatically generate a separate log file for each user on the hwi server - so
it will be somewhat easy to grok at logs if required.

Another thing i just noticed - Hive's current runtime assumes a singleton SessionState object.
That's just not going to work here (since there's a singleton per execution thread now). There
are in fact some comments to this effect in SessionState.java - we need to make it a thread-local
singleton. This has to be fixed - otherwise concurrent queries/sessions would be trampling
over each other. (we can do this in a separate jira - although it would be a blocker for this
one)

regarding ss.out: in order to capture data only in the results file - please set the session
to silent mode. otherwise the output will be polluted with informational messages. (perhaps
this is highlighting that we need to get informational messages in a different stream (potentially)
than the actual results - which is very doable - but not the way things are setup now)

all of these are really asking the question: how was this tested? both of the last two issues
are fairly major.

other usability issues that are going to be very important (based on observing hipal): one
cannot destroy a running session - but one of the most common operations that users will want
to do is monitor the map-reduce tasks that have been spawned by a query and kill them (for
example - if the job is too long or the jobconf parameter setting need to be fixed). 


Good to have things (in decreasing order of importance):
- regarding reloading HiveConf - if schema browsing is not associated with a session - then
the same hiveconf can be cached and re-used. minor point - but loading the hiveconf is big
enough that i think you won't be happy if this tool becomes really popular :-)
- any reason why QUERY_SET etc. should not be an enum type?
- spell check clientDestory

> Hive web interface
> ------------------
>
>                 Key: HIVE-30
>                 URL: https://issues.apache.org/jira/browse/HIVE-30
>             Project: Hadoop Hive
>          Issue Type: Bug
>            Reporter: Jeff Hammerbacher
>            Assignee: Edward Capriolo
>            Priority: Minor
>         Attachments: HIVE-30.patch
>
>
> Hive needs a web interface. The initial checkin should have:
> * simple schema browsing
> * query submission
> * query history (similar to MySQL's SHOW PROCESSLIST)
> A suggested feature: the ability to have a query notify the user when it's completed.
> Edward Capriolo has expressed some interest in driving this process.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


Mime
View raw message