hive-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Ratandeep Ratti (JIRA)" <>
Subject [jira] [Created] (HIVE-10301) Enhancing View registration and access with dynamic dependency artifact resolution
Date Fri, 10 Apr 2015 15:51:13 GMT
Ratandeep Ratti created HIVE-10301:

             Summary: Enhancing View registration and access with dynamic dependency artifact
                 Key: HIVE-10301
             Project: Hive
          Issue Type: New Feature
            Reporter: Ratandeep Ratti
            Assignee: Ratandeep Ratti

Since we now have dynamic dependency artifact resolution in Hive (HIVE-9664) . I think we
can improve upon view creation (and accessing) process which involve UDFs in their view definition.
An example will illustrate what I'm suggesting.

Say we have a simple view definition which involves a UDF function
> add jar udf-0.0.1.jar;
> create temporary function fn as examples.FunctionUDF;
> create view v as select fn(*) from db.table;

Now, once the session is closed the view will exist, but the function will not.
In a new session, if we tried to query the view it will fail since it will not be able to
find the function, unless we manually re-register the dependency and the function.

I suggest the following improvement for view registration which involves functions/udfs (Provided
the udfs is present in some ivy/maven repo)
> create view v 
   'dependencies' = 'ivy://',
   'functions'    = 'fn:examples.FunctionUDF'
)  as select fn(*) from db.table;

The view's metadata now contains the artifact coordinates and also the function to class mapping.
Hive can make use this of this information during view registration and view access. 

Now when a view is created or accessed, before that, Hive will download the artifact from
the ivy coordinates specified and register a temporary function with the specified class mapping.

Note above that the user does not have to enter commands "add jar" and "create temporary function".

There are a few things to think through though

1. What if a view 'v' with function and dependency metadata is dependent upon another view
which also has its own function and dependency metadata.
In this case we can traverse the view dependency chain/graph and register the functions and
dependencies for all views, before the said view 'v' is accessed. Note that there could be
function name conflicts in this case. Maybe we could resolve this by prefixing db and table-name
to function names during view registration?

2. Certain queries may require some configuration to be set before the query is executed.
Could we also specify the configuration setting as part of the table properties?

I'll update the bug again with formal metadata parameters that could be support and complete
view creation syntax.

Please do update with your thoughts and concerns.

This message was sent by Atlassian JIRA

View raw message