hive-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Sarath Subramanian (Jira)" <j...@apache.org>
Subject [jira] [Updated] (HIVE-22301) Hive lineage is not generated for insert overwrite queries on partitioned tables
Date Tue, 08 Oct 2019 02:23:00 GMT

     [ https://issues.apache.org/jira/browse/HIVE-22301?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]

Sarath Subramanian updated HIVE-22301:
--------------------------------------
    Description: 
Problem: When I run the below mentioned queries, the last query should have given the proper
hive lineage info (through HookContext) from table_b to table_t.
 * Create table table_t (id int) partitioned by (dob date);
 * Create table table_b (id int) partitioned by (dob date);
 * from table_b a insert overwrite table table_t select a.id,a.dob;

Note : for CTAS query from a partitioned table , this issue is not seen. Only for insert queries
like insert into <table> select * from <table> and query like above, issue is
seen.

 

Technical Observations:

At HookContext (passed from hive.ql.Driver to Hive Hook of Atlas through hookRunner.runPostExecHooks
call) contains no outputs. Check below screenshot from IntelliJ.

!ScreenShot RunPostExecHook.png|width=728,height=427!

 

I found that the PrivateHookContext is getting created with proper outputs value as shown
below initially:

  !ScreenShot HookContext.png|width=714,height=541!

The same is passed properly to runBeforeExecutionHook as shown below:

!ScreenShot runBeforeExecution.png|width=719,height=620!

 

Later when we pass HookContext to runPostExecHooks, there is no output populated. Kindly check
the reason and let me know if you need any further information from my end.

  was:
Problem: When I run the below mentioned queries, the last query should have given the proper
hive lineage info (through HookContext) from table_b to table_t.
 * Create table table_t (id int) partitioned by (dob date);
 * Create table table_b (id int) partitioned by (dob date);
 * from table_b a insert overwrite table table_t select a.id,a.dob;

Note : for CTAS query from a partitioned table , this issue is not seen. Only for insert queries
like insert into <table> select * from <table> and query like above, issue is
seen.

This issue is seen in latest HDP builds as well.

 

Technical Observations:

At HookContext (passed from hive.ql.Driver to Hive Hook of Atlas through hookRunner.runPostExecHooks
call) contains no outputs. Check below screenshot from IntelliJ.

!ScreenShot RunPostExecHook.png|width=728,height=427!

 

I found that the PrivateHookContext is getting created with proper outputs value as shown
below initially:

  !ScreenShot HookContext.png|width=714,height=541!

The same is passed properly to runBeforeExecutionHook as shown below:

!ScreenShot runBeforeExecution.png|width=719,height=620!

 

Later when we pass HookContext to runPostExecHooks, there is no output populated. Kindly check
the reason and let me know if you need any further information from my end.


> Hive lineage is not generated for insert overwrite queries on partitioned tables
> --------------------------------------------------------------------------------
>
>                 Key: HIVE-22301
>                 URL: https://issues.apache.org/jira/browse/HIVE-22301
>             Project: Hive
>          Issue Type: Bug
>          Components: lineage
>    Affects Versions: 3.1.2
>            Reporter: Sidharth Kumar Mishra
>            Priority: Major
>         Attachments: ScreenShot HookContext.png, ScreenShot RunPostExecHook.png, ScreenShot
runBeforeExecution.png
>
>
> Problem: When I run the below mentioned queries, the last query should have given the
proper hive lineage info (through HookContext) from table_b to table_t.
>  * Create table table_t (id int) partitioned by (dob date);
>  * Create table table_b (id int) partitioned by (dob date);
>  * from table_b a insert overwrite table table_t select a.id,a.dob;
> Note : for CTAS query from a partitioned table , this issue is not seen. Only for insert
queries like insert into <table> select * from <table> and query like above, issue
is seen.
>  
> Technical Observations:
> At HookContext (passed from hive.ql.Driver to Hive Hook of Atlas through hookRunner.runPostExecHooks
call) contains no outputs. Check below screenshot from IntelliJ.
> !ScreenShot RunPostExecHook.png|width=728,height=427!
>  
> I found that the PrivateHookContext is getting created with proper outputs value as shown
below initially:
>   !ScreenShot HookContext.png|width=714,height=541!
> The same is passed properly to runBeforeExecutionHook as shown below:
> !ScreenShot runBeforeExecution.png|width=719,height=620!
>  
> Later when we pass HookContext to runPostExecHooks, there is no output populated. Kindly
check the reason and let me know if you need any further information from my end.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

Mime
View raw message