hive-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Sahil Takiar <takiar.sa...@gmail.com>
Subject Re: Review Request 64193: HIVE-18054: Make Lineage work with concurrent queries on a Session
Date Sat, 02 Dec 2017 00:22:10 GMT

-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/64193/#review192601
-----------------------------------------------------------



Since we touch the `LoadSemanticAnalyzer` could we add a q-test (could be added to one of
the existing `lineage*.q` files) for `LOAD` statements. Same for import / export statements
(as far as I can tell there are no existing ones, correct me if I am wrong).

If you have time, it would be great to run some of the lineage tests for HoS too, but since
thats a bit orthogonal to this JIRA, it can be done in a follow up JIRA.


ql/src/java/org/apache/hadoop/hive/ql/Driver.java
Lines 365 (patched)
<https://reviews.apache.org/r/64193/#comment270856>

    Sounds good. Just curious, is there any way to know for sure where code run by a `Driver`,
creates another `Driver`? How did you determine when this is necessary?



ql/src/java/org/apache/hadoop/hive/ql/exec/MoveTask.java
Line 401 (original), 401 (patched)
<https://reviews.apache.org/r/64193/#comment270854>

    Ok, but do we need to do `if (queryState.getLineageState() != null)` to ensure an NPE
isn't thrown? That seems to be what the old code is doing.



ql/src/java/org/apache/hadoop/hive/ql/parse/TaskCompiler.java
Lines 111 (patched)
<https://reviews.apache.org/r/64193/#comment270857>

    Doesn't a `TaskCompiler` already have a `QueryState` object? Why do we need to explicitly
pass in a `LineageState`?


- Sahil Takiar


On Nov. 30, 2017, 1:22 a.m., Andrew Sherman wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/64193/
> -----------------------------------------------------------
> 
> (Updated Nov. 30, 2017, 1:22 a.m.)
> 
> 
> Review request for hive.
> 
> 
> Repository: hive-git
> 
> 
> Description
> -------
> 
> A Hive Session can contain multiple concurrent sql Operations.
> Lineage is currently tracked in SessionState and is cleared when a query
> completes. This results in Lineage for other running queries being lost.
> 
> To fix this, move LineageState from SessionState to QueryState.
> In MoveTask/MoveWork use the LineageState from the MoveTask's QueryState
> rather than trying to use it from MoveWork.
> Add a test which runs multiple jdbc queries in a thread pool
> against the same connection and show that Vertices are not lost from Lineage.
> As part of this test, add ReadableHook, an ExecuteWithHookContext that stores
> HookContexts in memory and makes them available for reading.
> Make LineageLogger methods static so they can be used elsewhere.
> 
> Sometimes a running query (originating in a Driver) will instantiate
> another Driver to run or compile another query. Because these Drivers
> shared a Session, the child Driver would accumulate Lineage information
> along with that of the parent Driver. For consistency a LineageState is
> passed to these child Drivers and stored in the new Driver's QueryState.
> 
> 
> Diffs
> -----
> 
>   itests/hive-unit/src/test/java/org/apache/hive/jdbc/TestJdbcWithMiniHS2.java f5ed735c1ec14dfee338e56020fa2629b168389d

>   ql/src/java/org/apache/hadoop/hive/ql/Driver.java af9f193dc94e2e05caa88d965a34f4483c9d7069

>   ql/src/java/org/apache/hadoop/hive/ql/QueryState.java 7d5aa8b179e536e25c41a8946e667f8dd5669e0f

>   ql/src/java/org/apache/hadoop/hive/ql/exec/DDLTask.java e7af5e004fb560b574b82f6d1b60517511802f37

>   ql/src/java/org/apache/hadoop/hive/ql/exec/MoveTask.java e2f8c1f8012ad25114e279747e821b291c7f4ca6

>   ql/src/java/org/apache/hadoop/hive/ql/exec/Task.java 1f0487f4f72ab18bcf876f45ad5758d83a7f001b

>   ql/src/java/org/apache/hadoop/hive/ql/exec/repl/bootstrap/load/table/LoadPartitions.java
262225fc202d4627652acfd77350e44b0284b3da 
>   ql/src/java/org/apache/hadoop/hive/ql/exec/repl/bootstrap/load/table/LoadTable.java
bb1f4e50509e57a9d0b9e6793c1fc08baa4d2981 
>   ql/src/java/org/apache/hadoop/hive/ql/hooks/HookContext.java 7b617309f6b0d8a7ce0dea80ab1f790c2651b147

>   ql/src/java/org/apache/hadoop/hive/ql/hooks/LineageLogger.java 2f764f8a29a9d41a7db013a949ffe3a8a9417d32

>   ql/src/java/org/apache/hadoop/hive/ql/hooks/ReadableHook.java PRE-CREATION 
>   ql/src/java/org/apache/hadoop/hive/ql/index/AggregateIndexHandler.java 68709b4d3baf15d78e60e948ccdef3df84f28cec

>   ql/src/java/org/apache/hadoop/hive/ql/index/HiveIndexHandler.java 1e577da82343a1b7361467fb662661f9c6642ec0

>   ql/src/java/org/apache/hadoop/hive/ql/index/TableBasedIndexHandler.java 29886ae7f97f8dae7116f4fc9a2417ab8f9dac0a

>   ql/src/java/org/apache/hadoop/hive/ql/index/bitmap/BitmapIndexHandler.java 7b067a0d45e33bc3347c43b050af933c296a9227

>   ql/src/java/org/apache/hadoop/hive/ql/index/compact/CompactIndexHandler.java 504b0623142a6fa6cdb45a26b49f146e12ec2d7a

>   ql/src/java/org/apache/hadoop/hive/ql/optimizer/GenMRFileSink1.java d7a83f775abca39b219f71aff88173a14ffaee9f

>   ql/src/java/org/apache/hadoop/hive/ql/optimizer/GenMRProcContext.java 4387c4297fee48d4c03e95d5a2fcb822ab480eeb

>   ql/src/java/org/apache/hadoop/hive/ql/optimizer/GenMapRedUtils.java 67739a1db9fc52a67f4f5ea7dba80fe0e95750c8

>   ql/src/java/org/apache/hadoop/hive/ql/optimizer/IndexUtils.java 338c1856672f09bb7da35d2336ebb5b6f3fdc5a6

>   ql/src/java/org/apache/hadoop/hive/ql/optimizer/lineage/Generator.java e6c07713b24df719315d804f006151106eea9aed

>   ql/src/java/org/apache/hadoop/hive/ql/parse/DDLSemanticAnalyzer.java 1fd634c928a5384b09d97322c3ea785f518d73fe

>   ql/src/java/org/apache/hadoop/hive/ql/parse/ExplainSemanticAnalyzer.java 065c7e50986872cd35386feee712f3452597d643

>   ql/src/java/org/apache/hadoop/hive/ql/parse/GenTezProcContext.java 0c160acf46eb1eb07c5f04091099c1024e166638

>   ql/src/java/org/apache/hadoop/hive/ql/parse/GenTezUtils.java b6f1139fe1a78283277bf4d0c5224ab1d718c634

>   ql/src/java/org/apache/hadoop/hive/ql/parse/ImportSemanticAnalyzer.java cd75130d7c5f0b402f1b4331c57edc611eb4b2ed

>   ql/src/java/org/apache/hadoop/hive/ql/parse/IndexUpdater.java f31775ed942160da73344c4dca707da7b8c658a6

>   ql/src/java/org/apache/hadoop/hive/ql/parse/LoadSemanticAnalyzer.java 238fbd60572ee5f7f8f6c4d5b2abce8f66c7e495

>   ql/src/java/org/apache/hadoop/hive/ql/parse/MapReduceCompiler.java d7a56e5846d5754dec5070d8c44443543a3695e4

>   ql/src/java/org/apache/hadoop/hive/ql/parse/ReplicationSemanticAnalyzer.java 498b6741c3f40b92ce3fb218e91e7809a17383f0

>   ql/src/java/org/apache/hadoop/hive/ql/parse/SemanticAnalyzer.java b66817f65f65b6aaf8dbc339a969b8e9e0565e9e

>   ql/src/java/org/apache/hadoop/hive/ql/parse/TaskCompiler.java 7b2937032ab8dd57f8923e0a9e7aab4a92de55ee

>   ql/src/java/org/apache/hadoop/hive/ql/parse/TezCompiler.java be33f380030ea8b416a4549c3947d767bba66356

>   ql/src/java/org/apache/hadoop/hive/ql/parse/spark/GenSparkProcContext.java 4d2bcfa285dc08811106f3c234346efff22afd99

>   ql/src/java/org/apache/hadoop/hive/ql/parse/spark/GenSparkUtils.java 604c8aee151a45cf942852a3644b5e79f779f353

>   ql/src/java/org/apache/hadoop/hive/ql/parse/spark/SparkCompiler.java 965044d9253585eeaeef50d7fe4fc4d818042df8

>   ql/src/java/org/apache/hadoop/hive/ql/plan/MoveWork.java 28a33740b30b7be0057ce91de55a0407dd2f2cbf

>   ql/src/java/org/apache/hadoop/hive/ql/session/LineageState.java 056d6141d6239816699ed5f730cbd14e48d8d9bb

>   ql/src/java/org/apache/hadoop/hive/ql/session/SessionState.java bb6ddc6fa4667ac0e30994d0f9ee8b969542383c

>   ql/src/test/org/apache/hadoop/hive/ql/optimizer/TestGenMapRedUtilsCreateConditionalTask.java
340689255c738ea497bcd269463b8b8bc38cf34c 
>   ql/src/test/org/apache/hadoop/hive/ql/parse/TestGenTezWork.java 2c28c398ca49ba661df460c9f3e6d578c785d3ce

> 
> 
> Diff: https://reviews.apache.org/r/64193/diff/1/
> 
> 
> Testing
> -------
> 
> 
> Thanks,
> 
> Andrew Sherman
> 
>


Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message