hive-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "ASF GitHub Bot (Jira)" <j...@apache.org>
Subject [jira] [Work logged] (HIVE-23123) Disable export/import of views and materialized views
Date Wed, 15 Apr 2020 17:02:00 GMT

     [ https://issues.apache.org/jira/browse/HIVE-23123?focusedWorklogId=422843&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-422843
]

ASF GitHub Bot logged work on HIVE-23123:
-----------------------------------------

                Author: ASF GitHub Bot
            Created on: 15/Apr/20 17:01
            Start Date: 15/Apr/20 17:01
    Worklog Time Spent: 10m 
      Work Description: miklosgergely commented on pull request #969: HIVE-23123 Disable export/import
of views and materialized views
URL: https://github.com/apache/hive/pull/969#discussion_r408995668
 
 

 ##########
 File path: ql/src/java/org/apache/hadoop/hive/ql/exec/repl/ReplLoadTask.java
 ##########
 @@ -163,10 +170,34 @@ a database ( directory )
               the dbTracker /  tableTracker are setup correctly always.
            */
           TableContext tableContext = new TableContext(dbTracker, work.dbNameToLoadIn);
-          TableEvent tableEvent = (TableEvent) next;
-          LoadTable loadTable = new LoadTable(tableEvent, loadContext, iterator.replLogger(),
-                                              tableContext, loadTaskTracker);
-          tableTracker = loadTable.tasks(work.isIncrementalLoad());
+          FSTableEvent tableEvent = (FSTableEvent) next;
+          if (TableType.VIRTUAL_VIEW.name().equals(tableEvent.getMetaData().getTable().getTableType()))
{
+            tableTracker = new TaskTracker(1);
+            tableTracker.addTask(createViewTask(tableEvent.getMetaData(), work.dbNameToLoadIn,
conf));
+          } else {
+            LoadTable loadTable = new LoadTable(tableEvent, loadContext, iterator.replLogger(),
tableContext,
+                loadTaskTracker);
+            tableTracker = loadTable.tasks(work.isIncrementalLoad());
+
+            /*
+              for table replication if we reach the max number of tasks then for the next
run we will
+              try to reload the same table again, this is mainly for ease of understanding
the code
+              as then we can avoid handling == > loading partitions for the table given
that
+              the creation of table lead to reaching max tasks vs,  loading next table since
current
+              one does not have partitions.
+             */
+
+            // for a table we explicitly try to load partitions as there is no separate partitions
events.
+            LoadPartitions loadPartitions =
+                new LoadPartitions(loadContext, iterator.replLogger(), loadTaskTracker, tableEvent,
+                        work.dbNameToLoadIn, tableContext);
+            TaskTracker partitionsTracker = loadPartitions.tasks();
+            partitionsPostProcessing(iterator, scope, loadTaskTracker, tableTracker,
 
 Review comment:
   Modified it, please confirm that like this it will be ok.
 
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


Issue Time Tracking
-------------------

    Worklog Id:     (was: 422843)
    Time Spent: 50m  (was: 40m)

> Disable export/import of views and materialized views
> -----------------------------------------------------
>
>                 Key: HIVE-23123
>                 URL: https://issues.apache.org/jira/browse/HIVE-23123
>             Project: Hive
>          Issue Type: Bug
>          Components: Hive
>            Reporter: Miklos Gergely
>            Assignee: Miklos Gergely
>            Priority: Major
>              Labels: pull-request-available
>         Attachments: HIVE-23123.01.patch, HIVE-23123.02.patch
>
>          Time Spent: 50m
>  Remaining Estimate: 0h
>
> According to [https://cwiki.apache.org/confluence/display/Hive/LanguageManual+ImportExport]
import and export can be done by using the
> {code:java}
> export table ...
> import table ... 
> {code}
> commands. The document doesn't mention views or materialized views at all, and in fact
we don't support commands like
> {code:java}
> export view ...
> import view ...
> export materialized view ...
> import materialized view ... 
> {code}
> they can not be parsed at all. The word table is often used though in a broader sense,
when it means all table like entities, including views and materialized views. For example
the various Table classes may represent any of these as well.
> If I try to export a view with the export table ... command, it goes fine. A _metadata
file will be created, but no data directory, which is what we'd expect. If I try to import
it back, an exception is thrown due to the lack of the data dir:
> {code:java}
> java.lang.AssertionError: null==getPath() for exim_view
>  at org.apache.hadoop.hive.ql.metadata.Hive.loadTable(Hive.java:3088)
>  at org.apache.hadoop.hive.ql.exec.MoveTask.execute(MoveTask.java:419)
>  at org.apache.hadoop.hive.ql.exec.Task.executeTask(Task.java:213)
>  at org.apache.hadoop.hive.ql.exec.TaskRunner.runSequential(TaskRunner.java:105)
>  at org.apache.hadoop.hive.ql.Executor.launchTask(Executor.java:364)
>  at org.apache.hadoop.hive.ql.Executor.launchTasks(Executor.java:335)
>  at org.apache.hadoop.hive.ql.Executor.runTasks(Executor.java:246)
>  at org.apache.hadoop.hive.ql.Executor.execute(Executor.java:109)
>  at org.apache.hadoop.hive.ql.Driver.runInternal(Driver.java:722)
>  at org.apache.hadoop.hive.ql.Driver.run(Driver.java:491)
>  at org.apache.hadoop.hive.ql.Driver.run(Driver.java:485) 
> {code}
> Still the view gets imported successfully, as data movement wasn't even necessary.
> If we try to export a materialized view which is transactional, then this exception occurs:
> {code:java}
> org.apache.hadoop.hive.ql.parse.SemanticException: org.apache.hadoop.hive.ql.metadata.InvalidTableException:
Table not found exim_materialized_view_da21d41a_9fe4_4446_9c72_d251496abf9d
>  at org.apache.hadoop.hive.ql.parse.AcidExportSemanticAnalyzer.analyzeAcidExport(AcidExportSemanticAnalyzer.java:163)
>  at org.apache.hadoop.hive.ql.parse.AcidExportSemanticAnalyzer.analyze(AcidExportSemanticAnalyzer.java:71)
>  at org.apache.hadoop.hive.ql.parse.RewriteSemanticAnalyzer.analyzeInternal(RewriteSemanticAnalyzer.java:72)
>  at org.apache.hadoop.hive.ql.parse.BaseSemanticAnalyzer.analyze(BaseSemanticAnalyzer.java:289)
>  at org.apache.hadoop.hive.ql.Compiler.analyze(Compiler.java:220)
>  at org.apache.hadoop.hive.ql.Compiler.compile(Compiler.java:104)
>  at org.apache.hadoop.hive.ql.Driver.compile(Driver.java:183)
>  at org.apache.hadoop.hive.ql.Driver.compileInternal(Driver.java:601)
>  at org.apache.hadoop.hive.ql.Driver.compileAndRespond(Driver.java:547)
>  at org.apache.hadoop.hive.ql.Driver.compileAndRespond(Driver.java:541) 
> {code}
> So the export process can not handle it, as the temporary table is not getting created.
>  
> The import command handling have a lot of codes dedicated to importing views and materialized
views, which suggests that we support the importing (and thus also suggests implicitly that
we support the exporting) of views and materialiezed views.
>  
> So the conclusion is that we have to decide if we support exporting/importing of views
and materialized views.
> If we decide not to support them then:
>  - export process should throw an exception if a view or materialized view is the subject
>  - the codes specific to view imports should be removed
> If we decide to support them, then:
>  - the commands mentioned above above should be introduced
>  - exception should be thrown if not the proper command used (e.g. export view on a table)
>  - the exceptions mentioned above should be fixed
>  
> I prefer not to support them, I don't think we should support the exporting / importing
of views. The point of exporting / importing is the transfer of data, not DDL, it causes more
issues than it solves. Our current documentation also suggests that it is only supported for
tables.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

Mime
View raw message