hive-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Eugene Koifman (JIRA)" <j...@apache.org>
Subject [jira] [Updated] (HIVE-17691) Miscellaneous List
Date Wed, 11 Oct 2017 23:22:01 GMT

     [ https://issues.apache.org/jira/browse/HIVE-17691?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]

Eugene Koifman updated HIVE-17691:
----------------------------------
    Description: 
# DDLSemanticAnalyzer.alterTableOutput is unused
# DDLTask.generateAddMmTasks(Table) - stmtId should probably come from TransactionManager
# DDLTask.createTable(Hive db, CreateTableDesc crtTbl) has _Long mmWriteId = crtTbl.getInitialMmWriteId();_
logic is unclear..  this ID is only set in one place..
# FileSinkOperator has multiple places that look like _conf.getWriteType() == AcidUtils.Operation.NOT_ACID
|| conf.isMmTable()_ - what is the writeType for MM tables?  Seems that Wei opted for "work.getLoadTableWork().getWriteType()
!= AcidUtils.Operation.NOT_ACID && !tbd.isMmTable()" to mean MM, e.g. MoveTask.handleStaticParts()
call to Hive.loadPartition()
# HiveConf.HIVE_TXN_OPERATIONAL_PROPERTIES - the doc/explanation there is obsolete
# Compactor Initiator likely doesn't work for MM tables.  It's triggered by into in TXN_COMPONENTS/COMPLETED_TXN_COMPONENTS.
 MM tables don't write to either because DbTxnManager.acquireLocks() does  _compBuilder.setIsAcid(AcidUtils.isFullAcidTable(t));_
i.e. it treats MM as non-acid tables
# In general integration with full Acid seems confused wrt to MM and seems to treat MM as
special table type rather than subtype of Acid table.  (mostly, but not always).
## e.g. _SemanticAnalyzer.genBucketingSortingDest(String dest, Operator input, QB qb, TableDesc
table_desc, Table dest_tab, SortBucketRSCtx ctx)_ 
##  _SemanticAnalyzer.validate()_ has _if (tbl != null && (AcidUtils.isFullAcidTable(tbl)
|| MetaStoreUtils.isInsertOnlyTable(tbl.getParameters()))) {_
# LoadSemanticAnalyzer.analyzeInternal(ASTNode) sets statementId to 0 rather than from TM
# ImportCommitTask - doesn't currently do anything.  It used to commit mmID.  Need to verify
we properly commit the txn in the Driver
# As far as I can tell all the mm_*.q tests run on TestCliDriver which means MR.  This doesn't
exercise some code specifically for dealing with writes from Union All queries (CTAS, Insert
into).  On MR this requires "hive.optimize.union.remove=true" (false by default)
# Remove MoveWork().setNoop(boolean) and usages per todo in _GenMapRedUtils.createMRWorkForMergingFiles
(FileSinkOperator fsInput, Path finalName, DependencyCollectionTask dependencyTask,   List<Task<MoveWork>>
mvTasks, HiveConf conf,   Task<? extends Serializable> currTask)_
# PartialScanWork.tblDesc - unused
# _Partition.getBucketPath(int bucketNum)_ has "// Note: this makes assumptions that won't
work with MM tables, unions, etc.".  File Jira?
# _PartitionDesc.LOG_ is unused
# Insert Overwrite for MM is incomplete - see comments in HIVE-15212 regarding IOW and multi
IOW
# mm_bucket_convert.q - doesn't install DbTxnManager, doesn't write any data - not sure what
it tests in practice
# There no concurrency tests that check locking
# no tests with aborted txns
# tests don't run on Tez/LLap - affects some optimization like Union All writes





  was:
# DDLSemanticAnalyzer.alterTableOutput is unused
# DDLTask.generateAddMmTasks(Table) - stmtId should probably come from TransactionManager
# DDLTask.createTable(Hive db, CreateTableDesc crtTbl) has _Long mmWriteId = crtTbl.getInitialMmWriteId();_
logic is unclear..  this ID is only set in one place..
# FileSinkOperator has multiple places that look like _conf.getWriteType() == AcidUtils.Operation.NOT_ACID
|| conf.isMmTable()_ - what is the writeType for MM tables?  Seems that Wei opted for "work.getLoadTableWork().getWriteType()
!= AcidUtils.Operation.NOT_ACID && !tbd.isMmTable()" to mean MM, e.g. MoveTask.handleStaticParts()
call to Hive.loadPartition()
# HiveConf.HIVE_TXN_OPERATIONAL_PROPERTIES - the doc/explanation there is obsolete
# Compactor Initiator likely doesn't work for MM tables.  It's triggered by into in TXN_COMPONENTS/COMPLETED_TXN_COMPONENTS.
 MM tables don't write to either because DbTxnManager.acquireLocks() does  _compBuilder.setIsAcid(AcidUtils.isFullAcidTable(t));_
i.e. it treats MM as non-acid tables
# In general integration with full Acid seems confused wrt to MM and seems to treat MM as
special table type rather than subtype of Acid table.  (mostly, but not always).
## e.g. _SemanticAnalyzer.genBucketingSortingDest(String dest, Operator input, QB qb, TableDesc
table_desc, Table dest_tab, SortBucketRSCtx ctx)_ 
##  _SemanticAnalyzer.validate()_ has _if (tbl != null && (AcidUtils.isFullAcidTable(tbl)
|| MetaStoreUtils.isInsertOnlyTable(tbl.getParameters()))) {_
# LoadSemanticAnalyzer.analyzeInternal(ASTNode) sets statementId to 0 rather than from TM
# ImportCommitTask - doesn't currently do anything.  It used to commit mmID.  Need to verify
we properly commit the txn in the Driver
# As far as I can tell all the mm_*.q tests run on TestCliDriver which means MR.  This doesn't
exercise some code specifically for dealing with writes from Union All queries (CTAS, Insert
into).  On MR this requires "hive.optimize.union.remove=true" (false by default)
# Remove MoveWork().setNoop(boolean) and usages per todo in _GenMapRedUtils.createMRWorkForMergingFiles
(FileSinkOperator fsInput, Path finalName, DependencyCollectionTask dependencyTask,   List<Task<MoveWork>>
mvTasks, HiveConf conf,   Task<? extends Serializable> currTask)_
# PartialScanWork.tblDesc - unused
# _Partition.getBucketPath(int bucketNum)_ has "// Note: this makes assumptions that won't
work with MM tables, unions, etc.".  File Jira?
# _PartitionDesc.LOG_ is unused
# Insert Overwrite for MM is incomplete - see comments in HIVE-15212 regarding IOW and multi
IOW





> Miscellaneous List
> ------------------
>
>                 Key: HIVE-17691
>                 URL: https://issues.apache.org/jira/browse/HIVE-17691
>             Project: Hive
>          Issue Type: Sub-task
>          Components: Transactions
>            Reporter: Eugene Koifman
>
> # DDLSemanticAnalyzer.alterTableOutput is unused
> # DDLTask.generateAddMmTasks(Table) - stmtId should probably come from TransactionManager
> # DDLTask.createTable(Hive db, CreateTableDesc crtTbl) has _Long mmWriteId = crtTbl.getInitialMmWriteId();_
logic is unclear..  this ID is only set in one place..
> # FileSinkOperator has multiple places that look like _conf.getWriteType() == AcidUtils.Operation.NOT_ACID
|| conf.isMmTable()_ - what is the writeType for MM tables?  Seems that Wei opted for "work.getLoadTableWork().getWriteType()
!= AcidUtils.Operation.NOT_ACID && !tbd.isMmTable()" to mean MM, e.g. MoveTask.handleStaticParts()
call to Hive.loadPartition()
> # HiveConf.HIVE_TXN_OPERATIONAL_PROPERTIES - the doc/explanation there is obsolete
> # Compactor Initiator likely doesn't work for MM tables.  It's triggered by into in TXN_COMPONENTS/COMPLETED_TXN_COMPONENTS.
 MM tables don't write to either because DbTxnManager.acquireLocks() does  _compBuilder.setIsAcid(AcidUtils.isFullAcidTable(t));_
i.e. it treats MM as non-acid tables
> # In general integration with full Acid seems confused wrt to MM and seems to treat MM
as special table type rather than subtype of Acid table.  (mostly, but not always).
> ## e.g. _SemanticAnalyzer.genBucketingSortingDest(String dest, Operator input, QB qb,
TableDesc table_desc, Table dest_tab, SortBucketRSCtx ctx)_ 
> ##  _SemanticAnalyzer.validate()_ has _if (tbl != null && (AcidUtils.isFullAcidTable(tbl)
|| MetaStoreUtils.isInsertOnlyTable(tbl.getParameters()))) {_
> # LoadSemanticAnalyzer.analyzeInternal(ASTNode) sets statementId to 0 rather than from
TM
> # ImportCommitTask - doesn't currently do anything.  It used to commit mmID.  Need to
verify we properly commit the txn in the Driver
> # As far as I can tell all the mm_*.q tests run on TestCliDriver which means MR.  This
doesn't exercise some code specifically for dealing with writes from Union All queries (CTAS,
Insert into).  On MR this requires "hive.optimize.union.remove=true" (false by default)
> # Remove MoveWork().setNoop(boolean) and usages per todo in _GenMapRedUtils.createMRWorkForMergingFiles
(FileSinkOperator fsInput, Path finalName, DependencyCollectionTask dependencyTask,   List<Task<MoveWork>>
mvTasks, HiveConf conf,   Task<? extends Serializable> currTask)_
> # PartialScanWork.tblDesc - unused
> # _Partition.getBucketPath(int bucketNum)_ has "// Note: this makes assumptions that
won't work with MM tables, unions, etc.".  File Jira?
> # _PartitionDesc.LOG_ is unused
> # Insert Overwrite for MM is incomplete - see comments in HIVE-15212 regarding IOW and
multi IOW
> # mm_bucket_convert.q - doesn't install DbTxnManager, doesn't write any data - not sure
what it tests in practice
> # There no concurrency tests that check locking
> # no tests with aborted txns
> # tests don't run on Tez/LLap - affects some optimization like Union All writes



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

Mime
View raw message