hive-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "ASF GitHub Bot (Jira)" <j...@apache.org>
Subject [jira] [Work logged] (HIVE-24332) Make AbstractSerDe Superclass of all Classes
Date Mon, 14 Dec 2020 00:00:00 GMT

     [ https://issues.apache.org/jira/browse/HIVE-24332?focusedWorklogId=523665&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-523665
]

ASF GitHub Bot logged work on HIVE-24332:
-----------------------------------------

                Author: ASF GitHub Bot
            Created on: 13/Dec/20 23:59
            Start Date: 13/Dec/20 23:59
    Worklog Time Spent: 10m 
      Work Description: miklosgergely commented on a change in pull request #1634:
URL: https://github.com/apache/hive/pull/1634#discussion_r542031088



##########
File path: ql/src/java/org/apache/hadoop/hive/ql/optimizer/TablePropertyEnrichmentOptimizer.java
##########
@@ -115,13 +115,13 @@ public Object process(Node nd, Stack<Node> stack, NodeProcessorCtx
procCtx, Obje
       String deserializerClassName = null;
       try {
         deserializerClassName = tableScanDesc.getTableMetadata().getSd().getSerdeInfo().getSerializationLib();
-        Deserializer deserializer = ReflectionUtil.newInstance(
+        AbstractSerDe deserializer = ReflectionUtil.newInstance(

Review comment:
       Please call the variable serDe.

##########
File path: ql/src/java/org/apache/hadoop/hive/ql/exec/tez/DynamicValueRegistryTez.java
##########
@@ -104,8 +105,8 @@ public void init(RegistryConf conf) throws Exception {
       RuntimeValuesInfo runtimeValuesInfo = rct.baseWork.getInputSourceToRuntimeValuesInfo().get(inputSourceName);
 
       // Setup deserializer/obj inspectors for the incoming data source
-      Deserializer deserializer = ReflectionUtils.newInstance(runtimeValuesInfo.getTableDesc().getDeserializerClass(),
null);
-      deserializer.initialize(rct.conf, runtimeValuesInfo.getTableDesc().getProperties());
+      AbstractSerDe deserializer = ReflectionUtils.newInstance(runtimeValuesInfo.getTableDesc().getSerDeClass(),
null);

Review comment:
       Please call this serDe.

##########
File path: ql/src/java/org/apache/hadoop/hive/ql/exec/spark/SparkReduceRecordHandler.java
##########
@@ -132,10 +133,12 @@ public void init(JobConf job, OutputCollector output, Reporter reporter)
throws
     isTagged = gWork.getNeedsTagging();
     try {
       keyTableDesc = gWork.getKeyDesc();
-      inputKeyDeserializer = ReflectionUtils.newInstance(keyTableDesc
-        .getDeserializerClass(), null);
-      SerDeUtils.initializeSerDe(inputKeyDeserializer, null, keyTableDesc.getProperties(),
null);
-      keyObjectInspector = inputKeyDeserializer.getObjectInspector();
+      AbstractSerDe serde = ReflectionUtils.newInstance(keyTableDesc
+        .getSerDeClass(), null);
+      serde.initialize(null, keyTableDesc.getProperties(), null);
+      keyObjectInspector = serde.getObjectInspector();
+
+      inputKeyDeserializer = serde;

Review comment:
       Please call this inputKeySerDe. Also no need to create a variable called serde for
an interim time.

##########
File path: ql/src/java/org/apache/hadoop/hive/ql/exec/HashTableSinkOperator.java
##########
@@ -179,18 +178,18 @@ protected void initializeOp(Configuration hconf) throws HiveException
{
     }
     try {
       TableDesc keyTableDesc = conf.getKeyTblDesc();
-      AbstractSerDe keySerde = (AbstractSerDe) ReflectionUtils.newInstance(keyTableDesc.getDeserializerClass(),
+      AbstractSerDe keySerde = (AbstractSerDe) ReflectionUtils.newInstance(keyTableDesc.getSerDeClass(),

Review comment:
       Nit: please call this keySerDe for consistency.

##########
File path: hcatalog/core/src/main/java/org/apache/hive/hcatalog/mapreduce/InternalUtil.java
##########
@@ -143,18 +142,16 @@ private static ObjectInspector getObjectInspector(TypeInfo type) throws
IOExcept
   //TODO this has to find a better home, it's also hardcoded as default in hive would be
nice
   // if the default was decided by the serde
   static void initializeOutputSerDe(AbstractSerDe serDe, Configuration conf, OutputJobInfo
jobInfo)
-    throws SerDeException {
-    SerDeUtils.initializeSerDe(serDe, conf,
-                               getSerdeProperties(jobInfo.getTableInfo(),
-                                                  jobInfo.getOutputSchema()),
-                               null);
+      throws SerDeException {
+    serDe.initialize(conf, getSerdeProperties(jobInfo.getTableInfo(), jobInfo.getOutputSchema()),
null);
   }
 
   static void initializeDeserializer(Deserializer deserializer, Configuration conf,
                      HCatTableInfo info, HCatSchema schema) throws SerDeException {
     Properties props = getSerdeProperties(info, schema);
     LOG.info("Initializing " + deserializer.getClass().getName() + " with properties " +
props);
-    SerDeUtils.initializeSerDe(deserializer, conf, props, null);
+    AbstractSerDe serde = (AbstractSerDe)deserializer;

Review comment:
       Nit: please call this serDe for consistency.

##########
File path: ql/src/java/org/apache/hadoop/hive/ql/exec/tez/ReduceRecordSource.java
##########
@@ -152,10 +152,11 @@ void init(JobConf jconf, Operator<?> reducer, boolean vectorized,
TableDesc keyT
     this.tag = tag;
 
     try {
-      inputKeyDeserializer = ReflectionUtils.newInstance(keyTableDesc
-          .getDeserializerClass(), null);
-      SerDeUtils.initializeSerDe(inputKeyDeserializer, null, keyTableDesc.getProperties(),
null);
-      keyObjectInspector = inputKeyDeserializer.getObjectInspector();
+      AbstractSerDe serde = ReflectionUtils.newInstance(keyTableDesc.getSerDeClass(), null);
+      serde.initialize(null, keyTableDesc.getProperties(), null);
+
+      inputKeyDeserializer = serde; 

Review comment:
       I suggest to call it inputKeySerDe, and it can be created under this name, no need
to call it serde for an interim time.

##########
File path: ql/src/java/org/apache/hadoop/hive/ql/exec/tez/ReduceRecordSource.java
##########
@@ -164,10 +165,8 @@ void init(JobConf jconf, Operator<?> reducer, boolean vectorized,
TableDesc keyT
 
       // We should initialize the SerDe with the TypeInfo when available.
       this.valueTableDesc = valueTableDesc;
-      inputValueDeserializer = (AbstractSerDe) ReflectionUtils.newInstance(
-          valueTableDesc.getDeserializerClass(), null);
-      SerDeUtils.initializeSerDe(inputValueDeserializer, null,
-          valueTableDesc.getProperties(), null);
+      inputValueDeserializer = (AbstractSerDe) ReflectionUtils.newInstance(valueTableDesc.getSerDeClass(),
null);

Review comment:
       Please call this inputValueSerDe

##########
File path: ql/src/java/org/apache/hadoop/hive/ql/exec/spark/SparkReduceRecordHandler.java
##########
@@ -154,10 +157,11 @@ public void init(JobConf job, OutputCollector output, Reporter reporter)
throws
       for (int tag = 0; tag < gWork.getTagToValueDesc().size(); tag++) {
         // We should initialize the SerDe with the TypeInfo when available.
         valueTableDesc[tag] = gWork.getTagToValueDesc().get(tag);
-        inputValueDeserializer[tag] = ReflectionUtils.newInstance(
-            valueTableDesc[tag].getDeserializerClass(), null);
-        SerDeUtils.initializeSerDe(inputValueDeserializer[tag], null,
-            valueTableDesc[tag].getProperties(), null);
+
+        AbstractSerDe sd = ReflectionUtils.newInstance(valueTableDesc[tag].getSerDeClass(),
null);
+        sd.initialize(null, valueTableDesc[tag].getProperties(), null);
+
+        inputValueDeserializer[tag] = sd;

Review comment:
       Please call this inputValueSerDe.

##########
File path: ql/src/java/org/apache/hadoop/hive/ql/exec/ScriptOperator.java
##########
@@ -273,15 +273,14 @@ protected void initializeOp(Configuration hconf) throws HiveException
{
     try {
       this.hconf = hconf;
 
-      scriptOutputDeserializer = conf.getScriptOutputInfo()
-          .getDeserializerClass().newInstance();
-      SerDeUtils.initializeSerDe(scriptOutputDeserializer, hconf,
-                                 conf.getScriptOutputInfo().getProperties(), null);
+      AbstractSerDe outputSerde = conf.getScriptOutputInfo().getSerDeClass().newInstance();

Review comment:
       Nit: outputSerDe for consistency.

##########
File path: ql/src/java/org/apache/hadoop/hive/ql/plan/PartitionDesc.java
##########
@@ -171,9 +171,9 @@ public String getDeserializerClassName() {
   public Deserializer getDeserializer(Configuration conf) throws Exception {
     Properties schema = getProperties();
     String clazzName = getDeserializerClassName();
-    Deserializer deserializer = ReflectionUtil.newInstance(conf.getClassByName(clazzName)
-        .asSubclass(Deserializer.class), conf);
-    SerDeUtils.initializeSerDe(deserializer, conf, getTableDesc().getProperties(), schema);
+    AbstractSerDe deserializer = ReflectionUtil.newInstance(conf.getClassByName(clazzName)

Review comment:
       Please call the variable serDe, and the function getSerDeClass, to be consistent with
TableDesc.

##########
File path: ql/src/java/org/apache/hadoop/hive/ql/optimizer/GenMapRedUtils.java
##########
@@ -775,7 +775,7 @@ public static void setTaskPlan(Path path, String alias,
     if (topOp instanceof TableScanOperator) {
       try {
         Utilities.addSchemaEvolutionToTableScanOperator(
-          (StructObjectInspector) tt_desc.getDeserializer().getObjectInspector(),
+          (StructObjectInspector) tt_desc.getSerDe().getObjectInspector(),

Review comment:
       Nit: please rename tt_desc so that it follows the java naming conventions.

##########
File path: ql/src/java/org/apache/hadoop/hive/ql/exec/MapOperator.java
##########
@@ -407,7 +407,7 @@ public void initEmptyInputChildren(List<Operator<?>> children,
Configuration hco
       StructObjectInspector soi = null;
       PartitionDesc partDesc = conf.getAliasToPartnInfo().get(tsOp.getConf().getAlias());
       Configuration newConf = tableNameToConf.get(partDesc.getTableDesc().getTableName());
-      Deserializer serde = partDesc.getTableDesc().getDeserializer();
+      Deserializer serde = partDesc.getTableDesc().getSerDe();

Review comment:
       Nit: AbstractSerDe serDe for consistency.

##########
File path: ql/src/java/org/apache/hadoop/hive/ql/exec/mr/ExecReducer.java
##########
@@ -100,20 +101,19 @@ public void configure(JobConf job) {
     isTagged = gWork.getNeedsTagging();
     try {
       keyTableDesc = gWork.getKeyDesc();
-      inputKeyDeserializer = ReflectionUtils.newInstance(keyTableDesc
-          .getDeserializerClass(), null);
-      SerDeUtils.initializeSerDe(inputKeyDeserializer, null, keyTableDesc.getProperties(),
null);
+      AbstractSerDe serDe = ReflectionUtils.newInstance(keyTableDesc
+          .getSerDeClass(), null);
+      serDe.initialize(null, keyTableDesc.getProperties(), null);
+      inputKeyDeserializer = serDe;
       keyObjectInspector = inputKeyDeserializer.getObjectInspector();
       valueTableDesc = new TableDesc[gWork.getTagToValueDesc().size()];
       for (int tag = 0; tag < gWork.getTagToValueDesc().size(); tag++) {
         // We should initialize the SerDe with the TypeInfo when available.
         valueTableDesc[tag] = gWork.getTagToValueDesc().get(tag);
-        inputValueDeserializer[tag] = ReflectionUtils.newInstance(
-            valueTableDesc[tag].getDeserializerClass(), null);
-        SerDeUtils.initializeSerDe(inputValueDeserializer[tag], null,
-                                   valueTableDesc[tag].getProperties(), null);
-        valueObjectInspector[tag] = inputValueDeserializer[tag]
-            .getObjectInspector();
+        AbstractSerDe sd = ReflectionUtils.newInstance(valueTableDesc[tag].getSerDeClass(),
null);
+        sd.initialize(null, valueTableDesc[tag].getProperties(), null);
+        inputValueDeserializer[tag] = sd;
+        valueObjectInspector[tag] = inputValueDeserializer[tag].getObjectInspector();

Review comment:
       Please call this valueObjectSerDe

##########
File path: ql/src/java/org/apache/hadoop/hive/ql/exec/mr/ExecReducer.java
##########
@@ -100,20 +101,19 @@ public void configure(JobConf job) {
     isTagged = gWork.getNeedsTagging();
     try {
       keyTableDesc = gWork.getKeyDesc();
-      inputKeyDeserializer = ReflectionUtils.newInstance(keyTableDesc
-          .getDeserializerClass(), null);
-      SerDeUtils.initializeSerDe(inputKeyDeserializer, null, keyTableDesc.getProperties(),
null);
+      AbstractSerDe serDe = ReflectionUtils.newInstance(keyTableDesc
+          .getSerDeClass(), null);
+      serDe.initialize(null, keyTableDesc.getProperties(), null);
+      inputKeyDeserializer = serDe;

Review comment:
       Please call this inputKeySerDe, also no need for the interim serde variable.

##########
File path: ql/src/java/org/apache/hadoop/hive/ql/exec/HashTableDummyOperator.java
##########
@@ -45,8 +44,8 @@ protected void initializeOp(Configuration hconf) throws HiveException {
     super.initializeOp(hconf);
     TableDesc tbl = this.getConf().getTbl();
     try {
-      Deserializer serde = tbl.getDeserializerClass().newInstance();
-      SerDeUtils.initializeSerDe(serde, hconf, tbl.getProperties(), null);
+      AbstractSerDe serde = tbl.getSerDeClass().newInstance();

Review comment:
       Nit: please call this serDe for consistency.

##########
File path: ql/src/java/org/apache/hadoop/hive/ql/exec/MapJoinOperator.java
##########
@@ -352,8 +351,8 @@ public void generateMapMetaData() throws HiveException {
     try {
       TableDesc keyTableDesc = conf.getKeyTblDesc();
       AbstractSerDe keySerializer = (AbstractSerDe) ReflectionUtil.newInstance(

Review comment:
       Please call this keySerDe

##########
File path: ql/src/java/org/apache/hadoop/hive/ql/exec/ScriptOperator.java
##########
@@ -273,15 +273,14 @@ protected void initializeOp(Configuration hconf) throws HiveException
{
     try {
       this.hconf = hconf;
 
-      scriptOutputDeserializer = conf.getScriptOutputInfo()
-          .getDeserializerClass().newInstance();
-      SerDeUtils.initializeSerDe(scriptOutputDeserializer, hconf,
-                                 conf.getScriptOutputInfo().getProperties(), null);
+      AbstractSerDe outputSerde = conf.getScriptOutputInfo().getSerDeClass().newInstance();
+      outputSerde.initialize(hconf, conf.getScriptOutputInfo().getProperties(), null);
 
-      scriptInputSerializer = (Serializer) conf.getScriptInputInfo()
-          .getDeserializerClass().newInstance();
-      scriptInputSerializer.initialize(hconf, conf.getScriptInputInfo()
-          .getProperties());
+      AbstractSerDe inputSerde = conf.getScriptInputInfo().getSerDeClass().newInstance();

Review comment:
       Nit: inputSerDe for consistency.

##########
File path: serde/src/java/org/apache/hadoop/hive/serde2/SerDe.java
##########
@@ -0,0 +1,32 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ *     http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.hadoop.hive.serde2;
+
+/**
+ * A Hive Serializer/Deserializer.
+ */
+public interface SerDe {
+
+  /**
+   * Returns statistics collected when serializing

Review comment:
       Modify comment to "(de)serializing"




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


Issue Time Tracking
-------------------

    Worklog Id:     (was: 523665)
    Time Spent: 40m  (was: 0.5h)

> Make AbstractSerDe Superclass of all Classes
> --------------------------------------------
>
>                 Key: HIVE-24332
>                 URL: https://issues.apache.org/jira/browse/HIVE-24332
>             Project: Hive
>          Issue Type: Improvement
>            Reporter: David Mollitor
>            Assignee: David Mollitor
>            Priority: Major
>              Labels: pull-request-available
>          Time Spent: 40m
>  Remaining Estimate: 0h
>
> Rework how {{AbstractSerDe}}, {{Deserializer}}, and {{Serializer}} classes are designed.
> Simplify, and consolidate more functionality into {{AbstractSerDe}}.  Remove functionality
that is not commonly used.  Remove deprecated methods that were deprecated in 3.x (or maybe
even older).
> Make it like Java's {{ByteChannel}} that provides implementations for both {{ReadableByteChannel}}
and {{WriteableByteChannel}} interfaces.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

Mime
View raw message