Github user gatorsmile commented on a diff in the pull request:
https://github.com/apache/spark/pull/20116#discussion_r159470325
--- Diff: sql/core/src/main/java/org/apache/spark/sql/vectorized/ColumnVector.java ---
@@ -14,32 +14,39 @@
* See the License for the specific language governing permissions and
* limitations under the License.
*/
-package org.apache.spark.sql.execution.vectorized;
+package org.apache.spark.sql.vectorized;
import org.apache.spark.sql.catalyst.util.MapData;
import org.apache.spark.sql.types.DataType;
import org.apache.spark.sql.types.Decimal;
import org.apache.spark.unsafe.types.UTF8String;
/**
- * This class represents in-memory values of a column and provides the main APIs to access
the data.
- * It supports all the types and contains get APIs as well as their batched versions.
The batched
- * versions are considered to be faster and preferable whenever possible.
+ * An interface representing in-memory columnar data in Spark. This interface defines
the main APIs
+ * to access the data, as well as their batched versions. The batched versions are considered
to be
+ * faster and preferable whenever possible.
*
- * To handle nested schemas, ColumnVector has two types: Arrays and Structs. In both
cases these
- * columns have child columns. All of the data are stored in the child columns and the
parent column
- * only contains nullability. In the case of Arrays, the lengths and offsets are saved
in the child
- * column and are encoded identically to INTs.
+ * Most of the APIs take the rowId as a parameter. This is the batch local 0-based row
id for values
+ * in this ColumnVector.
*
- * Maps are just a special case of a two field struct.
+ * ColumnVector supports all the data types including nested types. To handle nested
types,
+ * ColumnVector can have children and is a tree structure. For struct type, it stores
the actual
+ * data of each field in the corresponding child ColumnVector, and only store null information
in
--- End diff --
`store ` -> `stores`
---
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
|