drill-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "ASF GitHub Bot (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (DRILL-5657) Implement size-aware result set loader
Date Tue, 14 Nov 2017 17:56:01 GMT

    [ https://issues.apache.org/jira/browse/DRILL-5657?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16251830#comment-16251830

ASF GitHub Bot commented on DRILL-5657:

Github user paul-rogers commented on a diff in the pull request:

    --- Diff: exec/vector/src/main/java/org/apache/drill/exec/vector/accessor/writer/BaseScalarWriter.java
    @@ -0,0 +1,264 @@
    + * Licensed to the Apache Software Foundation (ASF) under one
    + * or more contributor license agreements.  See the NOTICE file
    + * distributed with this work for additional information
    + * regarding copyright ownership.  The ASF licenses this file
    + * to you under the Apache License, Version 2.0 (the
    + * "License"); you may not use this file except in compliance
    + * with the License.  You may obtain a copy of the License at
    + *
    + * http://www.apache.org/licenses/LICENSE-2.0
    + *
    + * Unless required by applicable law or agreed to in writing, software
    + * distributed under the License is distributed on an "AS IS" BASIS,
    + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
    + * See the License for the specific language governing permissions and
    + * limitations under the License.
    + */
    +package org.apache.drill.exec.vector.accessor.writer;
    +import java.math.BigDecimal;
    +import org.apache.drill.exec.vector.accessor.ColumnWriterIndex;
    +import org.apache.drill.exec.vector.accessor.impl.HierarchicalFormatter;
    +import org.joda.time.Period;
    + * Column writer implementation that acts as the basis for the
    + * generated, vector-specific implementations. All set methods
    + * throw an exception; subclasses simply override the supported
    + * method(s).
    + * <p>
    + * The only tricky part to this class is understanding the
    + * state of the write indexes as the write proceeds. There are
    + * two pointers to consider:
    + * <ul>
    + * <li>lastWriteIndex: The position in the vector at which the
    + * client last asked us to write data. This index is maintained
    + * in this class because it depends only on the actions of this
    + * class.</li>
    + * <li>vectorIndex: The position in the vector at which we will
    + * write if the client chooses to write a value at this time.
    + * The vector index is shared by all columns at the same repeat
    + * level. It is incremented as the client steps through the write
    + * and is observed in this class each time a write occurs.</i>
    + * </ul>
    + * A repeat level is defined as any of the following:
    + * <ul>
    + * <li>The set of top-level scalar columns, or those within a
    + * top-level, non-repeated map, or nested to any depth within
    + * non-repeated maps rooted at the top level.</li>
    + * <li>The values for a single scalar array.</li>
    + * <li>The set of scalar columns within a repeated map, or
    + * nested within non-repeated maps within a repeated map.</li>
    + * </ul>
    + * Items at a repeat level index together and share a vector
    + * index. However, the columns within a repeat level
    + * <i>do not</i> share a last write index: some can lag further
    + * behind than others.
    + * <p>
    + * Let's illustrate the states. Let's focus on one column and
    + * illustrate the three states that can occur during write:
    + * <ul>
    + * <li><b>Behind</b>: the last write index is more than one position
    + * the vector index. Zero-filling will be needed to catch up to
    + * the vector index.</li>
    + * <li><b>Written</b>: the last write index is the same as the vector
    + * index because the client wrote data at this position (and previous
    + * values were back-filled with nulls, empties or zeros.)</li>
    + * <li><b>Unwritten</b>: the last write index is one behind the vector
    + * index. This occurs when the column was written, then the client
    + * moved to the next row or array position.</li>
    + * <li><b>Restarted</b>: The current row is abandoned (perhaps filtered
    + * out) and is to be rewritten. The last write position moves
    + * back one position. Note that, the Restarted state is
    + * indistinguishable from the unwritten state: the only real
    + * difference is that the current slot (pointed to by the
    + * vector index) contains the previous written value that must
    + * be overwritten or back-filled. But, this is fine, because we
    + * assume that unwritten values are garbage anyway.</li>
    + * </ul>
    + * To illustrate:<pre><code>
    + *      Behind      Written    Unwritten    Restarted
    + *       |X|          |X|         |X|          |X|
    + *   lw >|X|          |X|         |X|          |X|
    + *       | |          |0|         |0|     lw > |0|
    + *    v >| |  lw, v > |X|    lw > |X|      v > |X|
    + *                            v > | |
    + * </code></pre>
    + * The illustrated state transitions are:
    + * <ul>
    + * <li>Suppose the state starts in Behind.<ul>
    + *   <li>If the client writes a value, then the empty slot is
    + *       back-filled and the state moves to Written.</li>
    + *   <li>If the client does not write a value, the state stays
    + *       at Behind, and the gap of unfilled values grows.</li></ul></li>
    + * <li>When in the Written state:<ul>
    + *   <li>If the client saves the current row or array position,
    + *       the vector index increments and we move to the Unwritten
    + *       state.</li>
    + *   <li>If the client abandons the row, the last write position
    + *       moves back one to recreate the unwritten state. We've
    + *       shown this state separately above just to illustrate
    + *       the two transitions from Written.</li></ul></li>
    + * <li>When in the Unwritten (or Restarted) states:<ul>
    + *   <li>If the client writes a value, then the writer moves back to the
    + *       Written state.</li>
    + *   <li>If the client skips the value, then the vector index increments
    + *       again, leaving a gap, and the writer moves to the
    + *       Behind state.</li></ul>
    + * </ul>
    + * <p>
    + * We've already noted that the Restarted state is identical to
    + * the Unwritten state (and was discussed just to make the flow a bit
    + * clearer.) The astute reader will have noticed that the Behind state is
    + * the same as the Unwritten state if we define the combined state as
    + * when the last write position is behind the vector index.
    + * <p>
    + * Further, if
    + * one simply treats the gap between last write and the vector indexes
    + * as the amount (which may be zero) to back-fill, then there is just
    + * one state. This is, in fact, how the code works: it always writes
    + * to the vector index (and can do so multiple times for a single row),
    + * back-filling as necessary.
    + * <p>
    + * The states, then, are more for our use in understanding the algorithm.
    + * They are also very useful when working through the logic of performing
    + * a roll-over when a vector overflows.
    + */
    +public abstract class BaseScalarWriter extends AbstractScalarWriter {
    +  public static final int MIN_BUFFER_SIZE = 256;
    +  /**
    +   * Indicates the position in the vector to write. Set via an object so that
    +   * all writers (within the same subtree) can agree on the write position.
    +   * For example, all top-level, simple columns see the same row index.
    +   * All columns within a repeated map see the same (inner) index, etc.
    +   */
    +  protected ColumnWriterIndex vectorIndex;
    +  /**
    +   * Listener invoked if the vector overflows. If not provided, then the writer
    +   * does not support vector overflow.
    +   */
    +  protected ColumnWriterListener listener;
    +  /**
    +   * Cached direct memory location of the start of data for the vector
    +   * being written. Updated each time the buffer is reallocated.
    +   */
    +  protected long bufAddr;
    --- End diff --
    Very good question that requires a longer answer than can be explained here. Basically,
the thought is that these accessors are the primary interface between users of vectors and
the backing memory buffers. `DrillBuf`, like the `ByteBuf` from which it derives, and the
`ByteBuffer` on which it is modeled, assume a serialization model. Here we assume more of
a DB buffer model.
    The model used in the code is that `DrillBuf` handles allocation, reference counting,
freeing and so on. The column accessors handle writes to, and reads from, the buffer using
    Calling `DrillBuf` methods without bounds checks is really little different than using
`PlatformDependent` directly. Avoiding those extra calls has a performance benefit.
    FWIW, the text reader has long used memory addresses; here that work is isolated here,
and removed (in a later PR) from the text reader (and other places.)

> Implement size-aware result set loader
> --------------------------------------
>                 Key: DRILL-5657
>                 URL: https://issues.apache.org/jira/browse/DRILL-5657
>             Project: Apache Drill
>          Issue Type: Improvement
>    Affects Versions: Future
>            Reporter: Paul Rogers
>            Assignee: Paul Rogers
>             Fix For: Future
> A recent extension to Drill's set of test tools created a "row set" abstraction to allow
us to create, and verify, record batches with very few lines of code. Part of this work involved
creating a set of "column accessors" in the vector subsystem. Column readers provide a uniform
API to obtain data from columns (vectors), while column writers provide a uniform writing
> DRILL-5211 discusses a set of changes to limit value vectors to 16 MB in size (to avoid
memory fragmentation due to Drill's two memory allocators.) The column accessors have proven
to be so useful that they will be the basis for the new, size-aware writers used by Drill's
record readers.
> A step in that direction is to retrofit the column writers to use the size-aware {{setScalar()}}
and {{setArray()}} methods introduced in DRILL-5517.
> Since the test framework row set classes are (at present) the only consumer of the accessors,
those classes must also be updated with the changes.
> This then allows us to add a new "row mutator" class that handles size-aware vector writing,
including the case in which a vector fills in the middle of a row.

This message was sent by Atlassian JIRA

View raw message