drill-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Jacques Nadeau" <jacques.dr...@gmail.com>
Subject Re: Review Request 28471: DRILL-1781: For complex functions, don't return until schema is known
Date Sun, 30 Nov 2014 21:14:27 GMT

-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/28471/#review63323
-----------------------------------------------------------


Generally like the reduction in code.  Treating certain operators special rather than all
seems cleaner. Two main pieces of feedback: 1. You should stop introducing additional boolean
state variables.  Once you move beyond a single boolean variable around state, please switch
over to using an enumeration.  2. You need to add a lot of comments.


exec/java-exec/src/main/java/org/apache/drill/exec/physical/impl/aggregate/StreamingAggBatch.java
<https://reviews.apache.org/r/28471/#comment105544>

    This code seems to be repeating.  Can you factor it out somehow?



exec/java-exec/src/main/java/org/apache/drill/exec/physical/impl/filter/FilterRecordBatch.java
<https://reviews.apache.org/r/28471/#comment105545>

    comment please.



exec/java-exec/src/main/java/org/apache/drill/exec/physical/impl/join/HashJoinBatch.java
<https://reviews.apache.org/r/28471/#comment105546>

    Please move this to a single state variable.  INIT, SCHEMA_BUILD, DONE, etc.



exec/java-exec/src/main/java/org/apache/drill/exec/physical/impl/join/HashJoinBatch.java
<https://reviews.apache.org/r/28471/#comment105547>

    please comment.



exec/java-exec/src/main/java/org/apache/drill/exec/physical/impl/join/JoinStatus.java
<https://reviews.apache.org/r/28471/#comment105548>

    please comment.



exec/java-exec/src/main/java/org/apache/drill/exec/physical/impl/join/MergeJoinBatch.java
<https://reviews.apache.org/r/28471/#comment105549>

    please move to a single state variable.



exec/java-exec/src/main/java/org/apache/drill/exec/physical/impl/mergereceiver/MergingRecordBatch.java
<https://reviews.apache.org/r/28471/#comment105550>

    please add comments.



exec/java-exec/src/main/java/org/apache/drill/exec/physical/impl/mergereceiver/MergingRecordBatch.java
<https://reviews.apache.org/r/28471/#comment105551>

    please add comment.



exec/java-exec/src/main/java/org/apache/drill/exec/physical/impl/mergereceiver/MergingRecordBatch.java
<https://reviews.apache.org/r/28471/#comment105552>

    At least use DrillRuntimeException.  Also add short message.



exec/java-exec/src/main/java/org/apache/drill/exec/physical/impl/partitionsender/PartitionSenderRootExec.java
<https://reviews.apache.org/r/28471/#comment105553>

    add comment.



exec/java-exec/src/main/java/org/apache/drill/exec/physical/impl/svremover/RemovingRecordBatch.java
<https://reviews.apache.org/r/28471/#comment105539>

    Please add comment here.  This doesn't look right.



exec/java-exec/src/main/java/org/apache/drill/exec/physical/impl/svremover/RemovingRecordBatch.java
<https://reviews.apache.org/r/28471/#comment105540>

    del



exec/java-exec/src/main/java/org/apache/drill/exec/physical/impl/union/UnionAllRecordBatch.java
<https://reviews.apache.org/r/28471/#comment105541>

    same, please add comment.



exec/java-exec/src/main/java/org/apache/drill/exec/record/AbstractSingleRecordBatch.java
<https://reviews.apache.org/r/28471/#comment105542>

    please comment on logic.



exec/java-exec/src/main/java/org/apache/drill/exec/work/batch/UnlimitedRawBatchBuffer.java
<https://reviews.apache.org/r/28471/#comment105543>

    please rework to be a single state variable rather than two.  RUNNING, KILLED, FINISHED
or similar.


- Jacques Nadeau


On Nov. 26, 2014, 12:07 p.m., Steven Phillips wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/28471/
> -----------------------------------------------------------
> 
> (Updated Nov. 26, 2014, 12:07 p.m.)
> 
> 
> Review request for drill and Jacques Nadeau.
> 
> 
> Repository: drill-git
> 
> 
> Description
> -------
> 
> In the case of complex output functions, it is impossible to determine the output schema
until the actual data is consumed. For example, with convert_form(VARCHAR, 'json'), unlike
most other functions, it is not sufficient to know that the incoming data type is VARCHAR,
we actually need to decode the contents of the record before we can determine what the output
type is, whether it be map, list, or primitive type.
> For fast schema return, we worked around this problem by simply assuming the type was
Map, and if it happened to be different, there would be a schema change. This solution is
not satisfactory, as it ends up breaking other functions, like flatten.
> The solution is to continue returning a schema whenever possible, but when it is not
possible, drill will wait until it is.
> For non-blocking operators, drill will immediately consume the incoming batch, and thus
will not return empty schema batches if there is data to consume. Blocking operators will
return an empty schema batch. If a flattten function occurs downstream from a blocking operator,
it will not be able to return a schema, and thus fast schema return will not happen in this
case.
> In the cases where the complex function is not downstream from a blocking operator, fast
schema return should continue to work.
> 
> 
> Diffs
> -----
> 
>   exec/java-exec/src/main/java/org/apache/drill/exec/physical/impl/RootExec.java d9c4e5b

>   exec/java-exec/src/main/java/org/apache/drill/exec/physical/impl/ScanBatch.java 4ed1180

>   exec/java-exec/src/main/java/org/apache/drill/exec/physical/impl/ScreenCreator.java
3a843ea 
>   exec/java-exec/src/main/java/org/apache/drill/exec/physical/impl/SingleSenderCreator.java
b638de0 
>   exec/java-exec/src/main/java/org/apache/drill/exec/physical/impl/TopN/TopNBatch.java
400a867 
>   exec/java-exec/src/main/java/org/apache/drill/exec/physical/impl/WriterRecordBatch.java
cb0de02 
>   exec/java-exec/src/main/java/org/apache/drill/exec/physical/impl/aggregate/HashAggBatch.java
a0b8d3f 
>   exec/java-exec/src/main/java/org/apache/drill/exec/physical/impl/aggregate/StreamingAggBatch.java
17aaae8 
>   exec/java-exec/src/main/java/org/apache/drill/exec/physical/impl/broadcastsender/BroadcastSenderRootExec.java
4e7d222 
>   exec/java-exec/src/main/java/org/apache/drill/exec/physical/impl/filter/FilterRecordBatch.java
7d68e07 
>   exec/java-exec/src/main/java/org/apache/drill/exec/physical/impl/flatten/FlattenRecordBatch.java
78c1c50 
>   exec/java-exec/src/main/java/org/apache/drill/exec/physical/impl/join/HashJoinBatch.java
7f4d03c 
>   exec/java-exec/src/main/java/org/apache/drill/exec/physical/impl/join/JoinStatus.java
3bc8daa 
>   exec/java-exec/src/main/java/org/apache/drill/exec/physical/impl/join/MergeJoinBatch.java
518971d 
>   exec/java-exec/src/main/java/org/apache/drill/exec/physical/impl/limit/LimitRecordBatch.java
02e1a92 
>   exec/java-exec/src/main/java/org/apache/drill/exec/physical/impl/mergereceiver/MergingRecordBatch.java
8da8f96 
>   exec/java-exec/src/main/java/org/apache/drill/exec/physical/impl/partitionsender/PartitionSenderRootExec.java
9e3cfe5 
>   exec/java-exec/src/main/java/org/apache/drill/exec/physical/impl/partitionsender/PartitionerTemplate.java
a16e29f 
>   exec/java-exec/src/main/java/org/apache/drill/exec/physical/impl/producer/ProducerConsumerBatch.java
132c41e 
>   exec/java-exec/src/main/java/org/apache/drill/exec/physical/impl/project/ProjectRecordBatch.java
27cb1f2 
>   exec/java-exec/src/main/java/org/apache/drill/exec/physical/impl/svremover/RemovingRecordBatch.java
f2c1e89 
>   exec/java-exec/src/main/java/org/apache/drill/exec/physical/impl/union/UnionAllRecordBatch.java
42492ab 
>   exec/java-exec/src/main/java/org/apache/drill/exec/physical/impl/unorderedreceiver/UnorderedReceiverBatch.java
25fec41 
>   exec/java-exec/src/main/java/org/apache/drill/exec/physical/impl/validate/IteratorValidatorBatchIterator.java
7f5ab2a 
>   exec/java-exec/src/main/java/org/apache/drill/exec/physical/impl/xsort/ExternalSortBatch.java
3f2692e 
>   exec/java-exec/src/main/java/org/apache/drill/exec/record/AbstractRecordBatch.java
f77ae3d 
>   exec/java-exec/src/main/java/org/apache/drill/exec/record/AbstractSingleRecordBatch.java
1ef0345 
>   exec/java-exec/src/main/java/org/apache/drill/exec/record/RecordBatch.java 318600f

>   exec/java-exec/src/main/java/org/apache/drill/exec/work/batch/UnlimitedRawBatchBuffer.java
ffa4e2c 
>   exec/java-exec/src/main/java/org/apache/drill/exec/work/fragment/FragmentExecutor.java
f76dfcd 
>   exec/java-exec/src/test/java/org/apache/drill/exec/nested/TestFastComplexSchema.java
PRE-CREATION 
>   exec/java-exec/src/test/java/org/apache/drill/exec/physical/impl/SimpleRootExec.java
0581411 
>   exec/java-exec/src/test/java/org/apache/drill/exec/physical/impl/TestCastFunctions.java
7baf7c4 
>   exec/java-exec/src/test/java/org/apache/drill/exec/physical/impl/TestDecimal.java 77301f0

>   exec/java-exec/src/test/java/org/apache/drill/exec/physical/impl/TestExtractFunctions.java
fae9390 
>   exec/java-exec/src/test/java/org/apache/drill/exec/physical/impl/TestSimpleFragmentRun.java
2576e16 
>   exec/java-exec/src/test/java/org/apache/drill/exec/physical/impl/join/TestMergeJoin.java
a6a1866 
>   exec/java-exec/src/test/java/org/apache/drill/exec/physical/impl/window/TestWindowFrame.java
1a4f998 
>   exec/java-exec/src/test/java/org/apache/drill/exec/physical/impl/writer/TestWriter.java
82a8bfd 
>   exec/java-exec/src/test/java/org/apache/drill/exec/record/vector/TestDateTypes.java
8152ed3 
>   exec/java-exec/src/test/java/org/apache/drill/exec/sql/TestWindowFunctions.java 780a7ce

>   exec/java-exec/src/test/java/org/apache/drill/exec/vector/complex/fn/TestJsonReaderWithSparseFiles.java
3cfdc1d 
>   exec/java-exec/src/test/java/org/apache/drill/exec/vector/complex/writer/TestJsonReader.java
bf81ba2 
> 
> Diff: https://reviews.apache.org/r/28471/diff/
> 
> 
> Testing
> -------
> 
> TestFastComplexSchema.java
> 
> 
> Thanks,
> 
> Steven Phillips
> 
>


Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message