drill-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Arina Ielchiieva (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (DRILL-5612) Random failure in TestMergeJoinWithSchemaChanges
Date Mon, 24 Jul 2017 10:03:00 GMT

    [ https://issues.apache.org/jira/browse/DRILL-5612?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16098169#comment-16098169
] 

Arina Ielchiieva commented on DRILL-5612:
-----------------------------------------

[~Paul.Rogers] may be we should add ignore annotation for this test? During the latest batch
commits this test has failed several times which made me to re-run unit tests ...

> Random failure in TestMergeJoinWithSchemaChanges
> ------------------------------------------------
>
>                 Key: DRILL-5612
>                 URL: https://issues.apache.org/jira/browse/DRILL-5612
>             Project: Apache Drill
>          Issue Type: Bug
>    Affects Versions: 1.11.0
>            Reporter: Paul Rogers
>
> The unit test {{org.apache.drill.exec.physical.impl.join.TestMergeJoinWithSchemaChanges#testMissingAndNewColumns}}
is subject to random failures, perhaps due to changes in file order in readers.
> The test builds a number of input files, then executes queries against them. On most
runs, the output is fine:
> {code}
> Running org.apache.drill.exec.physical.impl.join.TestMergeJoinWithSchemaChanges#testMissingAndNewColumns
> /home/.../target/1498606483211-0/mergejoin-schemachanges-left
> /home/.../target/1498606483211-1/mergejoin-schemachanges-right
> {code}
> But, on occasion, the query fails:
> {code}
> org.apache.drill.exec.physical.impl.join.TestMergeJoinWithSchemaChanges
> testMissingAndNewColumns(org.apache.drill.exec.physical.impl.join.TestMergeJoinWithSchemaChanges)
 Time elapsed: 0.569 sec  <<< ERROR!
> ...: UNSUPPORTED_OPERATION ERROR: Sort doesn't currently support sorts with changing
schemas
> Fragment 0:0
>   (org.apache.drill.exec.exception.SchemaChangeException) Sort currently only supports
a single schema.
>     org.apache.drill.exec.physical.impl.sort.SortRecordBatchBuilder.build():152
>     org.apache.drill.exec.physical.impl.xsort.ExternalSortBatch.innerNext():476
> ...
> {code}
> The line in the exception above:
> {code}
>   public void build(VectorContainer outputContainer) throws SchemaChangeException {
>     outputContainer.clear();
>     if (batches.keySet().size() > 1) {
>       throw new SchemaChangeException("Sort currently only supports a single schema.");
>     }
> {code}
> The above code has not changed in quite some time. The failure is in the "legacy" external
sort.
> Although the external sort does support schema changes, it only does so in the form of
a union vector, which must be enabled. (Other tests validate that schema changes work.)
> What is likely happening here is that the sort sometimes sees two files with differing
schemas, sometimes multiple threads run so that a single sort sees only one file. This speculation
can be verified by looking at a log file (not available in the test run that failed) to see
if the scan under the sort read more than one file.
> Or, perhaps the order of the JSON files matters. Perhaps file order varies across machines
(since the Linux command to list directories does not guarantee order.)



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

Mime
View raw message