drill-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Arina Ielchiieva (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (DRILL-5612) Random failure in TestMergeJoinWithSchemaChanges
Date Mon, 24 Jul 2017 10:03:00 GMT

    [ https://issues.apache.org/jira/browse/DRILL-5612?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16098169#comment-16098169

Arina Ielchiieva commented on DRILL-5612:

[~Paul.Rogers] may be we should add ignore annotation for this test? During the latest batch
commits this test has failed several times which made me to re-run unit tests ...

> Random failure in TestMergeJoinWithSchemaChanges
> ------------------------------------------------
>                 Key: DRILL-5612
>                 URL: https://issues.apache.org/jira/browse/DRILL-5612
>             Project: Apache Drill
>          Issue Type: Bug
>    Affects Versions: 1.11.0
>            Reporter: Paul Rogers
> The unit test {{org.apache.drill.exec.physical.impl.join.TestMergeJoinWithSchemaChanges#testMissingAndNewColumns}}
is subject to random failures, perhaps due to changes in file order in readers.
> The test builds a number of input files, then executes queries against them. On most
runs, the output is fine:
> {code}
> Running org.apache.drill.exec.physical.impl.join.TestMergeJoinWithSchemaChanges#testMissingAndNewColumns
> /home/.../target/1498606483211-0/mergejoin-schemachanges-left
> /home/.../target/1498606483211-1/mergejoin-schemachanges-right
> {code}
> But, on occasion, the query fails:
> {code}
> org.apache.drill.exec.physical.impl.join.TestMergeJoinWithSchemaChanges
> testMissingAndNewColumns(org.apache.drill.exec.physical.impl.join.TestMergeJoinWithSchemaChanges)
 Time elapsed: 0.569 sec  <<< ERROR!
> ...: UNSUPPORTED_OPERATION ERROR: Sort doesn't currently support sorts with changing
> Fragment 0:0
>   (org.apache.drill.exec.exception.SchemaChangeException) Sort currently only supports
a single schema.
>     org.apache.drill.exec.physical.impl.sort.SortRecordBatchBuilder.build():152
>     org.apache.drill.exec.physical.impl.xsort.ExternalSortBatch.innerNext():476
> ...
> {code}
> The line in the exception above:
> {code}
>   public void build(VectorContainer outputContainer) throws SchemaChangeException {
>     outputContainer.clear();
>     if (batches.keySet().size() > 1) {
>       throw new SchemaChangeException("Sort currently only supports a single schema.");
>     }
> {code}
> The above code has not changed in quite some time. The failure is in the "legacy" external
> Although the external sort does support schema changes, it only does so in the form of
a union vector, which must be enabled. (Other tests validate that schema changes work.)
> What is likely happening here is that the sort sometimes sees two files with differing
schemas, sometimes multiple threads run so that a single sort sees only one file. This speculation
can be verified by looking at a log file (not available in the test run that failed) to see
if the scan under the sort read more than one file.
> Or, perhaps the order of the JSON files matters. Perhaps file order varies across machines
(since the Linux command to list directories does not guarantee order.)

This message was sent by Atlassian JIRA

View raw message