From Paul Rogers <prog...@mapr.com>
Subject Re: Jars for BaseTestQuery
Date Thu, 20 Apr 2017 23:11:56 GMT
Hi François,

You raised two issues, I’ll address both.

First, it is true that Maven’s model is that test code is not packaged, it is visible only
to the maven module in which the test code resides. As you point out, this is an inconvenience
in multiple-module projects such as Drill. Drill gets around the problem by minimizing unit
testing; most testing outside of java-exec is done via system tests: running all of Drill
and throwing queries at it.

Drill, at present, has no support for reusing tests outside of their home module. It would
be great if someone volunteers to solve the problem. Here are two references: [1], [2]

Second, you mentioned you want to unit test a storage plugin. Here, it is necessary to understand
how Drill’s usage of the term “unit test" differs from common industry usage. In the industry,
a “unit test” would be one where you test your reader in isolation. Specially, give it
an operator definition (the so-called “physical operator” or “sub scan POP” in Drill.)
You’d then grab data and verify that the returned data batches are correct.

Similarly, for the planning side of the plugin, you’d let Drill plan the query, then verify
that the plan JSON is as you expect it to be.

Drill, however, uses “unit test” to mean a system-level test written using JUnit. That
is, most Drill tests run a query and examine the results. The BaseTestQuery class you mentioned
is a JUnit test, but it is a system level test: it starts up an embedded Drillbit to which
you can send queries. It has helper classes that let you examine results o the entire query
(not just of your reader.) If you construct the correct SQL, your query can include nothing
but a scan and the screen operator. Still, this approach introduces many layers between your
test and your reader. (I call it trying to fix a watch while wearing oven mitts.)

There are two recent additions to Drill’s test tools that may be of interest. First, we
have a simpler way to run system tests based on a “cluster test fixture”. BaseTestQuery
provides very poor control over boot-time configuration, but the test fixture gives you much
better control. Plus, the new fixture lets you reuse the “TestBuilder” classes from BestTestQuery
while also providing very easy ways to run queries, time results and so on. Check out the
package-info in [3] and the example test in [4]. Unfortunately, this code has the same Maven
packaging issues as described above.

Of course, even the simplified test fixture is still a system test. We are in the process
of checking in a new set of “sub-operator” unit test fixtures that enable true unit tests:
you test only your code. See DRILL-5323 and DRILL-5318. Those PRs will be followed by a complete
set of tests for the sort operator. I can point you to my personal dev branch if you want
a preview.

With these tools, you can set up to run just your own reader, then set up expected results
and validate that things work as expected. Unit tests let you verify behavior at a very fine
grain: verify each kind of column data type, verify filters you wish to push and so on. This
is important because Drill suffers from a very large number of minor bugs: bugs that are hard
to find using system tests, but which become obvious when using true unit tests.

The in-flight version of the test framework was built for an “internal” operator (the
sort.) Some work will be required to extend the tests to work with a reader (and to refactor
the reader so it does not depend on a running Drillbit.) This is a worthwhile effort that
I can help with if you want to go this route.


- Paul

[1] http://stackoverflow.com/questions/14722873/sharing-src-test-classes-between-modules-in-a-multi-module-maven-project
[2] http://maven.apache.org/guides/mini/guide-attached-tests.html
[3] https://github.com/apache/drill/blob/master/exec/java-exec/src/test/java/org/apache/drill/test/package-info.java
[4] https://github.com/apache/drill/blob/master/exec/java-exec/src/test/java/org/apache/drill/test/ExampleTest.java

> On Apr 20, 2017, at 11:23 AM, François Méthot <fmethot78@gmail.com> wrote:
> Hi,
>   I need to develop unit test of our storage plugins and if possible I
> would like to borrow from the tests done in "TestCsvHeader.java" and other
> classes in that package.
> Those tests depends on  BaseTestQuery, DrillTest and ExecTest classes which
> are not packaged in the Drill release (please correct me if I am wrong).
> Are those jar shared somewhere for Storage Plugin Developer that rely on
> the pre-built jar?
> Thanks
> Francois

