impala-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Jim Apple <jbap...@cloudera.com>
Subject Re: New Impala contributors: IMPALA-5754
Date Wed, 06 Sep 2017 17:35:22 GMT
I have posted a link on the ticket to
https://lists.apache.org/thread.html/6fbcfa650cbb920e2b517ae643bcd0859f1ba0368451d2949eda274d@%3Cdev.impala.apache.org%3E.
I hope to write some more of these, after which perhaps I should make
a space on the wiki to hold them all.

On Wed, Sep 6, 2017 at 10:08 AM, Todd Lipcon <todd@cloudera.com> wrote:
> Hey JIm,
>
> This is a great tutorial, thanks for posting it. One thought: would be
> great to put this somewhere on the web -- either as a blog post or wiki
> entry, so if someone googles they are more likely to find it. (sometimes
> mailing list archives are harder to bring up in google results)
>
> On Wed, Sep 6, 2017 at 10:05 AM, Jim Apple <jbapple@cloudera.com> wrote:
>
>> If you'd like to contribute a patch to Impala, but aren't sure what
>> you want to work on, you can look at Impala's newbie issues:
>> https://issues.apache.org/jira/issues/?filter=12341668. You can find
>> detailed instructions on submitting patches at
>> https://cwiki.apache.org/confluence/display/IMPALA/Contributing+to+Impala.
>> This is a walkthrough of a ticket a new contributor could take on,
>> with hopefully enough detail to get you going but not so much to take
>> away the fun.
>>
>> How can we fix https://issues.apache.org/jira/browse/IMPALA-5754,
>> "rand() algorithm is very non-random"? This is a partial walk-through
>> of how to get started.
>>
>> Set up your development environment. Then, look for where we might
>> first write a failing test. The test case given in the ticket is
>> "select count(distinct(rand(867-5309))), count(*) from alltypes a,
>> alltypes b;". Tests that run a full query are considered "end-to-end
>> tests".
>>
>> End-to-end tests are described in two ways: .test files and .py files.
>>
>> .test files contain queries and their expected results. For example:
>>
>> ====
>> ---- QUERY
>> # Regression test for IMPALA-938
>> select smallint_col, int_col, (cast("1970-01-01" as timestamp) +
>> interval smallint_col days)
>> from functional.alltypes where smallint_col = 1 limit 1
>> ---- RESULTS
>> 1,1,1970-01-02 00:00:00
>> ---- TYPES
>> smallint, int, timestamp
>> ====
>>
>> That is taken from
>> testdata/workloads/functional-query/queries/QueryTest/exprs.test.
>> That's a good test file to add a test case to, since it is testing
>> "exprs", and the bug is in  MathFunctions::Rand, which is defined in
>> be/src/exprs.
>>
>> First, let's run all of the exprs tests to see that they pass. You can
>> see them called in tests/query_test/test_exprs.py. The Python scrips
>> in tests/ can run these .test files by calling ImpalaTestSuite's
>> run_test_case() method with an abbreviated name of the .test file. In
>> test_exprs.py, this looks like
>>
>> self.run_test_case('QueryTest/exprs', vector)
>>
>> That call is in the method TestExprs.test_exprs(); you can invoke it with:
>>
>> ./bin/impala-py.test
>> tests/query_test/test_exprs.py::TestExprs::test_exprs --sanity
>>
>> This should take about 40 seconds and should pass, indicated by a
>> return value of 0 and a green line printed to the terminal reading:
>>
>> ...====== 1 passed in 39.85 seconds ======...
>>
>> Now add a test case, following the example from the ticket and the
>> format in exprs.test. Run the test again; it should fail.
>>
>> Fix the bug and run the test again. Once the test is passing, follow
>> the instructions on the wiki to send your patch for code review:
>> https://cwiki.apache.org/confluence/display/IMPALA/Contributing+to+Impala
>>
>
>
>
> --
> Todd Lipcon
> Software Engineer, Cloudera

Mime
View raw message