hadoop-common-commits mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Apache Wiki <wikidi...@apache.org>
Subject [Hadoop Wiki] Update of "Hive/TipsForAddingNewTests" by JohnSichi
Date Wed, 23 Jun 2010 23:55:20 GMT
Dear Wiki user,

You have subscribed to a wiki page or wiki category on "Hadoop Wiki" for change notification.

The "Hive/TipsForAddingNewTests" page has been changed by JohnSichi.
http://wiki.apache.org/hadoop/Hive/TipsForAddingNewTests?action=diff&rev1=2&rev2=3

--------------------------------------------------

  #pragma section-numbers 2
  = Tips for Adding New Tests in Hive =
  
- Following are a few rules of thumb that should be followed when adding new test cases in
Hive that require the introduction of new query file(s). Of course, these rules should not
be applied if they invalidate the purpose of your test to begin with. These are genrally helpful
in - keeping the test queries concise, minimizing the redundancies where possible, and ensuring
that cascading failures due to a single test failure do not occur. 
+ Following are a few rules of thumb that should be followed when adding new test cases in
Hive that require the introduction of new query file(s). Of course, these rules should not
be applied if they invalidate the purpose of your test to begin with. These are generally
helpful in keeping the test queries concise, minimizing the redundancies where possible, and
ensuring that cascading failures due to a single test failure do not occur. 
  
   * Instead of creating your own data file for loading into a new table, use existing data
from staged tables like {{{src}}}.
-  * If your test requires a {{{SELECT}}} query, limit it to a single {{{SELECT}}} statement
per table as these are generally heavily exercised by a majority of tests.
+  * If your test requires a {{{SELECT}}} query, keep it as simple as possible, and minimize
the number of queries to keep overall test time down; avoid repeating scenarios which are
already covered by existing tests.
-  * If you must use a {{{SELECT}}} statement, make sure you use the {{{ORDER BY}}} clause
to minimize the chances of spurious diffs due to output order differences leading to test
failures.
+  * When you do need to use a {{{SELECT}}} statement, make sure you use the {{{ORDER BY}}}
clause to minimize the chances of spurious diffs due to output order differences leading to
test failures.
   * Limit your test to one table unless you require multiple tables specifically.
   * Start the query specification with an explicit {{{DROP TABLE}}} directive to make sure
that any upstream test failures that could not clean up do not cause your test to fail.
   * End the query specification with explicit {{{DROP TABLE}}} directive to drop the table(s)
you may have created during the course of the test.

Mime
View raw message