hadoop-common-commits mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Apache Wiki <wikidi...@apache.org>
Subject [Hadoop Wiki] Update of "Hive/HowToContribute" by Ning Zhang
Date Thu, 29 Oct 2009 06:24:25 GMT
Dear Wiki user,

You have subscribed to a wiki page or wiki category on "Hadoop Wiki" for change notification.

The "Hive/HowToContribute" page has been changed by Ning Zhang.
http://wiki.apache.org/hadoop/Hive/HowToContribute?action=diff&rev1=8&rev2=9

--------------------------------------------------

  = How to Contribute to Apache Hive =
- 
  This page describes the mechanics of ''how'' to contribute software to Apache Hive.  For
ideas about ''what'' you might contribute, please see open tickets in [[https://issues.apache.org/jira/browse/HIVE|Jira]].
  
  <<TableOfContents(3)>>
  
  == Getting the source code ==
- 
  First of all, you need the Hive source code.<<BR>>
  
  Get the source code on your local drive using [[http://hadoop.apache.org/hive/version_control.html|SVN]].
 Most development is done on the "trunk":
@@ -15, +13 @@

  {{{
  svn checkout http://svn.apache.org/repos/asf/hadoop/hive/trunk hive-trunk
  }}}
- 
  == Setting up Eclipse Development Environment (Optional) ==
  This is an optional step.  Eclipse has a lot of advanced features for Java development,
and it makes the life much easier for Hive developers as well.
  
  [[Hive/GettingStarted/EclipseSetup|How to set up Eclipse for Hive development]]
  
  == Making Changes ==
- 
  Before you start, send a message to the [[http://hadoop.apache.org/hive/mailing_lists.html#Developers|Hive
developer mailing list]], or file a bug report in [[https://issues.apache.org/jira/browse/HIVE|Jira]].
 Describe your proposed changes and check that they fit in with what others are doing and
have planned for the project. Be patient, it may take folks a while to understand your requirements.
  
  Modify the source code and add some (very) nice features using your favorite IDE.<<BR>>
  
  But take care about the following points
+ 
   * All public classes and methods should have informative [[http://java.sun.com/j2se/javadoc/writingdoccomments/|Javadoc
comments]].
    * Do not use @author tags.
   * Code should be formatted according to [[http://java.sun.com/docs/codeconv/|Sun's conventions]],
with one exception:
@@ -39, +36 @@

    * You can run all the unit test with the command {{{ant test}}}, or you can run a specific
unit test with the command {{{ant -Dtestcase=<class name without package prefix> test}}}
(for example {{{ant -Dtestcase=TestFileSystem test}}})
  
  === Understanding Ant ===
- 
- Hive is built by Ant, a Java building tool.  
+ Hive is built by Ant, a Java building tool.
  
   * Good Ant tutorial: http://i-proving.ca/space/Technologies/Ant+Tutorial
  
  === Unit Tests ===
+ Please make sure that all unit tests succeed before and after applying your patch and that
no new javac compiler warnings are introduced by your patch. You can specify the hadoop version
with -Dhadoop.version="<your-hadoop-version>" if your hadoop version is not the default.
- 
- Please make sure that all unit tests succeed before and after applying your patch and that
no new javac compiler warnings are introduced by your patch.
- You can specify the hadoop version with -Dhadoop.version="<your-hadoop-version>" if
your hadoop version is not the default.
  
  {{{
  > cd hive-trunk
  > ant clean test tar -logfile ant.log
  }}}
  After a while, if you see
+ 
  {{{
  BUILD SUCCESSFUL
  }}}
  all is ok, but if you see
+ 
  {{{
  BUILD FAILED
  }}}
  then you should fix things before proceeding. Running
+ 
  {{{
  > ant testreport
  }}}
@@ -79, +76 @@

     * Run "ant test -Dtestcase=TestCliDriver -Dqfile=XXXXXX.q -Doverwrite=true -Dtest.silent=false".
This will generate a new XXXXXX.q.out file in ql/src/test/results/clientpositive.
    * If the feature is added in contrib
     * Do the steps above, replacing "ql" with "contrib", and "TestCliDriver" with "TestContribCliDriver".
+ === Debugging Hive code ===
+ 
+ Hive code includes both client-side (compiler, semantic analyzer, and optimizer of HiveQL)
code and server-side code (any operator/task implementations). The client-side code are running
on your local machine so you can easily debug it using Eclipse the same way as you debug a
regular local Java code.  The server-side code is distributed and running on the Hadoop cluster,
so debugging server-side Hive code is a little bit complicated. Nonetheless, we can still
attach the debugger to a different JVM under unit test (single machine mode). Below are the
steps for how to debug on server-side code. 
+ 
+ 
+  * Compile Hive code with javac.debug=on. Under Hive checkout directory.
+ {{{
+     > ant -Djavac.debug=on package
+ }}}
+ If you have already built Hive without javac.debug=on, you can clean the build and then
run the above command.
+ {{{
+     > ant clean  # not necessary if the first time to compile
+     > ant -Djavac.debug=on package
+ }}}
+  
+  
+  * Run ant test with additional options to tell Java VM that we want to wait for debugger
to attach.
+ First define some convenient macros for debugging. You can put it in your .bashrc or .cshrc.

+ {{{
+     > export HIVE_DEBUG_PORT=8000
+     > export $HIVE_DEBUG="-Xdebug -Xrunjdwp:transport=dt_socket,address=${HIVE_DEBUG_PORT},server=y,suspend=y"
+ }}}
+ In particular HIVE_DEBUG_PORT is the port that the JVM is listening on and the debugger
should attach to. Then run the unit test as follows:
+ {{{
+     > $HADOOP_OPTS=$HIVE_DEBUG; ant test -Dtestcase=TestCliDriver -Dqfile=<mytest>.q
+ }}}
+ 
+ The unit test will run until it shows:
+ {{{
+      [junit] Listening for transport dt_socket at address: 8000
+ }}}
+ 
+   * Now, you can use jdb to attach to port 8000 to debug
+ {{{
+     > jdb -attach 8000  
+ }}}
+ or better off if you are running eclipse and projects are already imported, you can debug
with eclipse. Under eclipse Run -> Debug Configurations, find "Remote Java Application"
at the bottom of the left panel. There should be MapRedTask configuration already. If there
is no such configuration, you can create one with the following property:
+ 
+ Project:  the Hive project that you imported.
+ Connection Type: Standard (Socket Attach)
+ Connection Properties: Host: localhost  Port: 8000
+ 
+ Then hit "Debug" button and it will attach the JVM listening on port 8000 and continue running.
You can define breakpoints in the source code before hit "Debug" so that it will stop there.
The rest is the same as debugging client side Hive. 
+ 
  
  === Creating a patch ===
  Check to see what files you have modified with:
+ 
  {{{
  svn stat
  }}}
- 
  Add any new files with:
+ 
  {{{
  svn add .../MyNewClass.java
  svn add .../TestMyNewClass.java
  svn add .../XXXXXX.q
  svn add .../XXXXXX.q.out
  }}}
- 
  In order to create a patch, type (from the base directory of hive):
  
  {{{
  svn diff > HIVE-1234.patch
  }}}
- 
- This will report all modifications done on Hive sources on your local disk and save them
into the ''HIVE-1234.patch'' file.  Read the patch file.  
+ This will report all modifications done on Hive sources on your local disk and save them
into the ''HIVE-1234.patch'' file.  Read the patch file.   Make sure it includes ONLY the
modifications required to fix a single issue.
- Make sure it includes ONLY the modifications required to fix a single issue.
  
  Please do not:
+ 
   * reformat code unrelated to the bug being fixed: formatting changes should be separate
patches/commits.
-  * comment out code that is now obsolete: just remove it.  
+  * comment out code that is now obsolete: just remove it.
   * insert comments around each change, marking the change: folks can use subversion to figure
out what's changed and by whom.
   * make things public which are not required by end users.
  
  Please do:
+ 
   * try to adhere to the coding style of files you edit;
   * comment code whose function or rationale is not obvious;
   * update documentation (e.g., ''package.html'' files, this wiki, etc.)
  
  If you need to rename files in your patch:
+ 
   1. Write a shell script that uses 'svn mv' to rename the original files.
   1. Edit files as needed (e.g., to change package names).
   1. Create a patch file with 'svn diff --no-diff-deleted --notice-ancestry'.
   1. Submit both the shell script and the patch file.
+ 
  This way other developers can preview your change by running the script and then applying
the patch.
  
- 
  === Applying a patch ===
- 
- To apply a patch either you generated or found from JIRA, you can issue 
+ To apply a patch either you generated or found from JIRA, you can issue
+ 
  {{{
  patch -p0 < cool_patch.patch
  }}}
  if you just want to check whether the patch applies you can run patch with --dry-run option
+ 
  {{{
  patch -p0 --dry-run < cool_patch.patch
  }}}
- 
- If you are an Eclipse user, you can apply a patch by : 1. Right click project name in Package
Explorer , 2. Team -> Apply Patch 
+ If you are an Eclipse user, you can apply a patch by : 1. Right click project name in Package
Explorer , 2. Team -> Apply Patch
  
  == Contributing your work ==
- 
- Finally, patches should be ''attached'' to an issue report in [[http://issues.apache.org/jira/browse/HIVE|Jira]]
via the '''Attach File''' link on the issue's Jira. Please add a comment that asks for a code
review following our [[CodeReviewChecklist| code review checklist]]. Please note that the
attachment should be granted license to ASF for inclusion in ASF works (as per the [[http://www.apache.org/licenses/LICENSE-2.0|Apache
License]] §5). 
+ Finally, patches should be ''attached'' to an issue report in [[http://issues.apache.org/jira/browse/HIVE|Jira]]
via the '''Attach File''' link on the issue's Jira. Please add a comment that asks for a code
review following our [[CodeReviewChecklist|code review checklist]]. Please note that the attachment
should be granted license to ASF for inclusion in ASF works (as per the [[http://www.apache.org/licenses/LICENSE-2.0|Apache
License]] §5).
  
  When you believe that your patch is ready to be committed, select the '''Submit Patch'''
link on the issue's Jira.
  
+ Folks should run {{{ant clean test}}} before selecting '''Submit Patch'''.  Tests should
all pass. If your patch involves performance optimizations, they should be validated by benchmarks
that demonstrate an improvement.
- Folks should run {{{ant clean test}}} before selecting '''Submit Patch'''.  Tests should
all pass.
- If your patch involves performance optimizations, they should be validated by benchmarks
that demonstrate an improvement.
  
  If your patch creates an incompatibility with the latest major release, then you must set
the '''Incompatible change''' flag on the issue's Jira 'and' fill in the '''Release Note'''
field with an explanation of the impact of the incompatibility and the necessary steps users
must take.
  
@@ -156, +196 @@

  
  Committers: for non-trivial changes, it is best to get another committer to review your
patches before commit.  Use '''Submit Patch''' link like other contributors, and then wait
for a "+1" from another committer before committing.  Please also try to frequently review
things in the patch queue.
  
+ 
  == Jira Guidelines ==
- 
  Please comment on issues in Jira, making their concerns known.  Please also vote for issues
that are a high priority for you.
  
  Please refrain from editing descriptions and comments if possible, as edits spam the mailing
list and clutter Jira's "All" display, which is otherwise very useful.  Instead, preview descriptions
and comments using the preview button (on the right) before posting them.  Keep descriptions
brief and save more elaborate proposals for comments, since descriptions are included in Jira's
automatically sent messages.  If you change your mind, note this in a new comment, rather
than editing an older comment.  The issue should preserve this history of the discussion.
  
  == Stay involved ==
- 
  Contributors should join the [[http://hadoop.apache.org/hive/mailing_lists.html|Hive mailing
lists]].  In particular the dev list (to join discussions of changes) and the user list (to
help others).
  
  == See Also ==
- 
   * [[http://www.apache.org/dev/contributors.html|Apache contributor documentation]]
   * [[http://www.apache.org/foundation/voting.html|Apache voting documentation]]
  

Mime
View raw message