hadoop-common-commits mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Apache Wiki <wikidi...@apache.org>
Subject [Hadoop Wiki] Update of "Hive/DeveloperGuide" by AshishThusoo
Date Mon, 15 Dec 2008 19:58:45 GMT
Dear Wiki user,

You have subscribed to a wiki page or wiki category on "Hadoop Wiki" for change notification.

The following page has been changed by AshishThusoo:

The comment on the change is:
Start filling out the developer guide.

  = Developer Guide =
  == Code Organization and a brief architecture ==
  === Introduction ===
+ Hive comprises of 3 main components:
+  * Serializers/Deserializers (trunk/serde) - This component has the framework libraries
that allow users to develop serializers and deserializers for their own data formats. This
component also contains some builtin serialization/deserialization families.
+  * MetaStore (trunk/metastore) - This component implements the metadata server which is
used to hold all the information about tables and partitions that are in the warehouse.
+  * Query Processor (trunk/ql) - This component implements the processing framework for converting
SQL to a graph of map/reduce jobs and also the execution time framework to run those jobs
in the order of dependencies.
+ Apart from these major components, Hive also contains a number of other components. These
are as follows:
+  * Command Line Interface (trunk/cli) - This component has all the java code used by the
Hive command line interface.
+  * Hive Server (trunk/service) - This component implements all the APIs that can be used
by other clients (such as JDBC drivers) to talk to Hive.
+  * Common (trunk/common) - This component contains common infrastructure needed by the rest
of the code. Currently, this contains all the java sources for managing and passing Hive configurations(HiveConf)
to all the other code components.
+  * Ant Utilities (trunk/ant) - This component contains the implementation of some ant tasks
that are used by the build infrastructure.
+  * Scripts (trunk/bin) - This component contains all the scripts provided in the distribution
including the scripts to run the Hive cli(bin/hive).
+ The following top level directories contain helper libraries, packaged configuration files
+  * trunk/conf - This directory contains the packaged hive-default.xml and hive-site.xml.
+  * trunk/data - This directory contains some data sets and configurations used in the hive
+  * trunk/ivy - This directory contains the ivy files used by the build infrastructure to
manage dependencies on different hadoop versions.
+  * trunk/lib - This directory contains the run time libraries needed by Hive.
+  * trunk/testlibs - This directory contains the junit.jar used by the junit target in the
build infrastructure.
+  * trunk/testutils (Deprecated)
  === SerDe ===
  === MetaStore ===
  === Query Processor ===

View raw message