hadoop-common-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Arun C Murthy (JIRA)" <j...@apache.org>
Subject [jira] Commented: (HADOOP-6332) Large-scale Automated Test Framework
Date Sun, 25 Oct 2009 06:57:59 GMT

    [ https://issues.apache.org/jira/browse/HADOOP-6332?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12769758#action_12769758
] 

Arun C Murthy commented on HADOOP-6332:
---------------------------------------

Some utility apis to provide a flavour for what we are trying to accomplish:

{noformat}
  /**
   * Sources of logs and outputs.
   */
  public enum LogSource {
    NAMENODE,
    DATANODE,
    JOBTRACKER,
    TASKTRACKER,
    TASK
  }

  /**
   * Setup a Hadoop Cluster.
   * @param conf {@link Configuration} for the cluster
   * @throws IOException
   */
  public static void setupCluster(Configuration conf) throws IOException;
  
  /**
   * Tear down the Hadoop Cluster
   * @param conf {@link Configuration} for the cluster
   * @throws IOException
   */
  public static void tearDownCluster(Configuration conf) throws IOException;

  /**
   * Kill all Hadoop Daemons running on the given rack.
   * @param rackId rack on which all map-reduce daemons should be killed
   * @throws IOException
   * @throws InterruptedException
   */
  public static void killRack(Cluster cluster, String rackId) 
  throws IOException, InterruptedException;

  /**
   * Fetch logs from the hadoop daemon from <code>startTime</code> to 
   * <code>endTime</code> and place them in <code>dst</code>.
   * @param cluster Map-Reduce {@link Cluster}
   * @param daemon hadoop daemon from which to fetch logs
   * @param startTime start time
   * @param endTime end time
   * @param dst destination for storing fetched logs
   * @throws IOException
   */
  public static void fetchDaemonLogs(Cluster cluster, Testable daemon, 
                                     long startTime, long endTime, 
                                     Path dst) 
  throws IOException;

  /**
   * Fetch deamon logs and check if they have the <code>pattern</code>.
   * @param cluster map-reduce <code>Cluster</code>
   * @param source log source
   * @param startTime start time
   * @param endTime end time
   * @param pattern pattern to check
   * @param fetch if <code>true</code> fetch the logs into <code>dir</code>,
   *              else do not fetch
   * @param dir directory to place the fetched logs
   * @return <code>true</code> if the logs contain <code>pattern</code>,
   *         <code>false</code> otherwise
   * @throws IOException
   */
  public static boolean checkDaemonLogs(Cluster cluster, 
                                        LogSource source,
                                        long startTime, long endTime,
                                        String pattern,
                                        boolean fetch, Path dir)
  throws IOException;

{noformat}

----

It's very likely each of these utility methods will turn around and call shell-scripts etc.
to actually accomplish the desired functionality... it's convenient to have the person implementing
a specific test-case not worry about the details and continue to work in the familiar junit-environment
(for hadoop devs).


> Large-scale Automated Test Framework
> ------------------------------------
>
>                 Key: HADOOP-6332
>                 URL: https://issues.apache.org/jira/browse/HADOOP-6332
>             Project: Hadoop Common
>          Issue Type: New Feature
>          Components: test
>            Reporter: Arun C Murthy
>             Fix For: 0.21.0
>
>
> Hadoop would benefit from having a large-scale, automated, test-framework. This jira
is meant to be a master-jira to track relevant work.
> ----
> The proposal is a junit-based, large-scale test framework which would run against _real_
clusters.
> There are several pieces we need to achieve this goal:
> # A set of utilities we can use in junit-based tests to work with real, large-scale hadoop
clusters. E.g. utilities to bring up to deploy, start & stop clusters, bring down tasktrackers,
datanodes, entire racks of both etc.
> # Enhanced control-ability and inspect-ability of the various components in the system
e.g. daemons such as namenode, jobtracker should expose their data-structures for query/manipulation
etc. Tests would be much more relevant if we could for e.g. query for specific states of the
jobtracker, scheduler etc. Clearly these apis should _not_ be part of the production clusters
- hence the proposal is to use aspectj to weave these new apis to debug-deployments.
> ----
> Related note: we should break up our tests into at least 3 categories:
> # src/test/unit -> Real unit tests using mock objects (e.g. HDFS-669 & MAPREDUCE-1050).
> # src/test/integration -> Current junit tests with Mini* clusters etc.
> # src/test/system -> HADOOP-6332 and it's children

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


Mime
View raw message