hadoop-pig-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Pradeep Kamath" <prade...@yahoo-inc.com>
Subject Use of Minicluster in unit tests
Date Wed, 31 Dec 2008 00:53:15 GMT
Hi,

   MiniCluster is used to create a Hadoop cluster on the machine running
the unit tests to test scripts in an end-to-end manner. Currently the
unit tests which use MiniCluster create a temporary file on the local
file system instead of the DFS in the miniCluster and supply it to the
load statement prefixed with "file:". To be more correct, we should use
the DFS of the minicluster so that we are truly testing end-to-end
execution in a hadoop execution environment. I have added a couple of
static functions to test/org/apache/pig/test/Util.java to create and
delete input files on the DFS of the minicluster as part of
https://issues.apache.org/jira/browse/PIG-580. Here is a snippet
illustrating the use of these functions - patch submitters should
consider creating input files this way in their unit tests if they plan
to use MiniCluster.

 

== code snippet begins ==

 

String input[] = {

                "pig1\t18\t2.1",

                "pig2\t24\t3.3",

                "pig5\t45\t2.4",

                "pig1\t18\t2.1",

                "pig1\t19\t2.1",

                "pig2\t24\t4.5",

                "pig1\t20\t3.1" };

Util.createInputFile(cluster, "input.txt", input);

... use input.txt in the load statement as is

Util.deleteFile(cluster, "input.txt");

 

== code snippet ends ==

 

Thanks,

Pradeep

 

P.S: For people wanting to use these functions BEFORE
https://issues.apache.org/jira/browse/PIG-580 is committed, here is the
code:

In MiniCluster.java:

public FileSystem getFileSystem() {

        return m_fileSys;

    }

 

In Util.java:

 

static public void createInputFile(MiniCluster miniCluster, String
fileName, 

                                       String[] inputData) 

    throws IOException {

        FileSystem fs = miniCluster.getFileSystem();

        if(fs.exists(new Path(fileName))) {

            throw new IOException("File " + fileName + " already exists
on the minicluster");

        }

        FSDataOutputStream stream = fs.create(new Path(fileName));

        PrintWriter pw = new PrintWriter(new OutputStreamWriter(stream,
"UTF-8"));

        for (int i=0; i<inputData.length; i++){

            pw.println(inputData[i]);

        }

        pw.close();

    }

 

 

static public void deleteFile(MiniCluster miniCluster, String fileName) 

    throws IOException {

        FileSystem fs = miniCluster.getFileSystem();

        fs.delete(new Path(fileName), true);

    }

 

 


Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message