hadoop-common-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Ted Dunning <tdunn...@veoh.com>
Subject Re: HDFS access/Jython examples
Date Mon, 07 Apr 2008 14:58:52 GMT


Here is the implementation I use in Groovy.  Jython should be nearly as
concise except that Jython may expect more out of a file reader.  These are
methods on my HadoopFile abstraction.

    void withPrintWriter(Closure action) {
        OutputStream os = outputStream()
        PrintWriter pw = new PrintWriter(os)
        try {
            action(pw)
        } finally {
            pw?.close()
        }
    }

    private OutputStream outputStream() {
        Configuration conf = new Configuration()
        //        conf.set("fs.default.name", "metricsapp4:50020")
        FileSystem fs = org.apache.hadoop.fs.FileSystem.get(conf)
        if (fs instanceof LocalFileSystem) {
            new File(name).deleteOnExit()
        }
        fs.create(new Path(name));
    }

    /**
     * Same as for File
     */
    public void eachLine(Closure action) {
        Hadoop.local {
            Configuration conf = new Configuration()
            //        conf.set("fs.default.name", "metricsapp4:50020")
            FileSystem fs = org.apache.hadoop.fs.FileSystem.get(conf)
            def read = {part ->
                def f = new BufferedReader(new
InputStreamReader(fs.open(part)))
                try {
                    f.eachLine(action)
                } finally {
                    f?.close()
                }
            }
            // sometimes the file is really a directory with part-* files,
sometimes
            // it is a file.
            if (fs.isFile(new Path(name))) {
                for (part in fs.globPaths(new Path(name))) {read(part)}
            } else {
                for (part in fs.globPaths(new Path(name, "part-*")))
{read(part)}
            }
        }
    }






On 4/7/08 7:48 AM, "Andreas Kostyrka" <andreas@kostyrka.org> wrote:

> Hi!
> 
> I just wondered if there is some Jython example that shows how to access
> the HDFS from Jython, without running a mapreduce?
> 
> Andreas


Mime
View raw message