hadoop-common-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Ralf Heyde" <ralf.he...@gmx.de>
Subject Native HDFS Write Text & JAQL Execution
Date Fri, 09 Sep 2011 15:41:21 GMT
Hello again,


I'm thinking I have misunderstood something in writing files to HDFS and
process them in JAQL.


I have some sample-data which are represented by a set of objects. 

I transform these object to a JSONString. 


I'm writing JSON data directly to a HDFS-File through my HDFS-Client code:



Configuration config = new Configuration();

// add the hadoop configuration files residing in the installation path of

config.addResource(new Path("core-site.xml"));

// pass the username and password required to access the HDFS (set up on the

config.set("hadoop.job.ugi", "hadoop, password");

FileSystem fs = FileSystem.get(config);


Path path = new Path("/sampledata");

fs.mkdirs( path );


Path file = new Path( path, "samplefile.json" );


FSDataOutputStream fos = fs.create( file );


// Collect Sample Data and

Collection<Entry> entries = MockFactory.createEntries();

// Build JSON and

JSONArray jsonArray = JSONBuilder.buildSomeTwitterJSON(entries);


// write JSON to HDFS

fos.writeBytes( jsonArray.toString() );






Now I would like to run a JAQL script, but I get an error - The input file
is not a SequenceFile.


// Read

$sampledata = read(hdfs("/sampledata/samplefile.json"));


// Query 1: filter and transform

$ sampledata

  -> filter $.status_id == 1

  -> transform { $.authorurl, $.datum };



Can someone give me a hint to correct my misunderstanding?






  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message