hadoop-common-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Pat Ferrel <pat.fer...@gmail.com>
Subject Hadoop 101
Date Wed, 12 Dec 2012 00:49:45 GMT
Stupid question for the day…

I have a file created by a mahout job of the form:

0	[356:0.3481597,359:0.3481597,358:0.3481597,361:0.3481597,360:0.3481597]
8	[356:0.34786037,359:0.34786037,358:0.34786037,361:0.34786037,360:0.34786037]
25	[284:0.34821576,286:0.34821576,287:0.34821576,288:0.34821576,289:0.34821576]
28	[452:0.34802154,454:0.34802154,453:0.34802154,456:0.34802154,455:0.34802154]

If this were a SequenceFile I could read it and be merrily on my way but it's a text file.
The classes written are key, value pairs <LongWritable, VectorWritable> but the file
is tab delimited text. 

I was hoping to do something like:

SequenceFile.Reader reader = new SequenceFile.Reader(fs, inputFile, conf);
Writable userId = new LongWritable();
VectorWritable recommendations = new VectorWritable();
while (reader.next(userId, recommendations)) {
	//do something with each pair

But alas Google fails me. How do you read in key, values pairs from text files outside of
a map or reduce? 

View raw message