hadoop-common-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Xavier Stevens" <Xavier.Stev...@fox.com>
Subject RE: What's the best way to get to a single key?
Date Tue, 11 Mar 2008 00:09:07 GMT
So I read some more through the Javadocs.  I had 11 reducers on my original job leaving me
11 MapFile directories.  I am passing in their parent directory here as "outDir".

MapFile.Reader[] readers = MapFileOutputFormat.getReaders(fileSys, outDir, defaults);
Partitioner part = (Partitioner)ReflectionUtils.newInstance(conf.getPartitionerClass(), conf);
Text entryValue = (Text)MapFileOutputFormat.getEntry(readers, part, new Text("mykey"), null);
System.out.println("My Entry's Value: ");
System.out.println(entryValue.toString());

But I am getting an exception:

Exception in thread "main" java.lang.ArithmeticException: / by zero
        at org.apache.hadoop.mapred.lib.HashPartitioner.getPartition(HashPartitioner.java:35)
        at org.apache.hadoop.mapred.MapFileOutputFormat.getEntry(MapFileOutputFormat.java:85)
        at mypackage.MyClass.main(ProfileReader.java:110)
        at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
        at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
        at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
        at java.lang.reflect.Method.invoke(Method.java:585)
        at org.apache.hadoop.util.RunJar.main(RunJar.java:155)

I am assuming I am doing something wrong, but I'm not sure what it is yet.  Any ideas?


-Xavier


-----Original Message-----
From: Xavier Stevens
Sent: Mon 3/10/2008 3:49 PM
To: core-user@hadoop.apache.org
Subject: RE: What's the best way to get to a single key?
 
I was thinking because it would be easier to search a single-index.
Unless I don't have to worry and hadoop searches all my indexes at the
same time.  Is this the case?

-Xavier
 

-----Original Message-----
From: Doug Cutting [mailto:cutting@apache.org] 
Sent: Monday, March 10, 2008 3:45 PM
To: core-user@hadoop.apache.org
Subject: Re: What's the best way to get to a single key?

Xavier Stevens wrote:
> Thanks for everything so far.  It has been really helpful.  I have one

> more question.  Is there a way to merge MapFile index/data files?

No.

To append text files you can use 'bin/hadoop fs -getmerge'.

To merge sorted SequenceFiles (like MapFile/index files) you can use:

http://hadoop.apache.org/core/docs/current/api/org/apache/hadoop/io/Sequ
enceFile.Sorter.html#merge(org.apache.hadoop.fs.Path[], org.apache.had
oop.fs.Path, boolean)

But this doesn't generate a MapFile.

Why is a single file preferable?

Doug




 

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message