hadoop-common-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Blanca Hernandez <Blanca.Hernan...@willhaben.at>
Subject AW: ClassCastException on running map-reduce jobs + tests on Windows (mongo-hadoop)
Date Thu, 18 Sep 2014 14:34:24 GMT
Thanks a millio for your support!

Von: Shahab Yunus [mailto:shahab.yunus@gmail.com]
Gesendet: Donnerstag, 18. September 2014 13:40
An: user@hadoop.apache.org
Betreff: Re: ClassCastException on running map-reduce jobs + tests on Windows (mongo-hadoop)

You will have to convert BSONWritable to BSONObject yourself. You can abstract this parsing
in a separate class/object model and reuse it but as far as I understand, objects being serialized
or deserialized have to be Writable (conforming to the interface that Hadoop defines and Comparable
if going to act as key.)

So given that, you will either have to do the parsing yourself or design your downstream modules
which expect BSONObject to accept BSONWritable. This way you won't need to parse. But the
downside is that your downstream users will then be tied with Hadoop api, resulting in a potentially
undesirable dependency.

Some links that might be helpful regarding this design and its background:

http://learnhadoopwithme.wordpress.com/tag/writablecomparable/

Page 93 and onwards from Tom White's book, Hadoop: The Definitive Guide.

Regarding your Windows experience, I don't have much knowledge in that area. Sorry :(

Regards,
Shahab

On Thu, Sep 18, 2014 at 2:58 AM, Blanca Hernandez <Blanca.Hernandez@willhaben.at<mailto:Blanca.Hernandez@willhaben.at>>
wrote:
Thanks,

I made the changes and everything works fine!! Many thanks!!

Now I am having problems converting BSONWritable to BSONObject and vice versa. Is there an
automatic way to make it?
Or should I write myself a parse?

And regarding the tests on windows, any experience?

Thanks again!!

Best regards,

Blanca


Von: Shahab Yunus [mailto:shahab.yunus@gmail.com<mailto:shahab.yunus@gmail.com>]
Gesendet: Mittwoch, 17. September 2014 17:20
An: user@hadoop.apache.org<mailto:user@hadoop.apache.org>
Betreff: Re: ClassCastException on running map-reduce jobs + tests on Windows (mongo-hadoop)

You String as the outputKey (). java.lang.String is not Writable. Change it to Text just like
you did for the Mapper.

Regards,
Shahab

On Wed, Sep 17, 2014 at 10:43 AM, Blanca Hernandez <Blanca.Hernandez@willhaben.at<mailto:Blanca.Hernandez@willhaben.at>>
wrote:
Thanks for answering:

hadoop jar /tmp/hadoop-test.jar at.willhaben.hadoop.AveragePriceCalculationJob

In the AveragePriceCalculationJob I have my configuration:


private static class AveragePriceCalculationJob extends MongoTool {
        private AveragePriceCalculationJob(AveragePriceNode currentNode, String currentId,
int nodeNumber) {
            Configuration conf = new Configuration();
            MongoConfig config = new MongoConfig(conf);
            setConf(conf);
            // change for my values
            config.setInputFormat(MongoInputFormat.class);
            config.setOutputFormat(MongoOutputFormat.class);

            config.setMapperOutputKey(Text.class);
            config.setMapperOutputValue(BSONObject.class);
            config.setOutputKey(String.class);
            config.setOutputValue(BSONWritable.class);

            config.setInputURI("myUrl");
            config.setOutputURI("myUrl");
            config.setMapper(AveragePriceMapper.class);
            config.setReducer(AveragePriceReducer.class);

        }
    }


And the main method:


public static void main(String [] args) throws InterruptedException, IOException, ClassNotFoundException
{
        // … some code

            try {
                ToolRunner.run(new AveragePriceCalculationJob(currentNode, currentId, nodeNumber),
args);
            } catch (Exception e) {
                e.printStackTrace();  //To change body of catch statement use File | Settings
| File Templates.
            }
    }


Best regards,

Blanca

Von: Shahab Yunus [mailto:shahab.yunus@gmail.com<mailto:shahab.yunus@gmail.com>]
Gesendet: Mittwoch, 17. September 2014 16:37
An: user@hadoop.apache.org<mailto:user@hadoop.apache.org>
Betreff: Re: ClassCastException on running map-reduce jobs + tests on Windows (mongo-hadoop)

Can you provide the driver code for this job?

Regards,
Shahab

On Wed, Sep 17, 2014 at 10:28 AM, Blanca Hernandez <Blanca.Hernandez@willhaben.at<mailto:Blanca.Hernandez@willhaben.at>>
wrote:
Hi again, I changed the String objects with org.apache.hadoop.io.Text objects (why is String
not accepted?), and now I get another exception, so I don´t really know if I solved something
or I broke something:


java.lang.Exception: java.lang.NullPointerException
        at org.apache.hadoop.mapred.LocalJobRunner$Job.runTasks(LocalJobRunner.java:462)
        at org.apache.hadoop.mapred.LocalJobRunner$Job.run(LocalJobRunner.java:522)
Caused by: java.lang.NullPointerException
        at org.apache.hadoop.mapred.MapTask$MapOutputBuffer.init(MapTask.java:988)
        at org.apache.hadoop.mapred.MapTask.createSortingCollector(MapTask.java:391)
        at org.apache.hadoop.mapred.MapTask.access$100(MapTask.java:80)
        at org.apache.hadoop.mapred.MapTask$NewOutputCollector.<init>(MapTask.java:675)
        at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:747)
        at org.apache.hadoop.mapred.MapTask.run(MapTask.java:340)
        at org.apache.hadoop.mapred.LocalJobRunner$Job$MapTaskRunnable.run(LocalJobRunner.java:243)
        at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:471)
        at java.util.concurrent.FutureTask.run(FutureTask.java:262)
        at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
        at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
        at java.lang.Thread.run(Thread.java:744)

If I could debug it in my IDE, I think I could work faster, but I have the problems already
exposed. How am I testing now? Building a jar, copying it on the server and running a Hadoop
jar command (not very performance approach…).

Could you give me a hand on this? Any Windows + IntelliJ IDEa there? Maaaany thanks!



Von: Blanca Hernandez [mailto:Blanca.Hernandez@willhaben.at<mailto:Blanca.Hernandez@willhaben.at>]
Gesendet: Mittwoch, 17. September 2014 15:27
An: user@hadoop.apache.org<mailto:user@hadoop.apache.org>
Betreff: ClassCastException on running map-reduce jobs + tests on Windows (mongo-hadoop)

Hi!

I am getting some CCE and don´t really understand why…

Here my mapper:

public class AveragePriceMapper extends Mapper<String, BSONObject, String, BSONObject>{
    @Override
    public void map(final String key, final BSONObject val, final Context context) throws
IOException, InterruptedException {
        String id = “result_of_making_some_operations”;
        context.write(id, val);
    }
}

And in my configuration:

config.setMapperOutputKey(String.class);
config.setMapperOutputValue(BSONObject.class);


On running my generated jar on the server, seems to work everything ok until:

14/09/17 15:20:36 INFO mapred.MapTask: Processing split: MongoInputSplit{URI=mongodb://user:pass@host:27017/my_db.my_collection,
authURI=null, min={ "_id" : { "$oid" : "541666d8e4b07265e257a42e"}}, max={ }, query={ }, sort={
}, fields={ }, notimeout=false}
14/09/17 15:20:36 INFO mapred.MapTask: Map output collector class = org.apache.hadoop.mapred.MapTask$MapOutputBuffer
14/09/17 15:20:36 INFO mapred.MapTask: (EQUATOR) 0 kvi 26214396(104857584)
14/09/17 15:20:36 INFO mapred.MapTask: mapreduce.task.io.sort.mb: 100
14/09/17 15:20:36 INFO mapred.MapTask: soft limit at 83886080
14/09/17 15:20:36 INFO mapred.MapTask: bufstart = 0; bufvoid = 104857600
14/09/17 15:20:36 INFO mapred.MapTask: kvstart = 26214396; length = 6553600
14/09/17 15:20:36 INFO mapred.LocalJobRunner: map task executor complete.
14/09/17 15:20:36 WARN mapred.LocalJobRunner: job_local1701078621_0001java.lang.Exception:
java.lang.ClassCastException: class java.lang.String
        at org.apache.hadoop.mapred.LocalJobRunner$Job.runTasks(LocalJobRunner.java:462)
        at org.apache.hadoop.mapred.LocalJobRunner$Job.run(LocalJobRunner.java:522)
Caused by: java.lang.ClassCastException: class java.lang.String
        at java.lang.Class.asSubclass(Class.java:3126)
        at org.apache.hadoop.mapred.JobConf.getOutputKeyComparator(JobConf.java:885)
        at org.apache.hadoop.mapred.MapTask$MapOutputBuffer.init(MapTask.java:981)
        at org.apache.hadoop.mapred.MapTask.createSortingCollector(MapTask.java:391)
        at org.apache.hadoop.mapred.MapTask.access$100(MapTask.java:80)
        at org.apache.hadoop.mapred.MapTask$NewOutputCollector.<init>(MapTask.java:675)
        at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:747)
        at org.apache.hadoop.mapred.MapTask.run(MapTask.java:340)
        at org.apache.hadoop.mapred.LocalJobRunner$Job$MapTaskRunnable.run(LocalJobRunner.java:243)
        at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:471)
        at java.util.concurrent.FutureTask.run(FutureTask.java:262)
        at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
        at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
        at java.lang.Thread.run(Thread.java:744)


Did I miss something??


Another issue I am worry about: working on a Windows system makes everything quite complicated
with Hadoop. I have it installed and running, the same as my mongoDB database (I am using
the connector provided by them). Running the same main class I am using in the hadooop jar
call on the server (in the example before), but from my IDE, I get this exception:

PriviledgedActionException as:hernanbl cause:java.io.IOException: Failed to set permissions
of path: \tmp\hadoop-hernanbl\mapred\staging\hernanbl1600842219\.staging to 0700

How could I make it run?


Many thanks!!

Best regards,

Blanca



Mime
View raw message