crunch-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Micah Whitacre <mkwhita...@gmail.com>
Subject Re: DataBaseSource configuration issue
Date Wed, 21 May 2014 12:27:24 GMT
It's a workaround but you should be able to manually set the correct
configuration using the Source.inputConf(...)[1] method and set the correct
additional property.

[1] -
http://crunch.apache.org/apidocs/0.8.2/org/apache/crunch/Source.html#inputConf(java.lang.String,
java.lang.String)


On Tue, May 20, 2014 at 6:06 PM, Josh Wills <jwills@cloudera.com> wrote:

> crunch-contrib was likely the Hadoop 2.0.0 APIs, not the MR1 APIs. I
> didn't realize there was a difference between the two in the value of that
> property, which is certainly my bad. I rarely (ever?) read anything from
> databases as part of MR jobs, and hadn't run into that one before.
>
>
> On Tue, May 20, 2014 at 3:21 PM, Nathan Schile <nathan.schile@gmail.com>wrote:
>
>> I am having trouble using the DataBaseSource class from crunch-contrib. I
>> am using version 0.8.2+32-cdh4.4.0 of crunch-contrib and 2.0.0-mr1-cdh4.4.0
>> of hadoop-core. The DataBaseSource class is setting the property
>> "mapreduce.jdbc.driver.class" on the Hadoop configuration [1] to specify
>> the JDBC driver to use, while when trying to get a connection to the
>> database in DBConfiguration#getConnection [2] it is reading property
>> "mapred.jdbc.driver.class" to retrieve the driver class to use. This
>> property mismatch is causing the connection to not be established. I would
>> have expected "mapred.jdbc.driver.class" property to be used within
>> DataBaseSource since MR1 is being used. I decompiled
>> crunch-contrib:2.0.0-mr1-cdh4.4.0 jar using [3] and looked at the
>> DataBaseSource class and it was using "mapreduce.jdbc.driver.class". It
>> makes me think that crunch-contrib:2.0.0-mr1-cdh4.4.0 was compiled with a
>> hadoop-core version that was not 2.0.0-mr1-cdh4.4.0. Has anyone ran into
>> this issue before? Thanks.
>>
>>
>> [1]
>> https://github.com/apache/crunch/blob/master/crunch-contrib/src/main/java/org/apache/crunch/contrib/io/jdbc/DataBaseSource.java#L55
>>
>> [2]
>> https://repository.cloudera.com/cloudera/public/org/apache/hadoop/hadoop-core/2.0.0-mr1-cdh4.4.0/
>>
>> [3] http://jd.benow.ca/
>>
>
>
>
> --
> Director of Data Science
> Cloudera <http://www.cloudera.com>
> Twitter: @josh_wills <http://twitter.com/josh_wills>
>

Mime
View raw message