crunch-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Chao Shi (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (CRUNCH-340) HCatSource
Date Sat, 15 Feb 2014 04:59:19 GMT

    [ https://issues.apache.org/jira/browse/CRUNCH-340?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13902331#comment-13902331
] 

Chao Shi commented on CRUNCH-340:
---------------------------------

bq. Does it run on Hadoop 2.2 as well?

I found that I have to recompile hcatalog-core with hadoop2, although HIVE-4460 said it is
possible to use the same jar for both hadoop1 and hadoop2.

Otherwise I will get:
{code}
Exception in thread "Thread-4" java.lang.IncompatibleClassChangeError: Found interface org.apache.hadoop.mapreduce.JobContext,
but class was expected
        at org.apache.hive.hcatalog.mapreduce.HCatBaseInputFormat.getSplits(HCatBaseInputFormat.java:102)
        at org.apache.hadoop.mapreduce.JobSubmitter.writeNewSplits(JobSubmitter.java:491)
        at org.apache.hadoop.mapreduce.JobSubmitter.writeSplits(JobSubmitter.java:508)
        at org.apache.hadoop.mapreduce.JobSubmitter.submitJobInternal(JobSubmitter.java:392)
        at org.apache.hadoop.mapreduce.Job$10.run(Job.java:1268)
        at org.apache.hadoop.mapreduce.Job$10.run(Job.java:1265)
        at java.security.AccessController.doPrivileged(Native Method)
        at javax.security.auth.Subject.doAs(Subject.java:415)
        at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1491)
        at org.apache.hadoop.mapreduce.Job.submit(Job.java:1265)
        at org.apache.crunch.hadoop.mapreduce.lib.jobcontrol.CrunchControlledJob.submit(CrunchControlledJob.java:307)
        at org.apache.crunch.hadoop.mapreduce.lib.jobcontrol.CrunchJobControl.startReadyJobs(CrunchJobControl.java:201)
        at org.apache.crunch.hadoop.mapreduce.lib.jobcontrol.CrunchJobControl.pollJobStatusAndStartNewOnes(CrunchJobControl.java:231)
        at org.apache.crunch.impl.mr.exec.MRExecutor.monitorLoop(MRExecutor.java:112)
        at org.apache.crunch.impl.mr.exec.MRExecutor.access$000(MRExecutor.java:55)
        at org.apache.crunch.impl.mr.exec.MRExecutor$1.run(MRExecutor.java:83)
        at java.lang.Thread.run(Thread.java:744)
{code}

> HCatSource
> ----------
>
>                 Key: CRUNCH-340
>                 URL: https://issues.apache.org/jira/browse/CRUNCH-340
>             Project: Crunch
>          Issue Type: New Feature
>            Reporter: Chao Shi
>         Attachments: crunch-340-v2.patch, crunch-340-v3.patch, crunch-340.patch
>
>
> This patch adds HCatSource, which enables crunch pipeline to read from Hive tables. This
is the very first version, leaving a few TODOs in code.
> It adds new dependency from crunch-core to hcatalog (as well as several hive components).
I guess maybe we should create a new subproject (e.g. crunch-hcatalog) rather than add it
into crunch-core.



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)

Mime
View raw message