crunch-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Gabriel Reid <gabriel.r...@gmail.com>
Subject Re: Exception with AvroPathPerKeyTarget
Date Fri, 28 Mar 2014 14:44:07 GMT
On Fri, Mar 28, 2014 at 3:19 PM, Jeremy Lewi <jeremy@lewi.us> wrote:
> No luck. I get the same error even when using a single reducer. I'm
> attaching the job configuration as shown in the web ui.
>
> When I look at the job tracker for the job, it has no map tasks. Is that
> expected? I've never heard of a reduce only job.
>

Nope, a job with no map tasks doesn't sound right to me. I noticed
that you're doing a effectively doing a materialize at [1], and then
using a BloomFilterJoinStrategy. While this should work fine, I'm
thinking that it could also potentially lead to some issues such as
the one you're having (i.e. a job with no map tasks).

Could you try using the default join strategy there to see what
happens. I'm thinking that the AvroPathPerKeyTarget issue could just a
consequence of something else going wrong earlier on.

1. https://code.google.com/p/contrail-bio/source/browse/src/main/java/contrail/scaffolding/FilterReads.java?name=dev_read_filtering#156

>
> On Fri, Mar 28, 2014 at 6:45 AM, Jeremy Lewi <jeremy@lewi.us> wrote:
>>
>> This is my first time on a  cluster I'll try what Josh suggests now.
>>
>> J
>>
>>
>> On Fri, Mar 28, 2014 at 3:41 AM, Josh Wills <josh.wills@gmail.com> wrote:
>>>
>>>
>>> On Fri, Mar 28, 2014 at 1:22 AM, Gabriel Reid <gabriel.reid@gmail.com>
>>> wrote:
>>>>
>>>> Hi Jeremy,
>>>>
>>>> On Thu, Mar 27, 2014 at 3:26 PM, Jeremy Lewi <jeremy@lewi.us> wrote:
>>>> > Hi
>>>> >
>>>> > I'm hitting the exception pasted below when using
>>>> > AvroPathPerKeyTarget.
>>>> > Interestingly, my code works just fine when I run on a small dataset
>>>> > using
>>>> > the LocalJobTracker. However, when I run on a large dataset using a
>>>> > hadoop
>>>> > cluster I hit the exception.
>>>> >
>>>>
>>>> Have you ever been able to successfully use the AvroPathPerKeyTarget
>>>> on a real cluster, or is this the first try with it?
>>>>
>>>> I'm wondering if this could be a problem that's always been around (as
>>>> the integration test for AvroPathPerKeyTarget also runs in the local
>>>> jobtracker), or if this could be something new.
>>>
>>>
>>> +1-- Jeremy, if you force the job to run w/a single reducer on the
>>> cluster (i.e., via groupByKey(1)), does it work?
>>>
>>>>
>>>>
>>>> - Gabriel
>>>
>>>
>>
>

Mime
View raw message