avro-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Doug Cutting (JIRA)" <j...@apache.org>
Subject [jira] Updated: (AVRO-513) java mapreduce api should pass iterator of matching objects to reduce
Date Mon, 14 Jun 2010 22:52:15 GMT

     [ https://issues.apache.org/jira/browse/AVRO-513?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel

Doug Cutting updated AVRO-513:

    Attachment: AVRO-513.patch

> I was thinking that you only need to copy at the beginning of the group, since you can
compare subsequent values to the copy, until they differ, at which point you make a new copy.

But they might differ in fields that are not compared, e.g., count.  So all objects in the
queue must be unique.

> I think it's possible that the interrupt occurs between the check on "done" and the call
to take(), so the call to take() would go ahead and cause a deadlock.

Interrupt() sets the thread's interrupt flag, and it remains set until a method that throws
InterruptableException is called.  So if it's called before or after take(), that's fine,
since take() will throw it either way when the queue is empty.

The risk that user code swallows the InterruptableException can be fixed by setting 'done=true'
before calling interrupt().  Then if user code swallows the interrupt and the queue is empty,
we'd never call queue.take(), since, by definition, the thread wasn't between checking 'done'
and calling 'take()' when it got the interrupt.  Does that sound right?

> java mapreduce api should pass iterator of matching objects to reduce
> ---------------------------------------------------------------------
>                 Key: AVRO-513
>                 URL: https://issues.apache.org/jira/browse/AVRO-513
>             Project: Avro
>          Issue Type: Improvement
>          Components: java
>            Reporter: Doug Cutting
>            Assignee: Doug Cutting
>             Fix For: 1.4.0
>         Attachments: AVRO-513.patch, AVRO-513.patch, AVRO-513.patch
> The Java mapreduce API added in AVRO-493 requires reducers implementations to explicitly
detect sequences of matching data.
> Rather the reduce method might better look something like:
>    void reduce(Iterator<IN>, Collector<OUT>);
> Where all equal values are passed in a single call.

This message is automatically generated by JIRA.
You can reply to this email to add a comment to the issue online.

View raw message