avro-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Jeremy Lewi (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (AVRO-512) define and implement mapreduce connector protocol
Date Tue, 28 Jun 2011 15:15:16 GMT

    [ https://issues.apache.org/jira/browse/AVRO-512?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13056566#comment-13056566

Jeremy Lewi commented on AVRO-512:

I think a deadlock can occur if the subprocess fails to start (e.g if the executable is specified
incorrectly). This happens because the constructor for TetheredProcess starts the subprocess
and then calls outputService.inputPort(). But inputPort() will block until the child process
sends a configure message to the parent; but if the child process wasn't started then I think
the parent deadlocks. 

At a minimum we could check that the subprocess hasn't exited yet. This probably won't prevent
all possible deadlocks but it might help. 

Below is some code for checking if the process has exited.
      //is there a better way to check if the process has exited then the roundabout way below?
      boolean hasexited=false;
      try {
    	  //exitValue throws an exception if process hasn't exited
      catch (IllegalThreadStateException e){
    	  //it hasn't exited yet
      if (hasexited){
    	//What's the best way to log this
    	  System.out.println("Error: Could not start subprocess");
    	  throw new RuntimeException("Error: Could not start subprocess");

> define and implement mapreduce connector protocol
> -------------------------------------------------
>                 Key: AVRO-512
>                 URL: https://issues.apache.org/jira/browse/AVRO-512
>             Project: Avro
>          Issue Type: New Feature
>          Components: java
>            Reporter: Doug Cutting
>            Assignee: Doug Cutting
>             Fix For: 1.4.0
>         Attachments: AVRO-512.patch, AVRO-512.patch, AVRO-512.patch, AVRO-512.patch
> Avro should provide Hadoop Mapper and Reducer implementations that connect to a subprocess
in another programming language, transmitting raw binary values to and from that process.
 This should be modeled after Hadoop Pipes.  It would allow one to easily write efficient
mapreduce programs in non-Java languages that process Avro-format data.

This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira


View raw message