hama-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Thomas Jungblut (JIRA)" <j...@apache.org>
Subject [jira] [Updated] (HAMA-601) Hama Streaming
Date Wed, 04 Jul 2012 16:46:34 GMT

     [ https://issues.apache.org/jira/browse/HAMA-601?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]

Thomas Jungblut updated HAMA-601:
---------------------------------

    Description: 
We can also do a Streaming job to allow other languages to use Hama's BSP API.

Basically you fork a new process in the BSP method, then set a inputstream for the process
which it can read very simple.
Then an outputstream from the childprocess can be read to give it following abilities:

- get a received message
- send a new message
- sync
- read a line from input
- write to output
- reset the input to reread

Those actions must have a constant prefix, for example send a message could look like this:

%SEND_MESSAGE%=this is the message

or sync:

$SYNC$=

The logic behind it is that we can simply split in Java code by "=" and the lefthand side
is the action and the righthandside is the value of this action.

Between the peers the messages are Text, which has some overhead but is easier to implement
and the communication between the BSP task and the forked process is based on text/strings
anyway.

This time I do not advise to copy the whole streaming from Hadoop itself. However the parts
that repacks the jar with needed execution scripts and the option handling seems good to reuse.
The input- and outputstream handling must be written from scratch because we want to take
actions into account.

  was:
We can also do a Streaming job to allow other languages to use Hama's BSP API.

Basically you fork a new process in the BSP method, then set a inputstream for the process
which it can read very simple.
Then an outputstream from the childprocess can be read to give it following abilities:

- get a received message
- send a new message
- sync
- read a line from input
- write to output
- reset the input to reread

Those actions must have a constant prefix, for example send a message could look like this:

%SEND_MESSAGE%=this is the message

or sync:

$SYNC$=

The logic behind it is that we can simply split in Java code by "=" and the lefthand side
is the action and the righthandside is the value of this action.

Between the peers the messages are Text, which has some overhead but is easier to implement
and the communication between the BSP task and the forked process is based on text/strings
anyway.

    
> Hama Streaming
> --------------
>
>                 Key: HAMA-601
>                 URL: https://issues.apache.org/jira/browse/HAMA-601
>             Project: Hama
>          Issue Type: New Feature
>          Components: bsp core, messaging
>    Affects Versions: 0.6.0
>            Reporter: Thomas Jungblut
>             Fix For: 0.6.0
>
>
> We can also do a Streaming job to allow other languages to use Hama's BSP API.
> Basically you fork a new process in the BSP method, then set a inputstream for the process
which it can read very simple.
> Then an outputstream from the childprocess can be read to give it following abilities:
> - get a received message
> - send a new message
> - sync
> - read a line from input
> - write to output
> - reset the input to reread
> Those actions must have a constant prefix, for example send a message could look like
this:
> %SEND_MESSAGE%=this is the message
> or sync:
> $SYNC$=
> The logic behind it is that we can simply split in Java code by "=" and the lefthand
side is the action and the righthandside is the value of this action.
> Between the peers the messages are Text, which has some overhead but is easier to implement
and the communication between the BSP task and the forked process is based on text/strings
anyway.
> This time I do not advise to copy the whole streaming from Hadoop itself. However the
parts that repacks the jar with needed execution scripts and the option handling seems good
to reuse.
> The input- and outputstream handling must be written from scratch because we want to
take actions into account.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

Mime
View raw message