hama-commits mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Apache Wiki <wikidi...@apache.org>
Subject [Hama Wiki] Update of "StreamingProtocol" by thomasjungblut
Date Thu, 13 Sep 2012 19:08:26 GMT
Dear Wiki user,

You have subscribed to a wiki page or wiki category on "Hama Wiki" for change notification.

The "StreamingProtocol" page has been changed by thomasjungblut:
http://wiki.apache.org/hama/StreamingProtocol?action=diff&rev1=3&rev2=4

  [[http://people.apache.org/~tjungblut/downloads/streaming_documentation_0.pdf|Immutable
Version *.pdf]]
  
  
- >> will clean this up tomorrow.
- 
  == Abstract ==
  
  The Hama Streaming Protocol is a protocol which enables non-Java applications to write BSP
functions just like in pure Java code. The communication takes place with plain strings and
is therefore highly portable. Also this is the only difference to Hama Pipes, which is a native
implementation for C/C++ fully based on low level binary communications and is therefore faster
and has less overhead. In the following chapters, we will introduce the text based protocol
and its bidirectional communication, also with the practical example of a Python implementation.

@@ -27, +25 @@

  Hama now redirects the input and output streams of that child process to streams that are
used for the protocol. Therefore you can’t simply print by writing to STDOUT, but you can
use the LOG function that redirects special log statements to the log of the Java task, this
will be explained later. 
  
  The protocol is designed to mimic the Java API, for obvious reasons of documentation and
convenience, so we suggest to make yourself comfortable with the Java API first and then get
into coding the protocol. For additional information about the design and programming model,
have a look at our Getting Started section in our wiki and the user documentation as a PDF.
- http://wiki.apache.org/hama/GettingStarted#User_documentation_as_PDF
+ 
+ [[GettingStarted#User_documentation_as_PDF| Getting Started - User Documentation as PDF]]
  
  If you want to know how to run a user code within other environments, have a look in the
python example at the end.
  
@@ -42, +41 @@

  
  Here is a table of all op codes and their corresponding unique identifier, note that not
all of them are in use, because the streaming protocol is working on top of the binary pipes
protocol.
  
- Operation Code Name	Operation Code identifier	Comment
+ ||Operation Code Name	||Operation Code identifier||	Comment||
- START	0	First op code after fork
+ ||START	||0||	First op code after fork||
- SET_BSP_JOB_CONF	1	Get configuration values
+ ||SET_BSP_JOB_CONF	||1||	Get configuration values||
- SET_INPUT_TYPES	2	Not used
+ ||SET_INPUT_TYPES	||2||	Not used||
- RUN_SETUP	3	Start of the setup function
+ ||RUN_SETUP	||3||	Start of the setup function||
- RUN_BSP	4	Start of the bsp function
+ ||RUN_BSP	||4||	Start of the bsp function||
- RUN_CLEANUP	5	Start of the cleanup function
+ ||RUN_CLEANUP	||5||	Start of the cleanup function||
- READ_KEYVALUE	6	Reads a key/value pair from input (Text Only)
+ ||READ_KEYVALUE	||6||	Reads a key/value pair from input (Text Only)||
- WRITE_KEYVALUE	7	Writes a key/value pair to output (Text Only)
+ ||WRITE_KEYVALUE	||7||	Writes a key/value pair to output (Text Only)||
- GET_MSG	8	Gets the next message in the queue
+ ||GET_MSG	||8||	Gets the next message in the queue||
- GET_MSG_COUNT	9	Get how many messages are in the queue
+ ||GET_MSG_COUNT	||9||	Get how many messages are in the queue||
- SEND_MSG	10	Send a message
+ ||SEND_MSG	||10||	Send a message||
- SYNC	11	Start the barrier synchronization
+ ||SYNC	||11||	Start the barrier synchronization||
- GET_ALL_PEERNAME	12	Get all peer names
+ ||GET_ALL_PEERNAME	||12||	Get all peer names||
- GET_PEERNAME	13	Get the peer name of the current peer
+ ||GET_PEERNAME	||13||	Get the peer name of the current peer||
- GET_PEER_INDEX	14	Get a peer name via index
+ ||GET_PEER_INDEX	||14||	Get a peer name via index||
- GET_PEER_COUNT	15	Get how many peers are there
+ ||GET_PEER_COUNT	||15||	Get how many peers are there||
- GET_SUPERSTEP_COUNT	16	Get the current Superstep counter
+ ||GET_SUPERSTEP_COUNT	||16||	Get the current Superstep counter||
- REOPEN_INPUT	17	Reopens the input to reread
+ ||REOPEN_INPUT	||17||	Reopens the input to reread||
- CLEAR	18	Clears the messaging queue
+ ||CLEAR	||18||	Clears the messaging queue||
- CLOSE	19	Closes the protocol
+ ||CLOSE	||19||	Closes the protocol||
- ABORT	20	Not used
+ ||ABORT	||20||	Not used||
- DONE	21	Closes the protocol if task is done
+ ||DONE	||21||	Closes the protocol if task is done||
- TASK_DONE	22	Yet another task is done op code
+ ||TASK_DONE	||22||	Yet another task is done op code||
- REGISTER_COUNTER	23	Not used (please create a new issue for that)
+ ||REGISTER_COUNTER	||23||	Not used (please create a new issue for that)||
- INCREMENT_COUNTER	24	Not used (please create a new issue for that)
+ ||INCREMENT_COUNTER	||24||	Not used (please create a new issue for that)||
- LOG	25	Not implemented in Pipes, but in streaming it sends child logging to the Java task.
+ ||LOG	||25||	Not implemented in Pipes, but in streaming it sends child logging to the Java
task.||
  
  
  == Acknowledgements ==
@@ -89, +88 @@

  
  
  A sample communication could look like this // are comments and are not parts of the protocol:
- 
+ {{{
  %0%= 			// start op code from java task
  0 			// protocol number
  %1%= 			// set bsp configuration op code
  2 			/ number of configuration items to read
  hama.bsp.child.opts 	// sample key from configuration
  -Xmx512m 		// sample value from configuration
+ }}}
  
  Now the forked child needs to acknowledge the start OP code by writing:
  
+ {{{
  %ACK_0%=		// ACK’d start code
+ }}}
  
  After the acknowledgement, we immediately start with the setup function.
+ 
+ 
- Setup/BSP/Cleanup Sequence
+ == Setup/BSP/Cleanup Sequence ==
  
  A normal BSP task has three major steps:
  
@@ -118, +122 @@

  
  Of course here is the example with the setup function:
  
+ {{{
  %3%= 			// setup op code from java task
  // now call the users setup function, any communication can happen here
  %ACK_3%=		// ACK’d setup code
+ }}}
  
  The three steps are called in sequential order, so after you have ACK’d the end of the
setup, the Java code will immediately start telling you about that you need to start the bsp
function. This applies also for the transition between bsp and cleanup function.
  
@@ -130, +136 @@

  
  Here is the table for the communication of the currently implemented BSP functionality:
  
- BSP Primitive	Operation sequence	Response
+ ||BSP Primitive	||Operation sequence||	Response||
+ ||Send	|| %10%= \n Destination peer name \n Message as text	||||
+ ||Sync	||%11%=||	After sync a special ACK will be in the next line: “%11%=_SUCCESS”.
So you should block until you received this.||
- Send	-	%10%=
- -	Destination peer name
- -	Message as text	
- Sync	-	%11%=	After sync a special ACK will be in the next line: “%11%=_SUCCESS”
- So you should block until you received this.
- getCurrentMessage	-	%8%=	“%%-1%%” if no message was found, else the message as text
+ ||getCurrentMessage	||%8%=||	“%%-1%%” if no message was found, else the message as text||
- getSuperstepCount	-	%16%=	The raw Superstep number
+ ||getSuperstepCount	||%16%=||	The raw Superstep number||
- getMessageCount	-	%9%=	The raw message count
+ ||getMessageCount	||%9%=||	The raw message count||
- getPeerName	-	%13%=
- -	Followed by an index (plain integer), -1 for the name of this peer.	The name of the peer
as text
+ ||getPeerName	||%13%= \n Followed by an index (plain integer), -1 for the name of this peer.
||The name of the peer as text||
- getPeerIndex	-	%14%=	The raw index in the peer array
+ ||getPeerIndex	||%14%=||	The raw index in the peer array||
- getAllPeerNames	-	%12%=	A line how many peers are there (plain number), then n-lines with
the peer names
+ ||getAllPeerNames	||%12%=||	A line how many peers are there (plain number), then n-lines
with the peer names||
- getPeerCount	-	%15%=	Plain number of how many peers are there
+ ||getPeerCount	||%15%=||	Plain number of how many peers are there||
+ ||writeKeyValue	||%7%= \n Key as line \n Value as line	||||
- writeKeyValue	-	%7%=
- -	Key as line
- -	Value as line	
- readKeyValue	-	%6%=	Key and Value in two separate following lines. If no input available,
both are equal to “%%-1%%”.
+ ||readKeyValue	||%6%=||	Key and Value in two separate following lines. If no input available,
both are equal to “%%-1%%”.||
- reopenInput	-	%17%=	
+ ||reopenInput||%17%=	||||
  
  == Closing Sequences ==
  
@@ -157, +157 @@

  After receiving this, the Java process will do the normal cleanup and finish the task itself,
after sending these codes you can gracefully shutdown your process by letting it exit with
a zero status.
  
  The end sequence looks like this:
- 
+ {{{
  %22%= 			// task done
  %21%= 			// completely done
- 
+ }}}
  
  Congratulation, you are now able to implement Hama streaming in other languages.
  
@@ -178, +178 @@

  
  Looks basically like that:
  
+ {{{
  className = sys.argv[1]
  module = __import__(className)
  class_ = getattr(module, className)
  bspInstance = class_()
+ }}}
  
  Now you can pass this instance to the handler that calls back the user functions in the
python implementation this is done by the peer itself.
  

Mime
View raw message