Return-Path: X-Original-To: apmail-incubator-hama-commits-archive@minotaur.apache.org Delivered-To: apmail-incubator-hama-commits-archive@minotaur.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id 13D9D928F for ; Sun, 11 Dec 2011 00:08:05 +0000 (UTC) Received: (qmail 37607 invoked by uid 500); 11 Dec 2011 00:08:05 -0000 Delivered-To: apmail-incubator-hama-commits-archive@incubator.apache.org Received: (qmail 37591 invoked by uid 500); 11 Dec 2011 00:08:05 -0000 Mailing-List: contact hama-commits-help@incubator.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: hama-dev@incubator.apache.org Delivered-To: mailing list hama-commits@incubator.apache.org Received: (qmail 37583 invoked by uid 99); 11 Dec 2011 00:08:05 -0000 Received: from athena.apache.org (HELO athena.apache.org) (140.211.11.136) by apache.org (qpsmtpd/0.29) with ESMTP; Sun, 11 Dec 2011 00:08:05 +0000 X-ASF-Spam-Status: No, hits=-2000.0 required=5.0 tests=ALL_TRUSTED X-Spam-Check-By: apache.org Received: from [140.211.11.131] (HELO eos.apache.org) (140.211.11.131) by apache.org (qpsmtpd/0.29) with ESMTP; Sun, 11 Dec 2011 00:08:03 +0000 Received: from eos.apache.org (localhost [127.0.0.1]) by eos.apache.org (Postfix) with ESMTP id 9638B38B for ; Sun, 11 Dec 2011 00:07:42 +0000 (UTC) MIME-Version: 1.0 Content-Type: text/plain; charset="utf-8" Content-Transfer-Encoding: quoted-printable From: Apache Wiki To: Apache Wiki Date: Sun, 11 Dec 2011 00:07:42 -0000 Message-ID: <20111211000742.99403.43346@eos.apache.org> Subject: =?utf-8?q?=5BHama_Wiki=5D_Update_of_=22IOSystem=22_by_thomasjungblut?= Auto-Submitted: auto-generated Dear Wiki user, You have subscribed to a wiki page or wiki category on "Hama Wiki" for chan= ge notification. The "IOSystem" page has been changed by thomasjungblut: http://wiki.apache.org/hama/IOSystem?action=3Ddiff&rev1=3D1&rev2=3D2 + <> + = + =3D=3D General Information =3D=3D + = + Since Hama 0.4.0 we provide a input and output system for BSP Jobs. + = + TODO: Some blahblah about key value and stuff + What's in case when no input is configured? and stuff like that should be= documented here.. + = + = + =3D=3D Input =3D=3D + = + =3D=3D=3D Configuring Input =3D=3D=3D + = + When setting up a BSPJob, you can provide a InputFormat and a Path where = to find the input. + = + {{{ + BSPJob job =3D new BSPJob(); + // detail stuff omitted + job.setInputPath(new Path("/tmp/test.seq"); + job.setInputFormat(org.apache.hama.bsp.SequenceFileInputFormat.class); + }}} + = + Another way to add input paths is following: + {{{ = + SequenceFileInputFormat.addInputPath(job, new Path("/tmp/test.seq")); + }}} + = + You can also add multiple paths by using this method: + = + {{{ = + SequenceFileInputFormat.addInputPaths(job, "/tmp/test.seq,/tmp/test2.s= eq,/tmp/test3.seq"); + }}} + = + '''Note that these paths must be separated by a comma.''' + = + In case of a {{{SequenceFileInputFormat}}} the key and value pair are par= sed from the header. + = + When you use want to read a basic textfile with {{{TextInputFormat}}} the= key is always {{{LongWritable}}} which contains how much bytes have been r= ead and {{{Text}}} which contains a line of your input. = + = + = + =3D=3D=3D Using Input =3D=3D=3D + = + You can now read the input from each of the functions in {{{BSP}}} class = which has {{{BSPPeer}}} as parameter. (e.G. setup / bsp / cleanup) + = + In this case we read a normal text file: + {{{ + @Override + public final void bsp( + BSPPeer peer) + throws IOException, InterruptedException, SyncException { + = + // this method reads the next key value record from file + KeyValuePair pair =3D peer.readNext(); + = + // the following lines do the same: + LongWritable key =3D new LongWritable(); + Text value =3D new Text(); + peer.readNext(key, value); + } + }}} + = + Consult the docs for more detail on events like end of file. + = + There is also a function which allows you to re-read the input from the b= eginning. + = + This snippet reads the input five times: + = + {{{ + for(int i =3D 0; i < 5; i++){ + LongWritable key =3D new LongWritable(); + Text value =3D new Text(); + while (peer.readNext(key, value)) { + // read everything + } + // reopens the input + peer.reopenInput() + } + }}} + = + =3D=3D=3D Custom Inputformat =3D=3D=3D + = + You can implement your own inputformat blabla + = + =3D=3D Output =3D=3D + = + =3D=3D=3D Configuring Output =3D=3D=3D + = + =3D=3D=3D Using Input =3D=3D=3D + = + =3D=3D=3D Custom Outputformat =3D=3D=3D + = + =3D=3D Implementation notes =3D=3D + = + =3D=3D=3D Internal implementation details =3D=3D=3D + = BSPJobClient = 1. Create the splits for the job @@ -12, +108 @@ 1. Receives splitFile 2. Add split argument to TaskInProgress constructor = + Task + = + 1. Gets his split from Groom + 2. Initializes everything in BSPPeerImpl +=20