Return-Path: X-Original-To: apmail-hama-dev-archive@www.apache.org Delivered-To: apmail-hama-dev-archive@www.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id B8D7410969 for ; Sat, 23 Nov 2013 10:55:40 +0000 (UTC) Received: (qmail 19070 invoked by uid 500); 23 Nov 2013 10:55:40 -0000 Delivered-To: apmail-hama-dev-archive@hama.apache.org Received: (qmail 18963 invoked by uid 500); 23 Nov 2013 10:55:38 -0000 Mailing-List: contact dev-help@hama.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: dev@hama.apache.org Delivered-To: mailing list dev@hama.apache.org Received: (qmail 18950 invoked by uid 99); 23 Nov 2013 10:55:36 -0000 Received: from arcas.apache.org (HELO arcas.apache.org) (140.211.11.28) by apache.org (qpsmtpd/0.29) with ESMTP; Sat, 23 Nov 2013 10:55:36 +0000 Date: Sat, 23 Nov 2013 10:55:36 +0000 (UTC) From: "Martin Illecker (JIRA)" To: dev@hama.apache.org Message-ID: In-Reply-To: References: Subject: [jira] [Resolved] (HAMA-815) Hama Pipes uses C++ templates MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: 7bit X-JIRA-FingerPrint: 30527f35849b9dde25b450d4833f0394 [ https://issues.apache.org/jira/browse/HAMA-815?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Martin Illecker resolved HAMA-815. ---------------------------------- Resolution: Fixed > Hama Pipes uses C++ templates > ----------------------------- > > Key: HAMA-815 > URL: https://issues.apache.org/jira/browse/HAMA-815 > Project: Hama > Issue Type: New Feature > Components: bsp core, pipes > Affects Versions: 0.6.3 > Reporter: Martin Illecker > Assignee: Martin Illecker > Fix For: 0.7.0 > > Attachments: HAMA-815.patch > > > *Extending Hama Pipes to use C++ templates* > Currently all messages are converted to *strings* before they are transferred over a socket communication between C++ and Java and vice versa. > To take advantage of the binary socket communication we will serialize and deserialize basic types like *int, long, float, double* directly without converting to strings. This will minimize the risk of type conversation errors. Other types (except these basic types) are transferred as strings. > It's also possible to create custom *Writables* and serialize and deserialize the object to string by overriding the following methods. (e.g., *PipesVectorWritable* and *PipesKeyValueWritable*) > {code} > @Override public void readFields(DataInput in) throws IOException > @Override public void write(DataOutput out) throws IOException > {code} > Hama Streaming, which depends on Hama Pipes, is still using strings. > The following methods change > {{virtual void sendMessage(const string& peerName, const string& msg)}} > {{virtual const string& getCurrentMessage()}} > {{virtual void write(const string& key, const string& value)}} > {{virtual bool readNext(string& key, string& value)}} > to support C++ templates: > {{virtual void sendMessage(const string& peer_name, *const M& msg*)}} > {{virtual *M* getCurrentMessage()}} > {{virtual void write(*const K2& key, const V2& value*)}} > {{virtual bool readNext(*K1& key, V1& value*)}} > Also *SequenceFile* functions uses templates: > {{bool sequenceFileReadNext(int32_t file_id, *K& key, V& value*)}} > {{bool sequenceFileAppend(int32_t file_id, *const K& key, const V& value*)}} > And the native *Partitioner* supports it: > {code} > template > class Partitioner { > public: > virtual int partition(const K1& key, const V1& value, int32_t num_tasks) = 0; > virtual ~Partitioner() {} > }; > {code} > This will minimize type conversation errors and change the compilation procedure. Because of the nature of C++ templates, static libraries are not possible anymore. The compiler will substitute all templates at compile time. > The compile command will look like: > {code} > g++ -m64 -Ic++/src/main/native/utils/api \ > -Ic++/src/main/native/pipes/api \ > -Lc++/target/native \ > -lhadooputils -lpthread \ > PROGRAM.cc \ > -o PROGRAM \ > -g -Wall -O2 > {code} > Finally the job configuration supports the following properties: > {code} > > bsp.input.format.class > org.apache.hama.bsp.KeyValueTextInputFormat > > > bsp.input.key.class > org.apache.hadoop.io.Text > > > bsp.input.value.class > org.apache.hadoop.io.Text > > > bsp.output.format.class > org.apache.hama.bsp.SequenceFileOutputFormat > > > bsp.output.key.class > org.apache.hadoop.io.Text > > > bsp.output.value.class > org.apache.hadoop.io.DoubleWritable > > > bsp.message.class > org.apache.hadoop.io.DoubleWritable > > {code} -- This message was sent by Atlassian JIRA (v6.1#6144)