Return-Path: X-Original-To: apmail-giraph-dev-archive@www.apache.org Delivered-To: apmail-giraph-dev-archive@www.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id 88BF910E21 for ; Wed, 31 Jul 2013 19:41:49 +0000 (UTC) Received: (qmail 17487 invoked by uid 500); 31 Jul 2013 19:41:49 -0000 Delivered-To: apmail-giraph-dev-archive@giraph.apache.org Received: (qmail 17040 invoked by uid 500); 31 Jul 2013 19:41:48 -0000 Mailing-List: contact dev-help@giraph.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: dev@giraph.apache.org Delivered-To: mailing list dev@giraph.apache.org Received: (qmail 17013 invoked by uid 500); 31 Jul 2013 19:41:48 -0000 Delivered-To: apmail-incubator-giraph-dev@incubator.apache.org Received: (qmail 17009 invoked by uid 99); 31 Jul 2013 19:41:48 -0000 Received: from arcas.apache.org (HELO arcas.apache.org) (140.211.11.28) by apache.org (qpsmtpd/0.29) with ESMTP; Wed, 31 Jul 2013 19:41:48 +0000 Date: Wed, 31 Jul 2013 19:41:48 +0000 (UTC) From: "Hudson (JIRA)" To: giraph-dev@incubator.apache.org Message-ID: In-Reply-To: References: Subject: [jira] [Commented] (GIRAPH-717) HiveJythonRunner with support for pure Jython value types. MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: quoted-printable X-JIRA-FingerPrint: 30527f35849b9dde25b450d4833f0394 [ https://issues.apache.org/jira/browse/GIRAPH-717?page=3Dcom.atlassian= .jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=3D1372= 5641#comment-13725641 ]=20 Hudson commented on GIRAPH-717: ------------------------------- SUCCESS: Integrated in Giraph-trunk-Commit #1215 (See [https://builds.apach= e.org/job/Giraph-trunk-Commit/1215/]) GIRAPH-717: HiveJythonRunner with support for pure Jython value types (nita= y) (nitay: http://git-wip-us.apache.org/repos/asf?p=3Dgiraph.git&a=3Dcommit= &h=3Dd419f8f4f84723e1eadce96168414de9f3ce5677) * findbugs-exclude.xml * giraph-core/src/main/java/org/apache/giraph/conf/GiraphClasses.java * giraph-core/src/main/java/org/apache/giraph/factories/EdgeValueFactory.ja= va * giraph-core/src/main/java/org/apache/giraph/types/DoubleToDoubleWritableW= rapper.java * giraph-core/src/main/java/org/apache/giraph/types/LongToLongWritableWrapp= er.java * giraph-hive/src/test/java/org/apache/giraph/hive/GiraphHiveTestBase.java * giraph-core/src/main/java/org/apache/giraph/jython/JythonUtils.java * giraph-core/src/test/java/org/apache/giraph/master/TestComputationCombine= rTypes.java * giraph-core/src/main/java/org/apache/giraph/jython/JythonComputationFacto= ry.java * giraph-hive/src/main/java/org/apache/giraph/hive/jython/JythonHiveToVerte= x.java * giraph-core/src/main/java/org/apache/giraph/utils/ReflectionUtils.java * giraph-hive/src/main/java/org/apache/giraph/hive/types/TypedValueWriter.j= ava * giraph-hive/src/main/java/org/apache/giraph/hive/types/HiveVertexIdWriter= .java * giraph-core/src/main/java/org/apache/giraph/factories/DefaultEdgeValueFac= tory.java * giraph-hive/src/test/resources/org/apache/giraph/jython/fake-label-propag= ation-worker.py * giraph-hive/src/main/java/org/apache/giraph/hive/jython/JythonVertexToHiv= e.java * giraph-hive/src/main/java/org/apache/giraph/hive/primitives/PrimitiveValu= eReader.java * giraph-core/src/main/java/org/apache/giraph/jython/factories/JythonVertex= ValueFactory.java * giraph-hive/src/test/resources/org/apache/giraph/jython/count-edges-launc= her.py * giraph-core/src/main/java/org/apache/giraph/factories/MessageValueFactory= .java * giraph-hive/src/main/java/org/apache/giraph/hive/input/vertex/AbstractHiv= eToVertex.java * giraph-hive/src/main/java/org/apache/giraph/hive/types/package-info.java * giraph-hive/src/test/java/org/apache/giraph/hive/Helpers.java * giraph-core/src/main/java/org/apache/giraph/jython/JythonComputation.java * giraph-core/src/main/java/org/apache/giraph/conf/ImmutableClassesGiraphCo= nfiguration.java * giraph-core/src/main/java/org/apache/giraph/types/ByteToIntWritableWrappe= r.java * giraph-hive/src/test/java/org/apache/giraph/hive/jython/TestHiveJythonPri= mitives.java * giraph-core/src/main/java/org/apache/giraph/conf/PerGraphTypeBooleanConfO= ption.java * giraph-core/src/test/java/org/apache/giraph/io/TestEdgeInput.java * giraph-hive/src/main/java/org/apache/giraph/hive/input/edge/TypedHiveToEd= ge.java * giraph-core/src/main/java/org/apache/giraph/jython/wrappers/JythonWrapper= Base.java * giraph-core/src/main/java/org/apache/giraph/jython/factories/JythonFactor= yBase.java * giraph-hive/src/test/java/org/apache/giraph/hive/jython/TestJythonLabelIn= fluence.java * giraph-core/src/main/java/org/apache/giraph/comm/messages/out_of_core/Seq= uentialFileMessageStore.java * giraph-core/src/main/java/org/apache/giraph/conf/GiraphConstants.java * giraph-hive/src/main/java/org/apache/giraph/hive/input/vertex/TypedHiveTo= Vertex.java * giraph-core/src/main/java/org/apache/giraph/comm/messages/ByteArrayMessag= esPerVertexStore.java * giraph-core/src/main/java/org/apache/giraph/factories/ValueFactory.java * pom.xml * giraph-hive/src/test/java/org/apache/giraph/hive/input/HiveEdgeInputTest.= java * giraph-core/src/main/java/org/apache/giraph/types/ShortToLongWritableWrap= per.java * giraph-core/src/main/java/org/apache/giraph/jython/JythonGiraphComputatio= n.java * giraph-core/src/main/java/org/apache/giraph/comm/messages/InMemoryMessage= StoreFactory.java * giraph-core/src/main/java/org/apache/giraph/utils/ByteArrayVertexIdMessag= es.java * giraph-hive/src/main/java/org/apache/giraph/hive/types/HiveValueWriter.ja= va * giraph-hive/src/main/java/org/apache/giraph/hive/types/HiveValueReader.ja= va * giraph-hive/src/main/java/org/apache/giraph/hive/primitives/package-info.= java * giraph-core/src/main/java/org/apache/giraph/comm/messages/OneMessagePerVe= rtexStore.java * giraph-core/src/test/java/org/apache/giraph/jython/TestJythonWritableWrap= per.java * giraph-hive/src/main/java/org/apache/giraph/hive/jython/package-info.java * giraph-examples/src/test/java/org/apache/giraph/TestBspBasic.java * giraph-core/src/main/java/org/apache/giraph/jython/factories/package-info= .java * giraph-core/src/main/java/org/apache/giraph/factories/VertexValueFactory.= java * giraph-hive/src/main/java/org/apache/giraph/hive/jython/JythonHiveWriter.= java * giraph-core/src/main/java/org/apache/giraph/graph/GraphType.java * giraph-core/src/main/java/org/apache/giraph/conf/PerGraphTypeBoolean.java * giraph-core/src/main/java/org/apache/giraph/jython/factories/JythonMessag= eValueFactory.java * giraph-core/src/main/java/org/apache/giraph/types/BooleanToBooleanWritabl= eWrapper.java * giraph-core/src/main/java/org/apache/giraph/types/ByteToByteWritableWrapp= er.java * giraph-core/src/main/java/org/apache/giraph/jython/JythonJob.java * giraph-core/src/main/java/org/apache/giraph/factories/AbstractMessageValu= eFactory.java * giraph-core/src/test/resources/org/apache/giraph/jython/count-edges.py * giraph-hive/src/main/java/org/apache/giraph/hive/types/TypedVertexIdReade= r.java * giraph-core/src/main/java/org/apache/giraph/factories/TestMessageValueFac= tory.java * giraph-hive/src/main/java/org/apache/giraph/hive/common/GiraphHiveConstan= ts.java * giraph-core/src/main/java/org/apache/giraph/types/FloatToDoubleWritableWr= apper.java * giraph-examples/src/test/java/org/apache/giraph/vertex/TestComputationTyp= es.java * giraph-hive/src/test/java/org/apache/giraph/hive/jython/TestHiveJythonCom= plexTypes.java * giraph-hive/src/main/java/org/apache/giraph/hive/column/package-info.java * giraph-core/src/main/java/org/apache/giraph/types/IntToLongWritableWrappe= r.java * giraph-hive/src/main/java/org/apache/giraph/hive/types/TypedVertexIdWrite= r.java * giraph-hive/src/main/java/org/apache/giraph/hive/column/HiveReadableColum= n.java * giraph-core/src/main/java/org/apache/giraph/graph/AbstractComputation.jav= a * giraph-core/src/main/java/org/apache/giraph/master/MasterCompute.java * giraph-core/src/test/java/org/apache/giraph/master/TestSwitchClasses.java * giraph-core/src/main/java/org/apache/giraph/jython/JythonOptions.java * giraph-hive/src/main/java/org/apache/giraph/hive/types/HiveVertexIdReader= .java * giraph-core/src/main/java/org/apache/giraph/jython/factories/JythonOutgoi= ngMessageValueFactory.java * giraph-core/src/main/java/org/apache/giraph/jython/factories/JythonVertex= IdFactory.java * giraph-hive/src/main/java/org/apache/giraph/hive/types/TypedValueReader.j= ava * giraph-core/src/main/java/org/apache/giraph/conf/PerGraphTypeEnumConfOpti= on.java * giraph-hive/src/main/java/org/apache/giraph/hive/input/edge/AbstractHiveT= oEdge.java * giraph-core/src/main/java/org/apache/giraph/scripting/ScriptLoader.java * giraph-core/src/main/java/org/apache/giraph/types/FloatToFloatWritableWra= pper.java * giraph-core/src/main/java/org/apache/giraph/jython/wrappers/package-info.= java * giraph-hive/src/main/java/org/apache/giraph/hive/values/package-info.java * giraph-hive/src/main/java/org/apache/giraph/hive/jython/JythonHiveReader.= java * giraph-hive/src/test/java/org/apache/giraph/hive/input/HiveVertexInputTes= t.java * giraph-hive/src/main/java/org/apache/giraph/hive/jython/JythonHiveToEdge.= java * giraph-hive/src/main/java/org/apache/giraph/hive/common/LanguageAndType.j= ava * giraph-hive/src/main/java/org/apache/giraph/hive/values/HiveValueWriter.j= ava * giraph-core/src/main/java/org/apache/giraph/conf/GiraphConfiguration.java * giraph-core/src/main/java/org/apache/giraph/utils/ByteArrayOneToAllMessag= es.java * giraph-hive/src/main/java/org/apache/giraph/hive/jython/HiveJythonUtils.j= ava * giraph-hive/src/main/java/org/apache/giraph/hive/common/HiveUtils.java * giraph-hive/src/main/java/org/apache/giraph/hive/primitives/PrimitiveValu= eWriter.java * giraph-hive/src/test/resources/org/apache/giraph/jython/fake-label-propag= ation-launcher.py * giraph-hive/src/main/java/org/apache/giraph/hive/output/TypedVertexToHive= .java * giraph-core/src/main/java/org/apache/giraph/factories/ValueFactoryBase.ja= va * giraph-core/src/main/java/org/apache/giraph/types/WritableWrapper.java * giraph-core/src/main/java/org/apache/giraph/factories/DefaultVertexValueF= actory.java * giraph-core/src/main/java/org/apache/giraph/graph/Computation.java * giraph-hive/src/test/java/org/apache/giraph/hive/output/HiveOutputTest.ja= va * giraph-core/src/main/java/org/apache/giraph/jython/factories/JythonEdgeVa= lueFactory.java * giraph-core/src/main/java/org/apache/giraph/conf/PerGraphTypeEnum.java * giraph-core/src/main/java/org/apache/giraph/factories/DefaultVertexIdFact= ory.java * giraph-core/src/main/java/org/apache/giraph/graph/BasicComputation.java * giraph-hive/src/main/java/org/apache/giraph/hive/values/HiveValueReader.j= ava * giraph-core/src/main/java/org/apache/giraph/types/ByteToLongWritableWrapp= er.java * giraph-hcatalog/src/main/java/org/apache/giraph/io/hcatalog/HCatGiraphRun= ner.java * giraph-core/src/main/java/org/apache/giraph/comm/messages/MessagesIterabl= e.java * giraph-core/src/main/java/org/apache/giraph/types/ShortToIntWritableWrapp= er.java * giraph-core/pom.xml * giraph-hive/src/main/java/org/apache/giraph/hive/column/HiveWritableColum= n.java * CHANGELOG * giraph-core/src/test/java/org/apache/giraph/jython/TestJythonBasic.java * giraph-core/src/main/java/org/apache/giraph/jython/wrappers/JythonWritabl= eWrapper.java * giraph-hive/src/main/java/org/apache/giraph/hive/jython/HiveJythonRunner.= java * giraph-core/src/main/java/org/apache/giraph/types/IntToIntWritableWrapper= .java * giraph-core/src/test/java/org/apache/giraph/utils/TestReflectionUtils.jav= a * giraph-core/src/test/java/org/apache/giraph/jython/TestJython.java * giraph-core/src/test/java/org/apache/giraph/jython/TestJythonComputation.= java * giraph-hive/src/main/java/org/apache/giraph/hive/jython/JythonColumnWrite= r.java * giraph-core/src/main/java/org/apache/giraph/jython/factories/JythonIncomi= ngMessageValueFactory.java * giraph-core/src/main/java/org/apache/giraph/factories/VertexIdFactory.jav= a * giraph-hive/src/main/java/org/apache/giraph/hive/jython/JythonColumnReade= r.java * giraph-hive/src/main/java/org/apache/giraph/hive/jython/JythonReadableCol= umn.java * giraph-hive/src/main/java/org/apache/giraph/hive/jython/JythonHiveIO.java * giraph-core/src/main/java/org/apache/giraph/utils/ConfigurationUtils.java * giraph-core/src/main/java/org/apache/giraph/jython/factories/JythonComput= ationFactory.java =20 > HiveJythonRunner with support for pure Jython value types. > ---------------------------------------------------------- > > Key: GIRAPH-717 > URL: https://issues.apache.org/jira/browse/GIRAPH-717 > Project: Giraph > Issue Type: Bug > Reporter: Nitay Joffe > Assignee: Nitay Joffe > > This adds support for pure Jython jobs. Currently this runner is hooked u= p to work with Hive. I'll make it more generic later. > Running a Jython job is simply: > HIVE_HOME=3D > HADOOP_HOME=3D > $HIVE_HOME/bin/hive --service jar org.apache.giraph.hiv= e.jython.HiveJythonRunner jython1.py [jython2.py] ... > You can pass in any number of scripts. They will be parsed in order and s= ent to all the workers using DistributedCache. > There are examples and tests=C2=A0in the diff. Here is one example: > launcher: https://gist.github.com/nitay/a62e0a5d369a5e701fa3 > worker: https://gist.github.com/nitay/7834fd2b059527e65a36 > There are a few pieces to a Jython job, I'll go over each part here. > The HiveJythonRunner will call a function called "prepare(job)" from the = Jython scripts. This is the entry point for configuring your job. > In this configuration you setup everything, such as your graph types (tho= se IVEMM writables) and sets up the Hive vertex/edge inputs and output. Eac= h graph type is one of the following: > 1) A Java type. For example the user can specify simply IntWritable > 2) A Jython type that implements Writable. In the example above the messa= ge value implements Writable. > 3) A pure Jython type. The Java code will wrap these objects in a Writabl= e wrapper that serializes Jython values using Pickle (jython IO framework). > Your computation must implement JythonComputation. Note that this does no= t actually implement Computation, but rather is a separate class so that we= can wrap all the types passed in with a wrapper that implements Writable. = The methods are named the same so that the user does not notice anything. > For Hive usage - if your value type is a primitive e.g. IntWritable or Lo= ngWritable, then you need not do anything. The Java code will automatically= read/write the Hive table specified and convert between Hive types and the= primitive Writable. The vertex_id type in the example works like this. > If=C2=A0your value is a custom Jython type, you must create classes which= implement JythonHiveReader/JythonHiveWriter (or JythonHiveIO which is both= ). These objects read/write Jython types from Hive. There are wrappers in t= he Java code which take HiveIO data normally used in giraph-hive and turns = them into Jython types. This means, for example, that getMap() will return = a Jython dictionary instead of a Java Map. > There is also a PageRankBenchmark (from previous diff) implemented in Jyt= hon. Here's a run for comparison / sanity check: > PageRankBenchmark with 10 workers, 100M vertices, 10B edges, 10 compute t= hreads > trunk: > https://gist.github.com/nitay/3170fa3b575d4d2e22a9 > total time: 302466 > with this diff: > https://gist.github.com/nitay/a52b6d1d64e50ab9829e > total time: 306517 > in jython: > https://gist.github.com/nitay/3f2e758b2933c3521727 > total time: 434730 > So we see that existing things are not affected (is there something else = I should test?) and that Jython has around 40% overhead. > ReviewBoard: https://reviews.apache.org/r/12543/ (Sorry it's a big one, h= ard to split up :/) -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrato= rs For more information on JIRA, see: http://www.atlassian.com/software/jira