Return-Path: X-Original-To: archive-asf-public-internal@cust-asf2.ponee.io Delivered-To: archive-asf-public-internal@cust-asf2.ponee.io Received: from cust-asf.ponee.io (cust-asf.ponee.io [163.172.22.183]) by cust-asf2.ponee.io (Postfix) with ESMTP id D5125200AE2 for ; Fri, 27 May 2016 20:33:57 +0200 (CEST) Received: by cust-asf.ponee.io (Postfix) id D3A41160A3C; Fri, 27 May 2016 18:33:57 +0000 (UTC) Delivered-To: archive-asf-public@cust-asf.ponee.io Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by cust-asf.ponee.io (Postfix) with SMTP id D4174160A38 for ; Fri, 27 May 2016 20:33:56 +0200 (CEST) Received: (qmail 55001 invoked by uid 500); 27 May 2016 18:33:56 -0000 Mailing-List: contact commits-help@tinkerpop.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: dev@tinkerpop.apache.org Delivered-To: mailing list commits@tinkerpop.apache.org Received: (qmail 54992 invoked by uid 99); 27 May 2016 18:33:56 -0000 Received: from pnap-us-west-generic-nat.apache.org (HELO spamd4-us-west.apache.org) (209.188.14.142) by apache.org (qpsmtpd/0.29) with ESMTP; Fri, 27 May 2016 18:33:56 +0000 Received: from localhost (localhost [127.0.0.1]) by spamd4-us-west.apache.org (ASF Mail Server at spamd4-us-west.apache.org) with ESMTP id AB818C0608 for ; Fri, 27 May 2016 18:33:55 +0000 (UTC) X-Virus-Scanned: Debian amavisd-new at spamd4-us-west.apache.org X-Spam-Flag: NO X-Spam-Score: -4.646 X-Spam-Level: X-Spam-Status: No, score=-4.646 tagged_above=-999 required=6.31 tests=[KAM_ASCII_DIVIDERS=0.8, KAM_LAZY_DOMAIN_SECURITY=1, RCVD_IN_DNSWL_HI=-5, RCVD_IN_MSPIKE_H3=-0.01, RCVD_IN_MSPIKE_WL=-0.01, RP_MATCHES_RCVD=-1.426] autolearn=disabled Received: from mx1-lw-us.apache.org ([10.40.0.8]) by localhost (spamd4-us-west.apache.org [10.40.0.11]) (amavisd-new, port 10024) with ESMTP id vvhzYYxtcVvq for ; Fri, 27 May 2016 18:33:54 +0000 (UTC) Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by mx1-lw-us.apache.org (ASF Mail Server at mx1-lw-us.apache.org) with SMTP id 3BECD5FD58 for ; Fri, 27 May 2016 18:33:52 +0000 (UTC) Received: (qmail 54611 invoked by uid 99); 27 May 2016 18:33:50 -0000 Received: from git1-us-west.apache.org (HELO git1-us-west.apache.org) (140.211.11.23) by apache.org (qpsmtpd/0.29) with ESMTP; Fri, 27 May 2016 18:33:50 +0000 Received: by git1-us-west.apache.org (ASF Mail Server at git1-us-west.apache.org, from userid 33) id 83425E01D8; Fri, 27 May 2016 18:33:50 +0000 (UTC) Content-Type: text/plain; charset="us-ascii" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit From: dkuppitz@apache.org To: commits@tinkerpop.incubator.apache.org Date: Fri, 27 May 2016 18:34:07 -0000 Message-Id: In-Reply-To: <46106f1ef53f4e92b3d902ac7b4ce887@git.apache.org> References: <46106f1ef53f4e92b3d902ac7b4ce887@git.apache.org> X-Mailer: ASF-Git Admin Mailer Subject: [18/20] incubator-tinkerpop git commit: Added script and configuration file that can be used for OLAP CSV exports. archived-at: Fri, 27 May 2016 18:33:58 -0000 Added script and configuration file that can be used for OLAP CSV exports. Project: http://git-wip-us.apache.org/repos/asf/incubator-tinkerpop/repo Commit: http://git-wip-us.apache.org/repos/asf/incubator-tinkerpop/commit/e13f91aa Tree: http://git-wip-us.apache.org/repos/asf/incubator-tinkerpop/tree/e13f91aa Diff: http://git-wip-us.apache.org/repos/asf/incubator-tinkerpop/diff/e13f91aa Branch: refs/heads/TINKERPOP-1298 Commit: e13f91aacddd4cb5722ac3a6e100fdf70b0c33fd Parents: 8ad5b62 Author: Daniel Kuppitz Authored: Mon May 23 18:24:22 2016 +0200 Committer: Daniel Kuppitz Committed: Tue May 24 19:51:22 2016 +0200 ---------------------------------------------------------------------- data/script-csv-export.groovy | 43 +++++++++++++++ .../conf/hadoop-csv-export.properties | 56 ++++++++++++++++++++ 2 files changed, 99 insertions(+) ---------------------------------------------------------------------- http://git-wip-us.apache.org/repos/asf/incubator-tinkerpop/blob/e13f91aa/data/script-csv-export.groovy ---------------------------------------------------------------------- diff --git a/data/script-csv-export.groovy b/data/script-csv-export.groovy new file mode 100644 index 0000000..7a6da22 --- /dev/null +++ b/data/script-csv-export.groovy @@ -0,0 +1,43 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one + * or more contributor license agreements. See the NOTICE file + * distributed with this work for additional information + * regarding copyright ownership. The ASF licenses this file + * to you under the Apache License, Version 2.0 (the + * "License"); you may not use this file except in compliance + * with the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, + * software distributed under the License is distributed on an + * "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY + * KIND, either express or implied. See the License for the + * specific language governing permissions and limitations + * under the License. + */ +@Grab(group = 'com.opencsv', module = 'opencsv', version = '3.7') +import com.opencsv.* + +import org.apache.tinkerpop.gremlin.process.computer.bulkdumping.BulkExportVertexProgram + +def stringify(vertex) { + def result = null + def haltedTraversers = vertex.property(TraversalVertexProgram.HALTED_TRAVERSERS) + if (haltedTraversers.isPresent()) { + def properties = vertex.value(BulkExportVertexProgram.BULK_EXPORT_PROPERTIES).split("\1")*.split("\2", 2)*.toList() + def writer = new StringWriter() + def w = new CSVWriter(writer) + haltedTraversers.value().each { def t -> + def values = [] + properties.each { def property, def format -> + def value = t.path(property) + values << (format.isEmpty() ? value.toString() : String.format(format, value)) + } + w.writeNext((String[]) values, false) + } + result = writer.toString().trim() + writer.close() + } + return result +} http://git-wip-us.apache.org/repos/asf/incubator-tinkerpop/blob/e13f91aa/hadoop-gremlin/conf/hadoop-csv-export.properties ---------------------------------------------------------------------- diff --git a/hadoop-gremlin/conf/hadoop-csv-export.properties b/hadoop-gremlin/conf/hadoop-csv-export.properties new file mode 100644 index 0000000..3e1f8da --- /dev/null +++ b/hadoop-gremlin/conf/hadoop-csv-export.properties @@ -0,0 +1,56 @@ +# Licensed to the Apache Software Foundation (ASF) under one +# or more contributor license agreements. See the NOTICE file +# distributed with this work for additional information +# regarding copyright ownership. The ASF licenses this file +# to you under the Apache License, Version 2.0 (the +# "License"); you may not use this file except in compliance +# with the License. You may obtain a copy of the License at +# +# http://www.apache.org/licenses/LICENSE-2.0 +# +# Unless required by applicable law or agreed to in writing, +# software distributed under the License is distributed on an +# "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY +# KIND, either express or implied. See the License for the +# specific language governing permissions and limitations +# under the License. +gremlin.graph=org.apache.tinkerpop.gremlin.hadoop.structure.HadoopGraph +gremlin.hadoop.graphReader=org.apache.tinkerpop.gremlin.hadoop.structure.io.gryo.GryoInputFormat +gremlin.hadoop.graphWriter=org.apache.tinkerpop.gremlin.hadoop.structure.io.script.ScriptOutputFormat +gremlin.hadoop.jarsInDistributedCache=true +gremlin.hadoop.defaultGraphComputer=org.apache.tinkerpop.gremlin.spark.process.computer.SparkGraphComputer + +gremlin.hadoop.inputLocation=output +gremlin.hadoop.scriptOutputFormat.script=script-csv-export.groovy +gremlin.hadoop.outputLocation=export + +#################################### +# SparkGraphComputer Configuration # +#################################### +spark.master=local[4] +spark.executor.memory=1g +spark.serializer=org.apache.tinkerpop.gremlin.spark.structure.io.gryo.GryoSerializer +# spark.kryo.registrationRequired=true +# spark.storage.memoryFraction=0.2 +# spark.eventLog.enabled=true +# spark.eventLog.dir=/tmp/spark-event-logs +# spark.ui.killEnabled=true + +##################################### +# GiraphGraphComputer Configuration # +##################################### +giraph.minWorkers=2 +giraph.maxWorkers=2 +giraph.useOutOfCoreGraph=true +giraph.useOutOfCoreMessages=true +mapreduce.map.java.opts=-Xmx1024m +mapreduce.reduce.java.opts=-Xmx1024m +giraph.numInputThreads=2 +giraph.numComputeThreads=2 +# giraph.maxPartitionsInMemory=1 +# giraph.userPartitionCount=2 +## MapReduce of GiraphGraphComputer ## +# mapreduce.job.maps=2 +# mapreduce.job.reduces=1 + +