Return-Path: X-Original-To: apmail-accumulo-user-archive@www.apache.org Delivered-To: apmail-accumulo-user-archive@www.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id DA85818C60 for ; Wed, 15 Jul 2015 16:21:42 +0000 (UTC) Received: (qmail 89329 invoked by uid 500); 15 Jul 2015 16:21:42 -0000 Delivered-To: apmail-accumulo-user-archive@accumulo.apache.org Received: (qmail 89275 invoked by uid 500); 15 Jul 2015 16:21:42 -0000 Mailing-List: contact user-help@accumulo.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: user@accumulo.apache.org Delivered-To: mailing list user@accumulo.apache.org Received: (qmail 89265 invoked by uid 99); 15 Jul 2015 16:21:42 -0000 Received: from Unknown (HELO spamd1-us-west.apache.org) (209.188.14.142) by apache.org (qpsmtpd/0.29) with ESMTP; Wed, 15 Jul 2015 16:21:42 +0000 Received: from localhost (localhost [127.0.0.1]) by spamd1-us-west.apache.org (ASF Mail Server at spamd1-us-west.apache.org) with ESMTP id 13481D4D12 for ; Wed, 15 Jul 2015 16:21:42 +0000 (UTC) X-Virus-Scanned: Debian amavisd-new at spamd1-us-west.apache.org X-Spam-Flag: NO X-Spam-Score: 2.879 X-Spam-Level: ** X-Spam-Status: No, score=2.879 tagged_above=-999 required=6.31 tests=[DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, HTML_MESSAGE=3, RCVD_IN_MSPIKE_H3=-0.01, RCVD_IN_MSPIKE_WL=-0.01, SPF_PASS=-0.001] autolearn=disabled Authentication-Results: spamd1-us-west.apache.org (amavisd-new); dkim=pass (2048-bit key) header.d=gmail.com Received: from mx1-us-west.apache.org ([10.40.0.8]) by localhost (spamd1-us-west.apache.org [10.40.0.7]) (amavisd-new, port 10024) with ESMTP id ZcjygEyZzJBL for ; Wed, 15 Jul 2015 16:21:41 +0000 (UTC) Received: from mail-wi0-f182.google.com (mail-wi0-f182.google.com [209.85.212.182]) by mx1-us-west.apache.org (ASF Mail Server at mx1-us-west.apache.org) with ESMTPS id 1D742212C5 for ; Wed, 15 Jul 2015 16:21:41 +0000 (UTC) Received: by wiga1 with SMTP id a1so4669716wig.0 for ; Wed, 15 Jul 2015 09:20:54 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113; h=mime-version:date:message-id:subject:from:to:content-type; bh=ARnuHnmcpkVUaIWlxDtEho10z1DnUIZVRUTO9LGrkE0=; b=FdLy6S/9e9X/+WSKaApOvJ+0j92HiO2vlN0WQcVz3qrhkXQc+8oCuexnTSifi26pHP IXQDzysWvX0Fn249/N8ssPxdMurdB/i8FfZAT9yFpQJCdht0dajQlyHFT7VOL/fKlIb6 m6sXYSBXqTSUa+TeXha8VGYUfonEo3T3imUE+IEMI/tHrTWOfWaeOMHyGWee8wEGMVoS 3BEL04xmG0t29HYi8n66ir7IqmxqD+6U9Khhse8QPLBvNm1YT2EXMuY75s+M0+ipA/xd sn49xEqmGMpOIPue8RlH81fTDvkFRLWgLrLreCjlJkxGQrrrx+bzrRz/E3mi+lgD4JQT nK8A== MIME-Version: 1.0 X-Received: by 10.194.192.98 with SMTP id hf2mr10052154wjc.23.1436977254579; Wed, 15 Jul 2015 09:20:54 -0700 (PDT) Received: by 10.28.61.86 with HTTP; Wed, 15 Jul 2015 09:20:54 -0700 (PDT) Date: Wed, 15 Jul 2015 09:20:54 -0700 Message-ID: Subject: AccumuloInputFormat with pyspark? From: Kina Winoto To: user@accumulo.apache.org Content-Type: multipart/alternative; boundary=047d7bae426ccf4298051aec565a --047d7bae426ccf4298051aec565a Content-Type: text/plain; charset=UTF-8 Has anyone used the python Spark API and AccumuloInputFormat? Using AccumuloInputFormat in scala and java within spark is straightforward, but the python spark API's newAPIHadoopRDD function takes in its configuration via a python dict ( https://spark.apache.org/docs/1.1.0/api/python/pyspark.context.SparkContext-class.html#newAPIHadoopRDD) and there isn't an obvious job configuration set of keys to use. From looking at the Accumulo source, it seems job configuration values are stored with keys that are java enums and it's unclear to me what to use for configuration keys in my python dict. Any thoughts as to how to do this would be helpful! Thanks, Kina --047d7bae426ccf4298051aec565a Content-Type: text/html; charset=UTF-8 Content-Transfer-Encoding: quoted-printable
Has anyone used the python Spark API and AccumuloInpu= tFormat?

Using AccumuloInputFormat in scala and java with= in spark is straightforward, but the python spark API's newAPIHadoopRDD= function takes in its configuration via a python dict (https://spark.apache.org/docs/1.1.0/a= pi/python/pyspark.context.SparkContext-class.html#newAPIHadoopRDD) and = there isn't an obvious job configuration set of keys to use. From looki= ng at the Accumulo source, it seems job configuration values are stored wit= h keys that are java enums and it's unclear to me what to use for confi= guration keys in my python dict.

Any thoughts as to how to do this would be helpful!

Thanks,

Kina<= br>


--047d7bae426ccf4298051aec565a--