Mailing-List: contact user-help@accumulo.apache.org; run by ezmlm
Precedence: bulk
Reply-To: user@accumulo.apache.org
MIME-Version: 1.0
Date: Wed, 15 Jul 2015 09:20:54 -0700
Message-ID: 
 <CAL-1RxcgMEQ30dZdVUQAxp29WXvprrgW_zy9h+GBbAz8_Wvziw@mail.gmail.com>
Subject: AccumuloInputFormat with pyspark?
From: Kina Winoto <winoto.kina.s@gmail.com>
To: user@accumulo.apache.org
Content-Type: multipart/alternative; boundary=047d7bae426ccf4298051aec565a

--047d7bae426ccf4298051aec565a
Content-Type: text/plain; charset=UTF-8

Has anyone used the python Spark API and AccumuloInputFormat?

Using AccumuloInputFormat in scala and java within spark is
straightforward, but the python spark API's newAPIHadoopRDD function takes
in its configuration via a python dict (
https://spark.apache.org/docs/1.1.0/api/python/pyspark.context.SparkContext-class.html#newAPIHadoopRDD)
and there isn't an obvious job configuration set of keys to use. From
looking at the Accumulo source, it seems job configuration values are
stored with keys that are java enums and it's unclear to me what to use for
configuration keys in my python dict.

Any thoughts as to how to do this would be helpful!

Thanks,

Kina

--047d7bae426ccf4298051aec565a
Content-Type: text/html; charset=UTF-8
Content-Transfer-Encoding: quoted-printable

<div dir=3D"ltr"><div>Has anyone used the python Spark API and AccumuloInpu=
tFormat?<br><br></div><div>Using AccumuloInputFormat in scala and java with=
in spark is straightforward, but the python spark API&#39;s newAPIHadoopRDD=
 function takes in its configuration via a python dict <span class=3D"">(<a=
 href=3D"https://spark.apache.org/docs/1.1.0/api/python/pyspark.context.Spa=
rkContext-class.html#newAPIHadoopRDD">https://spark.apache.org/docs/1.1.0/a=
pi/python/pyspark.context.SparkContext-class.html#newAPIHadoopRDD</a>) and =
there isn&#39;t an obvious job configuration set of keys to use. From looki=
ng at the Accumulo source, it seems job configuration values are stored wit=
h keys that are java enums and it&#39;s unclear to me what to use for confi=
guration keys in my python dict. <br><br></span></div><div><span class=3D""=
>Any thoughts as to how to do this would be helpful!<br><br></span></div><d=
iv><span class=3D"">Thanks,<br><br></span></div><div><span class=3D"">Kina<=
br></span></div><div><span class=3D""><br></span></div><div><br></div></div=
>

--047d7bae426ccf4298051aec565a--