accumulo-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Kina Winoto <winoto.kin...@gmail.com>
Subject AccumuloInputFormat with pyspark?
Date Wed, 15 Jul 2015 16:20:54 GMT
Has anyone used the python Spark API and AccumuloInputFormat?

Using AccumuloInputFormat in scala and java within spark is
straightforward, but the python spark API's newAPIHadoopRDD function takes
in its configuration via a python dict (
https://spark.apache.org/docs/1.1.0/api/python/pyspark.context.SparkContext-class.html#newAPIHadoopRDD)
and there isn't an obvious job configuration set of keys to use. From
looking at the Accumulo source, it seems job configuration values are
stored with keys that are java enums and it's unclear to me what to use for
configuration keys in my python dict.

Any thoughts as to how to do this would be helpful!

Thanks,

Kina

Mime
View raw message