spark-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "" <>
Subject appendix
Date Wed, 21 Jun 2017 02:19:11 GMT
My scenary is like this:
        1.val df=hivecontext/carboncontex.sql("sql....")
        2.iterating rows,extrating two columns,id and mvcc, and use id as key to scan hbase
to get corresponding value
            if mvcc==value, this row pass,else drop
Is there a better way except dataframe.mapPartitions because it cause an extra stage and spend
more time.
I put two DAGs in appendix,please check!

View raw message