spark-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "lk_spark"<lk_sp...@163.com>
Subject how to add colum to dataframe
Date Tue, 06 Dec 2016 09:35:51 GMT
hi,all:
   my spark version is 2.0
   I have a parquet file with one colum name url type is string,I wang get substring from
the url and add it to the datafram:
   val df = spark.read.parquet("/parquetdata/weixin/page/month=201607")
   val df2 = df.withColumn("pa_bid",when($"url".isNull,col("url").substr(3, 5)))
   df2.select("pa_bid","url").show
   +------+--------------------+
|pa_bid|                 url|
+------+--------------------+
|  null|http://mp.weixin....|
|  null|http://mp.weixin....|
|  null|http://mp.weixin....|
|  null|http://mp.weixin....|
|  null|http://mp.weixin....|
|  null|http://mp.weixin....|
|  null|http://mp.weixin....|
|  null|http://mp.weixin....|

Why what I got is null?


2016-12-06


lk_spark 
Mime
View raw message