spark-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Deepak Sharma <deepakmc...@gmail.com>
Subject Re: How to deal with string column data for spark mlib?
Date Tue, 20 Dec 2016 09:45:18 GMT
You can read the source in a data frame.
Then iterate over all rows with map and use something like below:
df.map(x=>x(0).toString().toDouble)

Thanks
Deepak

On Tue, Dec 20, 2016 at 3:05 PM, big data <bigdatabase@outlook.com> wrote:

> our source data are string-based data, like this:
> col1   col2   col3 ...
> aaa   bbb    ccc
> aa2   bb2    cc2
> aa3   bb3    cc3
> ...     ...       ...
>
> How to convert all of these data to double to apply for mlib's algorithm?
>
> thanks.
>
> ---------------------------------------------------------------------
> To unsubscribe e-mail: user-unsubscribe@spark.apache.org
>
>


-- 
Thanks
Deepak
www.bigdatabig.com
www.keosha.net

Mime
View raw message