spark-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Davies Liu <dav...@databricks.com>
Subject Re: deos randomSplit return a copy or a reference to the original rdd? [Python]
Date Tue, 02 Jun 2015 00:40:36 GMT
No, all of the RDDs (including those returned from randomSplit()) are read-only.

On Mon, Apr 27, 2015 at 11:28 AM, Pagliari, Roberto
<rpagliari@appcomsci.com> wrote:
> Suppose I have something like the code below
>
>
>         for idx in xrange(0, 10):
>             train_test_split = training.randomSplit(weights=[0.75, 0.25])
>             train_cv = train_test_split[0]
>             test_cv = train_test_split[1]
>             # scale train_cv and test_cv
>
>
> by scaling train_cv and test_cv, will the original data be affected?
>
> Thanks,
>

---------------------------------------------------------------------
To unsubscribe, e-mail: user-unsubscribe@spark.apache.org
For additional commands, e-mail: user-help@spark.apache.org


Mime
View raw message