spark-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Bertrand Dechoux <>
Subject Re: Rename filter() into keep(), remove() or take() ?
Date Thu, 27 Feb 2014 13:55:38 GMT
I understand the explanation but I had to try. However, the change could be
made without breaking anything but that's another story.



Bertrand Dechoux

On Thu, Feb 27, 2014 at 2:05 PM, Nick Pentreath <>wrote:

> filter comes from the Scala collection method "filter". I'd say it's best
> to keep in line with the Scala collections API, as Spark has done with RDDs
> generally (map, flatMap, take etc), so that is is easier and natural for
> developers to apply the same thinking for Scala (parallel) collections to
> Spark RDDs.
> Plus, such an API change would be a major breaking one and IMO not a good
> idea at this stage.
> deffilter(p: (A) => Boolean<>
> ): Seq <>[A]
> Selects all elements of this sequence which satisfy a predicate.
> p
> the predicate used to test elements.
> returns
> a new sequence consisting of all elements of this sequence that satisfy
> the given predicate p. The order of the elements is preserved.
> On Thu, Feb 27, 2014 at 2:36 PM, Bertrand Dechoux <>wrote:
>> Hi,
>> It might seem like a trivial issue but even though it is somehow a
>> standard name filter() is not really explicit in which way it does work.
>> Sure, it makes sense to provide a filter function but what happens when it
>> returns true? Is the current element removed or kept? It is not really
>> obvious.
>> Has another name been already discussed? It could be keep() or remove().
>> But take() could also be reused and instead of providing a number, the
>> filter function could be requested.
>>  Regards
>> Bertrand

View raw message