spark-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Evo Eftimov" <>
Subject RE: Can a map function return null
Date Sun, 19 Apr 2015 20:50:00 GMT
In fact you can return “NULL” from your initial map and hence not resort to Optional<String>
at all 


From: Evo Eftimov [] 
Sent: Sunday, April 19, 2015 9:48 PM
To: 'Steve Lewis'
Cc: 'Olivier Girardot'; ''
Subject: RE: Can a map function return null


Well you can do another map to turn Optional<String> into String as in the cases when
Optional is empty you can store e.g. “NULL” as the value of the RDD element 


If this is not acceptable (based on the objectives of your architecture) and IF when returning
plain null instead of Optional does throw Spark exception THEN as far as I am concerned, chess-mate


From: Steve Lewis [] 
Sent: Sunday, April 19, 2015 8:16 PM
To: Evo Eftimov
Cc: Olivier Girardot;
Subject: Re: Can a map function return null



So you imagine something like this:


 JavaRDD<String> words = ...

 JavaRDD< Optional<String>> wordsFiltered = Function<String,
Optional<String>>() {
    public Optional<String> call(String s) throws Exception {
        if ((s.length()) % 2 == 1) // drop strings of odd length
            return Optional.empty();
            return Optional.of(s);
That seems to return the wrong type a  JavaRDD< Optional<String>> which cannot
be used as a JavaRDD<String> which is what the next step expects


On Sun, Apr 19, 2015 at 12:17 PM, Evo Eftimov <> wrote:

I am on the move at the moment so i cant try it immediately but from previous memory / experience
i think if you return plain null you will get a spark exception


Anyway yiu can try it and see what happens and then ask the question 


If you do get exception try Optional instead of plain null



Sent from Samsung Mobile


-------- Original message --------

From: Olivier Girardot 

Date:2015/04/18 22:04 (GMT+00:00) 

To: Steve Lewis , 

Subject: Re: Can a map function return null 


You can return an RDD with null values inside, and afterwards filter on "item != null" 
In scala (or even in Java 8) you'd rather use Option/Optional, and in Scala they're directly
usable from Spark. 

Exemple : 

 sc.parallelize(1 to 1000).flatMap(item => if (item % 2 ==0) Some(item) else None).collect()

res0: Array[Int] = Array(2, 4, 6, ....)




Le sam. 18 avr. 2015 à 20:44, Steve Lewis <> a écrit :

I find a number of cases where I have an JavaRDD and I wish to transform the data and depending
on a test return 0 or one item (don't suggest a filter - the real case is more complex). So
I currently do something like the following - perform a flatmap returning a list with 0 or
1 entry depending on the isUsed function.


     JavaRDD<Foo> original = ...

  JavaRDD<Foo> words = original.flatMap(new FlatMapFunction<Foo, Foo>() {


            public Iterable<Foo> call(final Foo s) throws Exception {

            List<Foo> ret = new ArrayList<Foo>();



                return ret; // contains 0 items if isUsed is false




My question is can I do a map returning the transformed data and null if nothing is to be
returned. as shown below - what does a Spark do with a map function returning null


    JavaRDD<Foo> words = MapFunction<String, String>() {


          Foo  call(final Foo s) throws Exception {

            List<Foo> ret = new ArrayList<Foo>();


                       return transform(s);

                return null; // not used - what happens now








Steven M. Lewis PhD

4221 105th Ave NE

Kirkland, WA 98033

206-384-1340 (cell)
Skype lordjoe_com

View raw message