spark-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From eric wong <win19...@gmail.com>
Subject Re: Re: I think I am almost lost in the internals of Spark
Date Wed, 07 Jan 2015 02:46:39 GMT
A good beginning if you are chinese.

https://github.com/JerryLead/SparkInternals/tree/master/markdown

2015-01-07 10:13 GMT+08:00 bit1129@163.com <bit1129@163.com>:

> Thank you, Tobias. I will look into  the Spark paper. But it looks that
> the paper has been moved,
> http://www.cs.berkeley.edu/~matei/papers/2012/nsdi_spark.pdf.
> A web page is returned (Resource not found)when I access it.
>
> ------------------------------
> bit1129@163.com
>
>
> *From:* Tobias Pfeiffer <tgp@preferred.jp>
> *Date:* 2015-01-07 09:24
> *To:* Todd <bit1129@163.com>
> *CC:* user <user@spark.apache.org>
> *Subject:* Re: I think I am almost lost in the internals of Spark
> Hi,
>
> On Tue, Jan 6, 2015 at 11:24 PM, Todd <bit1129@163.com> wrote:
>
>> I am a bit new to Spark, except that I tried simple things like word
>> count, and the examples given in the spark sql programming guide.
>> Now, I am investigating the internals of Spark, but I think I am almost
>> lost, because I could not grasp a whole picture what spark does when it
>> executes the word count.
>>
>
> I recommend understanding what an RDD is and how it is processed, using
>
> http://spark.apache.org/docs/latest/programming-guide.html#resilient-distributed-datasets-rdds
> and probably also
>   http://www.cs.berkeley.edu/~matei/papers/2012/nsdi_spark.pdf
>   (once the server is back).
> Understanding how an RDD is processed is probably most helpful to
> understand the whole of Spark.
>
> Tobias
>
>


-- 
王海华

Mime
View raw message