spark-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Ashish Rangole <arang...@gmail.com>
Subject Re: Mutating RDD
Date Wed, 19 Feb 2014 13:44:20 GMT
You could also look at how the Spark Streaming DStream does what you
described.

Take a look at Spark StreamingContext.textFileStream implementation.
On Feb 18, 2014 8:02 PM, "David Thomas" <dt5434884@gmail.com> wrote:

> Perfect.
>
>
> On Tue, Feb 18, 2014 at 7:58 PM, Mayur Rustagi <mayur.rustagi@gmail.com>wrote:
>
>> RDD is immutable so modification of RDD is not possible, you can generate
>> a new RDD unioning the two RDD created from new files and old in-memory RDD.
>> Regards
>> Mayur
>>
>> Mayur Rustagi
>> Ph: +919632149971
>> h <https://twitter.com/mayur_rustagi>ttp://www.sigmoidanalytics.com
>> https://twitter.com/mayur_rustagi
>>
>>
>>
>> On Tue, Feb 18, 2014 at 6:33 PM, David Thomas <dt5434884@gmail.com>wrote:
>>
>>> Let's say I have an RDD of text files from HDFS. During the runtime, is
>>> it possible to check for new files in a particular directory and if
>>> present, add them to the existing RDD?
>>>
>>
>>
>

Mime
View raw message