spark-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Tathagata Das <t...@databricks.com>
Subject Re: [Spark Streaming] Connect to Database only once at the start of Streaming job
Date Wed, 28 Oct 2015 06:17:08 GMT
Yeah, of course. Just create an RDD from jdbc, call cache()/persist(), then
force it to be evaluated using something like count(). Once it is cached,
you can use it in a StreamingContext. Because of the cache it should not
access JDBC any more.

On Tue, Oct 27, 2015 at 12:04 PM, diplomatic Guru <diplomaticguru@gmail.com>
wrote:

> I know it uses lazy model, which is why I was wondering.
>
> On 27 October 2015 at 19:02, Uthayan Suthakar <uthayan.suthakar@gmail.com>
> wrote:
>
>> Hello all,
>>
>> What I wanted to do is configure the spark streaming job to read the
>> database using JdbcRDD and cache the results. This should occur only once
>> at the start of the job. It should not make any further connection to DB
>>  afterwards. Is it possible to do that?
>>
>
>

Mime
View raw message