spark-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Sean Owen <so...@cloudera.com>
Subject Re: Ready to talk about Spark 2.0?
Date Sun, 08 Nov 2015 11:33:09 GMT
Major releases can change APIs, yes. Although Flink is pretty similar
in broad design and goals, the APIs are quite different in
particulars. Speaking for myself, I can't imagine merging them, as it
would either mean significantly changing Spark APIs, or making Flink
use Spark APIs. It would mean effectively removing one project which
seems infeasible.

I am not sure of what you're saying the difference is, but I would not
describe Spark as primarily for interactive use.

Philosophically, I don't think One Big System to Rule Them All is a
good goal. One project will never get it all right even within one
niche. It's actually valuable to have many takes on important
problems. Hence any problem worth solving gets solved 10 times. Just
look at all those SQL engines and logging frameworks...

On Sun, Nov 8, 2015 at 10:53 AM, Romi Kuntsman <romi@totango.com> wrote:
> A major release usually means giving up on some API backward compatibility?
> Can this be used as a chance to merge efforts with Apache Flink
> (https://flink.apache.org/) and create the one ultimate open source big data
> processing system?
> Spark currently feels like it was made for interactive use (like Python and
> R), and when used others (batch/streaming), it feels like scripted
> interactive instead of really a standalone complete app. Maybe some base
> concepts may be adapted?
>
> (I'm not currently a committer, but as a heavy Spark user I'd love to
> participate in the discussion of what can/should be in Spark 2.0)
>
> Romi Kuntsman, Big Data Engineer
> http://www.totango.com
>
> On Fri, Nov 6, 2015 at 2:53 PM, Jean-Baptiste Onofré <jb@nanthrax.net>
> wrote:
>>
>> Hi Sean,
>>
>> Happy to see this discussion.
>>
>> I'm working on PoC to run Camel on Spark Streaming. The purpose is to have
>> an ingestion and integration platform directly running on Spark Streaming.
>>
>> Basically, we would be able to use a Camel Spark DSL like:
>>
>>
>> from("jms:queue:foo").choice().when(predicate).to("job:bar").when(predicate).to("hdfs:path").otherwise("file:path")....
>>
>> Before a formal proposal (I have to do more work there), I'm just
>> wondering if such framework can be a new Spark module (Spark Integration for
>> instance, like Spark ML, Spark Stream, etc).
>>
>> Maybe it could be a good candidate for an addition in a "major" release
>> like Spark 2.0.
>>
>> Just my $0.01 ;)
>>
>> Regards
>> JB
>>
>>
>> On 11/06/2015 01:44 PM, Sean Owen wrote:
>>>
>>> Since branch-1.6 is cut, I was going to make version 1.7.0 in JIRA.
>>> However I've had a few side conversations recently about Spark 2.0, and
>>> I know I and others have a number of ideas about it already.
>>>
>>> I'll go ahead and make 1.7.0, but thought I'd ask, how much other
>>> interest is there in starting to plan Spark 2.0? is that even on the
>>> table as the next release after 1.6?
>>>
>>> Sean
>>
>>
>> --
>> Jean-Baptiste Onofré
>> jbonofre@apache.org
>> http://blog.nanthrax.net
>> Talend - http://www.talend.com
>>
>> ---------------------------------------------------------------------
>> To unsubscribe, e-mail: dev-unsubscribe@spark.apache.org
>> For additional commands, e-mail: dev-help@spark.apache.org
>>
>

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@spark.apache.org
For additional commands, e-mail: dev-help@spark.apache.org


Mime
View raw message