spark-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Gavin Yue <yue.yuany...@gmail.com>
Subject Re: Spark and N-tier architecture
Date Tue, 29 Mar 2016 23:57:06 GMT
n-tiers or layers is mainly for separate a big problem into pieces smaller
problem.  So it is always valid.

Just for different application, it means different things.

Speaking of offline analytics, or big data eco-world, there are numerous
way of slicing the problem into different tier/layer.  You could search for
: Yarn/mesos/Spark layer and will find a lot of results/ppts.



On Tue, Mar 29, 2016 at 4:44 PM, Mich Talebzadeh <mich.talebzadeh@gmail.com>
wrote:

> Hi Mark,
>
> I beg I agree to differ on the interpretation of N-tier architecture.
> Agreed that 3-tier and by extrapolation N-tier have been around since days
> of client-server architecture. However, they are as valid today as 20 years
> ago. I believe the main recent expansion of n-tier has been on horizontal
> scaling and Spark by means of its clustering capability contributes to this
> model.
>
> Cheers
>
> Dr Mich Talebzadeh
>
>
>
> LinkedIn * https://www.linkedin.com/profile/view?id=AAEAAAAWh2gBxianrbJd6zP6AcPCCdOABUrV8Pw
> <https://www.linkedin.com/profile/view?id=AAEAAAAWh2gBxianrbJd6zP6AcPCCdOABUrV8Pw>*
>
>
>
> http://talebzadehmich.wordpress.com
>
>
>
> On 30 March 2016 at 00:22, Mark Hamstra <mark@clearstorydata.com> wrote:
>
>> Yes and no.  The idea of n-tier architecture is about 20 years older than
>> Spark and doesn't really apply to Spark as n-tier was original conceived.
>> If the n-tier model helps you make sense of some things related to Spark,
>> then use it; but don't get hung up on trying to force a Spark architecture
>> into an outdated model.
>>
>> On Tue, Mar 29, 2016 at 5:02 PM, Ashok Kumar <
>> ashok34668@yahoo.com.invalid> wrote:
>>
>>> Thank you both.
>>>
>>> So am I correct that Spark fits in within the application tier in N-tier
>>> architecture?
>>>
>>>
>>> On Tuesday, 29 March 2016, 23:50, Alexander Pivovarov <
>>> apivovarov@gmail.com> wrote:
>>>
>>>
>>> Spark is a distributed data processing engine plus distributed in-memory
>>> / disk data cache
>>>
>>> spark-jobserver provides REST API to your spark applications. It allows
>>> you to submit jobs to spark and get results in sync or async mode
>>>
>>> It also can create long running Spark context to cache RDDs in memory
>>> with some name (namedRDD) and then use it to serve requests from multiple
>>> users. Because RDD is in memory response should be super fast (seconds)
>>>
>>> https://github.com/spark-jobserver/spark-jobserver
>>>
>>>
>>> On Tue, Mar 29, 2016 at 2:50 PM, Mich Talebzadeh <
>>> mich.talebzadeh@gmail.com> wrote:
>>>
>>> Interesting question.
>>>
>>> The most widely used application of N-tier is the traditional three-tier
>>> architecture that has been the backbone of Client-server architecture by
>>> having presentation layer, application layer and data layer. This is
>>> primarily for performance, scalability and maintenance. The most profound
>>> changes that Big data space has introduced to N-tier architecture is the
>>> concept of horizontal scaling as opposed to the previous tiers that relied
>>> on vertical scaling. HDFS is an example of horizontal scaling at the data
>>> tier by adding more JBODS to storage. Similarly adding more nodes to Spark
>>> cluster should result in better performance.
>>>
>>> Bear in mind that these tiers are at Logical levels which means that
>>> there or may not be so many so many physical layers. For example multiple
>>> virtual servers can be hosted on the same physical server.
>>>
>>> With regard to Spark, it is effectively a powerful query tools that sits
>>> in between the presentation layer (say Tableau) and the HDFS or Hive as you
>>> alluded. In that sense you can think of Spark as part of the application
>>> layer that communicates with the backend via a number of protocols
>>> including the standard JDBC. There is rather a blurred vision here whether
>>> Spark is a database or query tool. IMO it is a query tool in a sense that
>>> Spark by itself does not have its own storage concept or metastore. Thus it
>>> relies on others to provide that service.
>>>
>>> HTH
>>>
>>>
>>>
>>> Dr Mich Talebzadeh
>>>
>>> LinkedIn * https://www.linkedin.com/profile/view?id=AAEAAAAWh2gBxianrbJd6zP6AcPCCdOABUrV8Pw
>>> <https://www.linkedin.com/profile/view?id=AAEAAAAWh2gBxianrbJd6zP6AcPCCdOABUrV8Pw>*
>>>
>>> http://talebzadehmich.wordpress.com
>>>
>>>
>>> On 29 March 2016 at 22:07, Ashok Kumar <ashok34668@yahoo.com.invalid>
>>> wrote:
>>>
>>> Experts,
>>>
>>> One of terms used and I hear is N-tier architecture within Big Data used
>>> for availability, performance etc. I also hear that Spark by means of its
>>> query engine and in-memory caching fits into middle tier (application
>>> layer) with HDFS and Hive may be providing the data tier.  Can someone
>>> elaborate the role of Spark here. For example A Scala program that we write
>>> uses JDBC to talk to databases so in that sense is Spark a middle tier
>>> application?
>>>
>>> I hope that someone can clarify this and if so what would the best
>>> practice in using Spark as middle tier and within Big data.
>>>
>>> Thanks
>>>
>>>
>>>
>>>
>>>
>>>
>>
>

Mime
View raw message