spark-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Udbhav Agarwal <udbhav.agar...@syncoms.com>
Subject RE: spark sql performance
Date Fri, 13 Mar 2015 06:52:36 GMT
Lets say am using 4 machines with 3gb ram. My data is customers records with 5 columns each
in two tables with 0.5 million records. I want to perform join query on these two tables.


Thanks,
Udbhav Agarwal

From: Akhil Das [mailto:akhil@sigmoidanalytics.com]
Sent: 13 March, 2015 12:16 PM
To: Udbhav Agarwal
Cc: user@spark.apache.org
Subject: Re: spark sql performance

The size/type of your data, and your cluster configuration would be fine i think.

Thanks
Best Regards

On Fri, Mar 13, 2015 at 12:07 PM, Udbhav Agarwal <udbhav.agarwal@syncoms.com<mailto:udbhav.agarwal@syncoms.com>>
wrote:
Thanks Akhil,
What more info should I give so we can estimate query time in my scenario?

Thanks,
Udbhav Agarwal

From: Akhil Das [mailto:akhil@sigmoidanalytics.com<mailto:akhil@sigmoidanalytics.com>]
Sent: 13 March, 2015 12:01 PM
To: Udbhav Agarwal
Cc: user@spark.apache.org<mailto:user@spark.apache.org>
Subject: Re: spark sql performance

That totally depends on your data size and your cluster setup.

Thanks
Best Regards

On Thu, Mar 12, 2015 at 7:32 PM, Udbhav Agarwal <udbhav.agarwal@syncoms.com<mailto:udbhav.agarwal@syncoms.com>>
wrote:
Hi,
What is query time for join query on hbase with spark sql. Say tables in hbase have 0.5 million
records each. I am expecting a query time (latency) in milliseconds with spark sql. Can this
be possible ?




Thanks,
Udbhav Agarwal



Mime
View raw message