spark-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Buntu Dev <buntu...@gmail.com>
Subject SparkSQL with large result size
Date Mon, 02 May 2016 04:19:15 GMT
I got a 10g limitation on the executors and operating on parquet dataset
with block size 70M with 200 blocks. I keep hitting the memory limits when
doing a 'select * from t1 order by c1 limit 1000000' (ie, 1M). It works if
I limit to say 100k. What are the options to save a large dataset without
running into memory issues?

Thanks!

Mime
View raw message