ignite-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Williams, Michael" <michael.willi...@transamerica.com>
Subject RE: Slow Group-By
Date Mon, 26 Feb 2018 16:26:41 GMT
Unfortunately, at this stage in dev, I'm only doing runs on one machine, and though I am using
partitioned data to do query parallelism, it seems I lose that in the GROUP BY.  Does GROUP_BY
distribute at all? 

Might a spark layer on top give a better distribution path? 

-----Original Message-----
From: slava.koptilin [mailto:slava.koptilin@gmail.com] 
Sent: Monday, February 26, 2018 11:17 AM
To: user@ignite.apache.org
Subject: RE: Slow Group-By

Hi Mike,

It seems that GROUP_BY requires to fetch all dataset into java heap (in order to sort data)
and it may lead to long GC pauses.
I think that data collocation [1] should improve performance with using GROUP BY.

[1] https://urldefense.proofpoint.com/v2/url?u=https-3A__apacheignite.readme.io_docs_affinity-2Dcollocation&d=DwICAg&c=9g4MJkl2VjLjS6R4ei18BA&r=ipRRuqPnuP3BWnXGSOR_sLoARpltax56uFYU6n57c3GFvMdyEV-dz2ez2lZZpYl0&m=NkZ5g5gstJbpAgZaFvdxW5LiH0PKkDt17rQQ1t3pWlM&s=HrRyvf4qAOPX9Fc0eEdX83y-EvOBiWLqbn5f_aE99Pw&e=


Sent from: https://urldefense.proofpoint.com/v2/url?u=http-3A__apache-2Dignite-2Dusers.70518.x6.nabble.com_&d=DwICAg&c=9g4MJkl2VjLjS6R4ei18BA&r=ipRRuqPnuP3BWnXGSOR_sLoARpltax56uFYU6n57c3GFvMdyEV-dz2ez2lZZpYl0&m=NkZ5g5gstJbpAgZaFvdxW5LiH0PKkDt17rQQ1t3pWlM&s=U_kuoGAjhwdELc4JAGoFSPc76DNhaiSwpOJCDR3MGZ8&e=

View raw message