Hi, Calvin, I am running 24GB data Spark KMeans in a c3.2xlarge AWS instance with 30GB physical memory.
Spark will cache data off-heap to Tachyon, the input data is also stored in Tachyon.
Tachyon is configured to use 15GB memory, and use tired store.
Tachyon underFS is /tmp.
The only configuration I've changed is Tachyon data block size.
Above experiment is a part of a research project.
On Thursday, January 28, 2016 at 9:11:19 PM UTC-6, Calvin Jia wrote:
Thanks for the detailed information. How large is the dataset you are running against? Also did you change any Tachyon configurations?