spark-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From ankits <>
Subject Re: Get size of rdd in memory
Date Mon, 02 Feb 2015 20:23:45 GMT
Thanks for your response. So AFAICT 

calling parallelize(1  to1024).map(i =>KV(i,
i.toString)).toSchemaRDD.cache().count(), will allow me to see the size of
the schemardd in memory

and parallelize(1  to1024).map(i =>KV(i, i.toString)).cache().count()  will
show me the size of a regular rdd.

But this will not show us the size when using cacheTable() right? Like if i

parallelize(1  to1024).map(i =>KV(i,
sqc.sql("SELECT COUNT(*) FROM test")

the web UI does not show us the size of the cached table. 

View this message in context:
Sent from the Apache Spark Developers List mailing list archive at

To unsubscribe, e-mail:
For additional commands, e-mail:

View raw message