spark-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Tim Chen <...@mesosphere.io>
Subject Re: spark.mesos.coarse impacts memory performance on mesos
Date Fri, 25 Sep 2015 16:17:18 GMT
Hi Utkarsh,

What is your job placement like when you run fine grain mode? You said
coarse grain mode only ran with one node right?

And when the job is running could you open the Spark webui and get stats
about the heap size and other java settings?

Tim

On Thu, Sep 24, 2015 at 10:56 PM, Utkarsh Sengar <utkarsh2012@gmail.com>
wrote:

> Bumping this one up, any suggestions on the stacktrace?
> spark.mesos.coarse=true is not working and the driver crashed with the
> error.
>
> On Wed, Sep 23, 2015 at 3:29 PM, Utkarsh Sengar <utkarsh2012@gmail.com>
> wrote:
>
>> Missed to do a reply-all.
>>
>> Tim,
>>
>> spark.mesos.coarse = true doesn't work and spark.mesos.coarse = false
>> works (sorry there was a typo in my last email, I meant "when I do
>> "spark.mesos.coarse=false", the job works like a charm. ").
>>
>> I get this exception with spark.mesos.coarse = true:
>>
>> 15/09/22 20:18:05 INFO MongoCollectionSplitter: Created split: min={
>> "_id" : "55af4bf26750ad38a444d7cf"}, max= { "_id" :
>> "55af5a61e8a42806f47546c1"}
>>
>> <http://singularity-qa-uswest2.otenv.com/task/ds-tetris-simspark-usengar.2015.09.22T20.14.36-1442952963980-1-mesos_slave1_qa_uswest2.qasql.opentable.com-us_west_2a/tail/stderr#611337>15/09/22
>> 20:18:05 INFO MongoCollectionSplitter: Created split: min={ "_id" :
>> "55af5a61e8a42806f47546c1"}, max= null
>>
>> <http://singularity-qa-uswest2.otenv.com/task/ds-tetris-simspark-usengar.2015.09.22T20.14.36-1442952963980-1-mesos_slave1_qa_uswest2.qasql.opentable.com-us_west_2a/tail/stderr#611453>Exception
>> in thread "main" java.lang.OutOfMemoryError: Java heap space
>>
>> <http://singularity-qa-uswest2.otenv.com/task/ds-tetris-simspark-usengar.2015.09.22T20.14.36-1442952963980-1-mesos_slave1_qa_uswest2.qasql.opentable.com-us_west_2a/tail/stderr#611524>
>> at org.apache.spark.rdd.CartesianRDD.getPartitions(CartesianRDD.scala:60)
>>
>> <http://singularity-qa-uswest2.otenv.com/task/ds-tetris-simspark-usengar.2015.09.22T20.14.36-1442952963980-1-mesos_slave1_qa_uswest2.qasql.opentable.com-us_west_2a/tail/stderr#611599>
>> at org.apache.spark.rdd.RDD$$anonfun$partitions$2.apply(RDD.scala:219)
>>
>> <http://singularity-qa-uswest2.otenv.com/task/ds-tetris-simspark-usengar.2015.09.22T20.14.36-1442952963980-1-mesos_slave1_qa_uswest2.qasql.opentable.com-us_west_2a/tail/stderr#611671>
>> at org.apache.spark.rdd.RDD$$anonfun$partitions$2.apply(RDD.scala:217)
>>
>> <http://singularity-qa-uswest2.otenv.com/task/ds-tetris-simspark-usengar.2015.09.22T20.14.36-1442952963980-1-mesos_slave1_qa_uswest2.qasql.opentable.com-us_west_2a/tail/stderr#611743>
>> at scala.Option.getOrElse(Option.scala:120)
>>
>> <http://singularity-qa-uswest2.otenv.com/task/ds-tetris-simspark-usengar.2015.09.22T20.14.36-1442952963980-1-mesos_slave1_qa_uswest2.qasql.opentable.com-us_west_2a/tail/stderr#611788>
>> at org.apache.spark.rdd.RDD.partitions(RDD.scala:217)
>>
>> <http://singularity-qa-uswest2.otenv.com/task/ds-tetris-simspark-usengar.2015.09.22T20.14.36-1442952963980-1-mesos_slave1_qa_uswest2.qasql.opentable.com-us_west_2a/tail/stderr#611843>
>> at org.apache.spark.rdd.CartesianRDD.getPartitions(CartesianRDD.scala:60)
>>
>> <http://singularity-qa-uswest2.otenv.com/task/ds-tetris-simspark-usengar.2015.09.22T20.14.36-1442952963980-1-mesos_slave1_qa_uswest2.qasql.opentable.com-us_west_2a/tail/stderr#611918>
>> at org.apache.spark.rdd.RDD$$anonfun$partitions$2.apply(RDD.scala:219)
>>
>> <http://singularity-qa-uswest2.otenv.com/task/ds-tetris-simspark-usengar.2015.09.22T20.14.36-1442952963980-1-mesos_slave1_qa_uswest2.qasql.opentable.com-us_west_2a/tail/stderr#611990>
>> at org.apache.spark.rdd.RDD$$anonfun$partitions$2.apply(RDD.scala:217)
>>
>> <http://singularity-qa-uswest2.otenv.com/task/ds-tetris-simspark-usengar.2015.09.22T20.14.36-1442952963980-1-mesos_slave1_qa_uswest2.qasql.opentable.com-us_west_2a/tail/stderr#612062>
>> at scala.Option.getOrElse(Option.scala:120)
>>
>> <http://singularity-qa-uswest2.otenv.com/task/ds-tetris-simspark-usengar.2015.09.22T20.14.36-1442952963980-1-mesos_slave1_qa_uswest2.qasql.opentable.com-us_west_2a/tail/stderr#612107>
>> at org.apache.spark.rdd.RDD.partitions(RDD.scala:217)
>>
>> <http://singularity-qa-uswest2.otenv.com/task/ds-tetris-simspark-usengar.2015.09.22T20.14.36-1442952963980-1-mesos_slave1_qa_uswest2.qasql.opentable.com-us_west_2a/tail/stderr#612162>
>> at
>> org.apache.spark.rdd.MapPartitionsRDD.getPartitions(MapPartitionsRDD.scala:32)
>>
>> <http://singularity-qa-uswest2.otenv.com/task/ds-tetris-simspark-usengar.2015.09.22T20.14.36-1442952963980-1-mesos_slave1_qa_uswest2.qasql.opentable.com-us_west_2a/tail/stderr#612245>
>> at org.apache.spark.rdd.RDD$$anonfun$partitions$2.apply(RDD.scala:219)
>>
>> <http://singularity-qa-uswest2.otenv.com/task/ds-tetris-simspark-usengar.2015.09.22T20.14.36-1442952963980-1-mesos_slave1_qa_uswest2.qasql.opentable.com-us_west_2a/tail/stderr#612317>
>> at org.apache.spark.rdd.RDD$$anonfun$partitions$2.apply(RDD.scala:217)
>>
>> <http://singularity-qa-uswest2.otenv.com/task/ds-tetris-simspark-usengar.2015.09.22T20.14.36-1442952963980-1-mesos_slave1_qa_uswest2.qasql.opentable.com-us_west_2a/tail/stderr#612389>
>> at scala.Option.getOrElse(Option.scala:120)
>>
>> <http://singularity-qa-uswest2.otenv.com/task/ds-tetris-simspark-usengar.2015.09.22T20.14.36-1442952963980-1-mesos_slave1_qa_uswest2.qasql.opentable.com-us_west_2a/tail/stderr#612434>
>> at org.apache.spark.rdd.RDD.partitions(RDD.scala:217)
>>
>> <http://singularity-qa-uswest2.otenv.com/task/ds-tetris-simspark-usengar.2015.09.22T20.14.36-1442952963980-1-mesos_slave1_qa_uswest2.qasql.opentable.com-us_west_2a/tail/stderr#612489>
>> at
>> org.apache.spark.rdd.MapPartitionsRDD.getPartitions(MapPartitionsRDD.scala:32)
>>
>> <http://singularity-qa-uswest2.otenv.com/task/ds-tetris-simspark-usengar.2015.09.22T20.14.36-1442952963980-1-mesos_slave1_qa_uswest2.qasql.opentable.com-us_west_2a/tail/stderr#612572>
>> at org.apache.spark.rdd.RDD$$anonfun$partitions$2.apply(RDD.scala:219)
>>
>> <http://singularity-qa-uswest2.otenv.com/task/ds-tetris-simspark-usengar.2015.09.22T20.14.36-1442952963980-1-mesos_slave1_qa_uswest2.qasql.opentable.com-us_west_2a/tail/stderr#612644>
>> at org.apache.spark.rdd.RDD$$anonfun$partitions$2.apply(RDD.scala:217)
>>
>> <http://singularity-qa-uswest2.otenv.com/task/ds-tetris-simspark-usengar.2015.09.22T20.14.36-1442952963980-1-mesos_slave1_qa_uswest2.qasql.opentable.com-us_west_2a/tail/stderr#612716>
>> at scala.Option.getOrElse(Option.scala:120)
>>
>> <http://singularity-qa-uswest2.otenv.com/task/ds-tetris-simspark-usengar.2015.09.22T20.14.36-1442952963980-1-mesos_slave1_qa_uswest2.qasql.opentable.com-us_west_2a/tail/stderr#612761>
>> at org.apache.spark.rdd.RDD.partitions(RDD.scala:217)
>>
>> <http://singularity-qa-uswest2.otenv.com/task/ds-tetris-simspark-usengar.2015.09.22T20.14.36-1442952963980-1-mesos_slave1_qa_uswest2.qasql.opentable.com-us_west_2a/tail/stderr#612816>
>> at
>> org.apache.spark.rdd.MapPartitionsRDD.getPartitions(MapPartitionsRDD.scala:32)
>>
>> <http://singularity-qa-uswest2.otenv.com/task/ds-tetris-simspark-usengar.2015.09.22T20.14.36-1442952963980-1-mesos_slave1_qa_uswest2.qasql.opentable.com-us_west_2a/tail/stderr#612899>
>> at org.apache.spark.rdd.RDD$$anonfun$partitions$2.apply(RDD.scala:219)
>>
>> <http://singularity-qa-uswest2.otenv.com/task/ds-tetris-simspark-usengar.2015.09.22T20.14.36-1442952963980-1-mesos_slave1_qa_uswest2.qasql.opentable.com-us_west_2a/tail/stderr#612971>
>> at org.apache.spark.rdd.RDD$$anonfun$partitions$2.apply(RDD.scala:217)
>>
>> <http://singularity-qa-uswest2.otenv.com/task/ds-tetris-simspark-usengar.2015.09.22T20.14.36-1442952963980-1-mesos_slave1_qa_uswest2.qasql.opentable.com-us_west_2a/tail/stderr#613043>
>> at scala.Option.getOrElse(Option.scala:120)
>>
>> <http://singularity-qa-uswest2.otenv.com/task/ds-tetris-simspark-usengar.2015.09.22T20.14.36-1442952963980-1-mesos_slave1_qa_uswest2.qasql.opentable.com-us_west_2a/tail/stderr#613088>
>> at org.apache.spark.rdd.RDD.partitions(RDD.scala:217)
>>
>> <http://singularity-qa-uswest2.otenv.com/task/ds-tetris-simspark-usengar.2015.09.22T20.14.36-1442952963980-1-mesos_slave1_qa_uswest2.qasql.opentable.com-us_west_2a/tail/stderr#613143>
>> at
>> org.apache.spark.rdd.MapPartitionsRDD.getPartitions(MapPartitionsRDD.scala:32)
>>
>> <http://singularity-qa-uswest2.otenv.com/task/ds-tetris-simspark-usengar.2015.09.22T20.14.36-1442952963980-1-mesos_slave1_qa_uswest2.qasql.opentable.com-us_west_2a/tail/stderr#613226>
>> at org.apache.spark.rdd.RDD$$anonfun$partitions$2.apply(RDD.scala:219)
>>
>> <http://singularity-qa-uswest2.otenv.com/task/ds-tetris-simspark-usengar.2015.09.22T20.14.36-1442952963980-1-mesos_slave1_qa_uswest2.qasql.opentable.com-us_west_2a/tail/stderr#613298>
>> at org.apache.spark.rdd.RDD$$anonfun$partitions$2.apply(RDD.scala:217)
>>
>> <http://singularity-qa-uswest2.otenv.com/task/ds-tetris-simspark-usengar.2015.09.22T20.14.36-1442952963980-1-mesos_slave1_qa_uswest2.qasql.opentable.com-us_west_2a/tail/stderr#613370>
>> at scala.Option.getOrElse(Option.scala:120)
>>
>> <http://singularity-qa-uswest2.otenv.com/task/ds-tetris-simspark-usengar.2015.09.22T20.14.36-1442952963980-1-mesos_slave1_qa_uswest2.qasql.opentable.com-us_west_2a/tail/stderr#613415>
>> at org.apache.spark.rdd.RDD.partitions(RDD.scala:217)
>>
>> <http://singularity-qa-uswest2.otenv.com/task/ds-tetris-simspark-usengar.2015.09.22T20.14.36-1442952963980-1-mesos_slave1_qa_uswest2.qasql.opentable.com-us_west_2a/tail/stderr#613470>
>> at org.apache.spark.ShuffleDependency.<init>(Dependency.scala:82)
>>
>> <http://singularity-qa-uswest2.otenv.com/task/ds-tetris-simspark-usengar.2015.09.22T20.14.36-1442952963980-1-mesos_slave1_qa_uswest2.qasql.opentable.com-us_west_2a/tail/stderr#613537>
>> at org.apache.spark.rdd.ShuffledRDD.getDependencies(ShuffledRDD.scala:78)
>>
>> <http://singularity-qa-uswest2.otenv.com/task/ds-tetris-simspark-usengar.2015.09.22T20.14.36-1442952963980-1-mesos_slave1_qa_uswest2.qasql.opentable.com-us_west_2a/tail/stderr#613612>15/09/22
>> 20:18:17 INFO SparkContext: Invoking stop() from shutdown hook
>>
>> <http://singularity-qa-uswest2.otenv.com/task/ds-tetris-simspark-usengar.2015.09.22T20.14.36-1442952963980-1-mesos_slave1_qa_uswest2.qasql.opentable.com-us_west_2a/tail/stderr#613684>15/09/22
>> 20:18:17 INFO BlockManagerInfo: Removed broadcast_2_piece0 on
>> some-ip-here:37706 in memory (size: 1964.0 B, free: 2.8 GB)
>>
>> <http://singularity-qa-uswest2.otenv.com/task/ds-tetris-simspark-usengar.2015.09.22T20.14.36-1442952963980-1-mesos_slave1_qa_uswest2.qasql.opentable.com-us_west_2a/tail/stderr#613814>15/09/22
>> 20:18:17 INFO BlockManagerInfo: Removed broadcast_2_piece0 on mesos-slave10
>> in memory (size: 1964.0 B, free: 5.2 GB)
>>
>> <http://singularity-qa-uswest2.otenv.com/task/ds-tetris-simspark-usengar.2015.09.22T20.14.36-1442952963980-1-mesos_slave1_qa_uswest2.qasql.opentable.com-us_west_2a/tail/stderr#613977>15/09/22
>> 20:18:17 INFO BlockManagerInfo: Removed broadcast_1_piece0 on
>> some-ip-here:37706 in memory (size: 17.2 KB, free: 2.8 GB)
>>
>> <http://singularity-qa-uswest2.otenv.com/task/ds-tetris-simspark-usengar.2015.09.22T20.14.36-1442952963980-1-mesos_slave1_qa_uswest2.qasql.opentable.com-us_west_2a/tail/stderr#614106>15/09/22
>> 20:18:17 INFO BlockManagerInfo: Removed broadcast_1_piece0 on
>> mesos-slave105 in memory (size: 17.2 KB, free: 5.2 GB)
>>
>> <http://singularity-qa-uswest2.otenv.com/task/ds-tetris-simspark-usengar.2015.09.22T20.14.36-1442952963980-1-mesos_slave1_qa_uswest2.qasql.opentable.com-us_west_2a/tail/stderr#614268>15/09/22
>> 20:18:17 INFO BlockManagerInfo: Removed broadcast_1_piece0 on mesos-slave1
>> in memory (size: 17.2 KB, free: 5.2 GB)
>>
>> <http://singularity-qa-uswest2.otenv.com/task/ds-tetris-simspark-usengar.2015.09.22T20.14.36-1442952963980-1-mesos_slave1_qa_uswest2.qasql.opentable.com-us_west_2a/tail/stderr#614429>15/09/22
>> 20:18:17 INFO BlockManagerInfo: Removed broadcast_1_piece0 on mesos-slave9
>> in memory (size: 17.2 KB, free: 5.2 GB)
>>
>> <http://singularity-qa-uswest2.otenv.com/task/ds-tetris-simspark-usengar.2015.09.22T20.14.36-1442952963980-1-mesos_slave1_qa_uswest2.qasql.opentable.com-us_west_2a/tail/stderr#614590>15/09/22
>> 20:18:17 INFO BlockManagerInfo: Removed broadcast_1_piece0 on mesos-slave3
>> in memory (size: 17.2 KB, free: 5.2 GB)
>>
>> <http://singularity-qa-uswest2.otenv.com/task/ds-tetris-simspark-usengar.2015.09.22T20.14.36-1442952963980-1-mesos_slave1_qa_uswest2.qasql.opentable.com-us_west_2a/tail/stderr#614751>15/09/22
>> 20:18:17 INFO SparkUI: Stopped Spark web UI at http://some-ip-here:4040
>>
>> <http://singularity-qa-uswest2.otenv.com/task/ds-tetris-simspark-usengar.2015.09.22T20.14.36-1442952963980-1-mesos_slave1_qa_uswest2.qasql.opentable.com-us_west_2a/tail/stderr#614831>15/09/22
>> 20:18:17 INFO DAGScheduler: Stopping DAGScheduler
>>
>> <http://singularity-qa-uswest2.otenv.com/task/ds-tetris-simspark-usengar.2015.09.22T20.14.36-1442952963980-1-mesos_slave1_qa_uswest2.qasql.opentable.com-us_west_2a/tail/stderr#614890>15/09/22
>> 20:18:17 INFO CoarseMesosSchedulerBackend: Shutting down all executors
>>
>> <http://singularity-qa-uswest2.otenv.com/task/ds-tetris-simspark-usengar.2015.09.22T20.14.36-1442952963980-1-mesos_slave1_qa_uswest2.qasql.opentable.com-us_west_2a/tail/stderr#614970>15/09/22
>> 20:18:17 INFO CoarseMesosSchedulerBackend: Asking each executor to shut down
>>
>> <http://singularity-qa-uswest2.otenv.com/task/ds-tetris-simspark-usengar.2015.09.22T20.14.36-1442952963980-1-mesos_slave1_qa_uswest2.qasql.opentable.com-us_west_2a/tail/stderr#615056>I0922
>> 20:18:17.794598 171 sched.cpp:1591] Asked to stop the driver
>>
>> <http://singularity-qa-uswest2.otenv.com/task/ds-tetris-simspark-usengar.2015.09.22T20.14.36-1442952963980-1-mesos_slave1_qa_uswest2.qasql.opentable.com-us_west_2a/tail/stderr#615125>I0922
>> 20:18:17.794739 143 sched.cpp:835] Stopping framework
>> '20150803-224832-1577534986-5050-1614-0016'
>>
>> <http://singularity-qa-uswest2.otenv.com/task/ds-tetris-simspark-usengar.2015.09.22T20.14.36-1442952963980-1-mesos_slave1_qa_uswest2.qasql.opentable.com-us_west_2a/tail/stderr#615231>15/09/22
>> 20:18:17 INFO CoarseMesosSchedulerBackend: driver.run() returned with code
>> DRIVER_STOPPED
>>
>> <http://singularity-qa-uswest2.otenv.com/task/ds-tetris-simspark-usengar.2015.09.22T20.14.36-1442952963980-1-mesos_slave1_qa_uswest2.qasql.opentable.com-us_west_2a/tail/stderr#615330>15/09/22
>> 20:18:17 INFO MapOutputTrackerMasterEndpoint:
>> MapOutputTrackerMasterEndpoint stopped!
>>
>> <http://singularity-qa-uswest2.otenv.com/task/ds-tetris-simspark-usengar.2015.09.22T20.14.36-1442952963980-1-mesos_slave1_qa_uswest2.qasql.opentable.com-us_west_2a/tail/stderr#615425>15/09/22
>> 20:18:17 INFO Utils: path =
>> /tmp/spark-98801318-9c49-473b-bf2f-07ea42187252/blockmgr-0e0e1a1c-894e-4e79-beac-ead0dff43166,
>> already present as root for deletion.
>>
>> <http://singularity-qa-uswest2.otenv.com/task/ds-tetris-simspark-usengar.2015.09.22T20.14.36-1442952963980-1-mesos_slave1_qa_uswest2.qasql.opentable.com-us_west_2a/tail/stderr#615595>15/09/22
>> 20:18:17 INFO MemoryStore: MemoryStore cleared
>>
>> <http://singularity-qa-uswest2.otenv.com/task/ds-tetris-simspark-usengar.2015.09.22T20.14.36-1442952963980-1-mesos_slave1_qa_uswest2.qasql.opentable.com-us_west_2a/tail/stderr#615651>15/09/22
>> 20:18:17 INFO BlockManager: BlockManager stopped
>>
>> <http://singularity-qa-uswest2.otenv.com/task/ds-tetris-simspark-usengar.2015.09.22T20.14.36-1442952963980-1-mesos_slave1_qa_uswest2.qasql.opentable.com-us_west_2a/tail/stderr#615709>15/09/22
>> 20:18:17 INFO BlockManagerMaster: BlockManagerMaster stopped
>>
>> <http://singularity-qa-uswest2.otenv.com/task/ds-tetris-simspark-usengar.2015.09.22T20.14.36-1442952963980-1-mesos_slave1_qa_uswest2.qasql.opentable.com-us_west_2a/tail/stderr#615779>15/09/22
>> 20:18:17 INFO OutputCommitCoordinator$OutputCommitCoordinatorEndpoint:
>> OutputCommitCoordinator stopped!
>>
>> <http://singularity-qa-uswest2.otenv.com/task/ds-tetris-simspark-usengar.2015.09.22T20.14.36-1442952963980-1-mesos_slave1_qa_uswest2.qasql.opentable.com-us_west_2a/tail/stderr#615892>15/09/22
>> 20:18:17 INFO SparkContext: Successfully stopped SparkContext
>>
>> <http://singularity-qa-uswest2.otenv.com/task/ds-tetris-simspark-usengar.2015.09.22T20.14.36-1442952963980-1-mesos_slave1_qa_uswest2.qasql.opentable.com-us_west_2a/tail/stderr#615963>15/09/22
>> 20:18:17 INFO Utils: Shutdown hook called
>>
>> <http://singularity-qa-uswest2.otenv.com/task/ds-tetris-simspark-usengar.2015.09.22T20.14.36-1442952963980-1-mesos_slave1_qa_uswest2.qasql.opentable.com-us_west_2a/tail/stderr#616014>15/09/22
>> 20:18:17 INFO Utils: Deleting directory
>> /tmp/spark-98801318-9c49-473b-bf2f-07ea42187252
>>
>> <http://singularity-qa-uswest2.otenv.com/task/ds-tetris-simspark-usengar.2015.09.22T20.14.36-1442952963980-1-mesos_slave1_qa_uswest2.qasql.opentable.com-us_west_2a/tail/stderr#616111>15/09/22
>> 20:18:17 INFO RemoteActorRefProvider$RemotingTerminator: Shutting down
>> remote daemon.
>>
>> <http://singularity-qa-uswest2.otenv.com/task/ds-tetris-simspark-usengar.2015.09.22T20.14.36-1442952963980-1-mesos_slave1_qa_uswest2.qasql.opentable.com-us_west_2a/tail/stderr#616206>15/09/22
>> 20:18:17 INFO RemoteActorRefProvider$RemotingTerminator: Remote daemon shut
>> down; proceeding with flushing remote transports.
>>
>>
>>
>>
>> On Tue, Sep 22, 2015 at 1:26 AM, Tim Chen <tim@mesosphere.io> wrote:
>>
>>> Hi Utkarsh,
>>>
>>> Just to be sure you originally set coarse to false but then to true? Or
>>> is it the other way around?
>>>
>>> Also what's the exception/stack trace when the driver crashed?
>>>
>>> Coarse grain mode per-starts all the Spark executor backends, so has the
>>> least overhead comparing to fine grain. There is no single answer for which
>>> mode you should use, otherwise we would have removed one of those modes
>>> since it depends on your use case.
>>>
>>> There are quite some factor why there could be huge GC pauses, but I
>>> don't think if you switch to standalone your GC pauses go away.
>>>
>>> Tim
>>>
>>> On Mon, Sep 21, 2015 at 5:18 PM, Utkarsh Sengar <utkarsh2012@gmail.com>
>>> wrote:
>>>
>>>> I am running Spark 1.4.1 on mesos.
>>>>
>>>> The spark job does a "cartesian" of 4 RDDs (aRdd, bRdd, cRdd, dRdd) of
>>>> size 100, 100, 7 and 1 respectively. Lets call it prouctRDD.
>>>>
>>>> Creation of "aRdd" needs data pull from multiple data sources, merging
>>>> it and creating a tuple of JavaRdd, finally aRDD looks something like this:
>>>> JavaRDD<Tuple4<A1, A2>>
>>>> bRdd, cRdd and dRdds are just List<> of values.
>>>>
>>>> Then apply a transformation on prouctRDD and finally call
>>>> "saveAsTextFile" to save the result of my transformation.
>>>>
>>>> Problem:
>>>> By setting "spark.mesos.coarse=true", creation of "aRdd" works fine but
>>>> driver crashes while doing the cartesian but when I do
>>>> "spark.mesos.coarse=true", the job works like a charm. I am running spark
>>>> on mesos.
>>>>
>>>> Comments:
>>>> So I wanted to understand what role does "spark.mesos.coarse=true"
>>>> plays in terms of memory and compute performance. My findings look counter
>>>> intuitive since:
>>>>
>>>>    1. "spark.mesos.coarse=true" just runs on 1 mesos task, so there
>>>>    should be an overhead of spinning up mesos tasks which should impact the
>>>>    performance.
>>>>    2. What config for "spark.mesos.coarse" recommended for running
>>>>    spark on mesos? Or there is no best answer and it depends on usecase?
>>>>    3. Also by setting "spark.mesos.coarse=true", I notice that I get
>>>>    huge GC pauses even with small dataset but a long running job (but this
can
>>>>    be a separate discussion).
>>>>
>>>> Let me know if I am missing something obvious, we are learning spark
>>>> tuning as we move forward :)
>>>>
>>>> --
>>>> Thanks,
>>>> -Utkarsh
>>>>
>>>
>>>
>>
>>
>> --
>> Thanks,
>> -Utkarsh
>>
>
>
>
> --
> Thanks,
> -Utkarsh
>

Mime
View raw message