carbondata-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Liang Chen <chenliang...@apache.org>
Subject Re: how to add RDD partition?
Date Mon, 26 Jun 2017 06:44:32 GMT
Hi

Can't understand your question exactly, do you want to increase
parallelism?
If yes:
You can set Spark's parallelism parameter

Regards
Liang

2017-06-20 11:41 GMT+08:00 suzzy <suzzy20171922@hotmail.com>:

> Hi
> Running query 'select count(1) from sunzy.datatest'
> this job had 16 blocks and 16 tasks, but only 4  partitions
> how  to add RDD partition?
> thanks
>
> CarbonData ThriftServer Log:
>
> INFO 16-06 16:14:34,039 -
>  Identified no.of.blocks: 16,
>  no.of.tasks: 16,
>  no.of.nodes: 0,
>  parallelism: 4
> INFO 16-06 16:14:34,059 - Starting job: run at AccessController.java:-2
> INFO 16-06 16:14:34,060 - Registering RDD 12 (run at
> AccessController.java:-2)
> INFO 16-06 16:14:34,061 - Got job 1 (run at AccessController.java:-2) with
> 1
> output partitions
> INFO 16-06 16:14:34,061 - Final stage: ResultStage 3 (run at
> AccessController.java:-2)
> INFO 16-06 16:14:34,061 - Parents of final stage: List(ShuffleMapStage 2)
> INFO 16-06 16:14:34,061 - Missing parents: List(ShuffleMapStage 2)
> INFO 16-06 16:14:34,062 - Submitting ShuffleMapStage 2
> (MapPartitionsRDD[12]
> at run at AccessController.java:-2), which has no missing parents
> INFO 16-06 16:14:34,065 - Block broadcast_2 stored as values in memory
> (estimated size 15.4 KB, free 62.2 KB)
> INFO 16-06 16:14:34,068 - Block broadcast_2_piece0 stored as bytes in
> memory
> (estimated size 7.6 KB, free 69.8 KB)
> INFO 16-06 16:14:34,069 - Added broadcast_2_piece0 in memory on
> 192.168.1.41:57617 (size: 7.6 KB, free: 71.7 GB)
> INFO 16-06 16:14:34,069 - Created broadcast 2 from broadcast at
> DAGScheduler.scala:1006
> INFO 16-06 16:14:34,070 - Submitting 16 missing tasks from ShuffleMapStage
> 2
> (MapPartitionsRDD[12] at run at AccessController.java:-2)
> INFO 16-06 16:14:34,070 - Adding task set 2.0 with 16 tasks
> INFO 16-06 16:14:34,072 - Starting task 2.0 in stage 2.0 (TID 16, H4,
> partition 2,NODE_LOCAL, 2376 bytes)
> INFO 16-06 16:14:34,073 - Starting task 0.0 in stage 2.0 (TID 17, H3,
> partition 0,NODE_LOCAL, 2376 bytes)
> INFO 16-06 16:14:34,073 - Starting task 1.0 in stage 2.0 (TID 18, H1,
> partition 1,NODE_LOCAL, 2376 bytes)
> INFO 16-06 16:14:34,074 - Starting task 4.0 in stage 2.0 (TID 19, H2,
> partition 4,NODE_LOCAL, 2376 bytes)
> INFO 16-06 16:14:34,089 - Added broadcast_2_piece0 in memory on H1:57002
> (size: 7.6 KB, free: 57.3 GB)
> INFO 16-06 16:14:34,096 - Added broadcast_2_piece0 in memory on H4:33086
> (size: 7.6 KB, free: 57.3 GB)
> INFO 16-06 16:14:34,116 - Added broadcast_2_piece0 in memory on H2:45618
> (size: 7.6 KB, free: 57.3 GB)
> INFO 16-06 16:14:34,117 - Added broadcast_2_piece0 in memory on H3:56719
> (size: 7.6 KB, free: 57.3 GB)
>
>
>
> --
> View this message in context: http://apache-carbondata-user-
> mailing-list.3231.n8.nabble.com/how-to-add-RDD-partition-tp31.html
> Sent from the Apache CarbonData User Mailing List mailing list archive at
> Nabble.com.
>

Mime
View raw message