hadoop-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Fei Hu <hufe...@gmail.com>
Subject Re: RDD Location
Date Sat, 31 Dec 2016 01:48:23 GMT
That is a good idea.

I tried add the following code to get getPreferredLocations() function:

val results: Array[Array[DataChunkPartition]] = context.runJob(
      partitionsRDD, (context: TaskContext, partIter:
Iterator[DataChunkPartition]) => partIter.toArray, dd, allowLocal = true)

But it seems to be suspended when executing this function. But if I move
the code to other places, like the main() function, it runs well.

What is the reason for it?

Thanks,
Fei

On Fri, Dec 30, 2016 at 2:38 AM, Sun Rui <sunrise_win@163.com> wrote:

> Maybe you can create your own subclass of RDD and override the
> getPreferredLocations() to implement the logic of dynamic changing of the
> locations.
> > On Dec 30, 2016, at 12:06, Fei Hu <hufei68@gmail.com> wrote:
> >
> > Dear all,
> >
> > Is there any way to change the host location for a certain partition of
> RDD?
> >
> > "protected def getPreferredLocations(split: Partition)" can be used to
> initialize the location, but how to change it after the initialization?
> >
> >
> > Thanks,
> > Fei
> >
> >
>
>
>

Mime
View raw message