hadoop-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Fei Hu <hufe...@gmail.com>
Subject Re: RDD Location
Date Sat, 31 Dec 2016 06:12:58 GMT
It will be very appreciated if you can give more details about why runJob
function could not be called in getPreferredLocations()

In the NewHadoopRDD class and HadoopRDD class, they get the location
information from the inputSplit. But there may be an issue in NewHadoopRDD,
because it generates all of the inputSplits on the master node, which means
I can only use a single node to generate and filter the inputSplits even if
the number of inputSplits is huge. Will it be a performance bottleneck?

Thanks,
Fei





On Fri, Dec 30, 2016 at 10:41 PM, Sun Rui <sunrise_win@163.com> wrote:

> You can’t call runJob inside getPreferredLocations().
> You can take a look at the source  code of HadoopRDD to help you implement getPreferredLocations()
> appropriately.
>
> On Dec 31, 2016, at 09:48, Fei Hu <hufei68@gmail.com> wrote:
>
> That is a good idea.
>
> I tried add the following code to get getPreferredLocations() function:
>
> val results: Array[Array[DataChunkPartition]] = context.runJob(
>       partitionsRDD, (context: TaskContext, partIter:
> Iterator[DataChunkPartition]) => partIter.toArray, dd, allowLocal = true)
>
> But it seems to be suspended when executing this function. But if I move
> the code to other places, like the main() function, it runs well.
>
> What is the reason for it?
>
> Thanks,
> Fei
>
> On Fri, Dec 30, 2016 at 2:38 AM, Sun Rui <sunrise_win@163.com> wrote:
>
>> Maybe you can create your own subclass of RDD and override the
>> getPreferredLocations() to implement the logic of dynamic changing of the
>> locations.
>> > On Dec 30, 2016, at 12:06, Fei Hu <hufei68@gmail.com> wrote:
>> >
>> > Dear all,
>> >
>> > Is there any way to change the host location for a certain partition of
>> RDD?
>> >
>> > "protected def getPreferredLocations(split: Partition)" can be used to
>> initialize the location, but how to change it after the initialization?
>> >
>> >
>> > Thanks,
>> > Fei
>> >
>> >
>>
>>
>>
>
>

Mime
View raw message