hadoop-hdfs-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Virajith Jalaparti (JIRA)" <j...@apache.org>
Subject [jira] [Comment Edited] (HDFS-12778) [READ] Report multiple locations for PROVIDED blocks
Date Thu, 16 Nov 2017 14:49:00 GMT

    [ https://issues.apache.org/jira/browse/HDFS-12778?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16255428#comment-16255428
] 

Virajith Jalaparti edited comment on HDFS-12778 at 11/16/17 2:48 PM:
---------------------------------------------------------------------

Thanks for taking a look [~elgoiri]. Posting a new patch with the additional test cases ({{testNumberOfProvidedLocations}}
and {{testNumberOfProvidedLocationsManyBlocks}}). 

bq. Should we make the block locations deterministic to some degree? I can see two mappers
trying to accessing the same block and in that way some caching could be done.

Yes, that is entirely possible. I agree that returning a consistent set of locations can help
with things like caching. We can fix this as part of HDFS-12809.


was (Author: virajith):
Thanks for taking a look [~elgoiri]. Posting a new patch with the additional test cases ({{testNumberOfProvidedLocations}}
and {{testNumberOfProvidedLocationsManyBlocks}}). 

> [READ] Report multiple locations for PROVIDED blocks
> ----------------------------------------------------
>
>                 Key: HDFS-12778
>                 URL: https://issues.apache.org/jira/browse/HDFS-12778
>             Project: Hadoop HDFS
>          Issue Type: Sub-task
>            Reporter: Virajith Jalaparti
>            Assignee: Virajith Jalaparti
>         Attachments: HDFS-12778-HDFS-9806.001.patch, HDFS-12778-HDFS-9806.002.patch
>
>
> On {{getBlockLocations}}, only one Datanode is returned as the location for all PROVIDED
blocks. This can hurt the performance of applications which typically 3 locations per block.
We need to return multiple Datanodes for each PROVIDED block for better application performance/resilience.




--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

---------------------------------------------------------------------
To unsubscribe, e-mail: hdfs-issues-unsubscribe@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-help@hadoop.apache.org


Mime
View raw message