hadoop-common-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Mahadev konar (JIRA)" <j...@apache.org>
Subject [jira] Commented: (HADOOP-6097) Multiple bugs w/ Hadoop archives
Date Wed, 19 Aug 2009 16:53:14 GMT

    [ https://issues.apache.org/jira/browse/HADOOP-6097?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12745122#action_12745122
] 

Mahadev konar commented on HADOOP-6097:
---------------------------------------

ben,
  koji is right. The caching is just a filesystem caching. The filesystem cache has a cache
for each scheme cached. So for har filesystem its caching the scheme and harpath to create
a cache for a filesystem meaning that a har filesystem is uniquely identified by a har:///archivepath.
the connection caching has nothing to do with this filesystem cache.  The connection caching
is done via the RPC layer and you cwould not be able to cache connections at the har filesystem
layer. 



> Multiple bugs w/ Hadoop archives
> --------------------------------
>
>                 Key: HADOOP-6097
>                 URL: https://issues.apache.org/jira/browse/HADOOP-6097
>             Project: Hadoop Common
>          Issue Type: Bug
>          Components: fs
>    Affects Versions: 0.18.0, 0.18.1, 0.18.2, 0.18.3, 0.19.0, 0.19.1, 0.20.0
>            Reporter: Ben Slusky
>             Fix For: 0.20.1
>
>         Attachments: HADOOP-6097.patch
>
>
> Found and fixed several bugs involving Hadoop archives:
> - In makeQualified(), the sloppy conversion from Path to URI and back mangles the path
if it contains an escape-worthy character.
> - It's possible that fileStatusInIndex() may have to read more than one segment of the
index. The LineReader and count of bytes read need to be reset for each block.
> - har:// connections cannot be indexed by (scheme, authority, username) -- the path is
significant as well. Caching them in this way limits a hadoop client to opening one archive
per filesystem. It seems to be safe not to cache them, since they wrap another connection
that does the actual networking.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


Mime
View raw message