hadoop-hdfs-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Vinayakumar B (JIRA)" <j...@apache.org>
Subject [jira] [Updated] (HDFS-8468) 2 RPC calls for every file read in DFSClient#open(..) resulting in double Audit log entries
Date Mon, 29 Jun 2015 12:19:04 GMT

     [ https://issues.apache.org/jira/browse/HDFS-8468?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]

Vinayakumar B updated HDFS-8468:
--------------------------------
    Attachment: HDFS-8468-HDFS-7285.patch

Attached the patch.

Idea is to carry {{ECSchema}} and {{stripeCellSize}} in {{LocatedBlocks}} instead of {{HdfsFileStatus}}
at the time of reading.
fetched {{LocatedBlocks}} can be re-used inside {{DFSInputStream}} unless refresh is required.

So,in NN side only 'open' command will be logged in audit log per file, as earlier. (again
unless refresh is required).

Main testcase for this fix are existing 'TestAuditLogs' which are failing currently in the
branch.

> 2 RPC calls for every file read in DFSClient#open(..) resulting in double Audit log entries
> -------------------------------------------------------------------------------------------
>
>                 Key: HDFS-8468
>                 URL: https://issues.apache.org/jira/browse/HDFS-8468
>             Project: Hadoop HDFS
>          Issue Type: Sub-task
>            Reporter: Vinayakumar B
>            Assignee: Vinayakumar B
>         Attachments: HDFS-8468-HDFS-7285.patch
>
>
> In HDFS-7285 branch, 
> To determine whether file is striped/not and get the Schema for the file, 2 RPCs done
to Namenode.
> This is resulting in double audit logs for every file read for both striped/non-striped.
> This will be a major impact in size of audit logs.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Mime
View raw message