hadoop-hdfs-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "ASF GitHub Bot (JIRA)" <j...@apache.org>
Subject [jira] [Work logged] (HDDS-1496) Support partial chunk reads and checksum verification
Date Thu, 06 Jun 2019 20:41:00 GMT

     [ https://issues.apache.org/jira/browse/HDDS-1496?focusedWorklogId=255416&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-255416
]

ASF GitHub Bot logged work on HDDS-1496:
----------------------------------------

                Author: ASF GitHub Bot
            Created on: 06/Jun/19 20:40
            Start Date: 06/Jun/19 20:40
    Worklog Time Spent: 10m 
      Work Description: hanishakoneru commented on pull request #804: HDDS-1496. Support partial
chunk reads and checksum verification
URL: https://github.com/apache/hadoop/pull/804#discussion_r291362428
 
 

 ##########
 File path: hadoop-ozone/client/src/main/java/org/apache/hadoop/ozone/client/io/KeyInputStream.java
 ##########
 @@ -146,22 +160,34 @@ public synchronized int read(byte[] b, int off, int len) throws IOException
{
         // this case.
         throw new IOException(String.format(
             "Inconsistent read for blockID=%s length=%d numBytesRead=%d",
-            current.blockInputStream.getBlockID(), current.length,
-            numBytesRead));
+            current.getBlockID(), current.getLength(), numBytesRead));
       }
       totalReadLen += numBytesRead;
       off += numBytesRead;
       len -= numBytesRead;
       if (current.getRemaining() <= 0 &&
-          ((currentStreamIndex + 1) < streamEntries.size())) {
-        currentStreamIndex += 1;
+          ((blockIndex + 1) < blockStreams.size())) {
+        blockIndex += 1;
       }
     }
     return totalReadLen;
   }
 
+  /**
+   * Seeks the KeyInputStream to the specified position. This involves 2 steps:
+   *    1. Updating the blockIndex to the blockStream corresponding to the
+   *    seeked position.
+   *    2. Seeking the corresponding blockStream to the adjusted position.
+   *
+   * For example, let’s say the block size is 200 bytes and block[0] stores
+   * data from indices 0 - 199, block[1] from indices 200 - 399 and so on.
+   * Let’s say we seek to position 240. In the first step, the blockIndex
+   * would be updated to 1 as indices 200 - 399 reside in blockStream[1]. In
+   * the second step, the blockStream[1] would be seeked to position 40 (=
+   * 240 - blockOffset[1] (= 200)).
+   */
   @Override
-  public void seek(long pos) throws IOException {
+  public synchronized void seek(long pos) throws IOException {
     checkNotClosed();
 
 Review comment:
   done
 
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


Issue Time Tracking
-------------------

    Worklog Id:     (was: 255416)
    Time Spent: 9h  (was: 8h 50m)

> Support partial chunk reads and checksum verification
> -----------------------------------------------------
>
>                 Key: HDDS-1496
>                 URL: https://issues.apache.org/jira/browse/HDDS-1496
>             Project: Hadoop Distributed Data Store
>          Issue Type: Improvement
>            Reporter: Hanisha Koneru
>            Assignee: Hanisha Koneru
>            Priority: Major
>              Labels: pull-request-available
>          Time Spent: 9h
>  Remaining Estimate: 0h
>
> BlockInputStream#readChunkFromContainer() reads the whole chunk from disk even if we
need to read only a part of the chunk.
> This Jira aims to improve readChunkFromContainer so that only that part of the chunk
file is read which is needed by client plus the part of chunk file which is required to verify
the checksum.
> For example, lets say the client is reading from index 120 to 450 in the chunk. And let's
say checksum is stored for every 100 bytes in the chunk i.e. the first checksum is for bytes
from index 0 to 99, the next for bytes from index 100 to 199 and so on. To verify bytes from
120 to 450, we would need to read from bytes 100 to 499 so that checksum verification can
be done.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

---------------------------------------------------------------------
To unsubscribe, e-mail: hdfs-issues-unsubscribe@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-help@hadoop.apache.org


Mime
View raw message