hadoop-hive-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "He Yongqiang (JIRA)" <j...@apache.org>
Subject [jira] Issue Comment Edited: (HIVE-352) Make Hive support column based storage
Date Tue, 21 Apr 2009 02:04:47 GMT

    [ https://issues.apache.org/jira/browse/HIVE-352?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12700764#action_12700764
] 

He Yongqiang edited comment on HIVE-352 at 4/20/09 7:04 PM:
------------------------------------------------------------

When I am testing RCFile's read performance, I notice severe read performance degradation
when the column number get bigger. 
I tentatively doubt that is caused by the DFSClient's DFSInputStream 's skip code as shown
below.  
{noformat}
int diff = (int)(targetPos - pos);
        if (diff <= TCP_WINDOW_SIZE) {
          try {
            pos += blockReader.skip(diff);
            if (pos == targetPos) {
              done = true;
            }
          } catch (IOException e) {//make following read to retry
            LOG.debug("Exception while seek to " + targetPos + " from "
                      + currentBlock +" of " + src + " from " + currentNode + 
                      ": " + StringUtils.stringifyException(e));
          }
        }
{noformat}
It seems that if I remove this piece of code, everything still works correctly ( in my test
code). 
Got an offline discussion with Zheng, and made several draft improvements:
1. compress each column directly, it means keep one codec for each column and write data directly
to the column's corresponding compression stream. Currently RCFile buffers all the data first.
When buffered data is greater than a config, compress each column separately and  flush them
out. The direct compression strategy can increase the compression ratio.(this is not related
the severe read performance degradation problem)
 
To overcome the bad skip performance caused by TCP_WINDOW_SIZE(128K and not changeable at
all), 
2. make continuous  skips to a single skip, that way it would increase the bytes need skipped
and increase the probability of not executing statements in the above if block. 
3. Enhance the RCFile writer code to be aware of TCP_WINDOW_SIZE, and let most columns be
greater than TCP_WINDOW_SIZE as far as possible. 
    To do this, we need add a SLOP variable to allow buffer size be greater than configured
size (default 4M) and less than SIZE*SLOP. With this we can let most columns' buffer data
be greater TCP_WINDOW_SIZE  as possible as we can.
    This method has many limitations when columns are getting more and more (we guess >100).

    Another way to do this is to let the buffer size be TCP_WINDOW_SIZE(128K) *columnNumber.

    Anyway, all the solutions I can think up can only ease the situation. 

Any thoughts on this? Thanks!

      was (Author: he yongqiang):
    When I am testing RCFile's read performance, I notice a sharp performance decrease when
the column number get bigger. 
I tentatively doubt that is caused by the DFSClient's DFSInputStream 's skip code as shown
below.  
{noformat}
int diff = (int)(targetPos - pos);
        if (diff <= TCP_WINDOW_SIZE) {
          try {
            pos += blockReader.skip(diff);
            if (pos == targetPos) {
              done = true;
            }
          } catch (IOException e) {//make following read to retry
            LOG.debug("Exception while seek to " + targetPos + " from "
                      + currentBlock +" of " + src + " from " + currentNode + 
                      ": " + StringUtils.stringifyException(e));
          }
        }
{noformat}
It seems that if I remove this piece of code, everything still works correctly ( in my test
code). 
Got an offline discussion with Zheng, and made several draft improvements:
1. compress each column directly, it means keep one codec for each column and write data directly
to the column's corresponding compression stream. Currently RCFile buffers all the data first.
When buffered data is greater than a config, compress each column separately and  flush them
out. The direct compression strategy can increase the compression ratio.(this is not related
the read sharp decrease problem)
 
To overcome the bad skip performance caused by TCP_WINDOW_SIZE(128K and not changeable at
all), 
2. make continuous  skips to a single skip, that way it would increase the bytes need skipped
and increase the probability of not executing statements in the above if block. 
3. Enhance the RCFile writer code to be aware of TCP_WINDOW_SIZE, and let most columns be
greater than TCP_WINDOW_SIZE as far as possible. 
    To do this, we need add a SLOP variable to allow buffer size be greater than configured
size (default 4M) and less than SIZE*SLOP. With this we can let most columns' buffer data
be greater TCP_WINDOW_SIZE  as possible as we can.
    This method has many limitations when columns are getting more and more (we guess >100).

    Another way to do this is to let the buffer size be TCP_WINDOW_SIZE(128K) *columnNumber.

    Anyway, all the solutions I can think up can only ease the situation. 

Any thoughts on this? Thanks!
  
> Make Hive support column based storage
> --------------------------------------
>
>                 Key: HIVE-352
>                 URL: https://issues.apache.org/jira/browse/HIVE-352
>             Project: Hadoop Hive
>          Issue Type: New Feature
>            Reporter: He Yongqiang
>            Assignee: He Yongqiang
>         Attachments: hive-352-2009-4-15.patch, hive-352-2009-4-16.patch, hive-352-2009-4-17.patch,
hive-352-2009-4-19.patch, HIve-352-draft-2009-03-28.patch, Hive-352-draft-2009-03-30.patch
>
>
> column based storage has been proven a better storage layout for OLAP. 
> Hive does a great job on raw row oriented storage. In this issue, we will enhance hive
to support column based storage. 
> Acctually we have done some work on column based storage on top of hdfs, i think it will
need some review and refactoring to port it to Hive.
> Any thoughts?

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


Mime
View raw message