hive-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Phabricator (Updated) (JIRA)" <j...@apache.org>
Subject [jira] [Updated] (HIVE-2471) Add timestamp column with index to the partition stats table.
Date Fri, 16 Mar 2012 18:21:40 GMT

     [ https://issues.apache.org/jira/browse/HIVE-2471?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]

Phabricator updated HIVE-2471:
------------------------------

    Attachment: HIVE-2471.D2367.1.patch

kevinwilfong requested code review of "HIVE-2471 [jira] Add timestamp column with index to
the partition stats table.".
Reviewers: JIRA

  https://issues.apache.org/jira/browse/HIVE-2471

  Added a timestamp column to the stats table.  It defaults to the current timestamp on inserts.
 I also updated the update query to update the timestamp, since derby does not support the
on update option as far as I can tell.

  I modified the insert query to specify the columns it's inserting to.  This is not only
necessary to prevent the query from inserting into the timestamp column, it is safer in general.

  Occasionally, when entries are added to the partition stats table the program is halted
before it can delete those entries, by an exception, keyboard interrupt, etc.  These build
up to the point where the table gets very large, and it hurts the performance of the update
statement which is often called.  In order to fix this, I am adding a column to the table
which is auto-populated with the current timestamp.  I am also adding an index on this column.
 This will allow us to create scripts that go through periodically and clean out old entries
from the table.  The index will help to keep the runtime of these scripts short, and hence
reduce the amount of time they need to lock the table/indexes for.

TEST PLAN
  EMPTY

REVISION DETAIL
  https://reviews.facebook.net/D2367

AFFECTED FILES
  ql/src/java/org/apache/hadoop/hive/ql/stats/jdbc/JDBCStatsSetupConstants.java
  ql/src/java/org/apache/hadoop/hive/ql/stats/jdbc/JDBCStatsUtils.java

MANAGE HERALD DIFFERENTIAL RULES
  https://reviews.facebook.net/herald/view/differential/

WHY DID I GET THIS EMAIL?
  https://reviews.facebook.net/herald/transcript/5253/

Tip: use the X-Herald-Rules header to filter Herald messages in your client.

                
> Add timestamp column with index to the partition stats table.
> -------------------------------------------------------------
>
>                 Key: HIVE-2471
>                 URL: https://issues.apache.org/jira/browse/HIVE-2471
>             Project: Hive
>          Issue Type: Improvement
>            Reporter: Kevin Wilfong
>            Assignee: Kevin Wilfong
>         Attachments: HIVE-2471.1.patch.txt, HIVE-2471.D2367.1.patch
>
>
> Occasionally, when entries are added to the partition stats table the program is halted
before it can delete those entries, by an exception, keyboard interrupt, etc.  These build
up to the point where the table gets very large, and it hurts the performance of the update
statement which is often called.  In order to fix this, I am adding a column to the table
which is auto-populated with the current timestamp.  I am also adding an index on this column.
 This will allow us to create scripts that go through periodically and clean out old entries
from the table.  The index will help to keep the runtime of these scripts short, and hence
reduce the amount of time they need to lock the table/indexes for.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

Mime
View raw message