hadoop-hive-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Paul Yang (JIRA)" <j...@apache.org>
Subject [jira] Commented: (HIVE-1332) Archiving partitions
Date Mon, 03 May 2010 21:38:57 GMT

    [ https://issues.apache.org/jira/browse/HIVE-1332?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12863525#action_12863525
] 

Paul Yang commented on HIVE-1332:
---------------------------------

Yeah, the way the patch is now, concurrent operations were not supported as it was assumed
these commands were going to be run via a single cron job. But that is probably not a good
assumption to make. And the priority was to make the order of operations to prevent data loss
in case of any failures. The reason why the (un)archive operation is tricky to do concurrently
is because there are no ways to lock a partition/table (HIVE-1293) and there are no ways to
atomically make a filesystem and metadata change. But there are ways of addressing these concurrency
issues while preserving data during failure scenarios:

Archiving a partition using a conservative approach would involve something like (as discussed
with Namit):

1. Create a copy of the partition, call it ds=1.copy
2. Alter metadata's location to point to ds=1.copy
-- At this point failures are okay as the copy is not touched
3. Make the archive of the partition directory in a tmp directory
4. Remove the directory ds=1
5. Move the tmp directory to ds=1
6. Alter metadata's location to point to har:/...ds=1
7. Delete ds=1.copy

These set of steps would ensure that no matter when failure occurs, subsequent queries on
the partition will continue to succeed. However, this approach incurs the overhead of having
to make a copy of the partition, which can be significant. Another approach is to:

1. Make the archive of the partition in a tmp directory
2. Move the archive folder to ds=1.copy
3. Move ds=1 to ds=1.old
4. Move ds=1.copy to ds=1
5. Alter the metada to change the location to har:/...ds=1

The drawback to this approach is that if a failure occurs, subsequent queries will not be
able to properly access the data. However, the archive command can be run again to recover
from the situation.

Also since the semantics for FileSystem.rename() do not throw an error if the destination
directory already exists, there is a small window for data duplication. However, this issue
is already present in INSERT OVERWRITE... These will be addressed with lock support.


> Archiving partitions
> --------------------
>
>                 Key: HIVE-1332
>                 URL: https://issues.apache.org/jira/browse/HIVE-1332
>             Project: Hadoop Hive
>          Issue Type: New Feature
>          Components: Metastore
>    Affects Versions: 0.6.0
>            Reporter: Paul Yang
>            Assignee: Paul Yang
>         Attachments: HIVE-1332.1.patch
>
>
> Partitions and tables in Hive typically consist of many files on HDFS. An issue is that
as the number of files increase, there will be higher memory/load requirements on the namenode.
Partitions in bucketed tables are a particular problem because they consist of many files,
one for each of the buckets.
> One way to drastically reduce the number of files is to use hadoop archives:
> http://hadoop.apache.org/common/docs/current/hadoop_archives.html
> This feature would introduce an ALTER TABLE <table_name> ARCHIVE PARTITION <spec>
that would automatically put the files for the partition into a HAR file. We would also have
an UNARCHIVE option to convert the files in the partition back to the original files. Archived
partitions would be slower to access, but they would have the same functionality and decrease
the number of files drastically. Typically, only seldom accessed partitions would be archived.
> Hadoop archives are still somewhat new, so we'll only put in support for the latest released
major version (0.20). Here are some bug fixes:
> https://issues.apache.org/jira/browse/HADOOP-6591 (Important - could potentially cause
data loss without this fix)
> https://issues.apache.org/jira/browse/HADOOP-6645
> https://issues.apache.org/jira/browse/MAPREDUCE-1585

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


Mime
View raw message