hadoop-hive-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Namit Jain (JIRA)" <j...@apache.org>
Subject [jira] Created: (HIVE-1293) Concurreny Model for Hive
Date Wed, 07 Apr 2010 23:39:36 GMT
Concurreny Model for Hive

                 Key: HIVE-1293
                 URL: https://issues.apache.org/jira/browse/HIVE-1293
             Project: Hadoop Hive
          Issue Type: New Feature
          Components: Query Processor
            Reporter: Namit Jain

Concurrency model for Hive:

Currently, hive does not provide a good concurrency model. The only guanrantee provided in
case of concurrent readers and writers is that
reader will not see partial data from the old version (before the write) and partial data
from the new version (after the write).
This has come across as a big problem, specially for background processes performing maintenance

The following possible solutions come to mind.

1. Locks: Acquire read/write locks - they can be acquired at the beginning of the query or
the write locks can be delayed till move
task (when the directory is actually moved). Care needs to be taken for deadlocks.

2. Versioning: The writer can create a new version if the current version is being read. Note
that, it is not equivalent to snapshots,
the old version can only be accessed by the current readers, and will be deleted when all
of them have finished.


This message is automatically generated by JIRA.
You can reply to this email to add a comment to the issue online.

View raw message