hadoop-pig-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Dmitriy V. Ryaboy (JIRA)" <j...@apache.org>
Subject [jira] Updated: (PIG-1205) Enhance HBaseStorage-- Make it support loading row key and implement StoreFunc
Date Tue, 17 Aug 2010 00:55:19 GMT

     [ https://issues.apache.org/jira/browse/PIG-1205?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]

Dmitriy V. Ryaboy updated PIG-1205:
-----------------------------------

    Attachment: PIG_1205_5.path

This patch (not really review-ready yet) introduces the Elephant-Bird improvements.

You can use -gt, -gte, -lt, -lte flags to filter out row ranges, specify caching and per-region
row limits, and you can specify the caster to use (interpret Strings, as before, or use bytes
directly for more eficient storage and communication).

The filtering is a bit off because it still spins up all the map tasks, the ones whose keys
are filtered out just finish extremely fast. 

The progress reporting is a bit jittery, but better than nothing.

TODO: fix up filtering, add projection pushdown, add filter pushdown, and write better tests.



> Enhance HBaseStorage-- Make it support loading row key and implement StoreFunc
> ------------------------------------------------------------------------------
>
>                 Key: PIG-1205
>                 URL: https://issues.apache.org/jira/browse/PIG-1205
>             Project: Pig
>          Issue Type: Sub-task
>    Affects Versions: 0.7.0
>            Reporter: Jeff Zhang
>            Assignee: Dmitriy V. Ryaboy
>             Fix For: 0.8.0
>
>         Attachments: PIG_1205.patch, PIG_1205_2.patch, PIG_1205_3.patch, PIG_1205_4.patch,
PIG_1205_5.path
>
>


-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


Mime
View raw message