phoenix-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "James Taylor (JIRA)" <>
Subject [jira] [Commented] (PHOENIX-4344) MapReduce Delete Support
Date Thu, 02 Nov 2017 00:51:02 GMT


James Taylor commented on PHOENIX-4344:

Here's a possible way to proceed with this:
- In PhoenixInputFormat, we drive things based on a QueryPlan. I think the first thing we'll
need is PHOENIX-4342 - providing a way of getting the underlying QueryPlan from a MutationPlan
(which is what you get when you compile a DELETE statement).
- Create different implementation of PhoenixInputFormat.getQueryPlan() that compiles the DELETE
statement and gets the QueryPlan from the MutationPlan.
- Keep the same logic that ends up setting up on mapper per scan in the QueryPlan
- Instead of executing each individual scan, you'd want to execute a DELETE statement bounded
by the start/stop key of each scan
- Execute code just like FormatToBytesWritableMapper to put together the list of Delete mutations
- Make sure we've got the write-to-multiple HTables working correctly (I believe MultiHfileOutputFormat
does that)

> MapReduce Delete Support
> ------------------------
>                 Key: PHOENIX-4344
>                 URL:
>             Project: Phoenix
>          Issue Type: New Feature
>    Affects Versions: 4.12.0
>            Reporter: Geoffrey Jacoby
>            Assignee: Geoffrey Jacoby
>            Priority: Major
> Phoenix already has the ability to use MapReduce for asynchronous handling of long-running
SELECTs. It would be really useful to have this capability for long-running DELETEs, particularly
of tables with indexes where using HBase's own MapReduce integration would be prohibitively

This message was sent by Atlassian JIRA

View raw message