hbase-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Vasu Mariyala (JIRA)" <j...@apache.org>
Subject [jira] [Updated] (HBASE-10416) Improvements to the import flow
Date Fri, 24 Jan 2014 22:09:43 GMT

     [ https://issues.apache.org/jira/browse/HBASE-10416?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]

Vasu Mariyala updated HBASE-10416:
----------------------------------

    Description: 
Following improvements can be made to the Import logic

a) Make the import extensible (i.e., remove the filter from being a static member of Import
and make it an instance variable of the mapper, make the mappers or variables of interest
protected. )

b) Make sure that the Import calls filterRowKey method of the filter (Useful if we want to
filter the data of an organization based on the row key or using filters like PrefixFilter
which filter the data in filterRowKey method rather than the filterKeyValue method). The existing
test case in TestImportExport#testWithFilter works with this assumption but is so far successful
because there is only one row inserted into the table.

c) Provide an option to specify the durability during the import (Specifying the Durability
as SKIP_WAL would improve the performance of restore considerably.) [~lhofhansl] suggested
that this should be a parameter to the import.

d) Some minor refactoring to avoid building a comma separated string for the filter args.

  was:
Following improvements can be made to the Import logic

a) Make the import extensible (i.e., remove the filter from being a static member of Import
and make it an instance variable of the mapper, make the mappers or variables of interest
protected. )

b) Make sure that the Import calls filterRowKey method of the filter (Useful if we want to
filter the data of an organization based on the row key or using filters like PrefixFilter).
The existing test case in TestImportExport#testWithFilter works with this assumption but is
so far successful because there is only one row inserted into the table.

c) Provide an option to specify the durability during the import (Specifying the Durability
as SKIP_WAL would improve the performance of restore considerably.) [~lhofhansl] suggested
that this should be a parameter to the import.

d) Some minor refactoring to avoid building a comma separated string for the filter args.


> Improvements to the import flow
> -------------------------------
>
>                 Key: HBASE-10416
>                 URL: https://issues.apache.org/jira/browse/HBASE-10416
>             Project: HBase
>          Issue Type: New Feature
>          Components: mapreduce
>            Reporter: Vasu Mariyala
>         Attachments: HBASE-10416.patch
>
>
> Following improvements can be made to the Import logic
> a) Make the import extensible (i.e., remove the filter from being a static member of
Import and make it an instance variable of the mapper, make the mappers or variables of interest
protected. )
> b) Make sure that the Import calls filterRowKey method of the filter (Useful if we want
to filter the data of an organization based on the row key or using filters like PrefixFilter
which filter the data in filterRowKey method rather than the filterKeyValue method). The existing
test case in TestImportExport#testWithFilter works with this assumption but is so far successful
because there is only one row inserted into the table.
> c) Provide an option to specify the durability during the import (Specifying the Durability
as SKIP_WAL would improve the performance of restore considerably.) [~lhofhansl] suggested
that this should be a parameter to the import.
> d) Some minor refactoring to avoid building a comma separated string for the filter args.



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)

Mime
View raw message