sqoop-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Szabolcs Vasas <vasas.szabo...@gmail.com>
Subject Re: Review Request 64333: Incremental import to HBase deletes only last version of column
Date Tue, 20 Feb 2018 17:17:56 GMT


> On Feb. 20, 2018, 5:16 p.m., Szabolcs Vasas wrote:
> > Hi Dani,
> > 
> > Thank you for the patch, it is also great that you have documented it. I have left
only a minor comment.
> > 
> > Szabolcs

Sorry, there is one more small thing. I got the following warning when I applied the patch:
SQOOP-3267.3.patch:13: new blank line at EOF.
    +
    warning: 1 line adds whitespace errors.
    
It seems you have added extra new lines at the end of hbase-args.txt. It would be great, if
you could fix this one too.
Thanks!


- Szabolcs


-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/64333/#review197784
-----------------------------------------------------------


On Dec. 5, 2017, 9:25 a.m., daniel voros wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/64333/
> -----------------------------------------------------------
> 
> (Updated Dec. 5, 2017, 9:25 a.m.)
> 
> 
> Review request for Sqoop.
> 
> 
> Bugs: SQOOP-3267
>     https://issues.apache.org/jira/browse/SQOOP-3267
> 
> 
> Repository: sqoop-trunk
> 
> 
> Description
> -------
> 
> Deletes are supported since SQOOP-3149, but we're only deleting the last version of a
column when the corresponding cell was set to NULL in the source table.
> 
> This can lead to unexpected and misleading results if the row has been transferred multiple
times, which can easily happen if it's being modified on the source side.
> 
> Also SQOOP-3149 is using a new Put command for every column instead of a single Put per
row as before. This could probably lead to a performance drop for wide tables (for which HBase
is otherwise usually recommended).
> 
> 
> Diffs
> -----
> 
>   src/docs/man/hbase-args.txt afd5c5b 
>   src/docs/user/hbase-args.txt 53040f5 
>   src/docs/user/hbase.txt ab4aedc 
>   src/java/org/apache/sqoop/SqoopOptions.java 73d0757 
>   src/java/org/apache/sqoop/hbase/HBasePutProcessor.java 27d6006 
>   src/java/org/apache/sqoop/hbase/ToStringPutTransformer.java 0bd6169 
>   src/java/org/apache/sqoop/mapreduce/HBaseImportJob.java 33da487 
>   src/java/org/apache/sqoop/tool/BaseSqoopTool.java ce21918 
>   src/test/org/apache/sqoop/TestSqoopOptions.java 16901ca 
>   src/test/org/apache/sqoop/hbase/HBaseImportTest.java 2e73cf3 
>   src/test/org/apache/sqoop/hbase/HBaseTestCase.java 98f8698 
> 
> 
> Diff: https://reviews.apache.org/r/64333/diff/3/
> 
> 
> Testing
> -------
> 
> - all unit tests and thirdparty tests passed
>  - splitted previous test for incremental import into two, to test both modes (ignore,
delete)
>  - tested on a cluster with HBase (with saved jobs as well)
> 
> 
> Thanks,
> 
> daniel voros
> 
>


Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message