hbase-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Jean-Marc Spaggiari <jean-m...@spaggiari.org>
Subject Re: Loading into hbase from csv file issue
Date Mon, 03 Oct 2016 14:10:26 GMT
Hi Mich,

As you said, it's most probably because it's all the same key... If you
want to be 200% sure, just alter VERSIONS => '1' to be greater (like, 10)
and scan all the versions of the cells. You should see the others.

JMS

2016-10-03 3:41 GMT-04:00 Mich Talebzadeh <mich.talebzadeh@gmail.com>:

> Hi,
>
> when I use the command line utility ImportTsv  to load a file into Hbase
> with the following table format
>
> describe 'marketDataHbase'
> Table marketDataHbase is ENABLED
> marketDataHbase
> COLUMN FAMILIES DESCRIPTION
> {NAME => 'price_info', BLOOMFILTER => 'ROW', VERSIONS => '1', IN_MEMORY =>
> 'false', KEEP_DELETED_CELLS => 'FALSE', DATA_BLOCK_ENCODING => 'NONE', TTL
> => 'FOREVER', COMPRESSION => 'NONE', MIN_VERSIONS => '0', BLOCKC
> ACHE => 'true', BLOCKSIZE => '65536', REPLICATION_SCOPE => '0'}
> 1 row(s) in 0.0930 seconds
>
>
> hbase org.apache.hadoop.hbase.mapreduce.ImportTsv
> -Dimporttsv.separator=','
> -Dimporttsv.columns="HBASE_ROW_KEY, stock_daily:ticker,
> stock_daily:tradedate, stock_daily:open,stock_daily:
> high,stock_daily:low,stock_daily:close,stock_daily:volume" tsco
> hdfs://rhes564:9000/data/stocks/tsco.csv
>
> There are with 1200 rows in the csv file,* but it only loads the first
> row!*
>
> scan 'tsco'
> ROW                                                    COLUMN+CELL
>  Tesco PLC
> column=stock_daily:close, timestamp=1475447365118, value=325.25
>  Tesco PLC
> column=stock_daily:high, timestamp=1475447365118, value=332.00
>  Tesco PLC
> column=stock_daily:low, timestamp=1475447365118, value=324.00
>  Tesco PLC
> column=stock_daily:open, timestamp=1475447365118, value=331.75
>  Tesco PLC
> column=stock_daily:ticker, timestamp=1475447365118, value=TSCO
>  Tesco PLC
> column=stock_daily:tradedate, timestamp=1475447365118, value= 3-Jan-06
>  Tesco PLC
> column=stock_daily:volume, timestamp=1475447365118, value=46935045
> 1 row(s) in 0.0390 seconds
>
> Is this because the hbase_row_key --> Tesco PLC is the same for all? I
> thought that the row key can be anything.
>
> Thanks
>
> Dr Mich Talebzadeh
>
>
>
> LinkedIn * https://www.linkedin.com/profile/view?id=
> AAEAAAAWh2gBxianrbJd6zP6AcPCCdOABUrV8Pw
> <https://www.linkedin.com/profile/view?id=AAEAAAAWh2gBxianrbJd6zP6AcPCCd
> OABUrV8Pw>*
>
>
>
> http://talebzadehmich.wordpress.com
>
>
> *Disclaimer:* Use it at your own risk. Any and all responsibility for any
> loss, damage or destruction of data or any other property which may arise
> from relying on this email's technical content is explicitly disclaimed.
> The author will in no case be liable for any monetary damages arising from
> such loss, damage or destruction.
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message