hive-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Edward Capriolo <edlinuxg...@gmail.com>
Subject Re: What does ROW__OFFSET__INSIDE__BLOCK FROM mean?
Date Wed, 03 Oct 2012 14:21:40 GMT
Make sure virtual column support is turned on in your hive-site.xml. I
have a feeling that this field is only supported inside certain input
formats because I was unable to get a non-very number out of it. (I
think it only works with index files)

On Wed, Oct 3, 2012 at 4:20 AM, afancy <groupme@gmail.com> wrote:
> Hi,
>
> Could anybody explain me what ROW__OFFSET__INSIDE__BLOCK means?
> For example, I make the following query, and return two rows. But why does
> the column of ROW__OFFSET__INSIDE__BLOCK show 0?
> For my understanding from the name of column, it should return the line
> number of the records in the block files, but now both are 0.  So, what is
> the BLOCK, BLOCK offset, and row offset in a block?
> The Hive bitmap document is very confusing.
>
>
> hive> SELECT  `url`,  INPUT__FILE__NAME,BLOCK__OFFSET__INSIDE__FILE,
> ROW__OFFSET__INSIDE__BLOCK FROM `testresult` WHERE
> url='http://www.domain022.tl04/page035.html';
>
> http://www.domain022.tl04/page035.html
> hdfs://pc01:54310/user/hive/warehouse/testresult/testresults.csv 0 0
> http://www.domain022.tl04/page035.html
> hdfs://pc01:54310/user/hive/warehouse/testresult/testresults.csv 3200250 0
> Time taken: 19.653 seconds
> hive>
>
>
> Regards,
> afancy
>

Mime
View raw message