Mailing-List: contact user-help@hive.apache.org; run by ezmlm
Precedence: bulk
Reply-To: user@hive.apache.org
MIME-Version: 1.0
In-Reply-To: <CALKvMUXvgopGQZy8TrhBMJ6bDj2EXdEtZhyro6dpmetgfc2r1w@mail.gmail.com>
References: <CALKvMUXDFJZB=VZgUiLrbhPzv7TTOrT3K8nWCV=LqdTVOmh+4Q@mail.gmail.com>
 <D4367D0A.8380E%ekoifman@hortonworks.com> <CALKvMUXvgopGQZy8TrhBMJ6bDj2EXdEtZhyro6dpmetgfc2r1w@mail.gmail.com>
From: satyajit vegesna <satyajit.apasprk@gmail.com>
Date: Wed, 26 Oct 2016 16:26:38 -0700
Message-ID: <CALKvMUVf7WA5ZxghPLJHzXhBW_mC3x==FxfmNZCVG5=vwncf8g@mail.gmail.com>
Subject: Re: Error with flush_length File in Orc, in hive 2.1.0 and mr
 execution engine.
To: user@hive.apache.org, Eugene Koifman <ekoifman@hortonworks.com>, dev@hive.apache.org
Content-Type: multipart/alternative; boundary=001a1141b320f6e9d0053fccf4da
archived-at: Wed, 26 Oct 2016 23:26:52 -0000

--001a1141b320f6e9d0053fccf4da
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: quoted-printable

Hi Eugene,

One more observation, in namenode logs when i run select count(*) on
individual tables i still see the same error as before,

org.apache.hadoop.ipc.Server: IPC Server handler 1 on 9000, call
org.apache.hadoop.hdfs.protocol.ClientProtocol.getBlockLocations from
192.168.120.133:47114 Call#4 Retry#0: java.io.FileNotFoundException: File
does not exist: /user/hive/warehouse/mls_public_record_association_
snapshot_orc/delta_0000002_0000002_0000/bucket_00094_flush_length

but i get the count of the tables and they match well with the source data.
So i believe the problem is with joining these tables together.

Any specific logs you want me to debug.

Regards,
Satyajit.

On Wed, Oct 26, 2016 at 4:16 PM, satyajit vegesna <
satyajit.apasprk@gmail.com> wrote:

> Hi Eugene,
>
> select count(*) from mls_public_record_association_snapshot_orc pra  left
> outer join mls_listing_snapshot_orc ml on pra.primary_listing_id =3D ml.i=
d  left
> outer join attribute a on a.id =3D ml.standard_status
>
> ran till end and threw the below exception.
>
> MapReduce Total cumulative CPU time: 0 days 1 hours 0 minutes 53 seconds
> 760 msec
> Ended Job =3D job_1477494091659_0024
> SLF4J: Class path contains multiple SLF4J bindings.
> SLF4J: Found binding in [jar:file:/opt/apache-hive-2.
> 1.0-bin/lib/log4j-slf4j-impl-2.4.1.jar!/org/slf4j/impl/
> StaticLoggerBinder.class]
> SLF4J: Found binding in [jar:file:/opt/hadoop-2.7.2/
> share/hadoop/common/lib/slf4j-log4j12-1.7.10.jar!/org/slf4j/
> impl/StaticLoggerBinder.class]
> SLF4J: See http://www.slf4j.org/codes.html#multiple_bindings for an
> explanation.
> SLF4J: Actual binding is of type [org.apache.logging.slf4j.
> Log4jLoggerFactory]
> 2016-10-26 16:09:01 Starting to launch local task to process map join; ma=
ximum
> memory =3D 514850816
> Execution failed with exit status: 2
> Obtaining error information
>
> Task failed!
> Task ID:
>   Stage-9
>
> Logs:
>
> FAILED: Execution Error, return code 2 from org.apache.hadoop.hive.ql.
> exec.mr.MapredLocalTask
> MapReduce Jobs Launched:
> Stage-Stage-1: Map: 300  Reduce: 121   Cumulative CPU: 3654.02 sec   HDFS
> Read: 1771032233 HDFS Write: 1917532703 SUCCESS
> Total MapReduce CPU Time Spent: 0 days 1 hours 0 minutes 54 seconds 20 ms=
ec
>
> Explain Plan:
>
> STAGE DEPENDENCIES:
>   Stage-8 is a root stage , consists of Stage-1
>   Stage-1
>   Stage-9 depends on stages: Stage-1
>   Stage-3 depends on stages: Stage-9
>   Stage-0 depends on stages: Stage-3
>
> STAGE PLANS:
>   Stage: Stage-8
>     Conditional Operator
>
>   Stage: Stage-1
>     Map Reduce
>       Map Operator Tree:
>           TableScan
>             alias: pra
>             Statistics: Num rows: 99241216 Data size: 9924121600 Basic
> stats: COMPLETE Column stats: NONE
>             Select Operator
>               expressions: primary_listing_id (type: string)
>               outputColumnNames: _col0
>               Statistics: Num rows: 99241216 Data size: 9924121600 Basic
> stats: COMPLETE Column stats: NONE
>               Reduce Output Operator
>                 key expressions: _col0 (type: string)
>                 sort order: +
>                 Map-reduce partition columns: _col0 (type: string)
>                 Statistics: Num rows: 99241216 Data size: 9924121600 Basi=
c
> stats: COMPLETE Column stats: NONE
>           TableScan
>             alias: ml
>             Statistics: Num rows: 201432950 Data size: 20949026816 Basic
> stats: COMPLETE Column stats: NONE
>             Select Operator
>               expressions: id (type: string), standard_status (type: int)
>               outputColumnNames: _col0, _col1
>               Statistics: Num rows: 201432950 Data size: 20949026816 Basi=
c
> stats: COMPLETE Column stats: NONE
>               Reduce Output Operator
>                 key expressions: _col0 (type: string)
>                 sort order: +
>                 Map-reduce partition columns: _col0 (type: string)
>                 Statistics: Num rows: 201432950 Data size: 20949026816
> Basic stats: COMPLETE Column stats: NONE
>                 value expressions: _col1 (type: int)
>       Reduce Operator Tree:
>         Join Operator
>           condition map:
>                Left Outer Join0 to 1
>           keys:
>             0 _col0 (type: string)
>             1 _col0 (type: string)
>           outputColumnNames: _col2
>           Statistics: Num rows: 221576249 Data size: 23043929997 Basic
> stats: COMPLETE Column stats: NONE
>           File Output Operator
>             compressed: false
>             table:
>                 input format: org.apache.hadoop.mapred.
> SequenceFileInputFormat
>                 output format: org.apache.hadoop.hive.ql.io.
> HiveSequenceFileOutputFormat
>                 serde: org.apache.hadoop.hive.serde2.
> lazybinary.LazyBinarySerDe
>
>   Stage: Stage-9 --it is failing in the same mapreduce local work.
>     Map Reduce Local Work
>       Alias -> Map Local Tables:
>         $hdt$_2:a
>           Fetch Operator
>             limit: -1
>       Alias -> Map Local Operator Tree:
>         $hdt$_2:a
>           TableScan
>             alias: a
>             Statistics: Num rows: 12830 Data size: 51322 Basic stats:
> COMPLETE Column stats: NONE
>             Select Operator
>               expressions: id (type: int)
>               outputColumnNames: _col0
>               Statistics: Num rows: 12830 Data size: 51322 Basic stats:
> COMPLETE Column stats: NONE
>               HashTable Sink Operator
>                 keys:
>                   0 _col2 (type: int)
>                   1 _col0 (type: int)
>
>   Stage: Stage-3
>     Map Reduce
>       Map Operator Tree:
>           TableScan
>             Map Join Operator
>               condition map:
>                    Left Outer Join0 to 1
>               keys:
>                 0 _col2 (type: int)
>                 1 _col0 (type: int)
>               Statistics: Num rows: 243733879 Data size: 25348323546 Basi=
c
> stats: COMPLETE Column stats: NONE
>               Group By Operator
>                 aggregations: count()
>                 mode: hash
>                 outputColumnNames: _col0
>                 Statistics: Num rows: 1 Data size: 8 Basic stats: COMPLET=
E
> Column stats: NONE
>                 Reduce Output Operator
>                   sort order:
>                   Statistics: Num rows: 1 Data size: 8 Basic stats:
> COMPLETE Column stats: NONE
>                   value expressions: _col0 (type: bigint)
>       Local Work:
>         Map Reduce Local Work
>       Reduce Operator Tree:
>         Group By Operator
>           aggregations: count(VALUE._col0)
>           mode: mergepartial
>           outputColumnNames: _col0
>           Statistics: Num rows: 1 Data size: 8 Basic stats: COMPLETE
> Column stats: NONE
>           File Output Operator
>             compressed: false
>             Statistics: Num rows: 1 Data size: 8 Basic stats: COMPLETE
> Column stats: NONE
>             table:
>                 input format: org.apache.hadoop.mapred.
> SequenceFileInputFormat
>                 output format: org.apache.hadoop.hive.ql.io.
> HiveSequenceFileOutputFormat
>                 serde: org.apache.hadoop.hive.serde2.lazy.LazySimpleSerDe
>
>   Stage: Stage-0
>     Fetch Operator
>       limit: -1
>       Processor Tree:
>         ListSink
>
> Any suggestion in debugging this issue is appreciated.
>
>
> Regards,
> Satyajit.
>
>
>
>
> On Wed, Oct 26, 2016 at 3:34 PM, Eugene Koifman <ekoifman@hortonworks.com=
>
> wrote:
>
>> If you can run this, then it=E2=80=99s safe to ignore =E2=80=9C00094_flu=
sh_length=E2=80=9D messages
>> and the issue is somewhere else
>>
>> select count(*) from mls_public_record_association_snapshot_orc pra
>>  left outer join mls_listing_snapshot_orc ml on pra.primary_listing_id =
=3D
>> ml.id  left outer join attribute a on a.id =3D ml.standard_status
>>
>> Eugene
>>
>> From: satyajit vegesna <satyajit.apasprk@gmail.com>
>> Date: Wednesday, October 26, 2016 at 2:14 PM
>> To: "user@hive.apache.org" <user@hive.apache.org>, Eugene Koifman <
>> ekoifman@hortonworks.com>
>> Cc: "dev@hive.apache.org" <dev@hive.apache.org>
>> Subject: Re: Error with flush_length File in Orc, in hive 2.1.0 and mr
>> execution engine.
>>
>> Hi Eugene,
>>
>> PFB Transaction table in green and parquet tables in yellow,
>>
>> INSERT INTO access_logs.crawlstats_dpp PARTITION(day=3D"2016-10-23")
>> select pra.url as prUrl,pra.url_type as urlType,CAST(pra.created_at AS
>> timestamp) as prCreated, CAST(pra.updated_at AS timestamp) as prUpdated,
>> CAST(ml.created_at AS timestamp) as mlCreated, CAST(ml.updated_at AS
>> timestamp) as mlUpdated, a.name as status, pra.public_record_id as prId,
>> acl.accesstime as crawledon, pra.id as propId, pra.primary_listing_id as
>> listingId, datediff(CAST(acl.accesstime AS timestamp),CAST(ml.created_at=
 AS
>> timestamp)) as mlcreateage, datediff(CAST(acl.accesstime AS
>> timestamp),CAST(ml.updated_at AS timestamp)) as mlupdateage,
>> datediff(CAST(acl.accesstime AS timestamp),CAST(pra.created_at AS
>> timestamp)) as prcreateage, datediff(CAST(acl.accesstime AS
>> timestamp),CAST(pra.updated_at AS timestamp)) as prupdateage,  (case whe=
n
>> (pra.public_record_id is not null and TRIM(pra.public_record_id) <> '')
>>  then (case when (pra.primary_listing_id is null or
>> TRIM(pra.primary_listing_id) =3D '') then 'PR' else 'PRMLS' END)  else (=
case
>> when (pra.primary_listing_id is not null and TRIM(pra.primary_listing_id=
)
>> <> '') then 'MLS' else 'UNKNOWN' END) END) as listingType,
>>  acl.httpstatuscode,  acl.httpverb,  acl.requesttime,
>> acl.upstreamheadertime , acl.upstreamresponsetime,  acl.page_id,  userag=
ent
>> AS user_agent,  substring(split(pra.url,'/')[0],
>> 0,length(split(pra.url,'/')[0])-3) as city,
>>  substring(split(pra.url,'/')[0], length(split(pra.url,'/')[0])-1,2) as
>> state,  ml.mls_id  FROM access_logs.loadbalancer_accesslogs acl  inner
>> join mls_public_record_association_snapshot_orc pra on acl.listing_url =
=3D
>> pra.url  left outer join mls_listing_snapshot_orc ml on
>> pra.primary_listing_id =3D ml.id  left outer join attribute a on a.id =
=3D
>> ml.standard_status  WHERE acl.accesstimedate=3D"2016-10-23";
>>
>>
>> Any clue, or something that you would want me to focus on to debug the
>> issue.
>>
>> Regards,
>> Satyajit.
>>
>>
>>
>> On Tue, Oct 25, 2016 at 8:49 PM, Eugene Koifman <ekoifman@hortonworks.co=
m
>> > wrote:
>>
>>> Which of your tables are are transactional?  Can you provide the DDL?
>>>
>>> I don=E2=80=99t think =E2=80=9CFile does not exist=E2=80=9D error is ca=
using your queries to
>>> fail.  It=E2=80=99s an INFO level msg.
>>> There should be some other error.
>>>
>>> Eugene
>>>
>>>
>>> From: satyajit vegesna <satyajit.apasprk@gmail.com>
>>> Reply-To: "user@hive.apache.org" <user@hive.apache.org>
>>> Date: Tuesday, October 25, 2016 at 5:46 PM
>>> To: "user@hive.apache.org" <user@hive.apache.org>, "dev@hive.apache.org=
"
>>> <dev@hive.apache.org>
>>> Subject: Error with flush_length File in Orc, in hive 2.1.0 and mr
>>> execution engine.
>>>
>>> HI All,
>>>
>>> i am using hive 2.1.0 , hadoop 2.7.2 , but  when i try running queries
>>> like simple insert,
>>>
>>> set mapreduce.job.queuename=3Ddefault;set hive.exec.dynamic.partition=
=3Dtrue;set
>>> hive.exec.dynamic.partition.mode=3Dnonstrict;set
>>> hive.exec.max.dynamic.partitions.pernode=3D400;set
>>> hive.exec.max.dynamic.partitions=3D2000;set mapreduce.map.memory.mb=3D5=
120;set
>>> mapreduce.reduce.memory.mb=3D5120;set mapred.tasktracker.map.tasks.maxi=
mum=3D30;set
>>> mapred.tasktracker.reduce.tasks.maximum=3D20;set
>>> mapred.reduce.child.java.opts=3D-Xmx2048m;set
>>> mapred.map.child.java.opts=3D-Xmx2048m; set
>>> hive.support.concurrency=3Dtrue; set hive.txn.manager=3Dorg.apache.ha
>>> doop.hive.ql.lockmgr.DbTxnManager; set hive.compactor.initiator.on=3Dfa=
lse;
>>> set hive.compactor.worker.threads=3D1;set mapreduce.job.queuename=3Ddef=
ault;set
>>> hive.exec.dynamic.partition=3Dtrue;set hive.exec.dynamic.partition.mode=
=3Dnonstrict;INSERT
>>> INTO access_logs.crawlstats_dpp PARTITION(day=3D"2016-10-23") select pr=
a.url
>>> as prUrl,pra.url_type as urlType,CAST(pra.created_at AS timestamp) as
>>> prCreated, CAST(pra.updated_at AS timestamp) as prUpdated,
>>> CAST(ml.created_at AS timestamp) as mlCreated, CAST(ml.updated_at AS
>>> timestamp) as mlUpdated, a.name as status, pra.public_record_id as
>>> prId, acl.accesstime as crawledon, pra.id as propId,
>>> pra.primary_listing_id as listingId, datediff(CAST(acl.accesstime AS
>>> timestamp),CAST(ml.created_at AS timestamp)) as mlcreateage,
>>> datediff(CAST(acl.accesstime AS timestamp),CAST(ml.updated_at AS
>>> timestamp)) as mlupdateage, datediff(CAST(acl.accesstime AS
>>> timestamp),CAST(pra.created_at AS timestamp)) as prcreateage,
>>> datediff(CAST(acl.accesstime AS timestamp),CAST(pra.updated_at AS
>>> timestamp)) as prupdateage,  (case when (pra.public_record_id is not nu=
ll
>>> and TRIM(pra.public_record_id) <> '')  then (case when
>>> (pra.primary_listing_id is null or TRIM(pra.primary_listing_id) =3D '')=
 then
>>> 'PR' else 'PRMLS' END)  else (case when (pra.primary_listing_id is not =
null
>>> and TRIM(pra.primary_listing_id) <> '') then 'MLS' else 'UNKNOWN' END) =
END)
>>> as listingType,  acl.httpstatuscode,  acl.httpverb,  acl.requesttime,
>>> acl.upstreamheadertime , acl.upstreamresponsetime,  acl.page_id,  usera=
gent
>>> AS user_agent,  substring(split(pra.url,'/')[0],
>>> 0,length(split(pra.url,'/')[0])-3) as city,
>>>  substring(split(pra.url,'/')[0], length(split(pra.url,'/')[0])-1,2) as
>>> state,  ml.mls_id  FROM access_logs.loadbalancer_accesslogs acl  inner
>>> join mls_public_record_association_snapshot_orc pra on acl.listing_url
>>> =3D pra.url  left outer join mls_listing_snapshot_orc ml on
>>> pra.primary_listing_id =3D ml.id  left outer join attribute a on a.id =
=3D
>>> ml.standard_status  WHERE acl.accesstimedate=3D"2016-10-23";
>>>
>>> i finally end up getting below error,
>>>
>>> 2016-10-25 17:40:18,725 Stage-2 map =3D 100%,  reduce =3D 52%, Cumulati=
ve
>>> CPU 1478.96 sec
>>> 2016-10-25 17:40:19,761 Stage-2 map =3D 100%,  reduce =3D 62%, Cumulati=
ve
>>> CPU 1636.58 sec
>>> 2016-10-25 17:40:20,794 Stage-2 map =3D 100%,  reduce =3D 64%, Cumulati=
ve
>>> CPU 1764.97 sec
>>> 2016-10-25 17:40:21,820 Stage-2 map =3D 100%,  reduce =3D 69%, Cumulati=
ve
>>> CPU 1879.61 sec
>>> 2016-10-25 17:40:22,842 Stage-2 map =3D 100%,  reduce =3D 80%, Cumulati=
ve
>>> CPU 2051.38 sec
>>> 2016-10-25 17:40:23,872 Stage-2 map =3D 100%,  reduce =3D 90%, Cumulati=
ve
>>> CPU 2151.49 sec
>>> 2016-10-25 17:40:24,907 Stage-2 map =3D 100%,  reduce =3D 93%, Cumulati=
ve
>>> CPU 2179.67 sec
>>> 2016-10-25 17:40:25,944 Stage-2 map =3D 100%,  reduce =3D 94%, Cumulati=
ve
>>> CPU 2187.86 sec
>>> 2016-10-25 17:40:29,062 Stage-2 map =3D 100%,  reduce =3D 95%, Cumulati=
ve
>>> CPU 2205.22 sec
>>> 2016-10-25 17:40:30,107 Stage-2 map =3D 100%,  reduce =3D 100%, Cumulat=
ive
>>> CPU 2241.25 sec
>>> MapReduce Total cumulative CPU time: 37 minutes 21 seconds 250 msec
>>> Ended Job =3D job_1477437520637_0009
>>> SLF4J: Class path contains multiple SLF4J bindings.
>>> SLF4J: Found binding in [jar:file:/opt/apache-hive-2.1
>>> .0-bin/lib/log4j-slf4j-impl-2.4.1.jar!/org/slf4j/impl/Static
>>> LoggerBinder.class]
>>> SLF4J: Found binding in [jar:file:/opt/hadoop-2.7.2/sh
>>> are/hadoop/common/lib/slf4j-log4j12-1.7.10.jar!/org/slf4j/im
>>> pl/StaticLoggerBinder.class]
>>> SLF4J: See http://www.slf4j.org/codes.html#multiple_bindings for an
>>> explanation.
>>> SLF4J: Actual binding is of type [org.apache.logging.slf4j.Log4
>>> jLoggerFactory]
>>> 2016-10-25 17:40:35Starting to launch local task to process map join;ma=
ximum
>>> memory =3D 514850816
>>> Execution failed with exit status: 2
>>> Obtaining error information
>>>
>>> Task failed!
>>> Task ID:
>>>   Stage-14
>>>
>>> Logs:
>>>
>>> FAILED: Execution Error, return code 2 from
>>> org.apache.hadoop.hive.ql.exec.mr.MapredLocalTask
>>> MapReduce Jobs Launched:
>>> Stage-Stage-1: Map: 106  Reduce: 45   Cumulative CPU: 3390.11 sec   HDF=
S
>>> Read: 8060555201 HDFS Write: 757253756 SUCCESS
>>> Stage-Stage-2: Map: 204  Reduce: 85   Cumulative CPU: 2241.25 sec   HDF=
S
>>> Read: 2407914653 HDFS Write: 805874953 SUCCESS
>>> Total MapReduce CPU Time Spent: 0 days 1 hours 33 minutes 51 seconds 36=
0
>>> msec
>>>
>>> Could not find any errors in logs, but when i check namenode logs , oi
>>> get the following error,
>>>
>>> 2016-10-25 17:01:51,923 INFO org.apache.hadoop.ipc.Server: IPC Server
>>> handler 1 on 9000, call org.apache.hadoop.hdfs.protoco
>>> l.ClientProtocol.getBlockLocations from 192.168.120.133:47114 Call#4
>>> Retry#0: java.io.FileNotFoundException: File does not exist:
>>> /user/hive/warehouse/mls_public_record_association_snapshot_
>>> orc/delta_0000002_0000002_0000/bucket_00094_flush_length
>>> 2016-10-25 17:01:52,779 INFO org.apache.hadoop.ipc.Server: IPC Server
>>> handler 1 on 9000, call org.apache.hadoop.hdfs.protoco
>>> l.ClientProtocol.getBlockLocations from 192.168.120.132:43008 Call#4
>>> Retry#0: java.io.FileNotFoundException: File does not exist:
>>> /user/hive/warehouse/mls_public_record_association_snapshot_
>>> orc/delta_0000002_0000002_0000/bucket_00095_flush_length
>>> 2016-10-25 17:01:52,984 INFO org.apache.hadoop.ipc.Server: IPC Server
>>> handler 0 on 9000, call org.apache.hadoop.hdfs.protoco
>>> l.ClientProtocol.getBlockLocations from 192.168.120.133:47260 Call#4
>>> Retry#0: java.io.FileNotFoundException: File does not exist:
>>> /user/hive/warehouse/mls_public_record_association_snapshot_
>>> orc/delta_0000002_0000002_0000/bucket_00096_flush_length
>>> 2016-10-25 17:01:53,381 INFO org.apache.hadoop.ipc.Server: IPC Server
>>> handler 0 on 9000, call org.apache.hadoop.hdfs.protoco
>>> l.ClientProtocol.getBlockLocations from 192.168.120.132:43090 Call#4
>>> Retry#0: java.io.FileNotFoundException: File does not exist:
>>> /user/hive/warehouse/mls_public_record_association_snapshot_
>>> orc/delta_0000002_0000002_0000/bucket_00097_flush_length
>>> 2016-10-25 17:01:53,971 INFO org.apache.hadoop.ipc.Server: IPC Server
>>> handler 1 on 9000, call org.apache.hadoop.hdfs.protoco
>>> l.ClientProtocol.getBlockLocations from 192.168.120.134:37444 Call#4
>>> Retry#0: java.io.FileNotFoundException: File does not exist:
>>> /user/hive/warehouse/mls_public_record_association_snapshot_
>>> orc/delta_0000002_0000002_0000/bucket_00098_flush_length
>>> 2016-10-25 17:01:54,092 INFO org.apache.hadoop.ipc.Server: IPC Server
>>> handler 2 on 9000, call org.apache.hadoop.hdfs.protoco
>>> l.ClientProtocol.getBlockLocations from 192.168.120.133:47300 Call#4
>>> Retry#0: java.io.FileNotFoundException: File does not exist:
>>> /user/hive/warehouse/mls_public_record_association_snapshot_
>>> orc/delta_0000002_0000002_0000/bucket_00099_flush_length
>>> 2016-10-25 17:01:55,094 INFO org.apache.hadoop.ipc.Server: IPC Server
>>> handler 8 on 9000, call org.apache.hadoop.hdfs.protoco
>>> l.ClientProtocol.getBlockLocations from 192.168.120.134:37540 Call#4
>>> Retry#0: java.io.FileNotFoundException: File does not exist:
>>> /user/hive/warehouse/mls_public_record_association_snapshot_
>>> orc/delta_0000002_0000002_0000/bucket_00012_flush_length
>>> 2016-10-25 17:02:11,269 INFO org.apache.hadoop.ipc.Server: IPC Server
>>> handler 5 on 9000, call org.apache.hadoop.hdfs.protoco
>>> l.ClientProtocol.getBlockLocations from 192.168.120.133:47378 Call#4
>>> Retry#0: java.io.FileNotFoundException: File does not exist:
>>> /user/hive/warehouse/mls_public_record_association_snapshot_
>>> orc/delta_0000002_0000002_0000/bucket_00075_flush_length
>>>
>>> i also search for find the flush_length files in the above mentioned
>>> location, but i only see buckets but no files ending with flush_length.
>>>
>>> Any clue or help would be highly appreciated.
>>>
>>> Regards,
>>> Satyajit.
>>>
>>>
>>
>

--001a1141b320f6e9d0053fccf4da
Content-Type: text/html; charset=UTF-8
Content-Transfer-Encoding: quoted-printable

<div dir=3D"ltr">Hi=C2=A0<span style=3D"color:rgb(0,0,0);font-family:calibr=
i,sans-serif;font-size:14px">Eugene,</span><div><span style=3D"color:rgb(0,=
0,0);font-family:calibri,sans-serif;font-size:14px"><br></span></div><div><=
span style=3D"color:rgb(0,0,0);font-family:calibri,sans-serif;font-size:14p=
x">One more observation, in namenode logs when i run select count(*) on ind=
ividual tables i still see the same error as before,</span></div><div><span=
 style=3D"color:rgb(0,0,0);font-family:calibri,sans-serif;font-size:14px"><=
br></span></div><div><span style=3D"font-size:12.8px">org.apache.hadoop.ipc=
.Server: IPC Server handler 1 on 9000, call org.apache.hadoop.hdfs.</span><=
wbr style=3D"font-size:12.8px"><span style=3D"font-size:12.8px">protocol.Cl=
ientProtocol.</span><wbr style=3D"font-size:12.8px"><span style=3D"font-siz=
e:12.8px">getBlockLocations from=C2=A0</span><a href=3D"http://192.168.120.=
133:47114/" target=3D"_blank" style=3D"font-size:12.8px">192.168.120.133:47=
114</a><span style=3D"font-size:12.8px">=C2=A0Call#4 Retry#0: java.io.FileN=
otFoundException: File does not exist: /user/hive/warehouse/mls_</span><wbr=
 style=3D"font-size:12.8px"><span style=3D"font-size:12.8px">public_record_=
association_</span><wbr style=3D"font-size:12.8px"><span style=3D"font-size=
:12.8px">snapshot_orc/delta_0000002_</span><wbr style=3D"font-size:12.8px">=
<span style=3D"font-size:12.8px">0000002_0000/bucket_00094_</span><wbr styl=
e=3D"font-size:12.8px"><span style=3D"font-size:12.8px">flush_length</span>=
<span style=3D"color:rgb(0,0,0);font-family:calibri,sans-serif;font-size:14=
px"><br></span></div><div><span style=3D"font-size:12.8px"><br></span></div=
><div><span style=3D"font-size:12.8px">but i get the count of the tables an=
d they match well with the source data.</span></div><div><span style=3D"fon=
t-size:12.8px">So i believe the problem is with joining these tables togeth=
er.</span></div><div><span style=3D"font-size:12.8px"><br></span></div><div=
><span style=3D"font-size:12.8px">Any specific logs you want me to debug.</=
span></div><div><span style=3D"font-size:12.8px"><br></span></div><div><spa=
n style=3D"font-size:12.8px">Regards,</span></div><div><span style=3D"font-=
size:12.8px">Satyajit.</span></div></div><div class=3D"gmail_extra"><br><di=
v class=3D"gmail_quote">On Wed, Oct 26, 2016 at 4:16 PM, satyajit vegesna <=
span dir=3D"ltr">&lt;<a href=3D"mailto:satyajit.apasprk@gmail.com" target=
=3D"_blank">satyajit.apasprk@gmail.com</a>&gt;</span> wrote:<br><blockquote=
 class=3D"gmail_quote" style=3D"margin:0 0 0 .8ex;border-left:1px #ccc soli=
d;padding-left:1ex"><div dir=3D"ltr">Hi=C2=A0<span style=3D"color:rgb(0,0,0=
);font-family:calibri,sans-serif;font-size:14px">Eugene,</span><span class=
=3D""><div><span style=3D"color:rgb(0,0,0);font-family:calibri,sans-serif;f=
ont-size:14px"><br></span></div><div><div style=3D"font-family:calibri,sans=
-serif;font-size:14px;color:rgb(0,0,0)"><span style=3D"font-size:12.8px"><s=
pan style=3D"background-color:rgb(56,118,29)">select count(*) from mls_publ=
ic_record_association_</span></span><span style=3D"background-color:rgb(56,=
118,29)"></span><span style=3D"font-size:12.8px"><span style=3D"background-=
color:rgb(56,118,29)"><wbr>snapshot_orc</span>=C2=A0pra =C2=A0left outer jo=
in=C2=A0<span style=3D"background-color:rgb(106,168,79)">mls_listing_snapsh=
ot_orc</span>=C2=A0<wbr>ml on pra.primary_listing_id =3D=C2=A0</span><a hre=
f=3D"http://ml.id/" style=3D"font-size:12.8px" target=3D"_blank">ml.id</a><=
span style=3D"font-size:12.8px">=C2=A0=C2=A0left outer join=C2=A0<span styl=
e=3D"background-color:rgb(106,168,79)">attribute</span>=C2=A0a on=C2=A0</sp=
an><a href=3D"http://a.id/" style=3D"font-size:12.8px" target=3D"_blank">a.=
id</a><span style=3D"font-size:12.8px">=C2=A0=3D ml.standard_status</span><=
/div></div><div><span style=3D"font-size:12.8px"><br></span></div></span><d=
iv><span style=3D"font-size:12.8px">ran till end and threw the below except=
ion.</span></div><div><span style=3D"color:rgb(0,0,0);font-family:calibri,s=
ans-serif;font-size:14px"><br></span></div><div><div><font color=3D"#000000=
" face=3D"calibri, sans-serif"><span style=3D"font-size:14px">MapReduce Tot=
al cumulative CPU time: 0 days 1 hours 0 minutes 53 seconds 760 msec</span>=
</font></div><div><font color=3D"#000000" face=3D"calibri, sans-serif"><spa=
n style=3D"font-size:14px">Ended Job =3D job_1477494091659_0024</span></fon=
t></div><span class=3D""><div><font color=3D"#000000" face=3D"calibri, sans=
-serif"><span style=3D"font-size:14px">SLF4J: Class path contains multiple =
SLF4J bindings.</span></font></div><div><font color=3D"#000000" face=3D"cal=
ibri, sans-serif"><span style=3D"font-size:14px">SLF4J: Found binding in [j=
ar:file:/opt/apache-hive-2.<wbr>1.0-bin/lib/log4j-slf4j-impl-<wbr>2.4.1.jar=
!/org/slf4j/impl/<wbr>StaticLoggerBinder.class]</span></font></div><div><fo=
nt color=3D"#000000" face=3D"calibri, sans-serif"><span style=3D"font-size:=
14px">SLF4J: Found binding in [jar:file:/opt/hadoop-2.7.2/<wbr>share/hadoop=
/common/lib/slf4j-<wbr>log4j12-1.7.10.jar!/org/slf4j/<wbr>impl/StaticLogger=
Binder.class]</span></font></div><div><font color=3D"#000000" face=3D"calib=
ri, sans-serif"><span style=3D"font-size:14px">SLF4J: See <a href=3D"http:/=
/www.slf4j.org/codes.html#multiple_bindings" target=3D"_blank">http://www.s=
lf4j.org/codes.<wbr>html#multiple_bindings</a> for an explanation.</span></=
font></div><div><font color=3D"#000000" face=3D"calibri, sans-serif"><span =
style=3D"font-size:14px">SLF4J: Actual binding is of type [org.apache.loggi=
ng.slf4j.<wbr>Log4jLoggerFactory]</span></font></div></span><div><font colo=
r=3D"#000000" face=3D"calibri, sans-serif"><span style=3D"font-size:14px">2=
016-10-26 16:09:01<span class=3D"m_-8019037479453584833gmail-Apple-tab-span=
" style=3D"white-space:pre-wrap">	</span>Starting to launch local task to p=
rocess map join;<span class=3D"m_-8019037479453584833gmail-Apple-tab-span" =
style=3D"white-space:pre-wrap">	</span>maximum memory =3D 514850816</span><=
/font></div><span class=3D""><div><font color=3D"#000000" face=3D"calibri, =
sans-serif"><span style=3D"font-size:14px">Execution failed with exit statu=
s: 2</span></font></div><div><font color=3D"#000000" face=3D"calibri, sans-=
serif"><span style=3D"font-size:14px">Obtaining error information</span></f=
ont></div><div><font color=3D"#000000" face=3D"calibri, sans-serif"><span s=
tyle=3D"font-size:14px"><br></span></font></div><div><font color=3D"#000000=
" face=3D"calibri, sans-serif"><span style=3D"font-size:14px">Task failed!<=
/span></font></div><div><font color=3D"#000000" face=3D"calibri, sans-serif=
"><span style=3D"font-size:14px">Task ID:</span></font></div></span><div><f=
ont color=3D"#000000" face=3D"calibri, sans-serif"><span style=3D"font-size=
:14px">=C2=A0 Stage-9</span></font></div><span class=3D""><div><font color=
=3D"#000000" face=3D"calibri, sans-serif"><span style=3D"font-size:14px"><b=
r></span></font></div><div><font color=3D"#000000" face=3D"calibri, sans-se=
rif"><span style=3D"font-size:14px">Logs:</span></font></div><div><font col=
or=3D"#000000" face=3D"calibri, sans-serif"><span style=3D"font-size:14px">=
<br></span></font></div><div><font color=3D"#000000" face=3D"calibri, sans-=
serif"><span style=3D"font-size:14px">FAILED: Execution Error, return code =
2 from org.apache.hadoop.hive.ql.<wbr>exec.mr.MapredLocalTask</span></font>=
</div><div><font color=3D"#000000" face=3D"calibri, sans-serif"><span style=
=3D"font-size:14px">MapReduce Jobs Launched:=C2=A0</span></font></div></spa=
n><div><font color=3D"#000000" face=3D"calibri, sans-serif"><span style=3D"=
font-size:14px">Stage-Stage-1: Map: 300 =C2=A0Reduce: 121 =C2=A0 Cumulative=
 CPU: 3654.02 sec =C2=A0 HDFS Read: 1771032233 HDFS Write: 1917532703 SUCCE=
SS</span></font></div><div><font color=3D"#000000" face=3D"calibri, sans-se=
rif"><span style=3D"font-size:14px">Total MapReduce CPU Time Spent: 0 days =
1 hours 0 minutes 54 seconds 20 msec</span></font></div></div><div><font co=
lor=3D"#000000" face=3D"calibri, sans-serif"><span style=3D"font-size:14px"=
><br></span></font></div><div><font color=3D"#000000" face=3D"calibri, sans=
-serif"><span style=3D"font-size:14px">Explain Plan:</span></font></div><di=
v><font color=3D"#000000" face=3D"calibri, sans-serif"><span style=3D"font-=
size:14px"><br></span></font></div><div><font color=3D"#000000" face=3D"cal=
ibri, sans-serif"><div><span style=3D"font-size:14px">STAGE DEPENDENCIES:</=
span></div><div><span style=3D"font-size:14px">=C2=A0 Stage-8 is a root sta=
ge , consists of Stage-1</span></div><div><span style=3D"font-size:14px">=
=C2=A0 Stage-1</span></div><div><span style=3D"font-size:14px">=C2=A0 Stage=
-9 depends on stages: Stage-1</span></div><div><span style=3D"font-size:14p=
x">=C2=A0 Stage-3 depends on stages: Stage-9</span></div><div><span style=
=3D"font-size:14px">=C2=A0 Stage-0 depends on stages: Stage-3</span></div><=
div><span style=3D"font-size:14px"><br></span></div><div><span style=3D"fon=
t-size:14px">STAGE PLANS:</span></div><div><span style=3D"font-size:14px">=
=C2=A0 Stage: Stage-8</span></div><span class=3D""><div><span style=3D"font=
-size:14px">=C2=A0 =C2=A0 Conditional Operator</span></div><div><span style=
=3D"font-size:14px"><br></span></div><div><span style=3D"font-size:14px">=
=C2=A0 Stage: Stage-1</span></div><div><span style=3D"font-size:14px">=C2=
=A0 =C2=A0 Map Reduce</span></div><div><span style=3D"font-size:14px">=C2=
=A0 =C2=A0 =C2=A0 Map Operator Tree:</span></div><div><span style=3D"font-s=
ize:14px">=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 TableScan</span></div></span><=
div><span style=3D"font-size:14px">=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=
=A0 alias: pra</span></div><div><span style=3D"font-size:14px">=C2=A0 =C2=
=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 Statistics: Num rows: 99241216 Data size: 9=
924121600 Basic stats: COMPLETE Column stats: NONE</span></div><div><span s=
tyle=3D"font-size:14px">=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 Select Op=
erator</span></div><div><span style=3D"font-size:14px">=C2=A0 =C2=A0 =C2=A0=
 =C2=A0 =C2=A0 =C2=A0 =C2=A0 expressions: primary_listing_id (type: string)=
</span></div><div><span style=3D"font-size:14px">=C2=A0 =C2=A0 =C2=A0 =C2=
=A0 =C2=A0 =C2=A0 =C2=A0 outputColumnNames: _col0</span></div><div><span st=
yle=3D"font-size:14px">=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 Sta=
tistics: Num rows: 99241216 Data size: 9924121600 Basic stats: COMPLETE Col=
umn stats: NONE</span></div><div><span style=3D"font-size:14px">=C2=A0 =C2=
=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 Reduce Output Operator</span></div><=
span class=3D""><div><span style=3D"font-size:14px">=C2=A0 =C2=A0 =C2=A0 =
=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 key expressions: _col0 (type: string)</s=
pan></div><div><span style=3D"font-size:14px">=C2=A0 =C2=A0 =C2=A0 =C2=A0 =
=C2=A0 =C2=A0 =C2=A0 =C2=A0 sort order: +</span></div><div><span style=3D"f=
ont-size:14px">=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 Map-=
reduce partition columns: _col0 (type: string)</span></div></span><div><spa=
n style=3D"font-size:14px">=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0=
 =C2=A0 Statistics: Num rows: 99241216 Data size: 9924121600 Basic stats: C=
OMPLETE Column stats: NONE</span></div><div><span style=3D"font-size:14px">=
=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 TableScan</span></div><div><span style=
=3D"font-size:14px">=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 alias: ml</sp=
an></div><div><span style=3D"font-size:14px">=C2=A0 =C2=A0 =C2=A0 =C2=A0 =
=C2=A0 =C2=A0 Statistics: Num rows: 201432950 Data size: 20949026816 Basic =
stats: COMPLETE Column stats: NONE</span></div><div><span style=3D"font-siz=
e:14px">=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 Select Operator</span></d=
iv><div><span style=3D"font-size:14px">=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =
=C2=A0 =C2=A0 expressions: id (type: string), standard_status (type: int)</=
span></div><div><span style=3D"font-size:14px">=C2=A0 =C2=A0 =C2=A0 =C2=A0 =
=C2=A0 =C2=A0 =C2=A0 outputColumnNames: _col0, _col1</span></div><div><span=
 style=3D"font-size:14px">=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =
Statistics: Num rows: 201432950 Data size: 20949026816 Basic stats: COMPLET=
E Column stats: NONE</span></div><span class=3D""><div><span style=3D"font-=
size:14px">=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 Reduce Output O=
perator</span></div><div><span style=3D"font-size:14px">=C2=A0 =C2=A0 =C2=
=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 key expressions: _col0 (type: string=
)</span></div><div><span style=3D"font-size:14px">=C2=A0 =C2=A0 =C2=A0 =C2=
=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 sort order: +</span></div><div><span style=
=3D"font-size:14px">=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0=
 Map-reduce partition columns: _col0 (type: string)</span></div></span><div=
><span style=3D"font-size:14px">=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =
=C2=A0 =C2=A0 Statistics: Num rows: 201432950 Data size: 20949026816 Basic =
stats: COMPLETE Column stats: NONE</span></div><span class=3D""><div><span =
style=3D"font-size:14px">=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =
=C2=A0 value expressions: _col1 (type: int)</span></div></span><span class=
=3D""><div><span style=3D"font-size:14px">=C2=A0 =C2=A0 =C2=A0 Reduce Opera=
tor Tree:</span></div><div><span style=3D"font-size:14px">=C2=A0 =C2=A0 =C2=
=A0 =C2=A0 Join Operator</span></div><div><span style=3D"font-size:14px">=
=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 condition map:</span></div><div><span st=
yle=3D"font-size:14px">=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=
=A0Left Outer Join0 to 1</span></div><div><span style=3D"font-size:14px">=
=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 keys:</span></div></span><div><span styl=
e=3D"font-size:14px">=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 0 _col0 (typ=
e: string)</span></div><div><span style=3D"font-size:14px">=C2=A0 =C2=A0 =
=C2=A0 =C2=A0 =C2=A0 =C2=A0 1 _col0 (type: string)</span></div><div><span s=
tyle=3D"font-size:14px">=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 outputColumnName=
s: _col2</span></div><div><span style=3D"font-size:14px">=C2=A0 =C2=A0 =C2=
=A0 =C2=A0 =C2=A0 Statistics: Num rows: 221576249 Data size: 23043929997 Ba=
sic stats: COMPLETE Column stats: NONE</span></div><span class=3D""><div><s=
pan style=3D"font-size:14px">=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 File Output=
 Operator</span></div><div><span style=3D"font-size:14px">=C2=A0 =C2=A0 =C2=
=A0 =C2=A0 =C2=A0 =C2=A0 compressed: false</span></div><div><span style=3D"=
font-size:14px">=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 table:</span></di=
v><div><span style=3D"font-size:14px">=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =
=C2=A0 =C2=A0 =C2=A0 input format: org.apache.hadoop.mapred.<wbr>SequenceFi=
leInputFormat</span></div><div><span style=3D"font-size:14px">=C2=A0 =C2=A0=
 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 output format: <a href=3D"http:/=
/org.apache.hadoop.hive.ql.io">org.apache.hadoop.hive.ql.io</a>.<wbr>HiveSe=
quenceFileOutputFormat</span></div><div><span style=3D"font-size:14px">=C2=
=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 serde: org.apache.hado=
op.hive.serde2.<wbr>lazybinary.LazyBinarySerDe</span></div><div><span style=
=3D"font-size:14px"><br></span></div></span><div><span style=3D"font-size:1=
4px">=C2=A0<span style=3D"background-color:rgb(255,229,153)"> Stage: Stage-=
9 --it is failing in the same mapreduce local work.</span></span></div><spa=
n class=3D""><div><span style=3D"font-size:14px">=C2=A0 =C2=A0 Map Reduce L=
ocal Work</span></div><div><span style=3D"font-size:14px">=C2=A0 =C2=A0 =C2=
=A0 Alias -&gt; Map Local Tables:</span></div></span><div><span style=3D"fo=
nt-size:14px">=C2=A0 =C2=A0 =C2=A0 =C2=A0 $hdt$_2:a=C2=A0</span></div><span=
 class=3D""><div><span style=3D"font-size:14px">=C2=A0 =C2=A0 =C2=A0 =C2=A0=
 =C2=A0 Fetch Operator</span></div><div><span style=3D"font-size:14px">=C2=
=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 limit: -1</span></div><div><span sty=
le=3D"font-size:14px">=C2=A0 =C2=A0 =C2=A0 Alias -&gt; Map Local Operator T=
ree:</span></div></span><div><span style=3D"font-size:14px">=C2=A0 =C2=A0 =
=C2=A0 =C2=A0 $hdt$_2:a=C2=A0</span></div><div><span style=3D"font-size:14p=
x">=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 TableScan</span></div><div><span styl=
e=3D"font-size:14px">=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 alias: a</sp=
an></div><div><span style=3D"font-size:14px">=C2=A0 =C2=A0 =C2=A0 =C2=A0 =
=C2=A0 =C2=A0 Statistics: Num rows: 12830 Data size: 51322 Basic stats: COM=
PLETE Column stats: NONE</span></div><span class=3D""><div><span style=3D"f=
ont-size:14px">=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 Select Operator</s=
pan></div><div><span style=3D"font-size:14px">=C2=A0 =C2=A0 =C2=A0 =C2=A0 =
=C2=A0 =C2=A0 =C2=A0 expressions: id (type: int)</span></div></span><div><s=
pan style=3D"font-size:14px">=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=
=A0 outputColumnNames: _col0</span></div><div><span style=3D"font-size:14px=
">=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 Statistics: Num rows: 12=
830 Data size: 51322 Basic stats: COMPLETE Column stats: NONE</span></div><=
div><span style=3D"font-size:14px">=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=
=A0 =C2=A0 HashTable Sink Operator</span></div><div><span style=3D"font-siz=
e:14px">=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 keys:</span=
></div><div><span style=3D"font-size:14px">=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=
=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 0 _col2 (type: int)</span></div><div><span =
style=3D"font-size:14px">=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =
=C2=A0 =C2=A0 1 _col0 (type: int)</span></div><div><span style=3D"font-size=
:14px"><br></span></div><div><span style=3D"font-size:14px">=C2=A0 Stage: S=
tage-3</span></div><span class=3D""><div><span style=3D"font-size:14px">=C2=
=A0 =C2=A0 Map Reduce</span></div><div><span style=3D"font-size:14px">=C2=
=A0 =C2=A0 =C2=A0 Map Operator Tree:</span></div><div><span style=3D"font-s=
ize:14px">=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 TableScan</span></div><div><sp=
an style=3D"font-size:14px">=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 Map J=
oin Operator</span></div><div><span style=3D"font-size:14px">=C2=A0 =C2=A0 =
=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 condition map:</span></div><div><span st=
yle=3D"font-size:14px">=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=
=A0 =C2=A0 =C2=A0Left Outer Join0 to 1</span></div><div><span style=3D"font=
-size:14px">=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 keys:</span></=
div></span><div><span style=3D"font-size:14px">=C2=A0 =C2=A0 =C2=A0 =C2=A0 =
=C2=A0 =C2=A0 =C2=A0 =C2=A0 0 _col2 (type: int)</span></div><div><span styl=
e=3D"font-size:14px">=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=
=A0 1 _col0 (type: int)</span></div><div><span style=3D"font-size:14px">=C2=
=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 Statistics: Num rows: 2437338=
79 Data size: 25348323546 Basic stats: COMPLETE Column stats: NONE</span></=
div><div><span style=3D"font-size:14px">=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =
=C2=A0 =C2=A0 Group By Operator</span></div><div><span style=3D"font-size:1=
4px">=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 aggregations: =
count()</span></div><div><span style=3D"font-size:14px">=C2=A0 =C2=A0 =C2=
=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 mode: hash</span></div><div><span st=
yle=3D"font-size:14px">=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=
=A0 outputColumnNames: _col0</span></div><div><span style=3D"font-size:14px=
">=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 Statistics: Num r=
ows: 1 Data size: 8 Basic stats: COMPLETE Column stats: NONE</span></div><d=
iv><span style=3D"font-size:14px">=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0=
 =C2=A0 =C2=A0 Reduce Output Operator</span></div><div><span style=3D"font-=
size:14px">=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 s=
ort order:=C2=A0</span></div><div><span style=3D"font-size:14px">=C2=A0 =C2=
=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 Statistics: Num rows: =
1 Data size: 8 Basic stats: COMPLETE Column stats: NONE</span></div><div><s=
pan style=3D"font-size:14px">=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=
=A0 =C2=A0 =C2=A0 value expressions: _col0 (type: bigint)</span></div><span=
 class=3D""><div><span style=3D"font-size:14px">=C2=A0 =C2=A0 =C2=A0 Local =
Work:</span></div><div><span style=3D"font-size:14px">=C2=A0 =C2=A0 =C2=A0 =
=C2=A0 Map Reduce Local Work</span></div></span><div><span style=3D"font-si=
ze:14px">=C2=A0 =C2=A0 =C2=A0 Reduce Operator Tree:</span></div><div><span =
style=3D"font-size:14px">=C2=A0 =C2=A0 =C2=A0 =C2=A0 Group By Operator</spa=
n></div><div><span style=3D"font-size:14px">=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=
=A0 aggregations: count(VALUE._col0)</span></div><div><span style=3D"font-s=
ize:14px">=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 mode: mergepartial</span></div=
><div><span style=3D"font-size:14px">=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 out=
putColumnNames: _col0</span></div><div><span style=3D"font-size:14px">=C2=
=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 Statistics: Num rows: 1 Data size: 8 Basic =
stats: COMPLETE Column stats: NONE</span></div><span class=3D""><div><span =
style=3D"font-size:14px">=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 File Output Ope=
rator</span></div><div><span style=3D"font-size:14px">=C2=A0 =C2=A0 =C2=A0 =
=C2=A0 =C2=A0 =C2=A0 compressed: false</span></div></span><div><span style=
=3D"font-size:14px">=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 Statistics: N=
um rows: 1 Data size: 8 Basic stats: COMPLETE Column stats: NONE</span></di=
v><div><span style=3D"font-size:14px">=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =
=C2=A0 table:</span></div><span class=3D""><div><span style=3D"font-size:14=
px">=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 input format: o=
rg.apache.hadoop.mapred.<wbr>SequenceFileInputFormat</span></div><div><span=
 style=3D"font-size:14px">=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =
=C2=A0 output format: <a href=3D"http://org.apache.hadoop.hive.ql.io">org.a=
pache.hadoop.hive.ql.io</a>.<wbr>HiveSequenceFileOutputFormat</span></div><=
/span><div><span style=3D"font-size:14px">=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=
=A0 =C2=A0 =C2=A0 =C2=A0 serde: org.apache.hadoop.hive.serde2.<wbr>lazy.Laz=
ySimpleSerDe</span></div><div><span style=3D"font-size:14px"><br></span></d=
iv><div><span style=3D"font-size:14px">=C2=A0 Stage: Stage-0</span></div><d=
iv><span style=3D"font-size:14px">=C2=A0 =C2=A0 Fetch Operator</span></div>=
<div><span style=3D"font-size:14px">=C2=A0 =C2=A0 =C2=A0 limit: -1</span></=
div><div><span style=3D"font-size:14px">=C2=A0 =C2=A0 =C2=A0 Processor Tree=
:</span></div><div><span style=3D"font-size:14px">=C2=A0 =C2=A0 =C2=A0 =C2=
=A0 ListSink</span></div><div><br></div><div>Any suggestion in debugging th=
is issue is appreciated.</div><div style=3D"font-size:14px"><br></div></fon=
t></div><div><font color=3D"#000000" face=3D"calibri, sans-serif"><span sty=
le=3D"font-size:14px"><br></span></font></div><div><font color=3D"#000000" =
face=3D"calibri, sans-serif"><span style=3D"font-size:14px">Regards,</span>=
</font></div><div><font color=3D"#000000" face=3D"calibri, sans-serif"><spa=
n style=3D"font-size:14px">Satyajit.</span></font></div><div><br></div><div=
><br></div><div><span style=3D"color:rgb(0,0,0);font-family:calibri,sans-se=
rif;font-size:14px"><br></span></div></div><div class=3D"HOEnZb"><div class=
=3D"h5"><div class=3D"gmail_extra"><br><div class=3D"gmail_quote">On Wed, O=
ct 26, 2016 at 3:34 PM, Eugene Koifman <span dir=3D"ltr">&lt;<a href=3D"mai=
lto:ekoifman@hortonworks.com" target=3D"_blank">ekoifman@hortonworks.com</a=
>&gt;</span> wrote:<br><blockquote class=3D"gmail_quote" style=3D"margin:0 =
0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex">


<div style=3D"word-wrap:break-word">
<div><font face=3D"Calibri,sans-serif"><span style=3D"font-size:12.8px">If =
you can run this, then it</span><span style=3D"font-size:13px">=E2=80=99</s=
pan><span style=3D"font-size:12.8px">s safe to ignore=C2=A0</span><span sty=
le=3D"font-size:13px">=E2=80=9C</span></font><span style=3D"font-family:Cal=
ibri,sans-serif;font-size:14px">00094_</span><span style=3D"font-family:Cal=
ibri,sans-serif;font-size:14px">flush_length</span><font face=3D"Calibri,sa=
ns-serif"><span style=3D"font-size:13px">=E2=80=9D</span><span style=3D"fon=
t-size:12.8px">=C2=A0me<wbr>ssages
 and the issue is somewhere else</span></font></div>
<div style=3D"font-family:Calibri,sans-serif;font-size:14px;color:rgb(0,0,0=
)">
<span style=3D"font-size:12.8px"><span style=3D"background-color:rgb(56,118=
,29)"><br>
</span></span></div>
<div style=3D"font-family:Calibri,sans-serif;font-size:14px;color:rgb(0,0,0=
)">
<span style=3D"font-size:12.8px"><span style=3D"background-color:rgb(56,118=
,29)">select count(*) from mls_public_record_association_</span></span><spa=
n style=3D"background-color:rgb(56,118,29)"></span><span style=3D"font-size=
:12.8px"><span style=3D"background-color:rgb(56,118,29)"><wbr>snapshot_orc<=
/span>=C2=A0pra
 =C2=A0left outer join=C2=A0<span style=3D"background-color:rgb(106,168,79)=
">mls_listing_snapshot_orc</span>=C2=A0<wbr>ml on pra.primary_listing_id =
=3D=C2=A0</span><a href=3D"http://ml.id/" style=3D"font-size:12.8px" target=
=3D"_blank">ml.id</a><span style=3D"font-size:12.8px">=C2=A0=C2=A0left oute=
r
 join=C2=A0<span style=3D"background-color:rgb(106,168,79)">attribute</span=
>=C2=A0a on=C2=A0</span><a href=3D"http://a.id/" style=3D"font-size:12.8px"=
 target=3D"_blank">a.id</a><span style=3D"font-size:12.8px">=C2=A0=3D ml.st=
andard_status</span></div>
<div style=3D"font-family:Calibri,sans-serif;font-size:14px;color:rgb(0,0,0=
)">
<br>
</div>
<div style=3D"font-family:Calibri,sans-serif;font-size:14px;color:rgb(0,0,0=
)">
Eugene</div>
<div style=3D"font-family:Calibri,sans-serif;font-size:14px;color:rgb(0,0,0=
)">
<br>
</div>
<span id=3D"m_-8019037479453584833m_2512523147321354427OLK_SRC_BODY_SECTION=
" style=3D"font-family:Calibri,sans-serif;font-size:14px;color:rgb(0,0,0)">
<div style=3D"font-family:Calibri;font-size:11pt;text-align:left;color:blac=
k;BORDER-BOTTOM:medium none;BORDER-LEFT:medium none;PADDING-BOTTOM:0in;PADD=
ING-LEFT:0in;PADDING-RIGHT:0in;BORDER-TOP:#b5c4df 1pt solid;BORDER-RIGHT:me=
dium none;PADDING-TOP:3pt">
<span style=3D"font-weight:bold">From: </span>satyajit vegesna &lt;<a href=
=3D"mailto:satyajit.apasprk@gmail.com" target=3D"_blank">satyajit.apasprk@g=
mail.com</a>&gt;<br>
<span style=3D"font-weight:bold">Date: </span>Wednesday, October 26, 2016 a=
t 2:14 PM<br>
<span style=3D"font-weight:bold">To: </span>&quot;<a href=3D"mailto:user@hi=
ve.apache.org" target=3D"_blank">user@hive.apache.org</a>&quot; &lt;<a href=
=3D"mailto:user@hive.apache.org" target=3D"_blank">user@hive.apache.org</a>=
&gt;, Eugene Koifman &lt;<a href=3D"mailto:ekoifman@hortonworks.com" target=
=3D"_blank">ekoifman@hortonworks.com</a>&gt;<br>
<span style=3D"font-weight:bold">Cc: </span>&quot;<a href=3D"mailto:dev@hiv=
e.apache.org" target=3D"_blank">dev@hive.apache.org</a>&quot; &lt;<a href=
=3D"mailto:dev@hive.apache.org" target=3D"_blank">dev@hive.apache.org</a>&g=
t;<br>
<span style=3D"font-weight:bold">Subject: </span>Re: Error with flush_lengt=
h File in Orc, in hive 2.1.0 and mr execution engine.<br>
</div><div><div class=3D"m_-8019037479453584833h5">
<div><br>
</div>
<div>
<div>
<div dir=3D"ltr">Hi=C2=A0<span style=3D"color:rgb(0,0,0);font-family:calibr=
i,sans-serif;font-size:14px">Eugene,</span>
<div><span style=3D"color:rgb(0,0,0);font-family:calibri,sans-serif;font-si=
ze:14px"><br>
</span></div>
<div><span style=3D"color:rgb(0,0,0);font-family:calibri,sans-serif;font-si=
ze:14px">PFB Transaction table in green and parquet tables in yellow,</span=
></div>
<div><span style=3D"color:rgb(0,0,0);font-family:calibri,sans-serif;font-si=
ze:14px"><br>
</span></div>
<div><span style=3D"font-size:12.8px">INSERT INTO <span style=3D"background=
-color:rgb(241,194,50)">
access_logs.crawlstats_dpp</span> PARTITION(day=3D&quot;2016-10-23&quot;) s=
elect pra.url as prUrl,pra.url_type as urlType,CAST(pra.created_at AS times=
tamp) as prCreated, CAST(pra.updated_at AS timestamp) as prUpdated, CAST(ml=
.created_at AS timestamp) as mlCreated, CAST(ml.updated_at
 AS timestamp) as mlUpdated,=C2=A0</span><a href=3D"http://a.name/" style=
=3D"font-size:12.8px" target=3D"_blank">a.name</a><span style=3D"font-size:=
12.8px">=C2=A0as status, pra.public_record_id as prId, acl.accesstime as cr=
awledon,=C2=A0</span><a href=3D"http://pra.id/" style=3D"font-size:12.8px" =
target=3D"_blank">pra.id</a><span style=3D"font-size:12.8px">=C2=A0as
 propId, pra.primary_listing_id as listingId, datediff(CAST(acl.accesstime =
AS timestamp),CAST(ml.created_at AS timestamp)) as mlcreateage, datediff(CA=
ST(acl.accesstime AS timestamp),CAST(ml.updated_at AS timestamp)) as mlupda=
teage, datediff(CAST(acl.accesstime
 AS timestamp),CAST(pra.created_at AS timestamp)) as prcreateage, datediff(=
CAST(acl.accesstime AS timestamp),CAST(pra.updated_at AS timestamp)) as pru=
pdateage, =C2=A0(case when (pra.public_record_id is not null and TRIM(pra.p=
ublic_record_id) &lt;&gt; &#39;&#39;) =C2=A0then (case when
 (pra.primary_listing_id is null or TRIM(pra.primary_listing_id) =3D &#39;&=
#39;) then &#39;PR&#39; else &#39;PRMLS&#39; END) =C2=A0else (case when (pr=
a.primary_listing_id is not null and TRIM(pra.primary_listing_id) &lt;&gt; =
&#39;&#39;) then &#39;MLS&#39; else &#39;UNKNOWN&#39; END) END) as listingT=
ype, =C2=A0acl.httpstatuscode,
 =C2=A0acl.httpverb, =C2=A0acl.requesttime, acl.upstreamheadertime , acl.up=
streamresponsetime, =C2=A0acl.page_id, =C2=A0useragent AS user_agent, =C2=
=A0substring(split(pra.url,&#39;/&#39;)[</span><span style=3D"font-size:12.=
8px"><wbr>0], 0,length(split(pra.url,&#39;/&#39;)[0]</span><span style=3D"f=
ont-size:12.8px"><wbr>)-3)
 as city, =C2=A0substring(split(pra.url,&#39;/&#39;)[</span><span style=3D"=
font-size:12.8px"><wbr>0], length(split(pra.url,&#39;/&#39;)[0])-</span><sp=
an style=3D"font-size:12.8px"><wbr>1,2) as state, =C2=A0ml.mls_id =C2=A0FRO=
M
<span style=3D"background-color:rgb(255,217,102)">access_logs.loadbalancer_=
</span></span><span style=3D"background-color:rgb(255,217,102)"></span><spa=
n style=3D"font-size:12.8px"><span style=3D"background-color:rgb(255,217,10=
2)">acces<wbr>slogs</span>
 acl =C2=A0inner join <span style=3D"background-color:rgb(56,118,29)">mls_p=
ublic_record_association_</span></span><span style=3D"background-color:rgb(=
56,118,29)"></span><span style=3D"font-size:12.8px"><span style=3D"backgrou=
nd-color:rgb(56,118,29)"><wbr>snapshot_orc</span>
 pra on acl.listing_url =3D pra.url =C2=A0left outer join <span style=3D"ba=
ckground-color:rgb(106,168,79)">
mls_listing_snapshot_orc</span> ml on pra.primary_listing_id =3D=C2=A0</spa=
n><a href=3D"http://ml.id/" style=3D"font-size:12.8px" target=3D"_blank">ml=
.id</a><span style=3D"font-size:12.8px">=C2=A0=C2=A0left outer join
<span style=3D"background-color:rgb(106,168,79)">attribute</span> a on=C2=
=A0</span><a href=3D"http://a.id/" style=3D"font-size:12.8px" target=3D"_bl=
ank">a.id</a><span style=3D"font-size:12.8px">=C2=A0=3D ml.standard_status =
=C2=A0WHERE acl.accesstimedate=3D&quot;2016-10-</span><span style=3D"font-s=
ize:12.8px">23<wbr>&quot;;</span><span style=3D"color:rgb(0,0,0);font-famil=
y:calibri,sans-serif;font-size:14px"><br>
</span></div>
<div><span style=3D"font-size:12.8px"><br>
</span></div>
<div><span style=3D"font-size:12.8px"><br>
</span></div>
<div><span style=3D"font-size:12.8px">Any clue, or something that you would=
 want me to focus on to debug the issue.</span></div>
<div><span style=3D"font-size:12.8px"><br>
</span></div>
<div><span style=3D"font-size:12.8px">Regards,</span></div>
<div><span style=3D"font-size:12.8px">Satyajit.</span></div>
<div><span style=3D"font-size:12.8px"><br>
</span></div>
<div><br>
</div>
</div>
<div class=3D"gmail_extra"><br>
<div class=3D"gmail_quote">On Tue, Oct 25, 2016 at 8:49 PM, Eugene Koifman =
<span dir=3D"ltr">
&lt;<a href=3D"mailto:ekoifman@hortonworks.com" target=3D"_blank">ekoifman@=
hortonworks.com</a>&gt;</span> wrote:<br>
<blockquote class=3D"gmail_quote" style=3D"margin:0 0 0 .8ex;border-left:1p=
x #ccc solid;padding-left:1ex">
<div style=3D"word-wrap:break-word;color:rgb(0,0,0);font-size:14px;font-fam=
ily:Calibri,sans-serif">
<div>Which of your tables are are transactional?=C2=A0 Can you provide the =
DDL?</div>
<div><br>
</div>
<div>I don=E2=80=99t think =E2=80=9CFile does not exist=E2=80=9D error is c=
ausing your queries to fail.=C2=A0 It=E2=80=99s an INFO level msg.</div>
<div>There should be some other error.</div>
<div><br>
</div>
<div>Eugene</div>
<div><br>
</div>
<div><br>
</div>
<span id=3D"m_-8019037479453584833m_2512523147321354427m_340124827789528119=
4OLK_SRC_BODY_SECTION">
<div style=3D"font-family:Calibri;font-size:11pt;text-align:left;color:blac=
k;BORDER-BOTTOM:medium none;BORDER-LEFT:medium none;PADDING-BOTTOM:0in;PADD=
ING-LEFT:0in;PADDING-RIGHT:0in;BORDER-TOP:#b5c4df 1pt solid;BORDER-RIGHT:me=
dium none;PADDING-TOP:3pt">
<span style=3D"font-weight:bold">From: </span>satyajit vegesna &lt;<a href=
=3D"mailto:satyajit.apasprk@gmail.com" target=3D"_blank">satyajit.apasprk@g=
mail.com</a>&gt;<br>
<span style=3D"font-weight:bold">Reply-To: </span>&quot;<a href=3D"mailto:u=
ser@hive.apache.org" target=3D"_blank">user@hive.apache.org</a>&quot; &lt;<=
a href=3D"mailto:user@hive.apache.org" target=3D"_blank">user@hive.apache.o=
rg</a>&gt;<br>
<span style=3D"font-weight:bold">Date: </span>Tuesday, October 25, 2016 at =
5:46 PM<br>
<span style=3D"font-weight:bold">To: </span>&quot;<a href=3D"mailto:user@hi=
ve.apache.org" target=3D"_blank">user@hive.apache.org</a>&quot; &lt;<a href=
=3D"mailto:user@hive.apache.org" target=3D"_blank">user@hive.apache.org</a>=
&gt;, &quot;<a href=3D"mailto:dev@hive.apache.org" target=3D"_blank">dev@hi=
ve.apache.org</a>&quot;
 &lt;<a href=3D"mailto:dev@hive.apache.org" target=3D"_blank">dev@hive.apac=
he.org</a>&gt;<br>
<span style=3D"font-weight:bold">Subject: </span>Error with flush_length Fi=
le in Orc, in hive 2.1.0 and mr execution engine.<br>
</div>
<div><br>
</div>
<div>
<div>
<div dir=3D"ltr"><span>HI All,
<div><br>
</div>
<div>i am using hive 2.1.0 , hadoop 2.7.2 , but =C2=A0when i try running qu=
eries like simple insert,</div>
<div><br>
</div>
<div>set mapreduce.job.queuename=3Ddefaul<wbr>t;set hive.exec.dynamic.parti=
tion=3Dtr<wbr>ue;set <a href=3D"http://hive.exec.dynamic.partition.mo" targ=
et=3D"_blank">hive.exec.dynamic.partition.mo</a><wbr>de=3Dnonstrict;set hiv=
e.exec.max.dynamic.partitio<wbr>ns.pernode=3D400;set hive.exec.max.dynamic.=
partitio<wbr>ns=3D2000;set mapreduce.map.memory.mb=3D5120;s<wbr>et
 mapreduce.reduce.memory.mb=3D512<wbr>0;set mapred.tasktracker.map.tasks.m<=
wbr>aximum=3D30;set mapred.tasktracker.reduce.task<wbr>s.maximum=3D20;set m=
apred.reduce.child.java.opts=3D<wbr>-Xmx2048m;set mapred.map.child.java.opt=
s=3D-Xm<wbr>x2048m; set hive.support.concurrency=3Dtrue;
 set hive.txn.manager=3Dorg.apache.ha<wbr>doop.hive.ql.lockmgr.DbTxnMana<wb=
r>ger; set hive.compactor.initiator.on=3Dfa<wbr>lse; set hive.compactor.wor=
ker.threads=3D<wbr>1;set mapreduce.job.queuename=3Ddefaul<wbr>t;set hive.ex=
ec.dynamic.partition=3Dtr<wbr>ue;set <a href=3D"http://hive.exec.dynamic.pa=
rtition.mo" target=3D"_blank">hive.exec.dynamic.partition.mo</a><wbr>de=3Dn=
onstrict;INSERT
 INTO access_logs.crawlstats_dpp PARTITION(day=3D&quot;2016-10-23&quot;) se=
lect pra.url as prUrl,pra.url_type as urlType,CAST(pra.created_at AS timest=
amp) as prCreated, CAST(pra.updated_at AS timestamp) as prUpdated, CAST(ml.=
created_at AS timestamp) as mlCreated, CAST(ml.updated_at
 AS timestamp) as mlUpdated, <a href=3D"http://a.name" target=3D"_blank">a.=
name</a> as status, pra.public_record_id as prId, acl.accesstime as crawled=
on,
<a href=3D"http://pra.id" target=3D"_blank">pra.id</a> as propId, pra.prima=
ry_listing_id as listingId, datediff(CAST(acl.accesstime AS timestamp),CAST=
(ml.created_at AS timestamp)) as mlcreateage, datediff(CAST(acl.accesstime =
AS timestamp),CAST(ml.updated_at AS
 timestamp)) as mlupdateage, datediff(CAST(acl.accesstime AS timestamp),CAS=
T(pra.created_at AS timestamp)) as prcreateage, datediff(CAST(acl.accesstim=
e AS timestamp),CAST(pra.updated_at AS timestamp)) as prupdateage, =C2=A0(c=
ase when (pra.public_record_id is not
 null and TRIM(pra.public_record_id) &lt;&gt; &#39;&#39;) =C2=A0then (case =
when (pra.primary_listing_id is null or TRIM(pra.primary_listing_id) =3D &#=
39;&#39;) then &#39;PR&#39; else &#39;PRMLS&#39; END) =C2=A0else (case when=
 (pra.primary_listing_id is not null and TRIM(pra.primary_listing_id) &lt;&=
gt; &#39;&#39;) then
 &#39;MLS&#39; else &#39;UNKNOWN&#39; END) END) as listingType, =C2=A0acl.h=
ttpstatuscode, =C2=A0acl.httpverb, =C2=A0acl.requesttime, acl.upstreamheade=
rtime , acl.upstreamresponsetime, =C2=A0acl.page_id, =C2=A0useragent AS use=
r_agent, =C2=A0substring(split(pra.url,&#39;/&#39;)[<wbr>0], 0,length(split=
(pra.url,&#39;/&#39;)[0]<wbr>)-3)
 as city, =C2=A0substring(split(pra.url,&#39;/&#39;)[<wbr>0], length(split(=
pra.url,&#39;/&#39;)[0])-<wbr>1,2) as state, =C2=A0ml.mls_id =C2=A0FROM acc=
ess_logs.loadbalancer_acces<wbr>slogs acl =C2=A0inner join mls_public_recor=
d_association_<wbr>snapshot_orc pra on acl.listing_url =3D pra.url =C2=A0le=
ft
 outer join mls_listing_snapshot_orc ml on pra.primary_listing_id =3D <a hr=
ef=3D"http://ml.id" target=3D"_blank">
ml.id</a> =C2=A0left outer join attribute a on <a href=3D"http://a.id" targ=
et=3D"_blank">a.id</a> =3D ml.standard_status =C2=A0WHERE acl.accesstimedat=
e=3D&quot;2016-10-23<wbr>&quot;;<br>
</div>
<div><br>
</div>
<div>i finally end up getting below error,</div>
<div><br>
</div>
</span>
<div><span>
<div>2016-10-25 17:40:18,725 Stage-2 map =3D 100%, =C2=A0reduce =3D 52%, Cu=
mulative CPU 1478.96 sec</div>
<div>2016-10-25 17:40:19,761 Stage-2 map =3D 100%, =C2=A0reduce =3D 62%, Cu=
mulative CPU 1636.58 sec</div>
<div>2016-10-25 17:40:20,794 Stage-2 map =3D 100%, =C2=A0reduce =3D 64%, Cu=
mulative CPU 1764.97 sec</div>
<div>2016-10-25 17:40:21,820 Stage-2 map =3D 100%, =C2=A0reduce =3D 69%, Cu=
mulative CPU 1879.61 sec</div>
<div>2016-10-25 17:40:22,842 Stage-2 map =3D 100%, =C2=A0reduce =3D 80%, Cu=
mulative CPU 2051.38 sec</div>
<div>2016-10-25 17:40:23,872 Stage-2 map =3D 100%, =C2=A0reduce =3D 90%, Cu=
mulative CPU 2151.49 sec</div>
<div>2016-10-25 17:40:24,907 Stage-2 map =3D 100%, =C2=A0reduce =3D 93%, Cu=
mulative CPU 2179.67 sec</div>
<div>2016-10-25 17:40:25,944 Stage-2 map =3D 100%, =C2=A0reduce =3D 94%, Cu=
mulative CPU 2187.86 sec</div>
<div>2016-10-25 17:40:29,062 Stage-2 map =3D 100%, =C2=A0reduce =3D 95%, Cu=
mulative CPU 2205.22 sec</div>
<div>2016-10-25 17:40:30,107 Stage-2 map =3D 100%, =C2=A0reduce =3D 100%, C=
umulative CPU 2241.25 sec</div>
<div>MapReduce Total cumulative CPU time: 37 minutes 21 seconds 250 msec</d=
iv>
<div>Ended Job =3D job_1477437520637_0009</div>
<div>SLF4J: Class path contains multiple SLF4J bindings.</div>
<div>SLF4J: Found binding in [jar:file:/opt/apache-hive-2.1<wbr>.0-bin/lib/=
log4j-slf4j-impl-2.<wbr>4.1.jar!/org/slf4j/impl/Static<wbr>LoggerBinder.cla=
ss]</div>
<div>SLF4J: Found binding in [jar:file:/opt/hadoop-2.7.2/sh<wbr>are/hadoop/=
common/lib/slf4j-lo<wbr>g4j12-1.7.10.jar!/org/slf4j/im<wbr>pl/StaticLoggerB=
inder.class]</div>
<div>SLF4J: See <a href=3D"http://www.slf4j.org/codes.html#multiple_binding=
s" target=3D"_blank">
http://www.slf4j.org/codes.htm<wbr>l#multiple_bindings</a> for an explanati=
on.</div>
<div>SLF4J: Actual binding is of type [org.apache.logging.slf4j.Log4<wbr>jL=
oggerFactory]</div>
</span>
<div>2016-10-25 17:40:35<span class=3D"m_-8019037479453584833m_251252314732=
1354427m_3401248277895281194gmail-Apple-tab-span" style=3D"white-space:pre-=
wrap"></span>Starting to launch local task to process map join;<span class=
=3D"m_-8019037479453584833m_2512523147321354427m_3401248277895281194gmail-A=
pple-tab-span" style=3D"white-space:pre-wrap"></span>maximum
 memory =3D 514850816</div>
<span>
<div>Execution failed with exit status: 2</div>
<div>Obtaining error information</div>
<div><br>
</div>
<div>Task failed!</div>
<div>Task ID:</div>
<div>=C2=A0 Stage-14</div>
<div><br>
</div>
<div>Logs:</div>
<div><br>
</div>
<div>FAILED: Execution Error, return code 2 from org.apache.hadoop.hive.ql.=
exec<wbr>.mr.MapredLocalTask</div>
<div>MapReduce Jobs Launched:=C2=A0</div>
<div>Stage-Stage-1: Map: 106 =C2=A0Reduce: 45 =C2=A0 Cumulative CPU: 3390.1=
1 sec =C2=A0 HDFS Read: 8060555201 HDFS Write: 757253756 SUCCESS</div>
<div>Stage-Stage-2: Map: 204 =C2=A0Reduce: 85 =C2=A0 Cumulative CPU: 2241.2=
5 sec =C2=A0 HDFS Read:
<a href=3D"tel:2407914653" value=3D"+12407914653" target=3D"_blank">2407914=
653</a> HDFS Write: 805874953 SUCCESS</div>
<div>Total MapReduce CPU Time Spent: 0 days 1 hours 33 minutes 51 seconds 3=
60 msec</div>
</span></div>
<span>
<div><br>
</div>
<div>Could not find any errors in logs, but when i check namenode logs , oi=
 get the following error,</div>
<div><br>
</div>
<div>
<div>2016-10-25 17:01:51,923 INFO org.apache.hadoop.ipc.Server: IPC Server =
handler 1 on 9000, call org.apache.hadoop.hdfs.protoco<wbr>l.ClientProtocol=
.getBlockLocat<wbr>ions from
<a href=3D"http://192.168.120.133:47114" target=3D"_blank">192.168.120.133:=
47114</a> Call#4 Retry#0: java.io.FileNotFoundException: File does not exis=
t: /user/hive/warehouse/mls_publi<wbr>c_record_association_snapshot_<wbr>or=
c/delta_0000002_0000002_0000<wbr>/bucket_00094_flush_length</div>
<div>2016-10-25 17:01:52,779 INFO org.apache.hadoop.ipc.Server: IPC Server =
handler 1 on 9000, call org.apache.hadoop.hdfs.protoco<wbr>l.ClientProtocol=
.getBlockLocat<wbr>ions from
<a href=3D"http://192.168.120.132:43008" target=3D"_blank">192.168.120.132:=
43008</a> Call#4 Retry#0: java.io.FileNotFoundException: File does not exis=
t: /user/hive/warehouse/mls_publi<wbr>c_record_association_snapshot_<wbr>or=
c/delta_0000002_0000002_0000<wbr>/bucket_00095_flush_length</div>
<div>2016-10-25 17:01:52,984 INFO org.apache.hadoop.ipc.Server: IPC Server =
handler 0 on 9000, call org.apache.hadoop.hdfs.protoco<wbr>l.ClientProtocol=
.getBlockLocat<wbr>ions from
<a href=3D"http://192.168.120.133:47260" target=3D"_blank">192.168.120.133:=
47260</a> Call#4 Retry#0: java.io.FileNotFoundException: File does not exis=
t: /user/hive/warehouse/mls_publi<wbr>c_record_association_snapshot_<wbr>or=
c/delta_0000002_0000002_0000<wbr>/bucket_00096_flush_length</div>
<div>2016-10-25 17:01:53,381 INFO org.apache.hadoop.ipc.Server: IPC Server =
handler 0 on 9000, call org.apache.hadoop.hdfs.protoco<wbr>l.ClientProtocol=
.getBlockLocat<wbr>ions from
<a href=3D"http://192.168.120.132:43090" target=3D"_blank">192.168.120.132:=
43090</a> Call#4 Retry#0: java.io.FileNotFoundException: File does not exis=
t: /user/hive/warehouse/mls_publi<wbr>c_record_association_snapshot_<wbr>or=
c/delta_0000002_0000002_0000<wbr>/bucket_00097_flush_length</div>
<div>2016-10-25 17:01:53,971 INFO org.apache.hadoop.ipc.Server: IPC Server =
handler 1 on 9000, call org.apache.hadoop.hdfs.protoco<wbr>l.ClientProtocol=
.getBlockLocat<wbr>ions from
<a href=3D"http://192.168.120.134:37444" target=3D"_blank">192.168.120.134:=
37444</a> Call#4 Retry#0: java.io.FileNotFoundException: File does not exis=
t: /user/hive/warehouse/mls_publi<wbr>c_record_association_snapshot_<wbr>or=
c/delta_0000002_0000002_0000<wbr>/bucket_00098_flush_length</div>
<div>2016-10-25 17:01:54,092 INFO org.apache.hadoop.ipc.Server: IPC Server =
handler 2 on 9000, call org.apache.hadoop.hdfs.protoco<wbr>l.ClientProtocol=
.getBlockLocat<wbr>ions from
<a href=3D"http://192.168.120.133:47300" target=3D"_blank">192.168.120.133:=
47300</a> Call#4 Retry#0: java.io.FileNotFoundException: File does not exis=
t: /user/hive/warehouse/mls_publi<wbr>c_record_association_snapshot_<wbr>or=
c/delta_0000002_0000002_0000<wbr>/bucket_00099_flush_length</div>
<div>2016-10-25 17:01:55,094 INFO org.apache.hadoop.ipc.Server: IPC Server =
handler 8 on 9000, call org.apache.hadoop.hdfs.protoco<wbr>l.ClientProtocol=
.getBlockLocat<wbr>ions from
<a href=3D"http://192.168.120.134:37540" target=3D"_blank">192.168.120.134:=
37540</a> Call#4 Retry#0: java.io.FileNotFoundException: File does not exis=
t: /user/hive/warehouse/mls_publi<wbr>c_record_association_snapshot_<wbr>or=
c/delta_0000002_0000002_0000<wbr>/bucket_00012_flush_length</div>
<div>2016-10-25 17:02:11,269 INFO org.apache.hadoop.ipc.Server: IPC Server =
handler 5 on 9000, call org.apache.hadoop.hdfs.protoco<wbr>l.ClientProtocol=
.getBlockLocat<wbr>ions from
<a href=3D"http://192.168.120.133:47378" target=3D"_blank">192.168.120.133:=
47378</a> Call#4 Retry#0: java.io.FileNotFoundException: File does not exis=
t: /user/hive/warehouse/mls_publi<wbr>c_record_association_snapshot_<wbr>or=
c/delta_0000002_0000002_0000<wbr>/bucket_00075_flush_length</div>
</div>
<div><br>
</div>
<div>i also search for find the flush_length files in the above mentioned l=
ocation, but i only see buckets but no files ending with flush_length.</div=
>
<div><br>
</div>
<div>Any clue or help would be highly appreciated.</div>
<div><br>
</div>
<div>Regards,</div>
<div>Satyajit.</div>
<div><br>
</div>
</span></div>
</div>
</div>
</span></div>
</blockquote>
</div>
<br>
</div>
</div>
</div>
</div></div></span>
</div>

</blockquote></div><br></div>
</div></div></blockquote></div><br></div>

--001a1141b320f6e9d0053fccf4da--