drill-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Anup Tiwari <anup.tiw...@games24x7.com>
Subject Re: Storage Plugin for accessing Hive ORC Table from Drill
Date Fri, 20 Jan 2017 11:00:43 GMT
Hi,

Please find below Create Table Statement and subsequent Drill Error :-

*Table Structure :*

CREATE TABLE `logindetails_all`(
  `sid` char(40),
  `channel_id` tinyint,
  `c_t` bigint,
  `l_t` bigint)
PARTITIONED BY (
  `login_date` char(10))
CLUSTERED BY (
  channel_id)
INTO 9 BUCKETS
ROW FORMAT SERDE
  'org.apache.hadoop.hive.ql.io.orc.OrcSerde'
STORED AS INPUTFORMAT
  'org.apache.hadoop.hive.ql.io.orc.OrcInputFormat'
OUTPUTFORMAT
  'org.apache.hadoop.hive.ql.io.orc.OrcOutputFormat'
LOCATION
  'hdfs://hostname1:9000/usr/hive/warehouse/logindetails_all'
TBLPROPERTIES (
  'compactorthreshold.hive.compactor.delta.num.threshold'='6',
  'compactorthreshold.hive.compactor.delta.pct.threshold'='0.5',
  'transactional'='true',
  'transient_lastDdlTime'='1484313383');
;

*Drill Error :*

*Query* : select * from hive.logindetails_all limit 1;

*Error :*
2017-01-20 16:21:12,625 [277e145e-c6bc-3372-01d0-6c5b75b92d73:foreman]
INFO  o.a.drill.exec.work.foreman.Foreman - Query text for query id
277e145e-c6bc-3372-01d0-6c5b75b92d73: select * from hive.logindetails_all
limit 1
2017-01-20 16:21:12,831 [277e145e-c6bc-3372-01d0-6c5b75b92d73:foreman]
ERROR o.a.drill.exec.work.foreman.Foreman - SYSTEM ERROR:
NumberFormatException: For input string: "0000004_0000"


[Error Id: 53fa92e1-477e-45d2-b6f7-6eab9ef1da35 on
prod-hadoop-101.bom-prod.aws.games24x7.com:31010]
org.apache.drill.common.exceptions.UserException: SYSTEM ERROR:
NumberFormatException: For input string: "0000004_0000"


[Error Id: 53fa92e1-477e-45d2-b6f7-6eab9ef1da35 on
prod-hadoop-101.bom-prod.aws.games24x7.com:31010]
    at
org.apache.drill.common.exceptions.UserException$Builder.build(UserException.java:543)
~[drill-common-1.9.0.jar:1.9.0]
    at
org.apache.drill.exec.work.foreman.Foreman$ForemanResult.close(Foreman.java:825)
[drill-java-exec-1.9.0.jar:1.9.0]
    at
org.apache.drill.exec.work.foreman.Foreman.moveToState(Foreman.java:935)
[drill-java-exec-1.9.0.jar:1.9.0]
    at org.apache.drill.exec.work.foreman.Foreman.run(Foreman.java:281)
[drill-java-exec-1.9.0.jar:1.9.0]
    at
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
[na:1.8.0_72]
    at
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
[na:1.8.0_72]
    at java.lang.Thread.run(Thread.java:745) [na:1.8.0_72]
Caused by: org.apache.drill.exec.work.foreman.ForemanException: Unexpected
exception during fragment initialization: Internal error: Error while
applying rule DrillPushProjIntoScan, args
[rel#4220197:LogicalProject.NONE.ANY([]).[](input=rel#4220196:Subset#0.ENUMERABLE.ANY([]).[],sid=$0,channel_id=$1,c_t=$2,l_t=$3,login_date=$4),
rel#4220181:EnumerableTableScan.ENUMERABLE.ANY([]).[](table=[hive,
logindetails_all])]
    ... 4 common frames omitted
Caused by: java.lang.AssertionError: Internal error: Error while applying
rule DrillPushProjIntoScan, args
[rel#4220197:LogicalProject.NONE.ANY([]).[](input=rel#4220196:Subset#0.ENUMERABLE.ANY([]).[],sid=$0,channel_id=$1,c_t=$2,l_t=$3,login_date=$4),
rel#4220181:EnumerableTableScan.ENUMERABLE.ANY([]).[](table=[hive,
logindetails_all])]
    at org.apache.calcite.util.Util.newInternal(Util.java:792)
~[calcite-core-1.4.0-drill-r19.jar:1.4.0-drill-r19]
    at
org.apache.calcite.plan.volcano.VolcanoRuleCall.onMatch(VolcanoRuleCall.java:251)
~[calcite-core-1.4.0-drill-r19.jar:1.4.0-drill-r19]
    at
org.apache.calcite.plan.volcano.VolcanoPlanner.findBestExp(VolcanoPlanner.java:808)
~[calcite-core-1.4.0-drill-r19.jar:1.4.0-drill-r19]
    at
org.apache.calcite.tools.Programs$RuleSetProgram.run(Programs.java:303)
~[calcite-core-1.4.0-drill-r19.jar:1.4.0-drill-r19]
    at
org.apache.drill.exec.planner.sql.handlers.DefaultSqlHandler.transform(DefaultSqlHandler.java:404)
~[drill-java-exec-1.9.0.jar:1.9.0]
    at
org.apache.drill.exec.planner.sql.handlers.DefaultSqlHandler.transform(DefaultSqlHandler.java:343)
~[drill-java-exec-1.9.0.jar:1.9.0]
    at
org.apache.drill.exec.planner.sql.handlers.DefaultSqlHandler.convertToDrel(DefaultSqlHandler.java:240)
~[drill-java-exec-1.9.0.jar:1.9.0]
    at
org.apache.drill.exec.planner.sql.handlers.DefaultSqlHandler.convertToDrel(DefaultSqlHandler.java:290)
~[drill-java-exec-1.9.0.jar:1.9.0]
    at
org.apache.drill.exec.planner.sql.handlers.DefaultSqlHandler.getPlan(DefaultSqlHandler.java:168)
~[drill-java-exec-1.9.0.jar:1.9.0]
    at
org.apache.drill.exec.planner.sql.DrillSqlWorker.getPhysicalPlan(DrillSqlWorker.java:123)
~[drill-java-exec-1.9.0.jar:1.9.0]
    at
org.apache.drill.exec.planner.sql.DrillSqlWorker.getPlan(DrillSqlWorker.java:97)
~[drill-java-exec-1.9.0.jar:1.9.0]
    at org.apache.drill.exec.work.foreman.Foreman.runSQL(Foreman.java:1008)
[drill-java-exec-1.9.0.jar:1.9.0]
    at org.apache.drill.exec.work.foreman.Foreman.run(Foreman.java:264)
[drill-java-exec-1.9.0.jar:1.9.0]
    ... 3 common frames omitted
Caused by: java.lang.AssertionError: Internal error: Error occurred while
applying rule DrillPushProjIntoScan
    at org.apache.calcite.util.Util.newInternal(Util.java:792)
~[calcite-core-1.4.0-drill-r19.jar:1.4.0-drill-r19]
    at
org.apache.calcite.plan.volcano.VolcanoRuleCall.transformTo(VolcanoRuleCall.java:150)
~[calcite-core-1.4.0-drill-r19.jar:1.4.0-drill-r19]
    at
org.apache.calcite.plan.RelOptRuleCall.transformTo(RelOptRuleCall.java:213)
~[calcite-core-1.4.0-drill-r19.jar:1.4.0-drill-r19]
    at
org.apache.drill.exec.planner.logical.DrillPushProjIntoScan.onMatch(DrillPushProjIntoScan.java:90)
~[drill-java-exec-1.9.0.jar:1.9.0]
    at
org.apache.calcite.plan.volcano.VolcanoRuleCall.onMatch(VolcanoRuleCall.java:228)
~[calcite-core-1.4.0-drill-r19.jar:1.4.0-drill-r19]
    ... 14 common frames omitted
Caused by: java.lang.reflect.UndeclaredThrowableException: null
    at com.sun.proxy.$Proxy75.getNonCumulativeCost(Unknown Source) ~[na:na]
    at
org.apache.calcite.rel.metadata.RelMetadataQuery.getNonCumulativeCost(RelMetadataQuery.java:115)
~[calcite-core-1.4.0-drill-r19.jar:1.4.0-drill-r19]
    at
org.apache.calcite.plan.volcano.VolcanoPlanner.getCost(VolcanoPlanner.java:1112)
~[calcite-core-1.4.0-drill-r19.jar:1.4.0-drill-r19]
    at
org.apache.calcite.plan.volcano.RelSubset.propagateCostImprovements0(RelSubset.java:363)
~[calcite-core-1.4.0-drill-r19.jar:1.4.0-drill-r19]
    at
org.apache.calcite.plan.volcano.RelSubset.propagateCostImprovements(RelSubset.java:344)
~[calcite-core-1.4.0-drill-r19.jar:1.4.0-drill-r19]
    at
org.apache.calcite.plan.volcano.VolcanoPlanner.addRelToSet(VolcanoPlanner.java:1827)
~[calcite-core-1.4.0-drill-r19.jar:1.4.0-drill-r19]
    at
org.apache.calcite.plan.volcano.VolcanoPlanner.registerImpl(VolcanoPlanner.java:1760)
~[calcite-core-1.4.0-drill-r19.jar:1.4.0-drill-r19]
    at
org.apache.calcite.plan.volcano.VolcanoPlanner.register(VolcanoPlanner.java:1017)
~[calcite-core-1.4.0-drill-r19.jar:1.4.0-drill-r19]
    at
org.apache.calcite.plan.volcano.VolcanoPlanner.ensureRegistered(VolcanoPlanner.java:1037)
~[calcite-core-1.4.0-drill-r19.jar:1.4.0-drill-r19]
    at
org.apache.calcite.plan.volcano.VolcanoPlanner.ensureRegistered(VolcanoPlanner.java:1940)
~[calcite-core-1.4.0-drill-r19.jar:1.4.0-drill-r19]
    at
org.apache.calcite.plan.volcano.VolcanoRuleCall.transformTo(VolcanoRuleCall.java:138)
~[calcite-core-1.4.0-drill-r19.jar:1.4.0-drill-r19]
    ... 17 common frames omitted
Caused by: java.lang.reflect.InvocationTargetException: null
    at sun.reflect.GeneratedMethodAccessor63.invoke(Unknown Source) ~[na:na]
    at
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
~[na:1.8.0_72]
    at java.lang.reflect.Method.invoke(Method.java:498) ~[na:1.8.0_72]
    at
org.apache.calcite.rel.metadata.CachingRelMetadataProvider$CachingInvocationHandler.invoke(CachingRelMetadataProvider.java:132)
~[calcite-core-1.4.0-drill-r19.jar:1.4.0-drill-r19]
    ... 28 common frames omitted
Caused by: java.lang.reflect.UndeclaredThrowableException: null
    at com.sun.proxy.$Proxy75.getNonCumulativeCost(Unknown Source) ~[na:na]
    ... 32 common frames omitted
Caused by: java.lang.reflect.InvocationTargetException: null
    at sun.reflect.GeneratedMethodAccessor63.invoke(Unknown Source) ~[na:na]
    at
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
~[na:1.8.0_72]
    at java.lang.reflect.Method.invoke(Method.java:498) ~[na:1.8.0_72]
    at
org.apache.calcite.rel.metadata.ChainedRelMetadataProvider$ChainedInvocationHandler.invoke(ChainedRelMetadataProvider.java:109)
~[calcite-core-1.4.0-drill-r19.jar:1.4.0-drill-r19]
    ... 33 common frames omitted
Caused by: java.lang.reflect.UndeclaredThrowableException: null
    at com.sun.proxy.$Proxy75.getNonCumulativeCost(Unknown Source) ~[na:na]
    ... 37 common frames omitted
Caused by: java.lang.reflect.InvocationTargetException: null
    at sun.reflect.GeneratedMethodAccessor65.invoke(Unknown Source) ~[na:na]
    at
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
~[na:1.8.0_72]
    at java.lang.reflect.Method.invoke(Method.java:498) ~[na:1.8.0_72]
    at
org.apache.calcite.rel.metadata.ReflectiveRelMetadataProvider$1$1.invoke(ReflectiveRelMetadataProvider.java:182)
~[calcite-core-1.4.0-drill-r19.jar:1.4.0-drill-r19]
    ... 38 common frames omitted
Caused by: org.apache.drill.common.exceptions.DrillRuntimeException:
java.io.IOException: Failed to get numRows from HiveTable
    at
org.apache.drill.exec.store.hive.HiveScan.getScanStats(HiveScan.java:233)
~[drill-storage-hive-core-1.9.0.jar:1.9.0]
    at
org.apache.drill.exec.physical.base.AbstractGroupScan.getScanStats(AbstractGroupScan.java:79)
~[drill-java-exec-1.9.0.jar:1.9.0]
    at
org.apache.drill.exec.planner.logical.DrillScanRel.computeSelfCost(DrillScanRel.java:159)
~[drill-java-exec-1.9.0.jar:1.9.0]
    at
org.apache.calcite.rel.metadata.RelMdPercentageOriginalRows.getNonCumulativeCost(RelMdPercentageOriginalRows.java:165)
~[calcite-core-1.4.0-drill-r19.jar:1.4.0-drill-r19]
    ... 42 common frames omitted
Caused by: java.io.IOException: Failed to get numRows from HiveTable
    at
org.apache.drill.exec.store.hive.HiveMetadataProvider.getStats(HiveMetadataProvider.java:113)
~[drill-storage-hive-core-1.9.0.jar:1.9.0]
    at
org.apache.drill.exec.store.hive.HiveScan.getScanStats(HiveScan.java:224)
~[drill-storage-hive-core-1.9.0.jar:1.9.0]
    ... 45 common frames omitted
Caused by: java.lang.RuntimeException: serious problem
    at
org.apache.hadoop.hive.ql.io.orc.OrcInputFormat.generateSplitsInfo(OrcInputFormat.java:1021)
~[drill-hive-exec-shaded-1.9.0.jar:1.9.0]
    at
org.apache.hadoop.hive.ql.io.orc.OrcInputFormat.getSplits(OrcInputFormat.java:1048)
~[drill-hive-exec-shaded-1.9.0.jar:1.9.0]
    at
org.apache.drill.exec.store.hive.HiveMetadataProvider$1.run(HiveMetadataProvider.java:253)
~[drill-storage-hive-core-1.9.0.jar:1.9.0]
    at
org.apache.drill.exec.store.hive.HiveMetadataProvider$1.run(HiveMetadataProvider.java:241)
~[drill-storage-hive-core-1.9.0.jar:1.9.0]
    at java.security.AccessController.doPrivileged(Native Method)
~[na:1.8.0_72]
    at javax.security.auth.Subject.doAs(Subject.java:422) ~[na:1.8.0_72]
    at
org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1657)
~[hadoop-common-2.7.1.jar:na]
    at
org.apache.drill.exec.store.hive.HiveMetadataProvider.splitInputWithUGI(HiveMetadataProvider.java:241)
~[drill-storage-hive-core-1.9.0.jar:1.9.0]
    at
org.apache.drill.exec.store.hive.HiveMetadataProvider.getPartitionInputSplits(HiveMetadataProvider.java:142)
~[drill-storage-hive-core-1.9.0.jar:1.9.0]
    at
org.apache.drill.exec.store.hive.HiveMetadataProvider.getStats(HiveMetadataProvider.java:105)
~[drill-storage-hive-core-1.9.0.jar:1.9.0]
    ... 46 common frames omitted
Caused by: java.util.concurrent.ExecutionException:
java.lang.NumberFormatException: For input string: "0000004_0000"
    at java.util.concurrent.FutureTask.report(FutureTask.java:122)
~[na:1.8.0_72]
    at java.util.concurrent.FutureTask.get(FutureTask.java:192)
~[na:1.8.0_72]
    at
org.apache.hadoop.hive.ql.io.orc.OrcInputFormat.generateSplitsInfo(OrcInputFormat.java:998)
~[drill-hive-exec-shaded-1.9.0.jar:1.9.0]
    ... 55 common frames omitted
Caused by: java.lang.NumberFormatException: For input string: "0000004_0000"
    at
java.lang.NumberFormatException.forInputString(NumberFormatException.java:65)
~[na:1.8.0_72]
    at java.lang.Long.parseLong(Long.java:589) ~[na:1.8.0_72]
    at java.lang.Long.parseLong(Long.java:631) ~[na:1.8.0_72]
    at
org.apache.hadoop.hive.ql.io.AcidUtils.parseDelta(AcidUtils.java:310)
~[drill-hive-exec-shaded-1.9.0.jar:1.9.0]
    at
org.apache.hadoop.hive.ql.io.AcidUtils.getAcidState(AcidUtils.java:379)
~[drill-hive-exec-shaded-1.9.0.jar:1.9.0]
    at
org.apache.hadoop.hive.ql.io.orc.OrcInputFormat$FileGenerator.call(OrcInputFormat.java:634)
~[drill-hive-exec-shaded-1.9.0.jar:1.9.0]
    at
org.apache.hadoop.hive.ql.io.orc.OrcInputFormat$FileGenerator.call(OrcInputFormat.java:620)
~[drill-hive-exec-shaded-1.9.0.jar:1.9.0]
    at java.util.concurrent.FutureTask.run(FutureTask.java:266)
~[na:1.8.0_72]
    ... 3 common frames omitted




Regards,
*Anup Tiwari*

On Thu, Jan 19, 2017 at 9:18 PM, Andries Engelbrecht <aengelbrecht@mapr.com>
wrote:

> I have not seen issues reading Hive ORC data with Drill.
>
>
> What is the DDL for the table in Hive?
>
>
> --Andries
>
> ________________________________
> From: Anup Tiwari <anup.tiwari@games24x7.com>
> Sent: Thursday, January 19, 2017 12:49:20 AM
> To: user@drill.apache.org
> Cc: dev@drill.apache.org
> Subject: Re: Storage Plugin for accessing Hive ORC Table from Drill
>
> We have created a ORC format table in hive and we were trying to read it in
> drill through hive plugin, but it is giving us error. But with same hive
> plugin, we are able to read parquet table created in hive.
>
> So after searching a bit, i found a drill documentation link
> <https://drill.apache.org/docs/apache-drill-contribution-ideas/> which
> says
> that we have to create custom storage plugin to read ORC format tables. So
> can you tell me how to create custom storage plugin in this case?
>
>
>
> Regards,
> *Anup Tiwari*
>
> On Thu, Jan 19, 2017 at 1:55 PM, Nitin Pawar <nitinpawar432@gmail.com>
> wrote:
>
> > you want to use the ORC files created by hive directly in drill or you
> want
> > to use them through hive?
> >
> > On Thu, Jan 19, 2017 at 1:40 PM, Anup Tiwari <anup.tiwari@games24x7.com>
> > wrote:
> >
> > > +Dev
> > >
> > > Can someone help me in this?
> > >
> > > Regards,
> > > *Anup Tiwari*
> > >
> > > On Sun, Jan 15, 2017 at 2:21 PM, Anup Tiwari <
> anup.tiwari@games24x7.com>
> > > wrote:
> > >
> > > > Hi Team,
> > > >
> > > > Can someone tell me how to configure custom storage plugin in Drill
> for
> > > > accessing hive ORC tables?
> > > >
> > > > Thanks in advance!!
> > > >
> > > > Regards,
> > > > *Anup Tiwari*
> > > >
> > >
> >
> >
> >
> > --
> > Nitin Pawar
> >
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message