hive-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Jitendra Kumar Singh <jksingh26...@gmail.com>
Subject Re: Hive Query having virtual column INPUT__FILE__NAME in where clause gives exception
Date Mon, 17 Jun 2013 07:01:14 GMT
Thanks guys for reply.

Following query also did not work
hive> select count(*), filename from (select INPUT__FILE__NAME as filename
from netflow) tmp  where filename='vzb.1351794600.0' group by filename;
FAILED: SemanticException java.lang.RuntimeException: cannot find field
input__file__name from
[org.apache.hadoop.hive.serde2.objectinspector.UnionStructObjectInspector$MyField@1d264bf5,
org.apache.hadoop.hive.serde2.objectinspector.UnionStructObjectInspector$MyField@3d44d0c6

I forgot to mention that my table uses partitions.

Do you guys know any other way to filter files?

Thanks and Regards,
--
Jitendra Kumar Singh
Mobile: (+91) 9891314709


On Sat, Jun 15, 2013 at 12:33 PM, Navis류승우 <navis.ryu@nexr.com> wrote:

> Firstly, the exception seemed
> https://issues.apache.org/jira/browse/HIVE-3926.
>
> Secondly, file selection on vc (file-name, etc.) is
> https://issues.apache.org/jira/browse/HIVE-1662
>
> Both of them are not fixed yet.
>
> 2013/6/14 Nitin Pawar <nitinpawar432@gmail.com>:
> > Jitendra,
> > I am really not sure you can use virtual columns in where clause.  (I
> never
> > tried it so I may be wrong as well).
> >
> > can you try executing your query as below
> >
> > select count(*), filename from (select INPUT__FILE__NAME as filename from
> > netflow)tmp  where filename='vzb.1351794600.0';
> >
> > please check for query syntax. I am giving an idea and have not verified
> the
> > query
> >
> >
> > On Fri, Jun 14, 2013 at 4:57 PM, Jitendra Kumar Singh
> > <jksingh26jun@gmail.com> wrote:
> >>
> >> Hi Guys,
> >>
> >> Executing hive query with filter on virtual column INPUT_FILE_NAME
> result
> >> in following exception.
> >>
> >> hive> select count(*) from netflow where
> >> INPUT__FILE__NAME='vzb.1351794600.0';
> >>
> >> FAILED: SemanticException java.lang.RuntimeException: cannot find field
> >> input__file__name from
> >>
> [org.apache.hadoop.hive.serde2.objectinspector.UnionStructObjectInspector$MyField@1d264bf5
> ,
> >>
> org.apache.hadoop.hive.serde2.objectinspector.UnionStructObjectInspector$MyField@3d44d0c6
> ,
> >>
> >> .
> >>
> >> .
> >>
> >> .
> >>
> >>
> >>
> org.apache.hadoop.hive.serde2.objectinspector.UnionStructObjectInspector$MyField@7e6bc5aa
> ]
> >>
> >> This error is different from the one we get when column name is wrong
> >>
> >> hive> select count(*) from netflow where
> >> INPUT__FILE__NAM='vzb.1351794600.0';
> >>
> >> FAILED: SemanticException [Error 10004]: Line 1:35 Invalid table alias
> or
> >> column reference 'INPUT__FILE__NAM': (possible column names are: first,
> >> last, ....)
> >>
> >> But using this virtual column in select clause works fine.
> >>
> >> hive> select INPUT__FILE__NAME from netflow group by INPUT__FILE__NAME;
> >>
> >> Total MapReduce jobs = 1
> >>
> >> Launching Job 1 out of 1
> >>
> >> Number of reduce tasks not specified. Estimated from input data size: 4
> >>
> >> In order to change the average load for a reducer (in bytes):
> >>
> >>   set hive.exec.reducers.bytes.per.reducer=<number>
> >>
> >> In order to limit the maximum number of reducers:
> >>
> >>   set hive.exec.reducers.max=<number>
> >>
> >> In order to set a constant number of reducers:
> >>
> >>   set mapred.reduce.tasks=<number>
> >>
> >> Starting Job = job_201306041359_0006, Tracking URL =
> >> http://192.168.0.224:50030/jobdetails.jsp?jobid=job_201306041359_0006
> >>
> >> Kill Command = /opt/hadoop/bin/../bin/hadoop job  -kill
> >> job_201306041359_0006
> >>
> >> Hadoop job information for Stage-1: number of mappers: 12; number of
> >> reducers: 4
> >>
> >> 2013-06-14 18:20:10,265 Stage-1 map = 0%,  reduce = 0%
> >>
> >> 2013-06-14 18:20:33,363 Stage-1 map = 8%,  reduce = 0%
> >>
> >> .
> >>
> >> .
> >>
> >> .
> >>
> >> 2013-06-14 18:21:15,554 Stage-1 map = 100%,  reduce = 100%
> >>
> >> Ended Job = job_201306041359_0006
> >>
> >> MapReduce Jobs Launched:
> >>
> >> Job 0: Map: 12  Reduce: 4   HDFS Read: 3107826046 HDFS Write: 55 SUCCESS
> >>
> >> Total MapReduce CPU Time Spent: 0 msec
> >>
> >> OK
> >>
> >> hdfs://192.168.0.224:9000/data/jk/vzb/vzb.1351794600.0
> >>
> >> Time taken: 78.467 seconds
> >>
> >> I am trying to create external hive table on already present HDFS data.
> >> And I have extra files in the folder that I want to ignore. Similar to
> what
> >> is asked and suggested in following stackflow questions how to make hive
> >> take only specific files as input from hdfs folder when creating an
> external
> >> table in hive can I point the location to specific files in a direcotry?
> >>
> >> Any help would be appreciated. Full stack trace I am getting is as
> follows
> >>
> >> 2013-06-14 15:01:32,608 ERROR ql.Driver
> >> (SessionState.java:printError(401)) - FAILED: SemanticException
> >> java.lang.RuntimeException: cannot find field input__
> >>
> >> org.apache.hadoop.hive.ql.parse.SemanticException:
> >> java.lang.RuntimeException: cannot find field input__file__name from
> >> [org.apache.hadoop.hive.serde2.object
> >>
> >>         at
> >>
> org.apache.hadoop.hive.ql.optimizer.pcr.PcrOpProcFactory$FilterPCR.process(PcrOpProcFactory.java:122)
> >>
> >>         at
> >>
> org.apache.hadoop.hive.ql.lib.DefaultRuleDispatcher.dispatch(DefaultRuleDispatcher.java:89)
> >>
> >>         at
> >>
> org.apache.hadoop.hive.ql.lib.DefaultGraphWalker.dispatch(DefaultGraphWalker.java:87)
> >>
> >>         at
> >>
> org.apache.hadoop.hive.ql.lib.DefaultGraphWalker.walk(DefaultGraphWalker.java:124)
> >>
> >>         at
> >>
> org.apache.hadoop.hive.ql.lib.DefaultGraphWalker.startWalking(DefaultGraphWalker.java:101)
> >>
> >>         at
> >>
> org.apache.hadoop.hive.ql.optimizer.pcr.PartitionConditionRemover.transform(PartitionConditionRemover.java:86)
> >>
> >>         at
> >>
> org.apache.hadoop.hive.ql.optimizer.Optimizer.optimize(Optimizer.java:102)
> >>
> >>         at
> >>
> org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.analyzeInternal(SemanticAnalyzer.java:8163)
> >>
> >>         at
> >>
> org.apache.hadoop.hive.ql.parse.BaseSemanticAnalyzer.analyze(BaseSemanticAnalyzer.java:258)
> >>
> >>         at
> >>
> org.apache.hadoop.hive.ql.parse.ExplainSemanticAnalyzer.analyzeInternal(ExplainSemanticAnalyzer.java:50)
> >>
> >>         at
> >>
> org.apache.hadoop.hive.ql.parse.BaseSemanticAnalyzer.analyze(BaseSemanticAnalyzer.java:258)
> >>
> >>         at org.apache.hadoop.hive.ql.Driver.compile(Driver.java:431)
> >>
> >>         at org.apache.hadoop.hive.ql.Driver.compile(Driver.java:335)
> >>
> >>         at org.apache.hadoop.hive.ql.Driver.run(Driver.java:893)
> >>
> >>         at
> >> org.apache.hadoop.hive.cli.CliDriver.processLocalCmd(CliDriver.java:259)
> >>
> >>         at
> >> org.apache.hadoop.hive.cli.CliDriver.processCmd(CliDriver.java:216)
> >>
> >>         at
> >> org.apache.hadoop.hive.cli.CliDriver.processLine(CliDriver.java:412)
> >>
> >>         at org.apache.hadoop.hive.cli.CliDriver.run(CliDriver.java:755)
> >>
> >>         at org.apache.hadoop.hive.cli.CliDriver.main(CliDriver.java:613)
> >>
> >>         at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
> >>
> >>         at sun.reflect.NativeMethodAccessorImpl.invoke(Unknown Source)
> >>
> >>         at sun.reflect.DelegatingMethodAccessorImpl.invoke(Unknown
> Source)
> >>
> >>         at java.lang.reflect.Method.invoke(Unknown Source)
> >>
> >>         at org.apache.hadoop.util.RunJar.main(RunJar.java:156)
> >>
> >> Caused by: org.apache.hadoop.hive.ql.metadata.HiveException:
> >> java.lang.RuntimeException: cannot find field input__file__name from
> >> [org.apache.hadoop.hive.ser
> >>
> >>         at
> >>
> org.apache.hadoop.hive.ql.optimizer.ppr.PartitionPruner.prune(PartitionPruner.java:231)
> >>
> >>         at
> >>
> org.apache.hadoop.hive.ql.optimizer.pcr.PcrOpProcFactory$FilterPCR.process(PcrOpProcFactory.java:112)
> >>
> >>         ... 23 more
> >>
> >> Caused by: java.lang.RuntimeException: cannot find field
> input__file__name
> >> from
> >>
> [org.apache.hadoop.hive.serde2.objectinspector.UnionStructObjectInspector$MyF
> >>
> >>         at
> >>
> org.apache.hadoop.hive.serde2.objectinspector.ObjectInspectorUtils.getStandardStructFieldRef(ObjectInspectorUtils.java:344)
> >>
> >>         at
> >>
> org.apache.hadoop.hive.serde2.objectinspector.UnionStructObjectInspector.getStructFieldRef(UnionStructObjectInspector.java:100)
> >>
> >>         at org.apache.hadoop.hive.ql.exec.ExprNodeColumnEvaluator.init
> >>
> >>
> >> Thanks and Regards,
> >> --
> >> Jitendra Kumar Singh
> >> Mobile: (+91) 9891314709
> >
> >
> >
> >
> > --
> > Nitin Pawar
>

Mime
View raw message