hive-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Markovitz, Dudu" <dmarkov...@paypal.com>
Subject RE: Query fails if condition placed on Parquet struct field
Date Tue, 03 May 2016 20:25:59 GMT
Hi

Can you send the execution plans of both versions?

Thanks

Dudu

From: Jose Rozanec [mailto:jose.rozanec@mercadolibre.com]
Sent: Tuesday, May 03, 2016 11:13 PM
To: Haas, Nichole <Nichole.Haas@concur.com>
Cc: user@hive.apache.org
Subject: Re: Query fails if condition placed on Parquet struct field

Hi!

Is not due to memory allocation. I found that I am able to perform que query ok, if I rewrite
it as:

select a.user_agent from (SELECT device.user_agent as user_agent FROM sometable WHERE ds >=
'2016-03-30 00' AND ds <= '2016-03-30 01')a where a.user_agent LIKE 'Mozilla%'  LIMIT 1;

I see the amount of mappers and execution time is almost the same, but this way we are able
to execute ok and get the results.
Any ideas why may this happen?



2016-05-03 17:02 GMT-03:00 Haas, Nichole <Nichole.Haas@concur.com<mailto:Nichole.Haas@concur.com>>:
What are you memory allocations set to?  When using something as expensive as LIKE and a date
range together, I often have to increase my standard memory allocation.

Try changing your memory allocation settings to:
Key: ​mapreduce.map.memory.mb​ Value: ​2048​ and Key: ​mapreduce.map.java.opts​
Value: ​-Xmx1500m

In HUE, this is the settings tab and you enter them manually.  I’m unsure about command
line.


From: Jose Rozanec <jose.rozanec@mercadolibre.com<mailto:jose.rozanec@mercadolibre.com>>
Reply-To: "user@hive.apache.org<mailto:user@hive.apache.org>" <user@hive.apache.org<mailto:user@hive.apache.org>>
Date: Tuesday, May 3, 2016 at 12:45 PM
To: "user@hive.apache.org<mailto:user@hive.apache.org>" <user@hive.apache.org<mailto:user@hive.apache.org>>
Subject: Query fails if condition placed on Parquet struct field

Hello,

We are running queries on Hive against parquet files.
In the schema definition, we have a parquet struct called device with a string field user_agent.

If we run query from Example 1, it returns results as expected.
If we run query from Example 2, execution fails and exits with error.

Did anyone face a similar case?

Thanks!

Example 1:
SELECT device.user_agent FROM sometable WHERE ds >= '2016-03-30 00' AND ds <= '2016-03-30
01' LIMIT 1;

Example 2:
SELECT device.user_agent FROM sometable WHERE ds >= '2016-03-30 00' AND ds <= '2016-03-30
01' AND device.user_agent LIKE 'Mozilla%'  LIMIT 1;


The error and trace we get is:

Exception from container-launch.
FAILED: Execution Error, return code 2 from org.apache.hadoop.hive.ql.exec.mr.MapRedTask
Container exited with a non-zero exit code 1

Stack trace: ExitCodeException exitCode=1:
at org.apache.hadoop.util.Shell.runCommand(Shell.java:545)
at org.apache.hadoop.util.Shell.run(Shell.java:456)
at org.apache.hadoop.util.Shell$ShellCommandExecutor.execute(Shell.java:722)
at org.apache.hadoop.yarn.server.nodemanager.DefaultContainerExecutor.launchContainer(DefaultContainerExecutor.java:212)
at org.apache.hadoop.yarn.server.nodemanager.containermanager.launcher.ContainerLaunch.call(ContainerLaunch.java:302)
at org.apache.hadoop.yarn.server.nodemanager.containermanager.launcher.ContainerLaunch.call(ContainerLaunch.java:82)
at java.util.concurrent.FutureTask.run(FutureTask.java:262)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
at java.lang.Thread.run(Thread.java:745)


________________________________

This e-mail message is authorized for use by the intended recipient only and may contain information
that is privileged and confidential. If you received this message in error, please call us
immediately at (425) 590-5000 and ask to speak to the message sender. Please do not copy,
disseminate, or retain this message unless you are the intended recipient. In addition, to
ensure the security of your data, please do not send any unencrypted credit card or personally
identifiable information to this email address. Thank you.

Mime
View raw message