hive-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Rajesh Balamohan (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (HIVE-13189) Consider using Joda DateTimeFormatter instead of SimpleDateFormat in GenericUDFDateAdd
Date Tue, 01 Mar 2016 15:58:18 GMT

    [ https://issues.apache.org/jira/browse/HIVE-13189?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15173947#comment-15173947
] 

Rajesh Balamohan commented on HIVE-13189:
-----------------------------------------

JMH comparison of SimpleDateFormat vs Joda DateTimeFormatter

{noformat}
# JMH 1.11.2 (released 124 days ago, please consider updating!)
# VM version: JDK 1.8.0_05, VM 25.5-b02
# VM invoker: /Library/Java/JavaVirtualMachines/jdk1.8.0_05.jdk/Contents/Home/jre/bin/java
# VM options: <none>
# Warmup: 5 iterations, 10 s each
# Measurement: 5 iterations, 10 s each
# Timeout: 10 min per iteration
# Threads: 1 thread, will synchronize iterations
# Benchmark mode: Average time, time/op
# Benchmark: org.apache.jmh.TestJodaVsSimpleDateFormat.testWithJodaTime

# Run progress: 0.00% complete, ETA 00:03:20
# Fork: 1 of 1
# Warmup Iteration   1: 395.761 ns/op
# Warmup Iteration   2: 396.304 ns/op
# Warmup Iteration   3: 388.342 ns/op
# Warmup Iteration   4: 407.058 ns/op
# Warmup Iteration   5: 392.305 ns/op
Iteration   1: 387.758 ns/op
Iteration   2: 419.816 ns/op
Iteration   3: 444.825 ns/op
Iteration   4: 435.538 ns/op
Iteration   5: 431.213 ns/op


Result "testWithJodaTime":
  423.830 ±(99.9%) 85.014 ns/op [Average]
  (min, avg, max) = (387.758, 423.830, 444.825), stdev = 22.078
  CI (99.9%): [338.817, 508.844] (assumes normal distribution)


# JMH 1.11.2 (released 124 days ago, please consider updating!)
# VM version: JDK 1.8.0_05, VM 25.5-b02
# VM invoker: /Library/Java/JavaVirtualMachines/jdk1.8.0_05.jdk/Contents/Home/jre/bin/java
# VM options: <none>
# Warmup: 5 iterations, 10 s each
# Measurement: 5 iterations, 10 s each
# Timeout: 10 min per iteration
# Threads: 1 thread, will synchronize iterations
# Benchmark mode: Average time, time/op
# Benchmark: org.apache.jmh.TestJodaVsSimpleDateFormat.testWithSimpleDateFormat

# Run progress: 50.00% complete, ETA 00:01:40
# Fork: 1 of 1
# Warmup Iteration   1: 847.271 ns/op
# Warmup Iteration   2: 839.440 ns/op
# Warmup Iteration   3: 840.931 ns/op
# Warmup Iteration   4: 819.619 ns/op
# Warmup Iteration   5: 838.692 ns/op
Iteration   1: 845.421 ns/op
Iteration   2: 857.534 ns/op
Iteration   3: 857.405 ns/op
Iteration   4: 810.189 ns/op
Iteration   5: 808.703 ns/op


Result "testWithSimpleDateFormat":
  835.850 ±(99.9%) 94.750 ns/op [Average]
  (min, avg, max) = (808.703, 835.850, 857.534), stdev = 24.606
  CI (99.9%): [741.101, 930.600] (assumes normal distribution)


# Run complete. Total time: 00:03:20

Benchmark                                            Mode  Cnt    Score    Error  Units
TestJodaVsSimpleDateFormat.testWithJodaTime          avgt    5  423.830 ± 85.014  ns/op
TestJodaVsSimpleDateFormat.testWithSimpleDateFormat  avgt    5  835.850 ± 94.750  ns/op
{noformat}

> Consider using Joda DateTimeFormatter instead of SimpleDateFormat in GenericUDFDateAdd
> --------------------------------------------------------------------------------------
>
>                 Key: HIVE-13189
>                 URL: https://issues.apache.org/jira/browse/HIVE-13189
>             Project: Hive
>          Issue Type: Improvement
>          Components: Hive
>            Reporter: Rajesh Balamohan
>
> Quite an amount was spent by tasks in trying to parse date string in GenericUDFDateAdd.
 
> {noformat}
>   java.lang.Thread.State: RUNNABLE
>         at java.text.DecimalFormat.subparse(DecimalFormat.java:1467)
>         at java.text.DecimalFormat.parse(DecimalFormat.java:1268)
>         at java.text.SimpleDateFormat.subParse(SimpleDateFormat.java:2088)
>         at java.text.SimpleDateFormat.parse(SimpleDateFormat.java:1455)
>         at java.text.DateFormat.parse(DateFormat.java:355)
>         at org.apache.hadoop.hive.ql.udf.generic.GenericUDFDateAdd.evaluate(GenericUDFDateAdd.java:172)
>         at org.apache.hadoop.hive.ql.exec.ExprNodeGenericFuncEvaluator._evaluate(ExprNodeGenericFuncEvaluator.java:186)
>         at org.apache.hadoop.hive.ql.exec.ExprNodeEvaluator.evaluate(ExprNodeEvaluator.java:77)
>         at org.apache.hadoop.hive.ql.exec.ExprNodeGenericFuncEvaluator$DeferredExprObject.get(ExprNodeGenericFuncEvaluator.java:87)
>         at org.apache.hadoop.hive.ql.udf.generic.GenericUDFOPGreaterThan.evaluate(GenericUDFOPGreaterThan.java:80)
>         at org.apache.hadoop.hive.ql.exec.ExprNodeGenericFuncEvaluator._evaluate(ExprNodeGenericFuncEvaluator.java:186)
>         at org.apache.hadoop.hive.ql.exec.ExprNodeEvaluator.evaluate(ExprNodeEvaluator.java:77)
>         at org.apache.hadoop.hive.ql.exec.ExprNodeEvaluator.evaluate(ExprNodeEvaluator.java:65)
>         at org.apache.hadoop.hive.ql.exec.FilterOperator.process(FilterOperator.java:108)
>         at org.apache.hadoop.hive.ql.exec.Operator.forward(Operator.java:838)
>         at org.apache.hadoop.hive.ql.exec.CommonJoinOperator.internalForward(CommonJoinOperator.java:644)
> {noformat}
> Joda DateTimeFormatter can be considered for better performance.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Mime
View raw message