nifi-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Pierre Villard (Jira)" <j...@apache.org>
Subject [jira] [Commented] (NIFI-7247) Unable to execute SQL
Date Thu, 12 Mar 2020 08:53:00 GMT

    [ https://issues.apache.org/jira/browse/NIFI-7247?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17057720#comment-17057720
] 

Pierre Villard commented on NIFI-7247:
--------------------------------------

I'm not really surprised by the fact that the first run is taking a long time since the connection
is going to be initialized/created in the pool at this time. This JIRA is not clear to me,
what you're saying is:
 * first execution works but takes a long time
 * next executions won't work (with the error shown in the attachment?)
 * if disabling/enabling the controller service, then the next execution will work again and
the next ones will fail

Is it correct?

> Unable to execute SQL
> ---------------------
>
>                 Key: NIFI-7247
>                 URL: https://issues.apache.org/jira/browse/NIFI-7247
>             Project: Apache NiFi
>          Issue Type: Bug
>          Components: Core Framework
>    Affects Versions: 1.11.3
>         Environment: containerized environment on EC2 (amzn2-ami-hvm-2.0.20191116.0-x86_64-gp2)
>            Reporter: Martin
>            Priority: Major
>              Labels: databricks, delta, jdbc, sql
>             Fix For: 1.11.4
>
>         Attachments: Error in UI.jpg, flow_as_template.xml, nifi-app.log
>
>
> Scenario:
> We use ExecuteSQL to read delta tables (stored in S3) via JDBC connection to databricks.
>  
> Temporary Fix:
> If we deactivate and reactivate the controller service, then ExecuteSQL works without
problems. What is noticeable here, however, is that it takes quite a long time the first time
it is executed and the next time it is executed it is done within 3 seconds.
>  
> Background information:
>  * Howto use Databricks JDBC [https://docs.databricks.com/integrations/bi/jdbc-odbc-bi.html]
>  * Controller Service DBCPConnectionPool 1.11.3
>  ** 
> URL: jdbc:spark://#\{databricks.host}...\{databricks.cluster.id};...;PWD=#\{databricks.token}
> Driver Class: com.simba.spark.jdbc.Driver
>  * Table
>  ** one column with <20 entries
>  ** Created By Spark 2.4.4
>  ** Type MANAGED
>  ** Provider delta
>  ** Location s3
>  ** Serde Library org.apache.hadoop.hive.serde2.lazy.LazySimpleSerDe
>  ** InputFormat org.apache.hadoop.mapred.SequenceFileInputFormat
>  ** OutputFormat org.apache.hadoop.hive.ql.io.HiveSequenceFileOutputFormat
>  * SQL 
>  ** SELECT * FROM "${db.table.schema}"."${db.table.name}"
>  ** output <20 entries
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

Mime
View raw message