beam-commits mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Damien GOUYETTE (JIRA)" <j...@apache.org>
Subject [jira] [Updated] (BEAM-2470) Inconsistent behavior on the functioning of the dataflow templates?
Date Tue, 20 Jun 2017 09:18:00 GMT

     [ https://issues.apache.org/jira/browse/BEAM-2470?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]

Damien GOUYETTE updated BEAM-2470:
----------------------------------
    Description: 
0
down vote
favorite
When i create a dataflow template, the characteristics of Runtime parameters are not persisted
in the template file. At runtime, if i try to pass a value for this parameter, i take a 400
error

I'm using Scio 0.3.2, scala 2.11.11 with apache beam (0.6).

My parameters are the following :


{code:java}
trait MyParameters extends PipelineOptions {

  def getInput: ValueProvider[String]
  def setInput(value: ValueProvider[String]): Unit
}
{code}

They are registred with this code


{code:java}
val options = PipelineOptionsFactory.fromArgs(cmdlineArgs: _*).withValidation().as[XmlImportJobParameters](classOf[XmlImportJobParameters])
PipelineOptionsFactory.register(classOf[XmlImportJobParameters])
implicit val (sc, args) = ContextAndArgs(cmdlineArgs)
{code}

To create the template i call sbt with this parameters :


{code:java}
run-main jobs.XmlImportJob    --runner=DataflowRunner --project=MyProject  --templateLocation=gs://myBucket/XmlImportTemplate
 --tempLocation=gs://myBucket/staging --instance=myInstance
{code}

If i pass explicitly --input, it becomes a StaticValue instead of RuntimeValue, and this time,
i can see it in the template file.

The template is called from a google function watching a bucket storage (inspired from https://shinesolutions.com/2017/03/23/triggering-dataflow-pipelines-with-cloud-functions/)
:


{code:java}
...
dataflow.projects.templates.create({
                projectId: projectId,
                resource: {
                    parameters: {
                        input: `gs://${file.bucket}/${file.name}`
                    },
                    jobName: jobs[job].name,
                    gcsPath: 'gs://MyBucket/MyTemplate'
                }
            }
...
{code}

The 400 error :



{code:java}
problem running dataflow template, error was: { Error: (109c1c52dc52fec7): The workflow could
not be created. Causes: (109c1c52dc52fb8e): Found unexpected parameters: ['input' (perhaps
you meant 'runner')] at Request._callback (/user_code/node_modules/googleapis/node_modules/google-auth-library/lib/transporters.js:85:15)
at Request.self.callback (/user_code/node_modules/googleapis/node_modules/request/request.js:188:22)
at emitTwo (events.js:106:13) at Request.emit (events.js:191:7) at Request.<anonymous(/user_code/node_modules/googleapis/node_modules/request/request.js:1171:10)
at emitOne (events.js:96:13) at Request.emit (events.js:188:7) at IncomingMessage.<anonymous>
(/user_code/node_modules/googleapis/node_modules/request/request.js:1091:12) at IncomingMessage.g
(events.js:291:16) at emitNone (events.js:91:20) code: 400, errors: [ { message: '(109c1c52dc52fec7):
The workflow could not be created. Causes: (109c1c52dc52fb8e): Found unexpected parameters:
[\'input\' (perhaps you meant \'runner\')]', domain: 'global', reason: 'badRequest' } ] }
{code}




  was:
0
down vote
favorite
When i create a dataflow template, the characteristics of Runtime parameters are not persisted
in the template file. At runtime, if i try to pass a value for this parameter, i take a 400
error

I'm using Scio 0.3.2, scala 2.11.11 with apache beam (0.6).

My parameters are the following :


{code:java}
trait MyParameters extends PipelineOptions {

  def getInput: ValueProvider[String]
  def setInput(value: ValueProvider[String]): Unit
}
{code}

They are registred with this code


{code:java}
val options = PipelineOptionsFactory.fromArgs(cmdlineArgs: _*).withValidation().as[XmlImportJobParameters](classOf[XmlImportJobParameters])
PipelineOptionsFactory.register(classOf[XmlImportJobParameters])
implicit val (sc, args) = ContextAndArgs(cmdlineArgs)
{code}

To create the template i call sbt with this parameters :


{code:java}
run-main jobs.XmlImportJob    --runner=DataflowRunner --project=MyProject  --templateLocation=gs://myBucket/XmlImportTemplate
 --tempLocation=gs://myBucket/staging --instance=myInstance
{code}

If i pass explicitly --input, it becomes a StaticValue instead of RuntimeValue, and this time,
i can see it in the template file.


> Inconsistent behavior on the functioning of the dataflow templates?
> -------------------------------------------------------------------
>
>                 Key: BEAM-2470
>                 URL: https://issues.apache.org/jira/browse/BEAM-2470
>             Project: Beam
>          Issue Type: Bug
>          Components: runner-core
>    Affects Versions: 0.6.0
>            Reporter: Damien GOUYETTE
>            Assignee: Kenneth Knowles
>
> 0
> down vote
> favorite
> When i create a dataflow template, the characteristics of Runtime parameters are not
persisted in the template file. At runtime, if i try to pass a value for this parameter, i
take a 400 error
> I'm using Scio 0.3.2, scala 2.11.11 with apache beam (0.6).
> My parameters are the following :
> {code:java}
> trait MyParameters extends PipelineOptions {
>   def getInput: ValueProvider[String]
>   def setInput(value: ValueProvider[String]): Unit
> }
> {code}
> They are registred with this code
> {code:java}
> val options = PipelineOptionsFactory.fromArgs(cmdlineArgs: _*).withValidation().as[XmlImportJobParameters](classOf[XmlImportJobParameters])
> PipelineOptionsFactory.register(classOf[XmlImportJobParameters])
> implicit val (sc, args) = ContextAndArgs(cmdlineArgs)
> {code}
> To create the template i call sbt with this parameters :
> {code:java}
> run-main jobs.XmlImportJob    --runner=DataflowRunner --project=MyProject  --templateLocation=gs://myBucket/XmlImportTemplate
 --tempLocation=gs://myBucket/staging --instance=myInstance
> {code}
> If i pass explicitly --input, it becomes a StaticValue instead of RuntimeValue, and this
time, i can see it in the template file.
> The template is called from a google function watching a bucket storage (inspired from
https://shinesolutions.com/2017/03/23/triggering-dataflow-pipelines-with-cloud-functions/)
:
> {code:java}
> ...
> dataflow.projects.templates.create({
>                 projectId: projectId,
>                 resource: {
>                     parameters: {
>                         input: `gs://${file.bucket}/${file.name}`
>                     },
>                     jobName: jobs[job].name,
>                     gcsPath: 'gs://MyBucket/MyTemplate'
>                 }
>             }
> ...
> {code}
> The 400 error :
> {code:java}
> problem running dataflow template, error was: { Error: (109c1c52dc52fec7): The workflow
could not be created. Causes: (109c1c52dc52fb8e): Found unexpected parameters: ['input' (perhaps
you meant 'runner')] at Request._callback (/user_code/node_modules/googleapis/node_modules/google-auth-library/lib/transporters.js:85:15)
at Request.self.callback (/user_code/node_modules/googleapis/node_modules/request/request.js:188:22)
at emitTwo (events.js:106:13) at Request.emit (events.js:191:7) at Request.<anonymous(/user_code/node_modules/googleapis/node_modules/request/request.js:1171:10)
at emitOne (events.js:96:13) at Request.emit (events.js:188:7) at IncomingMessage.<anonymous>
(/user_code/node_modules/googleapis/node_modules/request/request.js:1091:12) at IncomingMessage.g
(events.js:291:16) at emitNone (events.js:91:20) code: 400, errors: [ { message: '(109c1c52dc52fec7):
The workflow could not be created. Causes: (109c1c52dc52fb8e): Found unexpected parameters:
[\'input\' (perhaps you meant \'runner\')]', domain: 'global', reason: 'badRequest' } ] }
> {code}



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

Mime
View raw message