beam-commits mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "ASF GitHub Bot (JIRA)" <j...@apache.org>
Subject [jira] [Work logged] (BEAM-4726) Reduce ParDo per element Invoke overhead
Date Tue, 10 Jul 2018 15:43:01 GMT

     [ https://issues.apache.org/jira/browse/BEAM-4726?focusedWorklogId=121448&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-121448
]

ASF GitHub Bot logged work on BEAM-4726:
----------------------------------------

                Author: ASF GitHub Bot
            Created on: 10/Jul/18 15:42
            Start Date: 10/Jul/18 15:42
    Worklog Time Spent: 10m 
      Work Description: herohde commented on a change in pull request #5882: [BEAM-4726] Cache
fixed per function Invoke values
URL: https://github.com/apache/beam/pull/5882#discussion_r201388661
 
 

 ##########
 File path: sdks/go/pkg/beam/core/runtime/exec/fn.go
 ##########
 @@ -125,10 +125,144 @@ func Invoke(ctx context.Context, ws []typex.Window, ts typex.EventTime,
fn *func
 	return nil, nil
 }
 
+// InvokeWithoutEventTime runs the given function at time 0 in the global window.
 func InvokeWithoutEventTime(ctx context.Context, fn *funcx.Fn, opt *MainInput, extra ...interface{})
(*FullValue, error) {
 	return Invoke(ctx, window.SingleGlobalWindow, mtime.ZeroTimestamp, fn, opt, extra...)
 }
 
+// invoker is a container struct for hot path invocations of DoFns, to avoid
+// repeating fixed set up per element.
+type invoker struct {
+	fn   *funcx.Fn
+	args []interface{}
+	// TODO(lostluck):  2018/07/06 consider replacing with a slice of functions to run over
the args slice, as an improvement.
+	ctxIdx, wndIdx, etIdx int   // specialized input indexes
+	outEtIdx, errIdx      int   // specialized output indexes
+	in, out               []int // general indexes
+}
+
+func newInvoker(fn *funcx.Fn) *invoker {
+	n := &invoker{
+		fn:   fn,
+		args: make([]interface{}, len(fn.Param)),
+		in:   fn.Params(funcx.FnValue | funcx.FnIter | funcx.FnReIter | funcx.FnEmit),
+		out:  fn.Returns(funcx.RetValue),
+	}
+	var ok bool
+	if n.ctxIdx, ok = fn.Context(); !ok {
+		n.ctxIdx = -1
+	}
+	if n.wndIdx, ok = fn.Window(); !ok {
+		n.wndIdx = -1
+	}
+	if n.etIdx, ok = fn.EventTime(); !ok {
+		n.etIdx = -1
+	}
+	if n.outEtIdx, ok = fn.OutEventTime(); !ok {
+		n.outEtIdx = -1
+	}
+	if n.errIdx, ok = fn.Error(); !ok {
+		n.errIdx = -1
+	}
+	return n
+}
+
+// ClearArgs zeroes argument entries in the cached slice to allow values to be garbage collected
after the bundle ends.
+func (n *invoker) ClearArgs() {
 
 Review comment:
   nit: maybe name it something more generic like "Reset" to accommodate later improvements
that does more than caching arguments.

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


Issue Time Tracking
-------------------

    Worklog Id:     (was: 121448)
    Time Spent: 3h 40m  (was: 3.5h)

> Reduce ParDo per element Invoke overhead
> ----------------------------------------
>
>                 Key: BEAM-4726
>                 URL: https://issues.apache.org/jira/browse/BEAM-4726
>             Project: Beam
>          Issue Type: Sub-task
>          Components: sdk-go
>            Reporter: Robert Burke
>            Assignee: Robert Burke
>            Priority: Major
>          Time Spent: 3h 40m
>  Remaining Estimate: 0h
>
> Each call to invoke allocates a new args []interface{}, but the common case is to run
the same ProcessElement function over and again. It should be possible to have a container
struct to retain the args slice, and avoid recomputing the indices for where to assign parameters
before calling the ProcessElementFn.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

Mime
View raw message