oodt-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Mattmann, Chris A (388J)" <chris.a.mattm...@jpl.nasa.gov>
Subject Re: passing objects between tasks / workflows
Date Fri, 14 Dec 2012 21:24:34 GMT
Hey Lindsey,

Thanks for your email! Some comments below:

On 12/14/12 7:26 AM, "Lindsey Davis" <ldavis@nrao.edu> wrote:

>Hello,
>
>I  work on the NRAO ALMA pipeline software. I have recently begun
>developing
>some prototype OODT workflows for use in the production pipeline.

Great to hear!

>
>I would like to be able to pass an object between workflows and tasks. The
>object is modest in size, serializable in the Java sense, and can be
>encoded
>to and decoded from a string. XML is not a useful option here.

There are a few different ways to do this, listed below in lowest->highest
level of complexity:

1. All Tasks within the same Workflow have the capability to pass
information to one another via a Shared workflow context, provided to the
Task during execution by this interface:

http://oodt.apache.org/components/maven/xref/org/apache/oodt/cas/workflow/s
tructs/WorkflowTaskInstance.html#44


That provided Metadata is readable/writeable by all of the WorkflowTasks
in a particular WorkflowInstance. So, concretely, you could feasiable set
a object that you want the downstream tasks to have access to by doing
(within your WorkflowTaskInstance):

/* first workflow task */
String serializedObj = yourSerializationFunc(obj);
metadata.replaceMetadata(/* key name */ "objectKey", /* value */
serializedObj);

Then:

/* downstream workflow task */

String serializedObj = metadata.getMetadata(/* key name */ "objectKey");
// unserialize it, do something with it, etc.

2. If your serialized object makes more sense as a file, then you could
simply ingest that object separately (as part of an upstream workflow
task, or via some external ingestion process) into the File Manager:

http://oodt.apache.org/components/maven/filemgr/user/basic.html


Once that file is ingested, it can be referenced automatically using
CAS-PGE:

https://cwiki.apache.org/OODT/cas-pge-learn-by-example.html


There are other ways to do this, but try starting out with #1 or #2.


>
>Does OODT provide any builtin support for this? If not are there any OODT
>imposed limits on the size of strings that can be passed around, or
>any other related issues I should be aware of.

Yep see above. In terms of imposed limits on the size of strings to pass
around that is dictated by the underlying JVMs that the OODT daemons are
run inside of. If you are running Workflow Manager (without Resource
Manager) then you are size limited by the total # of workflow instances
running * workflow instance_i's memory footprint which has to be less than
the values passed to -Xms and -Xmx args sent to the JVM. If you are
running in Resource Manager mode, then you are limited by not only the
above values for WM, but also by the same JVM args sent to RM, and
ultimately the JVM args passed to each BatchStub which is executing the
underlying job on a compute node.

Hope that helps!

If you have more questions, keep em' coming!

Cheers,
Chris


Mime
View raw message