apex-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Bhupesh Chawda <bhup...@datatorrent.com>
Subject Including meta data with input tuples
Date Wed, 18 Nov 2015 04:20:07 GMT
Hi All,

In the design of input modules, we are facing situations where we might
need to pass on some meta data to the downstream modules, in addition to
actual data. Further, this meta data may need to be sent per record. An
example use case is to send a record and additionally send the file name
(as meta data) from which the record was read. Another example is sending
out the kafka topic information along with the message.

We are exploring options on:

   1. Whether to include the meta information in the data schema, so as to
   allow the parser to handle this data as regular data. This will involve
   changing the schema of the data.
   2. Whether to handle meta data separately and modify the behaviour of
   parser / converter to handle meta data separately as well.
   3. Use additional ports to transfer such meta data depending on
   different modules.
   4. Any other option

Please comment.

Consolidating comments on another thread here:

   1. Have the tuple containing two parts, with the downstream parser
   ignoring the meta data
   1. Data
   2. Meta-data
   2. Use option 1, but concern regarding how unifiers will treat meta
   data, if they need to unify that as well.
   3. Another comment is to have a centralized meta data repo. This may be
   in memory as well, may be as a separate operator which stores and serves
   the meta data to other operators.

Thanks.

-Bhupesh

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message