incubator-esme-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Ethan Jewett <>
Subject Re: Metadata handling (was "Release planning")
Date Wed, 21 Jul 2010 13:26:51 GMT
Unfortunately my reply is going to overlap with Imtiaz's, but bear
with me. Responses inline:

On Tue, Jul 20, 2010 at 7:57 AM, Vassil Dichev <> wrote:
>> I'd prefer option 1 (separate attribute from text). Within this
>> separate attribute there is the question of how data is
>> stored/represented. I'm ok with either raw string or a tuple-based
>> structure like Twitter's. I kind of like the tuple (key-value)
>> approach.
> I'm not too interested in how the data is stored, because it's fairly
> trivial to implement either way. It's currently not yet clear to me
> what the requirements for the output format are.

I'm just referring to the internal representation. Sorry for talking
about "stored" as that was misleading. If we are treating the metadata
internally as a tuple, then don't we need to design the API handling
to only allow tuples (whether expressed in XML, JSON, or anything

So we need to agree on this first. I think tuples are a good idea, not
least because you have shown us that Twitter is already doing it that
way. So I'm for tuples.

>> What I'm insisting on and what I was saying we got wrong is that what
>> goes in needs to be the same as what comes out. If it's tuple-based
>> and I send in a tuple, then I should get that tuple (key and value)
>> back out when I request the metadata for a message. Right now I think
>> we only get a concatenated list of values from the metadata and
>> metaData methods and we're bound to an XML format.
> I don't get it. A tuple is an abstraction which might be expressed in
> a specific format. So what goes in is not what comes out depending on
> the format. Let me quote the specific example Twitter provides.

Please look back at the original examples. What goes in to the API
currently is something like "<tag>Text</tag>" and what comes out is
"Text". The information encoded in the tag is gone!

This is all I'm referring to. All the information that goes in to the
metadata should be extractable. Maybe I'm misunderstanding how the
current implementation is supposed to work?

> This comes in:
> "annotations":
>    [{"type":{"another_attribute":"value", "attribute":"value"}}]
> This comes out:
> <annotations type="array">
>  <annotation>
>    <type>foo</type>
>    <attributes>
>      <attribute>
>        <name>bar</name>
>        <value>baz</value>
>      </attribute>
>    </attributes>
>  </annotation>
> </annotations>

Right, if this was what happened, that would be OK with me.

>> As far as requiring a particular format, I think the internal format
>> should be either a raw string or a immutable hashmap with raw strings
>> as keys and values. We can handle converting this to XML or JSON in
>> the API or view code.
> Again, it's not so interesting what's internally there, let's just
> treat it as a black box. What I want to know is, do we want to have
> for instance XML in a JSON reply returned:
> "annotations":
>    [{"type":{"<attributes> <attribute> <name>bar</name>
> <value>baz</value> </attribute> </attributes>"}}]
> or, inversely, do we want JSON inside an XML reply? Something like:
> <annotations type="array">
>  <annotation>
>    <type>foo</type>
>    <attributes>
>      {"another_attribute":"value", "attribute":"value"}
>    </attributes>
>  </annotation>
> </annotations>
> because the latter will need to be escaped.
> Sorry for being too dense, but in any case, we either have to escape
> the metadata or we have to transform the structure to XML/JSON when we
> return it back to the user. None of these is "what goes in needs to be
> the same as what comes out"

Agreed. I just wanted to make clear that I would like ESME to have a
canonical internal format for the metadata so that the API code can
understand what it will be receiving. The API can then handle
transforming the internal format into XML, JSON, or whatever new
format we end up supporting.

As far as "what goes in needs to be the same as what comes out",
obviously that was too vague. Hopefully it's clear by now that I meant
something more along the lines of "the information that goes in must
be the same as the information that comes out, regardless of format".


View raw message