esme-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Ethan Jewett <esjew...@gmail.com>
Subject Re: Metadata handling (was "Release planning")
Date Mon, 19 Jul 2010 13:31:35 GMT
I'd prefer option 1 (separate attribute from text). Within this
separate attribute there is the question of how data is
stored/represented. I'm ok with either raw string or a tuple-based
structure like Twitter's. I kind of like the tuple (key-value)
approach.

What I'm insisting on and what I was saying we got wrong is that what
goes in needs to be the same as what comes out. If it's tuple-based
and I send in a tuple, then I should get that tuple (key and value)
back out when I request the metadata for a message. Right now I think
we only get a concatenated list of values from the metadata and
metaData methods and we're bound to an XML format.

As far as requiring a particular format, I think the internal format
should be either a raw string or a immutable hashmap with raw strings
as keys and values. We can handle converting this to XML or JSON in
the API or view code.

Ethan

On Monday, July 19, 2010, Vassil Dichev <vdichev@apache.org> wrote:
>>> I believe that this is completely wrong. The API should simply take
>>> the text assigned to the metadata parameter, store it as part of the
>>> message, and return exactly the same string in the message response.
>>> It should not care what is in the text and it should not modify it in
>>> any way.
>>
>> Exactly: The basic idea is that external applications can add
>> additional information to a particular message in whatever form (XML,
>> JSON, text, etc.) with whatever structure desired. This metadata is
>> stored without being changed in the data store.  When accessing the
>> messages via the various APIs, the metadata is returned in exactly
>> format in which it was stored. Period.
>
> If this is completely wrong, then Twitter also got it completely
> wrong. AFAICT the annotations mechanism provides a way to include
> *structured* data in a tweet, in the form of key/value pairs. You can
> find this quote on http://apiwiki.twitter.com/Annotations-Overview:
>
>     An annotation is a tuple whose first element is a 'type' and whose
> second element is one or more attribute names with values.
>
> So what we're missing is that this is not just a text string. Having
> the data untouched breaks coupling with a specific format in the case
> when we want to submit a message in one format, but parse it in
> another, e.g. send in JSON and then read in as XML (take a look at the
> Twitter Annotation examples, this is the kind of scenario described).
>
> If we do not want Twitter's approach, we should have it somewhere
> spelled out- whether in the Jira item, in the wiki or somewhere else.
> Otherwise it's easy to misunderstand the requirement (as I did).
>
> I can think of 3 ways to avoid metadata in quoted form:
>
> 1. Include metadata as a separate attribute from the message text
> 2. Have metadata be included as XML in the message, unquoted.
> 3. Have metadata included in the message quoted, and then unquote it
> every time we return it back.
>
> I think if we want to preserve the structure (as Twitter did), 2 is
> more straightforward. 1 and 3 don't give us a significant advantage
> IMO since we'll have to process the XML anyway and convert to JSON
> before returning it. What we have to make sure when including XML in
> metadata is that the XPath doesn't contain ambiguous references which
> might resolve to something within the metadata (this might already be
> the case).
>
> It's worth noting that we cannot avoid shell escaping when using
> command-line clients and we cannot avoid form-encoding when sending
> the message from any client:
>
> http://groups.google.com/group/twitter-development-talk/browse_thread/thread/31f19d9432cc080e?pli=1
>
> So what do we want- raw string or structured data? Do we want to be
> more like how Twitter did it or not? Do we want to couple the data
> with a certain format (JSON/XML) or not?
>
> Vassil
>

Mime
View raw message