cocoon-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "David Wagner" <>
Subject RE: [RT] i18n in Cocoon and language independent semantic contexts
Date Mon, 12 Jun 2000 16:28:34 GMT
Stefano Mazzocchi expressed a great Random Thought, including the
following issue.
> Be careful, something like "/news/today" is a perfectly
> designed URI for a website and can stand ages without requiring to
> change. But it's  not human readable by non-english speakers. So
> it would be the italian equivalent "/notizie/oggi".
> This leads to something that was already expressed on the
> list: can the sitemap allow to enforce different views of the same
> URI space based on i18n issues? What's the best manageable way to
> do this? Where does separation of concerns accounts here? What's
> the best way to scale such a thing?
I was thinking about this at length over the weekend, and had the idea
of 'normalizing' the url by parsing it into tokens on the characters
[/.?&] then comparing the tokens to a list of synonyms, including
synonyms in other languages, to modify the url presented into one to
map directly into the native site hierarchy.

  /notizie/oggi -> /news/today  {Note the loss of lang=it in this

However, in most cases the token order does not matter and the list of
tokens in any order is enough to specify a single location in the
site.  So, token matching is done unordered by default.

  /today/news -> /news/today

In some cases and in some languages the token order will matter
(perhaps having the reversed meaning if tokens are reversed), and so
there should be provisions to match tokens in sequence as well as
unordered.  Also, I considered strict seperation of resource location
in the path of the URL, with parameters passed to the system in the
query portion.

  /weather/tomorrow -> /weather?day=tomorrow&lang=en
  /tempestus/cras   -> /weather?day=tomorrow&lang=la

Note it is also possible to request a resource in a different language
simply by typing /weather/tomorrow?lang=sa (Sanskrit ;).  Further, not
parsing on [=] allows mapping name-value pairs as tokens like this.

  /newsrequestform?type=weather&day=tomorrow ->

In this weather example I also noticed retrieving archived weather
information (/weather?day=yesterday) is completely different from
forecasting (/weather?day=tomorrow), and likely handled by different
resources at different locations in the framework.

   /weather/tomorrow -> /weather/forecast?day=tomorrow&lang=en
  /weather/yesterday -> /weather/archive?day=yesterday&lang=en

I then considered matching tokens against productions such as
date-time formats and US zip codes, how to encode these matching rules
(XML, XSLT, REGEX, XPointer, ...), and just how to use different token
match-replace pairs in different contexts.

     /news/today -> /news/today
  /weather/today -> /weather/forecast?day=today

Token matching like this done recursively until resolution to a single
(final) resource locator (with a list of paramters to pass) seems to
provide the URL normalization needed.  I wrote a much longer and more
detailed analysis and suggested implementation of this method, though
it is not yet finished and contains many unresolved issues touched on

I hope this contributes to the discussion.

View raw message