forrest-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Upayavira ...@upaya.co.uk>
Subject Re: i18n suggestion
Date Mon, 15 Mar 2004 14:30:44 GMT
Sjur Nørstebø Moshagen wrote:

> På 15. mar. 2004 kl. 11.19 skrev Upayavira:
>
>> Sjur Nørstebø Moshagen wrote:
>>
>>> filename_{lang}_{country}_{variant}.xml
>>>
>>> There is no need to invent a different scheme for ordinary text files.
>>
>>
>> Yes, but that's the filenaming at the source side - not necessarily 
>> for serving. But, I guess if you wanted to configure your sitemap to 
>> avoid the header's locale, that would work.
>
>
> I am talking about the source side. What you're serving should be just 
> filename.html (or .pdf). The whole i18n task is about mapping the 
> request "filename.html" to a suitable source file 
> "filename_{locale}.xml", isn't it?
>
> The only point about debugging was that it should be _possible_ 
> (without us doing anything) to request filename_{locale}.html, in the 
> same way that it is now possible to request menu_{lang}.html.

Fair enough.

>>> OK (but that should not stop us from trying to keep the two in sync).
>>
>>
>> Exactly. I agree that Forrest is about static not dynamic sites. But 
>> you've got to be really careful about how you define that. Dynamic to 
>> me means whether you are reading stuff from databases, etc, and every 
>> page you generate is different. As to static, I don't think it should 
>> matter whether the site is served by a servlet container or by Apache 
>> et al - a Forrest site should be independent of that, and should work 
>> on both. It should be environment independent. The discussion, as far 
>> as I remember it, was about the inclusion of stuff that would _only_ 
>> work in dynamic mode. You're suggesting adding stuff that will _only_ 
>> work in static mode - I think that is going way too far.
>
>
> To clearify what I has meant by "dynamic": served by a servlet (and 
> thus is changing dynamically as you change the source files of the 
> site), as opposed to static. So I agree that there should be no 
> difference between a servlet version and an Apache version.

Yup. I agree.

>>> Agree, but what if you only have "localised" files (you don't want 
>>> to keep two identical copies of the same file in the src, one as 
>>> default without any locale info, and one with an explicit locale) - 
>>> thus we _also_ need a notion of a "default language" that could be 
>>> used to build default versions (as files without any locale in the 
>>> file name).
>>
>>
>> What do you mean by "localised"? I think that it should be the 
>> environment's responsibility to handle downgrading, not the site itself.
>
>
> Agree, and that's in this case Cocoon. What I meant was that on the 
> source side you don't want to maintain both index.xml and index_de.xml 
> if they are identical (ie. German is the default language of the site) 
> - you want to keep one (and maintain it), and have the other generated 
> automatically (or remapped, or whatever automatically), such that when 
> you request index.html with no (applicable) locale info, you get the 
> default language, independent of whether that is found in index.xml or 
> in index_de.xml.

Okay.

>
>>>> There is not a i18nfile generator, so it needs to be added on the 
>>>> sitemap.
>>>
>>>
>>> I checked the Cocoon samples as well, and indeed, there does not 
>>> seem to be a mechanism (or sitemap action) in place for just 
>>> choosing one of several localised _files_.
>>
>> Juan Jose Pablos wrote:
>>
>>> If you  pass the locale, you are able get this done at least for 
>>> {lang}.   I think that we need a I18nFile generator that allows to 
>>> do the "downgrading scheme"
>>
>>
>> Why? As far as I can see, if the browser can pass a string of 
>> expected locales, e.g. pl, de, en, then those should be passed into 
>> Cocoon as the Locale, which Cocoon's I18N stuff should just handle - 
>> serving the most appropriate page it can. In the case of the CLI, it 
>> will pass one locale in at a time. Filename handling will be the 
>> CLI's concern, not Cocoon's.
>
>
> There are two issues here: whether we need an i18nFile Generator, and 
> how to name the generated files in the static site. Let's keep them 
> apart.
>
> 1) i18nFile generator:
> It *is* actually needed, there is no such thing in Cocoon, and that is 
> why Cocoon's present i18n stuff can't handle content 
> negotiation/localisation of whole files (and thus why Forrest can't). 
> That is, for a given page, Cocoon is NOT able to decide which version 
> it should pick based on locale, because there seems to be no mechanism 
> for it. The present i18n stuff in Cocoon is all about translation and 
> localisation of *single items* on a page, to be able to localise table 
> headings, date formats etc of dynamic pages generated from f.ex. an 
> external source (a database, whatever).

I get you now.

Okay, as I said in a previous mail, I think we might do better with an 
Input Module: I18NModule. This would enable you to say something like:

<map:match pattern="**/*.html">
  <map:generate src="{1}/{i18n:{2}.{locale}.xml}"/>

So, we'd be using the file generator still, as all we need is a filename 
conversion. And so long as we can come up with a suitable module syntax, 
it should be easier to implement.

> The typical scenario that has been in the minds of the i18n team seems 
> to have been:
>
> -get some tabular/business data out of an RDBMS or similar;
> -present it to the user,
> -on the way, localise the presentation of it if wanted (and possible)
>
> Thus, you can localise most things, BUT NOT WHOLE PAGES ON A FILE 
> LEVEL. This is where the i18nFile generator comes in. We need this for 
> being able to serve whole pages, where the different language/locale 
> versions of the page is stored in different files.

Exactly my problem right now too!

> The funny thing is, that we're almost there;) If you look at how the 
> present i18n features of Cocoon work, you'll find something like this 
> for the menu generation (this is based on the i18n sample coming with 
> the default installation of Cocoon - I have not yet looked at the Java 
> src code of the i18n block):
>
> Request: build the menu for locale X
> - find the file menu.xml
> - in translations/, find the file menu_X.xml  <==== THIS IS WHAT WE NEED
> - replace some content in menu.xml by looking it up in menu_X.xml
> - serve the localised menu.xml
>
> That is, the present catalog-based translation/localisation system has 
> all we need, and more. What is needed, is to generalise the use of it 
> (the translations should be in the same directory as the original 
> source files), and remove the translation part, just serving the 
> localised file _in place of_ the original source file. Finally, we 
> need to make this generalised i18n file generation function available 
> with a separate name, "i18nFile" is as good as anything.

What we need to do is generalise out the stuff to do with filename 
handling, and have a module share its code.

> My suggestion is thus to extend the present i18n block with one more 
> function, which should be easily based on what's already there.

Exactly. Except for that it isn't a block, it is core functionality.

> 2) Naming of static files
> In a servlet context we don't need more than what I've already 
> described under 1), since after we have implemented 1), content 
> negotiation and downgrading will be handled well by Cocoon even for 
> files. But for building a static site ready to be served by Apache, 
> one more thing is needed for the CLI: for each locale prosessed, the 
> output static files need to be given locale extensions according to 
> the conventions of Apache. Since we know the locale, this should be a 
> trivial task.

Exactly - pretty trivial.

> In the case that there is no localised version of a file, one could 
> just let the i18nFile generator pick a default file/locale, and save 
> the static file _without any locale at all_ (but that should not be 
> the case if even one letter has been localised, i.e. if the menu, or 
> the tabs, or some other included element is coming from a localised 
> src - in all these cases the generated static file should be treated 
> as if it was completely localised).

Hmm. How to handle defaults, that's interesting.

> The final static site will then be made up of a set of localised 
> files, and content negotiation will now be handled by Apache.

Yup.

> Does my suggestion make sense?

They certainly do.

One remaining point: how do we handle crawling? Do we crawl a page, then 
seek all translations of it, or do we crawl from each language's 
homepage, following links that way? Make sense?

With the following sources:

a.en.html (links to b,c)
a.de.html (links to c,d)
b.en.html
c.en.html
c.de.html
d.de.html

Do we go:
a.html < en
b.html < en
c.html < en
a.html < de
d.html < de

or

a.html <en
a.html <de
b.html <en
b.html <de
c.html <en
c.html <de
d.html <en
d.html <de

WDYT?

Regards, Upayavira



Mime
View raw message