incubator-ooo-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Rob Weir <robw...@apache.org>
Subject Re: Crazy idea: Use Google to translate website
Date Tue, 03 Jul 2012 00:49:51 GMT
On Mon, Jul 2, 2012 at 7:18 PM, Dave Fisher <dave2wave@comcast.net> wrote:
> Sorry for the top post. I like where this is going. A lot of interesting ideas.
>
> I have one major concern. How do we manage the human created content as people replace
and/or edit the translations. What happens when the original English (or French) page is changed?
To me we are really discussing managing Markdown text. If the names of files are like:
>
> index.mdtext
> index.en-GB.mdtext
> index.fr.mdtext
>

Wouldn't you say that 99% of the website is HTML and Wiki Text today?
 There is very little Markdown in use outside of the Podling project
pages.

In any case, it should be possible to use Pootle for this, just as we
manage changing product strings and updating those.

There are a good number of convertors for getting to/from Pootle
format:  http://translate.sourceforge.net/wiki/toolkit/index

Note html2po.

I bet writing mdtext2po (and the inverse) would be possible.

If we used Pootle for this, we'd need to define some sort of schedule,
since it is not really a "release" in the traditional sense.  But you
could imagine every month or so, doing a cycle of:

1) html2po and mdtext2po

2) Load into Pootle

3) Volunteers translate

4) At specified time run po2html and po2mdtext

5) check in the new website files

> We'll have some type of Apache CMS magic that can handle translated SSI elements. I need
to write Joe / infra-dev an email...
>
> Then if we can tie together the CMS to take translations and somehow inform either or
both the human and/or the tool translators when changes occur in other languages ... svn diff
can be used... assuming that...
>

The issue is the average translator is not an markup (or markdown)
person.  They use Pootle or similar tools that facilitate translation.
 What do we need to be translator-friendly?  Consistency between how
we translate UI and webpages might help.

> With markdown it will be easy to have a header parameter that will signal the inclusion
of an SSI detailing the machine translated page vs. human translation situation. By making
it an SSI and translatable it can become something different language groups can handle in
an organic way. We'll have an objective measure of the engagement of different language communities
based on the the number of edits, number of translators and how up to date and/or responsive
they are.
>
> I think we could start by creating a test-auto.mdtext file, and using the translate.google
to convert it to 100 pages. Put the scripts in the ooo-site/trunk/tools/ directory. If they
are perl scripts then in ooo-site/lib/.
>
> Regards,
> Dave
>
> On Jul 2, 2012, at 2:43 PM, Kay Schenk wrote:
>
>> On Mon, Jul 2, 2012 at 2:27 PM, Rob Weir <robweir@apache.org> wrote:
>>
>>> On Mon, Jul 2, 2012 at 4:20 PM, Kay Schenk <kay.schenk@gmail.com> wrote:
>>>> On Mon, Jul 2, 2012 at 7:14 AM, Rob Weir <robweir@apache.org> wrote:
>>>>
>>>>> On Mon, Jul 2, 2012 at 9:57 AM, Donald Whytock <dwhytock@gmail.com>
>>> wrote:
>>>>>> You don't have to use Google Translate for the entire site into a
>>>>>> given language.  Better than no page at all in a given language is
a
>>>>>
>>>>> True.   To enable this integration requires adding markup to two
>>>>> places in the HTML file:
>>>>>
>>>>> 1) Load some script in the <head> section
>>>>>
>>>>> 2) Add a Google-provided <div> to wherever in the page we want
the
>>>>> language selector drop down to be.
>>>>>
>>>>> It would be really easy to add this to a small number of selected pages.
>>>>>
>>>>> It would also be easy to add to all pages via the CMS template.
>>>>>
>>>>> What would be hard is managing this for a large number of pages, but
>>>>> not all pages.
>>>>>
>>>>>> page in a given language that says, "Hi there!  This is the site
for
>>>>>> Apache OpenOffice.  We welcome translations of our site into your
>>>>>> language, and invite you to volunteer at the following email address:
>>>>>> <blah> Or you can submit a translation through Google Translate,
which
>>>>>> was used to produce this page."
>>>>>>
>>>>>> Something as short as that is less likely to be garbled in
>>>>>> auto-translation than something technical, and it tells potential
>>>>>> contributors what to do to help out.
>>>>>>
>>>>>
>>>>> The trick would be to get people to visit that page.  Unless it was on
>>>>> the home page.
>>>>>
>>>>> -Rob
>>>>>
>>>>>> Don
>>>>>
>>>>
>>>> OK, it took me a little while to weed through Google's info on this.
>>>>
>>>> A good sample can be found at:
>>>>
>>>>
>>> http://googleblog.blogspot.com/2009/09/translate-your-website-with-google.html
>>>>
>>>> Is there any possibility we could ad the gadget to the OOo blogs site --
>>>>
>>>> https://blogs.apache.org/OOo/
>>>>
>>>> just for fun and see what we think?
>>>> This way we'd just be impacting one page and not a whole site.
>>>>
>>>
>>> If we want access to review and approve suggestions made by readers
>>> then it needs to be on a domain that we "own".  This is in common with
>>> most Google services, you need to demonstrate that you control the
>>> domain, typically by adding a special META tag to the homepage.  For
>>> *.openoffice.org this is easy, and I've already done this to enable
>>> Google Analytics.  If we want to do the same for the blog we'd need
>>> the ability to insert special markup into the <head> and <body> of
the
>>> blog template.  I'm not sure whether this is possible with our Roller
>>> setup.
>>>
>>
>> oh -- well too bad. It could have been fun.
>>
>>
>>>
>>> Another way of testing this, in a quantitative way, is via what is
>>> called "A/B Testing".  With this approach we define an action a
>>> satisfied site visitor might take, like downloading AOO 3.4.  Then we
>>> randomly show users either the original home page (or download page or
>>> any other page we're testing).  This is "A", and then we show other
>>> users a different version, B.  For example, B could have the
>>> translation enabled.  Then we ran this "experiment" for a period of
>>> time, like a week or two, tracking which version of the page has the
>>> higher success rate with users.
>>>
>>
>> hmmmm...interesting
>>
>> OK, I've looked at the rest of your post here and will think about this for
>> a bit.
>>
>>
>>>
>>> If the machine translated page leads visitors confuses users, or makes
>>> them suspect the page, then the download %'s will be lower than the
>>> original page.  And if the translated page is helpful then the
>>> download numbers would be higher.
>>>
>>> You could imagine other success indicators.  Pretty much anything that
>>> has a URL can be measured.   For example, imagine we add a link, "This
>>> page solved my problem" to the bottom of every documentation page.
>>> Even though the link would just go to a "thanks" page, we could use
>>> that action to measure the success of translated versus untranslated
>>> pages.
>>>
>>> Of course, we don't need to do this all at once.  But I'd recommend we
>>> think of ways of quantifying success.  The website serves our users.
>>> How do we know what is working well and what isn't?  How can we design
>>> experiments to test alternative approaches?
>>>
>>>
>>> Possible successes for users might be:
>>>
>>> - downloaded AOO
>>>
>>> - found answer to their question
>>>
>>> - signed up for our announcement list
>>>
>>> - entered their first bug report
>>>
>>> - signed up for one of the project lists
>>>
>>> - make first wiki contribution
>>>
>>> - followed/liked/+1'ed us on one of our social networking sites
>>>
>>> Measure, improve, repeat.   Constant improvement and optimization.
>>>
>>> We can debate what will improve the website for the users.  Or we can
>>> test and measure.  A/B testing is a new option for us, a technique
>>> that once was used only by the largest commercial websites, but is now
>>> available to everyone via Google's "content experiments" support in
>>> Google Analytics.
>>>
>>> -Rob
>>>
>>>> I think that might a perfect application for something like this.
>>>>
>>>>
>>>>
>>>> --
>>>>
>>> ----------------------------------------------------------------------------------------
>>>> MzK
>>>>
>>>> "I would rather have a donkey that takes me there
>>>> than a horse that will not fare."
>>>>                                          -- Portuguese proverb
>>>
>>
>>
>>
>> --
>> ----------------------------------------------------------------------------------------
>> MzK
>>
>> "I would rather have a donkey that takes me there
>> than a horse that will not fare."
>>                                          -- Portuguese proverb
>

Mime
View raw message