[Ltru] Re: Progressing beyond borders-making subtags inclusive

Nicholas Shanks <contact@nickshanks.com> Mon, 07 January 2008 15:02 UTC

Return-path: <ltru-bounces@ietf.org>
Received: from [127.0.0.1] (helo=stiedprmman1.va.neustar.com) by megatron.ietf.org with esmtp (Exim 4.43) id 1JBtUw-0001xa-SE; Mon, 07 Jan 2008 10:02:42 -0500
Received: from ltru by megatron.ietf.org with local (Exim 4.43) id 1JBtUv-0001xE-8Y for ltru-confirm+ok@megatron.ietf.org; Mon, 07 Jan 2008 10:02:41 -0500
Received: from [10.90.34.44] (helo=chiedprmail1.ietf.org) by megatron.ietf.org with esmtp (Exim 4.43) id 1JBtUu-0001x6-Qk for ltru@ietf.org; Mon, 07 Jan 2008 10:02:40 -0500
Received: from mk-outboundfilter-4.mail.uk.tiscali.com ([212.74.114.32]) by chiedprmail1.ietf.org with esmtp (Exim 4.43) id 1JBtUt-00011l-RP for ltru@ietf.org; Mon, 07 Jan 2008 10:02:40 -0500
X-Trace: 8047987/mk-outboundfilter-2.mail.uk.tiscali.com/F2S/$ACCEPTED/freedom2Surf-customers/195.137.85.17
X-SBRS: None
X-RemoteIP: 195.137.85.17
X-IP-MAIL-FROM: contact@nickshanks.com
X-IP-BHB: Once
X-IronPort-Anti-Spam-Filtered: true
X-IronPort-Anti-Spam-Result: AgAAADPOgUfDiVUR/2dsb2JhbAAIkV6WBg
Received: from home.nickshanks.com (HELO [192.168.1.17]) ([195.137.85.17]) by smtp.f2s.tiscali.co.uk with ESMTP/TLS/AES128-SHA; 07 Jan 2008 15:02:38 +0000
Message-Id: <700092AE-4A06-4A67-A34F-65F2B91C0561@nickshanks.com>
From: Nicholas Shanks <contact@nickshanks.com>
To: ltru@ietf.org
In-Reply-To: <003801c84f10$615b3ee0$6801a8c0@oemcomputer>
Mime-Version: 1.0 (Apple Message framework v915)
X-Priority: 3
Date: Mon, 07 Jan 2008 15:02:37 +0000
References: <OF2F4FEC1A.1A1EA98A-ON882573C5.006AD28E-882573C5.006BAABC@spe.sony.com> <8F16F2FB-10D0-4BCA-A4A4-A6C04B7B4852@nickshanks.com> <003801c84f10$615b3ee0$6801a8c0@oemcomputer>
X-Mailer: Apple Mail (2.915)
X-Spam-Score: 0.3 (/)
X-Scan-Signature: 86f85b2f88b0d50615aed44a7f9e33c7
Cc:
Subject: [Ltru] Re: Progressing beyond borders-making subtags inclusive
X-BeenThere: ltru@ietf.org
X-Mailman-Version: 2.1.5
Precedence: list
List-Id: Language Tag Registry Update working group discussion list <ltru.ietf.org>
List-Unsubscribe: <https://www1.ietf.org/mailman/listinfo/ltru>, <mailto:ltru-request@ietf.org?subject=unsubscribe>
List-Archive: <http://www1.ietf.org/pipermail/ltru>
List-Post: <mailto:ltru@ietf.org>
List-Help: <mailto:ltru-request@ietf.org?subject=help>
List-Subscribe: <https://www1.ietf.org/mailman/listinfo/ltru>, <mailto:ltru-request@ietf.org?subject=subscribe>
Content-Type: multipart/mixed; boundary="===============1492133984=="
Errors-To: ltru-bounces@ietf.org

On 4 Jan 2008, at 20:28, Randy Presuhn wrote:

>> I say this because en-US represents a cluster of
>> dialects and accents, with a unified orthography, and en-GB  
>> represents
>> a cluster of accents and dialects (some overlapping with en-US),
>
> Could you give an example of such an overlap?  The divergence in
> pronunciation was already marked in the 1700s.

I have personally noticed a lot of convergence is occurring here in  
the UK, with the large quantity of US sitcoms and music consumed. US  
speech patterns and pronunciations, especially when used in specific  
phrases, but also in the general patterns of speech. I don't believe  
it causes people to gradually start speaking with a different accent,  
but certainly can lead to blended speech in the young which differs  
from that of their parents. I have not looked for academic evidence  
for this though.

>> I believe that having a subtag registered is at present too difficult
>> (requirement for dictionaries!? what if it's mostly just an accent
>> with only phonemic changes relative to surrounding accents). A
>> relaxation of the barriers would lead to more de facto recognised
>> dialects being available to choose from.
>
> I'm not able to figure out what you're trying to say here.

I want to say "Here's some subtags for English that I think should be  
registered. Anyone disagree?" and just list them by name, using  
commonly understood terms that lay people wanting to use the tags will  
be able to identify with ease.
By lay people I mean tens of millions of folks like my girlfriend, who  
wants to make a website about dolphins, and uses one of ov dem HMTL  
programs to do so.

>> As an example, things like the supposedly "British English" speech
>> synthesizer voices on my computer (which the OS processes using the
>> tag "en_GB" from the voice's property list) sound nothing like most  
>> of
>> the accents of the United Kingdom, they would be better marked as  
>> "en-
>> received" or similar.
>
> This is not a tagging problem.  It's a complaint about a speech
> synthesizer

I disagree here. It's not the synthesizer and/or website's fault that  
the palette of choices is so restricted. If those creating the voice  
were to have a wider choice of subtags, they could more accurately  
mark it up. The synth was just one example of electronic consumption  
of content. It may be a search engine or something as yet undreamt of.  
Age and the rural/urban split also have significant influences on  
language, though those are at present not dealt with.

> and could be made for any language not tagged right
> down to the level of some person's idiolect.

Agreed. The line has to be drawn somewhere. I just think it's too  
course at present. We have the facility for creating 'approved'  
subtags without anything breaking. We might as well use it to the  
maximum :-)

>> I'm sure we can all agree on commonly recognised dialects for  
>> English,
>
> I'd be surprised.  The "cowboy" dialects spoken by my relatives in
> South Dakota differ from what the ones in Wyoming speak, and
> neither sounds much like Bush-speak.  With variation seemingly on
> the rise in US English, compiling an agreed list might be harder
> than you think.

Should have used 'dixie' as was pointed out, but as I am not familiar  
with US dialects in general I wouldn't be suggesting any :)

>> as it is a first langauge for many people on this list, and familiar
>> for many others. For other languages compiling a list might involve
>> asking a scholar for suggestions.
>
> That's not how ietf-languages@iana.org is supposed to work.

Okay. You referred to the mailing list. I was more referring to how  
the standard should be created.
I presume there are people out there (and on these lists) who get paid  
to create these things. These people would conduct or locate relevant  
research and create a map, rather like a barometric or elevation map,  
with contours encircling different dialects.

> Rather, someone (anyone) who has a need of a subtag for a
> particular dialect submits a registration request, the request is
> discussed, and the Language Subtag Reviewer decides whether
> to accept the registration.


The thing is most people using them in my field (web authors) have no  
idea this list exists, nor that they even have the need for a new  
subtag when they are creating new content. Quite often the tags are  
added without the author even knowing.
The tags have to be created beforehand and given as options on a  
platter to these people by the software.
If you expect users to register their own codes, would you like to see  
dialog boxes like this when people press 'save as HTML' ?

Please choose the most appropriate language tag for this page:
[ ]  en    English (language WizzoWebWhacker is running in)
[ ]  en-IE Hibernian English (taken from your time zone)
[x]  en-enteryourdialecthere (automatically sends a registration email  
to IETF)

[Cancel] [Okay]


(I write that only half in jest)

>> It occurred to me while writing this that perhaps a good solution
>> would be to use country codes for written content that uses the
>> national orthography, and dialect tags when transcribing spoken
>> content or for audio data. You would only combine the two if you were
>> transcribing the speech of someone with that dialect into the
>> orthography of a country (maybe not the country of the speaker).
>
> Interesting idea.  Discussion of such a proposal belongs on ltru@ietf.org 
> ,
> not here.

Moved. No comments on this? Obviously changes like this would have to  
be best practice suggestions, and not rules, for compatibility.

- Nicholas.
_______________________________________________
Ltru mailing list
Ltru@ietf.org
https://www1.ietf.org/mailman/listinfo/ltru