Rocksolid Light

Welcome to novaBBS (click a section below)

mail  files  register  newsreader  groups  login

Message-ID:  

If you analyse anything, you destroy it. -- Arthur Miller


computers / comp.os.vms / Re: Safer programming languages (and walking :-) ), was: Re: 8-bit characters

SubjectAuthor
* 8-bit charactersPhillip Helbig (undress to reply
`* Re: 8-bit charactersStephen Hoffman
 +* Re: 8-bit charactersJan-Erik Söderholm
 |+- Re: 8-bit charactersStephen Hoffman
 |+* Re: 8-bit charactersArne Vajhøj
 ||`* Re: 8-bit charactersLawrence D’Oliveiro
 || `* Re: 8-bit charactersArne Vajhøj
 ||  +* Re: 8-bit charactersCraig A. Berry
 ||  |+* Re: 8-bit charactersArne Vajhøj
 ||  ||+* Re: 8-bit charactersCraig A. Berry
 ||  |||`- Re: 8-bit charactersLawrence D’Oliveiro
 ||  ||`* Re: 8-bit charactersLawrence D’Oliveiro
 ||  || `- Re: 8-bit charactersLawrence D’Oliveiro
 ||  |`- Re: 8-bit charactersLawrence D’Oliveiro
 ||  `* Re: 8-bit charactersLawrence D’Oliveiro
 ||   `- Re: 8-bit charactersArne Vajhøj
 |`* Re: 8-bit charactersPhillip Helbig (undress to reply
 | `- Re: 8-bit charactersMichael Moroney
 `* Re: 8-bit charactersPhillip Helbig (undress to reply
  `* Re: 8-bit charactersMichael Moroney
   +* Re: 8-bit charactersPhillip Helbig (undress to reply
   |`* Re: 8-bit charactersMichael Moroney
   | `* Re: 8-bit charactersPhillip Helbig (undress to reply
   |  `* Re: 8-bit charactersDave Froble
   |   `* Re: 8-bit charactersMichael Moroney
   |    `* Re: 8-bit charactersArne Vajhøj
   |     `* Re: 8-bit charactersRobert A. Brooks
   |      `* Re: 8-bit charactersSimon Clubley
   |       +* Re: 8-bit charactersRobert A. Brooks
   |       |+- Re: 8-bit charactersMichael Moroney
   |       |`* Re: 8-bit charactersStephen Hoffman
   |       | +- Re: 8-bit charactersArne Vajhøj
   |       | `* Impenetrable code, was: Re: 8-bit charactersSimon Clubley
   |       |  `- Re: Impenetrable code, was: Re: 8-bit charactersDave Froble
   |       `* Re: 8-bit charactersJohn Reagan
   |        +* Trigger warnings, was: Re: 8-bit charactersSimon Clubley
   |        |+* Re: Trigger warnings, was: Re: 8-bit charactersDave Froble
   |        ||`* Re: Trigger warnings, was: Re: 8-bit charactersArne Vajhøj
   |        || +* Re: Trigger warnings, was: Re: 8-bit charactersArne Vajhøj
   |        || |`- Re: Trigger warnings, was: Re: 8-bit charactersNorbert Schönartz
   |        || +* Re: Trigger warnings, was: Re: 8-bit charactersSimon Clubley
   |        || |`- Re: Trigger warnings, was: Re: 8-bit charactersArne Vajhøj
   |        || `* Re: Trigger warnings, was: Re: 8-bit charactersDave Froble
   |        ||  `- Re: Trigger warnings, was: Re: 8-bit charactersArne Vajhøj
   |        |`- Re: Trigger warnings, was: Re: 8-bit charactersJohn Reagan
   |        `* Re: 8-bit charactersDave Froble
   |         `* Safer programming languages (and walking :-) ), was: Re: 8-bit charactersSimon Clubley
   |          +* Re: Safer programming languages (and walking :-) ), was: Re: 8-bitBill Gunshannon
   |          |`* Re: Safer programming languages (and walking :-) ), was: Re: 8-bitArne Vajhøj
   |          | `* Re: Safer programming languages (and walking :-) ), was: Re: 8-bit charactersSimon Clubley
   |          |  `* Re: Safer programming languages (and walking :-) ), was: Re: 8-bitArne Vajhøj
   |          |   +* Re: Safer programming languages (and walking :-) ), was: Re: 8-bitBill Gunshannon
   |          |   |`- Re: Safer programming languages (and walking :-) ), was: Re: 8-bitArne Vajhøj
   |          |   `* Re: Safer programming languages (and walking :-) ), was: Re: 8-bit charactersSimon Clubley
   |          |    +* Re: Safer programming languages (and walking :-) ), was: Re: 8-bitBill Gunshannon
   |          |    |`* Re: Safer programming languages (and walking :-) ), was: Re: 8-bit charactersSimon Clubley
   |          |    | `- Re: Safer programming languages (and walking :-) ), was: Re: 8-bitArne Vajhøj
   |          |    `- Re: Safer programming languages (and walking :-) ), was: Re: 8-bitArne Vajhøj
   |          `* Re: Safer programming languages (and walking :-) ), was: Re: 8-bitArne Vajhøj
   |           `* Re: Safer programming languages (and walking :-) ), was: Re: 8-bit charactersSimon Clubley
   |            +* Re: Safer programming languages (and walking :-) ), was: Re: 8-bitArne Vajhøj
   |            |`* Re: Safer programming languages (and walking :-) ), was: Re: 8-bit charactersSimon Clubley
   |            | +* Re: Safer programming languages (and walking :-) ), was: Re: 8-bitArne Vajhøj
   |            | |`* Re: Safer programming languages (and walking :-) ), was: Re: 8-bit charactersSimon Clubley
   |            | | `- Re: Safer programming languages (and walking :-) ), was: Re: 8-bitArne Vajhøj
   |            | `* Re: Safer programming languages (and walking :-) ), was: Re: 8-bitArne Vajhøj
   |            |  `* Re: Safer programming languages (and walking :-) ), was: Re: 8-bit charactersSimon Clubley
   |            |   `* Re: Safer programming languages (and walking :-) ), was: Re: 8-bitArne Vajhøj
   |            |    `* Re: Safer programming languages (and walking :-) ), was: Re: 8-bit charactersSimon Clubley
   |            |     +- Re: Safer programming languages (and walking :-) ), was: Re: 8-bit charactersStephen Hoffman
   |            |     `* Re: Safer programming languages (and walking :-) ), was: Re: 8-bitBill Gunshannon
   |            |      `* Re: Safer programming languages (and walking :-) ), was: Re: 8-bitArne Vajhøj
   |            |       +* Re: Safer programming languages (and walking :-) ), was: Re: 8-bitBill Gunshannon
   |            |       |`* Re: Safer programming languages (and walking :-) ), was: Re: 8-bitArne Vajhøj
   |            |       | `* Re: Safer programming languages (and walking :-) ), was: Re: 8-bitBill Gunshannon
   |            |       |  +* Re: Safer programming languages (and walking :-) ), was: Re: 8-bitArne Vajhøj
   |            |       |  |`* Re: Safer programming languages (and walking :-) ), was: Re: 8-bit charactersSimon Clubley
   |            |       |  | +* Re: Safer programming languages (and walking :-) ), was: Re: 8-bitArne Vajhøj
   |            |       |  | |+* Re: Safer programming languages (and walking :-) ), was: Re: 8-bitJohnny Billquist
   |            |       |  | ||+* Re: Safer programming languages (and walking :-) ), was: Re: 8-bitArne Vajhøj
   |            |       |  | |||`- Re: Safer programming languages (and walking :-) ), was: Re: 8-bitJohnny Billquist
   |            |       |  | ||`- Re: Safer programming languages (and walking :-) ), was: Re: 8-bit charactersJake Hamby
   |            |       |  | |`- Re: Safer programming languages (and walking :-) ), was: Re: 8-bitBob Eager
   |            |       |  | `* Re: Safer programming languages (and walking :-) ), was: Re: 8-bit charactersStephen Hoffman
   |            |       |  |  `* Re: Safer programming languages (and walking :-) ), was: Re: 8-bit charactersSimon Clubley
   |            |       |  |   `* Re: Safer programming languages (and walking :-) ), was: Re: 8-bitJohnny Billquist
   |            |       |  |    `* Re: Safer programming languages (and walking :-) ), was: Re: 8-bitArne Vajhøj
   |            |       |  |     +- Re: Safer programming languages (and walking :-) ), was: Re: 8-bitRichard Maher
   |            |       |  |     `* Re: Safer programming languages (and walking :-) ), was: Re: 8-bitJohnny Billquist
   |            |       |  |      `* Re: Safer programming languages (and walking :-) ), was: Re: 8-bitArne Vajhøj
   |            |       |  |       `* Re: Safer programming languages (and walking :-) ), was: Re: 8-bitJohnny Billquist
   |            |       |  |        `* Re: Safer programming languages (and walking :-) ), was: Re: 8-bitArne Vajhøj
   |            |       |  |         +* Re: Safer programming languages (and walking :-) ), was: Re: 8-bitJohnny Billquist
   |            |       |  |         |`- Re: Safer programming languages (and walking :-) ), was: Re: 8-bitBill Gunshannon
   |            |       |  |         `* Re: Safer programming languages (and walking :-) ), was: Re: 8-bit charactersSimon Clubley
   |            |       |  |          +- Re: Safer programming languages (and walking :-) ), was: Re: 8-bitArne Vajhøj
   |            |       |  |          `* Re: Safer programming languages (and walking :-) ), was: Re: 8-bitSingle Stage to Orbit
   |            |       |  |           +* Re: Safer programming languages (and walking :-) ), was: Re: 8-bitArne Vajhøj
   |            |       |  |           |+- Re: Safer programming languages (and walking :-) ), was: Re: 8-bitArne Vajhøj
   |            |       |  |           |`* Re: Safer programming languages (and walking :-) ), was: Re: 8-bitSingle Stage to Orbit
   |            |       |  |           | `- Re: Safer programming languages (and walking :-) ), was: Re: 8-bitArne Vajhøj
   |            |       |  |           `- Re: Safer programming languages (and walking :-) ), was: Re: 8-bit charactersSimon Clubley
   |            |       |  `- Re: Safer programming languages (and walking :-) ), was: Re: 8-bit charactersRich Alderson
   |            |       `* Re: Safer programming languages (and walking :-) ), was: Re: 8-bitDave Froble
   |            `* Re: Safer programming languages (and walking :-) ), was: Re: 8-bitBill Gunshannon
   +* Re: 8-bit charactersLawrence D’Oliveiro
   `* Re: 8-bit charactersJon Pinkley

Pages:123456789
8-bit characters

<smg4a2$boh$1@gioia.aioe.org>

  copy mid

https://www.novabbs.com/computers/article-flat.php?id=18630&group=comp.os.vms#18630

  copy link   Newsgroups: comp.os.vms
Path: i2pn2.org!i2pn.org!aioe.org!cI9PH+Y/yCUymCTCKdsldQ.user.46.165.242.75.POSTED!not-for-mail
From: hel...@asclothestro.multivax.de (Phillip Helbig (undress to reply)
Newsgroups: comp.os.vms
Subject: 8-bit characters
Date: Wed, 10 Nov 2021 09:44:34 -0000 (UTC)
Organization: Multivax C&R
Message-ID: <smg4a2$boh$1@gioia.aioe.org>
Mime-Version: 1.0
Content-Type: text/plain; charset=ISO-8859-15
Content-Transfer-Encoding: 8bit
Injection-Info: gioia.aioe.org; logging-data="12049"; posting-host="cI9PH+Y/yCUymCTCKdsldQ.user.gioia.aioe.org"; mail-complaints-to="abuse@aioe.org";
X-Notice: Filtered by postfilter v. 0.9.2
 by: Phillip Helbig (undr - Wed, 10 Nov 2021 09:44 UTC

Having to write some Icelanding words in a DECterm (as one does), I
notice that COMPOSE-T-H and COMPOSE-t-h create upper and lower case
thorn (Þ þ if those characters get through). If entered by c
both create the character, unless it is at the beginning of a line, in
which case one sees <XDE> or <XFE> (one character, displayed as
several). ASCII values are 222 and 254. Refreshing the screen also
causes the mnenonics to appear. Also, they are not displayed via
HELP FORTRAN CHAR DEC.

Any deeper reason or just flaky instrumentation?

I also notice that × (COMPOSE-x-x) works fine in a DECterm but not on a
real VT220 (where most or all other composed characters work). Again,
deeper meaning or just flaky?

Re: 8-bit characters

<smh1k8$ul5$1@dont-email.me>

  copy mid

https://www.novabbs.com/computers/article-flat.php?id=18641&group=comp.os.vms#18641

  copy link   Newsgroups: comp.os.vms
Path: i2pn2.org!i2pn.org!eternal-september.org!reader02.eternal-september.org!.POSTED!not-for-mail
From: seaoh...@hoffmanlabs.invalid (Stephen Hoffman)
Newsgroups: comp.os.vms
Subject: Re: 8-bit characters
Date: Wed, 10 Nov 2021 13:04:56 -0500
Organization: HoffmanLabs LLC
Lines: 39
Message-ID: <smh1k8$ul5$1@dont-email.me>
References: <smg4a2$boh$1@gioia.aioe.org>
Mime-Version: 1.0
Content-Type: text/plain; charset=iso-8859-1; format=flowed
Content-Transfer-Encoding: 8bit
Injection-Info: reader02.eternal-september.org; posting-host="c0a3398fa37ffe8bb533ab2b6296f8fa";
logging-data="31397"; mail-complaints-to="abuse@eternal-september.org"; posting-account="U2FsdGVkX1+rQ9YgGppNSo1jxwRjHD9Lqdm4qhtY81w="
User-Agent: Unison/2.2
Cancel-Lock: sha1:dZLlcd1luks8+o44Q8KQlFj6Ofk=
 by: Stephen Hoffman - Wed, 10 Nov 2021 18:04 UTC

On 2021-11-10 09:44:34 +0000, Phillip Helbig (undress to reply said:

> Having to write some Icelanding words in a DECterm (as one does), I
> notice that COMPOSE-T-H and COMPOSE-t-h create upper and lower case
> thorn (Þ þ if those characters get through). If entered by cboth
> create the character, unless it is at the beginning of a line, in which
> case one sees <XDE> or <XFE> (one character, displayed as several).
> ASCII values are 222 and 254. Refreshing the screen also causes the
> mnenonics to appear. Also, they are not displayed via HELP FORTRAN
> CHAR DEC.
>
> Any deeper reason or just flaky instrumentation?
>
> I also notice that × (COMPOSE-x-x) works fine in a DECterm but not on a
> real VT220 (where most or all other composed characters work). Again,
> deeper meaning or just flaky?

You're definitely not looking at ASCII, and AFAIK Þ and þ aren't in DEC
MCS, which likely means you're looking at inconsistent handling of or
inconsistent configuration of ISO 8859-1 among your apps and OS and
hardware; I'd guess some here is MCS, and some 8859-1.

You've asked variations of this question over the years too, usually
involving trying to use EDT past ASCII or maybe past DEC MCS.

https://groups.google.com/g/comp.os.vms/c/QAQAyRo9BPM/m/IrmCw1UJBQAJ
https://groups.google.com/g/comp.os.vms/c/Yji2Tufvv7k/m/mhUy-zKXAAAJ
etc.

This is part of the (lack of) UTF-8 and Unicode support in OpenVMS and
its tooling that I've grumbled. Not that adding UTF-8 and Unicode
support is ever going to be a small overhaul.

--
Pure Personal Opinion | HoffmanLabs LLC

Re: 8-bit characters

<smhnmj$2d8$1@dont-email.me>

  copy mid

https://www.novabbs.com/computers/article-flat.php?id=18648&group=comp.os.vms#18648

  copy link   Newsgroups: comp.os.vms
Path: i2pn2.org!i2pn.org!eternal-september.org!reader02.eternal-september.org!.POSTED!not-for-mail
From: jan-erik...@telia.com (Jan-Erik Söderholm)
Newsgroups: comp.os.vms
Subject: Re: 8-bit characters
Date: Thu, 11 Nov 2021 01:21:38 +0100
Organization: A noiseless patient Spider
Lines: 61
Message-ID: <smhnmj$2d8$1@dont-email.me>
References: <smg4a2$boh$1@gioia.aioe.org> <smh1k8$ul5$1@dont-email.me>
Mime-Version: 1.0
Content-Type: text/plain; charset=UTF-8; format=flowed
Content-Transfer-Encoding: 8bit
Injection-Date: Thu, 11 Nov 2021 00:21:39 -0000 (UTC)
Injection-Info: reader02.eternal-september.org; posting-host="29c300edc1d68295e32d39043d3ad3ce";
logging-data="2472"; mail-complaints-to="abuse@eternal-september.org"; posting-account="U2FsdGVkX189/Wbmai7gYE7JBXU2brCG"
User-Agent: Mozilla/5.0 (Windows NT 10.0; Win64; x64; rv:91.0) Gecko/20100101
Thunderbird/91.2.1
Cancel-Lock: sha1:kyyhFw5mmdKUrcp84ymij9oIeTw=
In-Reply-To: <smh1k8$ul5$1@dont-email.me>
Content-Language: sv
 by: Jan-Erik Söderholm - Thu, 11 Nov 2021 00:21 UTC

Den 2021-11-10 kl. 19:04, skrev Stephen Hoffman:
> On 2021-11-10 09:44:34 +0000, Phillip Helbig (undress to reply said:
>
>> Having to write some Icelanding words in a DECterm (as one does), I
>> notice that COMPOSE-T-H and COMPOSE-t-h create upper and lower case thorn
>> (Þ þ if those characters get through).  If entered by cboth create the
>> character, unless it is at the beginning of a line, in which case one
>> sees <XDE> or <XFE> (one character, displayed as several). ASCII values
>> are 222 and 254.  Refreshing the screen also causes the mnenonics to
>> appear.  Also, they are not displayed via HELP FORTRAN CHAR DEC.
>>
>> Any deeper reason or just flaky instrumentation?
>>
>> I also notice that × (COMPOSE-x-x) works fine in a DECterm but not on a
>> real VT220 (where most or all other composed characters work).  Again,
>> deeper meaning or just flaky?
>
> You're definitely not looking at ASCII, and AFAIK Þ and þ aren't in DEC
> MCS, which likely means you're looking at inconsistent handling of or
> inconsistent configuration of ISO 8859-1 among your apps and OS and
> hardware; I'd guess some here is MCS, and some 8859-1.
>
> You've asked variations of this question over the years too, usually
> involving trying to use EDT past ASCII or maybe past DEC MCS.
>
> https://groups.google.com/g/comp.os.vms/c/QAQAyRo9BPM/m/IrmCw1UJBQAJ
> https://groups.google.com/g/comp.os.vms/c/Yji2Tufvv7k/m/mhUy-zKXAAAJ
> etc.
>
> This is part of the (lack of) UTF-8 and Unicode support in OpenVMS and its
> tooling that I've grumbled. Not that adding UTF-8 and Unicode support is
> ever going to be a small overhaul.
>

Now, UTF8 is just a "row of bytes", so if you use (as an example) Putty
in its default setup using UTF8, you can type (or copy/paste) any UTF8
character into Putty and it will be stored using whatever editor you
are using. It is just a row of bytes, so there is no specific need for
any "UTF8 support" for doing just that.

Later on, of you send the same text to some UTF8 compatible display (like
another Putty session using the default UTF8 setup, or a web browser using
UTF8 encoding) the Islandic characters would be displayed just fine.

But if you are using some display tool that doesn't support UTF8, you
will get garbled text, of course. But that is not the fault of OpenVMS.

It is unclear if ISO/IEC 646 have/had support for Icelandic characters,
the Wiki page has an entry for "IS" in some tables but no real data.
https://en.wikipedia.org/wiki/ISO/IEC_646

Then of course, it is a totally other matter if you are talkning about
UTF8 support for symbols/variables in compilers or in file/directory
names, but that is a totally differnt area from just storing and
displaying some "data" that happens to include UTF8 sequences.
But that isn't in the scope of the question asked.

I would not expect tools like DECterm or VT220 (really?) to handle
UTF8 or anything else outside the DEC-MCS range of characters. If you
need that, simply use modern tool from the last 20 years or so.

Re: 8-bit characters

<smhrjq$nee$1@dont-email.me>

  copy mid

https://www.novabbs.com/computers/article-flat.php?id=18650&group=comp.os.vms#18650

  copy link   Newsgroups: comp.os.vms
Path: i2pn2.org!i2pn.org!eternal-september.org!reader02.eternal-september.org!.POSTED!not-for-mail
From: seaoh...@hoffmanlabs.invalid (Stephen Hoffman)
Newsgroups: comp.os.vms
Subject: Re: 8-bit characters
Date: Wed, 10 Nov 2021 20:28:26 -0500
Organization: HoffmanLabs LLC
Lines: 31
Message-ID: <smhrjq$nee$1@dont-email.me>
References: <smg4a2$boh$1@gioia.aioe.org> <smh1k8$ul5$1@dont-email.me> <smhnmj$2d8$1@dont-email.me>
Mime-Version: 1.0
Content-Type: text/plain; charset=utf-8; format=flowed
Content-Transfer-Encoding: 8bit
Injection-Info: reader02.eternal-september.org; posting-host="b0e94a38523dd05255dc81656ae88882";
logging-data="24014"; mail-complaints-to="abuse@eternal-september.org"; posting-account="U2FsdGVkX1/VdpMNla1sF4JUHqZSBqZrSLIEhq0hdWs="
User-Agent: Unison/2.2
Cancel-Lock: sha1:DNfloVxtspyXjmnilUxxSgpQdG0=
 by: Stephen Hoffman - Thu, 11 Nov 2021 01:28 UTC

On 2021-11-11 00:21:38 +0000, Jan-Erik Søderholm said:

> Now, UTF8 is just a "row of bytes", so if you use (as an example) Putty
> in its default setup using UTF8, you can type (or copy/paste) any UTF8
> character into Putty and it will be stored using whatever editor you
> are using. It is just a row of bytes, so there is no specific need for
> any "UTF8 support" for doing just that.

You're quite possibly headed for a few surprises within OpenVMS apps,
even if cutting-and-pasting wads of bytes around. Not the least of
which involves counting characters/code points/clusters (one byte is no
longer one character, so is the app looking for the buffer size or the
character/code point/cluster length, and what to do with the zero-width
stuff?), the fun that is directionality (there's a recent CVE related
to this), and identifying the string encoding and the string language
for each string, normalization, and the inherent language-sensitivity
of strings for purposes such as sorting. In aggregate, some baked-in
app and OpenVMS API assumptions—and developers' own assumptions—about
strings can and will break. Sure, UTF-8 is a "row of bytes", with some
caveats. Sort of. Mostly.

There are a few corners of OpenVMS that have some UTF-8 support, one is
the XQP. Another is C including the I18N bits. Java. I'd expect that
pervasive support is at least a decade away.

--
Pure Personal Opinion | HoffmanLabs LLC

Re: 8-bit characters

<618c75e8$0$704$14726298@news.sunsite.dk>

  copy mid

https://www.novabbs.com/computers/article-flat.php?id=18652&group=comp.os.vms#18652

  copy link   Newsgroups: comp.os.vms
Path: i2pn2.org!i2pn.org!weretis.net!feeder8.news.weretis.net!feeder1.feed.usenet.farm!feed.usenet.farm!news.uzoreto.com!dotsrc.org!filter.dotsrc.org!news.dotsrc.org!not-for-mail
Date: Wed, 10 Nov 2021 20:46:10 -0500
MIME-Version: 1.0
User-Agent: Mozilla/5.0 (Windows NT 6.1; Win64; x64; rv:91.0) Gecko/20100101
Thunderbird/91.3.0
Subject: Re: 8-bit characters
Content-Language: en-US
Newsgroups: comp.os.vms
References: <smg4a2$boh$1@gioia.aioe.org> <smh1k8$ul5$1@dont-email.me>
<smhnmj$2d8$1@dont-email.me>
From: arn...@vajhoej.dk (Arne Vajhøj)
In-Reply-To: <smhnmj$2d8$1@dont-email.me>
Content-Type: text/plain; charset=UTF-8; format=flowed
Content-Transfer-Encoding: 8bit
Lines: 36
Message-ID: <618c75e8$0$704$14726298@news.sunsite.dk>
Organization: SunSITE.dk - Supporting Open source
NNTP-Posting-Host: 36fffe1e.news.sunsite.dk
X-Trace: 1636595176 news.sunsite.dk 704 arne@vajhoej.dk/68.9.63.232:59099
X-Complaints-To: staff@sunsite.dk
 by: Arne Vajhøj - Thu, 11 Nov 2021 01:46 UTC

On 11/10/2021 7:21 PM, Jan-Erik Söderholm wrote:
> Den 2021-11-10 kl. 19:04, skrev Stephen Hoffman:
>> This is part of the (lack of) UTF-8 and Unicode support in OpenVMS and
>> its tooling that I've grumbled. Not that adding UTF-8 and Unicode
>> support is ever going to be a small overhaul.
>
> Now, UTF8 is just a "row of bytes", so if you use (as an example) Putty
> in its default setup using UTF8, you can type (or copy/paste) any UTF8
> character into Putty and it will be stored using whatever editor you
> are using. It is just a row of bytes, so there is no specific need for
> any "UTF8 support" for doing just that.
>
> Later on, of you send the same text to some UTF8 compatible display (like
> another Putty session using the default UTF8 setup, or a web browser using
> UTF8 encoding) the Islandic characters would be displayed just fine.
>
> But if you are using some display tool that doesn't support UTF8, you
> will get garbled text, of course. But that is not the fault of OpenVMS.

The biggest problems with UTF-8 is that the byte length is not
necessarily the character length and that byte index i is not character
index i (and worse byte index i may not even point to a character at all
if it hits in the middle of a multi-byte sequence).

> It is unclear if ISO/IEC 646 have/had support for Icelandic characters,
> the Wiki page has an entry for "IS" in some tables but no real data.
> https://en.wikipedia.org/wiki/ISO/IEC_646

http://www.kreativekorp.com/charset/encoding/ISO646IS/

https://www.freeutils.net/source/jcharset/ &
https://www.mvndoc.com/c/net.freeutils/jcharset/net/freeutils/charset/iso646/ISO646ISCharset.html

seems to indicate that it exist.

Arne

Re: 8-bit characters

<b5b56879-05bb-4d17-9242-2fcb582cfec8n@googlegroups.com>

  copy mid

https://www.novabbs.com/computers/article-flat.php?id=18654&group=comp.os.vms#18654

  copy link   Newsgroups: comp.os.vms
X-Received: by 2002:a37:b5c4:: with SMTP id e187mr3858462qkf.27.1636606120096;
Wed, 10 Nov 2021 20:48:40 -0800 (PST)
X-Received: by 2002:a0c:f8cc:: with SMTP id h12mr4327360qvo.6.1636606119914;
Wed, 10 Nov 2021 20:48:39 -0800 (PST)
Path: i2pn2.org!rocksolid2!i2pn.org!weretis.net!feeder6.news.weretis.net!news.misty.com!border2.nntp.dca1.giganews.com!nntp.giganews.com!news-out.google.com!nntp.google.com!postnews.google.com!google-groups.googlegroups.com!not-for-mail
Newsgroups: comp.os.vms
Date: Wed, 10 Nov 2021 20:48:39 -0800 (PST)
In-Reply-To: <618c75e8$0$704$14726298@news.sunsite.dk>
Injection-Info: google-groups.googlegroups.com; posting-host=118.92.47.103; posting-account=Rx7iEQoAAACMdczcZGHsDFakQWn8-8-t
NNTP-Posting-Host: 118.92.47.103
References: <smg4a2$boh$1@gioia.aioe.org> <smh1k8$ul5$1@dont-email.me>
<smhnmj$2d8$1@dont-email.me> <618c75e8$0$704$14726298@news.sunsite.dk>
User-Agent: G2/1.0
MIME-Version: 1.0
Message-ID: <b5b56879-05bb-4d17-9242-2fcb582cfec8n@googlegroups.com>
Subject: Re: 8-bit characters
From: lawrence...@gmail.com (Lawrence D’Oliveiro)
Injection-Date: Thu, 11 Nov 2021 04:48:40 +0000
Content-Type: text/plain; charset="UTF-8"
Content-Transfer-Encoding: quoted-printable
Lines: 5
 by: Lawrence D’Oliveir - Thu, 11 Nov 2021 04:48 UTC

On Thursday, November 11, 2021 at 3:33:33 PM UTC+13, Arne Vajhøj wrote:
> The biggest problems with UTF-8 is that the byte length is not
> necessarily the character length ...

That would be true of any Unicode encoding, even UCS-4.

Re: 8-bit characters

<smiagb$1u1f$1@gioia.aioe.org>

  copy mid

https://www.novabbs.com/computers/article-flat.php?id=18655&group=comp.os.vms#18655

  copy link   Newsgroups: comp.os.vms
Path: i2pn2.org!i2pn.org!aioe.org!cI9PH+Y/yCUymCTCKdsldQ.user.46.165.242.75.POSTED!not-for-mail
From: hel...@asclothestro.multivax.de (Phillip Helbig (undress to reply)
Newsgroups: comp.os.vms
Subject: Re: 8-bit characters
Date: Thu, 11 Nov 2021 05:42:35 -0000 (UTC)
Organization: Multivax C&R
Message-ID: <smiagb$1u1f$1@gioia.aioe.org>
References: <smg4a2$boh$1@gioia.aioe.org> <smh1k8$ul5$1@dont-email.me>
Mime-Version: 1.0
Content-Type: text/plain; charset=ISO-8859-15
Content-Transfer-Encoding: 8bit
Injection-Info: gioia.aioe.org; logging-data="63535"; posting-host="cI9PH+Y/yCUymCTCKdsldQ.user.gioia.aioe.org"; mail-complaints-to="abuse@aioe.org";
X-Notice: Filtered by postfilter v. 0.9.2
 by: Phillip Helbig (undr - Thu, 11 Nov 2021 05:42 UTC

In article <smh1k8$ul5$1@dont-email.me>, Stephen Hoffman
<seaohveh@hoffmanlabs.invalid> writes:

> On 2021-11-10 09:44:34 +0000, Phillip Helbig (undress to reply said:
>
> > Having to write some Icelanding words in a DECterm (as one does), I

"Icelandic" of course.

> > notice that COMPOSE-T-H and COMPOSE-t-h create upper and lower case
> > thorn (Þ þ if those characters get through). If entered by cboth
> > create the character, unless it is at the beginning of a line, in which
> > case one sees <XDE> or <XFE> (one character, displayed as several).
> > ASCII values are 222 and 254. Refreshing the screen also causes the
> > mnenonics to appear. Also, they are not displayed via HELP FORTRAN
> > CHAR DEC.
> >
> > Any deeper reason or just flaky instrumentation?
> >
> > I also notice that × (COMPOSE-x-x) works fine in a DECterm but not on a
> > real VT220 (where most or all other composed characters work). Again,
> > deeper meaning or just flaky?
>
> You're definitely not looking at ASCII, and AFAIK Þ and þ aren't
> in DEC
> MCS,

At least HELP FORTRAN CHAR DEC doesn't show them.

> which likely means you're looking at inconsistent handling of or
> inconsistent configuration of ISO 8859-1 among your apps and OS and
> hardware; I'd guess some here is MCS, and some 8859-1.

Only LK411, Alpha hardware and DECterm (under CDE, but that's probably
irrelevant). Maybe they are inconsistent. :-|

> You've asked variations of this question over the years too, usually
> involving trying to use EDT past ASCII or maybe past DEC MCS.

Yes. :-)

Re: 8-bit characters

<smialu$1u1f$2@gioia.aioe.org>

  copy mid

https://www.novabbs.com/computers/article-flat.php?id=18656&group=comp.os.vms#18656

  copy link   Newsgroups: comp.os.vms
Path: i2pn2.org!i2pn.org!aioe.org!cI9PH+Y/yCUymCTCKdsldQ.user.46.165.242.75.POSTED!not-for-mail
From: hel...@asclothestro.multivax.de (Phillip Helbig (undress to reply)
Newsgroups: comp.os.vms
Subject: Re: 8-bit characters
Date: Thu, 11 Nov 2021 05:45:34 -0000 (UTC)
Organization: Multivax C&R
Message-ID: <smialu$1u1f$2@gioia.aioe.org>
References: <smg4a2$boh$1@gioia.aioe.org> <smh1k8$ul5$1@dont-email.me> <smhnmj$2d8$1@dont-email.me>
Mime-Version: 1.0
Content-Type: text/plain; charset=ISO-8859-15
Content-Transfer-Encoding: 8bit
Injection-Info: gioia.aioe.org; logging-data="63535"; posting-host="cI9PH+Y/yCUymCTCKdsldQ.user.gioia.aioe.org"; mail-complaints-to="abuse@aioe.org";
X-Notice: Filtered by postfilter v. 0.9.2
 by: Phillip Helbig (undr - Thu, 11 Nov 2021 05:45 UTC

In article <smhnmj$2d8$1@dont-email.me>,
=?UTF-8?Q?Jan-Erik_S=c3=b6derholm?= <jan-erik.soderholm@telia.com>
writes:

> Now, UTF8 is just a "row of bytes", so if you use (as an example) Putty
> in its default setup using UTF8, you can type (or copy/paste) any UTF8
> character into Putty and it will be stored using whatever editor you
> are using. It is just a row of bytes, so there is no specific need for
> any "UTF8 support" for doing just that.
>
> Later on, of you send the same text to some UTF8 compatible display (like
> another Putty session using the default UTF8 setup, or a web browser using
> UTF8 encoding) the Islandic characters would be displayed just fine.
>
> But if you are using some display tool that doesn't support UTF8, you
> will get garbled text, of course. But that is not the fault of OpenVMS.
>
> It is unclear if ISO/IEC 646 have/had support for Icelandic characters,
> the Wiki page has an entry for "IS" in some tables but no real data.
> https://en.wikipedia.org/wiki/ISO/IEC_646
>
> Then of course, it is a totally other matter if you are talkning about
> UTF8 support for symbols/variables in compilers or in file/directory
> names, but that is a totally differnt area from just storing and
> displaying some "data" that happens to include UTF8 sequences.
> But that isn't in the scope of the question asked.

Right; just a text file.

> I would not expect tools like DECterm or VT220 (really?)

Sorry, VT320. :-)

> to handle
> UTF8 or anything else outside the DEC-MCS range of characters. If you
> need that, simply use modern tool from the last 20 years or so.

I don't expect anything more than MCS. I'm just wondering why in a
DECterm it is sometimes displayed correctly and sometimes not.

Re: 8-bit characters

<smicj8$h9r$1@gioia.aioe.org>

  copy mid

https://www.novabbs.com/computers/article-flat.php?id=18657&group=comp.os.vms#18657

  copy link   Newsgroups: comp.os.vms
Path: i2pn2.org!i2pn.org!aioe.org!Uh3cGLv3BUP05xA/L7flqA.user.46.165.242.75.POSTED!not-for-mail
From: moro...@world.std.spaamtrap.com (Michael Moroney)
Newsgroups: comp.os.vms
Subject: Re: 8-bit characters
Date: Thu, 11 Nov 2021 01:18:17 -0500
Organization: Aioe.org NNTP Server
Message-ID: <smicj8$h9r$1@gioia.aioe.org>
References: <smg4a2$boh$1@gioia.aioe.org> <smh1k8$ul5$1@dont-email.me>
<smiagb$1u1f$1@gioia.aioe.org>
Mime-Version: 1.0
Content-Type: text/plain; charset=iso-8859-15; format=flowed
Content-Transfer-Encoding: 8bit
Injection-Info: gioia.aioe.org; logging-data="17723"; posting-host="Uh3cGLv3BUP05xA/L7flqA.user.gioia.aioe.org"; mail-complaints-to="abuse@aioe.org";
User-Agent: Mozilla/5.0 (Windows NT 6.1; Win64; x64; rv:78.0) Gecko/20100101
Thunderbird/78.14.0
X-Notice: Filtered by postfilter v. 0.9.2
Content-Language: en-US
 by: Michael Moroney - Thu, 11 Nov 2021 06:18 UTC

On 11/11/2021 12:42 AM, Phillip Helbig (undress to reply) wrote:
> In article <smh1k8$ul5$1@dont-email.me>, Stephen Hoffman
> <seaohveh@hoffmanlabs.invalid> writes:
>
>> On 2021-11-10 09:44:34 +0000, Phillip Helbig (undress to reply said:
>>
>>> Having to write some Icelanding words in a DECterm (as one does), I
>
> "Icelandic" of course.
>
>>> notice that COMPOSE-T-H and COMPOSE-t-h create upper and lower case
>>> thorn (Þ þ if those characters get through). If entered by cboth
>>> create the character, unless it is at the beginning of a line, in which
>>> case one sees <XDE> or <XFE> (one character, displayed as several).
>>> ASCII values are 222 and 254. Refreshing the screen also causes the
>>> mnenonics to appear. Also, they are not displayed via HELP FORTRAN
>>> CHAR DEC.
>>>
>>> Any deeper reason or just flaky instrumentation?
>>>
>>> I also notice that × (COMPOSE-x-x) works fine in a DECterm but not on a
>>> real VT220 (where most or all other composed characters work). Again,
>>> deeper meaning or just flaky?
>>
>> You're definitely not looking at ASCII, and AFAIK Þ and þ aren't
>> in DEC
>> MCS,
>
> At least HELP FORTRAN CHAR DEC doesn't show them.
>
>> which likely means you're looking at inconsistent handling of or
>> inconsistent configuration of ISO 8859-1 among your apps and OS and
>> hardware; I'd guess some here is MCS, and some 8859-1.
>
> Only LK411, Alpha hardware and DECterm (under CDE, but that's probably
> irrelevant). Maybe they are inconsistent. :-|
>
>> You've asked variations of this question over the years too, usually
>> involving trying to use EDT past ASCII or maybe past DEC MCS.
>
> Yes. :-)
>
The character set ISO-8859-1 is almost the same as DEC-MCS with some of
the undefined DEC-MCS characters being defined in ISO-8859-1. The
exceptions are a few rarely used characters such as Œ and Ÿ.
Specifically, ISO-8859-1 has Icelandic Þ and þ, these positions are
undefined in DEC-MCS. 99% of the time one can use ISO-8859-1 instead of
DEC-MCS and get away with it.

There is an EDT patch which makes it more ISO-8859-1 friendly, actually
prompted by a customer who used EDT for strictly ASCII except for a
character at the 'þ' position (but not þ). EDT fans may want the patch
for its ability to understand terminals with more than 24 lines.

Re: 8-bit characters

<smidul$sua$1@gioia.aioe.org>

  copy mid

https://www.novabbs.com/computers/article-flat.php?id=18658&group=comp.os.vms#18658

  copy link   Newsgroups: comp.os.vms
Path: i2pn2.org!i2pn.org!aioe.org!Uh3cGLv3BUP05xA/L7flqA.user.46.165.242.75.POSTED!not-for-mail
From: moro...@world.std.spaamtrap.com (Michael Moroney)
Newsgroups: comp.os.vms
Subject: Re: 8-bit characters
Date: Thu, 11 Nov 2021 01:41:27 -0500
Organization: Aioe.org NNTP Server
Message-ID: <smidul$sua$1@gioia.aioe.org>
References: <smg4a2$boh$1@gioia.aioe.org> <smh1k8$ul5$1@dont-email.me>
<smhnmj$2d8$1@dont-email.me> <smialu$1u1f$2@gioia.aioe.org>
Mime-Version: 1.0
Content-Type: text/plain; charset=iso-8859-15; format=flowed
Content-Transfer-Encoding: 7bit
Injection-Info: gioia.aioe.org; logging-data="29642"; posting-host="Uh3cGLv3BUP05xA/L7flqA.user.gioia.aioe.org"; mail-complaints-to="abuse@aioe.org";
User-Agent: Mozilla/5.0 (Windows NT 6.1; Win64; x64; rv:78.0) Gecko/20100101
Thunderbird/78.14.0
X-Notice: Filtered by postfilter v. 0.9.2
Content-Language: en-US
 by: Michael Moroney - Thu, 11 Nov 2021 06:41 UTC

On 11/11/2021 12:45 AM, Phillip Helbig (undress to reply) wrote:
> In article <smhnmj$2d8$1@dont-email.me>,
> =?UTF-8?Q?Jan-Erik_S=c3=b6derholm?= <jan-erik.soderholm@telia.com>
> writes:

>> to handle
>> UTF8 or anything else outside the DEC-MCS range of characters. If you
>> need that, simply use modern tool from the last 20 years or so.
>
> I don't expect anything more than MCS. I'm just wondering why in a
> DECterm it is sometimes displayed correctly and sometimes not.
>

If you're talking about how EDT behaves, it's because EDT was
inconsistent whether that character position was printable or not, if
not printable it treated it much like a control code (displaying it as
<Xnn> ) See my other reply.

Re: 8-bit characters

<smij6j$ndd$1@gioia.aioe.org>

  copy mid

https://www.novabbs.com/computers/article-flat.php?id=18659&group=comp.os.vms#18659

  copy link   Newsgroups: comp.os.vms
Path: i2pn2.org!i2pn.org!aioe.org!cI9PH+Y/yCUymCTCKdsldQ.user.46.165.242.75.POSTED!not-for-mail
From: hel...@asclothestro.multivax.de (Phillip Helbig (undress to reply)
Newsgroups: comp.os.vms
Subject: Re: 8-bit characters
Date: Thu, 11 Nov 2021 08:10:59 -0000 (UTC)
Organization: Multivax C&R
Message-ID: <smij6j$ndd$1@gioia.aioe.org>
References: <smg4a2$boh$1@gioia.aioe.org> <smh1k8$ul5$1@dont-email.me> <smiagb$1u1f$1@gioia.aioe.org> <smicj8$h9r$1@gioia.aioe.org>
Mime-Version: 1.0
Content-Type: text/plain; charset=ISO-8859-15
Content-Transfer-Encoding: 8bit
Injection-Info: gioia.aioe.org; logging-data="23981"; posting-host="cI9PH+Y/yCUymCTCKdsldQ.user.gioia.aioe.org"; mail-complaints-to="abuse@aioe.org";
X-Notice: Filtered by postfilter v. 0.9.2
 by: Phillip Helbig (undr - Thu, 11 Nov 2021 08:10 UTC

In article <smicj8$h9r$1@gioia.aioe.org>, Michael Moroney
<moroney@world.std.spaamtrap.com> writes:

> >>> notice that COMPOSE-T-H and COMPOSE-t-h create upper and lower case
> >>> thorn (Þ þ if those characters get through). If entered by both
> >>> create the character, unless it is at the beginning of a line, in which
> >>> case one sees <XDE> or <XFE> (one character, displayed as several).
> >>> ASCII values are 222 and 254. Refreshing the screen also causes the
> >>> mnenonics to appear. Also, they are not displayed via HELP FORTRAN
> >>> CHAR DEC.
> >>>
> >>> Any deeper reason or just flaky instrumentation?
> >>>
> >>> I also notice that × (COMPOSE-x-x) works fine in a DECterm but not on a
> >>> real VT220 (where most or all other composed characters work). Again,
> >>> deeper meaning or just flaky?
> >>
> >> You're definitely not looking at ASCII, and AFAIK Þ and þ aren't
> >> in DEC
> >> MCS,
> >
> > At least HELP FORTRAN CHAR DEC doesn't show them.
> >
> >> which likely means you're looking at inconsistent handling of or
> >> inconsistent configuration of ISO 8859-1 among your apps and OS and
> >> hardware; I'd guess some here is MCS, and some 8859-1.
> >
> > Only LK411, Alpha hardware and DECterm (under CDE, but that's probably
> > irrelevant). Maybe they are inconsistent. :-|
> >
> >> You've asked variations of this question over the years too, usually
> >> involving trying to use EDT past ASCII or maybe past DEC MCS.
> >
> > Yes. :-)
> >
> The character set ISO-8859-1 is almost the same as DEC-MCS with some of
> the undefined DEC-MCS characters being defined in ISO-8859-1. The
> exceptions are a few rarely used characters such as Œ and Ÿ.
> Specifically, ISO-8859-1 has Icelandic Þ and þ, these positions are
> undefined in DEC-MCS. 99% of the time one can use ISO-8859-1 instead of
> DEC-MCS and get away with it.

Right. And ISO-8859-15 is also similar. I routinely write € in EDT to
get the Euro sign when most people read that text.

> There is an EDT patch which makes it more ISO-8859-1 friendly, actually
> prompted by a customer who used EDT for strictly ASCII except for a
> character at the 'þ' position (but not þ).

So the patch causes the wanted characters to be displayed? Of course,
one can enter any value in EDT.

> EDT fans may want the patch
> for its ability to understand terminals with more than 24 lines.

Will the patch become standard? Not that I need a terminal with more
than 24 lines. :-)

Re: 8-bit characters

<ad878819-1dc3-4f60-8e3d-8d844673235dn@googlegroups.com>

  copy mid

https://www.novabbs.com/computers/article-flat.php?id=18660&group=comp.os.vms#18660

  copy link   Newsgroups: comp.os.vms
X-Received: by 2002:ac8:5711:: with SMTP id 17mr5815171qtw.138.1636618586444;
Thu, 11 Nov 2021 00:16:26 -0800 (PST)
X-Received: by 2002:ac8:5f46:: with SMTP id y6mr5606848qta.93.1636618586275;
Thu, 11 Nov 2021 00:16:26 -0800 (PST)
Path: i2pn2.org!i2pn.org!paganini.bofh.team!news.dns-netz.com!news.freedyn.net!newsreader4.netcologne.de!news.netcologne.de!peer02.ams1!peer.ams1.xlned.com!news.xlned.com!peer03.iad!feed-me.highwinds-media.com!news.highwinds-media.com!news-out.google.com!nntp.google.com!postnews.google.com!google-groups.googlegroups.com!not-for-mail
Newsgroups: comp.os.vms
Date: Thu, 11 Nov 2021 00:16:25 -0800 (PST)
In-Reply-To: <smicj8$h9r$1@gioia.aioe.org>
Injection-Info: google-groups.googlegroups.com; posting-host=118.92.47.103; posting-account=Rx7iEQoAAACMdczcZGHsDFakQWn8-8-t
NNTP-Posting-Host: 118.92.47.103
References: <smg4a2$boh$1@gioia.aioe.org> <smh1k8$ul5$1@dont-email.me>
<smiagb$1u1f$1@gioia.aioe.org> <smicj8$h9r$1@gioia.aioe.org>
User-Agent: G2/1.0
MIME-Version: 1.0
Message-ID: <ad878819-1dc3-4f60-8e3d-8d844673235dn@googlegroups.com>
Subject: Re: 8-bit characters
From: lawrence...@gmail.com (Lawrence D’Oliveiro)
Injection-Date: Thu, 11 Nov 2021 08:16:26 +0000
Content-Type: text/plain; charset="UTF-8"
Content-Transfer-Encoding: quoted-printable
X-Received-Bytes: 1393
 by: Lawrence D’Oliveir - Thu, 11 Nov 2021 08:16 UTC

On Thursday, November 11, 2021 at 7:18:24 PM UTC+13, Michael Moroney wrote:
> Specifically, ISO-8859-1 has Icelandic Þ and þ ...

Don’t they also use Ð and ð?

Re: 8-bit characters

<618d4319$0$699$14726298@news.sunsite.dk>

  copy mid

https://www.novabbs.com/computers/article-flat.php?id=18661&group=comp.os.vms#18661

  copy link   Newsgroups: comp.os.vms
Path: i2pn2.org!rocksolid2!news.neodome.net!feeder1.feed.usenet.farm!feed.usenet.farm!news.uzoreto.com!dotsrc.org!filter.dotsrc.org!news.dotsrc.org!not-for-mail
Date: Thu, 11 Nov 2021 11:21:42 -0500
MIME-Version: 1.0
User-Agent: Mozilla/5.0 (Windows NT 6.1; Win64; x64; rv:91.0) Gecko/20100101
Thunderbird/91.3.0
Subject: Re: 8-bit characters
Content-Language: en-US
Newsgroups: comp.os.vms
References: <smg4a2$boh$1@gioia.aioe.org> <smh1k8$ul5$1@dont-email.me>
<smhnmj$2d8$1@dont-email.me> <618c75e8$0$704$14726298@news.sunsite.dk>
<b5b56879-05bb-4d17-9242-2fcb582cfec8n@googlegroups.com>
From: arn...@vajhoej.dk (Arne Vajhøj)
In-Reply-To: <b5b56879-05bb-4d17-9242-2fcb582cfec8n@googlegroups.com>
Content-Type: text/plain; charset=UTF-8; format=flowed
Content-Transfer-Encoding: 8bit
Lines: 21
Message-ID: <618d4319$0$699$14726298@news.sunsite.dk>
Organization: SunSITE.dk - Supporting Open source
NNTP-Posting-Host: 3cad2be6.news.sunsite.dk
X-Trace: 1636647706 news.sunsite.dk 699 arne@vajhoej.dk/68.9.63.232:55770
X-Complaints-To: staff@sunsite.dk
 by: Arne Vajhøj - Thu, 11 Nov 2021 16:21 UTC

On 11/10/2021 11:48 PM, Lawrence D’Oliveiro wrote:
> On Thursday, November 11, 2021 at 3:33:33 PM UTC+13, Arne Vajhøj wrote:
>> The biggest problems with UTF-8 is that the byte length is not
>> necessarily the character length ...
>
> That would be true of any Unicode encoding, even UCS-4.

No.

It is a practical problem in UTF-8 as everything not in ASCII is more
than 1 byte.

It is a theoretical problem in UTF-16 because there are defined unicode
code points that become more than 2 bytes (they are just extremely
rare).

It is not a problem for UTF-32 as everything is 4 bytes.

Arne

Re: 8-bit characters

<smjhmk$1q0u$1@gioia.aioe.org>

  copy mid

https://www.novabbs.com/computers/article-flat.php?id=18662&group=comp.os.vms#18662

  copy link   Newsgroups: comp.os.vms
Path: i2pn2.org!i2pn.org!aioe.org!Uh3cGLv3BUP05xA/L7flqA.user.46.165.242.75.POSTED!not-for-mail
From: moro...@world.std.spaamtrap.com (Michael Moroney)
Newsgroups: comp.os.vms
Subject: Re: 8-bit characters
Date: Thu, 11 Nov 2021 11:51:34 -0500
Organization: Aioe.org NNTP Server
Message-ID: <smjhmk$1q0u$1@gioia.aioe.org>
References: <smg4a2$boh$1@gioia.aioe.org> <smh1k8$ul5$1@dont-email.me>
<smiagb$1u1f$1@gioia.aioe.org> <smicj8$h9r$1@gioia.aioe.org>
<ad878819-1dc3-4f60-8e3d-8d844673235dn@googlegroups.com>
Mime-Version: 1.0
Content-Type: text/plain; charset=utf-8; format=flowed
Content-Transfer-Encoding: 8bit
Injection-Info: gioia.aioe.org; logging-data="59422"; posting-host="Uh3cGLv3BUP05xA/L7flqA.user.gioia.aioe.org"; mail-complaints-to="abuse@aioe.org";
User-Agent: Mozilla/5.0 (Windows NT 6.1; Win64; x64; rv:78.0) Gecko/20100101
Thunderbird/78.14.0
Content-Language: en-US
X-Notice: Filtered by postfilter v. 0.9.2
 by: Michael Moroney - Thu, 11 Nov 2021 16:51 UTC

On 11/11/2021 3:16 AM, Lawrence D’Oliveiro wrote:
> On Thursday, November 11, 2021 at 7:18:24 PM UTC+13, Michael Moroney wrote:
>> Specifically, ISO-8859-1 has Icelandic Þ and þ ...
>
> Don’t they also use Ð and ð?
>

Yes. ISO-8859-1 has many defined characters which are not defined in
DEC-MCS, including Ð and ð. I only mentioned þ because of the original
post and the EDT patch background.

Re: 8-bit characters

<smji9o$4vb$1@gioia.aioe.org>

  copy mid

https://www.novabbs.com/computers/article-flat.php?id=18663&group=comp.os.vms#18663

  copy link   Newsgroups: comp.os.vms
Path: i2pn2.org!i2pn.org!aioe.org!Uh3cGLv3BUP05xA/L7flqA.user.46.165.242.75.POSTED!not-for-mail
From: moro...@world.std.spaamtrap.com (Michael Moroney)
Newsgroups: comp.os.vms
Subject: Re: 8-bit characters
Date: Thu, 11 Nov 2021 12:01:46 -0500
Organization: Aioe.org NNTP Server
Message-ID: <smji9o$4vb$1@gioia.aioe.org>
References: <smg4a2$boh$1@gioia.aioe.org> <smh1k8$ul5$1@dont-email.me>
<smiagb$1u1f$1@gioia.aioe.org> <smicj8$h9r$1@gioia.aioe.org>
<smij6j$ndd$1@gioia.aioe.org>
Mime-Version: 1.0
Content-Type: text/plain; charset=iso-8859-15; format=flowed
Content-Transfer-Encoding: 8bit
Injection-Info: gioia.aioe.org; logging-data="5099"; posting-host="Uh3cGLv3BUP05xA/L7flqA.user.gioia.aioe.org"; mail-complaints-to="abuse@aioe.org";
User-Agent: Mozilla/5.0 (Windows NT 6.1; Win64; x64; rv:78.0) Gecko/20100101
Thunderbird/78.14.0
Content-Language: en-US
X-Notice: Filtered by postfilter v. 0.9.2
 by: Michael Moroney - Thu, 11 Nov 2021 17:01 UTC

On 11/11/2021 3:10 AM, Phillip Helbig (undress to reply) wrote:
> In article <smicj8$h9r$1@gioia.aioe.org>, Michael Moroney
> <moroney@world.std.spaamtrap.com> writes:
>
>>>>> notice that COMPOSE-T-H and COMPOSE-t-h create upper and lower case
>>>>> thorn (Þ þ if those characters get through). If entered by both
>>>>> create the character, unless it is at the beginning of a line, in which
>>>>> case one sees <XDE> or <XFE> (one character, displayed as several).
>>>>> ASCII values are 222 and 254. Refreshing the screen also causes the
>>>>> mnenonics to appear. Also, they are not displayed via HELP FORTRAN
>>>>> CHAR DEC.
>>>>>
>>>>> Any deeper reason or just flaky instrumentation?
>>>>>
>>>>> I also notice that × (COMPOSE-x-x) works fine in a DECterm but not on a
>>>>> real VT220 (where most or all other composed characters work). Again,
>>>>> deeper meaning or just flaky?
>>>>
>>>> You're definitely not looking at ASCII, and AFAIK Þ and þ aren't
>>>> in DEC
>>>> MCS,
>>>
>>> At least HELP FORTRAN CHAR DEC doesn't show them.
>>>
>>>> which likely means you're looking at inconsistent handling of or
>>>> inconsistent configuration of ISO 8859-1 among your apps and OS and
>>>> hardware; I'd guess some here is MCS, and some 8859-1.
>>>
>>> Only LK411, Alpha hardware and DECterm (under CDE, but that's probably
>>> irrelevant). Maybe they are inconsistent. :-|
>>>
>>>> You've asked variations of this question over the years too, usually
>>>> involving trying to use EDT past ASCII or maybe past DEC MCS.
>>>
>>> Yes. :-)
>>>
>> The character set ISO-8859-1 is almost the same as DEC-MCS with some of
>> the undefined DEC-MCS characters being defined in ISO-8859-1. The
>> exceptions are a few rarely used characters such as Œ and Ÿ.
>> Specifically, ISO-8859-1 has Icelandic Þ and þ, these positions are
>> undefined in DEC-MCS. 99% of the time one can use ISO-8859-1 instead of
>> DEC-MCS and get away with it.
>
> Right. And ISO-8859-15 is also similar. I routinely write € in EDT to
> get the Euro sign when most people read that text.
>
>> There is an EDT patch which makes it more ISO-8859-1 friendly, actually
>> prompted by a customer who used EDT for strictly ASCII except for a
>> character at the 'þ' position (but not þ).
>
> So the patch causes the wanted characters to be displayed? Of course,
> one can enter any value in EDT.

Yes since all characters in the range xA0-xFF are defined and printable.
In theory it's compatible with any ISO-8859-x character set since EDT
doesn't care what the actual 8 bit characters are. It's up to the user
and their program to interpret things correctly.
>
>> EDT fans may want the patch
>> for its ability to understand terminals with more than 24 lines.
>
> Will the patch become standard? Not that I need a terminal with more
> than 24 lines. :-)
>
It's a regular patch for VSI V8.4-2x and is already part of 9.X.

Re: 8-bit characters

<smjmoj$6lh$1@dont-email.me>

  copy mid

https://www.novabbs.com/computers/article-flat.php?id=18666&group=comp.os.vms#18666

  copy link   Newsgroups: comp.os.vms
Path: i2pn2.org!i2pn.org!eternal-september.org!reader02.eternal-september.org!.POSTED!not-for-mail
From: craigbe...@nospam.mac.com (Craig A. Berry)
Newsgroups: comp.os.vms
Subject: Re: 8-bit characters
Date: Thu, 11 Nov 2021 12:17:53 -0600
Organization: A noiseless patient Spider
Lines: 26
Message-ID: <smjmoj$6lh$1@dont-email.me>
References: <smg4a2$boh$1@gioia.aioe.org> <smh1k8$ul5$1@dont-email.me>
<smhnmj$2d8$1@dont-email.me> <618c75e8$0$704$14726298@news.sunsite.dk>
<b5b56879-05bb-4d17-9242-2fcb582cfec8n@googlegroups.com>
<618d4319$0$699$14726298@news.sunsite.dk>
Mime-Version: 1.0
Content-Type: text/plain; charset=UTF-8; format=flowed
Content-Transfer-Encoding: 8bit
Injection-Date: Thu, 11 Nov 2021 18:17:55 -0000 (UTC)
Injection-Info: reader02.eternal-september.org; posting-host="61dfc909fc302fdcf5a6bce5cf41c51b";
logging-data="6833"; mail-complaints-to="abuse@eternal-september.org"; posting-account="U2FsdGVkX19FNt+1BG/gLft7Z+78I6ufxUmNQc72JFY="
User-Agent: Mozilla/5.0 (Macintosh; Intel Mac OS X 10.15; rv:91.0)
Gecko/20100101 Thunderbird/91.3.0
Cancel-Lock: sha1:X4btUiAClaYNExgBCUC2yePI9Tg=
In-Reply-To: <618d4319$0$699$14726298@news.sunsite.dk>
Content-Language: en-US
 by: Craig A. Berry - Thu, 11 Nov 2021 18:17 UTC

On 11/11/21 10:21 AM, Arne Vajhøj wrote:
> On 11/10/2021 11:48 PM, Lawrence D’Oliveiro wrote:
>> On Thursday, November 11, 2021 at 3:33:33 PM UTC+13, Arne Vajhøj wrote:
>>> The biggest problems with UTF-8 is that the byte length is not
>>> necessarily the character length ...
>>
>> That would be true of any Unicode encoding, even UCS-4.
>
> No.
>
> It is a practical problem in UTF-8 as everything not in ASCII is more
> than 1 byte.
>
> It is a theoretical problem in UTF-16 because there are defined unicode
> code points that become more than 2 bytes (they are just extremely
> rare).
>
> It is not a problem for UTF-32 as everything is 4 bytes.

Back when it was called UCS-4, I think that was true. But as far as I
know, all the ones with UTF in the name are varying width. I think
there are a couple of emojis that take more than 4 bytes and would need
two UTF-32 chunks to represent a single character. But even if the
encoding is not varying width, the number of characters displayed might
not match the number of code points because of things like combining
characters.

Re: 8-bit characters

<smjnmi$vng$1@gioia.aioe.org>

  copy mid

https://www.novabbs.com/computers/article-flat.php?id=18667&group=comp.os.vms#18667

  copy link   Newsgroups: comp.os.vms
Path: i2pn2.org!i2pn.org!aioe.org!cI9PH+Y/yCUymCTCKdsldQ.user.46.165.242.75.POSTED!not-for-mail
From: hel...@asclothestro.multivax.de (Phillip Helbig (undress to reply)
Newsgroups: comp.os.vms
Subject: Re: 8-bit characters
Date: Thu, 11 Nov 2021 18:33:54 -0000 (UTC)
Organization: Multivax C&R
Message-ID: <smjnmi$vng$1@gioia.aioe.org>
References: <smg4a2$boh$1@gioia.aioe.org> <smh1k8$ul5$1@dont-email.me> <smiagb$1u1f$1@gioia.aioe.org> <smicj8$h9r$1@gioia.aioe.org> <smij6j$ndd$1@gioia.aioe.org> <smji9o$4vb$1@gioia.aioe.org>
Mime-Version: 1.0
Content-Type: text/plain; charset=ISO-8859-15
Content-Transfer-Encoding: 8bit
Injection-Info: gioia.aioe.org; logging-data="32496"; posting-host="cI9PH+Y/yCUymCTCKdsldQ.user.gioia.aioe.org"; mail-complaints-to="abuse@aioe.org";
X-Notice: Filtered by postfilter v. 0.9.2
 by: Phillip Helbig (undr - Thu, 11 Nov 2021 18:33 UTC

In article <smji9o$4vb$1@gioia.aioe.org>, Michael Moroney
<moroney@world.std.spaamtrap.com> writes:

> >> EDT fans may want the patch
> >> for its ability to understand terminals with more than 24 lines.
> >
> > Will the patch become standard? Not that I need a terminal with more
> > than 24 lines. :-)
> >
> It's a regular patch for VSI V8.4-2x and is already part of 9.X.

Nice to see that EDT is still under active maintenance and even
development at VSI. :-D

Re: 8-bit characters

<smjo3a$e2a$1@dont-email.me>

  copy mid

https://www.novabbs.com/computers/article-flat.php?id=18668&group=comp.os.vms#18668

  copy link   Newsgroups: comp.os.vms
Path: i2pn2.org!i2pn.org!eternal-september.org!reader02.eternal-september.org!.POSTED!not-for-mail
From: dav...@tsoft-inc.com (Dave Froble)
Newsgroups: comp.os.vms
Subject: Re: 8-bit characters
Date: Thu, 11 Nov 2021 13:40:33 -0500
Organization: A noiseless patient Spider
Lines: 25
Message-ID: <smjo3a$e2a$1@dont-email.me>
References: <smg4a2$boh$1@gioia.aioe.org> <smh1k8$ul5$1@dont-email.me>
<smiagb$1u1f$1@gioia.aioe.org> <smicj8$h9r$1@gioia.aioe.org>
<smij6j$ndd$1@gioia.aioe.org> <smji9o$4vb$1@gioia.aioe.org>
<smjnmi$vng$1@gioia.aioe.org>
Mime-Version: 1.0
Content-Type: text/plain; charset=utf-8; format=flowed
Content-Transfer-Encoding: 7bit
Injection-Date: Thu, 11 Nov 2021 18:40:42 -0000 (UTC)
Injection-Info: reader02.eternal-september.org; posting-host="9f2977a4dced20d743c4f61cf9ede8f1";
logging-data="14410"; mail-complaints-to="abuse@eternal-september.org"; posting-account="U2FsdGVkX192bZOCI+KBhBEtSQg6uE7g7BqwgJze8Fo="
User-Agent: Mozilla/5.0 (Windows NT 5.1; rv:45.0) Gecko/20100101
Thunderbird/45.8.0
Cancel-Lock: sha1:JiquZt6LoofTe22nkAWftgu1C5M=
In-Reply-To: <smjnmi$vng$1@gioia.aioe.org>
 by: Dave Froble - Thu, 11 Nov 2021 18:40 UTC

On 11/11/2021 1:33 PM, Phillip Helbig (undress to reply) wrote:
> In article <smji9o$4vb$1@gioia.aioe.org>, Michael Moroney
> <moroney@world.std.spaamtrap.com> writes:
>
>>>> EDT fans may want the patch
>>>> for its ability to understand terminals with more than 24 lines.
>>>
>>> Will the patch become standard? Not that I need a terminal with more
>>> than 24 lines. :-)
>>>
>> It's a regular patch for VSI V8.4-2x and is already part of 9.X.
>
> Nice to see that EDT is still under active maintenance and even
> development at VSI. :-D
>

Don't be too sure about that. I believe Michael mentioned in the past
that he did the mod on his own, ie; not assigned work. I could mis-remember.

--
David Froble Tel: 724-529-0450
Dave Froble Enterprises, Inc. E-Mail: davef@tsoft-inc.com
DFE Ultralights, Inc.
170 Grimplin Road
Vanderbilt, PA 15486

Re: 8-bit characters

<f58aadcf-5c2e-470f-a6e2-6d69a5e65af0n@googlegroups.com>

  copy mid

https://www.novabbs.com/computers/article-flat.php?id=18669&group=comp.os.vms#18669

  copy link   Newsgroups: comp.os.vms
X-Received: by 2002:ac8:5f0c:: with SMTP id x12mr10162468qta.309.1636656335193;
Thu, 11 Nov 2021 10:45:35 -0800 (PST)
X-Received: by 2002:ad4:5e8c:: with SMTP id jl12mr8662173qvb.58.1636656334931;
Thu, 11 Nov 2021 10:45:34 -0800 (PST)
Path: i2pn2.org!i2pn.org!weretis.net!feeder8.news.weretis.net!proxad.net!feeder1-2.proxad.net!209.85.160.216.MISMATCH!news-out.google.com!nntp.google.com!postnews.google.com!google-groups.googlegroups.com!not-for-mail
Newsgroups: comp.os.vms
Date: Thu, 11 Nov 2021 10:45:34 -0800 (PST)
In-Reply-To: <smicj8$h9r$1@gioia.aioe.org>
Injection-Info: google-groups.googlegroups.com; posting-host=147.0.155.38; posting-account=OzllYAoAAABYVy-SLFd1zjjAl1yc-de-
NNTP-Posting-Host: 147.0.155.38
References: <smg4a2$boh$1@gioia.aioe.org> <smh1k8$ul5$1@dont-email.me>
<smiagb$1u1f$1@gioia.aioe.org> <smicj8$h9r$1@gioia.aioe.org>
User-Agent: G2/1.0
MIME-Version: 1.0
Message-ID: <f58aadcf-5c2e-470f-a6e2-6d69a5e65af0n@googlegroups.com>
Subject: Re: 8-bit characters
From: jon.pink...@gmail.com (Jon Pinkley)
Injection-Date: Thu, 11 Nov 2021 18:45:35 +0000
Content-Type: text/plain; charset="UTF-8"
Content-Transfer-Encoding: quoted-printable
 by: Jon Pinkley - Thu, 11 Nov 2021 18:45 UTC

On Thursday, November 11, 2021 at 1:18:24 AM UTC-5, Michael Moroney wrote:
> There is an EDT patch which makes it more ISO-8859-1 friendly, actually
> prompted by a customer who used EDT for strictly ASCII except for a
> character at the 'þ' position (but not þ). EDT fans may want the patch
> for its ability to understand terminals with more than 24 lines.

The patch breaks some features when used with KEA!420
(I don't have a real DEC terminal to test with).

If you have KEA!420 if you use gold 7 show buf it will not display the buffers

Here is what EDT on Alpha OpenVMS V8.3 looks like.

Command: show buf
=T 1 lines
MAIN No lines
PASTE No lines
Press return to continue

Same on IA64 OpenVMS V8.4-2L1

Command: show buf
Press return to continue

The work around is to use ^Z to drop into command mode
*show buf
=T 1 lines
MAIN No lines
PASTE No lines
* Then use c to return to screen (change?) mode.

Oddly it does work correctly with putty.

I have always found that KEA!420 was a pretty good emulator, but this is a
case that putty works better, and it does support more lines.

If someone has a real VT terminal, it would be interesting to see if it
works correctly. (correctly meaning the way it always has worked before).

Re: 8-bit characters

<618d669e$0$705$14726298@news.sunsite.dk>

  copy mid

https://www.novabbs.com/computers/article-flat.php?id=18670&group=comp.os.vms#18670

  copy link   Newsgroups: comp.os.vms
Path: i2pn2.org!i2pn.org!aioe.org!news.uzoreto.com!dotsrc.org!filter.dotsrc.org!news.dotsrc.org!not-for-mail
Date: Thu, 11 Nov 2021 13:53:11 -0500
MIME-Version: 1.0
User-Agent: Mozilla/5.0 (Windows NT 6.1; Win64; x64; rv:91.0) Gecko/20100101
Thunderbird/91.3.0
Subject: Re: 8-bit characters
Content-Language: en-US
Newsgroups: comp.os.vms
References: <smg4a2$boh$1@gioia.aioe.org> <smh1k8$ul5$1@dont-email.me>
<smhnmj$2d8$1@dont-email.me> <618c75e8$0$704$14726298@news.sunsite.dk>
<b5b56879-05bb-4d17-9242-2fcb582cfec8n@googlegroups.com>
<618d4319$0$699$14726298@news.sunsite.dk> <smjmoj$6lh$1@dont-email.me>
From: arn...@vajhoej.dk (Arne Vajhøj)
In-Reply-To: <smjmoj$6lh$1@dont-email.me>
Content-Type: text/plain; charset=UTF-8; format=flowed
Content-Transfer-Encoding: 8bit
Lines: 49
Message-ID: <618d669e$0$705$14726298@news.sunsite.dk>
Organization: SunSITE.dk - Supporting Open source
NNTP-Posting-Host: f0abec4f.news.sunsite.dk
X-Trace: 1636656798 news.sunsite.dk 705 arne@vajhoej.dk/68.9.63.232:60138
X-Complaints-To: staff@sunsite.dk
 by: Arne Vajhøj - Thu, 11 Nov 2021 18:53 UTC

On 11/11/2021 1:17 PM, Craig A. Berry wrote:
> On 11/11/21 10:21 AM, Arne Vajhøj wrote:
>> On 11/10/2021 11:48 PM, Lawrence D’Oliveiro wrote:
>>> On Thursday, November 11, 2021 at 3:33:33 PM UTC+13, Arne Vajhøj wrote:
>>>> The biggest problems with UTF-8 is that the byte length is not
>>>> necessarily the character length ...
>>>
>>> That would be true of any Unicode encoding, even UCS-4.
>>
>> No.
>>
>> It is a practical problem in UTF-8 as everything not in ASCII is more
>> than 1 byte.
>>
>> It is a theoretical problem in UTF-16 because there are defined unicode
>> code points that become more than 2 bytes (they are just extremely
>> rare).
>>
>> It is not a problem for UTF-32 as everything is 4 bytes.
>
> Back when it was called UCS-4, I think that was true.  But as far as I
> know, all the ones with UTF in the name are varying width.  I think
> there are a couple of emojis that take more than 4 bytes and would need
> two UTF-32 chunks to represent a single character.

A few quotes from the standard:

<quote>
In the Unicode Standard, the codespace consists of the integers from 0
to 10FFFF16, comprising 1,114,112 code points available for assigning
the repertoire of abstract characters.
</quote>

<quote>
Each Unicode code point is represented directly by a single 32-bit
code unit. Because of this, UTF-32 has a one-to-one relationship
between encoded character and code unit; it is a fixed-width character
encoding form.
</quote>

> But even if the
> encoding is not varying width, the number of characters displayed might
> not match the number of code points because of things like combining
> characters.

Display is another issue - a way more complex issue.

Arne

Re: 8-bit characters

<fdad2b94-64e3-48e1-a415-390273a4a1b7n@googlegroups.com>

  copy mid

https://www.novabbs.com/computers/article-flat.php?id=18671&group=comp.os.vms#18671

  copy link   Newsgroups: comp.os.vms
X-Received: by 2002:a05:620a:298e:: with SMTP id r14mr7824044qkp.509.1636656896668;
Thu, 11 Nov 2021 10:54:56 -0800 (PST)
X-Received: by 2002:a05:6214:509a:: with SMTP id kk26mr8798637qvb.43.1636656896318;
Thu, 11 Nov 2021 10:54:56 -0800 (PST)
Path: i2pn2.org!i2pn.org!weretis.net!feeder8.news.weretis.net!newsreader4.netcologne.de!news.netcologne.de!peer01.ams1!peer.ams1.xlned.com!news.xlned.com!peer02.iad!feed-me.highwinds-media.com!news.highwinds-media.com!news-out.google.com!nntp.google.com!postnews.google.com!google-groups.googlegroups.com!not-for-mail
Newsgroups: comp.os.vms
Date: Thu, 11 Nov 2021 10:54:56 -0800 (PST)
In-Reply-To: <f58aadcf-5c2e-470f-a6e2-6d69a5e65af0n@googlegroups.com>
Injection-Info: google-groups.googlegroups.com; posting-host=147.0.155.38; posting-account=OzllYAoAAABYVy-SLFd1zjjAl1yc-de-
NNTP-Posting-Host: 147.0.155.38
References: <smg4a2$boh$1@gioia.aioe.org> <smh1k8$ul5$1@dont-email.me>
<smiagb$1u1f$1@gioia.aioe.org> <smicj8$h9r$1@gioia.aioe.org> <f58aadcf-5c2e-470f-a6e2-6d69a5e65af0n@googlegroups.com>
User-Agent: G2/1.0
MIME-Version: 1.0
Message-ID: <fdad2b94-64e3-48e1-a415-390273a4a1b7n@googlegroups.com>
Subject: Re: 8-bit characters
From: jon.pink...@gmail.com (Jon Pinkley)
Injection-Date: Thu, 11 Nov 2021 18:54:56 +0000
Content-Type: text/plain; charset="UTF-8"
Content-Transfer-Encoding: quoted-printable
X-Received-Bytes: 2913
 by: Jon Pinkley - Thu, 11 Nov 2021 18:54 UTC

On Thursday, November 11, 2021 at 1:45:36 PM UTC-5, Jon Pinkley wrote:
> On Thursday, November 11, 2021 at 1:18:24 AM UTC-5, Michael Moroney wrote:
> > There is an EDT patch which makes it more ISO-8859-1 friendly, actually
> > prompted by a customer who used EDT for strictly ASCII except for a
> > character at the 'þ' position (but not þ). EDT fans may want the patch
> > for its ability to understand terminals with more than 24 lines.
> The patch breaks some features when used with KEA!420
> (I don't have a real DEC terminal to test with).
>
> If you have KEA!420 if you use gold 7 show buf it will not display the buffers
>
> Here is what EDT on Alpha OpenVMS V8.3 looks like.
>
>
> Command: show buf
> =T 1 lines
> MAIN No lines
> PASTE No lines
> Press return to continue
>
> Same on IA64 OpenVMS V8.4-2L1
>
>
> Command: show buf
> Press return to continue
>
> The work around is to use ^Z to drop into command mode
> *show buf
> =T 1 lines
> MAIN No lines
> PASTE No lines
> *
> Then use c to return to screen (change?) mode.
>
> Oddly it does work correctly with putty.
>
> I have always found that KEA!420 was a pretty good emulator, but this is a
> case that putty works better, and it does support more lines.
>
> If someone has a real VT terminal, it would be interesting to see if it
> works correctly. (correctly meaning the way it always has worked before).

I should have said

Same on IA64 OpenVMS V8.4-2L1 with EDT patch applied

Re: 8-bit characters

<smjrat$7v4$1@dont-email.me>

  copy mid

https://www.novabbs.com/computers/article-flat.php?id=18674&group=comp.os.vms#18674

  copy link   Newsgroups: comp.os.vms
Path: i2pn2.org!i2pn.org!eternal-september.org!reader02.eternal-september.org!.POSTED!not-for-mail
From: craigbe...@nospam.mac.com (Craig A. Berry)
Newsgroups: comp.os.vms
Subject: Re: 8-bit characters
Date: Thu, 11 Nov 2021 13:35:55 -0600
Organization: A noiseless patient Spider
Lines: 55
Message-ID: <smjrat$7v4$1@dont-email.me>
References: <smg4a2$boh$1@gioia.aioe.org> <smh1k8$ul5$1@dont-email.me>
<smhnmj$2d8$1@dont-email.me> <618c75e8$0$704$14726298@news.sunsite.dk>
<b5b56879-05bb-4d17-9242-2fcb582cfec8n@googlegroups.com>
<618d4319$0$699$14726298@news.sunsite.dk> <smjmoj$6lh$1@dont-email.me>
<618d669e$0$705$14726298@news.sunsite.dk>
Mime-Version: 1.0
Content-Type: text/plain; charset=UTF-8; format=flowed
Content-Transfer-Encoding: 8bit
Injection-Date: Thu, 11 Nov 2021 19:35:57 -0000 (UTC)
Injection-Info: reader02.eternal-september.org; posting-host="61dfc909fc302fdcf5a6bce5cf41c51b";
logging-data="8164"; mail-complaints-to="abuse@eternal-september.org"; posting-account="U2FsdGVkX1809K7bdjeyM5h9E8iCJNUzoNu5GeppUUc="
User-Agent: Mozilla/5.0 (Macintosh; Intel Mac OS X 10.15; rv:91.0)
Gecko/20100101 Thunderbird/91.3.0
Cancel-Lock: sha1:eWtdYmqmsE66MkunYPbRnRs3Qws=
In-Reply-To: <618d669e$0$705$14726298@news.sunsite.dk>
Content-Language: en-US
 by: Craig A. Berry - Thu, 11 Nov 2021 19:35 UTC

On 11/11/21 12:53 PM, Arne Vajhøj wrote:
> On 11/11/2021 1:17 PM, Craig A. Berry wrote:
>> On 11/11/21 10:21 AM, Arne Vajhøj wrote:
>>> On 11/10/2021 11:48 PM, Lawrence D’Oliveiro wrote:
>>>> On Thursday, November 11, 2021 at 3:33:33 PM UTC+13, Arne Vajhøj wrote:
>>>>> The biggest problems with UTF-8 is that the byte length is not
>>>>> necessarily the character length ...
>>>>
>>>> That would be true of any Unicode encoding, even UCS-4.
>>>
>>> No.
>>>
>>> It is a practical problem in UTF-8 as everything not in ASCII is more
>>> than 1 byte.
>>>
>>> It is a theoretical problem in UTF-16 because there are defined unicode
>>> code points that become more than 2 bytes (they are just extremely
>>> rare).
>>>
>>> It is not a problem for UTF-32 as everything is 4 bytes.
>>
>> Back when it was called UCS-4, I think that was true.  But as far as I
>> know, all the ones with UTF in the name are varying width.  I think
>> there are a couple of emojis that take more than 4 bytes and would need
>> two UTF-32 chunks to represent a single character.
>
> A few quotes from the standard:
>
> <quote>
> In the Unicode Standard, the codespace consists of the integers from 0
> to 10FFFF16, comprising 1,114,112 code points available for assigning
> the repertoire of abstract characters.
> </quote>
>
> <quote>
> Each  Unicode  code  point  is  represented directly by a single 32-bit
> code unit. Because of this, UTF-32 has a one-to-one relationship
> between encoded character and code unit; it is a fixed-width character
> encoding form.
> </quote>

Hmm. You're right. For some reason I had thought they'd blown the
4-byte limit with emojis, but it doesn't seem UTF-32 has any provision
for surrogate pairs.

>>                                                  But even if the
>> encoding is not varying width, the number of characters displayed might
>> not match the number of code points because of things like combining
>> characters.
>
> Display is another issue - a way more complex issue.
>
> Arne
>

Re: 8-bit characters

<smk04g$19rt$1@gioia.aioe.org>

  copy mid

https://www.novabbs.com/computers/article-flat.php?id=18675&group=comp.os.vms#18675

  copy link   Newsgroups: comp.os.vms
Path: i2pn2.org!i2pn.org!aioe.org!Uh3cGLv3BUP05xA/L7flqA.user.46.165.242.75.POSTED!not-for-mail
From: moro...@world.std.spaamtrap.com (Michael Moroney)
Newsgroups: comp.os.vms
Subject: Re: 8-bit characters
Date: Thu, 11 Nov 2021 15:57:54 -0500
Organization: Aioe.org NNTP Server
Message-ID: <smk04g$19rt$1@gioia.aioe.org>
References: <smg4a2$boh$1@gioia.aioe.org> <smh1k8$ul5$1@dont-email.me>
<smiagb$1u1f$1@gioia.aioe.org> <smicj8$h9r$1@gioia.aioe.org>
<smij6j$ndd$1@gioia.aioe.org> <smji9o$4vb$1@gioia.aioe.org>
<smjnmi$vng$1@gioia.aioe.org> <smjo3a$e2a$1@dont-email.me>
Mime-Version: 1.0
Content-Type: text/plain; charset=utf-8; format=flowed
Content-Transfer-Encoding: 8bit
Injection-Info: gioia.aioe.org; logging-data="42877"; posting-host="Uh3cGLv3BUP05xA/L7flqA.user.gioia.aioe.org"; mail-complaints-to="abuse@aioe.org";
User-Agent: Mozilla/5.0 (Windows NT 6.1; Win64; x64; rv:78.0) Gecko/20100101
Thunderbird/78.14.0
X-Notice: Filtered by postfilter v. 0.9.2
Content-Language: en-US
 by: Michael Moroney - Thu, 11 Nov 2021 20:57 UTC

On 11/11/2021 1:40 PM, Dave Froble wrote:
> On 11/11/2021 1:33 PM, Phillip Helbig (undress to reply) wrote:
>> In article <smji9o$4vb$1@gioia.aioe.org>, Michael Moroney
>> <moroney@world.std.spaamtrap.com> writes:
>>
>>>>> EDT fans may want the patch
>>>>> for its ability to understand terminals with more than 24 lines.
>>>>
>>>> Will the patch become standard?  Not that I need a terminal with more
>>>> than 24 lines.  :-)
>>>>
>>> It's a regular patch for VSI V8.4-2x and is already part of 9.X.
>>
>> Nice to see that EDT is still under active maintenance and even
>> development at VSI.  :-D

It really isn't.

> Don't be too sure about that.  I believe Michael mentioned in the past
> that he did the mod on his own, ie; not assigned work.  I could
> mis-remember.
>

You remember correctly. The hardwired '24 line terminal' assumption
pissed me off and a few times I looked at it I said no way I can fix
that spaghetti code. But one day the planets were aligned or something,
and I just did it.

Re: 8-bit characters

<618d84c9$0$692$14726298@news.sunsite.dk>

  copy mid

https://www.novabbs.com/computers/article-flat.php?id=18676&group=comp.os.vms#18676

  copy link   Newsgroups: comp.os.vms
Path: i2pn2.org!i2pn.org!aioe.org!news.uzoreto.com!dotsrc.org!filter.dotsrc.org!news.dotsrc.org!not-for-mail
Date: Thu, 11 Nov 2021 16:01:58 -0500
MIME-Version: 1.0
User-Agent: Mozilla/5.0 (Windows NT 6.1; Win64; x64; rv:91.0) Gecko/20100101
Thunderbird/91.3.0
Subject: Re: 8-bit characters
Content-Language: en-US
Newsgroups: comp.os.vms
References: <smg4a2$boh$1@gioia.aioe.org> <smh1k8$ul5$1@dont-email.me>
<smiagb$1u1f$1@gioia.aioe.org> <smicj8$h9r$1@gioia.aioe.org>
<smij6j$ndd$1@gioia.aioe.org> <smji9o$4vb$1@gioia.aioe.org>
<smjnmi$vng$1@gioia.aioe.org> <smjo3a$e2a$1@dont-email.me>
<smk04g$19rt$1@gioia.aioe.org>
From: arn...@vajhoej.dk (Arne Vajhøj)
In-Reply-To: <smk04g$19rt$1@gioia.aioe.org>
Content-Type: text/plain; charset=UTF-8; format=flowed
Content-Transfer-Encoding: 8bit
Lines: 32
Message-ID: <618d84c9$0$692$14726298@news.sunsite.dk>
Organization: SunSITE.dk - Supporting Open source
NNTP-Posting-Host: 3cfdef0f.news.sunsite.dk
X-Trace: 1636664521 news.sunsite.dk 692 arne@vajhoej.dk/68.9.63.232:63958
X-Complaints-To: staff@sunsite.dk
 by: Arne Vajhøj - Thu, 11 Nov 2021 21:01 UTC

On 11/11/2021 3:57 PM, Michael Moroney wrote:
> On 11/11/2021 1:40 PM, Dave Froble wrote:
>> On 11/11/2021 1:33 PM, Phillip Helbig (undress to reply) wrote:
>>> In article <smji9o$4vb$1@gioia.aioe.org>, Michael Moroney
>>> <moroney@world.std.spaamtrap.com> writes:
>>>
>>>>>> EDT fans may want the patch
>>>>>> for its ability to understand terminals with more than 24 lines.
>>>>>
>>>>> Will the patch become standard?  Not that I need a terminal with more
>>>>> than 24 lines.  :-)
>>>>>
>>>> It's a regular patch for VSI V8.4-2x and is already part of 9.X.
>>>
>>> Nice to see that EDT is still under active maintenance and even
>>> development at VSI.  :-D
>
> It really isn't.
>
>> Don't be too sure about that.  I believe Michael mentioned in the past
>> that he did the mod on his own, ie; not assigned work.  I could
>> mis-remember.
>>
>
> You remember correctly. The hardwired '24 line terminal' assumption
> pissed me off and a few times I looked at it I said no way I can fix
> that spaghetti code. But one day the planets were aligned or something,
> and I just did it.

Macro-32 ?

Arne

Re: 8-bit characters

<smk0u9$i36$1@dont-email.me>

  copy mid

https://www.novabbs.com/computers/article-flat.php?id=18677&group=comp.os.vms#18677

  copy link   Newsgroups: comp.os.vms
Path: i2pn2.org!i2pn.org!eternal-september.org!reader02.eternal-september.org!.POSTED!not-for-mail
From: FIRST.L...@vmssoftware.com (Robert A. Brooks)
Newsgroups: comp.os.vms
Subject: Re: 8-bit characters
Date: Thu, 11 Nov 2021 16:11:37 -0500
Organization: A noiseless patient Spider
Lines: 13
Message-ID: <smk0u9$i36$1@dont-email.me>
References: <smg4a2$boh$1@gioia.aioe.org> <smh1k8$ul5$1@dont-email.me>
<smiagb$1u1f$1@gioia.aioe.org> <smicj8$h9r$1@gioia.aioe.org>
<smij6j$ndd$1@gioia.aioe.org> <smji9o$4vb$1@gioia.aioe.org>
<smjnmi$vng$1@gioia.aioe.org> <smjo3a$e2a$1@dont-email.me>
<smk04g$19rt$1@gioia.aioe.org> <618d84c9$0$692$14726298@news.sunsite.dk>
Mime-Version: 1.0
Content-Type: text/plain; charset=UTF-8; format=flowed
Content-Transfer-Encoding: 8bit
Injection-Date: Thu, 11 Nov 2021 21:11:37 -0000 (UTC)
Injection-Info: reader02.eternal-september.org; posting-host="dd12df659fbb37273d140c7d87546648";
logging-data="18534"; mail-complaints-to="abuse@eternal-september.org"; posting-account="U2FsdGVkX19v81supMn2m55/LRRLOHGNetkAJtGsC6aVH84e13b6OQ=="
User-Agent: Mozilla/5.0 (Windows NT 10.0; Win64; x64; rv:91.0) Gecko/20100101
Thunderbird/91.3.0
Cancel-Lock: sha1:h8XiUH+ujzyyw7FTo2yUsxXILL4=
In-Reply-To: <618d84c9$0$692$14726298@news.sunsite.dk>
X-Antivirus-Status: Clean
Content-Language: en-US
X-Antivirus: Avast (VPS 211111-6, 11/11/2021), Outbound message
 by: Robert A. Brooks - Thu, 11 Nov 2021 21:11 UTC

On 11/11/2021 4:01 PM, Arne Vajhøj wrote:
> On 11/11/2021 3:57 PM, Michael Moroney wrote:

>> You remember correctly. The hardwired '24 line terminal' assumption pissed me
>> off and a few times I looked at it I said no way I can fix that spaghetti
>> code. But one day the planets were aligned or something, and I just did it.
>
> Macro-32 ?

BLISS-32

--
-- Rob

Pages:123456789
server_pubkey.txt

rocksolid light 0.9.8
clearnet tor