Rocksolid Light

Welcome to novaBBS (click a section below)

mail  files  register  newsreader  groups  login

Message-ID:  

Progress means replacing a theory that is wrong with one more subtly wrong.


devel / comp.protocols.dicom / Re: Character count within PN attributes

SubjectAuthor
* Character count within PN attributesmadMorty
`- Character count within PN attributesDavid Gobbi

1
Character count within PN attributes

<ccacc1d1-8e73-46eb-9a75-e19e51e1f564n@googlegroups.com>

  copy mid

https://www.novabbs.com/devel/article-flat.php?id=10345&group=comp.protocols.dicom#10345

  copy link   Newsgroups: comp.protocols.dicom
X-Received: by 2002:a05:620a:1aa2:b0:67d:1637:7a9e with SMTP id bl34-20020a05620a1aa200b0067d16377a9emr5961954qkb.680.1646997004579;
Fri, 11 Mar 2022 03:10:04 -0800 (PST)
X-Received: by 2002:a05:6214:300e:b0:435:b3cf:2d9f with SMTP id
ke14-20020a056214300e00b00435b3cf2d9fmr7452161qvb.124.1646997004400; Fri, 11
Mar 2022 03:10:04 -0800 (PST)
Path: i2pn2.org!i2pn.org!weretis.net!feeder6.news.weretis.net!news.misty.com!border2.nntp.dca1.giganews.com!nntp.giganews.com!news-out.google.com!nntp.google.com!postnews.google.com!google-groups.googlegroups.com!not-for-mail
Newsgroups: comp.protocols.dicom
Date: Fri, 11 Mar 2022 03:10:04 -0800 (PST)
Injection-Info: google-groups.googlegroups.com; posting-host=2003:d7:ff1d:5d00:7db0:92d5:9f4:de85;
posting-account=JEyPZwoAAADgvqXlfDn3_bdujWVbq4jy
NNTP-Posting-Host: 2003:d7:ff1d:5d00:7db0:92d5:9f4:de85
User-Agent: G2/1.0
MIME-Version: 1.0
Message-ID: <ccacc1d1-8e73-46eb-9a75-e19e51e1f564n@googlegroups.com>
Subject: Character count within PN attributes
From: madmorty...@gmail.com (madMorty)
Injection-Date: Fri, 11 Mar 2022 11:10:04 +0000
Content-Type: text/plain; charset="UTF-8"
Content-Transfer-Encoding: quoted-printable
Lines: 18
 by: madMorty - Fri, 11 Mar 2022 11:10 UTC

Hi all,
so for PN attributes the maximum value length of one CG is limited to 64 characters including delimiters. Considering user input of only lastname and firstname, this would leave me with a 63 character limit considering the single "^" delimiter between the two components. Furthermore considering the used Specific Character Set (0008,0005) as "ISO IR100":

- 'ü' = shoud count as two seperate characters

But then again what about:
- '°' = this should still count as one character right (not two (bytes))?

Or should all characters from 0xA1 on count as 2 characters? This is currently a bit confusing two me, since I thought that the text/string VR limit within DICOM really means characters and never bytes.

Best regards,
Morty

Re: Character count within PN attributes

<ac850f1a-8977-4e87-b555-95ba3a92fec0n@googlegroups.com>

  copy mid

https://www.novabbs.com/devel/article-flat.php?id=10346&group=comp.protocols.dicom#10346

  copy link   Newsgroups: comp.protocols.dicom
X-Received: by 2002:a05:6214:1643:b0:42c:2865:d1e7 with SMTP id f3-20020a056214164300b0042c2865d1e7mr9296907qvw.52.1647023588870;
Fri, 11 Mar 2022 10:33:08 -0800 (PST)
X-Received: by 2002:ac8:764a:0:b0:2e1:bb5e:5ff3 with SMTP id
i10-20020ac8764a000000b002e1bb5e5ff3mr3500909qtr.255.1647023588662; Fri, 11
Mar 2022 10:33:08 -0800 (PST)
Path: i2pn2.org!i2pn.org!weretis.net!feeder6.news.weretis.net!news.misty.com!border2.nntp.dca1.giganews.com!nntp.giganews.com!news-out.google.com!nntp.google.com!postnews.google.com!google-groups.googlegroups.com!not-for-mail
Newsgroups: comp.protocols.dicom
Date: Fri, 11 Mar 2022 10:33:08 -0800 (PST)
In-Reply-To: <ccacc1d1-8e73-46eb-9a75-e19e51e1f564n@googlegroups.com>
Injection-Info: google-groups.googlegroups.com; posting-host=198.48.161.196; posting-account=oJk4vAoAAAAuHqwGdLwYUlL776upyWJ3
NNTP-Posting-Host: 198.48.161.196
References: <ccacc1d1-8e73-46eb-9a75-e19e51e1f564n@googlegroups.com>
User-Agent: G2/1.0
MIME-Version: 1.0
Message-ID: <ac850f1a-8977-4e87-b555-95ba3a92fec0n@googlegroups.com>
Subject: Re: Character count within PN attributes
From: david.go...@gmail.com (David Gobbi)
Injection-Date: Fri, 11 Mar 2022 18:33:08 +0000
Content-Type: text/plain; charset="UTF-8"
Content-Transfer-Encoding: quoted-printable
Lines: 27
 by: David Gobbi - Fri, 11 Mar 2022 18:33 UTC

On Friday, 11 March 2022 at 04:10:06 UTC-7, madMorty wrote:
> - 'ü' = should count as two separate characters

You must be referring to the statement about diacritics at the end of PS 3..5 6.2.1.2:

> Each combining character (e.g., diacritics or vowel marks) shall be considered a separate character for this maximum length, regardless of how an application may display such combining characters (i.e., combined into the glyph for the base character, or rendered separately).

So 'ü' should only be counted as two characters if the letter u and the diacritic ¨ are encoded as separate code points (a base character and a combining diacritic). But in ISO_IR 100, ü is encoded as a single code point. Ditto for NFC utf-8, where even though ü is encoded as two bytes, it is still a single code point.

Further clarification is available in CP 964 "Correct alphabetic name encoding for Unicode", which states:
> The definition: Combining characters (e.g., diacritics or vowel marks) separately encoded from base
characters shall be considered separate characters for this maximum length was chosen to be
consistent with Unicode and GB18030 definition of character code points.

So there you go. When DICOM says "character", it means "code point", regardless of the number of bytes, and regardless how the code points might be combined to form glyphs.

1
server_pubkey.txt

rocksolid light 0.9.81
clearnet tor