Rocksolid Light

Welcome to novaBBS (click a section below)

mail  files  register  newsreader  groups  login

Message-ID:  

<doogie> Thinking is dangerous. It leads to ideas. -- Seen on #Debian


computers / alt.comp.os.windows-10 / Character weirdness in redirection

SubjectAuthor
* Character weirdness in redirectionStan Brown
`* Re: Character weirdness in redirectionVanguardLH
 `* Re: Character weirdness in redirectionStan Brown
  `* Re: Character weirdness in redirectionVanguardLH
   `- Re: Character weirdness in redirectionStan Brown

1
Character weirdness in redirection

<MPG.3e587dbf227912df990065@news.individual.net>

  copy mid

https://www.novabbs.com/computers/article-flat.php?id=69071&group=alt.comp.os.windows-10#69071

  copy link   Newsgroups: alt.comp.os.windows-10
Path: i2pn2.org!i2pn.org!weretis.net!feeder8.news.weretis.net!newsreader4.netcologne.de!news.netcologne.de!fu-berlin.de!uni-berlin.de!individual.net!not-for-mail
From: the_stan...@fastmail.fm (Stan Brown)
Newsgroups: alt.comp.os.windows-10
Subject: Character weirdness in redirection
Date: Thu, 16 Feb 2023 17:52:38 -0800
Organization: Oak Road Systems
Lines: 29
Message-ID: <MPG.3e587dbf227912df990065@news.individual.net>
Mime-Version: 1.0
Content-Type: text/plain; charset="iso-8859-1"
Content-Transfer-Encoding: 8bit
X-Trace: individual.net 4Eq0O1o92Uyn1jySXo6LQQswk88pwxOfBoKLumoM4NXu5A1Bbg
Cancel-Lock: sha1:0txd9f6rTHbvQ54TIIKngg8Oftw=
User-Agent: MicroPlanet-Gravity/3.0.11 (GRC)
 by: Stan Brown - Fri, 17 Feb 2023 01:52 UTC

Windows 10 Pro version 21H2 OS build 19044.2604

I entered this command:
for /d %X in (dv*) do echo %X
and the result was
Dvorák, Antonin (1841-1904)
except with an upside-down caret over the r. That's identical to what
File Explorer shows.

Then I redirected the output to a batch file:
for /d %X in (dv*) do echo cd %X >foo.bat
I executed foo in the same command window and got "The system cannot
find the path specified." I then entered
type foo.bat
and the response was
cd Dvorák, Antonin (1841-1904)
with _no_ accent mark on the r.

chcp tells me that the active code page is 437, for what relevance
that may have. I could understand if the r-with-upside-down-caret
didn't display in the command window, since CP437 doesn't include it
<https://en.wikipedia.org/wiki/Code_page_437>. But it did display
there, only was changed to a regular r in redirection.

Can someone explain what's going on here? Thanks!

--
Stan Brown, Tehachapi, California, USA https://BrownMath.com/
Shikata ga nai...

Re: Character weirdness in redirection

<60irwclle7pd.dlg@v.nguard.lh>

  copy mid

https://www.novabbs.com/computers/article-flat.php?id=69072&group=alt.comp.os.windows-10#69072

  copy link   Newsgroups: alt.comp.os.windows-10
Path: i2pn2.org!i2pn.org!news.neodome.net!fu-berlin.de!uni-berlin.de!individual.net!not-for-mail
From: V...@nguard.LH (VanguardLH)
Newsgroups: alt.comp.os.windows-10
Subject: Re: Character weirdness in redirection
Date: Thu, 16 Feb 2023 20:36:32 -0600
Lines: 104
Message-ID: <60irwclle7pd.dlg@v.nguard.lh>
References: <MPG.3e587dbf227912df990065@news.individual.net>
Mime-Version: 1.0
Content-Type: text/plain; charset="iso-8859-1"
Content-Transfer-Encoding: 8bit
X-Trace: individual.net SVsDk6BnEAaQhF8I7IKCegLR4egGRBalMWg0D57n+x/vnmP6P4
Cancel-Lock: sha1:5zA7fB9wT+aJ+22dm1VieAN3Lj8=
User-Agent: 40tude_Dialog/2.0.15.41
 by: VanguardLH - Fri, 17 Feb 2023 02:36 UTC

Stan Brown <the_stan_brown@fastmail.fm> wrote:

> Windows 10 Pro version 21H2 OS build 19044.2604
>
> I entered this command:
> for /d %X in (dv*) do echo %X
> and the result was
> Dvorák, Antonin (1841-1904)
> except with an upside-down caret over the r. That's identical to what
> File Explorer shows.
>
> Then I redirected the output to a batch file:
> for /d %X in (dv*) do echo cd %X >foo.bat
> I executed foo in the same command window and got "The system cannot
> find the path specified." I then entered
> type foo.bat
> and the response was
> cd Dvorák, Antonin (1841-1904)
> with _no_ accent mark on the r.
>
> chcp tells me that the active code page is 437, for what relevance
> that may have. I could understand if the r-with-upside-down-caret
> didn't display in the command window, since CP437 doesn't include it
> <https://en.wikipedia.org/wiki/Code_page_437>. But it did display
> there, only was changed to a regular r in redirection.
>
> Can someone explain what's going on here? Thanks!

Did you format the partition using NTFS?

https://learn.microsoft.com/en-us/windows/win32/intl/character-sets-used-in-file-names
"NTFS stores file names in Unicode. In contrast, the older FAT12, FAT16,
and FAT32 file systems use the OEM character set. For more information,
see Code Pages."

Another reason for not finding a path to a file is that the path has
spaces, but you didn't account for them in your variables. Obviously:

Dvorák, Antonin (1841-1904)

has 2 space characters, but the 'cd' command is only going to see the
first argument (Dvorák) as the folder name. You cannot use 'cd' to
concurrently or sequentually open or move to multiple folders. 'cd'
works on just 1 folder at a time.

'echo' is displaying the entire string in the x variable. 'cd' only
sees the first argument as a folder name. You need to double-quote the
variable in the 'echo' command, like:

for /d %X in (dv*) do echo cd "%X" > foo.bat

However, running the 'for' command in a command shell at the prompt is
not the same as when running it inside a batch file. You need to
double-percent inside the batch file for the interpreter to properly
parse the command when passed to the shell. So, inside of a .bat file,
use:

for /d %%X in (dv*) do echo cd "%%X" > foo.bat

The first percent sign is the escape character, so the command shell
will see %x, and then the command interpreter will see %x instead of x.

Also, while the for-loop is creating an array it stores in x,
environment variables are addressed by /enclosing/ them in percent
signs. Try the following in a command shell:

set x=123
echo %x
echo %x%

The first doesn't work, because you haven't used the percent sign to
designate an environment variable. A first percent sign is an escape
character to designate the next percent sign is to signify an
environment variable. %x% is the environment variable. So, inside the
..bat file, change to:

for /d %%x in (dv*) do echo cd "%x%" > foo.bat
or
for /d %%x in (dv*) do echo cd "%%x%%" > foo.bat

This is off the top of my head. I'd have to test to verify %%x is
needed to identify a variable inside a batch file, and %x% is needed to
identify an environment variable. Possibly you have to change the 'cd'
arg to "%%x%" for the first percent to escape the second percent, so
when the command gets parsed and sent to the command intpreter, that
will see it as "%x%" to know %x% in an environment variable.

'for' generates a list. Each item in that list is piped into the
environment variable, but percent is an escape character inside a batch
file, and why you have to double them where you want a single percent
sign. For each item in the list generated by 'for', and copied into the
X environment variable, the do-statement gets executed, but again you
likely need double precent signs, so the first one escapes the second
one, and the second survives parsing to pass the string to the command
interpreter.

Bascially, in a command line you run at the prompt in a command shell,
enclose strings in double-quotes in a batch file to account for possible
space characters in the strings (and your's can have 2, or maybe more).
Also double the percent signs since, in a batch file, a percent sign is
the escape character applied to the next character. At the command
line, you can use %, but inside a batch you need to use %%. After all,
to allow escaping of character (to represent special characters), some
character had to be chosen as the escape character.

Re: Character weirdness in redirection

<MPG.3e58b519a49c33b3990067@news.individual.net>

  copy mid

https://www.novabbs.com/computers/article-flat.php?id=69078&group=alt.comp.os.windows-10#69078

  copy link   Newsgroups: alt.comp.os.windows-10
Path: i2pn2.org!i2pn.org!news.neodome.net!fu-berlin.de!uni-berlin.de!individual.net!not-for-mail
From: the_stan...@fastmail.fm (Stan Brown)
Newsgroups: alt.comp.os.windows-10
Subject: Re: Character weirdness in redirection
Date: Thu, 16 Feb 2023 21:48:46 -0800
Organization: Oak Road Systems
Lines: 57
Message-ID: <MPG.3e58b519a49c33b3990067@news.individual.net>
References: <MPG.3e587dbf227912df990065@news.individual.net> <60irwclle7pd.dlg@v.nguard.lh>
Mime-Version: 1.0
Content-Type: text/plain; charset="iso-8859-1"
Content-Transfer-Encoding: 8bit
X-Trace: individual.net /zL3vRjyfsdLfE1wkEcSDwHR8MJgHpMgoPrFu4XO22saJ8DYTq
Cancel-Lock: sha1:SQIlYbouQHwoXBKyhcncUHNwhL0=
User-Agent: MicroPlanet-Gravity/3.0.11 (GRC)
 by: Stan Brown - Fri, 17 Feb 2023 05:48 UTC

Thanks, Vanguard, for taking as much trouble as you did. I'm terribly
sorry -- I do know about quoting filenames or paths that contain
spaces, but I had been messing around with this all afternoon,
getting more and more frustrated, and when I finally decided to post,
in trying to simplify things I got careless. I apologize!

I don't actually use cmd.exe very much -- I've been using TCC or
TCCLE from JPsoft for decades. But I posted all of my examples using
cmd not TCCLE -- to make sure some quirk in TCCLE wasn't responsible.
error. And I copy/pasted from the cmd.exe window, not retyping
anything.

Yes, it's an NTFS partition (d:). I believe that the name as
displayed in File Explorer is indeed Unicode, since the r-upside-
down-caret is not in characters 0-255 of the Windows character set
1252 or the older dos set 437. (I looked at a character table of each
one, and didn't see it. That doesn't necessarily mean it's not there,
but I sure didn't see it.)

So let me do it right this time:

Typing
for /d %X in (dv*) do echo cd "%X"
on the command line displays
cd "Dvorák, Antonin (1841-1904)"
but the /r/ has an upside-down caret on it.
That's cd "%X" not cd "%X%" -- when I tried the latter on the command
line I got
cd "Dvorák, Antonin (1841-1904)%"
i.e., an extra % after the directory name. I'm guessing that the
variable named in the for command is special in this regard.
for /d %X in (dv*) do echo cd "%X" >foo.bat
displays
echo cd "Dvorák, Antonin (1841-1904)" 1>foo.bat
_with_ the upside-down-careted /r/
and the command "type foo.bat" displays
cd "Dvorák, Antonin (1841-1904)"
with no caret on the /r/. Executing /foo/ displays
cd "Dvorák, Antonin (1841-1904)"
The system cannot find the path specified.
(with no caret on the /r/).

I do understand when I go to put the for command in a batch file, it
will have to use "for %%x" etc., but I'm trying to solve this by
divide and conquer.

However, now that you've set me straight on 8.3 shortnames in my
other thread, I think that's really the answer. Shortnames won't use
special characters or contain spaces, so they shouldn't pose a
problem for robocopy. And my car doesn't care how the folders on the
USB stick are named, so I can just use the shortnames as the _only_
folder names on my USB stick's exFAT volume. (The car doesn't
recognize NTFS formatting, which is no surprise.)

--
Stan Brown, Tehachapi, California, USA https://BrownMath.com/
Shikata ga nai...

Re: Character weirdness in redirection

<3grtr4t7v99u.dlg@v.nguard.lh>

  copy mid

https://www.novabbs.com/computers/article-flat.php?id=69081&group=alt.comp.os.windows-10#69081

  copy link   Newsgroups: alt.comp.os.windows-10
Path: i2pn2.org!i2pn.org!news.neodome.net!fu-berlin.de!uni-berlin.de!individual.net!not-for-mail
From: V...@nguard.LH (VanguardLH)
Newsgroups: alt.comp.os.windows-10
Subject: Re: Character weirdness in redirection
Date: Fri, 17 Feb 2023 00:45:46 -0600
Lines: 105
Message-ID: <3grtr4t7v99u.dlg@v.nguard.lh>
References: <MPG.3e587dbf227912df990065@news.individual.net> <60irwclle7pd.dlg@v.nguard.lh> <MPG.3e58b519a49c33b3990067@news.individual.net>
Mime-Version: 1.0
Content-Type: text/plain; charset="iso-8859-2"
Content-Transfer-Encoding: 8bit
X-Trace: individual.net BJ+uJxH7TVDslMXK8eGmuQy9YivgquRggNjmA8H1HPYLFcmakd
Cancel-Lock: sha1:Vhhzhpu21GL/hAUfQu7cIMF+1qc=
User-Agent: 40tude_Dialog/2.0.15.41
 by: VanguardLH - Fri, 17 Feb 2023 06:45 UTC

Stan Brown <the_stan_brown@fastmail.fm> wrote:

> Thanks, Vanguard, for taking as much trouble as you did. I'm terribly
> sorry -- I do know about quoting filenames or paths that contain
> spaces, but I had been messing around with this all afternoon,
> getting more and more frustrated, and when I finally decided to post,
> in trying to simplify things I got careless. I apologize!
>
> I don't actually use cmd.exe very much -- I've been using TCC or
> TCCLE from JPsoft for decades. But I posted all of my examples using
> cmd not TCCLE -- to make sure some quirk in TCCLE wasn't responsible.
> error. And I copy/pasted from the cmd.exe window, not retyping
> anything.
>
> Yes, it's an NTFS partition (d:). I believe that the name as
> displayed in File Explorer is indeed Unicode, since the r-upside-
> down-caret is not in characters 0-255 of the Windows character set
> 1252 or the older dos set 437. (I looked at a character table of each
> one, and didn't see it. That doesn't necessarily mean it's not there,
> but I sure didn't see it.)
>
> So let me do it right this time:
>
> Typing
> for /d %X in (dv*) do echo cd "%X"
> on the command line displays
> cd "Dvorák, Antonin (1841-1904)"
> but the /r/ has an upside-down caret on it.
> That's cd "%X" not cd "%X%" -- when I tried the latter on the command
> line I got
> cd "Dvorák, Antonin (1841-1904)%"
> i.e., an extra % after the directory name. I'm guessing that the
> variable named in the for command is special in this regard.
> for /d %X in (dv*) do echo cd "%X" >foo.bat
> displays
> echo cd "Dvorák, Antonin (1841-1904)" 1>foo.bat
> _with_ the upside-down-careted /r/
> and the command "type foo.bat" displays
> cd "Dvorák, Antonin (1841-1904)"
> with no caret on the /r/. Executing /foo/ displays
> cd "Dvorák, Antonin (1841-1904)"
> The system cannot find the path specified.
> (with no caret on the /r/).
>
> I do understand when I go to put the for command in a batch file, it
> will have to use "for %%x" etc., but I'm trying to solve this by
> divide and conquer.
>
> However, now that you've set me straight on 8.3 shortnames in my
> other thread, I think that's really the answer. Shortnames won't use
> special characters or contain spaces, so they shouldn't pose a
> problem for robocopy. And my car doesn't care how the folders on the
> USB stick are named, so I can just use the shortnames as the _only_
> folder names on my USB stick's exFAT volume. (The car doesn't
> recognize NTFS formatting, which is no surprise.)

I simplified all that a lot further since I suspected there was a
problem storing some accented or diacritical characters into an ASCII
text file.

echo "dvořák"
"dvořák"

echo "dvořák" > testme.bat
type testme.bat
"dvorák"

The breve-r is not getting piped into the text file. Besides using the
'type' command to show what was inside the .txt file, I use HexEdit to
look inside the file. What I see in binary is:

"dvorák"

shows as binary:

22 64 76 6F 72 A0 6B 22 (hex)
" d v o r á K "

So breve-r is not getting piped into the .txt file. While accent-a is
in the ASCII-8 character set, breve-r is not. Text files support ASCII8
(aka ANSI). They are not rich-text files that can support Unicode.
Changing the target file's extension (to where you are piping the output
of the 'echo' command) to .rtf does not alter the piping is text, not
Unicode.

Looks like the problem stems with stdout piping into a file.

Ah, looks like I found the problem. The command shell defaults to the
437 code page. You need to change to the 65001 code page for UTF-8
support. Run:

chcp 65001
echo "dvořák" > testme.bat
type testme.bat
"dvořák"

With a change in the code page for the command shell, you can get it
support UTF-8 when piping stdout to a file.

https://ss64.com/nt/chcp.html

Doesn't list all code pages, but probably the most used, including
UTF-8. You can find more code pages listed at:

https://en.wikipedia.org/wiki/Code_page#IBM_code_pages

Re: Character weirdness in redirection

<MPG.3e59621f167b137199006a@news.individual.net>

  copy mid

https://www.novabbs.com/computers/article-flat.php?id=69091&group=alt.comp.os.windows-10#69091

  copy link   Newsgroups: alt.comp.os.windows-10
Path: i2pn2.org!i2pn.org!news.neodome.net!fu-berlin.de!uni-berlin.de!individual.net!not-for-mail
From: the_stan...@fastmail.fm (Stan Brown)
Newsgroups: alt.comp.os.windows-10
Subject: Re: Character weirdness in redirection
Date: Fri, 17 Feb 2023 10:07:00 -0800
Organization: Oak Road Systems
Lines: 41
Message-ID: <MPG.3e59621f167b137199006a@news.individual.net>
References: <MPG.3e587dbf227912df990065@news.individual.net> <60irwclle7pd.dlg@v.nguard.lh> <MPG.3e58b519a49c33b3990067@news.individual.net> <3grtr4t7v99u.dlg@v.nguard.lh>
Mime-Version: 1.0
Content-Type: text/plain; charset="iso-8859-1"
Content-Transfer-Encoding: 8bit
X-Trace: individual.net bl0CL0urIfTlBKfsw5N53AIfqFa9HlhG7wcqHR4u4ExvA5Ahge
Cancel-Lock: sha1:s20ElAwSDxDqqcaY90rUuinCmN8=
User-Agent: MicroPlanet-Gravity/3.0.11 (GRC)
 by: Stan Brown - Fri, 17 Feb 2023 18:07 UTC

On Fri, 17 Feb 2023 00:45:46 -0600, VanguardLH wrote:
> Ah, looks like I found the problem. The command shell defaults to the
> 437 code page. You need to change to the 65001 code page for UTF-8
> support. Run:
>
> chcp 65001
> echo "dvorák" > testme.bat
> type testme.bat
> "dvorák"

First, thank you again for your help in investigating this. I had
codepage 1242 in effect, but hadn't thought to mention it because I
thought (wrongly) that the codepage would affect only displays, not
pass-throughs.

Both in cmd.exe and with TCCLE, after chcp 65001 I entered the
command
dir dv*;do*;fu*|more
and got Dobrzynski, Dvorák, and Fucik, with the acute accent on the n
and the ?breve on the r and c. Piping those into a batch file, with
quotes around them, and then editing the batch file as UTF-8, they
echoed correctly and a CD command on Dvorák worked.

Now I'm torn. (a) It would be a little effort to set up shortnames
for the composer folders on the d: drive, but that's a one-time thing
and any new composers will be set up automatically by Windows. In the
past I've used shortnames to get around character-set problems, so I
know that will work.
(b) On the other hand, the codepage solution is less work to set up,
but I haven't tested it end-to-end yet. For instance, I'll have to
edit my data file in UTF-8 also, and I have to find a different grep
to search for matches in the data file, since mine handles only 8-bit
characters.

But now instead of a problem with no solution, I have one with two
solutions, and that's great progress -- thanks to your help. I am
very grateful for your time and trouble.

--
Stan Brown, Tehachapi, California, USA https://BrownMath.com/
Shikata ga nai...

1
server_pubkey.txt

rocksolid light 0.9.81
clearnet tor