Rocksolid Light

Welcome to novaBBS (click a section below)

mail  files  register  newsreader  groups  login

Message-ID:  

Old mail has arrived.


aus+uk / uk.comp.os.linux / Re: Character Encoding (Was: while loop taking input from file via ico

SubjectAuthor
o Re: Character Encoding (Was: while loop taking input from file viaJava Jive

1
Re: Character Encoding (Was: while loop taking input from file via ico

<4070696671@f1.n221.z2.fidonet.fi>

  copy mid

https://www.novabbs.com/aus+uk/article-flat.php?id=437&group=uk.comp.os.linux#437

  copy link   Newsgroups: uk.comp.os.linux
Path: i2pn2.org!i2pn.org!aioe.org!jHXdSDKPKExtdJMCjskdCQ.user.46.165.242.75.POSTED!not-for-mail
From: Java.J...@f1.n221.z2.fidonet.fi (Java Jive)
Newsgroups: uk.comp.os.linux
Subject: Re: Character Encoding (Was: while loop taking input from file via
ico
Date: Sun, 15 Aug 2021 16:00:15 +0200
Organization: rbb soupgate
Message-ID: <4070696671@f1.n221.z2.fidonet.fi>
References: <2988759105@f0.n0.z0.fidonet.org>
Mime-Version: 1.0
Content-Type: text/plain; charset="utf-8"
Content-Transfer-Encoding: 8bit
Injection-Info: gioia.aioe.org; logging-data="30231"; posting-host="jHXdSDKPKExtdJMCjskdCQ.user.gioia.aioe.org"; mail-complaints-to="abuse@aioe.org";
X-Comment-To: All
X-MailConverter: SoupGate-OS/2 v1.20
X-Notice: Filtered by postfilter v. 0.9.2
 by: Java Jive - Sun, 15 Aug 2021 14:00 UTC

Subject: Re: Character Encoding (Was: while loop taking input from file via
iconv )

On 15/08/2021 13:58, Spiros Bousbouras wrote:
>
> On Sun, 15 Aug 2021 12:57:24 +0100
> Java Jive <java@evij.com.invalid> wrote:
>>
>> Now the crunch, when I unzip these on a Linux machine, I see different
>> bastardisations of accented characters. So, for example where the full
>> 7zip archive when extracted shows an e acute correctly in both a console
>> and a file manager listing ...
>> "Chat Botté, Le" [e is correctly acute]
>> ... (if you're wondering, a French children's picture book version of
>> apparently 'Puss In Boots'), while with the WinZip main archive a
>> console listing shows a very odd character sequence instead of the e
>> acute ...
>> "Chat Bott'$'\302\202'', Le"
>> ... and a file manager listing has a graphic character resembling a 2x2
>> matrix, concerning which note that while \302 octal = \xC2 hex, and
>> \202 octal = \x82 hex, only the second of these and not the first
>> appears in the symbol:
>> |00|
>> |82|
>
> You aren't going to get anywhere with using high level tools for this. You
> need to go low level and see the values of the actual bytes in the filenames.
> So for example something like
>
> ls *Chat* | od -A n -t x1
>
> which will show the bytes in hexadecimal.

Thanks again, will look into that.

>> My problem is that I can't find a search term to trap this strange
>> character to correct it, for example the following, and a few similar
>> that I've tried, don't work because they don't find the directory:
>> mv "Chat Bott'$'\302\202'', Le" "Chat Botté, Le"
>> mv Chat\ Bott\'$\'\\302\\202\'\',\ Le "Chat Botté, Le"
>
> What directory ? Your post says that some files have strange names. Do also
> some directories have strange names ? In any case , the commands above do not
> show a directory separator.

As part of my manual investigations of the problem, I changed to the
directory of which the problem directory is a direct sub-directory, to
allow experimentation without having to type tediously extended pathnames.

--

Fake news kills!

I may be contacted via the contact address given on my website:
www.macfh.co.uk

1
server_pubkey.txt

rocksolid light 0.9.8
clearnet tor