Rocksolid Light

Welcome to novaBBS (click a section below)

mail  files  register  newsreader  groups  login

Message-ID:  

Linux - Das System fuer schlaue Maedchen ;) -- banshee


devel / comp.unix.shell / Re: Extended pattern syntax extension

SubjectAuthor
* Extended pattern syntax extensionJanis Papanagnou
+* Re: Extended pattern syntax extensionSpiros Bousbouras
|`* Re: Extended pattern syntax extensionJanis Papanagnou
| `* Re: Extended pattern syntax extensionSpiros Bousbouras
|  `- Re: Extended pattern syntax extensionJanis Papanagnou
+* Re: Extended pattern syntax extensionKaz Kylheku
|`- Re: Extended pattern syntax extensionJanis Papanagnou
`* Re: Extended pattern syntax extensionJanis Papanagnou
 `* Re: Extended pattern syntax extensionSpiros Bousbouras
  `* Re: Extended pattern syntax extensionJanis Papanagnou
   +* Re: Extended pattern syntax extensionSpiros Bousbouras
   |+* Re: Extended pattern syntax extensionJanis Papanagnou
   ||+* Re: Extended pattern syntax extensionWilliam Unruh
   |||+* Re: Extended pattern syntax extensionDavid W. Hodgins
   ||||`- Re: Extended pattern syntax extensionDavid W. Hodgins
   |||`- Re: Extended pattern syntax extensionjoerg
   ||`- Re: Extended pattern syntax extensionSpiros Bousbouras
   |`* Re: Extended pattern syntax extensionWilliam Unruh
   | `* Re: Extended pattern syntax extensionSpiros Bousbouras
   |  `- Re: Extended pattern syntax extensionWilliam Unruh
   `- Re: Extended pattern syntax extensionWilliam Unruh

1
Extended pattern syntax extension

<s7io9t$mr3$1@news-1.m-online.net>

 copy mid

https://www.novabbs.com/devel/article-flat.php?id=3762&group=comp.unix.shell#3762

 copy link   Newsgroups: comp.unix.shell
Path: i2pn2.org!i2pn.org!weretis.net!feeder8.news.weretis.net!news.mixmin.net!news2.arglkargh.de!news.karotte.org!news.space.net!news.m-online.net!.POSTED!not-for-mail
From: janis_pa...@hotmail.com (Janis Papanagnou)
Newsgroups: comp.unix.shell
Subject: Extended pattern syntax extension
Date: Thu, 13 May 2021 10:34:04 +0200
Organization: (posted via) M-net Telekommunikations GmbH
Lines: 31
Message-ID: <s7io9t$mr3$1@news-1.m-online.net>
NNTP-Posting-Host: 2001:a61:252a:da01:109d:75e9:a520:8386
Mime-Version: 1.0
Content-Type: text/plain; charset=utf-8
Content-Transfer-Encoding: 7bit
X-Trace: news-1.m-online.net 1620894845 23395 2001:a61:252a:da01:109d:75e9:a520:8386 (13 May 2021 08:34:05 GMT)
X-Complaints-To: news@news-1.m-online.net
NNTP-Posting-Date: Thu, 13 May 2021 08:34:05 +0000 (UTC)
User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:45.0) Gecko/20100101
Thunderbird/45.8.0
X-Mozilla-News-Host: news://news.m-online.net:119
X-Enigmail-Draft-Status: N1110
 by: Janis Papanagnou - Thu, 13 May 2021 08:34 UTC

Kornshell introduced patterns (called extended patterns in bash) that
allowed, for example, pattern expressions like @(A|B|C), matching any
of the sub-patterns A, B, or C. I was under the impression that I've
somewhere - maybe in some intermediate or experimental ksh version -
seen an extension of those patterns allowing expressions like @(A&B&C)
with the semantics of matching all sub-patterns (in arbitrary order).

One can think of various work-arounds for that like the equivalents

[[ $1 == *@(A*B*C|A*C*B|B*A*C|B*C*A|C*A*B|C*B*A)* ]]

[[ $1 == *A* && $1 == *B* && $1 == *C* ]]

((c=0))
case $1 in
(*A*) ((c++)) ;&
(*B*) ((c++)) ;&
(*C*) ((c++)) ;&
(*) ((c==3))
esac

none of them really satisfying (if compared to @(A&B&C) ).

Has someone seen such an extension in a shell (or other programming
language's regexp library)? I'm quite sure I've seen it somewhere
but don't recall where it was.

Thinking of an implementation, though, I'm not sure how to implement
that efficiently based on an FSM. (OR-branches are simple, but ANDs?)

Janis

Re: Extended pattern syntax extension

<3Y9TmbsO0BhnLZt12@bongo-ra.co>

 copy mid

https://www.novabbs.com/devel/article-flat.php?id=3766&group=comp.unix.shell#3766

 copy link   Newsgroups: comp.unix.shell comp.programming
Path: i2pn2.org!i2pn.org!weretis.net!feeder8.news.weretis.net!3.eu.feeder.erje.net!feeder.erje.net!npeer.as286.net!npeer-ng0.as286.net!peer02.ams1!peer.ams1.xlned.com!news.xlned.com!peer02.ams4!peer.am4.highwinds-media.com!news.highwinds-media.com!fx36.ams4.POSTED!not-for-mail
From: spi...@gmail.com (Spiros Bousbouras)
Newsgroups: comp.unix.shell,comp.programming
Subject: Re: Extended pattern syntax extension
Message-ID: <3Y9TmbsO0BhnLZt12@bongo-ra.co>
References: <s7io9t$mr3$1@news-1.m-online.net>
In-Reply-To: <s7io9t$mr3$1@news-1.m-online.net>
X-Organisation: Weyland-Yutani
Content-Type: text/plain; charset=ISO-8859-1
Content-Transfer-Encoding: 8bit
Lines: 66
X-Complaints-To: http://netreport.virginmedia.com
NNTP-Posting-Date: Thu, 13 May 2021 13:22:16 UTC
Organization: virginmedia.com
Date: Thu, 13 May 2021 13:22:16 GMT
X-Received-Bytes: 3449
 by: Spiros Bousbouras - Thu, 13 May 2021 13:22 UTC

[ Crossposting to comp.programming ]

On Thu, 13 May 2021 10:34:04 +0200
Janis Papanagnou <janis_papanagnou@hotmail.com> wrote:
> Kornshell introduced patterns (called extended patterns in bash) that
> allowed, for example, pattern expressions like @(A|B|C), matching any
> of the sub-patterns A, B, or C. I was under the impression that I've
> somewhere - maybe in some intermediate or experimental ksh version -
> seen an extension of those patterns allowing expressions like @(A&B&C)
> with the semantics of matching all sub-patterns (in arbitrary order).

Does @(ab&bc) match the string abc ? In other words , are the
matches allowed to overlap ?

> One can think of various work-arounds for that like the equivalents
>
> [[ $1 == *@(A*B*C|A*C*B|B*A*C|B*C*A|C*A*B|C*B*A)* ]]

For this the matches cannot overlap.

> [[ $1 == *A* && $1 == *B* && $1 == *C* ]]
>
> ((c=0))
> case $1 in
> (*A*) ((c++)) ;&
> (*B*) ((c++)) ;&
> (*C*) ((c++)) ;&
> (*) ((c==3))
> esac

For both of these alternatives the matches can overlap.

> none of them really satisfying (if compared to @(A&B&C) ).
>
> Has someone seen such an extension in a shell (or other programming
> language's regexp library)? I'm quite sure I've seen it somewhere
> but don't recall where it was.

The closest I can think of is the \& operator in vim regular
expressions for which the vim documentation says :

*/branch* */\&*
2. A branch is one or more concats, separated by "\&". It matches the last
concat, but only if all the preceding concats also match at the same
position. Examples:
"foobeep\&..." matches "foo" in "foobeep".
".*Peter\&.*Bob" matches in a line containing both "Peter" and "Bob"

I'm surprised it's not more common. Personally I often need such an idiom. On
the shell command line I usually do grep A | grep B | grep ... .Obviously
it can be done with AWK too.

> Thinking of an implementation, though, I'm not sure how to implement
> that efficiently based on an FSM. (OR-branches are simple, but ANDs?)

Create finite automata M1 , M2 , M3 corresponding to A , B , C .For each
symbol read , update each automaton. If any automaton reaches an accepting
state , stop updating it. If at some point all 3 automata have reached an
accepting state then the expression has matched ; if you reach the end of the
input and at least 1 automaton has not reached an accepting state , then the
expression has not matched.

--
I thought we were the better team. In the first half our keeper didn't
have a save to make apart from the two goals.
Dean Saunders , football coach.

Re: Extended pattern syntax extension

<s7jerp$t9t$1@news-1.m-online.net>

 copy mid

https://www.novabbs.com/devel/article-flat.php?id=3770&group=comp.unix.shell#3770

 copy link   Newsgroups: comp.unix.shell comp.programming
Path: i2pn2.org!i2pn.org!weretis.net!feeder8.news.weretis.net!3.eu.feeder.erje.net!feeder.erje.net!news2.arglkargh.de!news.karotte.org!news.space.net!news.m-online.net!.POSTED!not-for-mail
From: janis_pa...@hotmail.com (Janis Papanagnou)
Newsgroups: comp.unix.shell,comp.programming
Subject: Re: Extended pattern syntax extension
Date: Thu, 13 May 2021 16:59:05 +0200
Organization: (posted via) M-net Telekommunikations GmbH
Lines: 87
Message-ID: <s7jerp$t9t$1@news-1.m-online.net>
References: <s7io9t$mr3$1@news-1.m-online.net> <3Y9TmbsO0BhnLZt12@bongo-ra.co>
NNTP-Posting-Host: 2001:a61:252a:da01:109d:75e9:a520:8386
Mime-Version: 1.0
Content-Type: text/plain; charset=windows-1252
Content-Transfer-Encoding: 7bit
X-Trace: news-1.m-online.net 1620917945 30013 2001:a61:252a:da01:109d:75e9:a520:8386 (13 May 2021 14:59:05 GMT)
X-Complaints-To: news@news-1.m-online.net
NNTP-Posting-Date: Thu, 13 May 2021 14:59:05 +0000 (UTC)
User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:45.0) Gecko/20100101
Thunderbird/45.8.0
X-Enigmail-Draft-Status: N1110
In-Reply-To: <3Y9TmbsO0BhnLZt12@bongo-ra.co>
 by: Janis Papanagnou - Thu, 13 May 2021 14:59 UTC

On 13.05.2021 15:22, Spiros Bousbouras wrote:
> [ Crossposting to comp.programming ]
>
> On Thu, 13 May 2021 10:34:04 +0200
> Janis Papanagnou <janis_papanagnou@hotmail.com> wrote:
>> Kornshell introduced patterns (called extended patterns in bash) that
>> allowed, for example, pattern expressions like @(A|B|C), matching any
>> of the sub-patterns A, B, or C. I was under the impression that I've
>> somewhere - maybe in some intermediate or experimental ksh version -
>> seen an extension of those patterns allowing expressions like @(A&B&C)
>> with the semantics of matching all sub-patterns (in arbitrary order).
>
> Does @(ab&bc) match the string abc ? In other words , are the
> matches allowed to overlap ?

I would say so (but don't recall the implementation I thought to have
seen).

>
>> One can think of various work-arounds for that like the equivalents
>>
>> [[ $1 == *@(A*B*C|A*C*B|B*A*C|B*C*A|C*A*B|C*B*A)* ]]
>
> For this the matches cannot overlap.

You are right.

>
>> [[ $1 == *A* && $1 == *B* && $1 == *C* ]]
>>
>> ((c=0))
>> case $1 in
>> (*A*) ((c++)) ;&
>> (*B*) ((c++)) ;&
>> (*C*) ((c++)) ;&
>> (*) ((c==3))
>> esac
>
> For both of these alternatives the matches can overlap.
>
>> none of them really satisfying (if compared to @(A&B&C) ).
>>
>> Has someone seen such an extension in a shell (or other programming
>> language's regexp library)? I'm quite sure I've seen it somewhere
>> but don't recall where it was.
>
> The closest I can think of is the \& operator in vim regular
> expressions for which the vim documentation says :
[...]
>
> I'm surprised it's not more common. Personally I often need such an idiom. On
> the shell command line I usually do grep A | grep B | grep ... .

I needed it as well in a couple cases.

(It could as well have been the case that I haven't seen that feature
in some tool, but just thought it would be an useful and in Kornshell
a lexically coherent extension.)

> Obviously it can be done with AWK too.

Yes, and quite trivially (in Awk I'm using that pattern quite often)

/A/&&/B/&&/C/

>
>> Thinking of an implementation, though, I'm not sure how to implement
>> that efficiently based on an FSM. (OR-branches are simple, but ANDs?)
>
> Create finite automata M1 , M2 , M3 corresponding to A , B , C .For each
> symbol read , update each automaton. If any automaton reaches an accepting
> state , stop updating it. If at some point all 3 automata have reached an
> accepting state then the expression has matched ; if you reach the end of the
> input and at least 1 automaton has not reached an accepting state , then the
> expression has not matched.

Well, okay. I was more thinking (initially) about a method that will
stay within the Regular Expression (Chomsky 3) class. As I understand
it this method (like a counting automaton, etc.) would not qualify for
that. On the other hand, Kornshell's patterns are also already beyond
Chomsky 3 since they support backreferences in their patterns, so it's
okay, I guess, to think about non-regular implementations.

Would that be a candidate for ksh93u+m ?

Janis

Re: Extended pattern syntax extension

<20210513084821.510@kylheku.com>

 copy mid

https://www.novabbs.com/devel/article-flat.php?id=3771&group=comp.unix.shell#3771

 copy link   Newsgroups: comp.unix.shell
Path: i2pn2.org!i2pn.org!eternal-september.org!reader02.eternal-september.org!.POSTED!not-for-mail
From: 563-365-...@kylheku.com (Kaz Kylheku)
Newsgroups: comp.unix.shell
Subject: Re: Extended pattern syntax extension
Date: Thu, 13 May 2021 16:04:52 -0000 (UTC)
Organization: A noiseless patient Spider
Lines: 72
Message-ID: <20210513084821.510@kylheku.com>
References: <s7io9t$mr3$1@news-1.m-online.net>
Injection-Date: Thu, 13 May 2021 16:04:52 -0000 (UTC)
Injection-Info: reader02.eternal-september.org; posting-host="6ecb6720e4abb476a2bad040dde6dea3";
logging-data="23877"; mail-complaints-to="abuse@eternal-september.org"; posting-account="U2FsdGVkX19HSlOsc2V+Ut1ovmNkcbj2e5ejbe0/+fU="
User-Agent: slrn/1.0.3 (Linux)
Cancel-Lock: sha1:RRQwjYs9CMlIM9iPh75mndqk7HY=
 by: Kaz Kylheku - Thu, 13 May 2021 16:04 UTC

On 2021-05-13, Janis Papanagnou <janis_papanagnou@hotmail.com> wrote:
> Kornshell introduced patterns (called extended patterns in bash) that
> allowed, for example, pattern expressions like @(A|B|C), matching any
> of the sub-patterns A, B, or C. I was under the impression that I've
> somewhere - maybe in some intermediate or experimental ksh version -
> seen an extension of those patterns allowing expressions like @(A&B&C)
> with the semantics of matching all sub-patterns (in arbitrary order).
>
> One can think of various work-arounds for that like the equivalents
>
> [[ $1 == *@(A*B*C|A*C*B|B*A*C|B*C*A|C*A*B|C*B*A)* ]]
>
> [[ $1 == *A* && $1 == *B* && $1 == *C* ]]
>
> ((c=0))
> case $1 in
> (*A*) ((c++)) ;&
> (*B*) ((c++)) ;&
> (*C*) ((c++)) ;&
> (*) ((c==3))
> esac
>
> none of them really satisfying (if compared to @(A&B&C) ).
>
> Has someone seen such an extension in a shell (or other programming
> language's regexp library)? I'm quite sure I've seen it somewhere
> but don't recall where it was.
>
> Thinking of an implementation, though, I'm not sure how to implement
> that efficiently based on an FSM. (OR-branches are simple, but ANDs?)

Regex derivatives can be used; the & operator has a simple
interpretation under derivatives. Regex derivatives were discovered by
Janusz Brzozowski, who published about them in 1964.

Derivatives can be interpreted directly, and that's how they were
originally described.

The basic idea behind this is that given a regular expression R,
we can calculate a "derivative" of R with respect to an input symbol s,
or R' = d(R, s). Maybe dR/ds, haha.

The derivative R' is a regular expression which matches the rest of the
input after s has been consumed.

Matching a string is performed by repeatedly calculating a chain of
derivatives, for each symbol in the string. If the end of the string is
reached, and the remaining regular expression is nullable (capable of
matching the empty string), then the string has been matched. If at any
time a derivative is produced which doesn't match any strings, then the
match has failed.

Some more recent work describes ways of using derivatives to compile
to DFA. The gist is to chase the possible derivatives at compile time
and calculate the resulting transition graph statically.
The trick, while doing that, is to recognize the identical states to
condense the graph and recognize loops. Sometimes the derivative of
a regex will yield the same regex, or a regex that was seen before.
Repetition does this, for instance. The derivative of a* with respect
to a is a*. A notion of "sameness" is required, in consideration of
the issue that sometimes regexes have the same meaning, but look
different.

About the and operator: given R1&R2, the derivative has an easy
interpretation. d(R1&R2, s) is just d(R1, s)&d(R2, s). The derivative
operation parallelizes into the branches of the and, and the results are
combined with & again. If either derivative is the empty regex [],
then there is no match since R&[] = [].

--
TXR Programming Language: http://nongnu.org/txr
Cygnal: Cygwin Native Application Library: http://kylheku.com/cygnal

Re: Extended pattern syntax extension

<s7jqgu$106$1@news-1.m-online.net>

 copy mid

https://www.novabbs.com/devel/article-flat.php?id=3773&group=comp.unix.shell#3773

 copy link   Newsgroups: comp.unix.shell
Path: i2pn2.org!i2pn.org!weretis.net!feeder8.news.weretis.net!3.eu.feeder.erje.net!feeder.erje.net!news2.arglkargh.de!news.karotte.org!news.space.net!news.m-online.net!.POSTED!not-for-mail
From: janis_pa...@hotmail.com (Janis Papanagnou)
Newsgroups: comp.unix.shell
Subject: Re: Extended pattern syntax extension
Date: Thu, 13 May 2021 20:18:06 +0200
Organization: (posted via) M-net Telekommunikations GmbH
Lines: 56
Message-ID: <s7jqgu$106$1@news-1.m-online.net>
References: <s7io9t$mr3$1@news-1.m-online.net>
<20210513084821.510@kylheku.com>
NNTP-Posting-Host: 2001:a61:252a:da01:109d:75e9:a520:8386
Mime-Version: 1.0
Content-Type: text/plain; charset=windows-1252
Content-Transfer-Encoding: 7bit
X-Trace: news-1.m-online.net 1620929886 1030 2001:a61:252a:da01:109d:75e9:a520:8386 (13 May 2021 18:18:06 GMT)
X-Complaints-To: news@news-1.m-online.net
NNTP-Posting-Date: Thu, 13 May 2021 18:18:06 +0000 (UTC)
User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:45.0) Gecko/20100101
Thunderbird/45.8.0
X-Enigmail-Draft-Status: N1110
In-Reply-To: <20210513084821.510@kylheku.com>
 by: Janis Papanagnou - Thu, 13 May 2021 18:18 UTC

On 13.05.2021 18:04, Kaz Kylheku wrote:
> On 2021-05-13, Janis Papanagnou <janis_papanagnou@hotmail.com> wrote:
>> Kornshell introduced patterns (called extended patterns in bash) that
>> allowed, for example, pattern expressions like @(A|B|C), matching any
>> of the sub-patterns A, B, or C. I was under the impression that I've
>> somewhere - maybe in some intermediate or experimental ksh version -
>> seen an extension of those patterns allowing expressions like @(A&B&C)
>> with the semantics of matching all sub-patterns (in arbitrary order).
[...]
>> Thinking of an implementation, though, I'm not sure how to implement
>> that efficiently based on an FSM. (OR-branches are simple, but ANDs?)
>
> Regex derivatives can be used; the & operator has a simple
> interpretation under derivatives. Regex derivatives were discovered by
> Janusz Brzozowski, who published about them in 1964.
>
> Derivatives can be interpreted directly, and that's how they were
> originally described.
>
> The basic idea behind this is that given a regular expression R,
> we can calculate a "derivative" of R with respect to an input symbol s,
> or R' = d(R, s). Maybe dR/ds, haha.
>
> The derivative R' is a regular expression which matches the rest of the
> input after s has been consumed.
>
> Matching a string is performed by repeatedly calculating a chain of
> derivatives, for each symbol in the string. If the end of the string is
> reached, and the remaining regular expression is nullable (capable of
> matching the empty string), then the string has been matched. If at any
> time a derivative is produced which doesn't match any strings, then the
> match has failed.
>
> Some more recent work describes ways of using derivatives to compile
> to DFA. The gist is to chase the possible derivatives at compile time
> and calculate the resulting transition graph statically.
> The trick, while doing that, is to recognize the identical states to
> condense the graph and recognize loops. Sometimes the derivative of
> a regex will yield the same regex, or a regex that was seen before.
> Repetition does this, for instance. The derivative of a* with respect
> to a is a*. A notion of "sameness" is required, in consideration of
> the issue that sometimes regexes have the same meaning, but look
> different.
>
> About the and operator: given R1&R2, the derivative has an easy
> interpretation. d(R1&R2, s) is just d(R1, s)&d(R2, s). The derivative
> operation parallelizes into the branches of the and, and the results are
> combined with & again. If either derivative is the empty regex [],
> then there is no match since R&[] = [].

This is very interesting. What would be the complexity e.g. of the
recognition of "identical states to condense the graph and loops"?
Implemented in shell-interpreters it might be a factor to consider.

Janis

Re: Extended pattern syntax extension

<CXKS8ZtfDBvmQxRCZ@bongo-ra.co>

 copy mid

https://www.novabbs.com/devel/article-flat.php?id=3774&group=comp.unix.shell#3774

 copy link   Newsgroups: comp.unix.shell comp.programming
Path: i2pn2.org!i2pn.org!aioe.org!news.uzoreto.com!news-out.netnews.com!news.alt.net!fdc3.netnews.com!peer01.ams1!peer.ams1.xlned.com!news.xlned.com!peer03.ams4!peer.am4.highwinds-media.com!news.highwinds-media.com!fx46.ams4.POSTED!not-for-mail
From: spi...@gmail.com (Spiros Bousbouras)
Newsgroups: comp.unix.shell,comp.programming
Subject: Re: Extended pattern syntax extension
Message-ID: <CXKS8ZtfDBvmQxRCZ@bongo-ra.co>
References: <s7io9t$mr3$1@news-1.m-online.net> <3Y9TmbsO0BhnLZt12@bongo-ra.co> <s7jerp$t9t$1@news-1.m-online.net>
In-Reply-To: <s7jerp$t9t$1@news-1.m-online.net>
X-Organisation: Weyland-Yutani
Content-Type: text/plain; charset=ISO-8859-1
Content-Transfer-Encoding: 8bit
Lines: 41
X-Complaints-To: http://netreport.virginmedia.com
NNTP-Posting-Date: Thu, 13 May 2021 18:34:37 UTC
Organization: virginmedia.com
Date: Thu, 13 May 2021 18:34:37 GMT
X-Received-Bytes: 2683
 by: Spiros Bousbouras - Thu, 13 May 2021 18:34 UTC

On Thu, 13 May 2021 16:59:05 +0200
Janis Papanagnou <janis_papanagnou@hotmail.com> wrote:
> On 13.05.2021 15:22, Spiros Bousbouras wrote:
> > [ Crossposting to comp.programming ]
> >
> > On Thu, 13 May 2021 10:34:04 +0200
> > Janis Papanagnou <janis_papanagnou@hotmail.com> wrote:

[...]

> >> Thinking of an implementation, though, I'm not sure how to implement
> >> that efficiently based on an FSM. (OR-branches are simple, but ANDs?)
> >
> > Create finite automata M1 , M2 , M3 corresponding to A , B , C .For each
> > symbol read , update each automaton. If any automaton reaches an accepting
> > state , stop updating it. If at some point all 3 automata have reached an
> > accepting state then the expression has matched ; if you reach the end of the
> > input and at least 1 automaton has not reached an accepting state , then the
> > expression has not matched.
>
> Well, okay. I was more thinking (initially) about a method that will
> stay within the Regular Expression (Chomsky 3) class. As I understand
> it this method (like a counting automaton, etc.) would not qualify for
> that.

It can be realised with a regular expression like the \{n,m\} operator can.

Anyway , I did some digging and I saw a better way to do conjunction (it's
the same idea really but all the details are presented) : you take the
product of 2 deterministic finite automata. Details are on pages 30-32 of
http://www.cse.chalmers.se/~coquand/AUTOMATA/o2.pdf .And right after , the
notes describe a modification of the product which works to get disjunction.

> On the other hand, Kornshell's patterns are also already beyond
> Chomsky 3 since they support backreferences in their patterns, so it's
> okay, I guess, to think about non-regular implementations.

--
I was once seriously berated for revealing the ending of
"The Passion of the Christ."
Roger Ebert

Re: Extended pattern syntax extension

<s7o5oq$c33$1@news-1.m-online.net>

 copy mid

https://www.novabbs.com/devel/article-flat.php?id=3779&group=comp.unix.shell#3779

 copy link   Newsgroups: comp.unix.shell comp.programming
Path: i2pn2.org!i2pn.org!weretis.net!feeder8.news.weretis.net!3.eu.feeder.erje.net!feeder.erje.net!news2.arglkargh.de!news.karotte.org!news.space.net!news.m-online.net!.POSTED!not-for-mail
From: janis_pa...@hotmail.com (Janis Papanagnou)
Newsgroups: comp.unix.shell,comp.programming
Subject: Re: Extended pattern syntax extension
Date: Sat, 15 May 2021 11:54:34 +0200
Organization: (posted via) M-net Telekommunikations GmbH
Lines: 48
Message-ID: <s7o5oq$c33$1@news-1.m-online.net>
References: <s7io9t$mr3$1@news-1.m-online.net> <3Y9TmbsO0BhnLZt12@bongo-ra.co>
<s7jerp$t9t$1@news-1.m-online.net> <CXKS8ZtfDBvmQxRCZ@bongo-ra.co>
NNTP-Posting-Host: 2001:a61:252a:da01:9517:4d9e:bbbd:124e
Mime-Version: 1.0
Content-Type: text/plain; charset=windows-1252
Content-Transfer-Encoding: 7bit
X-Trace: news-1.m-online.net 1621072474 12387 2001:a61:252a:da01:9517:4d9e:bbbd:124e (15 May 2021 09:54:34 GMT)
X-Complaints-To: news@news-1.m-online.net
NNTP-Posting-Date: Sat, 15 May 2021 09:54:34 +0000 (UTC)
User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:45.0) Gecko/20100101
Thunderbird/45.8.0
X-Enigmail-Draft-Status: N1110
In-Reply-To: <CXKS8ZtfDBvmQxRCZ@bongo-ra.co>
 by: Janis Papanagnou - Sat, 15 May 2021 09:54 UTC

On 13.05.2021 20:34, Spiros Bousbouras wrote:
> On Thu, 13 May 2021 16:59:05 +0200
> Janis Papanagnou <janis_papanagnou@hotmail.com> wrote:
>> On 13.05.2021 15:22, Spiros Bousbouras wrote:
>>> [ Crossposting to comp.programming ]
>>>
>>> On Thu, 13 May 2021 10:34:04 +0200
>>> Janis Papanagnou <janis_papanagnou@hotmail.com> wrote:
>
> [...]
>
>>>> Thinking of an implementation, though, I'm not sure how to implement
>>>> that efficiently based on an FSM. (OR-branches are simple, but ANDs?)
>>>
>>> Create finite automata M1 , M2 , M3 corresponding to A , B , C .For each
>>> symbol read , update each automaton. If any automaton reaches an accepting
>>> state , stop updating it. If at some point all 3 automata have reached an
>>> accepting state then the expression has matched ; if you reach the end of the
>>> input and at least 1 automaton has not reached an accepting state , then the
>>> expression has not matched.
>>
>> Well, okay. I was more thinking (initially) about a method that will
>> stay within the Regular Expression (Chomsky 3) class. As I understand
>> it this method (like a counting automaton, etc.) would not qualify for
>> that.
>
> It can be realised with a regular expression like the \{n,m\} operator can.

Hmm.. - I've considered the range terminology (and a few other syntaxes)
always like an abbreviation, still within FSA/Chomsky-3, if linearized.
(While for large values of 'm' this is practically not a sophisticated
implementation, of course; complexity would be depending on 'm', as the
expanded form of the FSM would be that huge.)

(With "counting automaton" I rather meant syntaxes like, for example,
bracket counting.)

>
> Anyway , I did some digging and I saw a better way to do conjunction (it's
> the same idea really but all the details are presented) : you take the
> product of 2 deterministic finite automata. Details are on pages 30-32 of
> http://www.cse.chalmers.se/~coquand/AUTOMATA/o2.pdf .And right after , the
> notes describe a modification of the product which works to get disjunction.

I'll have a look into that. Thanks!

Janis

Re: Extended pattern syntax extension

<sbqpud$igo$1@news-1.m-online.net>

 copy mid

https://www.novabbs.com/devel/article-flat.php?id=4022&group=comp.unix.shell#4022

 copy link   Newsgroups: comp.unix.shell
Path: i2pn2.org!i2pn.org!aioe.org!news.mixmin.net!news2.arglkargh.de!news.karotte.org!news.space.net!news.m-online.net!.POSTED!not-for-mail
From: janis_pa...@hotmail.com (Janis Papanagnou)
Newsgroups: comp.unix.shell
Subject: Re: Extended pattern syntax extension
Date: Sun, 4 Jul 2021 00:56:13 +0200
Organization: (posted via) M-net Telekommunikations GmbH
Lines: 42
Message-ID: <sbqpud$igo$1@news-1.m-online.net>
References: <s7io9t$mr3$1@news-1.m-online.net>
NNTP-Posting-Host: 2001:a61:241e:cc01:5067:7de7:dfb1:4998
Mime-Version: 1.0
Content-Type: text/plain; charset=utf-8
Content-Transfer-Encoding: 8bit
X-Trace: news-1.m-online.net 1625352973 18968 2001:a61:241e:cc01:5067:7de7:dfb1:4998 (3 Jul 2021 22:56:13 GMT)
X-Complaints-To: news@news-1.m-online.net
NNTP-Posting-Date: Sat, 3 Jul 2021 22:56:13 +0000 (UTC)
User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:45.0) Gecko/20100101
Thunderbird/45.8.0
X-Enigmail-Draft-Status: N1110
In-Reply-To: <s7io9t$mr3$1@news-1.m-online.net>
 by: Janis Papanagnou - Sat, 3 Jul 2021 22:56 UTC

On 13.05.2021 10:34, Janis Papanagnou wrote:
> Kornshell introduced patterns (called extended patterns in bash) that
> allowed, for example, pattern expressions like @(A|B|C), matching any
> of the sub-patterns A, B, or C. I was under the impression that I've
> somewhere - maybe in some intermediate or experimental ksh version -
> seen an extension of those patterns allowing expressions like @(A&B&C)
> with the semantics of matching all sub-patterns (in arbitrary order).
>
> One can think of various work-arounds for that like the equivalents
>
> [[ $1 == *@(A*B*C|A*C*B|B*A*C|B*C*A|C*A*B|C*B*A)* ]]
>
> [[ $1 == *A* && $1 == *B* && $1 == *C* ]]
>
> ((c=0))
> case $1 in
> (*A*) ((c++)) ;&
> (*B*) ((c++)) ;&
> (*C*) ((c++)) ;&
> (*) ((c==3))
> esac
>
> none of them really satisfying (if compared to @(A&B&C) ).
>
> [...]
> I'm quite sure I've seen it somewhere but don't recall where it was.

Doh! - I had obviously seen it described where it actually belongs, in
the ksh man page...

"A pattern-list is a list of one or more patterns separated from each
other with a & or ⎪. A & signifies that all patterns must be matched
whereas ⎪ requires that only one pattern be matched."

But I used the pattern incorrectly in my tests. This one actually works

[[ $1 == @(*A*&*B*&*C*) ]]

(A bit more clumsy but still better than the workarounds.)

Janis

Re: Extended pattern syntax extension

<MmsCcM5LbSdiv6H71k@bongo-ra.co>

 copy mid

https://www.novabbs.com/devel/article-flat.php?id=4024&group=comp.unix.shell#4024

 copy link   Newsgroups: comp.unix.shell
Path: i2pn2.org!i2pn.org!aioe.org!Lsq9Ulyii8Zln50ye03obQ.user.gioia.aioe.org.POSTED!not-for-mail
From: spi...@gmail.com (Spiros Bousbouras)
Newsgroups: comp.unix.shell
Subject: Re: Extended pattern syntax extension
Date: Sun, 4 Jul 2021 14:32:25 +0000 (UTC)
Organization: Aioe.org NNTP Server
Lines: 44
Message-ID: <MmsCcM5LbSdiv6H71k@bongo-ra.co>
References: <s7io9t$mr3$1@news-1.m-online.net> <sbqpud$igo$1@news-1.m-online.net>
NNTP-Posting-Host: Lsq9Ulyii8Zln50ye03obQ.user.gioia.aioe.org
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: 8bit
X-Complaints-To: abuse@aioe.org
X-Organisation: Weyland-Yutani
X-Notice: Filtered by postfilter v. 0.9.2
 by: Spiros Bousbouras - Sun, 4 Jul 2021 14:32 UTC

On Sun, 4 Jul 2021 00:56:13 +0200
Janis Papanagnou <janis_papanagnou@hotmail.com> wrote:
> On 13.05.2021 10:34, Janis Papanagnou wrote:
> > Kornshell introduced patterns (called extended patterns in bash) that
> > allowed, for example, pattern expressions like @(A|B|C), matching any
> > of the sub-patterns A, B, or C. I was under the impression that I've
> > somewhere - maybe in some intermediate or experimental ksh version -
> > seen an extension of those patterns allowing expressions like @(A&B&C)
> > with the semantics of matching all sub-patterns (in arbitrary order).
> >
> > One can think of various work-arounds for that like the equivalents
> >
> > [[ $1 == *@(A*B*C|A*C*B|B*A*C|B*C*A|C*A*B|C*B*A)* ]]
> >
> > [[ $1 == *A* && $1 == *B* && $1 == *C* ]]
> >
> > ((c=0))
> > case $1 in
> > (*A*) ((c++)) ;&
> > (*B*) ((c++)) ;&
> > (*C*) ((c++)) ;&
> > (*) ((c==3))
> > esac
> >
> > none of them really satisfying (if compared to @(A&B&C) ).
> >
> > [...]
> > I'm quite sure I've seen it somewhere but don't recall where it was.
>
> Doh! - I had obviously seen it described where it actually belongs, in
> the ksh man page...
>
> "A pattern-list is a list of one or more patterns separated from each
> other with a & or ⎪. A & signifies that all patterns must be matched
> whereas ⎪ requires that only one pattern be matched."

I believe you want | rather than ⎪ .Somehow you replaced the former
character with the latter which looks similar.

> But I used the pattern incorrectly in my tests. This one actually works
>
> [[ $1 == @(*A*&*B*&*C*) ]]
>
> (A bit more clumsy but still better than the workarounds.)

Re: Extended pattern syntax extension

<sbshup$2kp$1@news-1.m-online.net>

 copy mid

https://www.novabbs.com/devel/article-flat.php?id=4025&group=comp.unix.shell#4025

 copy link   Newsgroups: comp.unix.shell
Path: i2pn2.org!i2pn.org!weretis.net!feeder6.news.weretis.net!4.us.feeder.erje.net!3.eu.feeder.erje.net!feeder.erje.net!news2.arglkargh.de!news.karotte.org!news.space.net!news.m-online.net!.POSTED!not-for-mail
From: janis_pa...@hotmail.com (Janis Papanagnou)
Newsgroups: comp.unix.shell
Subject: Re: Extended pattern syntax extension
Date: Sun, 4 Jul 2021 16:52:09 +0200
Organization: (posted via) M-net Telekommunikations GmbH
Lines: 19
Message-ID: <sbshup$2kp$1@news-1.m-online.net>
References: <s7io9t$mr3$1@news-1.m-online.net>
<sbqpud$igo$1@news-1.m-online.net> <MmsCcM5LbSdiv6H71k@bongo-ra.co>
NNTP-Posting-Host: 2001:a61:241e:cc01:28ff:8d60:b0bf:f2d7
Mime-Version: 1.0
Content-Type: text/plain; charset=utf-8
Content-Transfer-Encoding: 8bit
X-Trace: news-1.m-online.net 1625410329 2713 2001:a61:241e:cc01:28ff:8d60:b0bf:f2d7 (4 Jul 2021 14:52:09 GMT)
X-Complaints-To: news@news-1.m-online.net
NNTP-Posting-Date: Sun, 4 Jul 2021 14:52:09 +0000 (UTC)
User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:45.0) Gecko/20100101
Thunderbird/45.8.0
X-Enigmail-Draft-Status: N1110
In-Reply-To: <MmsCcM5LbSdiv6H71k@bongo-ra.co>
 by: Janis Papanagnou - Sun, 4 Jul 2021 14:52 UTC

On 04.07.2021 16:32, Spiros Bousbouras wrote:
> On Sun, 4 Jul 2021 00:56:13 +0200
> Janis Papanagnou <janis_papanagnou@hotmail.com> wrote:
>> the ksh man page...
>>
>> "A pattern-list is a list of one or more patterns separated from each
>> other with a & or ⎪. A & signifies that all patterns must be matched
>> whereas ⎪ requires that only one pattern be matched."
>
> I believe you want | rather than ⎪ .Somehow you replaced the former
> character with the latter which looks similar.

I haven't inspected the character's code number, I just copy/pasted it
from the ksh man page, I don't think I've replaced anything. If I copy
the text passages from the man page into 'od -c' I certainly don't get
the pipe symbol '|'. Does your ksh man page show the correct character?

Janis

Re: Extended pattern syntax extension

<gtsCcM5Lb@bongo-ra.co>

 copy mid

https://www.novabbs.com/devel/article-flat.php?id=4026&group=comp.unix.shell#4026

 copy link   Newsgroups: comp.unix.shell
Path: i2pn2.org!i2pn.org!aioe.org!Lsq9Ulyii8Zln50ye03obQ.user.gioia.aioe.org.POSTED!not-for-mail
From: spi...@gmail.com (Spiros Bousbouras)
Newsgroups: comp.unix.shell
Subject: Re: Extended pattern syntax extension
Date: Sun, 4 Jul 2021 15:07:45 +0000 (UTC)
Organization: Aioe.org NNTP Server
Lines: 28
Message-ID: <gtsCcM5Lb@bongo-ra.co>
References: <s7io9t$mr3$1@news-1.m-online.net> <sbqpud$igo$1@news-1.m-online.net> <MmsCcM5LbSdiv6H71k@bongo-ra.co>
<sbshup$2kp$1@news-1.m-online.net>
NNTP-Posting-Host: Lsq9Ulyii8Zln50ye03obQ.user.gioia.aioe.org
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: 8bit
X-Complaints-To: abuse@aioe.org
X-Organisation: Weyland-Yutani
X-Notice: Filtered by postfilter v. 0.9.2
 by: Spiros Bousbouras - Sun, 4 Jul 2021 15:07 UTC

On Sun, 4 Jul 2021 16:52:09 +0200
Janis Papanagnou <janis_papanagnou@hotmail.com> wrote:
> On 04.07.2021 16:32, Spiros Bousbouras wrote:
> > On Sun, 4 Jul 2021 00:56:13 +0200
> > Janis Papanagnou <janis_papanagnou@hotmail.com> wrote:
> >> the ksh man page...
> >>
> >> "A pattern-list is a list of one or more patterns separated from each
> >> other with a & or ⎪. A & signifies that all patterns must be matched
> >> whereas ⎪ requires that only one pattern be matched."
> >
> > I believe you want | rather than ⎪ .Somehow you replaced the former
> > character with the latter which looks similar.
>
> I haven't inspected the character's code number, I just copy/pasted it
> from the ksh man page, I don't think I've replaced anything. If I copy
> the text passages from the man page into 'od -c' I certainly don't get
> the pipe symbol '|'. Does your ksh man page show the correct character?

It shows | i.e. ASCII 124. I also experimented with using ⎪ (unicode
9130 decimal) and ksh takes it literally i.e. it matches itself. I think
it would be good if you examined the source code of the man page. If it
really has ⎪ then it's a bug. If it doesn't then the bug is with one
of the tools of the man pipeline which translate the page.

--
If you want to appear cultured and hip at the same time you can say that Jar Jar
Binks is the Papageno of Star Wars.

Re: Extended pattern syntax extension

<sbskc9$j0c$1@dont-email.me>

 copy mid

https://www.novabbs.com/devel/article-flat.php?id=4027&group=comp.unix.shell#4027

 copy link   Newsgroups: comp.unix.shell
Path: i2pn2.org!i2pn.org!eternal-september.org!reader02.eternal-september.org!.POSTED!not-for-mail
From: unr...@invalid.ca (William Unruh)
Newsgroups: comp.unix.shell
Subject: Re: Extended pattern syntax extension
Date: Sun, 4 Jul 2021 15:33:30 -0000 (UTC)
Organization: A noiseless patient Spider
Lines: 19
Message-ID: <sbskc9$j0c$1@dont-email.me>
References: <s7io9t$mr3$1@news-1.m-online.net>
<sbqpud$igo$1@news-1.m-online.net> <MmsCcM5LbSdiv6H71k@bongo-ra.co>
<sbshup$2kp$1@news-1.m-online.net>
Mime-Version: 1.0
Content-Type: text/plain; charset=utf8
Content-Transfer-Encoding: 8bit
Injection-Date: Sun, 4 Jul 2021 15:33:30 -0000 (UTC)
Injection-Info: reader02.eternal-september.org; posting-host="a249e292e4e340bd12b5bef85467351e";
logging-data="19468"; mail-complaints-to="abuse@eternal-september.org"; posting-account="U2FsdGVkX1/q01L0U6X7eQEnBmw05oCm"
User-Agent: slrn/1.0.3 (Linux)
Cancel-Lock: sha1:F++d8DRuuHhRlI99eRPZClzUEsU=
 by: William Unruh - Sun, 4 Jul 2021 15:33 UTC

On 2021-07-04, Janis Papanagnou <janis_papanagnou@hotmail.com> wrote:
> On 04.07.2021 16:32, Spiros Bousbouras wrote:
>> On Sun, 4 Jul 2021 00:56:13 +0200
>> Janis Papanagnou <janis_papanagnou@hotmail.com> wrote:
>>> the ksh man page...
>>>
>>> "A pattern-list is a list of one or more patterns separated from each
>>> other with a & or ⎪. A & signifies that all patterns must be matched
>>> whereas ⎪ requires that only one pattern be matched."
>>
>> I believe you want | rather than ⎪ .Somehow you replaced the former
>> character with the latter which looks similar.
>
> I haven't inspected the character's code number, I just copy/pasted it
> from the ksh man page, I don't think I've replaced anything. If I copy
> the text passages from the man page into 'od -c' I certainly don't get
> the pipe symbol '|'. Does your ksh man page show the correct character?

My ksh man page has the wrong symbol -- ⎪ not |, 'E2 8E AA' not '7C'

Re: Extended pattern syntax extension

<sbskie$3b2$1@news-1.m-online.net>

 copy mid

https://www.novabbs.com/devel/article-flat.php?id=4028&group=comp.unix.shell#4028

 copy link   Newsgroups: comp.unix.shell
Path: i2pn2.org!i2pn.org!weretis.net!feeder8.news.weretis.net!news.mixmin.net!news2.arglkargh.de!news.karotte.org!news.space.net!news.m-online.net!.POSTED!not-for-mail
From: janis_pa...@hotmail.com (Janis Papanagnou)
Newsgroups: comp.unix.shell
Subject: Re: Extended pattern syntax extension
Date: Sun, 4 Jul 2021 17:36:46 +0200
Organization: (posted via) M-net Telekommunikations GmbH
Lines: 34
Message-ID: <sbskie$3b2$1@news-1.m-online.net>
References: <s7io9t$mr3$1@news-1.m-online.net>
<sbqpud$igo$1@news-1.m-online.net> <MmsCcM5LbSdiv6H71k@bongo-ra.co>
<sbshup$2kp$1@news-1.m-online.net> <gtsCcM5Lb@bongo-ra.co>
NNTP-Posting-Host: 2001:a61:241e:cc01:28ff:8d60:b0bf:f2d7
Mime-Version: 1.0
Content-Type: text/plain; charset=utf-8
Content-Transfer-Encoding: 8bit
X-Trace: news-1.m-online.net 1625413006 3426 2001:a61:241e:cc01:28ff:8d60:b0bf:f2d7 (4 Jul 2021 15:36:46 GMT)
X-Complaints-To: news@news-1.m-online.net
NNTP-Posting-Date: Sun, 4 Jul 2021 15:36:46 +0000 (UTC)
User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:45.0) Gecko/20100101
Thunderbird/45.8.0
X-Enigmail-Draft-Status: N1110
In-Reply-To: <gtsCcM5Lb@bongo-ra.co>
 by: Janis Papanagnou - Sun, 4 Jul 2021 15:36 UTC

On 04.07.2021 17:07, Spiros Bousbouras wrote:
> On Sun, 4 Jul 2021 16:52:09 +0200
> Janis Papanagnou <janis_papanagnou@hotmail.com> wrote:
>> On 04.07.2021 16:32, Spiros Bousbouras wrote:
>>> On Sun, 4 Jul 2021 00:56:13 +0200
>>> Janis Papanagnou <janis_papanagnou@hotmail.com> wrote:
>>>> the ksh man page...
>>>>
>>>> "A pattern-list is a list of one or more patterns separated from each
>>>> other with a & or ⎪. A & signifies that all patterns must be matched
>>>> whereas ⎪ requires that only one pattern be matched."
>>>
>>> I believe you want | rather than ⎪ .Somehow you replaced the former
>>> character with the latter which looks similar.
>>
>> I haven't inspected the character's code number, I just copy/pasted it
>> from the ksh man page, I don't think I've replaced anything. If I copy
>> the text passages from the man page into 'od -c' I certainly don't get
>> the pipe symbol '|'. Does your ksh man page show the correct character?
>
> It shows | i.e. ASCII 124. I also experimented with using ⎪ (unicode
> 9130 decimal) and ksh takes it literally i.e. it matches itself. I think
> it would be good if you examined the source code of the man page. If it
> really has ⎪ then it's a bug. If it doesn't then the bug is with one
> of the tools of the man pipeline which translate the page.

The man page is full of that "other" pipe symbol, there's only three
instances of the "real" one.

The files ksh.1.gz and ksh93.1.gz contain only the "real" pipe symbol.
It seems the *roff processor creates that symbol in specific contexts?

Janis

Re: Extended pattern syntax extension

<sbslp6$t6f$1@dont-email.me>

 copy mid

https://www.novabbs.com/devel/article-flat.php?id=4029&group=comp.unix.shell#4029

 copy link   Newsgroups: comp.unix.shell
Path: i2pn2.org!i2pn.org!eternal-september.org!reader02.eternal-september.org!.POSTED!not-for-mail
From: unr...@invalid.ca (William Unruh)
Newsgroups: comp.unix.shell
Subject: Re: Extended pattern syntax extension
Date: Sun, 4 Jul 2021 15:57:26 -0000 (UTC)
Organization: A noiseless patient Spider
Lines: 60
Message-ID: <sbslp6$t6f$1@dont-email.me>
References: <s7io9t$mr3$1@news-1.m-online.net>
<sbqpud$igo$1@news-1.m-online.net> <MmsCcM5LbSdiv6H71k@bongo-ra.co>
<sbshup$2kp$1@news-1.m-online.net> <gtsCcM5Lb@bongo-ra.co>
Mime-Version: 1.0
Content-Type: text/plain; charset=utf8
Content-Transfer-Encoding: 8bit
Injection-Date: Sun, 4 Jul 2021 15:57:26 -0000 (UTC)
Injection-Info: reader02.eternal-september.org; posting-host="a249e292e4e340bd12b5bef85467351e";
logging-data="29903"; mail-complaints-to="abuse@eternal-september.org"; posting-account="U2FsdGVkX1+BwZSjISrGJ9CoMaz4hVPR"
User-Agent: slrn/1.0.3 (Linux)
Cancel-Lock: sha1:mA1BptayUdqBJ/NQTJy2BlwLzJY=
 by: William Unruh - Sun, 4 Jul 2021 15:57 UTC

On 2021-07-04, Spiros Bousbouras <spibou@gmail.com> wrote:
> On Sun, 4 Jul 2021 16:52:09 +0200
> Janis Papanagnou <janis_papanagnou@hotmail.com> wrote:
>> On 04.07.2021 16:32, Spiros Bousbouras wrote:
>> > On Sun, 4 Jul 2021 00:56:13 +0200
>> > Janis Papanagnou <janis_papanagnou@hotmail.com> wrote:
>> >> the ksh man page...
>> >>
>> >> "A pattern-list is a list of one or more patterns separated from each
>> >> other with a & or ⎪. A & signifies that all patterns must be matched
>> >> whereas ⎪ requires that only one pattern be matched."
>> >
>> > I believe you want | rather than ⎪ .Somehow you replaced the former
>> > character with the latter which looks similar.
>>
>> I haven't inspected the character's code number, I just copy/pasted it
>> from the ksh man page, I don't think I've replaced anything. If I copy
>> the text passages from the man page into 'od -c' I certainly don't get
>> the pipe symbol '|'. Does your ksh man page show the correct character?
>
> It shows | i.e. ASCII 124. I also experimented with using ⎪ (unicode
> 9130 decimal) and ksh takes it literally i.e. it matches itself. I think
> it would be good if you examined the source code of the man page. If it
> really has ⎪ then it's a bug. If it doesn't then the bug is with one
> of the tools of the man pipeline which translate the page.
>

Using hexedit the lines in the unxz man page which are

......................................................s:

; & ( ) ⎪ < > new-line space tab

A

come out as
00000744 73 3A 0A 2E 50 50 0A 2E 52 53 0A 5C 66 33 3B 20 20 20 26 20 s:..PP..RS.\f3; &
00000758 20 20 28 20 20 20 29 20 20 20 5C 28 62 76 20 20 20 3C 20 20 ( ) \(bv <
0000076C 20 3E 20 20 20 6E 65 77 2D 6C 69 6E 65 20 20 20 73 70 61 63 > new-line spac
00000780 65 20 20 20 74 61 62 5C 66 50 0A 2E 52 45 0A 2E 50 50 0A 41 e tab\fP..RE..PP.A

Ie, it is the man page that is wrong-- consistantly using the Unicode
representation rather than that Ascii.

And the symbol they use in the man page do not work as a substitute

Under ksh

echo a | cat

gives
a | cat

Not as it should

a

Ie, that symbol does NOT act as a pipe.

Re: Extended pattern syntax extension

<sbspki$ovb$1@dont-email.me>

 copy mid

https://www.novabbs.com/devel/article-flat.php?id=4030&group=comp.unix.shell#4030

 copy link   Newsgroups: comp.unix.shell
Path: i2pn2.org!i2pn.org!eternal-september.org!reader02.eternal-september.org!.POSTED!not-for-mail
From: unr...@invalid.ca (William Unruh)
Newsgroups: comp.unix.shell
Subject: Re: Extended pattern syntax extension
Date: Sun, 4 Jul 2021 17:03:14 -0000 (UTC)
Organization: A noiseless patient Spider
Lines: 10
Message-ID: <sbspki$ovb$1@dont-email.me>
References: <s7io9t$mr3$1@news-1.m-online.net>
<sbqpud$igo$1@news-1.m-online.net> <MmsCcM5LbSdiv6H71k@bongo-ra.co>
<sbshup$2kp$1@news-1.m-online.net> <gtsCcM5Lb@bongo-ra.co>
<sbskie$3b2$1@news-1.m-online.net>
Injection-Date: Sun, 4 Jul 2021 17:03:14 -0000 (UTC)
Injection-Info: reader02.eternal-september.org; posting-host="a249e292e4e340bd12b5bef85467351e";
logging-data="25579"; mail-complaints-to="abuse@eternal-september.org"; posting-account="U2FsdGVkX1/6opEFngXj0jiAs4GFBxod"
User-Agent: slrn/1.0.3 (Linux)
Cancel-Lock: sha1:ifFg2dozc2o3ONvr1doezJgsknY=
 by: William Unruh - Sun, 4 Jul 2021 17:03 UTC

On 2021-07-04, Janis Papanagnou <janis_papanagnou@hotmail.com> wrote:
>
> The files ksh.1.gz and ksh93.1.gz contain only the "real" pipe symbol.
> It seems the *roff processor creates that symbol in specific contexts?

As I mentioned that is not true of the ksh.1.xz man page (Mageia 8).

ksh-2020.0.0.338.git8d91e8a-0.2.mga8

It contains the wrong (UTF) pipe symbol rather than the right ascii one.

Re: Extended pattern syntax extension

<6JyyIyABvK0yAu71zm@bongo-ra.co>

 copy mid

https://www.novabbs.com/devel/article-flat.php?id=4031&group=comp.unix.shell#4031

 copy link   Newsgroups: comp.unix.shell
Path: i2pn2.org!i2pn.org!aioe.org!Lsq9Ulyii8Zln50ye03obQ.user.gioia.aioe.org.POSTED!not-for-mail
From: spi...@gmail.com (Spiros Bousbouras)
Newsgroups: comp.unix.shell
Subject: Re: Extended pattern syntax extension
Date: Sun, 4 Jul 2021 17:17:46 +0000 (UTC)
Organization: Aioe.org NNTP Server
Lines: 52
Message-ID: <6JyyIyABvK0yAu71zm@bongo-ra.co>
References: <s7io9t$mr3$1@news-1.m-online.net> <sbqpud$igo$1@news-1.m-online.net> <MmsCcM5LbSdiv6H71k@bongo-ra.co>
<sbshup$2kp$1@news-1.m-online.net> <gtsCcM5Lb@bongo-ra.co> <sbslp6$t6f$1@dont-email.me>
NNTP-Posting-Host: Lsq9Ulyii8Zln50ye03obQ.user.gioia.aioe.org
Mime-Version: 1.0
Content-Type: text/plain; charset=ISO-8859-1
Content-Transfer-Encoding: 8bit
X-Complaints-To: abuse@aioe.org
X-Server-Commands: nowebcancel
X-Organisation: Weyland-Yutani
X-Notice: Filtered by postfilter v. 0.9.2
 by: Spiros Bousbouras - Sun, 4 Jul 2021 17:17 UTC

On Sun, 4 Jul 2021 15:57:26 -0000 (UTC)
William Unruh <unruh@invalid.ca> wrote:
> On 2021-07-04, Spiros Bousbouras <spibou@gmail.com> wrote:
> > On Sun, 4 Jul 2021 16:52:09 +0200
> > Janis Papanagnou <janis_papanagnou@hotmail.com> wrote:
> >> I haven't inspected the character's code number, I just copy/pasted it
> >> from the ksh man page, I don't think I've replaced anything. If I copy
> >> the text passages from the man page into 'od -c' I certainly don't get
> >> the pipe symbol '|'. Does your ksh man page show the correct character?
> >
> > It shows | i.e. ASCII 124. I also experimented with using ⎪ (unicode
> > 9130 decimal) and ksh takes it literally i.e. it matches itself. I think
> > it would be good if you examined the source code of the man page. If it
> > really has ⎪ then it's a bug. If it doesn't then the bug is with one
> > of the tools of the man pipeline which translate the page.
> >
>
> Using hexedit the lines in the unxz man page which are

This confused me. I'm guessing you mean the ksh page with .xz compression.

> ......................................................s:
>
> ; & ( ) ⎪ < > new-line space tab
>
> A
>
> come out as
> 00000744 73 3A 0A 2E 50 50 0A 2E 52 53 0A 5C 66 33 3B 20 20 20 26 20 s:..PP..RS.\f3; &
> 00000758 20 20 28 20 20 20 29 20 20 20 5C 28 62 76 20 20 20 3C 20 20 ( ) \(bv <
> 0000076C 20 3E 20 20 20 6E 65 77 2D 6C 69 6E 65 20 20 20 73 70 61 63 > new-line spac
> 00000780 65 20 20 20 74 61 62 5C 66 50 0A 2E 52 45 0A 2E 50 50 0A 41 e tab\fP..RE..PP.A
>
> Ie, it is the man page that is wrong-- consistantly using the Unicode
> representation rather than that Ascii.
>
> And the symbol they use in the man page do not work as a substitute
>
> Under ksh
>
> echo a | cat
>
> gives
> a | cat
>
> Not as it should
>
> a
>
> Ie, that symbol does NOT act as a pipe.

But the symbol you posted is the usual pipe symbol namely ASCII 124.

Re: Extended pattern syntax extension

<DB2Kj8NtDjTSf8VP@bongo-ra.co>

 copy mid

https://www.novabbs.com/devel/article-flat.php?id=4032&group=comp.unix.shell#4032

 copy link   Newsgroups: comp.unix.shell
Path: i2pn2.org!i2pn.org!aioe.org!Lsq9Ulyii8Zln50ye03obQ.user.gioia.aioe.org.POSTED!not-for-mail
From: spi...@gmail.com (Spiros Bousbouras)
Newsgroups: comp.unix.shell
Subject: Re: Extended pattern syntax extension
Date: Sun, 4 Jul 2021 17:36:20 +0000 (UTC)
Organization: Aioe.org NNTP Server
Lines: 46
Message-ID: <DB2Kj8NtDjTSf8VP@bongo-ra.co>
References: <s7io9t$mr3$1@news-1.m-online.net> <sbqpud$igo$1@news-1.m-online.net> <MmsCcM5LbSdiv6H71k@bongo-ra.co>
<sbshup$2kp$1@news-1.m-online.net> <gtsCcM5Lb@bongo-ra.co> <sbskie$3b2$1@news-1.m-online.net>
NNTP-Posting-Host: Lsq9Ulyii8Zln50ye03obQ.user.gioia.aioe.org
Mime-Version: 1.0
Content-Type: text/plain; charset=utf8
Content-Transfer-Encoding: 8bit
X-Complaints-To: abuse@aioe.org
X-Notice: Filtered by postfilter v. 0.9.2
X-Server-Commands: nowebcancel
X-Organisation: Weyland-Yutani
 by: Spiros Bousbouras - Sun, 4 Jul 2021 17:36 UTC

On Sun, 4 Jul 2021 17:36:46 +0200
Janis Papanagnou <janis_papanagnou@hotmail.com> wrote:
> On 04.07.2021 17:07, Spiros Bousbouras wrote:
> > On Sun, 4 Jul 2021 16:52:09 +0200
> > Janis Papanagnou <janis_papanagnou@hotmail.com> wrote:
> >> I haven't inspected the character's code number, I just copy/pasted it
> >> from the ksh man page, I don't think I've replaced anything. If I copy
> >> the text passages from the man page into 'od -c' I certainly don't get
> >> the pipe symbol '|'. Does your ksh man page show the correct character?
> >
> > It shows | i.e. ASCII 124. I also experimented with using ⎪ (unicode
> > 9130 decimal) and ksh takes it literally i.e. it matches itself. I think
> > it would be good if you examined the source code of the man page. If it
> > really has ⎪ then it's a bug. If it doesn't then the bug is with one
> > of the tools of the man pipeline which translate the page.
>
> The man page is full of that "other" pipe symbol, there's only three
> instances of the "real" one.
>
> The files ksh.1.gz and ksh93.1.gz contain only the "real" pipe symbol.
> It seems the *roff processor creates that symbol in specific contexts?

It would seem so. I have encountered something analogous : doing
groff -m mandoc -T html ./some-groff-source

produces
&rsquo; when there is ' in the original
&lsquo; when there is ` in the original
&minus; when there is - in the original

..According to
https://www.w3.org/wiki/Common_HTML_entities_used_for_typography

&rsquo; is unicode 8217
&lsquo; is unicode 8216
&minus; is unicode 8722

which of course was not what I wanted. In particular when the man page has
code examples , doing these replacements breaks the code. So I always use

groff -m mandoc -T html ./some-groff-source | sed -e "s/&rsquo;/\&#39;/g" -e "s/&lsquo;/\&#96;/g" -e "s/&minus;/\&#45;/g" > man-page.html

--
About a car each minute gets blasted and 1000 liters of gasoline are burnt...
They sure did not have ecology in mind.
www.imdb.com/review/rw2940189/

Re: Extended pattern syntax extension

<op.050tvoaja3w0dxdave@hodgins.homeip.net>

 copy mid

https://www.novabbs.com/devel/article-flat.php?id=4035&group=comp.unix.shell#4035

 copy link   Newsgroups: comp.unix.shell
Path: i2pn2.org!i2pn.org!eternal-september.org!reader02.eternal-september.org!.POSTED!not-for-mail
From: dwhodg...@nomail.afraid.org (David W. Hodgins)
Newsgroups: comp.unix.shell
Subject: Re: Extended pattern syntax extension
Date: Sun, 04 Jul 2021 15:32:02 -0400
Organization: A noiseless patient Spider
Lines: 22
Message-ID: <op.050tvoaja3w0dxdave@hodgins.homeip.net>
References: <s7io9t$mr3$1@news-1.m-online.net>
<sbqpud$igo$1@news-1.m-online.net> <MmsCcM5LbSdiv6H71k@bongo-ra.co>
<sbshup$2kp$1@news-1.m-online.net> <gtsCcM5Lb@bongo-ra.co>
<sbskie$3b2$1@news-1.m-online.net> <sbspki$ovb$1@dont-email.me>
Mime-Version: 1.0
Content-Type: text/plain; charset=utf-8; format=flowed; delsp=yes
Content-Transfer-Encoding: 8bit
Injection-Info: reader02.eternal-september.org; posting-host="ea22e21f7ae375d19c4eabda0f9d12b9";
logging-data="31589"; mail-complaints-to="abuse@eternal-september.org"; posting-account="U2FsdGVkX1/I8F6CwsginiokQsWMVy5oF9hxIh0NFKI="
User-Agent: Opera Mail/12.16 (Linux)
Cancel-Lock: sha1:wBvBlXE2ZJa9w/H5u99DhoadnNE=
 by: David W. Hodgins - Sun, 4 Jul 2021 19:32 UTC

On Sun, 04 Jul 2021 13:03:14 -0400, William Unruh <unruh@invalid.ca> wrote:

> On 2021-07-04, Janis Papanagnou <janis_papanagnou@hotmail.com> wrote:
>>
>> The files ksh.1.gz and ksh93.1.gz contain only the "real" pipe symbol.
>> It seems the *roff processor creates that symbol in specific contexts?
>
> As I mentioned that is not true of the ksh.1.xz man page (Mageia 8).
>
> ksh-2020.0.0.338.git8d91e8a-0.2.mga8
>
> It contains the wrong (UTF) pipe symbol rather than the right ascii one.

The error is in the original source used to create the ksh source rpm for Mageia.

Line 109 of https://github.com/att/ast/blob/master/src/cmd/ksh93/sh.1 has the error.

Regards, Dave Hodgins

--
Change dwhodgins@nomail.afraid.org to davidwhodgins@teksavvy.com for
email replies.

Re: Extended pattern syntax extension

<op.050ualmda3w0dxdave@hodgins.homeip.net>

 copy mid

https://www.novabbs.com/devel/article-flat.php?id=4036&group=comp.unix.shell#4036

 copy link   Newsgroups: comp.unix.shell
Path: i2pn2.org!i2pn.org!eternal-september.org!reader02.eternal-september.org!.POSTED!not-for-mail
From: dwhodg...@nomail.afraid.org (David W. Hodgins)
Newsgroups: comp.unix.shell
Subject: Re: Extended pattern syntax extension
Date: Sun, 04 Jul 2021 15:40:59 -0400
Organization: A noiseless patient Spider
Lines: 21
Message-ID: <op.050ualmda3w0dxdave@hodgins.homeip.net>
References: <s7io9t$mr3$1@news-1.m-online.net>
<sbqpud$igo$1@news-1.m-online.net> <MmsCcM5LbSdiv6H71k@bongo-ra.co>
<sbshup$2kp$1@news-1.m-online.net> <gtsCcM5Lb@bongo-ra.co>
<sbskie$3b2$1@news-1.m-online.net> <sbspki$ovb$1@dont-email.me>
<op.050tvoaja3w0dxdave@hodgins.homeip.net>
Mime-Version: 1.0
Content-Type: text/plain; charset=utf-8; format=flowed; delsp=yes
Content-Transfer-Encoding: 8bit
Injection-Info: reader02.eternal-september.org; posting-host="ea22e21f7ae375d19c4eabda0f9d12b9";
logging-data="31589"; mail-complaints-to="abuse@eternal-september.org"; posting-account="U2FsdGVkX1+OcKMB8/ef/bSndwqa/K5DMc+SPMMzbG4="
User-Agent: Opera Mail/12.16 (Linux)
Cancel-Lock: sha1:n0qW5wydIujUoXma9QhNddjTsrU=
 by: David W. Hodgins - Sun, 4 Jul 2021 19:40 UTC

On Sun, 04 Jul 2021 15:32:02 -0400, David W. Hodgins <dwhodgins@nomail.afraid.org> wrote:
> On Sun, 04 Jul 2021 13:03:14 -0400, William Unruh <unruh@invalid.ca> wrote:
>> On 2021-07-04, Janis Papanagnou <janis_papanagnou@hotmail.com> wrote:
>>> The files ksh.1.gz and ksh93.1.gz contain only the "real" pipe symbol.
>>> It seems the *roff processor creates that symbol in specific contexts?

>> As I mentioned that is not true of the ksh.1.xz man page (Mageia 8).
>> ksh-2020.0.0.338.git8d91e8a-0.2.mga8
>> It contains the wrong (UTF) pipe symbol rather than the right ascii one.

> The error is in the original source used to create the ksh source rpm for Mageia.
> Line 109 of https://github.com/att/ast/blob/master/src/cmd/ksh93/sh.1 has the error.

Now reported as https://github.com/att/ast/issues/1491 and
https://bugs.mageia.org/show_bug.cgi?id=29214

Regards, Dave Hodgins

--
Change dwhodgins@nomail.afraid.org to davidwhodgins@teksavvy.com for
email replies.

Re: Extended pattern syntax extension

<sbt30p$v23$1@dont-email.me>

 copy mid

https://www.novabbs.com/devel/article-flat.php?id=4037&group=comp.unix.shell#4037

 copy link   Newsgroups: comp.unix.shell
Path: i2pn2.org!i2pn.org!eternal-september.org!reader02.eternal-september.org!.POSTED!not-for-mail
From: unr...@invalid.ca (William Unruh)
Newsgroups: comp.unix.shell
Subject: Re: Extended pattern syntax extension
Date: Sun, 4 Jul 2021 19:43:21 -0000 (UTC)
Organization: A noiseless patient Spider
Lines: 62
Message-ID: <sbt30p$v23$1@dont-email.me>
References: <s7io9t$mr3$1@news-1.m-online.net>
<sbqpud$igo$1@news-1.m-online.net> <MmsCcM5LbSdiv6H71k@bongo-ra.co>
<sbshup$2kp$1@news-1.m-online.net> <gtsCcM5Lb@bongo-ra.co>
<sbslp6$t6f$1@dont-email.me> <6JyyIyABvK0yAu71zm@bongo-ra.co>
Mime-Version: 1.0
Content-Type: text/plain; charset=utf8
Content-Transfer-Encoding: 8bit
Injection-Date: Sun, 4 Jul 2021 19:43:21 -0000 (UTC)
Injection-Info: reader02.eternal-september.org; posting-host="a249e292e4e340bd12b5bef85467351e";
logging-data="31811"; mail-complaints-to="abuse@eternal-september.org"; posting-account="U2FsdGVkX1/lqvoLczurSIfa6Dfafkby"
User-Agent: slrn/1.0.3 (Linux)
Cancel-Lock: sha1:LciIc9pKhKc3Xpj0JI7WraKAsOk=
 by: William Unruh - Sun, 4 Jul 2021 19:43 UTC

On 2021-07-04, Spiros Bousbouras <spibou@gmail.com> wrote:
> On Sun, 4 Jul 2021 15:57:26 -0000 (UTC)
> William Unruh <unruh@invalid.ca> wrote:
>> On 2021-07-04, Spiros Bousbouras <spibou@gmail.com> wrote:
>> > On Sun, 4 Jul 2021 16:52:09 +0200
>> > Janis Papanagnou <janis_papanagnou@hotmail.com> wrote:
>> >> I haven't inspected the character's code number, I just copy/pasted it
>> >> from the ksh man page, I don't think I've replaced anything. If I copy
>> >> the text passages from the man page into 'od -c' I certainly don't get
>> >> the pipe symbol '|'. Does your ksh man page show the correct character?
>> >
>> > It shows | i.e. ASCII 124. I also experimented with using ⎪ (unicode
>> > 9130 decimal) and ksh takes it literally i.e. it matches itself. I think
>> > it would be good if you examined the source code of the man page. If it
>> > really has ⎪ then it's a bug. If it doesn't then the bug is with one
>> > of the tools of the man pipeline which translate the page.
>> >
>>
>> Using hexedit the lines in the unxz man page which are
>
> This confused me. I'm guessing you mean the ksh page with .xz compression.

Sorry, Yes, I meant the ksh.1.xz file on which I ran the program unxz to
decompress it.
>
>> ......................................................s:
>>
>> ; & ( ) ⎪ < > new-line space tab
>>
>> A
>>
>> come out as
>> 00000744 73 3A 0A 2E 50 50 0A 2E 52 53 0A 5C 66 33 3B 20 20 20 26 20 s:..PP..RS.\f3; &
>> 00000758 20 20 28 20 20 20 29 20 20 20 5C 28 62 76 20 20 20 3C 20 20 ( ) \(bv <
>> 0000076C 20 3E 20 20 20 6E 65 77 2D 6C 69 6E 65 20 20 20 73 70 61 63 > new-line spac
>> 00000780 65 20 20 20 74 61 62 5C 66 50 0A 2E 52 45 0A 2E 50 50 0A 41 e tab\fP..RE..PP.A
>>
>> Ie, it is the man page that is wrong-- consistantly using the Unicode
>> representation rather than that Ascii.
>>
>> And the symbol they use in the man page do not work as a substitute
>>
>> Under ksh
>>
>> echo a | cat
>>
>> gives
>> a | cat
>>
>> Not as it should
>>
>> a
>>
>> Ie, that symbol does NOT act as a pipe.
>
> But the symbol you posted is the usual pipe symbol namely ASCII 124.

The symbol in the ksh.1 file derived from ksh.1.xz is
5C 28 62 76
Which does not appear to be a unicode symbol, but some sort of command
to the troff interpreter which is coming out as the unicode ⎪, rather than
the ascii |

Re: Extended pattern syntax extension

<sbunnv$3vp$1@news.dns-netz.com>

 copy mid

https://www.novabbs.com/devel/article-flat.php?id=4038&group=comp.unix.shell#4038

 copy link   Newsgroups: comp.unix.shell
Path: i2pn2.org!i2pn.org!aioe.org!news.dns-netz.com!news.freedyn.net!.POSTED!not-for-mail
From: joe...@schily.net
Newsgroups: comp.unix.shell
Subject: Re: Extended pattern syntax extension
Date: Mon, 5 Jul 2021 10:43:11 -0000 (UTC)
Message-ID: <sbunnv$3vp$1@news.dns-netz.com>
References: <s7io9t$mr3$1@news-1.m-online.net> <gtsCcM5Lb@bongo-ra.co> <sbskie$3b2$1@news-1.m-online.net> <sbspki$ovb$1@dont-email.me>
Mime-Version: 1.0
Content-Type: text/plain; charset=ISO-8859-1
Content-Transfer-Encoding: 8bit
X-Trace: eJxdjcEKAiEURdf1FQ9XBWk6TVAjAwPtgnb1AeKIGc57QypRX59tO+tz7kX3SmLExNHlj7A0aZgp5YCeG2upYO7Zg9zTD8neQ3yL6jG9XETy/ieNJpuetfJwZBomEyKvG3M0AXPimXpWkqvJgP8/DLZwIdzAHs4lQiMbBUp27a5TCriswOp2Pa2/2uEz9w==
Cancel-Lock: sha1:YsDKbT8ND8Vc5Fj5J0UEV+GdzSc=
X-Abuse-Contact: "abuse@dns-netz.com"
X-Newsreader: trn 4.0-test76 (Apr 2, 2001)
 by: joe...@schily.net - Mon, 5 Jul 2021 10:43 UTC

In article <sbspki$ovb$1@dont-email.me>,
William Unruh <unruh@invalid.ca> wrote:
>On 2021-07-04, Janis Papanagnou <janis_papanagnou@hotmail.com> wrote:
>>
>> The files ksh.1.gz and ksh93.1.gz contain only the "real" pipe symbol.
>> It seems the *roff processor creates that symbol in specific contexts?
>
>As I mentioned that is not true of the ksh.1.xz man page (Mageia 8).
>
>ksh-2020.0.0.338.git8d91e8a-0.2.mga8

Be careful, the beast ksh2020 is the ksh destroyed bz RedHat...

--
EMail:joerg@schily.net Jörg Schilling D-13353 Berlin
Blog: http://schily.blogspot.com/
URL: http://cdrecord.org/private/ http://sourceforge.net/projects/schilytools/files/

1
server_pubkey.txt

rocksolid light 0.9.7
clearnet tor