Rocksolid Light

Welcome to novaBBS (click a section below)

mail  files  register  newsreader  groups  login

Message-ID:  

Logic is a pretty flower that smells bad.


devel / comp.compilers / Re: Question about regex with negated character class

SubjectAuthor
* Question about regex with negated character classRoger L Costello
`- Re: Question about regex with negated character classKaz Kylheku

1
Question about regex with negated character class

<22-04-015@comp.compilers>

 copy mid

https://www.novabbs.com/devel/article-flat.php?id=364&group=comp.compilers#364

 copy link   Newsgroups: comp.compilers
Path: i2pn2.org!i2pn.org!weretis.net!feeder6.news.weretis.net!news.misty.com!news.iecc.com!.POSTED.news.iecc.com!nerds-end
From: coste...@mitre.org (Roger L Costello)
Newsgroups: comp.compilers
Subject: Question about regex with negated character class
Date: Mon, 25 Apr 2022 12:48:43 +0000
Organization: Compilers Central
Lines: 23
Sender: news@iecc.com
Approved: comp.compilers@iecc.com
Message-ID: <22-04-015@comp.compilers>
Mime-Version: 1.0
Content-Type: text/plain; charset="us-ascii"
Content-Transfer-Encoding: 8bit
Injection-Info: gal.iecc.com; posting-host="news.iecc.com:2001:470:1f07:1126:0:676f:7373:6970";
logging-data="57597"; mail-complaints-to="abuse@iecc.com"
Keywords: lex, question, comment
Posted-Date: 25 Apr 2022 12:33:08 EDT
X-submission-address: compilers@iecc.com
X-moderator-address: compilers-request@iecc.com
X-FAQ-and-archives: http://compilers.iecc.com
Accept-Language: en-US
Content-Language: en-US
 by: Roger L Costello - Mon, 25 Apr 2022 12:48 UTC

Hi Folks,

On page 12 of the Flex specification it says this:

"A negated character class such as [^A-Z] will match a newline
unless \n (or an equivalent escape sequence) is one of the characters
explicitly present
in the negated character class (e.g., [^A-Z\n]). This is unlike how many other
regular
expression tools treat negated character classes ..."

Is that last sentence true? Does Flex behaves differently from other regex
engines, with regard to negated character class?

I just tested the [^A-Z] regex at (https://regex101.com/) and every regex
engine on that web page matches a string containing a newline. In other words,
Flex behaves just like all the other regex engines. I conclude that that last
sentence in the Flex manual is not correct. Do you agree?

/Roger
[It may have been true 30 years ago but they all match \n in a pattern
now. On the other hand, grep won't match a newline because it does the
matching one line at a time. -John]

Re: Question about regex with negated character class

<22-04-021@comp.compilers>

 copy mid

https://www.novabbs.com/devel/article-flat.php?id=370&group=comp.compilers#370

 copy link   Newsgroups: comp.compilers
Path: i2pn2.org!i2pn.org!weretis.net!feeder6.news.weretis.net!news.misty.com!news.iecc.com!.POSTED.news.iecc.com!nerds-end
From: 480-992-...@kylheku.com (Kaz Kylheku)
Newsgroups: comp.compilers
Subject: Re: Question about regex with negated character class
Date: Mon, 25 Apr 2022 23:46:44 -0000 (UTC)
Organization: A noiseless patient Spider
Lines: 27
Sender: news@iecc.com
Approved: comp.compilers@iecc.com
Message-ID: <22-04-021@comp.compilers>
References: <22-04-015@comp.compilers>
Injection-Info: gal.iecc.com; posting-host="news.iecc.com:2001:470:1f07:1126:0:676f:7373:6970";
logging-data="97910"; mail-complaints-to="abuse@iecc.com"
Keywords: lex
Posted-Date: 25 Apr 2022 22:40:32 EDT
X-submission-address: compilers@iecc.com
X-moderator-address: compilers-request@iecc.com
X-FAQ-and-archives: http://compilers.iecc.com
 by: Kaz Kylheku - Mon, 25 Apr 2022 23:46 UTC

On 2022-04-25, Roger L Costello <costello@mitre.org> wrote:
> Hi Folks,
>
> On page 12 of the Flex specification it says this:
>
> "A negated character class such as [^A-Z] will match a newline
> unless \n (or an equivalent escape sequence) is one of the characters
> explicitly present
> in the negated character class (e.g., [^A-Z\n]). This is unlike how many other
> regular expression tools treat negated character classes ..."

I suspect this is a documentation mistake (in terms of the the remark it
makes about other regex implementations).

There is something special in Flex with regard to newlines: namely the
any-character regular expression . (dot) does not match any character:
it excludes the newline. The documenter might have momentarily gotten
their wires crossed, misremembering what is the special behavior.

Or else, I also agree with John that it may in fact be a remark about
regex implementations in line-oriented text processing utilities, which
(in their standrad forms, e.g. POSIX) don't have multi-line matching
features in which \n appears as a character.

--
TXR Programming Language: http://nongnu.org/txr
Cygnal: Cygwin Native Application Library: http://kylheku.com/cygnal

1
server_pubkey.txt

rocksolid light 0.9.7
clearnet tor