Rocksolid Light

Welcome to novaBBS (click a section below)

mail  files  register  newsreader  groups  login

Message-ID:  

There's a whole WORLD in a mud puddle! -- Doug Clifford


devel / comp.arch / Re: Safepoints

SubjectAuthor
* Safepointsantispam
+* Re: SafepointsMitchAlsup
|`* Re: Safepointsantispam
| `* Re: SafepointsBranimir Maksimovic
|  +* Re: SafepointsBranimir Maksimovic
|  |`- Re: SafepointsBranimir Maksimovic
|  `* Re: Safepointsantispam
|   `- Re: SafepointsBranimir Maksimovic
+* Re: SafepointsIvan Godard
|+* Re: Safepointsantispam
||`- Re: SafepointsBranimir Maksimovic
|`* Re: SafepointsBranimir Maksimovic
| +* Re: Safepointsantispam
| |`* Re: SafepointsBranimir Maksimovic
| | `* Re: SafepointsBranimir Maksimovic
| |  `* Re: Safepointsantispam
| |   `- Re: SafepointsBranimir Maksimovic
| `- Re: SafepointsMitchAlsup
+* Re: SafepointsBranimir Maksimovic
|`* Re: Safepointsantispam
| `- Re: SafepointsBranimir Maksimovic
+* Re: SafepointsThomas Koenig
|`- Re: SafepointsBranimir Maksimovic
+* Re: SafepointsDavid Brown
|+* Re: SafepointsBranimir Maksimovic
||+- Re: SafepointsBranimir Maksimovic
||+* Re: SafepointsAndy Valencia
|||`- Re: SafepointsBranimir Maksimovic
||`- Re: SafepointsDavid Brown
|+* Re: SafepointsStephen Fuld
||+* Re: SafepointsMitchAlsup
|||+- Re: SafepointsStephen Fuld
|||`* Re: SafepointsDavid Brown
||| +- Re: Safepointsrobf...@gmail.com
||| `* Re: SafepointsBranimir Maksimovic
|||  `* Re: SafepointsDavid Brown
|||   `* Re: SafepointsEricP
|||    +- Re: SafepointsDavid Brown
|||    `- Re: SafepointsEricP
||`* Re: SafepointsDavid Brown
|| +* Re: SafepointsStephen Fuld
|| |`* Re: SafepointsEricP
|| | `* Re: SafepointsStephen Fuld
|| |  +* Re: SafepointsDavid Brown
|| |  |`* Re: SafepointsEricP
|| |  | `* Re: SafepointsDavid Brown
|| |  |  `* Re: SafepointsEricP
|| |  |   `* Re: SafepointsDavid Brown
|| |  |    +* Re: SafepointsStephen Fuld
|| |  |    |`- Re: SafepointsDavid Brown
|| |  |    `* Re: SafepointsEricP
|| |  |     +- Re: SafepointsMitchAlsup
|| |  |     +- Re: SafepointsDavid Brown
|| |  |     `* Re: SafepointsPaul A. Clayton
|| |  |      +* Re: SafepointsMitchAlsup
|| |  |      |+* Re: SafepointsTerje Mathisen
|| |  |      ||`- Re: SafepointsMitchAlsup
|| |  |      |`* Re: SafepointsPaul A. Clayton
|| |  |      | `* Re: SafepointsMitchAlsup
|| |  |      |  +* Re: SafepointsEricP
|| |  |      |  |`* Re: SafepointsMitchAlsup
|| |  |      |  | `* Re: SafepointsEricP
|| |  |      |  |  `* Re: SafepointsMitchAlsup
|| |  |      |  |   `* Re: SafepointsEricP
|| |  |      |  |    `- Re: SafepointsMitchAlsup
|| |  |      |  `* Re: SafepointsPaul A. Clayton
|| |  |      |   `- Re: SafepointsMitchAlsup
|| |  |      `* Re: SafepointsEricP
|| |  |       `* Re: SafepointsChris M. Thomasson
|| |  |        `* Re: SafepointsEricP
|| |  |         `* Re: SafepointsChris M. Thomasson
|| |  |          +- Re: SafepointsMitchAlsup
|| |  |          `* Re: SafepointsEricP
|| |  |           `* Re: SafepointsChris M. Thomasson
|| |  |            `- Re: SafepointsChris M. Thomasson
|| |  `* Re: SafepointsEricP
|| |   `* Re: SafepointsDavid Brown
|| |    `* Re: SafepointsEricP
|| |     `* Re: SafepointsDavid Brown
|| |      `* Re: SafepointsEricP
|| |       +- Re: SafepointsMitchAlsup
|| |       `- Re: SafepointsDavid Brown
|| `- Re: SafepointsMitchAlsup
|`- Re: everything old is new again, SafepointsJohn Levine
`* Re: Safepointsaph
 `* Re: SafepointsBranimir Maksimovic
  +* Re: Safepointsantispam
  |`* Re: Safepointsaph
  | `* Re: Safepointsantispam
  |  `* Re: Safepointsaph
  |   +- Re: SafepointsBranimir Maksimovic
  |   `- Re: Safepointsantispam
  `* Re: Safepointsaph
   `* Re: SafepointsBranimir Maksimovic
    `* Re: Safepointsantispam
     `* Re: SafepointsBranimir Maksimovic
      +* Re: SafepointsBranimir Maksimovic
      |`- Re: SafepointsBranimir Maksimovic
      `* Re: Safepointsantispam
       `- Re: SafepointsBranimir Maksimovic

Pages:1234
Re: Safepoints

<c097ddb1-3209-478e-8dae-32f64b87a492n@googlegroups.com>

  copy mid

https://www.novabbs.com/devel/article-flat.php?id=19548&group=comp.arch#19548

  copy link   Newsgroups: comp.arch
X-Received: by 2002:a05:6214:5199:: with SMTP id kl25mr20384739qvb.12.1628078568746;
Wed, 04 Aug 2021 05:02:48 -0700 (PDT)
X-Received: by 2002:a9d:61d9:: with SMTP id h25mr19063435otk.81.1628078568487;
Wed, 04 Aug 2021 05:02:48 -0700 (PDT)
Path: i2pn2.org!i2pn.org!weretis.net!feeder8.news.weretis.net!proxad.net!feeder1-2.proxad.net!209.85.160.216.MISMATCH!news-out.google.com!nntp.google.com!postnews.google.com!google-groups.googlegroups.com!not-for-mail
Newsgroups: comp.arch
Date: Wed, 4 Aug 2021 05:02:48 -0700 (PDT)
In-Reply-To: <sedj5r$b0l$2@dont-email.me>
Injection-Info: google-groups.googlegroups.com; posting-host=99.251.79.92; posting-account=QId4bgoAAABV4s50talpu-qMcPp519Eb
NNTP-Posting-Host: 99.251.79.92
References: <se1s18$sf$1@z-news.wcss.wroc.pl> <se36el$en5$1@dont-email.me>
<se4aae$qma$1@dont-email.me> <f1240d75-82d6-468e-bf51-c4e275e38a4an@googlegroups.com>
<sedj5r$b0l$2@dont-email.me>
User-Agent: G2/1.0
MIME-Version: 1.0
Message-ID: <c097ddb1-3209-478e-8dae-32f64b87a492n@googlegroups.com>
Subject: Re: Safepoints
From: robfi...@gmail.com (robf...@gmail.com)
Injection-Date: Wed, 04 Aug 2021 12:02:48 +0000
Content-Type: text/plain; charset="UTF-8"
Content-Transfer-Encoding: quoted-printable
 by: robf...@gmail.com - Wed, 4 Aug 2021 12:02 UTC

On Wednesday, August 4, 2021 at 4:29:49 AM UTC-4, David Brown wrote:
> On 31/07/2021 22:58, MitchAlsup wrote:
> > On Saturday, July 31, 2021 at 3:03:28 PM UTC-5, Stephen Fuld wrote:
> >> On 7/31/2021 2:51 AM, David Brown wrote:
> >>
> >> snip
> >>> I've always thought a "disable interrupts for the next N instructions",
> >>> with "N" being a small immediate constant, would be an extremely useful
> >>> instruction on the kind of devices I use (single cpu microcontrollers).
> > <
> > I can see utility here, but you need not only N as the number of instructions,
> > but also P the priority level below which you suppress interrupts and above
> > which you still allow interrupts.
> > <
> No - you block /all/ interrupts, of /all/ priorities. That's key to
> such an instruction.
> > In the realm of real CPUs (not microcontrollers) P is variable and the task/
> > thread might not know what privilege level is appropriate. In a hypervisor/
> > supervisor system, you might not be allowed to even know what level is
> > appropriate.

Hey, I added this to ANY1, seemed like a good idea. It was fairly straight-forward to do. When the interrupt disable instruction (DI) hits the writeback stage, it simply sets a lockout flag in the following N queue entries which may or may not be populated yet with instructions. When the instructions hit writeback the lock out flag is cleared. In this case the max lockout is seven instructions, the size of the queue – 1. The DI instruction itself allows an interrupt to occur so it should not be possible to lock out interrupts using a string of DI instructions. Works in all modes of operation. The instruction was needed for several different operation that need to be atomic.

Re: Safepoints

<seeb8j$f0t$1@dont-email.me>

  copy mid

https://www.novabbs.com/devel/article-flat.php?id=19551&group=comp.arch#19551

  copy link   Newsgroups: comp.arch
Path: i2pn2.org!i2pn.org!eternal-september.org!reader02.eternal-september.org!.POSTED!not-for-mail
From: sfu...@alumni.cmu.edu.invalid (Stephen Fuld)
Newsgroups: comp.arch
Subject: Re: Safepoints
Date: Wed, 4 Aug 2021 08:20:49 -0700
Organization: A noiseless patient Spider
Lines: 42
Message-ID: <seeb8j$f0t$1@dont-email.me>
References: <se1s18$sf$1@z-news.wcss.wroc.pl> <se36el$en5$1@dont-email.me>
<se4aae$qma$1@dont-email.me> <sedj2d$b0l$1@dont-email.me>
Mime-Version: 1.0
Content-Type: text/plain; charset=utf-8; format=flowed
Content-Transfer-Encoding: 8bit
Injection-Date: Wed, 4 Aug 2021 15:20:51 -0000 (UTC)
Injection-Info: reader02.eternal-september.org; posting-host="68c5bd58a847b759e8d3d45702d759db";
logging-data="15389"; mail-complaints-to="abuse@eternal-september.org"; posting-account="U2FsdGVkX18fwNS4ZVslU2cVNP8GaRGexyycyX6CAzs="
User-Agent: Mozilla/5.0 (Windows NT 10.0; Win64; x64; rv:78.0) Gecko/20100101
Thunderbird/78.12.0
Cancel-Lock: sha1:6cxMzXY93N3LMGe9PWsEN+99hEw=
In-Reply-To: <sedj2d$b0l$1@dont-email.me>
Content-Language: en-US
 by: Stephen Fuld - Wed, 4 Aug 2021 15:20 UTC

On 8/4/2021 1:27 AM, David Brown wrote:
> On 31/07/2021 22:03, Stephen Fuld wrote:
>> On 7/31/2021 2:51 AM, David Brown wrote:
>>
>> snip
>>
>>> I've always thought a "disable interrupts for the next N instructions",
>>> with "N" being a small immediate constant, would be an extremely useful
>>> instruction on the kind of devices I use (single cpu microcontrollers).
>>>   This instruction should work regardless of the current interrupt enable
>>> status, making it significantly more efficient than the usual store old
>>> status, disable interrupts, restore old status dance.  It could also be
>>> allowable from user mode safely, unlike normal interrupt disable, since
>>> it is only temporary.  And then you would have an easy and safe way to
>>> make short multi-instruction atomic sections, in a way that could be
>>> used by any language.
>>>
>>> That would not completely solve your complex invariant problem here, but
>>> it could be used to have atomic logs/trackers of the state of the
>>> object, allowing the program to identify and recover from inconsistent
>>> states.
>>
>> I agree on its usefulness for your kind of application - those where you
>> control all the software.  However on a general purpose system, I can
>> see lots of potential problems, mostly involving denial of service attacks.
>>
>
> That is the point of having it limited - it would block interrupts for
> up to N instructions (where N is a small constant), and then even if
> followed by another "disable interrupts for the next N instructions",
> interrupts could be handled between them.

I understand that. But you can reduce the "responsivness" to
interrupts, and reduce the number of interrupts per unit time that you
can handle. I am not sure that is a problem, but it nags at the back of
my mind.

--
- Stephen Fuld
(e-mail address disguised to prevent spam)

Re: Safepoints

<1JzOI.8$xM2.7@fx22.iad>

  copy mid

https://www.novabbs.com/devel/article-flat.php?id=19556&group=comp.arch#19556

  copy link   Newsgroups: comp.arch
Path: i2pn2.org!i2pn.org!weretis.net!feeder8.news.weretis.net!newsreader4.netcologne.de!news.netcologne.de!peer03.ams1!peer.ams1.xlned.com!news.xlned.com!peer01.iad!feed-me.highwinds-media.com!news.highwinds-media.com!fx22.iad.POSTED!not-for-mail
From: ThatWoul...@thevillage.com (EricP)
User-Agent: Thunderbird 2.0.0.24 (Windows/20100228)
MIME-Version: 1.0
Newsgroups: comp.arch
Subject: Re: Safepoints
References: <se1s18$sf$1@z-news.wcss.wroc.pl> <se36el$en5$1@dont-email.me> <se4aae$qma$1@dont-email.me> <sedj2d$b0l$1@dont-email.me> <seeb8j$f0t$1@dont-email.me>
In-Reply-To: <seeb8j$f0t$1@dont-email.me>
Content-Type: text/plain; charset=UTF-8; format=flowed
Content-Transfer-Encoding: 7bit
Lines: 61
Message-ID: <1JzOI.8$xM2.7@fx22.iad>
X-Complaints-To: abuse@UsenetServer.com
NNTP-Posting-Date: Wed, 04 Aug 2021 17:00:45 UTC
Date: Wed, 04 Aug 2021 12:59:55 -0400
X-Received-Bytes: 3729
 by: EricP - Wed, 4 Aug 2021 16:59 UTC

Stephen Fuld wrote:
> On 8/4/2021 1:27 AM, David Brown wrote:
>> On 31/07/2021 22:03, Stephen Fuld wrote:
>>> On 7/31/2021 2:51 AM, David Brown wrote:
>>>
>>> snip
>>>
>>>> I've always thought a "disable interrupts for the next N instructions",
>>>> with "N" being a small immediate constant, would be an extremely useful
>>>> instruction on the kind of devices I use (single cpu microcontrollers).
>>>> This instruction should work regardless of the current interrupt
>>>> enable
>>>> status, making it significantly more efficient than the usual store old
>>>> status, disable interrupts, restore old status dance. It could also be
>>>> allowable from user mode safely, unlike normal interrupt disable, since
>>>> it is only temporary. And then you would have an easy and safe way to
>>>> make short multi-instruction atomic sections, in a way that could be
>>>> used by any language.
>>>>
>>>> That would not completely solve your complex invariant problem here,
>>>> but
>>>> it could be used to have atomic logs/trackers of the state of the
>>>> object, allowing the program to identify and recover from inconsistent
>>>> states.
>>>
>>> I agree on its usefulness for your kind of application - those where you
>>> control all the software. However on a general purpose system, I can
>>> see lots of potential problems, mostly involving denial of service
>>> attacks.
>>>
>>
>> That is the point of having it limited - it would block interrupts for
>> up to N instructions (where N is a small constant), and then even if
>> followed by another "disable interrupts for the next N instructions",
>> interrupts could be handled between them.
>
> I understand that. But you can reduce the "responsivness" to
> interrupts, and reduce the number of interrupts per unit time that you
> can handle. I am not sure that is a problem, but it nags at the back of
> my mind.

Operating systems go to great lengths to layer their code
exactly so that interrupts do NOT need to be completely disabled.

Interrupts of the same or lower priority are blocked by HW while
an Interrupt Service Routine runs, but the ISR actions should be
only a relatively few instructions, maybe dozens,
before it queues further work to a lower priority and executes
a Return From Interrupt to re enable that interrupt priority.
The deferred work is processed when all nested ISR's return
and interrupts are again fully enabled.

There is of course more to it than that, a lot due to x86/x64 clunkiness.

User mode code should use Exchange or CompareExchange, FetchAdd, etc.
to perform non-interruptible operations. They don't need to be atomic if
the variables are not shared between threads (are thread local store).
Load-Lock/Store-Conditional clear the lock if an interrupt occurs
so can be used to perform non-interruptible sequences without
memory barriers if the variables are thread local.

Re: everything old is new again, Safepoints

<seeie0$22n5$1@gal.iecc.com>

  copy mid

https://www.novabbs.com/devel/article-flat.php?id=19558&group=comp.arch#19558

  copy link   Newsgroups: comp.arch
Path: i2pn2.org!i2pn.org!weretis.net!feeder6.news.weretis.net!news.cmpublishers.com!adore2!news.iecc.com!.POSTED.news.iecc.com!not-for-mail
From: joh...@taugh.com (John Levine)
Newsgroups: comp.arch
Subject: Re: everything old is new again, Safepoints
Date: Wed, 4 Aug 2021 17:23:12 -0000 (UTC)
Organization: Taughannock Networks
Message-ID: <seeie0$22n5$1@gal.iecc.com>
References: <se1s18$sf$1@z-news.wcss.wroc.pl> <se36el$en5$1@dont-email.me>
Injection-Date: Wed, 4 Aug 2021 17:23:12 -0000 (UTC)
Injection-Info: gal.iecc.com; posting-host="news.iecc.com:2001:470:1f07:1126:0:676f:7373:6970";
logging-data="68325"; mail-complaints-to="abuse@iecc.com"
In-Reply-To: <se1s18$sf$1@z-news.wcss.wroc.pl> <se36el$en5$1@dont-email.me>
Cleverness: some
X-Newsreader: trn 4.0-test77 (Sep 1, 2010)
Originator: johnl@iecc.com (John Levine)
 by: John Levine - Wed, 4 Aug 2021 17:23 UTC

According to David Brown <david.brown@hesbynett.no>:
>I've always thought a "disable interrupts for the next N instructions",
>with "N" being a small immediate constant, would be an extremely useful
>instruction on the kind of devices I use (single cpu microcontrollers).

Back in the 1960s the GE 635 had an interrupt blocking bit in each instruction
that you could use to do that. A timer forced an interrupt if too
many consecutive instructions set the bit. It worked in both system and user modes
(unforunately known at the time as master and slave.)

It was fine for what it was used for, atomic data structure updates on
a single threaded in-order machine. The 635 was sort of a better 7094
so I wouldn't be surprised if other scientific machines of the era had
something similar. Compare the IBM 360's test-and-set, followed by
compare-and-swap which avoided spin loops and worked better with
multiple CPUs.
--
Regards,
John Levine, johnl@taugh.com, Primary Perpetrator of "The Internet for Dummies",
Please consider the environment before reading this e-mail. https://jl.ly

Re: Safepoints

<seejpg$ce6$1@dont-email.me>

  copy mid

https://www.novabbs.com/devel/article-flat.php?id=19559&group=comp.arch#19559

  copy link   Newsgroups: comp.arch
Path: i2pn2.org!i2pn.org!eternal-september.org!reader02.eternal-september.org!.POSTED!not-for-mail
From: sfu...@alumni.cmu.edu.invalid (Stephen Fuld)
Newsgroups: comp.arch
Subject: Re: Safepoints
Date: Wed, 4 Aug 2021 10:46:22 -0700
Organization: A noiseless patient Spider
Lines: 82
Message-ID: <seejpg$ce6$1@dont-email.me>
References: <se1s18$sf$1@z-news.wcss.wroc.pl> <se36el$en5$1@dont-email.me>
<se4aae$qma$1@dont-email.me> <sedj2d$b0l$1@dont-email.me>
<seeb8j$f0t$1@dont-email.me> <1JzOI.8$xM2.7@fx22.iad>
Mime-Version: 1.0
Content-Type: text/plain; charset=utf-8; format=flowed
Content-Transfer-Encoding: 8bit
Injection-Date: Wed, 4 Aug 2021 17:46:24 -0000 (UTC)
Injection-Info: reader02.eternal-september.org; posting-host="68c5bd58a847b759e8d3d45702d759db";
logging-data="12742"; mail-complaints-to="abuse@eternal-september.org"; posting-account="U2FsdGVkX1/gjvnePkc6CgC+c0qIROOOIXwsf9PSpbk="
User-Agent: Mozilla/5.0 (Windows NT 10.0; Win64; x64; rv:78.0) Gecko/20100101
Thunderbird/78.12.0
Cancel-Lock: sha1:m00jg0M1e6UM+wqSGWu2/B9OVuY=
In-Reply-To: <1JzOI.8$xM2.7@fx22.iad>
Content-Language: en-US
 by: Stephen Fuld - Wed, 4 Aug 2021 17:46 UTC

On 8/4/2021 9:59 AM, EricP wrote:
> Stephen Fuld wrote:
>> On 8/4/2021 1:27 AM, David Brown wrote:
>>> On 31/07/2021 22:03, Stephen Fuld wrote:
>>>> On 7/31/2021 2:51 AM, David Brown wrote:
>>>>
>>>> snip
>>>>
>>>>> I've always thought a "disable interrupts for the next N
>>>>> instructions",
>>>>> with "N" being a small immediate constant, would be an extremely
>>>>> useful
>>>>> instruction on the kind of devices I use (single cpu
>>>>> microcontrollers).
>>>>>    This instruction should work regardless of the current interrupt
>>>>> enable
>>>>> status, making it significantly more efficient than the usual store
>>>>> old
>>>>> status, disable interrupts, restore old status dance.  It could
>>>>> also be
>>>>> allowable from user mode safely, unlike normal interrupt disable,
>>>>> since
>>>>> it is only temporary.  And then you would have an easy and safe way to
>>>>> make short multi-instruction atomic sections, in a way that could be
>>>>> used by any language.
>>>>>
>>>>> That would not completely solve your complex invariant problem
>>>>> here, but
>>>>> it could be used to have atomic logs/trackers of the state of the
>>>>> object, allowing the program to identify and recover from inconsistent
>>>>> states.
>>>>
>>>> I agree on its usefulness for your kind of application - those where
>>>> you
>>>> control all the software.  However on a general purpose system, I can
>>>> see lots of potential problems, mostly involving denial of service
>>>> attacks.
>>>>
>>>
>>> That is the point of having it limited - it would block interrupts for
>>> up to N instructions (where N is a small constant), and then even if
>>> followed by another "disable interrupts for the next N instructions",
>>> interrupts could be handled between them.
>>
>> I understand that.  But you can reduce the "responsivness" to
>> interrupts, and reduce the number of interrupts per unit time that you
>> can handle.  I am not sure that is a problem, but it nags at the back
>> of my mind.
>
> Operating systems go to great lengths to layer their code
> exactly so that interrupts do NOT need to be completely disabled.
>
> Interrupts of the same or lower priority are blocked by HW while
> an Interrupt Service Routine runs, but the ISR actions should be
> only a relatively few instructions, maybe dozens,
> before it queues further work to a lower priority and executes
> a Return From Interrupt to re enable that interrupt priority.
> The deferred work is processed when all nested ISR's return
> and interrupts are again fully enabled.
>
> There is of course more to it than that, a lot due to x86/x64 clunkiness.
>
> User mode code should use Exchange or CompareExchange, FetchAdd, etc.
> to perform non-interruptible operations. They don't need to be atomic if
> the variables are not shared between threads (are thread local store).
> Load-Lock/Store-Conditional clear the lock if an interrupt occurs
> so can be used to perform non-interruptible sequences without
> memory barriers if the variables are thread local.

You have made a bunch of assumptions that may not be true. If you look
back to see the kind of environment David was talking about, (single CPU
microcontrollers), he may not have a full OS, probably doesn't have
instructions like fetch and add or compare and swap, etc.

If David would give a few more specifics about his world, it might show
you a whole different world of programming.

--
- Stephen Fuld
(e-mail address disguised to prevent spam)

Re: Safepoints

<DsAOI.3063$Fx8.71@fx45.iad>

  copy mid

https://www.novabbs.com/devel/article-flat.php?id=19560&group=comp.arch#19560

  copy link   Newsgroups: comp.arch
Path: i2pn2.org!i2pn.org!weretis.net!feeder8.news.weretis.net!newsreader4.netcologne.de!news.netcologne.de!peer03.ams1!peer.ams1.xlned.com!news.xlned.com!peer01.iad!feed-me.highwinds-media.com!news.highwinds-media.com!fx45.iad.POSTED!not-for-mail
Newsgroups: comp.arch
From: branimir...@gmail.com (Branimir Maksimovic)
Subject: Re: Safepoints
References: <se1s18$sf$1@z-news.wcss.wroc.pl> <se36el$en5$1@dont-email.me>
<se4aae$qma$1@dont-email.me>
<f1240d75-82d6-468e-bf51-c4e275e38a4an@googlegroups.com>
<sedj5r$b0l$2@dont-email.me>
User-Agent: slrn/1.0.3 (Darwin)
Lines: 25
Message-ID: <DsAOI.3063$Fx8.71@fx45.iad>
X-Complaints-To: abuse@usenet-news.net
NNTP-Posting-Date: Wed, 04 Aug 2021 17:51:31 UTC
Organization: usenet-news.net
Date: Wed, 04 Aug 2021 17:51:31 GMT
X-Received-Bytes: 1554
 by: Branimir Maksimovic - Wed, 4 Aug 2021 17:51 UTC

On 2021-08-04, David Brown <david.brown@hesbynett.no> wrote:
> On 31/07/2021 22:58, MitchAlsup wrote:
>> On Saturday, July 31, 2021 at 3:03:28 PM UTC-5, Stephen Fuld wrote:
>>> On 7/31/2021 2:51 AM, David Brown wrote:
>>>
>>> snip
>>>> I've always thought a "disable interrupts for the next N instructions",
>>>> with "N" being a small immediate constant, would be an extremely useful
>>>> instruction on the kind of devices I use (single cpu microcontrollers).
>> <
>> I can see utility here, but you need not only N as the number of instructions,
>> but also P the priority level below which you suppress interrupts and above
>> which you still allow interrupts.
>> <
>
> No - you block /all/ interrupts, of /all/ priorities. That's key to
> such an instruction.

Except non maskable interrupts.

>

--
bmaxa now listens rock.mp3

Re: Safepoints

<seeoqr$i6d$1@dont-email.me>

  copy mid

https://www.novabbs.com/devel/article-flat.php?id=19561&group=comp.arch#19561

  copy link   Newsgroups: comp.arch
Path: i2pn2.org!i2pn.org!eternal-september.org!reader02.eternal-september.org!.POSTED!not-for-mail
From: david.br...@hesbynett.no (David Brown)
Newsgroups: comp.arch
Subject: Re: Safepoints
Date: Wed, 4 Aug 2021 21:12:27 +0200
Organization: A noiseless patient Spider
Lines: 25
Message-ID: <seeoqr$i6d$1@dont-email.me>
References: <se1s18$sf$1@z-news.wcss.wroc.pl> <se36el$en5$1@dont-email.me>
<se4aae$qma$1@dont-email.me>
<f1240d75-82d6-468e-bf51-c4e275e38a4an@googlegroups.com>
<sedj5r$b0l$2@dont-email.me> <DsAOI.3063$Fx8.71@fx45.iad>
Mime-Version: 1.0
Content-Type: text/plain; charset=utf-8
Content-Transfer-Encoding: 7bit
Injection-Date: Wed, 4 Aug 2021 19:12:27 -0000 (UTC)
Injection-Info: reader02.eternal-september.org; posting-host="3a095d4c6431fbf4c46a2b9aa58dd627";
logging-data="18637"; mail-complaints-to="abuse@eternal-september.org"; posting-account="U2FsdGVkX18BvYvQV1egKeRW0vo/ALAVoNrUp0QckCE="
User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:78.0) Gecko/20100101
Thunderbird/78.11.0
Cancel-Lock: sha1:UxeJdxKv6YSppfbG0BJI5/PKQr8=
In-Reply-To: <DsAOI.3063$Fx8.71@fx45.iad>
Content-Language: en-GB
 by: David Brown - Wed, 4 Aug 2021 19:12 UTC

On 04/08/2021 19:51, Branimir Maksimovic wrote:
> On 2021-08-04, David Brown <david.brown@hesbynett.no> wrote:
>> On 31/07/2021 22:58, MitchAlsup wrote:
>>> On Saturday, July 31, 2021 at 3:03:28 PM UTC-5, Stephen Fuld wrote:
>>>> On 7/31/2021 2:51 AM, David Brown wrote:
>>>>
>>>> snip
>>>>> I've always thought a "disable interrupts for the next N instructions",
>>>>> with "N" being a small immediate constant, would be an extremely useful
>>>>> instruction on the kind of devices I use (single cpu microcontrollers).
>>> <
>>> I can see utility here, but you need not only N as the number of instructions,
>>> but also P the priority level below which you suppress interrupts and above
>>> which you still allow interrupts.
>>> <
>>
>> No - you block /all/ interrupts, of /all/ priorities. That's key to
>> such an instruction.
>
> Except non maskable interrupts.
>

Yes, but those only occur in a disaster (such as critical hardware fault).

Re: Safepoints

<862c41ba-d720-4095-84df-9c95965c4186n@googlegroups.com>

  copy mid

https://www.novabbs.com/devel/article-flat.php?id=19564&group=comp.arch#19564

  copy link   Newsgroups: comp.arch
X-Received: by 2002:a37:74c:: with SMTP id 73mr1082609qkh.104.1628106011267;
Wed, 04 Aug 2021 12:40:11 -0700 (PDT)
X-Received: by 2002:a05:6808:158a:: with SMTP id t10mr8214836oiw.175.1628106011035;
Wed, 04 Aug 2021 12:40:11 -0700 (PDT)
Path: i2pn2.org!i2pn.org!weretis.net!feeder6.news.weretis.net!news.snarked.org!border2.nntp.dca1.giganews.com!nntp.giganews.com!news-out.google.com!nntp.google.com!postnews.google.com!google-groups.googlegroups.com!not-for-mail
Newsgroups: comp.arch
Date: Wed, 4 Aug 2021 12:40:10 -0700 (PDT)
In-Reply-To: <sedj2d$b0l$1@dont-email.me>
Injection-Info: google-groups.googlegroups.com; posting-host=2600:1700:291:29f0:1fa:9c49:66c4:5e6;
posting-account=H_G_JQkAAADS6onOMb-dqvUozKse7mcM
NNTP-Posting-Host: 2600:1700:291:29f0:1fa:9c49:66c4:5e6
References: <se1s18$sf$1@z-news.wcss.wroc.pl> <se36el$en5$1@dont-email.me>
<se4aae$qma$1@dont-email.me> <sedj2d$b0l$1@dont-email.me>
User-Agent: G2/1.0
MIME-Version: 1.0
Message-ID: <862c41ba-d720-4095-84df-9c95965c4186n@googlegroups.com>
Subject: Re: Safepoints
From: MitchAl...@aol.com (MitchAlsup)
Injection-Date: Wed, 04 Aug 2021 19:40:11 +0000
Content-Type: text/plain; charset="UTF-8"
Lines: 37
 by: MitchAlsup - Wed, 4 Aug 2021 19:40 UTC

On Wednesday, August 4, 2021 at 3:27:59 AM UTC-5, David Brown wrote:
> On 31/07/2021 22:03, Stephen Fuld wrote:
> > On 7/31/2021 2:51 AM, David Brown wrote:
> >
> > snip
> >
> >> I've always thought a "disable interrupts for the next N instructions",
> >> with "N" being a small immediate constant, would be an extremely useful
> >> instruction on the kind of devices I use (single cpu microcontrollers).
> >> This instruction should work regardless of the current interrupt enable
> >> status, making it significantly more efficient than the usual store old
> >> status, disable interrupts, restore old status dance. It could also be
> >> allowable from user mode safely, unlike normal interrupt disable, since
> >> it is only temporary. And then you would have an easy and safe way to
> >> make short multi-instruction atomic sections, in a way that could be
> >> used by any language.
> >>
> >> That would not completely solve your complex invariant problem here, but
> >> it could be used to have atomic logs/trackers of the state of the
> >> object, allowing the program to identify and recover from inconsistent
> >> states.
> >
> > I agree on its usefulness for your kind of application - those where you
> > control all the software. However on a general purpose system, I can
> > see lots of potential problems, mostly involving denial of service attacks.
> >
> That is the point of having it limited - it would block interrupts for
> up to N instructions (where N is a small constant), and then even if
> followed by another "disable interrupts for the next N instructions",
> interrupts could be handled between them.
<
Are any of these instructions where the interrupts are blocked capable
of taking hundreds of clock cycles to be performed ? gamma()
or take several dozen clocks to be performed ? atan2()
Are the capable of throwing exceptions? PAGEFAULT
And what happens after control returns from PAGEFALT?
Are the interrupts still blocked for n-k instructions ?
And to what end ?????

Re: Safepoints

<seeqsh$1ak$1@dont-email.me>

  copy mid

https://www.novabbs.com/devel/article-flat.php?id=19565&group=comp.arch#19565

  copy link   Newsgroups: comp.arch
Path: i2pn2.org!i2pn.org!eternal-september.org!reader02.eternal-september.org!.POSTED!not-for-mail
From: david.br...@hesbynett.no (David Brown)
Newsgroups: comp.arch
Subject: Re: Safepoints
Date: Wed, 4 Aug 2021 21:47:28 +0200
Organization: A noiseless patient Spider
Lines: 115
Message-ID: <seeqsh$1ak$1@dont-email.me>
References: <se1s18$sf$1@z-news.wcss.wroc.pl> <se36el$en5$1@dont-email.me>
<se4aae$qma$1@dont-email.me> <sedj2d$b0l$1@dont-email.me>
<seeb8j$f0t$1@dont-email.me> <1JzOI.8$xM2.7@fx22.iad>
<seejpg$ce6$1@dont-email.me>
Mime-Version: 1.0
Content-Type: text/plain; charset=utf-8
Content-Transfer-Encoding: 8bit
Injection-Date: Wed, 4 Aug 2021 19:47:29 -0000 (UTC)
Injection-Info: reader02.eternal-september.org; posting-host="3a095d4c6431fbf4c46a2b9aa58dd627";
logging-data="1364"; mail-complaints-to="abuse@eternal-september.org"; posting-account="U2FsdGVkX18D2ZdGhG2PzqHrdSMGbo7F3KbpbD4crIw="
User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:78.0) Gecko/20100101
Thunderbird/78.11.0
Cancel-Lock: sha1:mYRcsLU8swisXnaArFwVX1zKlrk=
In-Reply-To: <seejpg$ce6$1@dont-email.me>
Content-Language: en-GB
 by: David Brown - Wed, 4 Aug 2021 19:47 UTC

On 04/08/2021 19:46, Stephen Fuld wrote:
> On 8/4/2021 9:59 AM, EricP wrote:
>> Stephen Fuld wrote:
>>> On 8/4/2021 1:27 AM, David Brown wrote:
>>>> On 31/07/2021 22:03, Stephen Fuld wrote:
>>>>> On 7/31/2021 2:51 AM, David Brown wrote:
>>>>>
>>>>> snip
>>>>>
>>>>>> I've always thought a "disable interrupts for the next N
>>>>>> instructions",
>>>>>> with "N" being a small immediate constant, would be an extremely
>>>>>> useful
>>>>>> instruction on the kind of devices I use (single cpu
>>>>>> microcontrollers).
>>>>>>    This instruction should work regardless of the current
>>>>>> interrupt enable
>>>>>> status, making it significantly more efficient than the usual
>>>>>> store old
>>>>>> status, disable interrupts, restore old status dance.  It could
>>>>>> also be
>>>>>> allowable from user mode safely, unlike normal interrupt disable,
>>>>>> since
>>>>>> it is only temporary.  And then you would have an easy and safe
>>>>>> way to
>>>>>> make short multi-instruction atomic sections, in a way that could be
>>>>>> used by any language.
>>>>>>
>>>>>> That would not completely solve your complex invariant problem
>>>>>> here, but
>>>>>> it could be used to have atomic logs/trackers of the state of the
>>>>>> object, allowing the program to identify and recover from
>>>>>> inconsistent
>>>>>> states.
>>>>>
>>>>> I agree on its usefulness for your kind of application - those
>>>>> where you
>>>>> control all the software.  However on a general purpose system, I can
>>>>> see lots of potential problems, mostly involving denial of service
>>>>> attacks.
>>>>>
>>>>
>>>> That is the point of having it limited - it would block interrupts for
>>>> up to N instructions (where N is a small constant), and then even if
>>>> followed by another "disable interrupts for the next N instructions",
>>>> interrupts could be handled between them.
>>>
>>> I understand that.  But you can reduce the "responsivness" to
>>> interrupts, and reduce the number of interrupts per unit time that
>>> you can handle.  I am not sure that is a problem, but it nags at the
>>> back of my mind.
>>
>> Operating systems go to great lengths to layer their code
>> exactly so that interrupts do NOT need to be completely disabled.
>>
>> Interrupts of the same or lower priority are blocked by HW while
>> an Interrupt Service Routine runs, but the ISR actions should be
>> only a relatively few instructions, maybe dozens,
>> before it queues further work to a lower priority and executes
>> a Return From Interrupt to re enable that interrupt priority.
>> The deferred work is processed when all nested ISR's return
>> and interrupts are again fully enabled.
>>
>> There is of course more to it than that, a lot due to x86/x64 clunkiness.
>>
>> User mode code should use Exchange or CompareExchange, FetchAdd, etc.
>> to perform non-interruptible operations. They don't need to be atomic if
>> the variables are not shared between threads (are thread local store).
>> Load-Lock/Store-Conditional clear the lock if an interrupt occurs
>> so can be used to perform non-interruptible sequences without
>> memory barriers if the variables are thread local.
>
> You have made a bunch of assumptions that may not be true.  If you look
> back to see the kind of environment David was talking about, (single CPU
> microcontrollers), he may not have a full OS, probably doesn't have
> instructions like fetch and add or compare and swap, etc.
>
> If David would give a few more specifics about his world, it might show
> you a whole different world of programming.
>

A typical device for me these days would be based on an ARM Cortex M
single-core microcontroller. The OS's I use are real-time OS's (such as
FreeRTOS) that are linked directly to a single executable along with the
program - the systems typically have only a single binary. (There might
be an independent bootloader, but that stops once the main program starts.)

Processors in this class generally do not have atomic swap, CAS,
FetchAdd, etc., instructions. Most microcontrollers have RISC cpus
(except some of the small, old-fashioned 8-bit devices), and typically
the only tool you have other than disabling interrupts is a
load-link/store-conditional pair (ldrex/strex on ARM).

Use of ldrex/strex can avoid disabling interrupts. But they are big and
slow. Disabling interrupts around a critical section (such as an atomic
increment, or accessing a 64-bit variable) is simpler, more efficient
and more predictable.

And of critical importance in this world, it works in conjunction with
interrupts.

Any locking mechanism that does not disable interrupts works by using a
loop to get a lock (a memory location), doing the critical work, then
releasing the lock. This will not work if thread-level code and an
interrupt both want access to the same object, as the thread-level code
can never pre-empty the interrupt code. Thus if the thread has taken
the lock, and the interrupt hits, it cannot get the lock.

If you have a larger section of work that needs to be in the critical
section, you use an RTOS mutex and keep everything in threads -
interrupt routines should not be doing "large sections of work" anyway,
and the RTOS will handle the priority inversions needed to avoid deadlock.

Re: Safepoints

<lhCOI.3073$Fx8.581@fx45.iad>

  copy mid

https://www.novabbs.com/devel/article-flat.php?id=19566&group=comp.arch#19566

  copy link   Newsgroups: comp.arch
Path: i2pn2.org!i2pn.org!weretis.net!feeder6.news.weretis.net!4.us.feeder.erje.net!feeder.erje.net!feeder1.feed.usenet.farm!feed.usenet.farm!peer01.ams4!peer.am4.highwinds-media.com!peer01.iad!feed-me.highwinds-media.com!news.highwinds-media.com!fx45.iad.POSTED!not-for-mail
From: ThatWoul...@thevillage.com (EricP)
User-Agent: Thunderbird 2.0.0.24 (Windows/20100228)
MIME-Version: 1.0
Newsgroups: comp.arch
Subject: Re: Safepoints
References: <se1s18$sf$1@z-news.wcss.wroc.pl> <se36el$en5$1@dont-email.me> <se4aae$qma$1@dont-email.me> <sedj2d$b0l$1@dont-email.me> <seeb8j$f0t$1@dont-email.me> <1JzOI.8$xM2.7@fx22.iad> <seejpg$ce6$1@dont-email.me>
In-Reply-To: <seejpg$ce6$1@dont-email.me>
Content-Type: text/plain; charset=UTF-8; format=flowed
Content-Transfer-Encoding: 7bit
Lines: 96
Message-ID: <lhCOI.3073$Fx8.581@fx45.iad>
X-Complaints-To: abuse@UsenetServer.com
NNTP-Posting-Date: Wed, 04 Aug 2021 19:56:01 UTC
Date: Wed, 04 Aug 2021 15:55:13 -0400
X-Received-Bytes: 5267
 by: EricP - Wed, 4 Aug 2021 19:55 UTC

Stephen Fuld wrote:
> On 8/4/2021 9:59 AM, EricP wrote:
>> Stephen Fuld wrote:
>>> On 8/4/2021 1:27 AM, David Brown wrote:
>>>> On 31/07/2021 22:03, Stephen Fuld wrote:
>>>>> On 7/31/2021 2:51 AM, David Brown wrote:
>>>>>
>>>>> snip
>>>>>
>>>>>> I've always thought a "disable interrupts for the next N
>>>>>> instructions",
>>>>>> with "N" being a small immediate constant, would be an extremely
>>>>>> useful
>>>>>> instruction on the kind of devices I use (single cpu
>>>>>> microcontrollers).
>>>>>> This instruction should work regardless of the current
>>>>>> interrupt enable
>>>>>> status, making it significantly more efficient than the usual
>>>>>> store old
>>>>>> status, disable interrupts, restore old status dance. It could
>>>>>> also be
>>>>>> allowable from user mode safely, unlike normal interrupt disable,
>>>>>> since
>>>>>> it is only temporary. And then you would have an easy and safe
>>>>>> way to
>>>>>> make short multi-instruction atomic sections, in a way that could be
>>>>>> used by any language.
>>>>>>
>>>>>> That would not completely solve your complex invariant problem
>>>>>> here, but
>>>>>> it could be used to have atomic logs/trackers of the state of the
>>>>>> object, allowing the program to identify and recover from
>>>>>> inconsistent
>>>>>> states.
>>>>>
>>>>> I agree on its usefulness for your kind of application - those
>>>>> where you
>>>>> control all the software. However on a general purpose system, I can
>>>>> see lots of potential problems, mostly involving denial of service
>>>>> attacks.
>>>>>
>>>>
>>>> That is the point of having it limited - it would block interrupts for
>>>> up to N instructions (where N is a small constant), and then even if
>>>> followed by another "disable interrupts for the next N instructions",
>>>> interrupts could be handled between them.
>>>
>>> I understand that. But you can reduce the "responsivness" to
>>> interrupts, and reduce the number of interrupts per unit time that
>>> you can handle. I am not sure that is a problem, but it nags at the
>>> back of my mind.
>>
>> Operating systems go to great lengths to layer their code
>> exactly so that interrupts do NOT need to be completely disabled.
>>
>> Interrupts of the same or lower priority are blocked by HW while
>> an Interrupt Service Routine runs, but the ISR actions should be
>> only a relatively few instructions, maybe dozens,
>> before it queues further work to a lower priority and executes
>> a Return From Interrupt to re enable that interrupt priority.
>> The deferred work is processed when all nested ISR's return
>> and interrupts are again fully enabled.
>>
>> There is of course more to it than that, a lot due to x86/x64 clunkiness.
>>
>> User mode code should use Exchange or CompareExchange, FetchAdd, etc.
>> to perform non-interruptible operations. They don't need to be atomic if
>> the variables are not shared between threads (are thread local store).
>> Load-Lock/Store-Conditional clear the lock if an interrupt occurs
>> so can be used to perform non-interruptible sequences without
>> memory barriers if the variables are thread local.
>
> You have made a bunch of assumptions that may not be true. If you look
> back to see the kind of environment David was talking about, (single CPU
> microcontrollers), he may not have a full OS, probably doesn't have
> instructions like fetch and add or compare and swap, etc.
>
> If David would give a few more specifics about his world, it might show
> you a whole different world of programming.

Not really. Any microcontroller with interrupts has a disable capability.
That means a non-interruptible sequence is always possible
and the discussion is about the efficiency on a particular uC.

One designs the software using HalNonInterruptibleExchange and
HalNonInterruptibleCompareExchange subroutines or macros.
If those map to single instructions on a uC, great.
If not, then they do the equivalent PUSHF, INTD, LD, ST, POPF sequence.
If an INTD_N instruction is available then it saves a PUSHF and POPF.
But LL/SC saves any disable and is more generally useful even on a uC.

Many uC I have seen had an XCHG [mem],reg instruction but not
compare-exchange as most are accumulator architectures and many
instructions are a non-interruptible R-M-W sequence anyway.

Re: Safepoints

<a0DOI.93$fI7.10@fx33.iad>

  copy mid

https://www.novabbs.com/devel/article-flat.php?id=19568&group=comp.arch#19568

  copy link   Newsgroups: comp.arch
Path: i2pn2.org!i2pn.org!weretis.net!feeder8.news.weretis.net!feeder1.feed.usenet.farm!feed.usenet.farm!peer01.ams4!peer.am4.highwinds-media.com!peer01.iad!feed-me.highwinds-media.com!news.highwinds-media.com!fx33.iad.POSTED!not-for-mail
From: ThatWoul...@thevillage.com (EricP)
User-Agent: Thunderbird 2.0.0.24 (Windows/20100228)
MIME-Version: 1.0
Newsgroups: comp.arch
Subject: Re: Safepoints
References: <se1s18$sf$1@z-news.wcss.wroc.pl> <se36el$en5$1@dont-email.me> <se4aae$qma$1@dont-email.me> <sedj2d$b0l$1@dont-email.me> <seeb8j$f0t$1@dont-email.me> <1JzOI.8$xM2.7@fx22.iad> <seejpg$ce6$1@dont-email.me> <seeqsh$1ak$1@dont-email.me>
In-Reply-To: <seeqsh$1ak$1@dont-email.me>
Content-Type: text/plain; charset=UTF-8; format=flowed
Content-Transfer-Encoding: 7bit
Lines: 128
Message-ID: <a0DOI.93$fI7.10@fx33.iad>
X-Complaints-To: abuse@UsenetServer.com
NNTP-Posting-Date: Wed, 04 Aug 2021 20:45:58 UTC
Date: Wed, 04 Aug 2021 16:45:16 -0400
X-Received-Bytes: 7265
 by: EricP - Wed, 4 Aug 2021 20:45 UTC

David Brown wrote:
> On 04/08/2021 19:46, Stephen Fuld wrote:
>> On 8/4/2021 9:59 AM, EricP wrote:
>>> Stephen Fuld wrote:
>>>> On 8/4/2021 1:27 AM, David Brown wrote:
>>>>> On 31/07/2021 22:03, Stephen Fuld wrote:
>>>>>> On 7/31/2021 2:51 AM, David Brown wrote:
>>>>>>
>>>>>> snip
>>>>>>
>>>>>>> I've always thought a "disable interrupts for the next N
>>>>>>> instructions",
>>>>>>> with "N" being a small immediate constant, would be an extremely
>>>>>>> useful
>>>>>>> instruction on the kind of devices I use (single cpu
>>>>>>> microcontrollers).
>>>>>>> This instruction should work regardless of the current
>>>>>>> interrupt enable
>>>>>>> status, making it significantly more efficient than the usual
>>>>>>> store old
>>>>>>> status, disable interrupts, restore old status dance. It could
>>>>>>> also be
>>>>>>> allowable from user mode safely, unlike normal interrupt disable,
>>>>>>> since
>>>>>>> it is only temporary. And then you would have an easy and safe
>>>>>>> way to
>>>>>>> make short multi-instruction atomic sections, in a way that could be
>>>>>>> used by any language.
>>>>>>>
>>>>>>> That would not completely solve your complex invariant problem
>>>>>>> here, but
>>>>>>> it could be used to have atomic logs/trackers of the state of the
>>>>>>> object, allowing the program to identify and recover from
>>>>>>> inconsistent
>>>>>>> states.
>>>>>> I agree on its usefulness for your kind of application - those
>>>>>> where you
>>>>>> control all the software. However on a general purpose system, I can
>>>>>> see lots of potential problems, mostly involving denial of service
>>>>>> attacks.
>>>>>>
>>>>> That is the point of having it limited - it would block interrupts for
>>>>> up to N instructions (where N is a small constant), and then even if
>>>>> followed by another "disable interrupts for the next N instructions",
>>>>> interrupts could be handled between them.
>>>> I understand that. But you can reduce the "responsivness" to
>>>> interrupts, and reduce the number of interrupts per unit time that
>>>> you can handle. I am not sure that is a problem, but it nags at the
>>>> back of my mind.
>>> Operating systems go to great lengths to layer their code
>>> exactly so that interrupts do NOT need to be completely disabled.
>>>
>>> Interrupts of the same or lower priority are blocked by HW while
>>> an Interrupt Service Routine runs, but the ISR actions should be
>>> only a relatively few instructions, maybe dozens,
>>> before it queues further work to a lower priority and executes
>>> a Return From Interrupt to re enable that interrupt priority.
>>> The deferred work is processed when all nested ISR's return
>>> and interrupts are again fully enabled.
>>>
>>> There is of course more to it than that, a lot due to x86/x64 clunkiness.
>>>
>>> User mode code should use Exchange or CompareExchange, FetchAdd, etc.
>>> to perform non-interruptible operations. They don't need to be atomic if
>>> the variables are not shared between threads (are thread local store).
>>> Load-Lock/Store-Conditional clear the lock if an interrupt occurs
>>> so can be used to perform non-interruptible sequences without
>>> memory barriers if the variables are thread local.
>> You have made a bunch of assumptions that may not be true. If you look
>> back to see the kind of environment David was talking about, (single CPU
>> microcontrollers), he may not have a full OS, probably doesn't have
>> instructions like fetch and add or compare and swap, etc.
>>
>> If David would give a few more specifics about his world, it might show
>> you a whole different world of programming.
>>
>
> A typical device for me these days would be based on an ARM Cortex M
> single-core microcontroller. The OS's I use are real-time OS's (such as
> FreeRTOS) that are linked directly to a single executable along with the
> program - the systems typically have only a single binary. (There might
> be an independent bootloader, but that stops once the main program starts.)
>
> Processors in this class generally do not have atomic swap, CAS,
> FetchAdd, etc., instructions. Most microcontrollers have RISC cpus
> (except some of the small, old-fashioned 8-bit devices), and typically
> the only tool you have other than disabling interrupts is a
> load-link/store-conditional pair (ldrex/strex on ARM).

ARMv5 had SWP Swap Word and SWPB Swap Byte exchange instructions.
I suppose before v5 super mode had to disable interrupts.
User mode would have to make a system call to do an an exchange.

> Use of ldrex/strex can avoid disabling interrupts. But they are big and
> slow. Disabling interrupts around a critical section (such as an atomic
> increment, or accessing a 64-bit variable) is simpler, more efficient
> and more predictable.

ldrex and strex came later with ARMv6.
I am surprised you say that on ARM they are big and slow.
The whole point of a LL/SC mechanism is that it is extremely lean.

LL loads a word as usual sets a FF in the cache controller to
indicate the line is "linked" and sets a clock counter to some value.
If any other processor reads the linked cache line, the FF is reset.
If an interrupt occurs the FF is reset.
SC stores the word as usual but checks if the FF is still set
and the clock counter is > 0. If both are true the store occurs.

The memory coherence detector should be present even on a uniprocessor as
a cache line may be invalidated by DMA. But that is just an address XOR.

> And of critical importance in this world, it works in conjunction with
> interrupts.
>
> Any locking mechanism that does not disable interrupts works by using a
> loop to get a lock (a memory location), doing the critical work, then
> releasing the lock. This will not work if thread-level code and an
> interrupt both want access to the same object, as the thread-level code
> can never pre-empty the interrupt code. Thus if the thread has taken
> the lock, and the interrupt hits, it cannot get the lock.
>
> If you have a larger section of work that needs to be in the critical
> section, you use an RTOS mutex and keep everything in threads -
> interrupt routines should not be doing "large sections of work" anyway,
> and the RTOS will handle the priority inversions needed to avoid deadlock.

Re: Safepoints

<sef0rg$9ps$1@dont-email.me>

  copy mid

https://www.novabbs.com/devel/article-flat.php?id=19569&group=comp.arch#19569

  copy link   Newsgroups: comp.arch
Path: i2pn2.org!i2pn.org!eternal-september.org!reader02.eternal-september.org!.POSTED!not-for-mail
From: david.br...@hesbynett.no (David Brown)
Newsgroups: comp.arch
Subject: Re: Safepoints
Date: Wed, 4 Aug 2021 23:29:20 +0200
Organization: A noiseless patient Spider
Lines: 162
Message-ID: <sef0rg$9ps$1@dont-email.me>
References: <se1s18$sf$1@z-news.wcss.wroc.pl> <se36el$en5$1@dont-email.me>
<se4aae$qma$1@dont-email.me> <sedj2d$b0l$1@dont-email.me>
<seeb8j$f0t$1@dont-email.me> <1JzOI.8$xM2.7@fx22.iad>
<seejpg$ce6$1@dont-email.me> <seeqsh$1ak$1@dont-email.me>
<a0DOI.93$fI7.10@fx33.iad>
Mime-Version: 1.0
Content-Type: text/plain; charset=utf-8
Content-Transfer-Encoding: 8bit
Injection-Date: Wed, 4 Aug 2021 21:29:20 -0000 (UTC)
Injection-Info: reader02.eternal-september.org; posting-host="3a095d4c6431fbf4c46a2b9aa58dd627";
logging-data="10044"; mail-complaints-to="abuse@eternal-september.org"; posting-account="U2FsdGVkX19kfgXEzhJt5ADmrLD3O29lbdHkZpK02Hw="
User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:78.0) Gecko/20100101
Thunderbird/78.11.0
Cancel-Lock: sha1:zo+ElvayQzrp4+SKiX3TLzroCWU=
In-Reply-To: <a0DOI.93$fI7.10@fx33.iad>
Content-Language: en-GB
 by: David Brown - Wed, 4 Aug 2021 21:29 UTC

On 04/08/2021 22:45, EricP wrote:
> David Brown wrote:
>> On 04/08/2021 19:46, Stephen Fuld wrote:
>>> On 8/4/2021 9:59 AM, EricP wrote:
>>>> Stephen Fuld wrote:
>>>>> On 8/4/2021 1:27 AM, David Brown wrote:
>>>>>> On 31/07/2021 22:03, Stephen Fuld wrote:
>>>>>>> On 7/31/2021 2:51 AM, David Brown wrote:
>>>>>>>
>>>>>>> snip
>>>>>>>
>>>>>>>> I've always thought a "disable interrupts for the next N
>>>>>>>> instructions",
>>>>>>>> with "N" being a small immediate constant, would be an extremely
>>>>>>>> useful
>>>>>>>> instruction on the kind of devices I use (single cpu
>>>>>>>> microcontrollers).
>>>>>>>>    This instruction should work regardless of the current
>>>>>>>> interrupt enable
>>>>>>>> status, making it significantly more efficient than the usual
>>>>>>>> store old
>>>>>>>> status, disable interrupts, restore old status dance.  It could
>>>>>>>> also be
>>>>>>>> allowable from user mode safely, unlike normal interrupt disable,
>>>>>>>> since
>>>>>>>> it is only temporary.  And then you would have an easy and safe
>>>>>>>> way to
>>>>>>>> make short multi-instruction atomic sections, in a way that
>>>>>>>> could be
>>>>>>>> used by any language.
>>>>>>>>
>>>>>>>> That would not completely solve your complex invariant problem
>>>>>>>> here, but
>>>>>>>> it could be used to have atomic logs/trackers of the state of the
>>>>>>>> object, allowing the program to identify and recover from
>>>>>>>> inconsistent
>>>>>>>> states.
>>>>>>> I agree on its usefulness for your kind of application - those
>>>>>>> where you
>>>>>>> control all the software.  However on a general purpose system, I
>>>>>>> can
>>>>>>> see lots of potential problems, mostly involving denial of service
>>>>>>> attacks.
>>>>>>>
>>>>>> That is the point of having it limited - it would block interrupts
>>>>>> for
>>>>>> up to N instructions (where N is a small constant), and then even if
>>>>>> followed by another "disable interrupts for the next N instructions",
>>>>>> interrupts could be handled between them.
>>>>> I understand that.  But you can reduce the "responsivness" to
>>>>> interrupts, and reduce the number of interrupts per unit time that
>>>>> you can handle.  I am not sure that is a problem, but it nags at the
>>>>> back of my mind.
>>>> Operating systems go to great lengths to layer their code
>>>> exactly so that interrupts do NOT need to be completely disabled.
>>>>
>>>> Interrupts of the same or lower priority are blocked by HW while
>>>> an Interrupt Service Routine runs, but the ISR actions should be
>>>> only a relatively few instructions, maybe dozens,
>>>> before it queues further work to a lower priority and executes
>>>> a Return From Interrupt to re enable that interrupt priority.
>>>> The deferred work is processed when all nested ISR's return
>>>> and interrupts are again fully enabled.
>>>>
>>>> There is of course more to it than that, a lot due to x86/x64
>>>> clunkiness.
>>>>
>>>> User mode code should use Exchange or CompareExchange, FetchAdd, etc.
>>>> to perform non-interruptible operations. They don't need to be
>>>> atomic if
>>>> the variables are not shared between threads (are thread local store).
>>>> Load-Lock/Store-Conditional clear the lock if an interrupt occurs
>>>> so can be used to perform non-interruptible sequences without
>>>> memory barriers if the variables are thread local.
>>> You have made a bunch of assumptions that may not be true.  If you look
>>> back to see the kind of environment David was talking about, (single CPU
>>> microcontrollers), he may not have a full OS, probably doesn't have
>>> instructions like fetch and add or compare and swap, etc.
>>>
>>> If David would give a few more specifics about his world, it might show
>>> you a whole different world of programming.
>>>
>>
>> A typical device for me these days would be based on an ARM Cortex M
>> single-core microcontroller.  The OS's I use are real-time OS's (such as
>> FreeRTOS) that are linked directly to a single executable along with the
>> program - the systems typically have only a single binary.  (There might
>> be an independent bootloader, but that stops once the main program
>> starts.)
>>
>> Processors in this class generally do not have atomic swap, CAS,
>> FetchAdd, etc., instructions.  Most microcontrollers have RISC cpus
>> (except some of the small, old-fashioned 8-bit devices), and typically
>> the only tool you have other than disabling interrupts is a
>> load-link/store-conditional pair (ldrex/strex on ARM).
>
> ARMv5 had SWP Swap Word and SWPB Swap Byte exchange instructions.
> I suppose before v5 super mode had to disable interrupts.
> User mode would have to make a system call to do an an exchange.

SWP and SWPB were deprecated for ARMv6, and dropped in the Thumb-2
instruction set used by the Cortex-M devices.

>
>> Use of ldrex/strex can avoid disabling interrupts.  But they are big and
>> slow.  Disabling interrupts around a critical section (such as an atomic
>> increment, or accessing a 64-bit variable) is simpler, more efficient
>> and more predictable.
>
> ldrex and strex came later with ARMv6.
> I am surprised you say that on ARM they are big and slow.
> The whole point of a LL/SC mechanism is that it is extremely lean.

Compared to disabling interrupts, LL/SC sequences are big (and therefore
slow). They take something like 7 or 8 instructions, with a loop (and
in RTOS systems, worst-case timings are important even if the loop is
usually only run once). Disabling interrupts are two or three instructions.

There are two important aspects of LL/SC, compared to older "lock
everything" mechanisms such disabling interrupts (for single core
devices) or x86 locked instructions. It splits the locking in two so
that you don't have the delays seen in the x86 world, and it is vastly
easier to implement in the hardware of a load/store processor than fully
locked instructions.

But on a single core system, it is bigger and slower than disabling
interrupts. And it does not work inside interrupt functions (which is a
big issue for microcontroller usage).

>
> LL loads a word as usual sets a FF in the cache controller to
> indicate the line is "linked" and sets a clock counter to some value.
> If any other processor reads the linked cache line, the FF is reset.
> If an interrupt occurs the FF is reset.
> SC stores the word as usual but checks if the FF is still set
> and the clock counter is > 0. If both are true the store occurs.
>
> The memory coherence detector should be present even on a uniprocessor as
> a cache line may be invalidated by DMA. But that is just an address XOR.
>

Yes. And all of that is slower than a couple of instructions to disable
interrupts.

>> And of critical importance in this world, it works in conjunction with
>> interrupts.
>>
>> Any locking mechanism that does not disable interrupts works by using a
>> loop to get a lock (a memory location), doing the critical work, then
>> releasing the lock.  This will not work if thread-level code and an
>> interrupt both want access to the same object, as the thread-level code
>> can never pre-empty the interrupt code.  Thus if the thread has taken
>> the lock, and the interrupt hits, it cannot get the lock.
>>
>> If you have a larger section of work that needs to be in the critical
>> section, you use an RTOS mutex and keep everything in threads -
>> interrupt routines should not be doing "large sections of work" anyway,
>> and the RTOS will handle the priority inversions needed to avoid
>> deadlock.
>
>


Click here to read the complete article
Re: Safepoints

<sef1gf$dvr$1@dont-email.me>

  copy mid

https://www.novabbs.com/devel/article-flat.php?id=19570&group=comp.arch#19570

  copy link   Newsgroups: comp.arch
Path: i2pn2.org!i2pn.org!eternal-september.org!reader02.eternal-september.org!.POSTED!not-for-mail
From: david.br...@hesbynett.no (David Brown)
Newsgroups: comp.arch
Subject: Re: Safepoints
Date: Wed, 4 Aug 2021 23:40:30 +0200
Organization: A noiseless patient Spider
Lines: 128
Message-ID: <sef1gf$dvr$1@dont-email.me>
References: <se1s18$sf$1@z-news.wcss.wroc.pl> <se36el$en5$1@dont-email.me>
<se4aae$qma$1@dont-email.me> <sedj2d$b0l$1@dont-email.me>
<seeb8j$f0t$1@dont-email.me> <1JzOI.8$xM2.7@fx22.iad>
<seejpg$ce6$1@dont-email.me> <lhCOI.3073$Fx8.581@fx45.iad>
Mime-Version: 1.0
Content-Type: text/plain; charset=utf-8
Content-Transfer-Encoding: 8bit
Injection-Date: Wed, 4 Aug 2021 21:40:31 -0000 (UTC)
Injection-Info: reader02.eternal-september.org; posting-host="3a095d4c6431fbf4c46a2b9aa58dd627";
logging-data="14331"; mail-complaints-to="abuse@eternal-september.org"; posting-account="U2FsdGVkX1+dp42ZGIXmjF9X1Xq/SpcT8kGOuIHs0oE="
User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:78.0) Gecko/20100101
Thunderbird/78.11.0
Cancel-Lock: sha1:ZEGhaw0+GCB9jm+q3lMAxLwSbB0=
In-Reply-To: <lhCOI.3073$Fx8.581@fx45.iad>
Content-Language: en-GB
 by: David Brown - Wed, 4 Aug 2021 21:40 UTC

On 04/08/2021 21:55, EricP wrote:
> Stephen Fuld wrote:
>> On 8/4/2021 9:59 AM, EricP wrote:
>>> Stephen Fuld wrote:
>>>> On 8/4/2021 1:27 AM, David Brown wrote:
>>>>> On 31/07/2021 22:03, Stephen Fuld wrote:
>>>>>> On 7/31/2021 2:51 AM, David Brown wrote:
>>>>>>
>>>>>> snip
>>>>>>
>>>>>>> I've always thought a "disable interrupts for the next N
>>>>>>> instructions",
>>>>>>> with "N" being a small immediate constant, would be an extremely
>>>>>>> useful
>>>>>>> instruction on the kind of devices I use (single cpu
>>>>>>> microcontrollers).
>>>>>>>    This instruction should work regardless of the current
>>>>>>> interrupt enable
>>>>>>> status, making it significantly more efficient than the usual
>>>>>>> store old
>>>>>>> status, disable interrupts, restore old status dance.  It could
>>>>>>> also be
>>>>>>> allowable from user mode safely, unlike normal interrupt disable,
>>>>>>> since
>>>>>>> it is only temporary.  And then you would have an easy and safe
>>>>>>> way to
>>>>>>> make short multi-instruction atomic sections, in a way that could be
>>>>>>> used by any language.
>>>>>>>
>>>>>>> That would not completely solve your complex invariant problem
>>>>>>> here, but
>>>>>>> it could be used to have atomic logs/trackers of the state of the
>>>>>>> object, allowing the program to identify and recover from
>>>>>>> inconsistent
>>>>>>> states.
>>>>>>
>>>>>> I agree on its usefulness for your kind of application - those
>>>>>> where you
>>>>>> control all the software.  However on a general purpose system, I can
>>>>>> see lots of potential problems, mostly involving denial of service
>>>>>> attacks.
>>>>>>
>>>>>
>>>>> That is the point of having it limited - it would block interrupts for
>>>>> up to N instructions (where N is a small constant), and then even if
>>>>> followed by another "disable interrupts for the next N instructions",
>>>>> interrupts could be handled between them.
>>>>
>>>> I understand that.  But you can reduce the "responsivness" to
>>>> interrupts, and reduce the number of interrupts per unit time that
>>>> you can handle.  I am not sure that is a problem, but it nags at the
>>>> back of my mind.
>>>
>>> Operating systems go to great lengths to layer their code
>>> exactly so that interrupts do NOT need to be completely disabled.
>>>
>>> Interrupts of the same or lower priority are blocked by HW while
>>> an Interrupt Service Routine runs, but the ISR actions should be
>>> only a relatively few instructions, maybe dozens,
>>> before it queues further work to a lower priority and executes
>>> a Return From Interrupt to re enable that interrupt priority.
>>> The deferred work is processed when all nested ISR's return
>>> and interrupts are again fully enabled.
>>>
>>> There is of course more to it than that, a lot due to x86/x64
>>> clunkiness.
>>>
>>> User mode code should use Exchange or CompareExchange, FetchAdd, etc.
>>> to perform non-interruptible operations. They don't need to be atomic if
>>> the variables are not shared between threads (are thread local store).
>>> Load-Lock/Store-Conditional clear the lock if an interrupt occurs
>>> so can be used to perform non-interruptible sequences without
>>> memory barriers if the variables are thread local.
>>
>> You have made a bunch of assumptions that may not be true.  If you
>> look back to see the kind of environment David was talking about,
>> (single CPU microcontrollers), he may not have a full OS, probably
>> doesn't have instructions like fetch and add or compare and swap, etc.
>>
>> If David would give a few more specifics about his world, it might
>> show you a whole different world of programming.
>
> Not really. Any microcontroller with interrupts has a disable capability.
> That means a non-interruptible sequence is always possible
> and the discussion is about the efficiency on a particular uC.
>
> One designs the software using HalNonInterruptibleExchange and
> HalNonInterruptibleCompareExchange subroutines or macros.

No, one does not.

You view the interrupt disabling / restoring code (inline functions,
macros, or - my preference - RAII in C++ or gcc-extended C) as brackets
around your critical code section. You don't use it to duplicate other
mechanisms such as compare-exchange or CAS.

> If those map to single instructions on a uC, great.
> If not, then they do the equivalent PUSHF, INTD, LD, ST, POPF sequence.
> If an INTD_N instruction is available then it saves a PUSHF and POPF.
> But LL/SC saves any disable and is more generally useful even on a uC.
>

No, it is not more useful - because it does not work in conjunction with
interrupts.

Typical uses for short critical sections might be to make an atomic
variable, or to keep some related variables consistent. These will
often be used alongside interrupts - for more "normal" variables, you'd
likely be using RTOS synchronisation systems like messaging or mutexes,
and know when you can access shared data without any more locking. And
if you try to use LL/SC mechanisms to lock the data your thread shares
with an interrupt function, it will work perfectly during all your lab
testing and fail at the customer site.

(There are also other occasions when you might want such
interrupt-disabled critical sections, such as during a watchdog update,
that cannot be handled by any kind of software locking mechanism.)

> Many uC I have seen had an XCHG [mem],reg instruction but not
> compare-exchange as most are accumulator architectures and many
> instructions are a non-interruptible R-M-W sequence anyway.
>

That is the case for a number of old 8-bit CISC designs - not for more
modern microcontrollers which are almost always RISC based. And on
these old 8-bit designs, you regularly need to have interrupt disable
sequences in order to make atomic access to variables of more than 8 bits.

Re: Safepoints

<5hHOI.1976$lK.1523@fx41.iad>

  copy mid

https://www.novabbs.com/devel/article-flat.php?id=19573&group=comp.arch#19573

  copy link   Newsgroups: comp.arch
Path: i2pn2.org!i2pn.org!aioe.org!news.uzoreto.com!feeder.usenetexpress.com!tr2.eu1.usenetexpress.com!feeder1.feed.usenet.farm!feed.usenet.farm!peer02.ams4!peer.am4.highwinds-media.com!peer03.iad!feed-me.highwinds-media.com!news.highwinds-media.com!fx41.iad.POSTED!not-for-mail
From: ThatWoul...@thevillage.com (EricP)
User-Agent: Thunderbird 2.0.0.24 (Windows/20100228)
MIME-Version: 1.0
Newsgroups: comp.arch
Subject: Re: Safepoints
References: <se1s18$sf$1@z-news.wcss.wroc.pl> <se36el$en5$1@dont-email.me> <se4aae$qma$1@dont-email.me> <sedj2d$b0l$1@dont-email.me> <seeb8j$f0t$1@dont-email.me> <1JzOI.8$xM2.7@fx22.iad> <seejpg$ce6$1@dont-email.me> <seeqsh$1ak$1@dont-email.me> <a0DOI.93$fI7.10@fx33.iad> <sef0rg$9ps$1@dont-email.me>
In-Reply-To: <sef0rg$9ps$1@dont-email.me>
Content-Type: text/plain; charset=UTF-8; format=flowed
Content-Transfer-Encoding: 7bit
Lines: 83
Message-ID: <5hHOI.1976$lK.1523@fx41.iad>
X-Complaints-To: abuse@UsenetServer.com
NNTP-Posting-Date: Thu, 05 Aug 2021 01:37:05 UTC
Date: Wed, 04 Aug 2021 21:36:17 -0400
X-Received-Bytes: 4927
 by: EricP - Thu, 5 Aug 2021 01:36 UTC

David Brown wrote:
> On 04/08/2021 22:45, EricP wrote:
>> David Brown wrote:
>>> A typical device for me these days would be based on an ARM Cortex M
>>> single-core microcontroller. The OS's I use are real-time OS's (such as
>>> FreeRTOS) that are linked directly to a single executable along with the
>>> program - the systems typically have only a single binary. (There might
>>> be an independent bootloader, but that stops once the main program
>>> starts.)
>>>
>>> Processors in this class generally do not have atomic swap, CAS,
>>> FetchAdd, etc., instructions. Most microcontrollers have RISC cpus
>>> (except some of the small, old-fashioned 8-bit devices), and typically
>>> the only tool you have other than disabling interrupts is a
>>> load-link/store-conditional pair (ldrex/strex on ARM).
>> ARMv5 had SWP Swap Word and SWPB Swap Byte exchange instructions.
>> I suppose before v5 super mode had to disable interrupts.
>> User mode would have to make a system call to do an an exchange.
>
> SWP and SWPB were deprecated for ARMv6, and dropped in the Thumb-2
> instruction set used by the Cortex-M devices.

I checked the ARMv7 manual before posting that and it is still there.
It was removed for ARMv7-M which is a subset of ARMv7-R.

>>> Use of ldrex/strex can avoid disabling interrupts. But they are big and
>>> slow. Disabling interrupts around a critical section (such as an atomic
>>> increment, or accessing a 64-bit variable) is simpler, more efficient
>>> and more predictable.
>> ldrex and strex came later with ARMv6.
>> I am surprised you say that on ARM they are big and slow.
>> The whole point of a LL/SC mechanism is that it is extremely lean.
>
> Compared to disabling interrupts, LL/SC sequences are big (and therefore
> slow). They take something like 7 or 8 instructions, with a loop (and
> in RTOS systems, worst-case timings are important even if the loop is
> usually only run once). Disabling interrupts are two or three instructions.

A non-interruptable exchange on Alpha would be

loop:
LDL_L rx,[mem]
STL_C rx,[mem]
BEQ rx,loop

though strictly speaking that should be a forward branch to
a backward branch so it is predicted as not taken.

> There are two important aspects of LL/SC, compared to older "lock
> everything" mechanisms such disabling interrupts (for single core
> devices) or x86 locked instructions. It splits the locking in two so
> that you don't have the delays seen in the x86 world, and it is vastly
> easier to implement in the hardware of a load/store processor than fully
> locked instructions.
>
> But on a single core system, it is bigger and slower than disabling
> interrupts. And it does not work inside interrupt functions (which is a
> big issue for microcontroller usage).

Then Arm must have done something seriously wrong with its implementation.
There is nothing in any Arm manual that I have seen to indicate a
problem with ldrex/strex and interrupts, or that they are inherently
slower than normal LD and ST.

>> LL loads a word as usual sets a FF in the cache controller to
>> indicate the line is "linked" and sets a clock counter to some value.
>> If any other processor reads the linked cache line, the FF is reset.
>> If an interrupt occurs the FF is reset.
>> SC stores the word as usual but checks if the FF is still set
>> and the clock counter is > 0. If both are true the store occurs.
>>
>> The memory coherence detector should be present even on a uniprocessor as
>> a cache line may be invalidated by DMA. But that is just an address XOR.
>>
>
> Yes. And all of that is slower than a couple of instructions to disable
> interrupts.

I keep wondering if you left the DMB memory barrier instructions in?
If a cpu is just talking between its own interrupts and its own
non-interrupt levels then it doesn't need memory barriers.
A cpu loads and stores are always consistent with itself.

Re: Safepoints

<qdIOI.1006$yW1.813@fx08.iad>

  copy mid

https://www.novabbs.com/devel/article-flat.php?id=19574&group=comp.arch#19574

  copy link   Newsgroups: comp.arch
Path: i2pn2.org!i2pn.org!weretis.net!feeder8.news.weretis.net!feeder5.feed.usenet.farm!feeder1.feed.usenet.farm!feed.usenet.farm!news.uzoreto.com!news-out.netnews.com!news.alt.net!fdc2.netnews.com!peer02.ams1!peer.ams1.xlned.com!news.xlned.com!peer03.iad!feed-me.highwinds-media.com!news.highwinds-media.com!fx08.iad.POSTED!not-for-mail
From: ThatWoul...@thevillage.com (EricP)
User-Agent: Thunderbird 2.0.0.24 (Windows/20100228)
MIME-Version: 1.0
Newsgroups: comp.arch
Subject: Re: Safepoints
References: <se1s18$sf$1@z-news.wcss.wroc.pl> <se36el$en5$1@dont-email.me> <se4aae$qma$1@dont-email.me> <sedj2d$b0l$1@dont-email.me> <seeb8j$f0t$1@dont-email.me> <1JzOI.8$xM2.7@fx22.iad> <seejpg$ce6$1@dont-email.me> <lhCOI.3073$Fx8.581@fx45.iad> <sef1gf$dvr$1@dont-email.me>
In-Reply-To: <sef1gf$dvr$1@dont-email.me>
Content-Type: text/plain; charset=UTF-8; format=flowed
Content-Transfer-Encoding: 7bit
Lines: 86
Message-ID: <qdIOI.1006$yW1.813@fx08.iad>
X-Complaints-To: abuse@UsenetServer.com
NNTP-Posting-Date: Thu, 05 Aug 2021 02:41:26 UTC
Date: Wed, 04 Aug 2021 22:40:24 -0400
X-Received-Bytes: 4906
 by: EricP - Thu, 5 Aug 2021 02:40 UTC

David Brown wrote:
> On 04/08/2021 21:55, EricP wrote:
>> Stephen Fuld wrote:
>>> You have made a bunch of assumptions that may not be true. If you
>>> look back to see the kind of environment David was talking about,
>>> (single CPU microcontrollers), he may not have a full OS, probably
>>> doesn't have instructions like fetch and add or compare and swap, etc.
>>>
>>> If David would give a few more specifics about his world, it might
>>> show you a whole different world of programming.
>> Not really. Any microcontroller with interrupts has a disable capability.
>> That means a non-interruptible sequence is always possible
>> and the discussion is about the efficiency on a particular uC.
>>
>> One designs the software using HalNonInterruptibleExchange and
>> HalNonInterruptibleCompareExchange subroutines or macros.
>
> No, one does not.
>
> You view the interrupt disabling / restoring code (inline functions,
> macros, or - my preference - RAII in C++ or gcc-extended C) as brackets
> around your critical code section. You don't use it to duplicate other
> mechanisms such as compare-exchange or CAS.

If, as Stephen Fuld's scenario said, the microcontroller doesn't have
a swap instruction then your philosophy is moot. You have no choice.

>> If those map to single instructions on a uC, great.
>> If not, then they do the equivalent PUSHF, INTD, LD, ST, POPF sequence.
>> If an INTD_N instruction is available then it saves a PUSHF and POPF.
>> But LL/SC saves any disable and is more generally useful even on a uC.
>>
>
> No, it is not more useful - because it does not work in conjunction with
> interrupts.

Its not clear what you think does not work with interrupts but if it is
LL/SC the interrupt should clear the lock flag and prevent the SC.

The ARMv7-M manual says exceptions do clear the flag:
section A3.4.4 Context switch support
"the local monitor is changed to Open Access automatically
as part of an exception entry or exit"

In ARM parlance Open Access means not exclusive - the load-lock is canceled.

> Typical uses for short critical sections might be to make an atomic
> variable, or to keep some related variables consistent. These will
> often be used alongside interrupts - for more "normal" variables, you'd
> likely be using RTOS synchronisation systems like messaging or mutexes,
> and know when you can access shared data without any more locking. And
> if you try to use LL/SC mechanisms to lock the data your thread shares
> with an interrupt function, it will work perfectly during all your lab
> testing and fail at the customer site.

I'm not referring to atomically shared variables,
just interrupts within a single cpu.

LL/SC is used to build a non interruptible exchange,
which is used to queue deferred work packets from interrupt level
to non interrupt kernel level, and for kernel to pop deferred work
packets from the queue, all without disabling interrupts.

Its how RSX, VMS, and WNT deferred work from interrupts
without having to constantly disable and enable interrupts,
and maybe BSD and Linux too.

>
> (There are also other occasions when you might want such
> interrupt-disabled critical sections, such as during a watchdog update,
> that cannot be handled by any kind of software locking mechanism.)
>
>> Many uC I have seen had an XCHG [mem],reg instruction but not
>> compare-exchange as most are accumulator architectures and many
>> instructions are a non-interruptible R-M-W sequence anyway.
>>
>
> That is the case for a number of old 8-bit CISC designs - not for more
> modern microcontrollers which are almost always RISC based. And on
> these old 8-bit designs, you regularly need to have interrupt disable
> sequences in order to make atomic access to variables of more than 8 bits.

I don't count 32 bits as a microcontroller - those are microprocessors.
Last I read the 8-bit 8051 is still the most popular microcontroller.

Re: Safepoints

<gsIOI.3508$Fx8.1428@fx45.iad>

  copy mid

https://www.novabbs.com/devel/article-flat.php?id=19575&group=comp.arch#19575

  copy link   Newsgroups: comp.arch
Path: i2pn2.org!i2pn.org!weretis.net!feeder8.news.weretis.net!feeder1.feed.usenet.farm!feed.usenet.farm!newsfeed.xs4all.nl!newsfeed8.news.xs4all.nl!news-out.netnews.com!news.alt.net!fdc3.netnews.com!peer03.ams1!peer.ams1.xlned.com!news.xlned.com!peer01.iad!feed-me.highwinds-media.com!news.highwinds-media.com!fx45.iad.POSTED!not-for-mail
From: ThatWoul...@thevillage.com (EricP)
User-Agent: Thunderbird 2.0.0.24 (Windows/20100228)
MIME-Version: 1.0
Newsgroups: comp.arch
Subject: Re: Safepoints
References: <se1s18$sf$1@z-news.wcss.wroc.pl> <se36el$en5$1@dont-email.me> <se4aae$qma$1@dont-email.me> <f1240d75-82d6-468e-bf51-c4e275e38a4an@googlegroups.com> <sedj5r$b0l$2@dont-email.me> <DsAOI.3063$Fx8.71@fx45.iad> <seeoqr$i6d$1@dont-email.me>
In-Reply-To: <seeoqr$i6d$1@dont-email.me>
Content-Type: text/plain; charset=UTF-8; format=flowed
Content-Transfer-Encoding: 7bit
Lines: 41
Message-ID: <gsIOI.3508$Fx8.1428@fx45.iad>
X-Complaints-To: abuse@UsenetServer.com
NNTP-Posting-Date: Thu, 05 Aug 2021 02:57:16 UTC
Date: Wed, 04 Aug 2021 22:56:26 -0400
X-Received-Bytes: 2692
 by: EricP - Thu, 5 Aug 2021 02:56 UTC

David Brown wrote:
> On 04/08/2021 19:51, Branimir Maksimovic wrote:
>> On 2021-08-04, David Brown <david.brown@hesbynett.no> wrote:
>>> On 31/07/2021 22:58, MitchAlsup wrote:
>>>> On Saturday, July 31, 2021 at 3:03:28 PM UTC-5, Stephen Fuld wrote:
>>>>> On 7/31/2021 2:51 AM, David Brown wrote:
>>>>>
>>>>> snip
>>>>>> I've always thought a "disable interrupts for the next N instructions",
>>>>>> with "N" being a small immediate constant, would be an extremely useful
>>>>>> instruction on the kind of devices I use (single cpu microcontrollers).
>>>> <
>>>> I can see utility here, but you need not only N as the number of instructions,
>>>> but also P the priority level below which you suppress interrupts and above
>>>> which you still allow interrupts.
>>>> <
>>> No - you block /all/ interrupts, of /all/ priorities. That's key to
>>> such an instruction.
>> Except non maskable interrupts.
>>
>
> Yes, but those only occur in a disaster (such as critical hardware fault).

Yeah, well, only if people read the fucking HW manual first.

However it seems none of those who designed the PC did as they
wired the 8087 FPU coprocessor operation completion signal into
the NMI and afterwards everything had to be backwards compatible.
And of couse all PC OS's have to deal with the possibility that
disabling interrupts won't block FPU NMI's.

Later, from what I've read, someone wired an NMI enable/disable register.
But they connected it to the IO port bus, which for compatibility
was stuck at 5 MHz no matter what the system bus speed was.
So NMI disables and enables were really slow.

The PC Convertible had the NMI attached to the diskette,
real-time clock, keyboard, and the system suspend interrupts.

The PCjr had NMI attached to the keyboard interrupt.

Re: Safepoints

<seg9bo$djl$1@dont-email.me>

  copy mid

https://www.novabbs.com/devel/article-flat.php?id=19577&group=comp.arch#19577

  copy link   Newsgroups: comp.arch
Path: i2pn2.org!i2pn.org!eternal-september.org!reader02.eternal-september.org!.POSTED!not-for-mail
From: david.br...@hesbynett.no (David Brown)
Newsgroups: comp.arch
Subject: Re: Safepoints
Date: Thu, 5 Aug 2021 11:00:29 +0200
Organization: A noiseless patient Spider
Lines: 52
Message-ID: <seg9bo$djl$1@dont-email.me>
References: <se1s18$sf$1@z-news.wcss.wroc.pl> <se36el$en5$1@dont-email.me>
<se4aae$qma$1@dont-email.me>
<f1240d75-82d6-468e-bf51-c4e275e38a4an@googlegroups.com>
<sedj5r$b0l$2@dont-email.me> <DsAOI.3063$Fx8.71@fx45.iad>
<seeoqr$i6d$1@dont-email.me> <gsIOI.3508$Fx8.1428@fx45.iad>
Mime-Version: 1.0
Content-Type: text/plain; charset=utf-8
Content-Transfer-Encoding: 8bit
Injection-Date: Thu, 5 Aug 2021 09:00:40 -0000 (UTC)
Injection-Info: reader02.eternal-september.org; posting-host="ec91c64638e968ca0c439847a3b21dbc";
logging-data="13941"; mail-complaints-to="abuse@eternal-september.org"; posting-account="U2FsdGVkX1/0TIoIa+sZoFX/ObyS4LOeHFGfXmVGnWM="
User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:78.0) Gecko/20100101
Thunderbird/78.11.0
Cancel-Lock: sha1:yJDP2VwoQb7KWXpUCCS8ceJnSAw=
In-Reply-To: <gsIOI.3508$Fx8.1428@fx45.iad>
Content-Language: en-GB
 by: David Brown - Thu, 5 Aug 2021 09:00 UTC

On 05/08/2021 04:56, EricP wrote:
> David Brown wrote:
>> On 04/08/2021 19:51, Branimir Maksimovic wrote:
>>> On 2021-08-04, David Brown <david.brown@hesbynett.no> wrote:
>>>> On 31/07/2021 22:58, MitchAlsup wrote:
>>>>> On Saturday, July 31, 2021 at 3:03:28 PM UTC-5, Stephen Fuld wrote:
>>>>>> On 7/31/2021 2:51 AM, David Brown wrote:
>>>>>> snip
>>>>>>> I've always thought a "disable interrupts for the next N
>>>>>>> instructions", with "N" being a small immediate constant, would
>>>>>>> be an extremely useful instruction on the kind of devices I use
>>>>>>> (single cpu microcontrollers).
>>>>> <
>>>>> I can see utility here, but you need not only N as the number of
>>>>> instructions,
>>>>> but also P the priority level below which you suppress interrupts
>>>>> and above
>>>>> which you still allow interrupts.
>>>>> <
>>>> No - you block /all/ interrupts, of /all/ priorities.  That's key to
>>>> such an instruction.
>>> Except non maskable interrupts.
>>>
>>
>> Yes, but those only occur in a disaster (such as critical hardware
>> fault).
>
> Yeah, well, only if people read the fucking HW manual first.
>
> However it seems none of those who designed the PC did as they
> wired the 8087 FPU coprocessor operation completion signal into
> the NMI and afterwards everything had to be backwards compatible.
> And of couse all PC OS's have to deal with the possibility that
> disabling interrupts won't block FPU NMI's.
>
> Later, from what I've read, someone wired an NMI enable/disable register.
> But they connected it to the IO port bus, which for compatibility
> was stuck at 5 MHz no matter what the system bus speed was.
> So NMI disables and enables were really slow.
>
> The PC Convertible had the NMI attached to the diskette,
> real-time clock, keyboard, and the system suspend interrupts.
>
> The PCjr had NMI attached to the keyboard interrupt.
>

I am in the lucky position of being heavily involved in the design of
most of the systems I program - and I am very careful about how an NMI
can be used. (I don't design the microcontrollers themselves, of course
- and sometimes these have a few astoundingly stupid features.) My work
on PC's is all at a higher level - usually in a higher level language
too, such as Python. So PC's NMI problems are an SEP from my viewpoint!

Re: Safepoints

<segqla$4k9$1@dont-email.me>

  copy mid

https://www.novabbs.com/devel/article-flat.php?id=19578&group=comp.arch#19578

  copy link   Newsgroups: comp.arch
Path: i2pn2.org!i2pn.org!eternal-september.org!reader02.eternal-september.org!.POSTED!not-for-mail
From: david.br...@hesbynett.no (David Brown)
Newsgroups: comp.arch
Subject: Re: Safepoints
Date: Thu, 5 Aug 2021 15:55:54 +0200
Organization: A noiseless patient Spider
Lines: 229
Message-ID: <segqla$4k9$1@dont-email.me>
References: <se1s18$sf$1@z-news.wcss.wroc.pl> <se36el$en5$1@dont-email.me>
<se4aae$qma$1@dont-email.me> <sedj2d$b0l$1@dont-email.me>
<seeb8j$f0t$1@dont-email.me> <1JzOI.8$xM2.7@fx22.iad>
<seejpg$ce6$1@dont-email.me> <seeqsh$1ak$1@dont-email.me>
<a0DOI.93$fI7.10@fx33.iad> <sef0rg$9ps$1@dont-email.me>
<5hHOI.1976$lK.1523@fx41.iad>
Mime-Version: 1.0
Content-Type: text/plain; charset=utf-8
Content-Transfer-Encoding: 8bit
Injection-Date: Thu, 5 Aug 2021 13:55:54 -0000 (UTC)
Injection-Info: reader02.eternal-september.org; posting-host="ec91c64638e968ca0c439847a3b21dbc";
logging-data="4745"; mail-complaints-to="abuse@eternal-september.org"; posting-account="U2FsdGVkX1/VfFWzH75Re2dCwDlEuGPm0twfzijKA0Q="
User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:78.0) Gecko/20100101
Thunderbird/78.11.0
Cancel-Lock: sha1:qKp3YwW3htTTj8i3Kuma+OnbX1s=
In-Reply-To: <5hHOI.1976$lK.1523@fx41.iad>
Content-Language: en-GB
 by: David Brown - Thu, 5 Aug 2021 13:55 UTC

On 05/08/2021 03:36, EricP wrote:
> David Brown wrote:
>> On 04/08/2021 22:45, EricP wrote:
>>> David Brown wrote:
>>>> A typical device for me these days would be based on an ARM Cortex M
>>>> single-core microcontroller.  The OS's I use are real-time OS's
>>>> (such as
>>>> FreeRTOS) that are linked directly to a single executable along with
>>>> the
>>>> program - the systems typically have only a single binary.  (There
>>>> might
>>>> be an independent bootloader, but that stops once the main program
>>>> starts.)
>>>>
>>>> Processors in this class generally do not have atomic swap, CAS,
>>>> FetchAdd, etc., instructions.  Most microcontrollers have RISC cpus
>>>> (except some of the small, old-fashioned 8-bit devices), and typically
>>>> the only tool you have other than disabling interrupts is a
>>>> load-link/store-conditional pair (ldrex/strex on ARM).
>>> ARMv5 had SWP Swap Word and SWPB Swap Byte exchange instructions.
>>> I suppose before v5 super mode had to disable interrupts.
>>> User mode would have to make a system call to do an an exchange.
>>
>> SWP and SWPB were deprecated for ARMv6, and dropped in the Thumb-2
>> instruction set used by the Cortex-M devices.
>
> I checked the ARMv7 manual before posting that and it is still there.
> It was removed for ARMv7-M which is a subset of ARMv7-R.

Yes, it is still in the ARMv7 - but deprecated. And it does not exist
at all in the Thumb-2 instruction set, which is the only available set
in the Cortex-M family. So you are right that it is in the architecture
(though I believe it is completely gone in ARMv8), but it is missing
from the devices I have been talking about.

>
>>>> Use of ldrex/strex can avoid disabling interrupts.  But they are big
>>>> and
>>>> slow.  Disabling interrupts around a critical section (such as an
>>>> atomic
>>>> increment, or accessing a 64-bit variable) is simpler, more efficient
>>>> and more predictable.
>>> ldrex and strex came later with ARMv6.
>>> I am surprised you say that on ARM they are big and slow.
>>> The whole point of a LL/SC mechanism is that it is extremely lean.
>>
>> Compared to disabling interrupts, LL/SC sequences are big (and therefore
>> slow).  They take something like 7 or 8 instructions, with a loop (and
>> in RTOS systems, worst-case timings are important even if the loop is
>> usually only run once).  Disabling interrupts are two or three
>> instructions.
>
> A non-interruptable exchange on Alpha would be
>
>   loop:
>     LDL_L  rx,[mem]
>     STL_C  rx,[mem]
>     BEQ    rx,loop
>
> though strictly speaking that should be a forward branch to
> a backward branch so it is predicted as not taken.
>
>> There are two important aspects of LL/SC, compared to older "lock
>> everything" mechanisms such disabling interrupts (for single core
>> devices) or x86 locked instructions.  It splits the locking in two so
>> that you don't have the delays seen in the x86 world, and it is vastly
>> easier to implement in the hardware of a load/store processor than fully
>> locked instructions.
>>
>> But on a single core system, it is bigger and slower than disabling
>> interrupts.  And it does not work inside interrupt functions (which is a
>> big issue for microcontroller usage).
>
> Then Arm must have done something seriously wrong with its implementation.
> There is nothing in any Arm manual that I have seen to indicate a
> problem with ldrex/strex and interrupts, or that they are inherently
> slower than normal LD and ST.
>

I am not sure I a explaining things well here.

The assembly for a 32-bit increment (of a variable in memory) in Thumb-2
will be something along the lines of :

ldr r3, [r2]
add r3, r3, #1
str r3, [r2]

Three instructions, with one load and one store. With tightly-coupled
memory (or on an M0 to M4 microcontroller, on-board ram), loads and
stores are two cycles. So that is 3 instructions, 5 cycles.

Making this atomic on an M0 microcontroller is :

cpsid i

ldr r3, [r2]
add r3, r3, #1
str r3, [r2]

cpsie

Two extra instructions, at two cycles. (These will generally not be
needed inside an interrupt function, unless the same data will be
accessed by different interrupt routines with different priorities.)

If you want a more flexible sequence that saves and restores interrupt
status, and also need it safe for the dual-issue M7, you can use:

mrs r1, primask
cpsid i
dsb

ldr r3, [r2]
add r3, r3, #1
str r3, [r2]

msr primask, r1

Four instructions, four cycles overhead for making the sequence atomic.

The equivalent for this using ldrex/strex is indeed short and fast:

loop:
ldrex r3, [r2]
add r3, r3, 1
strex r1, r3, [r2]
cmp r1, #0
bne loop
dsb

The ldrex and strex instructions don't take any longer than their
non-locking equivalents. And this is safe to use in interrupts. In
real-time work, it is vital to track worst-case execution time, not
best-case or common-case. So though the best case might be 3 cycles
overhead, an extra round of the loop might be another 9 cycles. (Single
extra rounds will happen occasionally - multiple extra rounds should not
be realistically possible.)

So it seems to be a viable alternative to disabling interrupts, with
approximately the same overhead. However, it has two major
disadvantages to go with its obvious advantage of not blocking interrupts.

It will only work for sequences ending in a single write of 32 bits or
less, and it will only work for restartable sequences.

Suppose, instead, that the atomic operation you want is not a simple
increment of a 32-bit value, but is storing a 64-bit value. With the
interrupt disable strategy, you still have exactly the same 4
instruction, 4 cycle overhead (or two instructions in the simplest
version), for both read and write routines. How do you do this with
ldrex/strex ?

You can't use the same setup as earlier. Suppose task A takes the
exclusive monitor lock, writes the first half of the 64-bit item, then
there is an interrupt. Task B wants to read the value - it takes the
lock, reads the 64-bit value, and releases it, thinking all is well.
When task A resumes, its write to the second half using strex fails, and
it restarts the write. In the meantime, task B is left with a corrupted
read that it thinks is valid. This is typically a very low probability
event - you never see it in testing, but it /will/ happen when you have
deployed thousands of systems at customers.

So you now use ldrex/strex to control access to an independent lock flag
(a simple semaphore). That works for tasks, but the overhead is now
much bigger, and there is the possibility of failure - if another task
has the semaphore, the current one must cede control to it. And that
means blocking the task, changing dynamic priority levels so that the
other task can run, etc. - a full-blown RTOS mutex solution. These are
very powerful and useful locking mechanisms, but a /vast/ overhead
compared to the four instructions for interrupt disabling, and the
single instruction and 3 cycles needed for the 64-bit load or store.

And what happens in interrupts? An interrupt function must not block
(though it can trigger a scheduler run when it exits). If a task has
the lock when the interrupt is triggered, it will not be able to do its job.

In summary, ldrex/strex /can/ be used to implement high-level,
high-power, high-overhead locking mechanisms such as an RTOS mutex or
queue (though interrupt disabling works there too). It can be used as
an alternative to interrupt locking for a limited (albeit common) subset
of atomic operations, with little more cost in run-time but
significantly greater source-code complexity (and therefore scope for
programmer error).

In general, you want to avoid disabling interrupts for any significant
length of time. But a system that can't cope with them being disabled
for the time taken to do a small atomic access, is broken anyway.

This is a different world from multi-core processors, or systems where
reading from memory might take 200 clock cycles due to cache misses, or
where it might trigger page faults. Then ldrex/strex becomes essential.

>>> LL loads a word as usual sets a FF in the cache controller to
>>> indicate the line is "linked" and sets a clock counter to some value.
>>> If any other processor reads the linked cache line, the FF is reset.
>>> If an interrupt occurs the FF is reset.
>>> SC stores the word as usual but checks if the FF is still set
>>> and the clock counter is > 0. If both are true the store occurs.
>>>
>>> The memory coherence detector should be present even on a
>>> uniprocessor as
>>> a cache line may be invalidated by DMA. But that is just an address XOR.
>>>
>>
>> Yes.  And all of that is slower than a couple of instructions to disable
>> interrupts.
>
> I keep wondering if you left the DMB memory barrier instructions in?
> If a cpu is just talking between its own interrupts and its own
> non-interrupt levels then it doesn't need memory barriers.
> A cpu loads and stores are always consistent with itself.
>


Click here to read the complete article
Re: Safepoints

<segs22$n6j$1@dont-email.me>

  copy mid

https://www.novabbs.com/devel/article-flat.php?id=19579&group=comp.arch#19579

  copy link   Newsgroups: comp.arch
Path: i2pn2.org!i2pn.org!eternal-september.org!reader02.eternal-september.org!.POSTED!not-for-mail
From: david.br...@hesbynett.no (David Brown)
Newsgroups: comp.arch
Subject: Re: Safepoints
Date: Thu, 5 Aug 2021 16:19:41 +0200
Organization: A noiseless patient Spider
Lines: 129
Message-ID: <segs22$n6j$1@dont-email.me>
References: <se1s18$sf$1@z-news.wcss.wroc.pl> <se36el$en5$1@dont-email.me>
<se4aae$qma$1@dont-email.me> <sedj2d$b0l$1@dont-email.me>
<seeb8j$f0t$1@dont-email.me> <1JzOI.8$xM2.7@fx22.iad>
<seejpg$ce6$1@dont-email.me> <lhCOI.3073$Fx8.581@fx45.iad>
<sef1gf$dvr$1@dont-email.me> <qdIOI.1006$yW1.813@fx08.iad>
Mime-Version: 1.0
Content-Type: text/plain; charset=utf-8
Content-Transfer-Encoding: 8bit
Injection-Date: Thu, 5 Aug 2021 14:19:46 -0000 (UTC)
Injection-Info: reader02.eternal-september.org; posting-host="ec91c64638e968ca0c439847a3b21dbc";
logging-data="23763"; mail-complaints-to="abuse@eternal-september.org"; posting-account="U2FsdGVkX1+xAnEzpW8bRaj7t9UecBvI88oK5u0kXPQ="
User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:78.0) Gecko/20100101
Thunderbird/78.11.0
Cancel-Lock: sha1:YlBD3h0SxLn9aEhUen/dBn+yC+8=
In-Reply-To: <qdIOI.1006$yW1.813@fx08.iad>
Content-Language: en-GB
 by: David Brown - Thu, 5 Aug 2021 14:19 UTC

On 05/08/2021 04:40, EricP wrote:
> David Brown wrote:
>> On 04/08/2021 21:55, EricP wrote:
>>> Stephen Fuld wrote:
>>>> You have made a bunch of assumptions that may not be true.  If you
>>>> look back to see the kind of environment David was talking about,
>>>> (single CPU microcontrollers), he may not have a full OS, probably
>>>> doesn't have instructions like fetch and add or compare and swap, etc.
>>>>
>>>> If David would give a few more specifics about his world, it might
>>>> show you a whole different world of programming.
>>> Not really. Any microcontroller with interrupts has a disable
>>> capability.
>>> That means a non-interruptible sequence is always possible
>>> and the discussion is about the efficiency on a particular uC.
>>>
>>> One designs the software using HalNonInterruptibleExchange and
>>> HalNonInterruptibleCompareExchange subroutines or macros.
>>
>> No, one does not.
>>
>> You view the interrupt disabling / restoring code (inline functions,
>> macros, or - my preference - RAII in C++ or gcc-extended C) as brackets
>> around your critical code section.  You don't use it to duplicate other
>> mechanisms such as compare-exchange or CAS.
>
> If, as Stephen Fuld's scenario said, the microcontroller doesn't have
> a swap instruction then your philosophy is moot. You have no choice.
>
>>> If those map to single instructions on a uC, great.
>>> If not, then they do the equivalent PUSHF, INTD, LD, ST, POPF sequence.
>>> If an INTD_N instruction is available then it saves a PUSHF and POPF.
>>> But LL/SC saves any disable and is more generally useful even on a uC.
>>>
>>
>> No, it is not more useful - because it does not work in conjunction with
>> interrupts.
>
> Its not clear what you think does not work with interrupts but if it is
> LL/SC the interrupt should clear the lock flag and prevent the SC.
>
> The ARMv7-M manual says exceptions do clear the flag:
> section A3.4.4 Context switch support
> "the local monitor is changed to Open Access automatically
> as part of an exception entry or exit"
>
> In ARM parlance Open Access means not exclusive - the load-lock is
> canceled.

Yes, I realise that. This is not primarily designed to make ldrex/strex
work in an interrupt function, but to ensure that you don't have
problems when a task gets interrupted in the middle of a ldrex/strex
sequence, then another task starts a new sequence (claiming the lock for
itself), then the first task resumes again and thinks it has the lock.

>
>> Typical uses for short critical sections might be to make an atomic
>> variable, or to keep some related variables consistent.  These will
>> often be used alongside interrupts - for more "normal" variables, you'd
>> likely be using RTOS synchronisation systems like messaging or mutexes,
>> and know when you can access shared data without any more locking.  And
>> if you try to use LL/SC mechanisms to lock the data your thread shares
>> with an interrupt function, it will work perfectly during all your lab
>> testing and fail at the customer site.
>
> I'm not referring to atomically shared variables,
> just interrupts within a single cpu.
>
> LL/SC is used to build a non interruptible exchange,
> which is used to queue deferred work packets from interrupt level
> to non interrupt kernel level, and for kernel to pop deferred work
> packets from the queue, all without disabling interrupts.
>
> Its how RSX, VMS, and WNT deferred work from interrupts
> without having to constantly disable and enable interrupts,
> and maybe BSD and Linux too.
>

Yes. However, I am talking about locking for much smaller atomic
sequences - avoiding the need for OS-level locking with its much higher
overheads (but more features, of course).

>>
>> (There are also other occasions when you might want such
>> interrupt-disabled critical sections, such as during a watchdog update,
>> that cannot be handled by any kind of software locking mechanism.)
>>
>>> Many uC I have seen had an XCHG [mem],reg instruction but not
>>> compare-exchange as most are accumulator architectures and many
>>> instructions are a non-interruptible R-M-W sequence anyway.
>>>
>>
>> That is the case for a number of old 8-bit CISC designs - not for more
>> modern microcontrollers which are almost always RISC based.  And on
>> these old 8-bit designs, you regularly need to have interrupt disable
>> sequences in order to make atomic access to variables of more than 8
>> bits.
>
> I don't count 32 bits as a microcontroller - those are microprocessors.
> Last I read the 8-bit 8051 is still the most popular microcontroller.
>

The 8051 has been greatly used - to call it "popular" suggests that lots
of people like it, which is a different matter. Like Windows, the main
reason people used the 8051 is because a lot of people used the 8051.
(Okay, I suppose /some/ people like - I don't.)

ARM Cortex-M cores now dominate in most new microcontroller designs.
And yes, they /are/ microcontrollers - as are dual-core 64-bit PPC
microcontrollers, or 600 MHz Cortex-M7 devices. "Microcontroller" means
you have an integrated device with a cpu core, a range of peripherals,
and you are working primarily with a single binary image that handles
the entire system - OS (if you have one at all) and application code
combined.

There are no hard boundaries or definitions, however. There are devices
that some people use as microcontrollers, others as microprocessor SoC's
combined with general-purpose OS's (Linux being the most common). Some
microcontrollers have more than one program running at the same time
(this is common for Bluetooth and Wifi stacks).

I have been using 32-bit microcontrollers for 25 years - though it is
perhaps only the last 10 years since 32-bit has been more common than
8-bit or 16-bit in the boards we make. And we still make some boards
with 8-bit and 16-bit microcontrollers for specific requirements.
However, given that 32-bit microcontrollers can be had for $0.50 and are
smaller and lower power than most 8-bit devices, you'd need a really
good reason to actually /choose/ to use an 8051.

Re: Safepoints

<jLTOI.1846$EF2.1067@fx47.iad>

  copy mid

https://www.novabbs.com/devel/article-flat.php?id=19580&group=comp.arch#19580

  copy link   Newsgroups: comp.arch
Path: i2pn2.org!i2pn.org!news.swapon.de!news.uzoreto.com!news-out.netnews.com!news.alt.net!fdc3.netnews.com!peer01.ams1!peer.ams1.xlned.com!news.xlned.com!peer01.iad!feed-me.highwinds-media.com!news.highwinds-media.com!fx47.iad.POSTED!not-for-mail
From: ThatWoul...@thevillage.com (EricP)
User-Agent: Thunderbird 2.0.0.24 (Windows/20100228)
MIME-Version: 1.0
Newsgroups: comp.arch
Subject: Re: Safepoints
References: <se1s18$sf$1@z-news.wcss.wroc.pl> <se36el$en5$1@dont-email.me> <se4aae$qma$1@dont-email.me> <f1240d75-82d6-468e-bf51-c4e275e38a4an@googlegroups.com> <sedj5r$b0l$2@dont-email.me> <DsAOI.3063$Fx8.71@fx45.iad> <seeoqr$i6d$1@dont-email.me> <gsIOI.3508$Fx8.1428@fx45.iad>
In-Reply-To: <gsIOI.3508$Fx8.1428@fx45.iad>
Content-Type: text/plain; charset=UTF-8; format=flowed
Content-Transfer-Encoding: 7bit
Lines: 33
Message-ID: <jLTOI.1846$EF2.1067@fx47.iad>
X-Complaints-To: abuse@UsenetServer.com
NNTP-Posting-Date: Thu, 05 Aug 2021 15:48:31 UTC
Date: Thu, 05 Aug 2021 11:48:16 -0400
X-Received-Bytes: 2124
 by: EricP - Thu, 5 Aug 2021 15:48 UTC

EricP wrote:
> David Brown wrote:
>> On 04/08/2021 19:51, Branimir Maksimovic wrote:
>>> On 2021-08-04, David Brown <david.brown@hesbynett.no> wrote:
>>>> No - you block /all/ interrupts, of /all/ priorities. That's key to
>>>> such an instruction.
>>> Except non maskable interrupts.
>>
>> Yes, but those only occur in a disaster (such as critical hardware
>> fault).
>
> Yeah, well, only if people read the fucking HW manual first.

For the curious and any masochists out there,
after 40 years of NMI abuse and misdesigns
here is what Linux has do to deal with NMI on x64.

https://0xax.gitbooks.io/linux-insides/content/Interrupts/linux-interrupts-6.html

Intel and AMD are proposing to clean up at least some of the mess.
Linus's thoughts on the proposals below.
https://www.realworldtech.com/forum/?threadid=200812&curpostid=200822

Intel Flexible Return and Event Delivery (FRED) Jun-2021
https://software.intel.com/content/dam/develop/external/us/en/documents-tps/346446-flexible-return-and-event-delivery.pdf

*Warning*: My reader warns that this PDF contains JavaScript *
AMD Supervisor Entry Extensions Feb-2021
https://www.amd.com/system/files/TechDocs/57115.pdf

Re: Safepoints

<seh18s$rvl$1@dont-email.me>

  copy mid

https://www.novabbs.com/devel/article-flat.php?id=19581&group=comp.arch#19581

  copy link   Newsgroups: comp.arch
Path: i2pn2.org!i2pn.org!eternal-september.org!reader02.eternal-september.org!.POSTED!not-for-mail
From: sfu...@alumni.cmu.edu.invalid (Stephen Fuld)
Newsgroups: comp.arch
Subject: Re: Safepoints
Date: Thu, 5 Aug 2021 08:48:42 -0700
Organization: A noiseless patient Spider
Lines: 18
Message-ID: <seh18s$rvl$1@dont-email.me>
References: <se1s18$sf$1@z-news.wcss.wroc.pl> <se36el$en5$1@dont-email.me>
<se4aae$qma$1@dont-email.me> <sedj2d$b0l$1@dont-email.me>
<seeb8j$f0t$1@dont-email.me> <1JzOI.8$xM2.7@fx22.iad>
<seejpg$ce6$1@dont-email.me> <seeqsh$1ak$1@dont-email.me>
<a0DOI.93$fI7.10@fx33.iad> <sef0rg$9ps$1@dont-email.me>
<5hHOI.1976$lK.1523@fx41.iad> <segqla$4k9$1@dont-email.me>
Mime-Version: 1.0
Content-Type: text/plain; charset=utf-8; format=flowed
Content-Transfer-Encoding: 7bit
Injection-Date: Thu, 5 Aug 2021 15:48:44 -0000 (UTC)
Injection-Info: reader02.eternal-september.org; posting-host="d261e2215741824384b5113e8333cf8f";
logging-data="28661"; mail-complaints-to="abuse@eternal-september.org"; posting-account="U2FsdGVkX1/ZhLKjdnoyhonwkKspVXRORQ+/zBA9TG4="
User-Agent: Mozilla/5.0 (Windows NT 10.0; Win64; x64; rv:78.0) Gecko/20100101
Thunderbird/78.12.0
Cancel-Lock: sha1:Q+xmQkPwC8wheC9I5fGYny0fM0g=
In-Reply-To: <segqla$4k9$1@dont-email.me>
Content-Language: en-US
 by: Stephen Fuld - Thu, 5 Aug 2021 15:48 UTC

On 8/5/2021 6:55 AM, David Brown wrote:

snip

> I am not sure I a explaining things well here.

I snipped all the details because I just wanted to thank you for
responding to my request to post more details about your work
environment. I think it has led to, and continues to provide, an
interesting sub-thread.

And BTW, I think you are explaining the differences between yours and a
"typical" big, non-embedded environment quite well.

--
- Stephen Fuld
(e-mail address disguised to prevent spam)

Re: Safepoints

<seh2lq$c55$1@dont-email.me>

  copy mid

https://www.novabbs.com/devel/article-flat.php?id=19582&group=comp.arch#19582

  copy link   Newsgroups: comp.arch
Path: i2pn2.org!i2pn.org!eternal-september.org!reader02.eternal-september.org!.POSTED!not-for-mail
From: david.br...@hesbynett.no (David Brown)
Newsgroups: comp.arch
Subject: Re: Safepoints
Date: Thu, 5 Aug 2021 18:12:42 +0200
Organization: A noiseless patient Spider
Lines: 23
Message-ID: <seh2lq$c55$1@dont-email.me>
References: <se1s18$sf$1@z-news.wcss.wroc.pl> <se36el$en5$1@dont-email.me>
<se4aae$qma$1@dont-email.me> <sedj2d$b0l$1@dont-email.me>
<seeb8j$f0t$1@dont-email.me> <1JzOI.8$xM2.7@fx22.iad>
<seejpg$ce6$1@dont-email.me> <seeqsh$1ak$1@dont-email.me>
<a0DOI.93$fI7.10@fx33.iad> <sef0rg$9ps$1@dont-email.me>
<5hHOI.1976$lK.1523@fx41.iad> <segqla$4k9$1@dont-email.me>
<seh18s$rvl$1@dont-email.me>
Mime-Version: 1.0
Content-Type: text/plain; charset=utf-8
Content-Transfer-Encoding: 8bit
Injection-Date: Thu, 5 Aug 2021 16:12:42 -0000 (UTC)
Injection-Info: reader02.eternal-september.org; posting-host="ec91c64638e968ca0c439847a3b21dbc";
logging-data="12453"; mail-complaints-to="abuse@eternal-september.org"; posting-account="U2FsdGVkX18oTAxhOjs858HkU7Vjben7GLRi69KiPZc="
User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:78.0) Gecko/20100101
Thunderbird/78.11.0
Cancel-Lock: sha1:oA1WczrMFjDKJQJKhrHfinMKZV4=
In-Reply-To: <seh18s$rvl$1@dont-email.me>
Content-Language: en-GB
 by: David Brown - Thu, 5 Aug 2021 16:12 UTC

On 05/08/2021 17:48, Stephen Fuld wrote:
> On 8/5/2021 6:55 AM, David Brown wrote:
>
> snip
>
>> I am not sure I a explaining things well here.
>
> I snipped all the details because I just wanted to thank you for
> responding to my request to post more details about your work
> environment.  I think it has led to, and continues to provide, an
> interesting sub-thread.
>
> And BTW, I think you are explaining the differences between yours and a
> "typical" big, non-embedded environment quite well.
>

I'm on holiday, so I have plenty of time :-)

This thread has inspired me to look into more details of ldrex/strex
myself, and learn a number of new things. It's also shown me some of
the other architectures with ways of disabling interrupts. All in all,
a good thread.

Re: Safepoints

<r_cPI.1442$yW1.495@fx08.iad>

  copy mid

https://www.novabbs.com/devel/article-flat.php?id=19597&group=comp.arch#19597

  copy link   Newsgroups: comp.arch
Path: i2pn2.org!i2pn.org!paganini.bofh.team!news.dns-netz.com!news.freedyn.net!newsfeed.xs4all.nl!newsfeed9.news.xs4all.nl!feeder1.feed.usenet.farm!feed.usenet.farm!peer02.ams4!peer.am4.highwinds-media.com!peer03.iad!feed-me.highwinds-media.com!news.highwinds-media.com!fx08.iad.POSTED!not-for-mail
From: ThatWoul...@thevillage.com (EricP)
User-Agent: Thunderbird 2.0.0.24 (Windows/20100228)
MIME-Version: 1.0
Newsgroups: comp.arch
Subject: Re: Safepoints
References: <se1s18$sf$1@z-news.wcss.wroc.pl> <se36el$en5$1@dont-email.me> <se4aae$qma$1@dont-email.me> <sedj2d$b0l$1@dont-email.me> <seeb8j$f0t$1@dont-email.me> <1JzOI.8$xM2.7@fx22.iad> <seejpg$ce6$1@dont-email.me> <seeqsh$1ak$1@dont-email.me> <a0DOI.93$fI7.10@fx33.iad> <sef0rg$9ps$1@dont-email.me> <5hHOI.1976$lK.1523@fx41.iad> <segqla$4k9$1@dont-email.me>
In-Reply-To: <segqla$4k9$1@dont-email.me>
Content-Type: text/plain; charset=UTF-8; format=flowed
Content-Transfer-Encoding: 7bit
Lines: 226
Message-ID: <r_cPI.1442$yW1.495@fx08.iad>
X-Complaints-To: abuse@UsenetServer.com
NNTP-Posting-Date: Fri, 06 Aug 2021 15:58:15 UTC
Date: Fri, 06 Aug 2021 11:56:27 -0400
X-Received-Bytes: 10993
 by: EricP - Fri, 6 Aug 2021 15:56 UTC

David Brown wrote:
>
> I am not sure I a explaining things well here.

Your explanations are fine but we have been talking about different things.

You have been mostly referring to atomic memory operations between
threads or smp cpus, and I explicitly excluded atomic operations.

I was expanding on Mitch's earlier point about turning asynchronous
events delivered using interrupt semantics into synchronous ones,
which is what the original poster was asking about.

By atomic I mean memory operations between concurrent threads or cpus.
Non interruptible sequence is orthogonal, within a single thread or cpu.
E.g. on x86 ADD [mem],reg is a non interruptible R-M-W sequence but
is not atomic. LCK ADD [mem],reg is both atomic and non interruptible.

> The assembly for a 32-bit increment (of a variable in memory) in Thumb-2
> will be something along the lines of :
>
> ldr r3, [r2]
> add r3, r3, #1
> str r3, [r2]
>
> Three instructions, with one load and one store. With tightly-coupled
> memory (or on an M0 to M4 microcontroller, on-board ram), loads and
> stores are two cycles. So that is 3 instructions, 5 cycles.
>
> Making this atomic on an M0 microcontroller is :
>
> cpsid i
>
> ldr r3, [r2]
> add r3, r3, #1
> str r3, [r2]
>
> cpsie
>
> Two extra instructions, at two cycles. (These will generally not be
> needed inside an interrupt function, unless the same data will be
> accessed by different interrupt routines with different priorities.)
>
> If you want a more flexible sequence that saves and restores interrupt
> status, and also need it safe for the dual-issue M7, you can use:
>
> mrs r1, primask
> cpsid i
> dsb
>
> ldr r3, [r2]
> add r3, r3, #1
> str r3, [r2]
>
> msr primask, r1
>
> Four instructions, four cycles overhead for making the sequence atomic.

Yes, most OS programming standards require interrupt state
be saved and restored so the code is reentrant.

> The equivalent for this using ldrex/strex is indeed short and fast:
>
> loop:
> ldrex r3, [r2]
> add r3, r3, 1
> strex r1, r3, [r2]
> cmp r1, #0
> bne loop
> dsb

I looked at the ARMv7 manual again and I still think,
_for code that is within a single thread or cpu as per
the original question about asynchronous signals_
that the DSB is unnecessary in this particular case.
A cpu will always see its own instructions in program order.
As DSB stalls the cpu until all prior memory and branches have
completed it would impose an extra overhead on the sequence.

> The ldrex and strex instructions don't take any longer than their
> non-locking equivalents. And this is safe to use in interrupts. In
> real-time work, it is vital to track worst-case execution time, not
> best-case or common-case. So though the best case might be 3 cycles
> overhead, an extra round of the loop might be another 9 cycles. (Single
> extra rounds will happen occasionally - multiple extra rounds should not
> be realistically possible.)
>
> So it seems to be a viable alternative to disabling interrupts, with
> approximately the same overhead. However, it has two major
> disadvantages to go with its obvious advantage of not blocking interrupts.
>
> It will only work for sequences ending in a single write of 32 bits or
> less, and it will only work for restartable sequences.

Yes, this is a limitation of LL/SC - it cannot do double wide operations.

A few years ago I proposed an enhancement here which
allows LL/SC to multiple locations within a single cache line.

LL - retains the lock if load is to the same cache line
SCH - store conditional and hold stores and retains lock
SCR - store conditional and release stores and releases lock
CLL - Clear lock

The extra cost in the cache controller is holding the first line updates
separate until the release occurs in case it needs to roll back.
For the modified line it can either use a separate cache line buffer,
or if L2 is inclusive it can update L1 and either keep or toss that copy.

> Suppose, instead, that the atomic operation you want is not a simple
> increment of a 32-bit value, but is storing a 64-bit value. With the
> interrupt disable strategy, you still have exactly the same 4
> instruction, 4 cycle overhead (or two instructions in the simplest
> version), for both read and write routines. How do you do this with
> ldrex/strex ?

Yes unfortunately in general you don't.

However there are specific situations where it can be done.
E.g. reading a 64-bit clock on a 32-bit cpu since the clock is always
increasing one can read high1 read low, read high2, compare high1,high2

Another example is updating a 64-bit PTE on a 32-bit system without using
spinlocks and never leave the intermediate PTE in an illegal state.
One does the reads and writes setting and clearing bits
in a particular order.

> You can't use the same setup as earlier. Suppose task A takes the
> exclusive monitor lock, writes the first half of the 64-bit item, then
> there is an interrupt. Task B wants to read the value - it takes the
> lock, reads the 64-bit value, and releases it, thinking all is well.
> When task A resumes, its write to the second half using strex fails, and
> it restarts the write. In the meantime, task B is left with a corrupted
> read that it thinks is valid. This is typically a very low probability
> event - you never see it in testing, but it /will/ happen when you have
> deployed thousands of systems at customers.
>
> So you now use ldrex/strex to control access to an independent lock flag
> (a simple semaphore). That works for tasks, but the overhead is now
> much bigger, and there is the possibility of failure - if another task
> has the semaphore, the current one must cede control to it. And that
> means blocking the task, changing dynamic priority levels so that the
> other task can run, etc. - a full-blown RTOS mutex solution. These are
> very powerful and useful locking mechanisms, but a /vast/ overhead
> compared to the four instructions for interrupt disabling, and the
> single instruction and 3 cycles needed for the 64-bit load or store.

Right, this is all the atomic stuff which I am avoiding by narrowly
constraining the problem to e.g. signals within a single thread.

> And what happens in interrupts? An interrupt function must not block
> (though it can trigger a scheduler run when it exits). If a task has
> the lock when the interrupt is triggered, it will not be able to do its job.

By design this is illegal in any OS I know of
and will panic (crash/halt/bugcheck) the system.

Mutexes coordinate threads, spinlocks coordinate cpu kernels.
Spinlocks coordinate interrupts at the same interrupt priority and must
never be shared across interrupt priority levels as they _WILL_ deadlock.
A spinlock initialized for, say, IPL 3 should panic the OS on
attempts to acquire or release it at any other IPL.

> In summary, ldrex/strex /can/ be used to implement high-level,
> high-power, high-overhead locking mechanisms such as an RTOS mutex or
> queue (though interrupt disabling works there too). It can be used as
> an alternative to interrupt locking for a limited (albeit common) subset
> of atomic operations, with little more cost in run-time but
> significantly greater source-code complexity (and therefore scope for
> programmer error).

Yes.

> In general, you want to avoid disabling interrupts for any significant
> length of time. But a system that can't cope with them being disabled
> for the time taken to do a small atomic access, is broken anyway.
>
>
> This is a different world from multi-core processors, or systems where
> reading from memory might take 200 clock cycles due to cache misses, or
> where it might trigger page faults. Then ldrex/strex becomes essential.

I'm thinking that exchange, compare-exchange may be essential too.
Too many algorithms need non interruptable pointer swap.

Plus 32-bit cpus with 64-bit PTE's need 64-bit LD, ST, Xchg, CmpXchg
even if restricted to 8 byte alignment.

I'm warming to your interrupt-disable-for-N instruction.
One can avoid any denial of service problems by simply ignoring
further disable_N requests while current N is still counting down
and limiting N to, say, less than 8.

>>>> LL loads a word as usual sets a FF in the cache controller to
>>>> indicate the line is "linked" and sets a clock counter to some value.
>>>> If any other processor reads the linked cache line, the FF is reset.
>>>> If an interrupt occurs the FF is reset.
>>>> SC stores the word as usual but checks if the FF is still set
>>>> and the clock counter is > 0. If both are true the store occurs.
>>>>
>>>> The memory coherence detector should be present even on a
>>>> uniprocessor as
>>>> a cache line may be invalidated by DMA. But that is just an address XOR.
>>>>
>>> Yes. And all of that is slower than a couple of instructions to disable
>>> interrupts.
>> I keep wondering if you left the DMB memory barrier instructions in?
>> If a cpu is just talking between its own interrupts and its own
>> non-interrupt levels then it doesn't need memory barriers.
>> A cpu loads and stores are always consistent with itself.
>>
>
> The Cortex-M7 has a dual-issue core, and there are pipelines involved.
> That means instructions following the interrupt disable could be started
> or slightly re-organised, and you need to avoid that. A DSB is cheap in
> these devices, and recommended by ARM in connection with interrupt
> disables or ldrex/strex sequences. (The M0 core is simpler, and does
> not need it.)


Click here to read the complete article
Re: Safepoints

<ktdPI.2401$NyB2.2194@fx17.iad>

  copy mid

https://www.novabbs.com/devel/article-flat.php?id=19600&group=comp.arch#19600

  copy link   Newsgroups: comp.arch
Path: i2pn2.org!i2pn.org!weretis.net!feeder6.news.weretis.net!4.us.feeder.erje.net!2.eu.feeder.erje.net!feeder.erje.net!feeder1.feed.usenet.farm!feed.usenet.farm!newsfeed.xs4all.nl!newsfeed8.news.xs4all.nl!fdc2.netnews.com!news-out.netnews.com!news.alt.net!fdc3.netnews.com!peer03.ams1!peer.ams1.xlned.com!news.xlned.com!peer01.iad!feed-me.highwinds-media.com!news.highwinds-media.com!fx17.iad.POSTED!not-for-mail
From: ThatWoul...@thevillage.com (EricP)
User-Agent: Thunderbird 2.0.0.24 (Windows/20100228)
MIME-Version: 1.0
Newsgroups: comp.arch
Subject: Re: Safepoints
References: <se1s18$sf$1@z-news.wcss.wroc.pl> <se36el$en5$1@dont-email.me> <se4aae$qma$1@dont-email.me> <sedj2d$b0l$1@dont-email.me> <seeb8j$f0t$1@dont-email.me> <1JzOI.8$xM2.7@fx22.iad> <seejpg$ce6$1@dont-email.me> <lhCOI.3073$Fx8.581@fx45.iad> <sef1gf$dvr$1@dont-email.me> <qdIOI.1006$yW1.813@fx08.iad> <segs22$n6j$1@dont-email.me>
In-Reply-To: <segs22$n6j$1@dont-email.me>
Content-Type: text/plain; charset=UTF-8; format=flowed
Content-Transfer-Encoding: 7bit
Lines: 42
Message-ID: <ktdPI.2401$NyB2.2194@fx17.iad>
X-Complaints-To: abuse@UsenetServer.com
NNTP-Posting-Date: Fri, 06 Aug 2021 16:31:12 UTC
Date: Fri, 06 Aug 2021 12:30:30 -0400
X-Received-Bytes: 2911
 by: EricP - Fri, 6 Aug 2021 16:30 UTC

David Brown wrote:
> On 05/08/2021 04:40, EricP wrote:
>> David Brown wrote:
>>> On 04/08/2021 21:55, EricP wrote:
>>
>>>> If those map to single instructions on a uC, great.
>>>> If not, then they do the equivalent PUSHF, INTD, LD, ST, POPF sequence.
>>>> If an INTD_N instruction is available then it saves a PUSHF and POPF.
>>>> But LL/SC saves any disable and is more generally useful even on a uC.
>>>>
>>> No, it is not more useful - because it does not work in conjunction with
>>> interrupts.
>> Its not clear what you think does not work with interrupts but if it is
>> LL/SC the interrupt should clear the lock flag and prevent the SC.
>>
>> The ARMv7-M manual says exceptions do clear the flag:
>> section A3.4.4 Context switch support
>> "the local monitor is changed to Open Access automatically
>> as part of an exception entry or exit"
>>
>> In ARM parlance Open Access means not exclusive - the load-lock is
>> canceled.
>
> Yes, I realise that. This is not primarily designed to make ldrex/strex
> work in an interrupt function, but to ensure that you don't have
> problems when a task gets interrupted in the middle of a ldrex/strex
> sequence, then another task starts a new sequence (claiming the lock for
> itself), then the first task resumes again and thinks it has the lock.

I don't think LL/SC can function correctly if it does not cancel the
lock on interrupt. Alpha documents this with the LL and SC instructions.

It is unfortunate that ARM documents the lock-cancel-on-interrupt separately
(and I might add, making it almost impossible to find unless you
pretty much know what to look for - that section took 1/2 hour to find)
as this might give the impression that it is a secondary side effect
rather than critical to the correct functioning.

So lock-cancel-on-interrupt should not be viewed as a side effect
but as a basic feature that you may use to your advantage.

Re: Safepoints

<58616467-a1f9-4be8-b4af-3dbc1b657d16n@googlegroups.com>

  copy mid

https://www.novabbs.com/devel/article-flat.php?id=19603&group=comp.arch#19603

  copy link   Newsgroups: comp.arch
X-Received: by 2002:ac8:5e46:: with SMTP id i6mr9945921qtx.326.1628269348015;
Fri, 06 Aug 2021 10:02:28 -0700 (PDT)
X-Received: by 2002:aca:59c6:: with SMTP id n189mr3008323oib.44.1628269347606;
Fri, 06 Aug 2021 10:02:27 -0700 (PDT)
Path: i2pn2.org!i2pn.org!usenet.goja.nl.eu.org!3.eu.feeder.erje.net!feeder.erje.net!fdn.fr!proxad.net!feeder1-2.proxad.net!209.85.160.216.MISMATCH!news-out.google.com!nntp.google.com!postnews.google.com!google-groups.googlegroups.com!not-for-mail
Newsgroups: comp.arch
Date: Fri, 6 Aug 2021 10:02:27 -0700 (PDT)
In-Reply-To: <r_cPI.1442$yW1.495@fx08.iad>
Injection-Info: google-groups.googlegroups.com; posting-host=2600:1700:291:29f0:ece2:6007:88e4:f63b;
posting-account=H_G_JQkAAADS6onOMb-dqvUozKse7mcM
NNTP-Posting-Host: 2600:1700:291:29f0:ece2:6007:88e4:f63b
References: <se1s18$sf$1@z-news.wcss.wroc.pl> <se36el$en5$1@dont-email.me>
<se4aae$qma$1@dont-email.me> <sedj2d$b0l$1@dont-email.me> <seeb8j$f0t$1@dont-email.me>
<1JzOI.8$xM2.7@fx22.iad> <seejpg$ce6$1@dont-email.me> <seeqsh$1ak$1@dont-email.me>
<a0DOI.93$fI7.10@fx33.iad> <sef0rg$9ps$1@dont-email.me> <5hHOI.1976$lK.1523@fx41.iad>
<segqla$4k9$1@dont-email.me> <r_cPI.1442$yW1.495@fx08.iad>
User-Agent: G2/1.0
MIME-Version: 1.0
Message-ID: <58616467-a1f9-4be8-b4af-3dbc1b657d16n@googlegroups.com>
Subject: Re: Safepoints
From: MitchAl...@aol.com (MitchAlsup)
Injection-Date: Fri, 06 Aug 2021 17:02:28 +0000
Content-Type: text/plain; charset="UTF-8"
 by: MitchAlsup - Fri, 6 Aug 2021 17:02 UTC

On Friday, August 6, 2021 at 10:58:19 AM UTC-5, EricP wrote:
> David Brown wrote:
> >
> > I am not sure I a explaining things well here.
> Your explanations are fine but we have been talking about different things.
>
> You have been mostly referring to atomic memory operations between
> threads or smp cpus, and I explicitly excluded atomic operations.
>
> I was expanding on Mitch's earlier point about turning asynchronous
> events delivered using interrupt semantics into synchronous ones,
> which is what the original poster was asking about.
>
> By atomic I mean memory operations between concurrent threads or cpus.
> Non interruptible sequence is orthogonal, within a single thread or cpu.
> E.g. on x86 ADD [mem],reg is a non interruptible R-M-W sequence but
> is not atomic. LCK ADD [mem],reg is both atomic and non interruptible.
> > The assembly for a 32-bit increment (of a variable in memory) in Thumb-2
> > will be something along the lines of :
> >
> > ldr r3, [r2]
> > add r3, r3, #1
> > str r3, [r2]
> >
> > Three instructions, with one load and one store. With tightly-coupled
> > memory (or on an M0 to M4 microcontroller, on-board ram), loads and
> > stores are two cycles. So that is 3 instructions, 5 cycles.
> >
> > Making this atomic on an M0 microcontroller is :
> >
> > cpsid i
> >
> > ldr r3, [r2]
> > add r3, r3, #1
> > str r3, [r2]
> >
> > cpsie
> >
> > Two extra instructions, at two cycles. (These will generally not be
> > needed inside an interrupt function, unless the same data will be
> > accessed by different interrupt routines with different priorities.)
> >
> > If you want a more flexible sequence that saves and restores interrupt
> > status, and also need it safe for the dual-issue M7, you can use:
> >
> > mrs r1, primask
> > cpsid i
> > dsb
> >
> > ldr r3, [r2]
> > add r3, r3, #1
> > str r3, [r2]
> >
> > msr primask, r1
> >
> > Four instructions, four cycles overhead for making the sequence atomic.
> Yes, most OS programming standards require interrupt state
> be saved and restored so the code is reentrant.
> > The equivalent for this using ldrex/strex is indeed short and fast:
> >
> > loop:
> > ldrex r3, [r2]
> > add r3, r3, 1
> > strex r1, r3, [r2]
> > cmp r1, #0
> > bne loop
> > dsb
> I looked at the ARMv7 manual again and I still think,
> _for code that is within a single thread or cpu as per
> the original question about asynchronous signals_
> that the DSB is unnecessary in this particular case.
> A cpu will always see its own instructions in program order.
> As DSB stalls the cpu until all prior memory and branches have
> completed it would impose an extra overhead on the sequence.
> > The ldrex and strex instructions don't take any longer than their
> > non-locking equivalents. And this is safe to use in interrupts. In
> > real-time work, it is vital to track worst-case execution time, not
> > best-case or common-case. So though the best case might be 3 cycles
> > overhead, an extra round of the loop might be another 9 cycles. (Single
> > extra rounds will happen occasionally - multiple extra rounds should not
> > be realistically possible.)
> >
> > So it seems to be a viable alternative to disabling interrupts, with
> > approximately the same overhead. However, it has two major
> > disadvantages to go with its obvious advantage of not blocking interrupts.
> >
> > It will only work for sequences ending in a single write of 32 bits or
> > less, and it will only work for restartable sequences.
> Yes, this is a limitation of LL/SC - it cannot do double wide operations.
>
> A few years ago I proposed an enhancement here which
> allows LL/SC to multiple locations within a single cache line.
>
> LL - retains the lock if load is to the same cache line
> SCH - store conditional and hold stores and retains lock
> SCR - store conditional and release stores and releases lock
> CLL - Clear lock
>
> The extra cost in the cache controller is holding the first line updates
> separate until the release occurs in case it needs to roll back.
> For the modified line it can either use a separate cache line buffer,
> or if L2 is inclusive it can update L1 and either keep or toss that copy.
<
And then there is ESM (Exotic Synchronization Method) where one
can LL up to 8 cache lines and perform a number of LDs and STs
between these cache lines (and other places) to effect what smells
like an ATOMIC event. Here one can move an Item from one place in
a concurrent data structure to another in a single ATOMIC event
(It requires 6 cache lines to participate in the event.)
<
CAS, CDAS, CDASD, are all writeable subsets of ESM
<
One can apply timestamps and thread numbers to the pointers to
make figuring out when and where things went wrong straightforward.
<
> > Suppose, instead, that the atomic operation you want is not a simple
> > increment of a 32-bit value, but is storing a 64-bit value. With the
> > interrupt disable strategy, you still have exactly the same 4
> > instruction, 4 cycle overhead (or two instructions in the simplest
> > version), for both read and write routines. How do you do this with
> > ldrex/strex ?
<
Suppose you want to alter 3 forward pointing pointers and 3 backward
pointing pointers in a single ATOMIC event ?
<
> Yes unfortunately in general you don't.
>
> However there are specific situations where it can be done.
> E.g. reading a 64-bit clock on a 32-bit cpu since the clock is always
> increasing one can read high1 read low, read high2, compare high1,high2
>
> Another example is updating a 64-bit PTE on a 32-bit system without using
> spinlocks and never leave the intermediate PTE in an illegal state.
> One does the reads and writes setting and clearing bits
> in a particular order.
> > You can't use the same setup as earlier. Suppose task A takes the
> > exclusive monitor lock, writes the first half of the 64-bit item, then
> > there is an interrupt. Task B wants to read the value - it takes the
> > lock, reads the 64-bit value, and releases it, thinking all is well.
> > When task A resumes, its write to the second half using strex fails, and
> > it restarts the write. In the meantime, task B is left with a corrupted
> > read that it thinks is valid. This is typically a very low probability
> > event - you never see it in testing, but it /will/ happen when you have
> > deployed thousands of systems at customers.
> >
> > So you now use ldrex/strex to control access to an independent lock flag
> > (a simple semaphore). That works for tasks, but the overhead is now
> > much bigger, and there is the possibility of failure - if another task
> > has the semaphore, the current one must cede control to it. And that
> > means blocking the task, changing dynamic priority levels so that the
> > other task can run, etc. - a full-blown RTOS mutex solution. These are
> > very powerful and useful locking mechanisms, but a /vast/ overhead
> > compared to the four instructions for interrupt disabling, and the
> > single instruction and 3 cycles needed for the 64-bit load or store.
> Right, this is all the atomic stuff which I am avoiding by narrowly
> constraining the problem to e.g. signals within a single thread.
> > And what happens in interrupts? An interrupt function must not block
> > (though it can trigger a scheduler run when it exits). If a task has
> > the lock when the interrupt is triggered, it will not be able to do its job.
> By design this is illegal in any OS I know of
> and will panic (crash/halt/bugcheck) the system.
>
> Mutexes coordinate threads, spinlocks coordinate cpu kernels.
> Spinlocks coordinate interrupts at the same interrupt priority and must
> never be shared across interrupt priority levels as they _WILL_ deadlock.
> A spinlock initialized for, say, IPL 3 should panic the OS on
> attempts to acquire or release it at any other IPL.
> > In summary, ldrex/strex /can/ be used to implement high-level,
> > high-power, high-overhead locking mechanisms such as an RTOS mutex or
> > queue (though interrupt disabling works there too). It can be used as
> > an alternative to interrupt locking for a limited (albeit common) subset
> > of atomic operations, with little more cost in run-time but
> > significantly greater source-code complexity (and therefore scope for
> > programmer error).
> Yes.
> > In general, you want to avoid disabling interrupts for any significant
> > length of time. But a system that can't cope with them being disabled
> > for the time taken to do a small atomic access, is broken anyway.
> >
> >
> > This is a different world from multi-core processors, or systems where
> > reading from memory might take 200 clock cycles due to cache misses, or
> > where it might trigger page faults. Then ldrex/strex becomes essential.
> I'm thinking that exchange, compare-exchange may be essential too.
> Too many algorithms need non interruptable pointer swap.
>
> Plus 32-bit cpus with 64-bit PTE's need 64-bit LD, ST, Xchg, CmpXchg
> even if restricted to 8 byte alignment.
>
> I'm warming to your interrupt-disable-for-N instruction.
> One can avoid any denial of service problems by simply ignoring
> further disable_N requests while current N is still counting down
> and limiting N to, say, less than 8.
> >>>> LL loads a word as usual sets a FF in the cache controller to
> >>>> indicate the line is "linked" and sets a clock counter to some value.
> >>>> If any other processor reads the linked cache line, the FF is reset.
> >>>> If an interrupt occurs the FF is reset.
> >>>> SC stores the word as usual but checks if the FF is still set
> >>>> and the clock counter is > 0. If both are true the store occurs.
> >>>>
> >>>> The memory coherence detector should be present even on a
> >>>> uniprocessor as
> >>>> a cache line may be invalidated by DMA. But that is just an address XOR.
> >>>>
> >>> Yes. And all of that is slower than a couple of instructions to disable
> >>> interrupts.
> >> I keep wondering if you left the DMB memory barrier instructions in?
> >> If a cpu is just talking between its own interrupts and its own
> >> non-interrupt levels then it doesn't need memory barriers.
> >> A cpu loads and stores are always consistent with itself.
> >>
> >
> > The Cortex-M7 has a dual-issue core, and there are pipelines involved.
> > That means instructions following the interrupt disable could be started
> > or slightly re-organised, and you need to avoid that. A DSB is cheap in
> > these devices, and recommended by ARM in connection with interrupt
> > disables or ldrex/strex sequences. (The M0 core is simpler, and does
> > not need it.)
> For non-atomic, ie within a single thread between its normal and
> its own signal level, or between a single cpu and its interrupts,
> the data dependencies themselves look to me to be sufficient
> ordering which is why I thought the DSB superfluous.


Click here to read the complete article
Pages:1234
server_pubkey.txt

rocksolid light 0.9.81
clearnet tor