Message-ID:

There are bugs and then there are bugs. And then there are bugs. -- Karl Lehenbauer

Re: Safepoints

<54f5dd93-7082-4735-b222-8e9404eda380n@googlegroups.com>

https://www.novabbs.com/computers/article-flat.php?id=19758&group=comp.arch#19758

X-Received: by 2002:ac8:5744:: with SMTP id 4mr5886348qtx.326.1628801877371;
Thu, 12 Aug 2021 13:57:57 -0700 (PDT)
X-Received: by 2002:a9d:7517:: with SMTP id r23mr4815240otk.182.1628801877092;
Thu, 12 Aug 2021 13:57:57 -0700 (PDT)
Path: i2pn2.org!i2pn.org!weretis.net!feeder6.news.weretis.net!news.snarked.org!border2.nntp.dca1.giganews.com!nntp.giganews.com!news-out.google.com!nntp.google.com!postnews.google.com!google-groups.googlegroups.com!not-for-mail
Newsgroups: comp.arch
Date: Thu, 12 Aug 2021 13:57:56 -0700 (PDT)
In-Reply-To: <a3f971e5-4db6-4427-b231-03fdc78ebc95n@googlegroups.com>
Injection-Info: google-groups.googlegroups.com; posting-host=104.59.204.55; posting-account=H_G_JQkAAADS6onOMb-dqvUozKse7mcM
NNTP-Posting-Host: 104.59.204.55
References: <se1s18$sf$1@z-news.wcss.wroc.pl> <se36el$en5$1@dont-email.me>
<se4aae$qma$1@dont-email.me> <sedj2d$b0l$1@dont-email.me> <seeb8j$f0t$1@dont-email.me>
<1JzOI.8$xM2.7@fx22.iad> <seejpg$ce6$1@dont-email.me> <seeqsh$1ak$1@dont-email.me>
<a0DOI.93$fI7.10@fx33.iad> <sef0rg$9ps$1@dont-email.me> <5hHOI.1976$lK.1523@fx41.iad>
<segqla$4k9$1@dont-email.me> <r_cPI.1442$yW1.495@fx08.iad>
<9b51ff18-801c-482e-b883-a921aa5f014dn@googlegroups.com> <30b55b24-ba70-4da4-8bc9-a37d9503df82n@googlegroups.com>
<35a020f8-6228-4b68-a39c-d89e8475bb90n@googlegroups.com> <7f7207de-30cf-4f82-845c-44cdfb2853f3n@googlegroups.com>
<a3f971e5-4db6-4427-b231-03fdc78ebc95n@googlegroups.com>
User-Agent: G2/1.0
MIME-Version: 1.0
Message-ID: <54f5dd93-7082-4735-b222-8e9404eda380n@googlegroups.com>
Subject: Re: Safepoints
From: MitchAl...@aol.com (MitchAlsup)
Injection-Date: Thu, 12 Aug 2021 20:57:57 +0000
Content-Type: text/plain; charset="UTF-8"
Content-Transfer-Encoding: quoted-printable
Lines: 214

by: MitchAlsup - Thu, 12 Aug 2021 20:57 UTC

On Thursday, August 12, 2021 at 1:26:26 PM UTC-5, Paul A. Clayton wrote:
> On Tuesday, August 10, 2021 at 2:53:56 PM UTC-4, MitchAlsup wrote:
> >On Tuesday, August 10, 2021 at 12:58:10 PM UTC-5, Paul A. Clayton wrote:
> >> On Saturday, August 7, 2021 at 5:39:18 PM UTC-4, MitchAlsup wrote:
> >>> On Saturday, August 7, 2021 at 4:10:33 PM UTC-5, Paul A. Clayton wrote:
> [snip]
> >> I am almost surprised, Mitch, that you did not use a
> >> PRED-like instruction modifier to extend the semantics of
> >> memory access instructions in its shadow.☺ In addition to
> >> allowing any load/store instruction to have the extra
> >> annotation applied independent of ordinary instruction
> >> encoding constraints, the encoding freedom might facilitate
> >> more extensive annotations, e.g., by adding an optional
> >> 32-bit or 64-bit "immediate" to the LOCK instruction
> >> modifier.)
> >
> > To your credit, you are the first person to have noted this.
> >
> > But, ESM grew out of ASF and ASF had no predicates, indeed, when
> > I did the ESM stuff (circa 2007) I did not have a PRED in my ISA.
> >
> > You have a sharp thinking cap !!
> Thank you for the compliment. (I plan to add it to my list.)
>
> As is often the case, the association was caught by
> coincidence of mental bias (reusing mechanisms and coherent
> design) and circumstance (ASF's use of x86's LOCK prefix was
> in mind, so a LOCK instruction modifier is not a large
> inventive step when applied to an ISA that uses such modifiers
> as an extension mechanism for special cases). The role of
> mental bias is one reason why mental diversity is useful in
> teams. While one cannot program seredipity, environmental
> diversity (i.e., do not just focus on one problem) seems to help.
>
> (I do wish that more of my gifts were used. While noticing
> inconsistencies or opportunities for greater coherence is
> useful in working as a library page [now in two counties
> #&☠%~⛈⛤🗡🔧 part-time minimum-wage jobs!], I suspect I
> would be more useful in a research role. Sadly, for me and
> perhaps for humanity [in that less effective use of resources
> hurts humanity], I am not fit for most conventional roles. I
> have been able to provide a little edutainment on the
> Internet with my gifts.)
> > In retrospect, 8 instructions is not enough to perform some of the
> > things ESM is capable of doing.
>
> >> I was thinking of a similar "dummy" SC to replace CLL (clear
> >> lock), inspired by the dummy stores used to implement
> >> memory-based conditional move (where the register CMOV/
> >> SELECT instruction sets up/retains the store address or
> >> retains/sets up a dummy address). The main problem would be
> >> forcing a failure so that the previous stores do not become
> >> visible. For a single cache block reservation this would be
> >> trivial, but extending the reservation to an arbitrary set
> >
> > Arbitrary: yes, but a fixed (smallish) number of them.
<
> I was thinking arbitrarily *large*. Small read and write sets
> are great for providing guarantees, but I think it would be
> nice to have support for relatively large transactions. With a
> strict limit to only eight cache blocks participating, software
> might reasonably know an address not in those eight blocks. I
> would also like the terminating SC to be able to be a new block.
<
Hardware in general does not do "large" well. In particular, the
number I chose is exactly he number of miss buffers in Opteron
which I reused with very minor additions, so it was essentially
FREE.
>
> (One "solution" would be to use page fault/permission violation
> as a terminating condition for SC/the transaction. Software
> might reasonably be able to define an address that is guaranteed
> to be unwritable.)
<
It is the turning of these things on and then off again that creates
all the difficulty.
<
> >> of addresses makes it difficult to be sure that the SC will
> >> fail. On a general register machine, one might reserve one
> >> register name to indicate a SC that always fails. If the ISA
> >> has a zero register, using this as a base register for a SC might
> >> be defined as always failing. Using an architecturally defined
> >> stack pointer or thread-local-storage pointer as a base address
> >> might work, but CLL does seem less crufty and potentially
> >> constraining.
> >
> > Whatever mechanics fits with the rest of your architecture can be
> > made to work adequately. I chose to do this all in the Miss Buffer.
<
> "Work adequately" may be above average (Sturgeon's Law), but
> having a coherent/elegant design is desirable.
<
We are trying to go from 1 (T&S, CAS) and 2 (DCAS) to a handful
or more in this step. Until we have history on using these mechanisms
we have no data on how many would be "nice".
<
In any event, having 5 cache lines participate, means we can
convert 3 ATOMIC events into 1 when moving data around in
a concurrent data structure. This LOWERS the system wide
interference, maybe enough that we don't need "large" as defined
above. In any event, until we have data, 8 is more than enough.
>
> [snip]
> >> I consider LL/SC transactional memory (with the limit of
> >> single load read set and single store write set with the
> >> read set identical to the write set), but this is a
> >> classification/nomenclature issue.
> >
> > {{eyes wider open than usual and staring with a gaze}} Interesting
<
> If one defines a transaction as "a failable operation
> collecting multiple operations that together would normally
> not be atomic into an atomic unit, with retry as a typical
> fallback", LL/SC ticks all the boxes: two operations
> ("multiple"), a store would not normally be atomic with a
> load but is made atomic, and retry is the typical fallback.
>
> This is an example of my clumper classification tendency.
<
Repeat after me: ESM is not TM, ESM is not TM, and never will be.
<
Transactions that are suitably small enough can be done inside
ESM, transactions of 2×-4× the size ESM supports can be
accelerated using ESM. But large general HW TM is not in going
to happen.
>
> [snip]
> > ASF and ESM are, in essence, a way to get out of the game of inventing
> > more and more exotic synchronization instructions over time. SW can
> > create whatever ATOMIC primitives it desires and wrap them up in
> > subroutines, macros, or code to be inlined.
<
> Yes, and it is very nice for that purpose. I just feel that
> a broader transactional system could use much of the hardware
> support for and ASF/ESM-like system and unify the interface
> somewhat among "large" atomic primitives and something more
> associated with transactional memory.
<
See history comment above:: it is not yet time for TM.
>
> [snip]
> > Yes, at first blush, one can run ESM under a SW lock........but why ?
> > ESM is present to allow the ILLUSION of ATOMICity without ever
> > locking anything!
> [snip]
> >>> Also note: single stepping through an ATOMIC event and achieving
> >>> success cannot be allowed.
> >>
> >> While that does seem to be generally "asking for trouble", I
> >> am not certain that single stepping is impossible — not
> >> having given it that much thought and, especially, not
> >> having the best grasp of synchronization issues.
> >
> > Consider single stepping through an ATOMIC event under actual
> > contention::
> >
> > By the time the human gets the carrot > at the control terminal,
> > million upon millions of instructions have been performed, and
> > by the time the human makes his first response, millions and
> > millions of more instructions have been performed.
> >
> > Under contention, it is doubtful that the single stepping event
> > will ever succeed, so all one can single step through is the failure
> > cases. So why not fail when control is transferred out of the event ?
> >
> >> At minimum, it seems one could provide a trace of processor
> >> state for a successful atomic event, allowing something
> >> similar to single stepping.
> >
> > The problem is NOT single stepping !!
> > The problem is that you cannot make it SMELL ATOMIC !!
<
> I think I did not communicate my proposal clearly. The
> operation would be "in the past". One would be single-
> stepping through a sequence that has already happened. Yes,
> this means that the full debugging power of single-stepping
> would not be available — one would not be able to modify
> values and continue — and some confusion would be possible
> (checking the value of a variable outside the transaction
> could give a later version of that variable's value).
<
Once again: the overhead is all in the turning stuff on and then back off.
>
> Such limited single-stepping might still be useful for
> understanding what the software is doing.
<
While agree, I still don't see HOW to retain the Illusion of ATOMICity
over ALL kinds of memory references (including I/O) which may
interact with the ATOMICness of 3rd party views.
<
> >> For ESM, I think one could even
> >> provide a "lock" that prevents other threads from
> >> introducing conflicts while post-atomic single-stepping;
> >
> > You are setting yourself up for deadlock.
<
> With software locks, single-stepping through a critical
> section effectively produces "coma-lock". Single-thread
> single-stepping through parallel code will not provide the
> same "experience" as single-stepping through serial code.
<
But if you have only one thread, the ATOMIC stuff will NEVER fail
Yet it is the failure cases that are most interesting to get right.
So, single stepping one thread buys so little it does not seem
worth it.
<
> Even with running all communicating threads in lock step,
> the actual behavior would not be realistic as timing
> influences behavior in a parallel system.

Subject	Replies	Author
Safepoints By: antispam on Fri, 30 Jul 2021	99	antispam

There are bugs and then there are bugs. And then there are bugs. -- Karl Lehenbauer

computers / comp.arch / Re: Safepoints