Rocksolid Light

Welcome to novaBBS (click a section below)

mail  files  register  newsreader  groups  login

Message-ID:  

A hacker does for love what others would not do for money.


devel / comp.arch / Re: Sequencer vs microcode

SubjectAuthor
* Sequencer vs microcodeMarcus
+* Re: Sequencer vs microcodewinden
|`* Re: Sequencer vs microcodeMitchAlsup
| `- Re: Sequencer vs microcodeMitchAlsup
`* Re: Sequencer vs microcodeMitchAlsup
 `* Re: Sequencer vs microcodeMarcus
  `* Re: Sequencer vs microcodeBGB
   `* Re: Sequencer vs microcodeMitchAlsup
    +* Re: Sequencer vs microcoderobf...@gmail.com
    |+* Re: Sequencer vs microcodeMarcus
    ||`* Re: Sequencer vs microcoderobf...@gmail.com
    || `* Re: Sequencer vs microcodeMitchAlsup
    ||  `* Re: Sequencer vs microcodeEricP
    ||   +- Re: Sequencer vs microcodeAnton Ertl
    ||   `* Re: Sequencer vs microcodeMitchAlsup
    ||    +* Re: Sequencer vs microcoderobf...@gmail.com
    ||    |+- Re: Sequencer vs microcodeStephen Fuld
    ||    |`* Re: Sequencer vs microcodeMitchAlsup
    ||    | `* Re: Sequencer vs microcoderobf...@gmail.com
    ||    |  `* Re: Sequencer vs microcodeStefan Monnier
    ||    |   +* Re: Sequencer vs microcodeThomas Koenig
    ||    |   |`* Re: Sequencer vs microcoderobf...@gmail.com
    ||    |   | `- Re: Sequencer vs microcodeAnton Ertl
    ||    |   +* Re: Sequencer vs microcodeIvan Godard
    ||    |   |`* Re: Sequencer vs microcodeStefan Monnier
    ||    |   | `* Re: Sequencer vs microcodeStephen Fuld
    ||    |   |  `* Re: Sequencer vs microcodeStefan Monnier
    ||    |   |   +* Re: Sequencer vs microcodeMitchAlsup
    ||    |   |   |+* Re: Sequencer vs microcoderobf...@gmail.com
    ||    |   |   ||+- Re: Sequencer vs microcoderobf...@gmail.com
    ||    |   |   ||+* Re: Sequencer vs microcodeStefan Monnier
    ||    |   |   |||`- Re: Sequencer vs microcodeEricP
    ||    |   |   ||+* Re: Sequencer vs microcodeEricP
    ||    |   |   |||`* Re: Sequencer vs microcodeStefan Monnier
    ||    |   |   ||| `* Re: Sequencer vs microcodeEricP
    ||    |   |   |||  `* Re: Sequencer vs microcodeStefan Monnier
    ||    |   |   |||   `- Re: Sequencer vs microcodeEricP
    ||    |   |   ||+- Exec (was: Sequencer vs microcode)Anton Ertl
    ||    |   |   ||`- Re: Sequencer vs microcodeMitchAlsup
    ||    |   |   |`* Re: Sequencer vs microcodeAnton Ertl
    ||    |   |   | `* Re: Sequencer vs microcoderobf...@gmail.com
    ||    |   |   |  `- Re: Sequencer vs microcodeMitchAlsup
    ||    |   |   +- Re: Sequencer vs microcodeThomas Koenig
    ||    |   |   `* Re: Sequencer vs microcodeMarcus
    ||    |   |    `* Re: Sequencer vs microcoderobf...@gmail.com
    ||    |   |     +* Exec (was: Sequencer vs microcode)Anton Ertl
    ||    |   |     |`* Re: ExecStefan Monnier
    ||    |   |     | `* Re: ExecAnton Ertl
    ||    |   |     |  `* Re: Execrobf...@gmail.com
    ||    |   |     |   `* Re: ExecMitchAlsup
    ||    |   |     |    `* Re: ExecStefan Monnier
    ||    |   |     |     +* Re: ExecMitchAlsup
    ||    |   |     |     |+* Re: Execrobf...@gmail.com
    ||    |   |     |     ||`- Re: ExecMitchAlsup
    ||    |   |     |     |`- Re: ExecStefan Monnier
    ||    |   |     |     `* Re: ExecAnton Ertl
    ||    |   |     |      `- Re: ExecStefan Monnier
    ||    |   |     `- Re: Sequencer vs microcodeStefan Monnier
    ||    |   `* Re: Sequencer vs microcodeQuadibloc
    ||    |    +* Re: Sequencer vs microcodeQuadibloc
    ||    |    |+- Re: Sequencer vs microcodeQuadibloc
    ||    |    |+* Re: Sequencer vs microcodeMitchAlsup
    ||    |    ||`- Re: EX instructon, Sequencer vs microcodeJohn Levine
    ||    |    |+* Re: Sequencer vs microcodeStephen Fuld
    ||    |    ||+- Re: Sequencer vs microcodeQuadibloc
    ||    |    ||`* Re: Execute, not Sequencer vs microcodeJohn Levine
    ||    |    || `* Re: Execute, not Sequencer vs microcodeMitchAlsup
    ||    |    ||  `* Re: Execute, not Sequencer vs microcodeQuadibloc
    ||    |    ||   `* Re: Execute, not Sequencer vs microcodeJohn Levine
    ||    |    ||    `* Re: Execute, not Sequencer vs microcodeMitchAlsup
    ||    |    ||     +* Re: Execute, not Sequencer vs microcodeStephen Fuld
    ||    |    ||     |`* Re: Execute, not Sequencer vs microcodeQuadibloc
    ||    |    ||     | `* Re: Execute, not Sequencer vs microcodeQuadibloc
    ||    |    ||     |  `* Re: Execute, not Sequencer vs microcodeJohn Levine
    ||    |    ||     |   +* Re: Execute, not Sequencer vs microcodeQuadibloc
    ||    |    ||     |   |`- Re: transistors, was Execute, not Sequencer vs microcodeJohn Levine
    ||    |    ||     |   +* Re: Execute, not Sequencer vs microcodeMitchAlsup
    ||    |    ||     |   |`* Re: Execute, not Sequencer vs microcodeJohn Levine
    ||    |    ||     |   | +* Re: Execute, not Sequencer vs microcodeThomas Koenig
    ||    |    ||     |   | |+* Re: Execute, not Sequencer vs microcodeJohn Levine
    ||    |    ||     |   | ||`- Re: Execute, not Sequencer vs microcodeQuadibloc
    ||    |    ||     |   | |`* Re: Execute, not Sequencer vs microcodeIvan Godard
    ||    |    ||     |   | | +* Re: Execute, not Sequencer vs microcodeStephen Fuld
    ||    |    ||     |   | | |`* Re: Execute, not Sequencer vs microcodeIvan Godard
    ||    |    ||     |   | | | `* Re: Execute, not Sequencer vs microcodeQuadibloc
    ||    |    ||     |   | | |  `- Re: Execute, not Sequencer vs microcodeThomas Koenig
    ||    |    ||     |   | | `* Re: 7094, was Execute, not Sequencer vs microcodeJohn Levine
    ||    |    ||     |   | |  `- Re: 7094, was Execute, not Sequencer vs microcodeBrian G. Lucas
    ||    |    ||     |   | `- Re: Execute, not Sequencer vs microcodeQuadibloc
    ||    |    ||     |   `* Re: Execute, not Sequencer vs microcodeStephen Fuld
    ||    |    ||     |    `* Re: Execute, not Sequencer vs microcodeJohn Levine
    ||    |    ||     |     +* Re: Execute, not Sequencer vs microcodeQuadibloc
    ||    |    ||     |     |+* Re: Execute, not Sequencer vs microcodeThomas Koenig
    ||    |    ||     |     ||+- Re: Execute, not Sequencer vs microcodeMitchAlsup
    ||    |    ||     |     ||`* Re: Execute, not Sequencer vs microcodeQuadibloc
    ||    |    ||     |     || `- Re: Execute, not Sequencer vs microcodeIvan Godard
    ||    |    ||     |     |`* Re: Execute and IBM history, not Sequencer vs microcodeJohn Levine
    ||    |    ||     |     | `* Re: Execute and IBM history, not Sequencer vs microcodeStephen Fuld
    ||    |    ||     |     |  +* Re: Execute and IBM history, not Sequencer vs microcodeAnne & Lynn Wheeler
    ||    |    ||     |     |  |+* Re: Execute and IBM history, not Sequencer vs microcodeStephen Fuld
    ||    |    ||     |     |  ||`- Re: Execute and IBM history, not Sequencer vs microcodeAnne & Lynn Wheeler
    ||    |    ||     |     |  |`* Re: Execute and IBM history, not Sequencer vs microcodeJohn Levine
    ||    |    ||     |     |  `* Re: Execute and IBM history, not Sequencer vs microcodeJohn Levine
    ||    |    ||     |     `* Re: Execute, not Sequencer vs microcodeStephen Fuld
    ||    |    ||     `- Re: Execute, not Sequencer vs microcodeIvan Godard
    ||    |    |`- Re: Sequencer vs microcodeantispam
    ||    |    `* Re: Sequencer vs microcodeStefan Monnier
    ||    `* Re: Sequencer vs microcodeKent Dickey
    |`- Re: Sequencer vs microcodeMitchAlsup
    `- Re: Sequencer vs microcodeBGB

Pages:1234567
Re: Sequencer vs microcode

<jwvfsx2hquz.fsf-monnier+comp.arch@gnu.org>

  copy mid

https://www.novabbs.com/devel/article-flat.php?id=18160&group=comp.arch#18160

  copy link   Newsgroups: comp.arch
Path: i2pn2.org!i2pn.org!eternal-september.org!reader02.eternal-september.org!.POSTED!not-for-mail
From: monn...@iro.umontreal.ca (Stefan Monnier)
Newsgroups: comp.arch
Subject: Re: Sequencer vs microcode
Date: Mon, 28 Jun 2021 12:58:53 -0400
Organization: A noiseless patient Spider
Lines: 19
Message-ID: <jwvfsx2hquz.fsf-monnier+comp.arch@gnu.org>
References: <sb1abl$ps8$1@dont-email.me>
<cc7766bb-1433-45f4-9ba5-93c40bdab0f7n@googlegroups.com>
<sb553t$kta$1@dont-email.me> <sb5a40$o97$1@dont-email.me>
<42b654db-a022-42f3-a6aa-94cd5a88de98n@googlegroups.com>
<53e78bb8-5c9c-481e-8cec-50a3febcbe95n@googlegroups.com>
<sb75hs$7nq$1@dont-email.me>
<3050904a-eec2-4d47-ba5d-bfed1d774518n@googlegroups.com>
<3d37be3c-e4fd-41b3-93f7-7037a3c19aefn@googlegroups.com>
<e5IBI.683979$ST2.397427@fx47.iad>
<8b1baea3-a802-4341-8afd-d827e4eee948n@googlegroups.com>
<3b2b37a7-6168-42c0-a657-dcff23fa3e03n@googlegroups.com>
<7ce8fca5-3a58-4e9a-92b4-595d0a743c57n@googlegroups.com>
<e2d01fcb-2cb1-4fb4-bf0e-b08d23ead140n@googlegroups.com>
<jwvv95yhy1o.fsf-monnier+comp.arch@gnu.org>
<sbcu9u$587$1@dont-email.me>
Mime-Version: 1.0
Content-Type: text/plain
Injection-Info: reader02.eternal-september.org; posting-host="786152efbb815a7dd913e8530438f84b";
logging-data="1702"; mail-complaints-to="abuse@eternal-september.org"; posting-account="U2FsdGVkX1+EG1cGfq/KhEcpIyf2Or4D"
User-Agent: Gnus/5.13 (Gnus v5.13) Emacs/28.0.50 (gnu/linux)
Cancel-Lock: sha1:G/g2p07uAoxjhb25IG9IunOf2aI=
sha1:Qd8gIX1R4n7lBuNV/7gm59qq2sg=
 by: Stefan Monnier - Mon, 28 Jun 2021 16:58 UTC

Ivan Godard [2021-06-28 09:44:47] wrote:
> On 6/28/2021 7:21 AM, Stefan Monnier wrote:
>>>> One pretty simple solution to this kind of problem that has been used in
>>>> the past is to start a timer/counter each time an instruction starts.
>>>> If the count exceeds some large number, generate an exception.
>>> That is a good idea to include, but I am a little stumped at the moment
>>> about implementation.
>> As a compiler/proglang guy, the one thing that stumps me instead here is
>> "why?".
>> Why would you want, or when/where would you need something like an
>> "exec" opcode? It seems to me like a "solution" in search of a problem.
>> Stefan
> An exec is a very cheap special-case function call for very short functions.

What makes it so "very cheap"er than a function call using normal
instructions and why/when would I need it?

Stefan

Re: Sequencer vs microcode

<2021Jun28.182331@mips.complang.tuwien.ac.at>

  copy mid

https://www.novabbs.com/devel/article-flat.php?id=18161&group=comp.arch#18161

  copy link   Newsgroups: comp.arch
Path: i2pn2.org!i2pn.org!eternal-september.org!reader02.eternal-september.org!.POSTED!not-for-mail
From: ant...@mips.complang.tuwien.ac.at (Anton Ertl)
Newsgroups: comp.arch
Subject: Re: Sequencer vs microcode
Date: Mon, 28 Jun 2021 16:23:31 GMT
Organization: Institut fuer Computersprachen, Technische Universitaet Wien
Lines: 62
Message-ID: <2021Jun28.182331@mips.complang.tuwien.ac.at>
References: <sb1abl$ps8$1@dont-email.me> <3d37be3c-e4fd-41b3-93f7-7037a3c19aefn@googlegroups.com> <e5IBI.683979$ST2.397427@fx47.iad> <8b1baea3-a802-4341-8afd-d827e4eee948n@googlegroups.com> <3b2b37a7-6168-42c0-a657-dcff23fa3e03n@googlegroups.com> <7ce8fca5-3a58-4e9a-92b4-595d0a743c57n@googlegroups.com> <e2d01fcb-2cb1-4fb4-bf0e-b08d23ead140n@googlegroups.com> <jwvv95yhy1o.fsf-monnier+comp.arch@gnu.org> <sbcovt$d7k$1@newsreader4.netcologne.de> <291ad795-9788-406a-bbe3-aa17eded93ebn@googlegroups.com>
Injection-Info: reader02.eternal-september.org; posting-host="66f55b36d0cf152998a863e51258d0f5";
logging-data="31924"; mail-complaints-to="abuse@eternal-september.org"; posting-account="U2FsdGVkX18QO6NZjhWylNRGzsa8zKw/"
Cancel-Lock: sha1:CumC8S3ICuK3QftFjul20yaOa5I=
X-newsreader: xrn 10.00-beta-3
 by: Anton Ertl - Mon, 28 Jun 2021 16:23 UTC

"robf...@gmail.com" <robfi680@gmail.com> writes:
>On Monday, June 28, 2021 at 11:14:07 AM UTC-4, Thomas Koenig wrote:
>> Stefan Monnier <mon...@iro.umontreal.ca> schrieb:
>> >>>One pretty simple solution to this kind of problem that has been used =
>in=20
>> >>>the past is to start a timer/counter each time an instruction starts.=
>=20
>> >>>If the count exceeds some large number, generate an exception.=20
>> >> That is a good idea to include, but I am a little stumped at the momen=
>t=20
>> >> about implementation.=20
>> >=20
>> > As a compiler/proglang guy, the one thing that stumps me instead here i=
>s "why?".=20
>> > Why would you want, or when/where would you need something like an=20
>> > "exec" opcode? It seems to me like a "solution" in search of a problem.
>> I have even missed what this exec opcode should do, to be frank.=20
>>=20
>> Would it be like the UNIX exec() system call, or something else?
>
>EXEC executes an instruction contained in a register. It would allow passin=
>g the instruction around in a program. The great benefit of EXEC is that it=
> allows functionality of a routine to be altered without resorting to self-=
>modified code. I missed this kind of thing when programming a graphics libr=
>ary. I ended up using self modifying code for performance reasons. EXEC cou=
>ld have been used to implement the raster-ops for instance. It could also b=
>e used in text processing software. It is far faster to use an EXEC instruc=
>tion than the switch statement it typically replaces. It does have some ben=
>efit, but it is also costly to implement.
>I have had EXEC like functionality available on one machine in the form of =
>specialized code buffers. Code could be executed out of a buffer in a singl=
>e clock cycle.

Forth has a word EXECUTE that executes an arbitrary word (not just
colon definitions (functions/procedures in other languages)) whose
token is passed to EXECUTE on the stack. In the dynamically executed
instructions in <http://www.complang.tuwien.ac.at/forth/peep/>, about
0.5% of the executed primitives are EXECUTE; 96% of the EXECUTEd words
are colon definitions, followed by variables (1.4%).

In a threaded-code implementation, EXECUTE works by leaving the
threaded-code instruction pointer alone, but jumping to the native
code that executes the word; as soon as it is done, the next word is
executed, which is the one after the EXECUTE. This is very much like
the exec opcode (compare the threaded code to the macrocode and the
native code to the microcode of a microcoded CPU).

In a native-code implementation, this does not work, because there is
only one program counter in play. So native-code implementations
implement all words such that they can be CALLed, and EXECUTE performs
an indirect call. The call saves the program counter, and at the end
the word returns and execution continues with the next word after the
EXECUTE.

What Forth native-code implementations do may also be a good approach
for doing whatever one wants to do with an exec instruction. It is
probably faster than a half-hearted implementation of exec.

- anton
--
'Anyone trying for "industrial quality" ISA should avoid undefined behavior.'
Mitch Alsup, <c17fcd89-f024-40e7-a594-88a85ac10d20o@googlegroups.com>

Re: Sequencer vs microcode

<sbd1tr$jfl$1@dont-email.me>

  copy mid

https://www.novabbs.com/devel/article-flat.php?id=18163&group=comp.arch#18163

  copy link   Newsgroups: comp.arch
Path: i2pn2.org!i2pn.org!eternal-september.org!reader02.eternal-september.org!.POSTED!not-for-mail
From: sfu...@alumni.cmu.edu.invalid (Stephen Fuld)
Newsgroups: comp.arch
Subject: Re: Sequencer vs microcode
Date: Mon, 28 Jun 2021 10:46:33 -0700
Organization: A noiseless patient Spider
Lines: 31
Message-ID: <sbd1tr$jfl$1@dont-email.me>
References: <sb1abl$ps8$1@dont-email.me>
<cc7766bb-1433-45f4-9ba5-93c40bdab0f7n@googlegroups.com>
<sb553t$kta$1@dont-email.me> <sb5a40$o97$1@dont-email.me>
<42b654db-a022-42f3-a6aa-94cd5a88de98n@googlegroups.com>
<53e78bb8-5c9c-481e-8cec-50a3febcbe95n@googlegroups.com>
<sb75hs$7nq$1@dont-email.me>
<3050904a-eec2-4d47-ba5d-bfed1d774518n@googlegroups.com>
<3d37be3c-e4fd-41b3-93f7-7037a3c19aefn@googlegroups.com>
<e5IBI.683979$ST2.397427@fx47.iad>
<8b1baea3-a802-4341-8afd-d827e4eee948n@googlegroups.com>
<3b2b37a7-6168-42c0-a657-dcff23fa3e03n@googlegroups.com>
<7ce8fca5-3a58-4e9a-92b4-595d0a743c57n@googlegroups.com>
<e2d01fcb-2cb1-4fb4-bf0e-b08d23ead140n@googlegroups.com>
<jwvv95yhy1o.fsf-monnier+comp.arch@gnu.org> <sbcu9u$587$1@dont-email.me>
<jwvfsx2hquz.fsf-monnier+comp.arch@gnu.org>
Mime-Version: 1.0
Content-Type: text/plain; charset=utf-8; format=flowed
Content-Transfer-Encoding: 7bit
Injection-Date: Mon, 28 Jun 2021 17:46:35 -0000 (UTC)
Injection-Info: reader02.eternal-september.org; posting-host="cf88ae1f0bdd23fad1b092f01507761e";
logging-data="19957"; mail-complaints-to="abuse@eternal-september.org"; posting-account="U2FsdGVkX1+b/dPWdDBmY4gor5xjhxWBI5Uzw3kgki8="
User-Agent: Mozilla/5.0 (Windows NT 10.0; Win64; x64; rv:78.0) Gecko/20100101
Thunderbird/78.11.0
Cancel-Lock: sha1:5erdfGyOa4Ksc0npyu73nayAkGI=
In-Reply-To: <jwvfsx2hquz.fsf-monnier+comp.arch@gnu.org>
Content-Language: en-US
 by: Stephen Fuld - Mon, 28 Jun 2021 17:46 UTC

On 6/28/2021 9:58 AM, Stefan Monnier wrote:
> Ivan Godard [2021-06-28 09:44:47] wrote:
>> On 6/28/2021 7:21 AM, Stefan Monnier wrote:
>>>>> One pretty simple solution to this kind of problem that has been used in
>>>>> the past is to start a timer/counter each time an instruction starts.
>>>>> If the count exceeds some large number, generate an exception.
>>>> That is a good idea to include, but I am a little stumped at the moment
>>>> about implementation.
>>> As a compiler/proglang guy, the one thing that stumps me instead here is
>>> "why?".
>>> Why would you want, or when/where would you need something like an
>>> "exec" opcode? It seems to me like a "solution" in search of a problem.
>>> Stefan
>> An exec is a very cheap special-case function call for very short functions.
>
> What makes it so "very cheap"er than a function call using normal
> instructions

It reduces the "overhead" of the function call by half. One instruction
(the EXC) versus two (the Call and the Return).

> and why/when would I need it?

You never "need" it, as there are other, albeit more costly, ways to
accomplish the same thing, but it is sometimes useful.

--
- Stephen Fuld
(e-mail address disguised to prevent spam)

Re: Sequencer vs microcode

<jwvy2ath6c3.fsf-monnier+comp.arch@gnu.org>

  copy mid

https://www.novabbs.com/devel/article-flat.php?id=18171&group=comp.arch#18171

  copy link   Newsgroups: comp.arch
Path: i2pn2.org!i2pn.org!weretis.net!feeder8.news.weretis.net!eternal-september.org!reader02.eternal-september.org!.POSTED!not-for-mail
From: monn...@iro.umontreal.ca (Stefan Monnier)
Newsgroups: comp.arch
Subject: Re: Sequencer vs microcode
Date: Mon, 28 Jun 2021 20:31:10 -0400
Organization: A noiseless patient Spider
Lines: 25
Message-ID: <jwvy2ath6c3.fsf-monnier+comp.arch@gnu.org>
References: <sb1abl$ps8$1@dont-email.me>
<cc7766bb-1433-45f4-9ba5-93c40bdab0f7n@googlegroups.com>
<sb553t$kta$1@dont-email.me> <sb5a40$o97$1@dont-email.me>
<42b654db-a022-42f3-a6aa-94cd5a88de98n@googlegroups.com>
<53e78bb8-5c9c-481e-8cec-50a3febcbe95n@googlegroups.com>
<sb75hs$7nq$1@dont-email.me>
<3050904a-eec2-4d47-ba5d-bfed1d774518n@googlegroups.com>
<3d37be3c-e4fd-41b3-93f7-7037a3c19aefn@googlegroups.com>
<e5IBI.683979$ST2.397427@fx47.iad>
<8b1baea3-a802-4341-8afd-d827e4eee948n@googlegroups.com>
<3b2b37a7-6168-42c0-a657-dcff23fa3e03n@googlegroups.com>
<7ce8fca5-3a58-4e9a-92b4-595d0a743c57n@googlegroups.com>
<e2d01fcb-2cb1-4fb4-bf0e-b08d23ead140n@googlegroups.com>
<jwvv95yhy1o.fsf-monnier+comp.arch@gnu.org>
<sbcu9u$587$1@dont-email.me>
<jwvfsx2hquz.fsf-monnier+comp.arch@gnu.org>
<sbd1tr$jfl$1@dont-email.me>
Mime-Version: 1.0
Content-Type: text/plain
Injection-Info: reader02.eternal-september.org; posting-host="d5b28953779022f435cfca735a21e3c7";
logging-data="18202"; mail-complaints-to="abuse@eternal-september.org"; posting-account="U2FsdGVkX18r0axkjbtafVPAlhxDxGr0"
User-Agent: Gnus/5.13 (Gnus v5.13) Emacs/28.0.50 (gnu/linux)
Cancel-Lock: sha1:1p4yRqP4Y7PRyz1n++W9ejT+EGg=
sha1:oGHzhWV6Wi3pGjXMXwj3XBcViqI=
 by: Stefan Monnier - Tue, 29 Jun 2021 00:31 UTC

>> What makes it so "very cheap"er than a function call using normal
>> instructions
> It reduces the "overhead" of the function call by half. One instruction
> (the EXC) versus two (the Call and the Return).

I don't think you can count it quite this way except maybe for
naive implementations, since "jump&link" and "ret" are typically both
"performed" by the branch prediction (and then later actually executed,
admittedly, but this should usually use an otherwise unused execution
unit so it's quite cheap).

Also those gains can easily be dwarfed by the potential difficulty to
keep the fetch unit fed when encountering an EXEC (i.e adjusting the
branch prediction to do something useful).

>> and why/when would I need it?
> You never "need" it, as there are other, albeit more costly, ways to
> accomplish the same thing, but it is sometimes useful.

The question is what are real cases where the presumably lower cost
can/could be used to get a measurably higher performance than using the
usual implementation&optimization strategies?

Stefan

Re: Sequencer vs microcode

<1ca49244-3cb6-4cfe-aee8-332145c67674n@googlegroups.com>

  copy mid

https://www.novabbs.com/devel/article-flat.php?id=18172&group=comp.arch#18172

  copy link   Newsgroups: comp.arch
X-Received: by 2002:a05:620a:205d:: with SMTP id d29mr4292508qka.296.1624927238762;
Mon, 28 Jun 2021 17:40:38 -0700 (PDT)
X-Received: by 2002:a9d:66c3:: with SMTP id t3mr2006023otm.276.1624927238519;
Mon, 28 Jun 2021 17:40:38 -0700 (PDT)
Path: i2pn2.org!i2pn.org!weretis.net!feeder8.news.weretis.net!proxad.net!feeder1-2.proxad.net!209.85.160.216.MISMATCH!news-out.google.com!nntp.google.com!postnews.google.com!google-groups.googlegroups.com!not-for-mail
Newsgroups: comp.arch
Date: Mon, 28 Jun 2021 17:40:38 -0700 (PDT)
In-Reply-To: <jwvy2ath6c3.fsf-monnier+comp.arch@gnu.org>
Injection-Info: google-groups.googlegroups.com; posting-host=2600:1700:291:29f0:4de5:91b7:1e4c:8ba8;
posting-account=H_G_JQkAAADS6onOMb-dqvUozKse7mcM
NNTP-Posting-Host: 2600:1700:291:29f0:4de5:91b7:1e4c:8ba8
References: <sb1abl$ps8$1@dont-email.me> <cc7766bb-1433-45f4-9ba5-93c40bdab0f7n@googlegroups.com>
<sb553t$kta$1@dont-email.me> <sb5a40$o97$1@dont-email.me> <42b654db-a022-42f3-a6aa-94cd5a88de98n@googlegroups.com>
<53e78bb8-5c9c-481e-8cec-50a3febcbe95n@googlegroups.com> <sb75hs$7nq$1@dont-email.me>
<3050904a-eec2-4d47-ba5d-bfed1d774518n@googlegroups.com> <3d37be3c-e4fd-41b3-93f7-7037a3c19aefn@googlegroups.com>
<e5IBI.683979$ST2.397427@fx47.iad> <8b1baea3-a802-4341-8afd-d827e4eee948n@googlegroups.com>
<3b2b37a7-6168-42c0-a657-dcff23fa3e03n@googlegroups.com> <7ce8fca5-3a58-4e9a-92b4-595d0a743c57n@googlegroups.com>
<e2d01fcb-2cb1-4fb4-bf0e-b08d23ead140n@googlegroups.com> <jwvv95yhy1o.fsf-monnier+comp.arch@gnu.org>
<sbcu9u$587$1@dont-email.me> <jwvfsx2hquz.fsf-monnier+comp.arch@gnu.org>
<sbd1tr$jfl$1@dont-email.me> <jwvy2ath6c3.fsf-monnier+comp.arch@gnu.org>
User-Agent: G2/1.0
MIME-Version: 1.0
Message-ID: <1ca49244-3cb6-4cfe-aee8-332145c67674n@googlegroups.com>
Subject: Re: Sequencer vs microcode
From: MitchAl...@aol.com (MitchAlsup)
Injection-Date: Tue, 29 Jun 2021 00:40:38 +0000
Content-Type: text/plain; charset="UTF-8"
 by: MitchAlsup - Tue, 29 Jun 2021 00:40 UTC

On Monday, June 28, 2021 at 7:31:13 PM UTC-5, Stefan Monnier wrote:
> >> What makes it so "very cheap"er than a function call using normal
> >> instructions
> > It reduces the "overhead" of the function call by half. One instruction
> > (the EXC) versus two (the Call and the Return).
> I don't think you can count it quite this way except maybe for
> naive implementations, since "jump&link" and "ret" are typically both
> "performed" by the branch prediction (and then later actually executed,
> admittedly, but this should usually use an otherwise unused execution
> unit so it's quite cheap).
>
> Also those gains can easily be dwarfed by the potential difficulty to
> keep the fetch unit fed when encountering an EXEC (i.e adjusting the
> branch prediction to do something useful).
> >> and why/when would I need it?
> > You never "need" it, as there are other, albeit more costly, ways to
> > accomplish the same thing, but it is sometimes useful.
<
> The question is what are real cases where the presumably lower cost
> can/could be used to get a measurably higher performance than using the
> usual implementation&optimization strategies?
<
What High Level Languages allow one to pass code to a subroutine that
is not wrapped up in a subroutine ?
>
>
> Stefan

Re: Sequencer vs microcode

<35b24e3f-0b5d-42a1-a08a-cc4d8d77231fn@googlegroups.com>

  copy mid

https://www.novabbs.com/devel/article-flat.php?id=18173&group=comp.arch#18173

  copy link   Newsgroups: comp.arch
X-Received: by 2002:ac8:e08:: with SMTP id a8mr24892324qti.346.1624936686725;
Mon, 28 Jun 2021 20:18:06 -0700 (PDT)
X-Received: by 2002:aca:5cd7:: with SMTP id q206mr6283390oib.99.1624936686440;
Mon, 28 Jun 2021 20:18:06 -0700 (PDT)
Path: i2pn2.org!i2pn.org!weretis.net!feeder8.news.weretis.net!proxad.net!feeder1-2.proxad.net!209.85.160.216.MISMATCH!news-out.google.com!nntp.google.com!postnews.google.com!google-groups.googlegroups.com!not-for-mail
Newsgroups: comp.arch
Date: Mon, 28 Jun 2021 20:18:06 -0700 (PDT)
In-Reply-To: <1ca49244-3cb6-4cfe-aee8-332145c67674n@googlegroups.com>
Injection-Info: google-groups.googlegroups.com; posting-host=2607:fea8:1de1:d400:9911:a902:95d2:8b93;
posting-account=QId4bgoAAABV4s50talpu-qMcPp519Eb
NNTP-Posting-Host: 2607:fea8:1de1:d400:9911:a902:95d2:8b93
References: <sb1abl$ps8$1@dont-email.me> <cc7766bb-1433-45f4-9ba5-93c40bdab0f7n@googlegroups.com>
<sb553t$kta$1@dont-email.me> <sb5a40$o97$1@dont-email.me> <42b654db-a022-42f3-a6aa-94cd5a88de98n@googlegroups.com>
<53e78bb8-5c9c-481e-8cec-50a3febcbe95n@googlegroups.com> <sb75hs$7nq$1@dont-email.me>
<3050904a-eec2-4d47-ba5d-bfed1d774518n@googlegroups.com> <3d37be3c-e4fd-41b3-93f7-7037a3c19aefn@googlegroups.com>
<e5IBI.683979$ST2.397427@fx47.iad> <8b1baea3-a802-4341-8afd-d827e4eee948n@googlegroups.com>
<3b2b37a7-6168-42c0-a657-dcff23fa3e03n@googlegroups.com> <7ce8fca5-3a58-4e9a-92b4-595d0a743c57n@googlegroups.com>
<e2d01fcb-2cb1-4fb4-bf0e-b08d23ead140n@googlegroups.com> <jwvv95yhy1o.fsf-monnier+comp.arch@gnu.org>
<sbcu9u$587$1@dont-email.me> <jwvfsx2hquz.fsf-monnier+comp.arch@gnu.org>
<sbd1tr$jfl$1@dont-email.me> <jwvy2ath6c3.fsf-monnier+comp.arch@gnu.org> <1ca49244-3cb6-4cfe-aee8-332145c67674n@googlegroups.com>
User-Agent: G2/1.0
MIME-Version: 1.0
Message-ID: <35b24e3f-0b5d-42a1-a08a-cc4d8d77231fn@googlegroups.com>
Subject: Re: Sequencer vs microcode
From: robfi...@gmail.com (robf...@gmail.com)
Injection-Date: Tue, 29 Jun 2021 03:18:06 +0000
Content-Type: text/plain; charset="UTF-8"
Content-Transfer-Encoding: quoted-printable
 by: robf...@gmail.com - Tue, 29 Jun 2021 03:18 UTC

On Monday, June 28, 2021 at 8:40:39 PM UTC-4, MitchAlsup wrote:
> On Monday, June 28, 2021 at 7:31:13 PM UTC-5, Stefan Monnier wrote:
> > >> What makes it so "very cheap"er than a function call using normal
> > >> instructions
> > > It reduces the "overhead" of the function call by half. One instruction
> > > (the EXC) versus two (the Call and the Return).
> > I don't think you can count it quite this way except maybe for
> > naive implementations, since "jump&link" and "ret" are typically both
> > "performed" by the branch prediction (and then later actually executed,
> > admittedly, but this should usually use an otherwise unused execution
> > unit so it's quite cheap).
> >
> > Also those gains can easily be dwarfed by the potential difficulty to
> > keep the fetch unit fed when encountering an EXEC (i.e adjusting the
> > branch prediction to do something useful).
> > >> and why/when would I need it?
> > > You never "need" it, as there are other, albeit more costly, ways to
> > > accomplish the same thing, but it is sometimes useful.
> <
> > The question is what are real cases where the presumably lower cost
> > can/could be used to get a measurably higher performance than using the
> > usual implementation&optimization strategies?
> <
> What High Level Languages allow one to pass code to a subroutine that
> is not wrapped up in a subroutine ?
> >
> >
> > Stefan

Suppose there is a text search driven by user input. The first time in the user wants to find all letters < ‘a’ so a set less than ‘a’ instruction is passed in a register to the text search function. The next time in the user want to search for letters greater than ‘g’. So a greater than ‘g’ instruction is passed in a register to the text search function. In the text search function all it does is ‘exec Rn’ where Rn is the register containing the instruction. Without an exec instruction either self modifying code (JIT) must be used to get the same performance or a switch statement with an option set by the user passed in. The switch statement because it is branches is several times slower than exec. It may not make a difference in many programs, but sometimes if the code is called millions of times for instance drawing graphics on-screen it can make a difference.

EXEC is a challenge to implement in a classic pipelined processor, but in a processor using an out-of-order instruction queue it is a bit easier to do.. EXEC does not really affect the branch logic. It is not actually doing a fast subroutine call / return. It is just populating the instruction register field with a value from a register, so there is a double decode. To get the double decode in a classic pipelined processor the easiest approach may be to perform a hidden branch back to the same instruction after feeding a pipeline IR register with the value from a register.
After the first decode it is treated as an ordinary instruction which may or may not be a branch instruction. There are ways to avoid the double decode, but it increases the size of the execute logic.

Re: Sequencer vs microcode

<fb6971f9-46ba-4f6c-9994-26b03f5cdb11n@googlegroups.com>

  copy mid

https://www.novabbs.com/devel/article-flat.php?id=18174&group=comp.arch#18174

  copy link   Newsgroups: comp.arch
X-Received: by 2002:a0c:be85:: with SMTP id n5mr14125370qvi.59.1624936857766; Mon, 28 Jun 2021 20:20:57 -0700 (PDT)
X-Received: by 2002:a9d:d09:: with SMTP id 9mr2497173oti.16.1624936857545; Mon, 28 Jun 2021 20:20:57 -0700 (PDT)
Path: i2pn2.org!i2pn.org!aioe.org!feeder5.feed.usenet.farm!feeder1.feed.usenet.farm!feed.usenet.farm!tr1.eu1.usenetexpress.com!feeder.usenetexpress.com!tr3.iad1.usenetexpress.com!border1.nntp.dca1.giganews.com!nntp.giganews.com!news-out.google.com!nntp.google.com!postnews.google.com!google-groups.googlegroups.com!not-for-mail
Newsgroups: comp.arch
Date: Mon, 28 Jun 2021 20:20:57 -0700 (PDT)
In-Reply-To: <35b24e3f-0b5d-42a1-a08a-cc4d8d77231fn@googlegroups.com>
Injection-Info: google-groups.googlegroups.com; posting-host=2607:fea8:1de1:d400:9911:a902:95d2:8b93; posting-account=QId4bgoAAABV4s50talpu-qMcPp519Eb
NNTP-Posting-Host: 2607:fea8:1de1:d400:9911:a902:95d2:8b93
References: <sb1abl$ps8$1@dont-email.me> <cc7766bb-1433-45f4-9ba5-93c40bdab0f7n@googlegroups.com> <sb553t$kta$1@dont-email.me> <sb5a40$o97$1@dont-email.me> <42b654db-a022-42f3-a6aa-94cd5a88de98n@googlegroups.com> <53e78bb8-5c9c-481e-8cec-50a3febcbe95n@googlegroups.com> <sb75hs$7nq$1@dont-email.me> <3050904a-eec2-4d47-ba5d-bfed1d774518n@googlegroups.com> <3d37be3c-e4fd-41b3-93f7-7037a3c19aefn@googlegroups.com> <e5IBI.683979$ST2.397427@fx47.iad> <8b1baea3-a802-4341-8afd-d827e4eee948n@googlegroups.com> <3b2b37a7-6168-42c0-a657-dcff23fa3e03n@googlegroups.com> <7ce8fca5-3a58-4e9a-92b4-595d0a743c57n@googlegroups.com> <e2d01fcb-2cb1-4fb4-bf0e-b08d23ead140n@googlegroups.com> <jwvv95yhy1o.fsf-monnier+comp.arch@gnu.org> <sbcu9u$587$1@dont-email.me> <jwvfsx2hquz.fsf-monnier+comp.arch@gnu.org> <sbd1tr$jfl$1@dont-email.me> <jwvy2ath6c3.fsf-monnier+comp.arch@gnu.org> <1ca49244-3cb6-4cfe-aee8-332145c67674n@googlegroups.com> <35b24e3f-0b5d-42a1-a08a-cc4d8d77231fn@googlegroups.com>
User-Agent: G2/1.0
MIME-Version: 1.0
Message-ID: <fb6971f9-46ba-4f6c-9994-26b03f5cdb11n@googlegroups.com>
Subject: Re: Sequencer vs microcode
From: robfi...@gmail.com (robf...@gmail.com)
Injection-Date: Tue, 29 Jun 2021 03:20:57 +0000
Content-Type: text/plain; charset="UTF-8"
Content-Transfer-Encoding: quoted-printable
Lines: 68
 by: robf...@gmail.com - Tue, 29 Jun 2021 03:20 UTC

On Monday, June 28, 2021 at 11:18:07 PM UTC-4, robf...@gmail.com wrote:
> On Monday, June 28, 2021 at 8:40:39 PM UTC-4, MitchAlsup wrote:
> > On Monday, June 28, 2021 at 7:31:13 PM UTC-5, Stefan Monnier wrote:
> > > >> What makes it so "very cheap"er than a function call using normal
> > > >> instructions
> > > > It reduces the "overhead" of the function call by half. One instruction
> > > > (the EXC) versus two (the Call and the Return).
> > > I don't think you can count it quite this way except maybe for
> > > naive implementations, since "jump&link" and "ret" are typically both
> > > "performed" by the branch prediction (and then later actually executed,
> > > admittedly, but this should usually use an otherwise unused execution
> > > unit so it's quite cheap).
> > >
> > > Also those gains can easily be dwarfed by the potential difficulty to
> > > keep the fetch unit fed when encountering an EXEC (i.e adjusting the
> > > branch prediction to do something useful).
> > > >> and why/when would I need it?
> > > > You never "need" it, as there are other, albeit more costly, ways to
> > > > accomplish the same thing, but it is sometimes useful.
> > <
> > > The question is what are real cases where the presumably lower cost
> > > can/could be used to get a measurably higher performance than using the
> > > usual implementation&optimization strategies?
> > <
> > What High Level Languages allow one to pass code to a subroutine that
> > is not wrapped up in a subroutine ?
> > >
> > >
> > > Stefan
> Suppose there is a text search driven by user input. The first time in the user wants to find all letters < ‘a’ so a set less than ‘a’ instruction is passed in a register to the text search function. The next time in the user want to search for letters greater than ‘g’. So a greater than ‘g’ instruction is passed in a register to the text search function. In the text search function all it does is ‘exec Rn’ where Rn is the register containing the instruction. Without an exec instruction either self modifying code (JIT) must be used to get the same performance or a switch statement with an option set by the user passed in. The switch statement because it is branches is several times slower than exec. It may not make a difference in many programs, but sometimes if the code is called millions of times for instance drawing graphics on-screen it can make a difference.
>
> EXEC is a challenge to implement in a classic pipelined processor, but in a processor using an out-of-order instruction queue it is a bit easier to do. EXEC does not really affect the branch logic. It is not actually doing a fast subroutine call / return. It is just populating the instruction register field with a value from a register, so there is a double decode. To get the double decode in a classic pipelined processor the easiest approach may be to perform a hidden branch back to the same instruction after feeding a pipeline IR register with the value from a register.
> After the first decode it is treated as an ordinary instruction which may or may not be a branch instruction. There are ways to avoid the double decode, but it increases the size of the execute logic.

One issue with EXEC is code readability. It makes code potentially difficult to understand. If a series of exec statements is written was is the code doing?

Re: Sequencer vs microcode

<jwvsg11gx8p.fsf-monnier+comp.arch@gnu.org>

  copy mid

https://www.novabbs.com/devel/article-flat.php?id=18176&group=comp.arch#18176

  copy link   Newsgroups: comp.arch
Path: i2pn2.org!i2pn.org!eternal-september.org!reader02.eternal-september.org!.POSTED!not-for-mail
From: monn...@iro.umontreal.ca (Stefan Monnier)
Newsgroups: comp.arch
Subject: Re: Sequencer vs microcode
Date: Mon, 28 Jun 2021 23:52:56 -0400
Organization: A noiseless patient Spider
Lines: 54
Message-ID: <jwvsg11gx8p.fsf-monnier+comp.arch@gnu.org>
References: <sb1abl$ps8$1@dont-email.me>
<cc7766bb-1433-45f4-9ba5-93c40bdab0f7n@googlegroups.com>
<sb553t$kta$1@dont-email.me> <sb5a40$o97$1@dont-email.me>
<42b654db-a022-42f3-a6aa-94cd5a88de98n@googlegroups.com>
<53e78bb8-5c9c-481e-8cec-50a3febcbe95n@googlegroups.com>
<sb75hs$7nq$1@dont-email.me>
<3050904a-eec2-4d47-ba5d-bfed1d774518n@googlegroups.com>
<3d37be3c-e4fd-41b3-93f7-7037a3c19aefn@googlegroups.com>
<e5IBI.683979$ST2.397427@fx47.iad>
<8b1baea3-a802-4341-8afd-d827e4eee948n@googlegroups.com>
<3b2b37a7-6168-42c0-a657-dcff23fa3e03n@googlegroups.com>
<7ce8fca5-3a58-4e9a-92b4-595d0a743c57n@googlegroups.com>
<e2d01fcb-2cb1-4fb4-bf0e-b08d23ead140n@googlegroups.com>
<jwvv95yhy1o.fsf-monnier+comp.arch@gnu.org>
<sbcu9u$587$1@dont-email.me>
<jwvfsx2hquz.fsf-monnier+comp.arch@gnu.org>
<sbd1tr$jfl$1@dont-email.me>
<jwvy2ath6c3.fsf-monnier+comp.arch@gnu.org>
<1ca49244-3cb6-4cfe-aee8-332145c67674n@googlegroups.com>
<35b24e3f-0b5d-42a1-a08a-cc4d8d77231fn@googlegroups.com>
Mime-Version: 1.0
Content-Type: text/plain; charset=utf-8
Content-Transfer-Encoding: 8bit
Injection-Info: reader02.eternal-september.org; posting-host="d5b28953779022f435cfca735a21e3c7";
logging-data="20638"; mail-complaints-to="abuse@eternal-september.org"; posting-account="U2FsdGVkX185+2I60bsjUpYULjBUB9Vn"
User-Agent: Gnus/5.13 (Gnus v5.13) Emacs/28.0.50 (gnu/linux)
Cancel-Lock: sha1:1OIYygWWketc9sVVesMJ8QaHQ1U=
sha1:bYMAt6kSIXmybmXzZuiQwLgYhCs=
 by: Stefan Monnier - Tue, 29 Jun 2021 03:52 UTC

> Suppose there is a text search driven by user input. The first time in the
> user wants to find all letters < ‘a’ so a set less than ‘a’ instruction is
> passed in a register to the text search function.

And in which application is such a search performance critical enough
that the application doesn't generate a DFA or something similar?

> the instruction. Without an exec instruction either self modifying code
> (JIT) must be used to get the same performance or a switch statement with an
> option set by the user passed in.

AFAIK for text search, it's pretty easy to setup a table-driven function
that works efficiently enough, with very predictable branches.
What applications do you have in mind where that approach is not fast enough?

> The switch statement because it is branches is several times slower
> than exec.

I think this presumes quite pessimistic performance of the branch
prediction of the switch and unrealistically good performance of the
branch prediction of the EXEC.

> It may not make a difference in many programs, but sometimes if the
> code is called millions of times for instance drawing graphics
> on-screen it can make a difference.

If it's called that many times, even a slow JIT-generation
(e.g. calling a plain old compiler) will pay for itself and can then
outperform your EXEC, e.g. by inlining code into the loop.

> EXEC is a challenge to implement in a classic pipelined processor, but in
> a processor using an out-of-order instruction queue it is a bit easier to
> do. EXEC does not really affect the branch logic.

How can it not? At the time we fetch the EXEC instruction we don't know
yet what will be the content of the register when that instruction will
be executed (there are probably a few hundred instructions in flight
already several of which will likely affect that register's content),
yet we already need to decide what will be the next instruction to push
into the pipeline.

Just as for branches, you need to *predict* the code that will be
executed next, the only difference is that it's not at an address.
I guess if you use a BTB that holds the actual code at the target rather
than the address of the target, then EXEC is no worse than an indirect
branch but it's probably not noticeably faster either.

> It is just populating the instruction register field with a value from
> a register, so there is a double decode.

The double decode is the least of your problems, I think.

Stefan

Re: Sequencer vs microcode

<mZwCI.366532$jf1.289886@fx37.iad>

  copy mid

https://www.novabbs.com/devel/article-flat.php?id=18177&group=comp.arch#18177

  copy link   Newsgroups: comp.arch
Path: i2pn2.org!i2pn.org!weretis.net!feeder8.news.weretis.net!news.uzoreto.com!newsfeed.xs4all.nl!newsfeed7.news.xs4all.nl!feeder1.feed.usenet.farm!feed.usenet.farm!peer02.ams4!peer.am4.highwinds-media.com!peer02.iad!feed-me.highwinds-media.com!news.highwinds-media.com!fx37.iad.POSTED!not-for-mail
From: ThatWoul...@thevillage.com (EricP)
User-Agent: Thunderbird 2.0.0.24 (Windows/20100228)
MIME-Version: 1.0
Newsgroups: comp.arch
Subject: Re: Sequencer vs microcode
References: <sb1abl$ps8$1@dont-email.me> <cc7766bb-1433-45f4-9ba5-93c40bdab0f7n@googlegroups.com> <sb553t$kta$1@dont-email.me> <sb5a40$o97$1@dont-email.me> <42b654db-a022-42f3-a6aa-94cd5a88de98n@googlegroups.com> <53e78bb8-5c9c-481e-8cec-50a3febcbe95n@googlegroups.com> <sb75hs$7nq$1@dont-email.me> <3050904a-eec2-4d47-ba5d-bfed1d774518n@googlegroups.com> <3d37be3c-e4fd-41b3-93f7-7037a3c19aefn@googlegroups.com> <e5IBI.683979$ST2.397427@fx47.iad> <8b1baea3-a802-4341-8afd-d827e4eee948n@googlegroups.com> <3b2b37a7-6168-42c0-a657-dcff23fa3e03n@googlegroups.com> <7ce8fca5-3a58-4e9a-92b4-595d0a743c57n@googlegroups.com> <e2d01fcb-2cb1-4fb4-bf0e-b08d23ead140n@googlegroups.com> <jwvv95yhy1o.fsf-monnier+comp.arch@gnu.org> <sbcu9u$587$1@dont-email.me> <jwvfsx2hquz.fsf-monnier+comp.arch@gnu.org> <sbd1tr$jfl$1@dont-email.me> <jwvy2ath6c3.fsf-monnier+comp.arch@gnu.org> <1ca49244-3cb6-4cfe-aee8-332145c67674n@googlegroups.com> <35b24e3f-0b5d-42a1-a08a-cc4d8d77231fn@googlegroups.com>
In-Reply-To: <35b24e3f-0b5d-42a1-a08a-cc4d8d77231fn@googlegroups.com>
Content-Type: text/plain; charset=UTF-8; format=flowed
Content-Transfer-Encoding: 7bit
Lines: 37
Message-ID: <mZwCI.366532$jf1.289886@fx37.iad>
X-Complaints-To: abuse@UsenetServer.com
NNTP-Posting-Date: Tue, 29 Jun 2021 04:04:34 UTC
Date: Tue, 29 Jun 2021 00:04:27 -0400
X-Received-Bytes: 3995
 by: EricP - Tue, 29 Jun 2021 04:04 UTC

robf...@gmail.com wrote:
>
> EXEC is a challenge to implement in a classic pipelined processor, but in a processor using an out-of-order instruction queue it is a bit easier to do.. EXEC does not really affect the branch logic. It is not actually doing a fast subroutine call / return. It is just populating the instruction register field with a value from a register, so there is a double decode. To get the double decode in a classic pipelined processor the easiest approach may be to perform a hidden branch back to the same instruction after feeding a pipeline IR register with the value from a register.
> After the first decode it is treated as an ordinary instruction which may or may not be a branch instruction. There are ways to avoid the double decode, but it increases the size of the execute logic.

Decode has to have a way to stall when it encounters things like "JMP reg"
and it does not know what reg value is. Decode passes the jmp uOp along,
sets a DecodeStall flag, which signals back to fetch, and stops looking
at the instruction buffer. However the Fetch unit may
continue to use the fetch_PC to fill the prefetch buffers.
Note that the fetch_PC is pointing to the next instruction after JMP.

The same decode-stalls-fetch mechanism can be used for single-step,
where Decode emits a single uOp is marked as single-step and stalls itself.
When the uOp reaches Retire, it sees it is marked as single step
and resets the DecodeStall flag.

That same stall mechanism can be used for EXEC. Decode sees EXEC and
passes its uOp along and stalls itself. Eventually EXEC reaches
register read, gets the instruction value, and the execute unit
writes back the value into the instruction buffer.
The instruction buffer write also sets a buffer flag "IsExec".

The instruction buffer and IsExec flag feed into Decode.
If the instruction is EXEC and the IsExec flag is set,
then it is an EXEC to an EXEC instruction so it
triggers an illegal instruction trap.
Otherwise it decodes the instruction as normal.
In any case it resets the IsExec and DecodeStall flags.

Resetting the DecodeStall flag enables fetch to pass the next
sequential instruction in the instruction buffer.

If there is a mispredicted branch then the pipeline gets flushed
and the DecodeStall and IsExec flags are reset.

Re: Sequencer vs microcode

<sbef0e$hb1$2@newsreader4.netcologne.de>

  copy mid

https://www.novabbs.com/devel/article-flat.php?id=18181&group=comp.arch#18181

  copy link   Newsgroups: comp.arch
Path: i2pn2.org!i2pn.org!weretis.net!feeder8.news.weretis.net!newsreader4.netcologne.de!news.netcologne.de!.POSTED.2001-4dd7-51bc-0-7285-c2ff-fe6c-992d.ipv6dyn.netcologne.de!not-for-mail
From: tkoe...@netcologne.de (Thomas Koenig)
Newsgroups: comp.arch
Subject: Re: Sequencer vs microcode
Date: Tue, 29 Jun 2021 06:35:58 -0000 (UTC)
Organization: news.netcologne.de
Distribution: world
Message-ID: <sbef0e$hb1$2@newsreader4.netcologne.de>
References: <sb1abl$ps8$1@dont-email.me>
<cc7766bb-1433-45f4-9ba5-93c40bdab0f7n@googlegroups.com>
<sb553t$kta$1@dont-email.me> <sb5a40$o97$1@dont-email.me>
<42b654db-a022-42f3-a6aa-94cd5a88de98n@googlegroups.com>
<53e78bb8-5c9c-481e-8cec-50a3febcbe95n@googlegroups.com>
<sb75hs$7nq$1@dont-email.me>
<3050904a-eec2-4d47-ba5d-bfed1d774518n@googlegroups.com>
<3d37be3c-e4fd-41b3-93f7-7037a3c19aefn@googlegroups.com>
<e5IBI.683979$ST2.397427@fx47.iad>
<8b1baea3-a802-4341-8afd-d827e4eee948n@googlegroups.com>
<3b2b37a7-6168-42c0-a657-dcff23fa3e03n@googlegroups.com>
<7ce8fca5-3a58-4e9a-92b4-595d0a743c57n@googlegroups.com>
<e2d01fcb-2cb1-4fb4-bf0e-b08d23ead140n@googlegroups.com>
<jwvv95yhy1o.fsf-monnier+comp.arch@gnu.org> <sbcu9u$587$1@dont-email.me>
<jwvfsx2hquz.fsf-monnier+comp.arch@gnu.org> <sbd1tr$jfl$1@dont-email.me>
<jwvy2ath6c3.fsf-monnier+comp.arch@gnu.org>
Injection-Date: Tue, 29 Jun 2021 06:35:58 -0000 (UTC)
Injection-Info: newsreader4.netcologne.de; posting-host="2001-4dd7-51bc-0-7285-c2ff-fe6c-992d.ipv6dyn.netcologne.de:2001:4dd7:51bc:0:7285:c2ff:fe6c:992d";
logging-data="17761"; mail-complaints-to="abuse@netcologne.de"
User-Agent: slrn/1.0.3 (Linux)
 by: Thomas Koenig - Tue, 29 Jun 2021 06:35 UTC

Stefan Monnier <monnier@iro.umontreal.ca> schrieb:

> The question is what are real cases where the presumably lower cost
> can/could be used to get a measurably higher performance than using the
> usual implementation&optimization strategies?

If you did not know the integral of cos(), you could pass cos()
to an integration routine if you had it as a machine instruction
as on My 66000.

However, the need to numerically integrate cos() should be
pretty small.

Re: Sequencer vs microcode

<sbeig1$o01$1@dont-email.me>

  copy mid

https://www.novabbs.com/devel/article-flat.php?id=18182&group=comp.arch#18182

  copy link   Newsgroups: comp.arch
Path: i2pn2.org!i2pn.org!eternal-september.org!reader02.eternal-september.org!.POSTED!not-for-mail
From: m.del...@this.bitsnbites.eu (Marcus)
Newsgroups: comp.arch
Subject: Re: Sequencer vs microcode
Date: Tue, 29 Jun 2021 09:35:28 +0200
Organization: A noiseless patient Spider
Lines: 18
Message-ID: <sbeig1$o01$1@dont-email.me>
References: <sb1abl$ps8$1@dont-email.me>
<cc7766bb-1433-45f4-9ba5-93c40bdab0f7n@googlegroups.com>
<sb553t$kta$1@dont-email.me> <sb5a40$o97$1@dont-email.me>
<42b654db-a022-42f3-a6aa-94cd5a88de98n@googlegroups.com>
<53e78bb8-5c9c-481e-8cec-50a3febcbe95n@googlegroups.com>
<sb75hs$7nq$1@dont-email.me>
<3050904a-eec2-4d47-ba5d-bfed1d774518n@googlegroups.com>
<3d37be3c-e4fd-41b3-93f7-7037a3c19aefn@googlegroups.com>
<e5IBI.683979$ST2.397427@fx47.iad>
<8b1baea3-a802-4341-8afd-d827e4eee948n@googlegroups.com>
<3b2b37a7-6168-42c0-a657-dcff23fa3e03n@googlegroups.com>
<7ce8fca5-3a58-4e9a-92b4-595d0a743c57n@googlegroups.com>
<e2d01fcb-2cb1-4fb4-bf0e-b08d23ead140n@googlegroups.com>
<jwvv95yhy1o.fsf-monnier+comp.arch@gnu.org> <sbcu9u$587$1@dont-email.me>
<jwvfsx2hquz.fsf-monnier+comp.arch@gnu.org> <sbd1tr$jfl$1@dont-email.me>
<jwvy2ath6c3.fsf-monnier+comp.arch@gnu.org>
Mime-Version: 1.0
Content-Type: text/plain; charset=windows-1252; format=flowed
Content-Transfer-Encoding: 7bit
Injection-Date: Tue, 29 Jun 2021 07:35:29 -0000 (UTC)
Injection-Info: reader02.eternal-september.org; posting-host="b574f9e171197ede9f2566eda9bf670e";
logging-data="24577"; mail-complaints-to="abuse@eternal-september.org"; posting-account="U2FsdGVkX1/utss4o2/ugrnbacJGpPMUB39eykuHXFY="
User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:78.0) Gecko/20100101
Thunderbird/78.11.0
Cancel-Lock: sha1:ZAboRCLJyQbFpmUcYQudYfxjB+w=
In-Reply-To: <jwvy2ath6c3.fsf-monnier+comp.arch@gnu.org>
Content-Language: en-US
 by: Marcus - Tue, 29 Jun 2021 07:35 UTC

On 2021-06-29, Stefan Monnier wrote:
>>> and why/when would I need it?
>> You never "need" it, as there are other, albeit more costly, ways to
>> accomplish the same thing, but it is sometimes useful.
>
> The question is what are real cases where the presumably lower cost
> can/could be used to get a measurably higher performance than using the
> usual implementation&optimization strategies?

In my experience the better option for some of the use cases that I see
mentioned (e.g. making adaptations of a highly optimized core routine)
is to use JIT. For instance, see LLVMpipe [1] in Mesa that generates
optimized software rasterization code on-the-fly.

/Marcus

[1] https://docs.mesa3d.org/drivers/llvmpipe.html

Re: Sequencer vs microcode

<18641964-95f3-426a-a318-470ef6cfcd47n@googlegroups.com>

  copy mid

https://www.novabbs.com/devel/article-flat.php?id=18183&group=comp.arch#18183

  copy link   Newsgroups: comp.arch
X-Received: by 2002:a05:622a:11c3:: with SMTP id n3mr26386068qtk.211.1624970908192;
Tue, 29 Jun 2021 05:48:28 -0700 (PDT)
X-Received: by 2002:a4a:88c2:: with SMTP id q2mr3874277ooh.73.1624970907925;
Tue, 29 Jun 2021 05:48:27 -0700 (PDT)
Path: i2pn2.org!i2pn.org!weretis.net!feeder8.news.weretis.net!proxad.net!feeder1-2.proxad.net!209.85.160.216.MISMATCH!news-out.google.com!nntp.google.com!postnews.google.com!google-groups.googlegroups.com!not-for-mail
Newsgroups: comp.arch
Date: Tue, 29 Jun 2021 05:48:27 -0700 (PDT)
In-Reply-To: <sbeig1$o01$1@dont-email.me>
Injection-Info: google-groups.googlegroups.com; posting-host=2607:fea8:1de1:d400:30e5:6d26:4cbb:f0ba;
posting-account=QId4bgoAAABV4s50talpu-qMcPp519Eb
NNTP-Posting-Host: 2607:fea8:1de1:d400:30e5:6d26:4cbb:f0ba
References: <sb1abl$ps8$1@dont-email.me> <cc7766bb-1433-45f4-9ba5-93c40bdab0f7n@googlegroups.com>
<sb553t$kta$1@dont-email.me> <sb5a40$o97$1@dont-email.me> <42b654db-a022-42f3-a6aa-94cd5a88de98n@googlegroups.com>
<53e78bb8-5c9c-481e-8cec-50a3febcbe95n@googlegroups.com> <sb75hs$7nq$1@dont-email.me>
<3050904a-eec2-4d47-ba5d-bfed1d774518n@googlegroups.com> <3d37be3c-e4fd-41b3-93f7-7037a3c19aefn@googlegroups.com>
<e5IBI.683979$ST2.397427@fx47.iad> <8b1baea3-a802-4341-8afd-d827e4eee948n@googlegroups.com>
<3b2b37a7-6168-42c0-a657-dcff23fa3e03n@googlegroups.com> <7ce8fca5-3a58-4e9a-92b4-595d0a743c57n@googlegroups.com>
<e2d01fcb-2cb1-4fb4-bf0e-b08d23ead140n@googlegroups.com> <jwvv95yhy1o.fsf-monnier+comp.arch@gnu.org>
<sbcu9u$587$1@dont-email.me> <jwvfsx2hquz.fsf-monnier+comp.arch@gnu.org>
<sbd1tr$jfl$1@dont-email.me> <jwvy2ath6c3.fsf-monnier+comp.arch@gnu.org> <sbeig1$o01$1@dont-email.me>
User-Agent: G2/1.0
MIME-Version: 1.0
Message-ID: <18641964-95f3-426a-a318-470ef6cfcd47n@googlegroups.com>
Subject: Re: Sequencer vs microcode
From: robfi...@gmail.com (robf...@gmail.com)
Injection-Date: Tue, 29 Jun 2021 12:48:28 +0000
Content-Type: text/plain; charset="UTF-8"
Content-Transfer-Encoding: quoted-printable
 by: robf...@gmail.com - Tue, 29 Jun 2021 12:48 UTC

> In my experience the better option for some of the use cases that I see
> mentioned (e.g. making adaptations of a highly optimized core routine)
> is to use JIT. For instance, see LLVMpipe [1] in Mesa that generates
> optimized software rasterization code on-the-fly.
>
> /Marcus
There is compute time overhead with JIT. With EXEC it is just a loading a register with a value.

Torturing the topic a little bit.

The fast way of supporting EXEC increases the size of the core by about 15%, just for the one instruction; requiring the IR to act like another register and be part of the bypass network. The slow way of supporting EXEC does not increase the size of the core significantly, but may take EXEC a couple of more cycles to execute which pretty much negates its usefulness. The slow way is to have EXEC execute like any other instruction placing the operand on the result bus, then copy the result to the IR, reset several flags, and wait for it to decode and execute again. The work-around for having special logic to detect an EXEC of an EXEC is the watchdog timer on the queue.
Found another feature of EXEC that is interesting. Potentially wider instructions can be used because the value is coming from a register which is 64-bits wide. So, 64-bit instructions could be used with EXEC. It may be one way to execute an instruction requiring four read source ports for instance – too wide for the regular format. Having an instruction in a register means one can also perform calculations on it. For instance incrementing the register fields. If there are a lot of registers to save / restored an EXEC instruction in a loop could do the trick. I will not mention my other favorite instruction ATNI - add to next instruction.

I had the core treating EXEC as if it were an ordinary ALU op, not a branch.. Predicted as if sequential instructions follow with ensuing consequences if it turns out to be a flow control op. That would make it horrendously slow if it turns out to be a branch, but why would one EXEC a branch anyway?

Robert

Re: Sequencer vs microcode

<fOECI.1945$SMMb.1808@fx17.iad>

  copy mid

https://www.novabbs.com/devel/article-flat.php?id=18184&group=comp.arch#18184

  copy link   Newsgroups: comp.arch
Path: i2pn2.org!i2pn.org!weretis.net!feeder8.news.weretis.net!news.uzoreto.com!news-out.netnews.com!news.alt.net!fdc3.netnews.com!peer03.ams1!peer.ams1.xlned.com!news.xlned.com!peer03.iad!feed-me.highwinds-media.com!news.highwinds-media.com!fx17.iad.POSTED!not-for-mail
From: ThatWoul...@thevillage.com (EricP)
User-Agent: Thunderbird 2.0.0.24 (Windows/20100228)
MIME-Version: 1.0
Newsgroups: comp.arch
Subject: Re: Sequencer vs microcode
References: <sb1abl$ps8$1@dont-email.me> <sb553t$kta$1@dont-email.me> <sb5a40$o97$1@dont-email.me> <42b654db-a022-42f3-a6aa-94cd5a88de98n@googlegroups.com> <53e78bb8-5c9c-481e-8cec-50a3febcbe95n@googlegroups.com> <sb75hs$7nq$1@dont-email.me> <3050904a-eec2-4d47-ba5d-bfed1d774518n@googlegroups.com> <3d37be3c-e4fd-41b3-93f7-7037a3c19aefn@googlegroups.com> <e5IBI.683979$ST2.397427@fx47.iad> <8b1baea3-a802-4341-8afd-d827e4eee948n@googlegroups.com> <3b2b37a7-6168-42c0-a657-dcff23fa3e03n@googlegroups.com> <7ce8fca5-3a58-4e9a-92b4-595d0a743c57n@googlegroups.com> <e2d01fcb-2cb1-4fb4-bf0e-b08d23ead140n@googlegroups.com> <jwvv95yhy1o.fsf-monnier+comp.arch@gnu.org> <sbcu9u$587$1@dont-email.me> <jwvfsx2hquz.fsf-monnier+comp.arch@gnu.org> <sbd1tr$jfl$1@dont-email.me> <jwvy2ath6c3.fsf-monnier+comp.arch@gnu.org> <1ca49244-3cb6-4cfe-aee8-332145c67674n@googlegroups.com> <35b24e3f-0b5d-42a1-a08a-cc4d8d77231fn@googlegroups.com> <jwvsg11gx8p.fsf-monnier+comp.arch@gnu.org>
In-Reply-To: <jwvsg11gx8p.fsf-monnier+comp.arch@gnu.org>
Content-Type: text/plain; charset=UTF-8; format=flowed
Content-Transfer-Encoding: 7bit
Lines: 28
Message-ID: <fOECI.1945$SMMb.1808@fx17.iad>
X-Complaints-To: abuse@UsenetServer.com
NNTP-Posting-Date: Tue, 29 Jun 2021 12:58:51 UTC
Date: Tue, 29 Jun 2021 08:58:41 -0400
X-Received-Bytes: 2723
 by: EricP - Tue, 29 Jun 2021 12:58 UTC

Stefan Monnier wrote:
>
>> It is just populating the instruction register field with a value from
>> a register, so there is a double decode.
>
> The double decode is the least of your problems, I think.

Especially since that's the instructions' raison d'etre.

The performance cost is the pipeline stall between
"EXEC reg" decode and when the register value is available
to forward back to Decode input instruction buffer.

If a branch predictor can predict that this 32-bit PC address
is next after that instruction PC, then a similar mechanism
could predict the contents of the "EXEC reg" register value
and provide it early to the Decoder.
However as with "JMP reg" the predicted register value
must later be checked against the actual register value,
with a mis-match triggering a mispredict rollback.

Since EXEC is intended to execute instructions that could
not have been predicted in advance, it is unlikely that an
EXEC predictor would be much help.
If something was intended for high performance,
you probably wouldn't be doing EXEC's in the first place.

Re: Sequencer vs microcode

<2021Jun29.151032@mips.complang.tuwien.ac.at>

  copy mid

https://www.novabbs.com/devel/article-flat.php?id=18185&group=comp.arch#18185

  copy link   Newsgroups: comp.arch
Path: i2pn2.org!i2pn.org!eternal-september.org!reader02.eternal-september.org!.POSTED!not-for-mail
From: ant...@mips.complang.tuwien.ac.at (Anton Ertl)
Newsgroups: comp.arch
Subject: Re: Sequencer vs microcode
Date: Tue, 29 Jun 2021 13:10:32 GMT
Organization: Institut fuer Computersprachen, Technische Universitaet Wien
Lines: 15
Message-ID: <2021Jun29.151032@mips.complang.tuwien.ac.at>
References: <sb1abl$ps8$1@dont-email.me> <3b2b37a7-6168-42c0-a657-dcff23fa3e03n@googlegroups.com> <7ce8fca5-3a58-4e9a-92b4-595d0a743c57n@googlegroups.com> <e2d01fcb-2cb1-4fb4-bf0e-b08d23ead140n@googlegroups.com> <jwvv95yhy1o.fsf-monnier+comp.arch@gnu.org> <sbcu9u$587$1@dont-email.me> <jwvfsx2hquz.fsf-monnier+comp.arch@gnu.org> <sbd1tr$jfl$1@dont-email.me> <jwvy2ath6c3.fsf-monnier+comp.arch@gnu.org> <1ca49244-3cb6-4cfe-aee8-332145c67674n@googlegroups.com>
Injection-Info: reader02.eternal-september.org; posting-host="c65cab040ca5bcb1b4afd0898903036f";
logging-data="13068"; mail-complaints-to="abuse@eternal-september.org"; posting-account="U2FsdGVkX1/EQWzoXZBYCe7MI2/jbY+f"
Cancel-Lock: sha1:CbepzWbeyNk0QXOVBS94RMl8Y3A=
X-newsreader: xrn 10.00-beta-3
 by: Anton Ertl - Tue, 29 Jun 2021 13:10 UTC

MitchAlsup <MitchAlsup@aol.com> writes:
>What High Level Languages allow one to pass code to a subroutine that
>is not wrapped up in a subroutine ?

Forth.

But that's the wrong question. Architectures have a number of
features that are not directly reflected in high-level languages,
e.g., the carry flag (and Intel has added the ADX extension because
the existing carry flag was not enough).

- anton
--
'Anyone trying for "industrial quality" ISA should avoid undefined behavior.'
Mitch Alsup, <c17fcd89-f024-40e7-a594-88a85ac10d20o@googlegroups.com>

Re: Sequencer vs microcode

<12fa6643-1457-48a2-afbf-973bbb8b78b6n@googlegroups.com>

  copy mid

https://www.novabbs.com/devel/article-flat.php?id=18186&group=comp.arch#18186

  copy link   Newsgroups: comp.arch
X-Received: by 2002:ad4:5444:: with SMTP id h4mr31243985qvt.14.1624976410383;
Tue, 29 Jun 2021 07:20:10 -0700 (PDT)
X-Received: by 2002:a9d:d09:: with SMTP id 9mr4654959oti.16.1624976410137;
Tue, 29 Jun 2021 07:20:10 -0700 (PDT)
Path: i2pn2.org!i2pn.org!weretis.net!feeder8.news.weretis.net!proxad.net!feeder1-2.proxad.net!209.85.160.216.MISMATCH!news-out.google.com!nntp.google.com!postnews.google.com!google-groups.googlegroups.com!not-for-mail
Newsgroups: comp.arch
Date: Tue, 29 Jun 2021 07:20:09 -0700 (PDT)
In-Reply-To: <2021Jun29.151032@mips.complang.tuwien.ac.at>
Injection-Info: google-groups.googlegroups.com; posting-host=2607:fea8:1de1:d400:30e5:6d26:4cbb:f0ba;
posting-account=QId4bgoAAABV4s50talpu-qMcPp519Eb
NNTP-Posting-Host: 2607:fea8:1de1:d400:30e5:6d26:4cbb:f0ba
References: <sb1abl$ps8$1@dont-email.me> <3b2b37a7-6168-42c0-a657-dcff23fa3e03n@googlegroups.com>
<7ce8fca5-3a58-4e9a-92b4-595d0a743c57n@googlegroups.com> <e2d01fcb-2cb1-4fb4-bf0e-b08d23ead140n@googlegroups.com>
<jwvv95yhy1o.fsf-monnier+comp.arch@gnu.org> <sbcu9u$587$1@dont-email.me>
<jwvfsx2hquz.fsf-monnier+comp.arch@gnu.org> <sbd1tr$jfl$1@dont-email.me>
<jwvy2ath6c3.fsf-monnier+comp.arch@gnu.org> <1ca49244-3cb6-4cfe-aee8-332145c67674n@googlegroups.com>
<2021Jun29.151032@mips.complang.tuwien.ac.at>
User-Agent: G2/1.0
MIME-Version: 1.0
Message-ID: <12fa6643-1457-48a2-afbf-973bbb8b78b6n@googlegroups.com>
Subject: Re: Sequencer vs microcode
From: robfi...@gmail.com (robf...@gmail.com)
Injection-Date: Tue, 29 Jun 2021 14:20:10 +0000
Content-Type: text/plain; charset="UTF-8"
Content-Transfer-Encoding: quoted-printable
 by: robf...@gmail.com - Tue, 29 Jun 2021 14:20 UTC

On Tuesday, June 29, 2021 at 9:58:54 AM UTC-4, Anton Ertl wrote:
> MitchAlsup <Mitch...@aol.com> writes:
> >What High Level Languages allow one to pass code to a subroutine that
> >is not wrapped up in a subroutine ?
> Forth.
>
> But that's the wrong question. Architectures have a number of
> features that are not directly reflected in high-level languages,
> e.g., the carry flag (and Intel has added the ADX extension because
> the existing carry flag was not enough).
> - anton
> --
> 'Anyone trying for "industrial quality" ISA should avoid undefined behavior.'
> Mitch Alsup, <c17fcd89-f024-40e7...@googlegroups.com>

I believe also IIRC the Clipper DBASE3/4 compiler also allows code to pass to a subroutine.

Appreciating that EXEC is not high-performance instruction I am still tempted to include it, as the way it is implemented is not high cost and I have transistors to burn. XOR is not a very useful instruction either, but many machines include it. EXEC may allow 64-bit instructions to execute in a kludgy manner.
Since EXEC uses a 2R instruction format it could conditionally execute based on the value in a second register. Could allow predicated execution of any instruction. Getting a vector exec turns out not to have additional cost. It may be an idea to have a vector exec abort instruction. Vector, execute is interesting because it could be made dependent on a mask register setting. Different combinations of instructions could be executed based on the mask value.

Exec (was: Sequencer vs microcode)

<2021Jun29.155912@mips.complang.tuwien.ac.at>

  copy mid

https://www.novabbs.com/devel/article-flat.php?id=18187&group=comp.arch#18187

  copy link   Newsgroups: comp.arch
Path: i2pn2.org!i2pn.org!eternal-september.org!reader02.eternal-september.org!.POSTED!not-for-mail
From: ant...@mips.complang.tuwien.ac.at (Anton Ertl)
Newsgroups: comp.arch
Subject: Exec (was: Sequencer vs microcode)
Date: Tue, 29 Jun 2021 13:59:12 GMT
Organization: Institut fuer Computersprachen, Technische Universitaet Wien
Lines: 57
Message-ID: <2021Jun29.155912@mips.complang.tuwien.ac.at>
References: <sb1abl$ps8$1@dont-email.me> <7ce8fca5-3a58-4e9a-92b4-595d0a743c57n@googlegroups.com> <e2d01fcb-2cb1-4fb4-bf0e-b08d23ead140n@googlegroups.com> <jwvv95yhy1o.fsf-monnier+comp.arch@gnu.org> <sbcu9u$587$1@dont-email.me> <jwvfsx2hquz.fsf-monnier+comp.arch@gnu.org> <sbd1tr$jfl$1@dont-email.me> <jwvy2ath6c3.fsf-monnier+comp.arch@gnu.org> <1ca49244-3cb6-4cfe-aee8-332145c67674n@googlegroups.com> <35b24e3f-0b5d-42a1-a08a-cc4d8d77231fn@googlegroups.com>
Injection-Info: reader02.eternal-september.org; posting-host="c65cab040ca5bcb1b4afd0898903036f";
logging-data="23968"; mail-complaints-to="abuse@eternal-september.org"; posting-account="U2FsdGVkX18IyqOQwyd6nBi137tcH6O8"
Cancel-Lock: sha1:8Eqi7gQnV7bAckPUwLEGBup5uNE=
X-newsreader: xrn 10.00-beta-3
 by: Anton Ertl - Tue, 29 Jun 2021 13:59 UTC

"robf...@gmail.com" <robfi680@gmail.com> writes:
>Suppose there is a text search driven by user input. The first time in the =
>user wants to find all letters < =E2=80=98a=E2=80=99 so a set less than =E2=
>=80=98a=E2=80=99 instruction is passed in a register to the text search fun=
>ction. The next time in the user want to search for letters greater than =
>=E2=80=98g=E2=80=99. So a greater than =E2=80=98g=E2=80=99 instruction is p=
>assed in a register to the text search function. In the text search functio=
>n all it does is =E2=80=98exec Rn=E2=80=99 where Rn is the register contain=
>ing the instruction. Without an exec instruction either self modifying code=
> (JIT) must be used to get the same performance or a switch statement with =
>an option set by the user passed in.

The classic approach is to pass the address of a subroutine; that
could contain

slt r1,r1,r2
ret

or

slt r1,r2,r1
ret

>The switch statement because it is bra=
>nches is several times slower than exec.

The branches are predicted and may cost very little. But the indirect
call tends to cost even less.

>EXEC is a challenge to implement in a classic pipelined processor, but in a=
> processor using an out-of-order instruction queue it is a bit easier to do=
>.

On a OoO CPU it is a lot harder. You need to feed data from the
execution unit back into the front end. It introduces arbitrary
source and destination registers, so on front ends with a register
renamer the front end cannot proceed past the exec until the data is
here. I expect that the value-passing style of OoO engine also has
its difficulties with inserting arbitrary register reads and writes,
and possibly memory reads and writes somewhere in the middle of the
OoO part of the engine.

I think the fastest way to implement exec on both kinds of OoO CPUs is
like what we do for branches (the other place where we need to feed
back from the OoO part to the front end): have a predictor, and check
at commit whether the prediction was correct. If it was not, throw
away everything after the exec, change the predicted instruction, and
start again from the exec.

But does it pay to have exec and yet another predictor, or is it
better to invest the effort in faster indirect call and return (e.g.,
a larger indirect branch predictor)?

- anton
--
'Anyone trying for "industrial quality" ISA should avoid undefined behavior.'
Mitch Alsup, <c17fcd89-f024-40e7-a594-88a85ac10d20o@googlegroups.com>

Exec (was: Sequencer vs microcode)

<2021Jun29.162615@mips.complang.tuwien.ac.at>

  copy mid

https://www.novabbs.com/devel/article-flat.php?id=18188&group=comp.arch#18188

  copy link   Newsgroups: comp.arch
Path: i2pn2.org!i2pn.org!eternal-september.org!reader02.eternal-september.org!.POSTED!not-for-mail
From: ant...@mips.complang.tuwien.ac.at (Anton Ertl)
Newsgroups: comp.arch
Subject: Exec (was: Sequencer vs microcode)
Date: Tue, 29 Jun 2021 14:26:15 GMT
Organization: Institut fuer Computersprachen, Technische Universitaet Wien
Lines: 12
Message-ID: <2021Jun29.162615@mips.complang.tuwien.ac.at>
References: <sb1abl$ps8$1@dont-email.me> <3b2b37a7-6168-42c0-a657-dcff23fa3e03n@googlegroups.com> <7ce8fca5-3a58-4e9a-92b4-595d0a743c57n@googlegroups.com> <e2d01fcb-2cb1-4fb4-bf0e-b08d23ead140n@googlegroups.com> <jwvv95yhy1o.fsf-monnier+comp.arch@gnu.org> <sbcu9u$587$1@dont-email.me> <jwvfsx2hquz.fsf-monnier+comp.arch@gnu.org> <sbd1tr$jfl$1@dont-email.me> <jwvy2ath6c3.fsf-monnier+comp.arch@gnu.org> <sbeig1$o01$1@dont-email.me> <18641964-95f3-426a-a318-470ef6cfcd47n@googlegroups.com>
Injection-Info: reader02.eternal-september.org; posting-host="c65cab040ca5bcb1b4afd0898903036f";
logging-data="23968"; mail-complaints-to="abuse@eternal-september.org"; posting-account="U2FsdGVkX18vPg3EovUnXYYFWNSYHXwP"
Cancel-Lock: sha1:5nC7JylDk+OK1OuVyfJKLwUJ+NU=
X-newsreader: xrn 10.00-beta-3
 by: Anton Ertl - Tue, 29 Jun 2021 14:26 UTC

"robf...@gmail.com" <robfi680@gmail.com> writes:
>That would make it horrendously sl=
>ow if it turns out to be a branch, but why would one EXEC a branch anyway?

You want to pass code that consists of more instructions than you can
pass directly to EXEC, so you pass a call to the instruction sequence
(96% of the uses of EXECUTE in Forth).

- anton
--
'Anyone trying for "industrial quality" ISA should avoid undefined behavior.'
Mitch Alsup, <c17fcd89-f024-40e7-a594-88a85ac10d20o@googlegroups.com>

Re: Sequencer vs microcode

<jwvczs4heyw.fsf-monnier+comp.arch@gnu.org>

  copy mid

https://www.novabbs.com/devel/article-flat.php?id=18189&group=comp.arch#18189

  copy link   Newsgroups: comp.arch
Path: i2pn2.org!i2pn.org!eternal-september.org!reader02.eternal-september.org!.POSTED!not-for-mail
From: monn...@iro.umontreal.ca (Stefan Monnier)
Newsgroups: comp.arch
Subject: Re: Sequencer vs microcode
Date: Tue, 29 Jun 2021 11:25:09 -0400
Organization: A noiseless patient Spider
Lines: 9
Message-ID: <jwvczs4heyw.fsf-monnier+comp.arch@gnu.org>
References: <sb1abl$ps8$1@dont-email.me> <sb553t$kta$1@dont-email.me>
<sb5a40$o97$1@dont-email.me>
<42b654db-a022-42f3-a6aa-94cd5a88de98n@googlegroups.com>
<53e78bb8-5c9c-481e-8cec-50a3febcbe95n@googlegroups.com>
<sb75hs$7nq$1@dont-email.me>
<3050904a-eec2-4d47-ba5d-bfed1d774518n@googlegroups.com>
<3d37be3c-e4fd-41b3-93f7-7037a3c19aefn@googlegroups.com>
<e5IBI.683979$ST2.397427@fx47.iad>
<8b1baea3-a802-4341-8afd-d827e4eee948n@googlegroups.com>
<3b2b37a7-6168-42c0-a657-dcff23fa3e03n@googlegroups.com>
<7ce8fca5-3a58-4e9a-92b4-595d0a743c57n@googlegroups.com>
<e2d01fcb-2cb1-4fb4-bf0e-b08d23ead140n@googlegroups.com>
<jwvv95yhy1o.fsf-monnier+comp.arch@gnu.org>
<sbcu9u$587$1@dont-email.me>
<jwvfsx2hquz.fsf-monnier+comp.arch@gnu.org>
<sbd1tr$jfl$1@dont-email.me>
<jwvy2ath6c3.fsf-monnier+comp.arch@gnu.org>
<1ca49244-3cb6-4cfe-aee8-332145c67674n@googlegroups.com>
<35b24e3f-0b5d-42a1-a08a-cc4d8d77231fn@googlegroups.com>
<mZwCI.366532$jf1.289886@fx37.iad>
Mime-Version: 1.0
Content-Type: text/plain
Injection-Info: reader02.eternal-september.org; posting-host="d5b28953779022f435cfca735a21e3c7";
logging-data="13178"; mail-complaints-to="abuse@eternal-september.org"; posting-account="U2FsdGVkX1+twr/2c4lgaSuQVr+D9z8B"
User-Agent: Gnus/5.13 (Gnus v5.13) Emacs/28.0.50 (gnu/linux)
Cancel-Lock: sha1:MNsYlfSx4JqGWTgVg3KbuKFUzfc=
sha1:ww9Mpg93NtMQCJX9lF57E6ZGYsM=
 by: Stefan Monnier - Tue, 29 Jun 2021 15:25 UTC

> Decode has to have a way to stall when it encounters things like "JMP reg"
> and it does not know what reg value is.

No, AFAIK it doesn't.
Instead it uses a BTB to predict the destination of the jump, otherwise
such indirect jumps would be much too slow.

Stefan

Re: Sequencer vs microcode

<jwv7dichevp.fsf-monnier+comp.arch@gnu.org>

  copy mid

https://www.novabbs.com/devel/article-flat.php?id=18190&group=comp.arch#18190

  copy link   Newsgroups: comp.arch
Path: i2pn2.org!i2pn.org!eternal-september.org!reader02.eternal-september.org!.POSTED!not-for-mail
From: monn...@iro.umontreal.ca (Stefan Monnier)
Newsgroups: comp.arch
Subject: Re: Sequencer vs microcode
Date: Tue, 29 Jun 2021 11:29:02 -0400
Organization: A noiseless patient Spider
Lines: 12
Message-ID: <jwv7dichevp.fsf-monnier+comp.arch@gnu.org>
References: <sb1abl$ps8$1@dont-email.me>
<cc7766bb-1433-45f4-9ba5-93c40bdab0f7n@googlegroups.com>
<sb553t$kta$1@dont-email.me> <sb5a40$o97$1@dont-email.me>
<42b654db-a022-42f3-a6aa-94cd5a88de98n@googlegroups.com>
<53e78bb8-5c9c-481e-8cec-50a3febcbe95n@googlegroups.com>
<sb75hs$7nq$1@dont-email.me>
<3050904a-eec2-4d47-ba5d-bfed1d774518n@googlegroups.com>
<3d37be3c-e4fd-41b3-93f7-7037a3c19aefn@googlegroups.com>
<e5IBI.683979$ST2.397427@fx47.iad>
<8b1baea3-a802-4341-8afd-d827e4eee948n@googlegroups.com>
<3b2b37a7-6168-42c0-a657-dcff23fa3e03n@googlegroups.com>
<7ce8fca5-3a58-4e9a-92b4-595d0a743c57n@googlegroups.com>
<e2d01fcb-2cb1-4fb4-bf0e-b08d23ead140n@googlegroups.com>
<jwvv95yhy1o.fsf-monnier+comp.arch@gnu.org>
<sbcu9u$587$1@dont-email.me>
<jwvfsx2hquz.fsf-monnier+comp.arch@gnu.org>
<sbd1tr$jfl$1@dont-email.me>
<jwvy2ath6c3.fsf-monnier+comp.arch@gnu.org>
<sbeig1$o01$1@dont-email.me>
<18641964-95f3-426a-a318-470ef6cfcd47n@googlegroups.com>
Mime-Version: 1.0
Content-Type: text/plain
Injection-Info: reader02.eternal-september.org; posting-host="d5b28953779022f435cfca735a21e3c7";
logging-data="13178"; mail-complaints-to="abuse@eternal-september.org"; posting-account="U2FsdGVkX185EIYNc24u1RI3WK1u+EYs"
User-Agent: Gnus/5.13 (Gnus v5.13) Emacs/28.0.50 (gnu/linux)
Cancel-Lock: sha1:b+PL2KeNR2wfsDb2UnMXsnpysWk=
sha1:DdYR1Cu7pfkRGWe1eGyWkA3hKv4=
 by: Stefan Monnier - Tue, 29 Jun 2021 15:29 UTC

> There is compute time overhead with JIT.

But if it's executed enough times, this overhead is lost in the noise.
And if it's not executed enough times, then it's usually not
performance-critical.

Also most of the "overhead" of JIT is usually in the time to generate
the code (and that time would be the same for EXEC), rather than in the
synchronization with the I$/fetch/decode.

Stefan

Re: Exec

<jwv1r8khepo.fsf-monnier+comp.arch@gnu.org>

  copy mid

https://www.novabbs.com/devel/article-flat.php?id=18191&group=comp.arch#18191

  copy link   Newsgroups: comp.arch
Path: i2pn2.org!i2pn.org!eternal-september.org!reader02.eternal-september.org!.POSTED!not-for-mail
From: monn...@iro.umontreal.ca (Stefan Monnier)
Newsgroups: comp.arch
Subject: Re: Exec
Date: Tue, 29 Jun 2021 11:30:21 -0400
Organization: A noiseless patient Spider
Lines: 11
Message-ID: <jwv1r8khepo.fsf-monnier+comp.arch@gnu.org>
References: <sb1abl$ps8$1@dont-email.me>
<3b2b37a7-6168-42c0-a657-dcff23fa3e03n@googlegroups.com>
<7ce8fca5-3a58-4e9a-92b4-595d0a743c57n@googlegroups.com>
<e2d01fcb-2cb1-4fb4-bf0e-b08d23ead140n@googlegroups.com>
<jwvv95yhy1o.fsf-monnier+comp.arch@gnu.org>
<sbcu9u$587$1@dont-email.me>
<jwvfsx2hquz.fsf-monnier+comp.arch@gnu.org>
<sbd1tr$jfl$1@dont-email.me>
<jwvy2ath6c3.fsf-monnier+comp.arch@gnu.org>
<sbeig1$o01$1@dont-email.me>
<18641964-95f3-426a-a318-470ef6cfcd47n@googlegroups.com>
<2021Jun29.162615@mips.complang.tuwien.ac.at>
Mime-Version: 1.0
Content-Type: text/plain
Injection-Info: reader02.eternal-september.org; posting-host="d5b28953779022f435cfca735a21e3c7";
logging-data="13178"; mail-complaints-to="abuse@eternal-september.org"; posting-account="U2FsdGVkX19asYMwt6+zHnKjlfilclWp"
User-Agent: Gnus/5.13 (Gnus v5.13) Emacs/28.0.50 (gnu/linux)
Cancel-Lock: sha1:yJxyERfWKKSu4NxHGQZy7YbpCG4=
sha1:UG6xtgXLrdRGrwNWwb3aMiEnL5k=
 by: Stefan Monnier - Tue, 29 Jun 2021 15:30 UTC

>>That would make it horrendously sl=
>>ow if it turns out to be a branch, but why would one EXEC a branch anyway?
> You want to pass code that consists of more instructions than you can
> pass directly to EXEC, so you pass a call to the instruction sequence
> (96% of the uses of EXECUTE in Forth).

That's just an indirect function call (i.e. "JAL reg"), no need for an
EXEC instruction in the CPU for that.

Stefan

Re: Sequencer vs microcode

<weHCI.1947$SMMb.866@fx17.iad>

  copy mid

https://www.novabbs.com/devel/article-flat.php?id=18192&group=comp.arch#18192

  copy link   Newsgroups: comp.arch
Path: i2pn2.org!i2pn.org!paganini.bofh.team!news.dns-netz.com!news.freedyn.net!newsfeed.xs4all.nl!newsfeed9.news.xs4all.nl!news-out.netnews.com!news.alt.net!fdc3.netnews.com!peer02.ams1!peer.ams1.xlned.com!news.xlned.com!peer03.iad!feed-me.highwinds-media.com!news.highwinds-media.com!fx17.iad.POSTED!not-for-mail
From: ThatWoul...@thevillage.com (EricP)
User-Agent: Thunderbird 2.0.0.24 (Windows/20100228)
MIME-Version: 1.0
Newsgroups: comp.arch
Subject: Re: Sequencer vs microcode
References: <sb1abl$ps8$1@dont-email.me> <sb5a40$o97$1@dont-email.me> <42b654db-a022-42f3-a6aa-94cd5a88de98n@googlegroups.com> <53e78bb8-5c9c-481e-8cec-50a3febcbe95n@googlegroups.com> <sb75hs$7nq$1@dont-email.me> <3050904a-eec2-4d47-ba5d-bfed1d774518n@googlegroups.com> <3d37be3c-e4fd-41b3-93f7-7037a3c19aefn@googlegroups.com> <e5IBI.683979$ST2.397427@fx47.iad> <8b1baea3-a802-4341-8afd-d827e4eee948n@googlegroups.com> <3b2b37a7-6168-42c0-a657-dcff23fa3e03n@googlegroups.com> <7ce8fca5-3a58-4e9a-92b4-595d0a743c57n@googlegroups.com> <e2d01fcb-2cb1-4fb4-bf0e-b08d23ead140n@googlegroups.com> <jwvv95yhy1o.fsf-monnier+comp.arch@gnu.org> <sbcu9u$587$1@dont-email.me> <jwvfsx2hquz.fsf-monnier+comp.arch@gnu.org> <sbd1tr$jfl$1@dont-email.me> <jwvy2ath6c3.fsf-monnier+comp.arch@gnu.org> <1ca49244-3cb6-4cfe-aee8-332145c67674n@googlegroups.com> <35b24e3f-0b5d-42a1-a08a-cc4d8d77231fn@googlegroups.com> <mZwCI.366532$jf1.289886@fx37.iad> <jwvczs4heyw.fsf-monnier+comp.arch@gnu.org>
In-Reply-To: <jwvczs4heyw.fsf-monnier+comp.arch@gnu.org>
Content-Type: text/plain; charset=ISO-8859-1; format=flowed
Content-Transfer-Encoding: 7bit
Lines: 17
Message-ID: <weHCI.1947$SMMb.866@fx17.iad>
X-Complaints-To: abuse@UsenetServer.com
NNTP-Posting-Date: Tue, 29 Jun 2021 15:45:32 UTC
Date: Tue, 29 Jun 2021 11:45:07 -0400
X-Received-Bytes: 2242
 by: EricP - Tue, 29 Jun 2021 15:45 UTC

Stefan Monnier wrote:
>> Decode has to have a way to stall when it encounters things like "JMP reg"
>> and it does not know what reg value is.
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^

> No, AFAIK it doesn't.
> Instead it uses a BTB to predict the destination of the jump, otherwise
> such indirect jumps would be much too slow.
>
>
> Stefan

The predictor MAY supply such a value, which MAY be correct.
However the design must deal with the possibility that the predictor
responds with "no hit", or "hit" but provides the wrong value.
Hit with the correct value is just gravy.

Re: Sequencer vs microcode

<6885d093-3e35-4c5b-8485-182ad8494462n@googlegroups.com>

  copy mid

https://www.novabbs.com/devel/article-flat.php?id=18196&group=comp.arch#18196

  copy link   Newsgroups: comp.arch
X-Received: by 2002:a0c:eb0c:: with SMTP id j12mr32242112qvp.3.1624987336100;
Tue, 29 Jun 2021 10:22:16 -0700 (PDT)
X-Received: by 2002:a9d:d09:: with SMTP id 9mr5338472oti.16.1624987335875;
Tue, 29 Jun 2021 10:22:15 -0700 (PDT)
Path: i2pn2.org!i2pn.org!weretis.net!feeder8.news.weretis.net!proxad.net!feeder1-2.proxad.net!209.85.160.216.MISMATCH!news-out.google.com!nntp.google.com!postnews.google.com!google-groups.googlegroups.com!not-for-mail
Newsgroups: comp.arch
Date: Tue, 29 Jun 2021 10:22:15 -0700 (PDT)
In-Reply-To: <35b24e3f-0b5d-42a1-a08a-cc4d8d77231fn@googlegroups.com>
Injection-Info: google-groups.googlegroups.com; posting-host=2600:1700:291:29f0:1a6:66b6:1520:df39;
posting-account=H_G_JQkAAADS6onOMb-dqvUozKse7mcM
NNTP-Posting-Host: 2600:1700:291:29f0:1a6:66b6:1520:df39
References: <sb1abl$ps8$1@dont-email.me> <cc7766bb-1433-45f4-9ba5-93c40bdab0f7n@googlegroups.com>
<sb553t$kta$1@dont-email.me> <sb5a40$o97$1@dont-email.me> <42b654db-a022-42f3-a6aa-94cd5a88de98n@googlegroups.com>
<53e78bb8-5c9c-481e-8cec-50a3febcbe95n@googlegroups.com> <sb75hs$7nq$1@dont-email.me>
<3050904a-eec2-4d47-ba5d-bfed1d774518n@googlegroups.com> <3d37be3c-e4fd-41b3-93f7-7037a3c19aefn@googlegroups.com>
<e5IBI.683979$ST2.397427@fx47.iad> <8b1baea3-a802-4341-8afd-d827e4eee948n@googlegroups.com>
<3b2b37a7-6168-42c0-a657-dcff23fa3e03n@googlegroups.com> <7ce8fca5-3a58-4e9a-92b4-595d0a743c57n@googlegroups.com>
<e2d01fcb-2cb1-4fb4-bf0e-b08d23ead140n@googlegroups.com> <jwvv95yhy1o.fsf-monnier+comp.arch@gnu.org>
<sbcu9u$587$1@dont-email.me> <jwvfsx2hquz.fsf-monnier+comp.arch@gnu.org>
<sbd1tr$jfl$1@dont-email.me> <jwvy2ath6c3.fsf-monnier+comp.arch@gnu.org>
<1ca49244-3cb6-4cfe-aee8-332145c67674n@googlegroups.com> <35b24e3f-0b5d-42a1-a08a-cc4d8d77231fn@googlegroups.com>
User-Agent: G2/1.0
MIME-Version: 1.0
Message-ID: <6885d093-3e35-4c5b-8485-182ad8494462n@googlegroups.com>
Subject: Re: Sequencer vs microcode
From: MitchAl...@aol.com (MitchAlsup)
Injection-Date: Tue, 29 Jun 2021 17:22:16 +0000
Content-Type: text/plain; charset="UTF-8"
Content-Transfer-Encoding: quoted-printable
 by: MitchAlsup - Tue, 29 Jun 2021 17:22 UTC

On Monday, June 28, 2021 at 10:18:07 PM UTC-5, robf...@gmail.com wrote:
> On Monday, June 28, 2021 at 8:40:39 PM UTC-4, MitchAlsup wrote:
> > On Monday, June 28, 2021 at 7:31:13 PM UTC-5, Stefan Monnier wrote:
> > > >> What makes it so "very cheap"er than a function call using normal
> > > >> instructions
> > > > It reduces the "overhead" of the function call by half. One instruction
> > > > (the EXC) versus two (the Call and the Return).
> > > I don't think you can count it quite this way except maybe for
> > > naive implementations, since "jump&link" and "ret" are typically both
> > > "performed" by the branch prediction (and then later actually executed,
> > > admittedly, but this should usually use an otherwise unused execution
> > > unit so it's quite cheap).
> > >
> > > Also those gains can easily be dwarfed by the potential difficulty to
> > > keep the fetch unit fed when encountering an EXEC (i.e adjusting the
> > > branch prediction to do something useful).
> > > >> and why/when would I need it?
> > > > You never "need" it, as there are other, albeit more costly, ways to
> > > > accomplish the same thing, but it is sometimes useful.
> > <
> > > The question is what are real cases where the presumably lower cost
> > > can/could be used to get a measurably higher performance than using the
> > > usual implementation&optimization strategies?
> > <
> > What High Level Languages allow one to pass code to a subroutine that
> > is not wrapped up in a subroutine ?
> > >
> > >
> > > Stefan
> Suppose there is a text search driven by user input. The first time in the user wants to find all letters < ‘a’ so a set less than ‘a’ instruction is passed in a register to the text search function. The next time in the user want to search for letters greater than ‘g’. So a greater than ‘g’ instruction is passed in a register to the text search function.
<
Now imagine the case where the caller passes the "clear all registers" instruction..............
<
>In the text search function all it does is ‘exec Rn’ where Rn is the register containing the instruction. Without an exec instruction either self modifying code (JIT) must be used to get the same performance or a switch statement with an option set by the user passed in. The switch statement because it is branches is several times slower than exec. It may not make a difference in many programs, but sometimes if the code is called millions of times for instance drawing graphics on-screen it can make a difference.
>
> EXEC is a challenge to implement in a classic pipelined processor, but in a processor using an out-of-order instruction queue it is a bit easier to do.
<
In OoO it is even HARDER to do. imagine that the instruction to be fetched takes a data cache miss
whose address is dependent on a integer DIVIDE instruction. You don't have the instruction for 50-70 cycles, so you basically sit there issuing nothing until the instruction to EXC shows up.
<
>EXEC does not really affect the branch logic. It is not actually doing a fast subroutine call / return. It is just populating the instruction register field with a value from a register, so there is a double decode. To get the double decode in a classic pipelined processor the easiest approach may be to perform a hidden branch back to the same instruction after feeding a pipeline IR register with the value from a register.
> After the first decode it is treated as an ordinary instruction which may or may not be a branch instruction. There are ways to avoid the double decode, but it increases the size of the execute logic.

Re: Sequencer vs microcode

<9566e657-f481-423d-bde2-56f78b347c89n@googlegroups.com>

  copy mid

https://www.novabbs.com/devel/article-flat.php?id=18198&group=comp.arch#18198

  copy link   Newsgroups: comp.arch
X-Received: by 2002:a05:620a:205d:: with SMTP id d29mr8129870qka.296.1624987736149; Tue, 29 Jun 2021 10:28:56 -0700 (PDT)
X-Received: by 2002:a05:6808:14c8:: with SMTP id f8mr22514613oiw.7.1624987735926; Tue, 29 Jun 2021 10:28:55 -0700 (PDT)
Path: i2pn2.org!i2pn.org!paganini.bofh.team!news.dns-netz.com!news.freedyn.net!newsfeed.xs4all.nl!newsfeed7.news.xs4all.nl!tr1.eu1.usenetexpress.com!feeder.usenetexpress.com!tr3.iad1.usenetexpress.com!border1.nntp.dca1.giganews.com!nntp.giganews.com!news-out.google.com!nntp.google.com!postnews.google.com!google-groups.googlegroups.com!not-for-mail
Newsgroups: comp.arch
Date: Tue, 29 Jun 2021 10:28:55 -0700 (PDT)
In-Reply-To: <12fa6643-1457-48a2-afbf-973bbb8b78b6n@googlegroups.com>
Injection-Info: google-groups.googlegroups.com; posting-host=2600:1700:291:29f0:1a6:66b6:1520:df39; posting-account=H_G_JQkAAADS6onOMb-dqvUozKse7mcM
NNTP-Posting-Host: 2600:1700:291:29f0:1a6:66b6:1520:df39
References: <sb1abl$ps8$1@dont-email.me> <3b2b37a7-6168-42c0-a657-dcff23fa3e03n@googlegroups.com> <7ce8fca5-3a58-4e9a-92b4-595d0a743c57n@googlegroups.com> <e2d01fcb-2cb1-4fb4-bf0e-b08d23ead140n@googlegroups.com> <jwvv95yhy1o.fsf-monnier+comp.arch@gnu.org> <sbcu9u$587$1@dont-email.me> <jwvfsx2hquz.fsf-monnier+comp.arch@gnu.org> <sbd1tr$jfl$1@dont-email.me> <jwvy2ath6c3.fsf-monnier+comp.arch@gnu.org> <1ca49244-3cb6-4cfe-aee8-332145c67674n@googlegroups.com> <2021Jun29.151032@mips.complang.tuwien.ac.at> <12fa6643-1457-48a2-afbf-973bbb8b78b6n@googlegroups.com>
User-Agent: G2/1.0
MIME-Version: 1.0
Message-ID: <9566e657-f481-423d-bde2-56f78b347c89n@googlegroups.com>
Subject: Re: Sequencer vs microcode
From: MitchAl...@aol.com (MitchAlsup)
Injection-Date: Tue, 29 Jun 2021 17:28:56 +0000
Content-Type: text/plain; charset="UTF-8"
Content-Transfer-Encoding: quoted-printable
Lines: 48
 by: MitchAlsup - Tue, 29 Jun 2021 17:28 UTC

On Tuesday, June 29, 2021 at 9:20:11 AM UTC-5, robf...@gmail.com wrote:
> On Tuesday, June 29, 2021 at 9:58:54 AM UTC-4, Anton Ertl wrote:
> > MitchAlsup <Mitch...@aol.com> writes:
> > >What High Level Languages allow one to pass code to a subroutine that
> > >is not wrapped up in a subroutine ?
> > Forth.
> >
> > But that's the wrong question. Architectures have a number of
> > features that are not directly reflected in high-level languages,
> > e.g., the carry flag (and Intel has added the ADX extension because
> > the existing carry flag was not enough).
> > - anton
> > --
> > 'Anyone trying for "industrial quality" ISA should avoid undefined behavior.'
> > Mitch Alsup, <c17fcd89-f024-40e7...@googlegroups.com>
> I believe also IIRC the Clipper DBASE3/4 compiler also allows code to pass to a subroutine.
>
> Appreciating that EXEC is not high-performance instruction I am still tempted to include it, as the way it is implemented is not high cost and I have transistors to burn. XOR is not a very useful instruction either,
<
On the contrary, it is very useful, and exceedingly inexpensive to implement.
<
if( state & SOME_STATE )
state ^= SOME_STATE | SOME_NEXT_STATE;
<
>but many machines include it. EXEC may allow 64-bit instructions to execute in a kludgy manner.
<
What is the semantic if you wanted to EXEC a register and the instruction needed 96-bits,
128-bits, or 160-bits ??
<
> Since EXEC uses a 2R instruction format it could conditionally execute based on the value in a second register. Could allow predicated execution of any instruction.
<
There are much better and safer means to predicate an instruction--which your ISA should
already have.
<
>Getting a vector exec turns out not to have additional cost. It may be an idea to have a vector exec abort instruction. Vector, execute is interesting because it could be made dependent on a mask register setting. Different combinations of instructions could be executed based on the mask value.

Re: Exec

<2021Jun29.192608@mips.complang.tuwien.ac.at>

  copy mid

https://www.novabbs.com/devel/article-flat.php?id=18199&group=comp.arch#18199

  copy link   Newsgroups: comp.arch
Path: i2pn2.org!i2pn.org!eternal-september.org!reader02.eternal-september.org!.POSTED!not-for-mail
From: ant...@mips.complang.tuwien.ac.at (Anton Ertl)
Newsgroups: comp.arch
Subject: Re: Exec
Date: Tue, 29 Jun 2021 17:26:08 GMT
Organization: Institut fuer Computersprachen, Technische Universitaet Wien
Lines: 19
Message-ID: <2021Jun29.192608@mips.complang.tuwien.ac.at>
References: <sb1abl$ps8$1@dont-email.me> <e2d01fcb-2cb1-4fb4-bf0e-b08d23ead140n@googlegroups.com> <jwvv95yhy1o.fsf-monnier+comp.arch@gnu.org> <sbcu9u$587$1@dont-email.me> <jwvfsx2hquz.fsf-monnier+comp.arch@gnu.org> <sbd1tr$jfl$1@dont-email.me> <jwvy2ath6c3.fsf-monnier+comp.arch@gnu.org> <sbeig1$o01$1@dont-email.me> <18641964-95f3-426a-a318-470ef6cfcd47n@googlegroups.com> <2021Jun29.162615@mips.complang.tuwien.ac.at> <jwv1r8khepo.fsf-monnier+comp.arch@gnu.org>
Injection-Info: reader02.eternal-september.org; posting-host="c65cab040ca5bcb1b4afd0898903036f";
logging-data="25206"; mail-complaints-to="abuse@eternal-september.org"; posting-account="U2FsdGVkX1/nswplV97hyYnGEZx/FJzM"
Cancel-Lock: sha1:hl+K3BLStEGOK9o1uymVdRRndQQ=
X-newsreader: xrn 10.00-beta-3
 by: Anton Ertl - Tue, 29 Jun 2021 17:26 UTC

Stefan Monnier <monnier@iro.umontreal.ca> writes:
>>>That would make it horrendously sl=
>>>ow if it turns out to be a branch, but why would one EXEC a branch anyway?
>> You want to pass code that consists of more instructions than you can
>> pass directly to EXEC, so you pass a call to the instruction sequence
>> (96% of the uses of EXECUTE in Forth).
>
>That's just an indirect function call (i.e. "JAL reg"), no need for an
>EXEC instruction in the CPU for that.

If you only want to do indirect calls, yes. If you also want to be
able to execute other instructions, you need EXEC. Or you need to
store these other instructions in memory followed by a return (and the
usual cache synchronization dance).

- anton
--
'Anyone trying for "industrial quality" ISA should avoid undefined behavior.'
Mitch Alsup, <c17fcd89-f024-40e7-a594-88a85ac10d20o@googlegroups.com>

Re: Exec

<42430dc6-1410-440b-8009-b47e15b5541cn@googlegroups.com>

  copy mid

https://www.novabbs.com/devel/article-flat.php?id=18204&group=comp.arch#18204

  copy link   Newsgroups: comp.arch
X-Received: by 2002:a05:622a:11cf:: with SMTP id n15mr2872637qtk.256.1624990235239;
Tue, 29 Jun 2021 11:10:35 -0700 (PDT)
X-Received: by 2002:a05:6830:1643:: with SMTP id h3mr5772541otr.76.1624990235035;
Tue, 29 Jun 2021 11:10:35 -0700 (PDT)
Path: i2pn2.org!i2pn.org!weretis.net!feeder8.news.weretis.net!proxad.net!feeder1-2.proxad.net!209.85.160.216.MISMATCH!news-out.google.com!nntp.google.com!postnews.google.com!google-groups.googlegroups.com!not-for-mail
Newsgroups: comp.arch
Date: Tue, 29 Jun 2021 11:10:34 -0700 (PDT)
In-Reply-To: <2021Jun29.192608@mips.complang.tuwien.ac.at>
Injection-Info: google-groups.googlegroups.com; posting-host=2607:fea8:1de1:d400:30e5:6d26:4cbb:f0ba;
posting-account=QId4bgoAAABV4s50talpu-qMcPp519Eb
NNTP-Posting-Host: 2607:fea8:1de1:d400:30e5:6d26:4cbb:f0ba
References: <sb1abl$ps8$1@dont-email.me> <e2d01fcb-2cb1-4fb4-bf0e-b08d23ead140n@googlegroups.com>
<jwvv95yhy1o.fsf-monnier+comp.arch@gnu.org> <sbcu9u$587$1@dont-email.me>
<jwvfsx2hquz.fsf-monnier+comp.arch@gnu.org> <sbd1tr$jfl$1@dont-email.me>
<jwvy2ath6c3.fsf-monnier+comp.arch@gnu.org> <sbeig1$o01$1@dont-email.me>
<18641964-95f3-426a-a318-470ef6cfcd47n@googlegroups.com> <2021Jun29.162615@mips.complang.tuwien.ac.at>
<jwv1r8khepo.fsf-monnier+comp.arch@gnu.org> <2021Jun29.192608@mips.complang.tuwien.ac.at>
User-Agent: G2/1.0
MIME-Version: 1.0
Message-ID: <42430dc6-1410-440b-8009-b47e15b5541cn@googlegroups.com>
Subject: Re: Exec
From: robfi...@gmail.com (robf...@gmail.com)
Injection-Date: Tue, 29 Jun 2021 18:10:35 +0000
Content-Type: text/plain; charset="UTF-8"
Content-Transfer-Encoding: quoted-printable
 by: robf...@gmail.com - Tue, 29 Jun 2021 18:10 UTC

I do not seem to be able to follow the example. Why cannot other instructions decode and execute during that time? It is just a load operation taking place to load the instruction. Things would proceed as per a normal load. It may be many more cycles / instructions before the instruction EXEC is executed. Once EXEC is decoded / issued it is just a matter of waiting for register values.

>What is the semantic if you wanted to EXEC a register and the instruction needed 96-bits,
>128-bits, or 160-bits ??
Instructions are limited to 36-bits in this case. I think using more registers to contain instruction modifiers will work.

I think it is execute of the one particular instruction that must stall, not decode. On my machine other instructions can be decoded or execute. This is relied on to hide the decode time in many cases.
If the registers are valid before the EXEC is processed, then it will be fast. Hmm, it just occurred to me that I put a four-cycle lockout in the execute scheduler to prevent it selecting the same queue slot until there was time to mark the instruction as out. So, EXEC would take five or more cycles to execute. Of course, it is mostly hidden time.

It is very few LOC to implement EXEC. The biggest piece is probably the mux on the IR. I am waiting to see how much it impacts design size, but synth keeps crashing on me.
I think all that needs to be done is:
if (robo.exec) begin
rob[robo.rid].ir <= robo.res.val[35:0]; // comes from register arg A
rob[robo.rid].dec <= FALSE;
rob[robo.rid].out <= FALSE;
rob[robo.rid].cmt <= FALSE;
rob[robo.rid].cmt2 <= FALSE;
rob[robo.rid].vcmt <= FALSE;
end

Pages:1234567
server_pubkey.txt

rocksolid light 0.9.8
clearnet tor