Rocksolid Light

Welcome to novaBBS (click a section below)

mail  files  register  newsreader  groups  login

Message-ID:  

It is easier to change the specification to fit the program than vice versa.


devel / comp.arch / Re: PDP-11-like ISA

SubjectAuthor
* PDP-11-like ISAMitchAlsup
+* Re: PDP-11-like ISAJimBrakefield
|`- Re: PDP-11-like ISAMitchAlsup
+* Re: PDP-11-like ISAQuadibloc
|+* Re: PDP-11-like ISAQuadibloc
||`- Re: PDP-11-like ISAMitchAlsup
|+- Re: PDP-11-like ISAMitchAlsup
|+- Re: PDP-11-like ISAMitchAlsup
|`* Re: PDP-11-like ISAchris
| `* Re: PDP-11-like ISAQuadibloc
|  +* Re: PDP-11-like ISAMitchAlsup
|  |+* Re: PDP-11-like ISAIvan Godard
|  ||+* Re: PDP-11-like ISAMitchAlsup
|  |||`- Re: PDP-11-like ISAIvan Godard
|  ||`* Re: PDP-11-like ISATerje Mathisen
|  || `* Re: PDP-11-like ISAIvan Godard
|  ||  +* Re: PDP-11-like ISAMitchAlsup
|  ||  |+- Re: PDP-11-like ISATerje Mathisen
|  ||  |`* Re: PDP-11-like ISAEricP
|  ||  | +* Re: PDP-11-like ISAMitchAlsup
|  ||  | |`- Re: PDP-11-like ISAEricP
|  ||  | `- Re: PDP-11-like ISATerje Mathisen
|  ||  `* Re: PDP-11-like ISAStephen Fuld
|  ||   `* Re: PDP-11-like ISAIvan Godard
|  ||    `* Re: PDP-11-like ISAStephen Fuld
|  ||     +- Re: PDP-11-like ISAIvan Godard
|  ||     `- Re: PDP-11-like ISAIvan Godard
|  |`* Re: PDP-11-like ISAQuadibloc
|  | +* Re: PDP-11-like ISAQuadibloc
|  | |+* Re: PDP-11-like ISAIvan Godard
|  | ||`* Re: PDP-11-like ISAQuadibloc
|  | || `* Re: PDP-11-like ISAJames Harris
|  | ||  +* Condition codes (was: PDP-11-like ISA)Anton Ertl
|  | ||  |`* Re: Condition codes (was: PDP-11-like ISA)Quadibloc
|  | ||  | `- Re: Condition codes (was: PDP-11-like ISA)Anton Ertl
|  | ||  `* Re: PDP-11-like ISAMitchAlsup
|  | ||   `* Re: PDP-11-like ISAJames Harris
|  | ||    `- Re: PDP-11-like ISAMitchAlsup
|  | |`- Re: PDP-11-like ISAQuadibloc
|  | +* Re: PDP-11-like ISAMitchAlsup
|  | |`* Re: PDP-11-like ISAEricP
|  | | `- Re: PDP-11-like ISAMitchAlsup
|  | `- Condition codes (was: PDP-11-like ISA)Anton Ertl
|  `* Re: PDP-11-like ISAAnton Ertl
|   `* Re: PDP-11-like ISAQuadibloc
|    `* Re: PDP-11-like ISAAnton Ertl
|     +* Re: PDP-11-like ISAQuadibloc
|     |`- Re: PDP-11-like ISAQuadibloc
|     `* Re: PDP-11-like ISAMitchAlsup
|      `- Re: PDP-11-like ISAQuadibloc
`* Re: PDP-11-like ISAEricP
 +* Re: PDP-11-like ISAJimBrakefield
 |`* Re: PDP-11-like ISAMitchAlsup
 | `- Re: PDP-11-like ISAJimBrakefield
 +* Re: PDP-11-like ISAMitchAlsup
 |+* Re: PDP-11-like ISAQuadibloc
 ||+- Re: PDP-11-like ISAMitchAlsup
 ||`- Re: PDP-11-like ISAMitchAlsup
 |`* Re: PDP-11-like ISAEricP
 | +* Re: PDP-11-like ISAMitchAlsup
 | |`* Re: PDP-11-like ISAEricP
 | | `* Re: PDP-11-like ISAMitchAlsup
 | |  +- Re: PDP-11-like ISAJimBrakefield
 | |  +- Re: PDP-11-like ISAThomas Koenig
 | |  `* Re: PDP-11-like ISATerje Mathisen
 | |   +- Re: PDP-11-like ISAQuadibloc
 | |   +- Re: PDP-11-like ISAMitchAlsup
 | |   `* Re: PDP-11-like ISAStefan Monnier
 | |    `- Re: PDP-11-like ISATerje Mathisen
 | `* Re: PDP-11-like ISATerje Mathisen
 |  +* Re: PDP-11-like ISAEricP
 |  |`* Re: PDP-11-like ISATerje Mathisen
 |  | `* Re: PDP-11-like ISAMitchAlsup
 |  |  `* Re: PDP-11-like ISATerje Mathisen
 |  |   `* Re: PDP-11-like ISAEricP
 |  |    `* Re: PDP-11-like ISAMitchAlsup
 |  |     `- Re: PDP-11-like ISAEricP
 |  `* Re: PDP-11-like ISAMitchAlsup
 |   +* Re: PDP-11-like ISAMitchAlsup
 |   |`* Re: PDP-11-like ISAQuadibloc
 |   | `- Re: PDP-11-like ISAMitchAlsup
 |   `* Re: PDP-11-like ISATerje Mathisen
 |    `* Re: PDP-11-like ISAMitchAlsup
 |     +* Re: PDP-11-like ISAQuadibloc
 |     |`* Re: PDP-11-like ISAMitchAlsup
 |     | +* Re: PDP-11-like ISAJimBrakefield
 |     | |`- Re: PDP-11-like ISAMitchAlsup
 |     | `- Re: PDP-11-like ISATerje Mathisen
 |     `* Re: PDP-11-like ISAJames Harris
 |      `* Re: PDP-11-like ISAThomas Koenig
 |       +* Re: PDP-11-like ISAAnton Ertl
 |       |`- Re: PDP-11-like ISAThomas Koenig
 |       +* Re: PDP-11-like ISATerje Mathisen
 |       |`- Re: PDP-11-like ISAIvan Godard
 |       `- Re: PDP-11-like ISAEricP
 `* Re: PDP-11-like ISAQuadibloc
  +- Re: PDP-11-like ISAMitchAlsup
  `* Re: PDP-11-like ISAEricP
   +- Re: PDP-11-like ISABGB
   +* Re: PDP-11-like ISAMitchAlsup
   |`* Re: PDP-11-like ISATerje Mathisen
   `* Re: PDP-11-like ISAQuadibloc

Pages:12345
Re: PDP-11-like ISA

<sa5op9$1l48$1@gioia.aioe.org>

  copy mid

https://www.novabbs.com/devel/article-flat.php?id=17698&group=comp.arch#17698

  copy link   Newsgroups: comp.arch
Path: i2pn2.org!i2pn.org!aioe.org!Z/OnjRNZ74xzNAVdC5cKTg.user.gioia.aioe.org.POSTED!not-for-mail
From: terje.ma...@tmsw.no (Terje Mathisen)
Newsgroups: comp.arch
Subject: Re: PDP-11-like ISA
Date: Sun, 13 Jun 2021 22:11:23 +0200
Organization: Aioe.org NNTP Server
Lines: 53
Message-ID: <sa5op9$1l48$1@gioia.aioe.org>
References: <548dc91d-831f-4dd7-a947-8fd7974695e3n@googlegroups.com>
<cyowI.31628$8f1.4834@fx23.iad>
<dfbead57-fff5-4ddb-bb18-38105da53f64n@googlegroups.com>
<j0ywI.36041$431.5756@fx39.iad> <s9voen$do7$1@gioia.aioe.org>
<8iKwI.317313$N_4.28704@fx36.iad> <sa2pt3$ctq$1@gioia.aioe.org>
<b95bb897-d9a0-4550-8292-0a2248c9cb6an@googlegroups.com>
NNTP-Posting-Host: Z/OnjRNZ74xzNAVdC5cKTg.user.gioia.aioe.org
Mime-Version: 1.0
Content-Type: text/plain; charset=ISO-8859-1; format=flowed
Content-Transfer-Encoding: 7bit
X-Complaints-To: abuse@aioe.org
User-Agent: Mozilla/5.0 (Windows NT 10.0; Win64; x64; rv:60.0) Gecko/20100101
Firefox/60.0 SeaMonkey/2.53.7.1
X-Notice: Filtered by postfilter v. 0.9.2
 by: Terje Mathisen - Sun, 13 Jun 2021 20:11 UTC

MitchAlsup wrote:
> On Saturday, June 12, 2021 at 12:12:07 PM UTC-5, Terje Mathisen wrote:
>> EricP wrote:
>>> Terje Mathisen wrote:
>>>> EricP wrote:
>>>>>
>>>>> So toss out auto inc/dec address modes and put in two instructions
>>>>>
>>>>> // pdp-11 asm has dest reg on the right
>>>>> ADDTY #tiny,reg Add Tiny
>>>>> SUBTY #tiny,reg Subtract Tiny
>>>>>
>>>>> where tiny is 1..7 and 0 means 8. Much more generally useful.
>>>>
>>>> I would rather have 0..7 meaning 1..8, since this is trivially
>>>> achieved by always enabling an incoming carry to the address adder.
>>>> Special-casing 0 to mean 8 seems like it would need more gates?
>>>>
>>>> I.e. cnt3 = !(cnt0 & cnt1 & cnt2) vs always setting carry-in?
>>>>
>>>> Terje
>>>
>>> 1..7, 0 => 8 allows the 3 lower bits to be used directly
>>> and a 3-input NOR-3 gate to drive the 4th bit.
>>>
>>> 0..7 => 1..8 needs an incrementer, in this case a NOT, 2 XOR's,
>>> an AND-2 and an AND-3 carry lookaheads.
>>>
>>> Its in the decoder and there are other more expensive things there,
>>> so it wouldn't be on the critical path either way.
> <
>> The logic is simple either way, but since we're already going to use the
>> (address) adder, just setting carry_in is effectively free.
> <
> What makes you think the AGEN adder has a carry in ??

I realize that it doesn't need one: The bottom bit can be handled with a
half adder which I'm guessing is at least one gate delay faster, and
that can negate any gain from not having to generate cnt3?

OTOH, if this is dedicated circuit, then it would know to always add the
carry-in, so the bottom half adder would just be slightly different:

s0 = a == b
carry_out = a | b;

Right?

Terje

--
- <Terje.Mathisen at tmsw.no>
"almost all programming can be viewed as an exercise in caching"

Re: PDP-11-like ISA

<sa5q37$246$1@dont-email.me>

  copy mid

https://www.novabbs.com/devel/article-flat.php?id=17699&group=comp.arch#17699

  copy link   Newsgroups: comp.arch
Path: i2pn2.org!i2pn.org!eternal-september.org!reader02.eternal-september.org!.POSTED!not-for-mail
From: iva...@millcomputing.com (Ivan Godard)
Newsgroups: comp.arch
Subject: Re: PDP-11-like ISA
Date: Sun, 13 Jun 2021 13:33:43 -0700
Organization: A noiseless patient Spider
Lines: 74
Message-ID: <sa5q37$246$1@dont-email.me>
References: <548dc91d-831f-4dd7-a947-8fd7974695e3n@googlegroups.com>
<312431da-80d8-4e85-8e8c-ebc814ab099en@googlegroups.com>
<s9vj68$1h6$1@gioia.aioe.org>
<b030e686-69fe-4563-8949-0ccce68a9766n@googlegroups.com>
<0107a6e5-ccb6-4d58-a18d-ccc68cba06f0n@googlegroups.com>
<sa0ud7$861$1@dont-email.me> <sa5oag$1bl2$1@gioia.aioe.org>
Mime-Version: 1.0
Content-Type: text/plain; charset=utf-8; format=flowed
Content-Transfer-Encoding: 8bit
Injection-Date: Sun, 13 Jun 2021 20:33:43 -0000 (UTC)
Injection-Info: reader02.eternal-september.org; posting-host="bd7a90a9ee0fe45e179c466ca57568da";
logging-data="2182"; mail-complaints-to="abuse@eternal-september.org"; posting-account="U2FsdGVkX1863EkRWEHCDgvml/0T+ouh"
User-Agent: Mozilla/5.0 (Windows NT 10.0; Win64; x64; rv:78.0) Gecko/20100101
Thunderbird/78.11.0
Cancel-Lock: sha1:hfD8LJUMBTH+DDzxvY5NIKs0JGs=
In-Reply-To: <sa5oag$1bl2$1@gioia.aioe.org>
Content-Language: en-US
 by: Ivan Godard - Sun, 13 Jun 2021 20:33 UTC

On 6/13/2021 1:03 PM, Terje Mathisen wrote:
> Ivan Godard wrote:
>> On 6/11/2021 11:42 AM, MitchAlsup wrote:
>>> On Friday, June 11, 2021 at 1:28:30 PM UTC-5, Quadibloc wrote:
>>>> On Friday, June 11, 2021 at 5:59:09 AM UTC-6, chris wrote:
>>>>
>>>>> Opinions may vary, but if you were an assembler programmer back in the
>>>>> days when memory was tight and machines were slow, the fact that
>>>>> condition codes are set with many instructions, was of great
>>>>> benefit.
>>>> Oh, I agree that condition codes are a good thing. I'm not too
>>>> sympathetic
>>>> to architectures like the Alpha and RISC-V which completely dispense
>>>> with
>>>> them.
>>>>
>>>> However, if _every_ operate instruction changes the condition codes,
>>>> you
>>>> have to put the conditional branch right after the operate
>>>> instruction that
>>>> affects it. This creates a dependency, and for branches especially
>>>> that's a
>>>> bad thing.
>>> <
>>> In the days condition codes made sense, the machines were using 4-10
>>> cycles
>>> per instruction, so putting the BC immediately after the Operate
>>> instruction
>>> was dé rigueur.
>>> <
>>> One of the things we learned in the early RISC days was this one
>>> either wanted
>>> no condition codes, or to use a bit to say when the condition codes
>>> were going
>>> to be used--in effect CC write elision. Using that bit is bad for
>>> entropy.
>>>>
>>>> So, the way many RISC instruction sets have done it (the PowerPC taking
>>>> it to a more elaborate level) in a high-performance instruction set,
>>>> I include
>>>> a bit to enable or disable setting the condition codes in
>>>> instructions that can
>>>> set them.
>>>>
>>>> That loses a bit of opcode space, but it doesn't lose the advantages
>>>> you
>>>> mention.
>>> <
>>> I still fail to see where cc gain anything (with proper branch
>>> support), plus
>>> I see many places where a vector of comparand bits is better than a CC
>>> from a CMP instruction (especially with the requirements of IEEE 754).
>>
>> How often is more than one bit subsequently selected from that bit
>> vector? -
>
> Enough?
>
> I.e. branch if a >= b, but only if neither is a nan?
>
> I.e. any kind of compound check.
>
> Terje

And that's the question. True, there could be bits for </=/> and you do
a compound for <= and >=, but that seems pointless to me; you should
have a negate flag (branch direction) in the instruction instead of
compounding. Otherwise, how often are compound checks of real operands
in real code? Yes, FP folding a NaN check is a reasonable use - but how
often is NaN checked for in real code regardless of how you do it?

So if compounds are ignorably rare then the bit vector is just an
encoding idea and should be measured on code density.

Re: Condition codes (was: PDP-11-like ISA)

<2021Jun13.231900@mips.complang.tuwien.ac.at>

  copy mid

https://www.novabbs.com/devel/article-flat.php?id=17701&group=comp.arch#17701

  copy link   Newsgroups: comp.arch
Path: i2pn2.org!i2pn.org!eternal-september.org!reader02.eternal-september.org!.POSTED!not-for-mail
From: ant...@mips.complang.tuwien.ac.at (Anton Ertl)
Newsgroups: comp.arch
Subject: Re: Condition codes (was: PDP-11-like ISA)
Date: Sun, 13 Jun 2021 21:19:00 GMT
Organization: Institut fuer Computersprachen, Technische Universitaet Wien
Lines: 19
Message-ID: <2021Jun13.231900@mips.complang.tuwien.ac.at>
References: <548dc91d-831f-4dd7-a947-8fd7974695e3n@googlegroups.com> <0107a6e5-ccb6-4d58-a18d-ccc68cba06f0n@googlegroups.com> <02aee909-8b53-4a0b-9f10-83a9b4dfb6dfn@googlegroups.com> <af646ced-b439-4ea5-9c2d-010abd963bddn@googlegroups.com> <sa2qem$crs$1@dont-email.me> <12457f77-068d-4252-9a87-c407f8ccc7f1n@googlegroups.com> <sa4oj5$v4j$1@dont-email.me> <2021Jun13.153104@mips.complang.tuwien.ac.at> <feb6adc0-b4eb-47a4-bae6-018542aecc2an@googlegroups.com>
Injection-Info: reader02.eternal-september.org; posting-host="e16a562f480bb8c9f2706653b83d961b";
logging-data="30945"; mail-complaints-to="abuse@eternal-september.org"; posting-account="U2FsdGVkX18W01KH/WVg25jNLcpLbk4X"
Cancel-Lock: sha1:hs3aiYo4CjyRnlKg5YcZi3cm984=
X-newsreader: xrn 10.00-beta-3
 by: Anton Ertl - Sun, 13 Jun 2021 21:19 UTC

Quadibloc <jsavard@ecn.ab.ca> writes:
>In a general register architecture, though, if you lengthen
>all the general registers by one bit, now they're not conformant
>with memory. So load and store multiple don't work well for
>saving state unless they're made wasteful of memory, and
>quite a bit else gets made... inaesthetic at least.

Yes, for saving context on a context switch, you have to preserve the
extra bits somehow, and I explored this in
<2021Mar14.152525@mips.complang.tuwien.ac.at>.

For function calls, I don't think it's worth preserving the carry bits
across them, so just store 64 (not 64) bits, and then load 64 bits
with a load instruction that, e.g., zeros the carry.

- anton
--
'Anyone trying for "industrial quality" ISA should avoid undefined behavior.'
Mitch Alsup, <c17fcd89-f024-40e7-a594-88a85ac10d20o@googlegroups.com>

Re: PDP-11-like ISA

<17ed37e2-f5f6-436f-b5b9-349dd3be649bn@googlegroups.com>

  copy mid

https://www.novabbs.com/devel/article-flat.php?id=17704&group=comp.arch#17704

  copy link   Newsgroups: comp.arch
X-Received: by 2002:ac8:5815:: with SMTP id g21mr14080591qtg.266.1623622190582;
Sun, 13 Jun 2021 15:09:50 -0700 (PDT)
X-Received: by 2002:a05:6830:3089:: with SMTP id f9mr11176240ots.276.1623622190387;
Sun, 13 Jun 2021 15:09:50 -0700 (PDT)
Path: i2pn2.org!i2pn.org!paganini.bofh.team!usenet.pasdenom.info!usenet-fr.net!fdn.fr!proxad.net!feeder1-2.proxad.net!209.85.160.216.MISMATCH!news-out.google.com!nntp.google.com!postnews.google.com!google-groups.googlegroups.com!not-for-mail
Newsgroups: comp.arch
Date: Sun, 13 Jun 2021 15:09:50 -0700 (PDT)
In-Reply-To: <sa5k5p$1k92$1@gioia.aioe.org>
Injection-Info: google-groups.googlegroups.com; posting-host=2600:1700:291:29f0:7916:6f80:6291:31c9;
posting-account=H_G_JQkAAADS6onOMb-dqvUozKse7mcM
NNTP-Posting-Host: 2600:1700:291:29f0:7916:6f80:6291:31c9
References: <548dc91d-831f-4dd7-a947-8fd7974695e3n@googlegroups.com>
<cyowI.31628$8f1.4834@fx23.iad> <37b4cf2c-9ceb-41dd-9cc3-b53ba2326ab6n@googlegroups.com>
<S%KwI.744926$nn2.705429@fx48.iad> <8948bd52-302f-47eb-a82e-6788228d59bbn@googlegroups.com>
<sa5k5p$1k92$1@gioia.aioe.org>
User-Agent: G2/1.0
MIME-Version: 1.0
Message-ID: <17ed37e2-f5f6-436f-b5b9-349dd3be649bn@googlegroups.com>
Subject: Re: PDP-11-like ISA
From: MitchAl...@aol.com (MitchAlsup)
Injection-Date: Sun, 13 Jun 2021 22:09:50 +0000
Content-Type: text/plain; charset="UTF-8"
 by: MitchAlsup - Sun, 13 Jun 2021 22:09 UTC

On Sunday, June 13, 2021 at 1:52:45 PM UTC-5, Terje Mathisen wrote:
> MitchAlsup wrote:
> > On Friday, June 11, 2021 at 10:08:38 AM UTC-5, EricP wrote:
> >> More expensive ALU serially dependent on decode means longer Decode stage.
> >> Now since no one would do this to Decode, you wind up doing this
> >> in the real ALU which is at least 2 pipeline stages further on.
> > <
> > {Quote :: Spock from "Wrath of Kahan", 'Captain, he is thinking 2-dimensionally'}
>
> I'm pretty sure you were thinking of the "Wrath of Kahn", but I'm sure
> Kahan could also be quite cross if you mess up your FP unit badly?
<
Finally, someone understood the terminology !! Well done sir !
<
> :-)
> Terje
>
> --
> - <Terje.Mathisen at tmsw.no>
> "almost all programming can be viewed as an exercise in caching"

Re: PDP-11-like ISA

<c292c061-2583-4bc6-8de8-74ad03211bc6n@googlegroups.com>

  copy mid

https://www.novabbs.com/devel/article-flat.php?id=17705&group=comp.arch#17705

  copy link   Newsgroups: comp.arch
X-Received: by 2002:ac8:6885:: with SMTP id m5mr14071789qtq.268.1623622583715;
Sun, 13 Jun 2021 15:16:23 -0700 (PDT)
X-Received: by 2002:a9d:82b:: with SMTP id 40mr11300896oty.81.1623622583509;
Sun, 13 Jun 2021 15:16:23 -0700 (PDT)
Path: i2pn2.org!i2pn.org!weretis.net!feeder8.news.weretis.net!proxad.net!feeder1-2.proxad.net!209.85.160.216.MISMATCH!news-out.google.com!nntp.google.com!postnews.google.com!google-groups.googlegroups.com!not-for-mail
Newsgroups: comp.arch
Date: Sun, 13 Jun 2021 15:16:23 -0700 (PDT)
In-Reply-To: <sa5q37$246$1@dont-email.me>
Injection-Info: google-groups.googlegroups.com; posting-host=2600:1700:291:29f0:7916:6f80:6291:31c9;
posting-account=H_G_JQkAAADS6onOMb-dqvUozKse7mcM
NNTP-Posting-Host: 2600:1700:291:29f0:7916:6f80:6291:31c9
References: <548dc91d-831f-4dd7-a947-8fd7974695e3n@googlegroups.com>
<312431da-80d8-4e85-8e8c-ebc814ab099en@googlegroups.com> <s9vj68$1h6$1@gioia.aioe.org>
<b030e686-69fe-4563-8949-0ccce68a9766n@googlegroups.com> <0107a6e5-ccb6-4d58-a18d-ccc68cba06f0n@googlegroups.com>
<sa0ud7$861$1@dont-email.me> <sa5oag$1bl2$1@gioia.aioe.org> <sa5q37$246$1@dont-email.me>
User-Agent: G2/1.0
MIME-Version: 1.0
Message-ID: <c292c061-2583-4bc6-8de8-74ad03211bc6n@googlegroups.com>
Subject: Re: PDP-11-like ISA
From: MitchAl...@aol.com (MitchAlsup)
Injection-Date: Sun, 13 Jun 2021 22:16:23 +0000
Content-Type: text/plain; charset="UTF-8"
Content-Transfer-Encoding: quoted-printable
 by: MitchAlsup - Sun, 13 Jun 2021 22:16 UTC

On Sunday, June 13, 2021 at 3:33:45 PM UTC-5, Ivan Godard wrote:
> On 6/13/2021 1:03 PM, Terje Mathisen wrote:
> > Ivan Godard wrote:

> > I.e. any kind of compound check.
> >
> > Terje
<
> And that's the question. True, there could be bits for </=/> and you do
> a compound for <= and >=, but that seems pointless to me; you should
> have a negate flag (branch direction) in the instruction instead of
> compounding. Otherwise, how often are compound checks of real operands
> in real code? Yes, FP folding a NaN check is a reasonable use - but how
> often is NaN checked for in real code regardless of how you do it?
<
Well Written FP codes do it all the time::
<
double ATAN2( double y, double x )
{ // IEEE 754-2019 quality ATAN2
// deal with NANs
if( ISNAN( x ) ) return x;
if( ISNAN( y ) ) return y;
// deal with infinities
if( x == +∞ && |y|== +∞ ) return copysign( π/4, y );
if( x == +∞ ) return copysign( 0.0, y );
if( x == -∞ && |y|== +∞ ) return copysign( 3π/4, y );
if( x == -∞ ) return copysign( π, y );
if( |y|== +∞ ) return copysign( π/2, y );
// deal with signed zeros
if( x == 0.0 && y != 0.0 ) return copysign( π/2, y );
if( x >=+0.0 && y == 0.0 ) return copysign( 0.0, y );
if( x <=-0.0 && y == 0.0 ) return copysign( π, y );
// calculate ATAN2 high performance style
if( x > 0.0 )
{
if( y < 0.0 && |y| < |x| ) return - π/2 - ATAN( x / y );
if( y < 0.0 && |y| > |x| ) return + ATAN( y / x );
if( y > 0.0 && |y| < |x| ) return + ATAN( y / x );
if( y > 0.0 && |y| > |x| ) return + π/2 - ATAN( x / y );
}
if( x < 0.0 )
{
if( y < 0.0 && |y| > |x| ) return + π/2 + ATAN( x / y );
if( y < 0.0 && |y| > |x| ) return + π - ATAN( y / x );
if( y > 0.0 && |y| < |x| ) return + π - ATAN( y / x );
if( y > 0.0 && |y| > |x| ) return +3π/2 + ATAN( x / y );
}
<
Not so well written FP codes are (well) no so well written.
<
Also note: only 1 compare is required to perform all of the above checks
in My 66000 ISA.
>
> So if compounds are ignorably rare then the bit vector is just an
> encoding idea and should be measured on code density.

Re: PDP-11-like ISA

<p_vxI.101400$od.74449@fx15.iad>

  copy mid

https://www.novabbs.com/devel/article-flat.php?id=17707&group=comp.arch#17707

  copy link   Newsgroups: comp.arch
Path: i2pn2.org!i2pn.org!paganini.bofh.team!news.dns-netz.com!news.freedyn.net!newsreader4.netcologne.de!news.netcologne.de!peer03.ams1!peer.ams1.xlned.com!news.xlned.com!peer03.iad!feed-me.highwinds-media.com!news.highwinds-media.com!fx15.iad.POSTED!not-for-mail
From: ThatWoul...@thevillage.com (EricP)
User-Agent: Thunderbird 2.0.0.24 (Windows/20100228)
MIME-Version: 1.0
Newsgroups: comp.arch
Subject: Re: PDP-11-like ISA
References: <548dc91d-831f-4dd7-a947-8fd7974695e3n@googlegroups.com> <cyowI.31628$8f1.4834@fx23.iad> <dfbead57-fff5-4ddb-bb18-38105da53f64n@googlegroups.com> <j0ywI.36041$431.5756@fx39.iad> <s9voen$do7$1@gioia.aioe.org> <8iKwI.317313$N_4.28704@fx36.iad> <sa2pt3$ctq$1@gioia.aioe.org> <b95bb897-d9a0-4550-8292-0a2248c9cb6an@googlegroups.com> <sa5op9$1l48$1@gioia.aioe.org>
In-Reply-To: <sa5op9$1l48$1@gioia.aioe.org>
Content-Type: text/plain; charset=ISO-8859-1; format=flowed
Content-Transfer-Encoding: 7bit
Lines: 74
Message-ID: <p_vxI.101400$od.74449@fx15.iad>
X-Complaints-To: abuse@UsenetServer.com
NNTP-Posting-Date: Sun, 13 Jun 2021 22:52:05 UTC
Date: Sun, 13 Jun 2021 18:51:37 -0400
X-Received-Bytes: 3946
 by: EricP - Sun, 13 Jun 2021 22:51 UTC

Terje Mathisen wrote:
> MitchAlsup wrote:
>> On Saturday, June 12, 2021 at 12:12:07 PM UTC-5, Terje Mathisen wrote:
>>> EricP wrote:
>>>> Terje Mathisen wrote:
>>>>> EricP wrote:
>>>>>>
>>>>>> So toss out auto inc/dec address modes and put in two instructions
>>>>>>
>>>>>> // pdp-11 asm has dest reg on the right
>>>>>> ADDTY #tiny,reg Add Tiny
>>>>>> SUBTY #tiny,reg Subtract Tiny
>>>>>>
>>>>>> where tiny is 1..7 and 0 means 8. Much more generally useful.
>>>>>
>>>>> I would rather have 0..7 meaning 1..8, since this is trivially
>>>>> achieved by always enabling an incoming carry to the address adder.
>>>>> Special-casing 0 to mean 8 seems like it would need more gates?
>>>>>
>>>>> I.e. cnt3 = !(cnt0 & cnt1 & cnt2) vs always setting carry-in?
>>>>>
>>>>> Terje
>>>>
>>>> 1..7, 0 => 8 allows the 3 lower bits to be used directly
>>>> and a 3-input NOR-3 gate to drive the 4th bit.
>>>>
>>>> 0..7 => 1..8 needs an incrementer, in this case a NOT, 2 XOR's,
>>>> an AND-2 and an AND-3 carry lookaheads.
>>>>
>>>> Its in the decoder and there are other more expensive things there,
>>>> so it wouldn't be on the critical path either way.
>> <
>>> The logic is simple either way, but since we're already going to use the
>>> (address) adder, just setting carry_in is effectively free.
>> <
>> What makes you think the AGEN adder has a carry in ??
>
> I realize that it doesn't need one: The bottom bit can be handled with a
> half adder which I'm guessing is at least one gate delay faster, and
> that can negate any gain from not having to generate cnt3?
>
> OTOH, if this is dedicated circuit, then it would know to always add the
> carry-in, so the bottom half adder would just be slightly different:
>
> s0 = a == b
> carry_out = a | b;
>
> Right?
>
> Terje
>

what I was talking about was the encoding of a field in the instruction
which would always be set by the assembler/compiler so there is no reason
for it not to be encoded as simply as possible for the hardware.

For a N-bit field, an N-input NOR gate, which 1..7, 0 => 8 needs,
will always be cheaper than an N-bit incrementer.

That being said, there is lots of opportunity to optimize an incrementer
up the wazoo. It should use parallel carry lookahead (a 4 input NAND tree)
so it approximates log_4(N) delay, give or take a few gates.

In this case we are talking about 3 bits so its no big deal.
But in the fetch-parse units there could be some 52 bit incrementers.
I dont know, maybe an FP rounder might want to increment long bit strings.

An optimized incrementer could be organized into groups of 4 bits.
A 4 input NAND would generate the groups (inverted) carry out.
At the next level a 4 input NOR would generate carry for 16 bits.
Next a 4 input NAND generates (inverted) group carry for 64 bits.
The carry outs could drive MUX's to select raw or incremented values.

Re: PDP-11-like ISA

<3c1ae43d-5af5-4d03-87d6-df49fa0c4383n@googlegroups.com>

  copy mid

https://www.novabbs.com/devel/article-flat.php?id=17709&group=comp.arch#17709

  copy link   Newsgroups: comp.arch
X-Received: by 2002:aed:2166:: with SMTP id 93mr14164915qtc.374.1623626272679; Sun, 13 Jun 2021 16:17:52 -0700 (PDT)
X-Received: by 2002:a9d:4f18:: with SMTP id d24mr11081008otl.16.1623626272471; Sun, 13 Jun 2021 16:17:52 -0700 (PDT)
Path: i2pn2.org!i2pn.org!aioe.org!news.uzoreto.com!tr1.eu1.usenetexpress.com!feeder.usenetexpress.com!tr3.iad1.usenetexpress.com!border1.nntp.dca1.giganews.com!nntp.giganews.com!news-out.google.com!nntp.google.com!postnews.google.com!google-groups.googlegroups.com!not-for-mail
Newsgroups: comp.arch
Date: Sun, 13 Jun 2021 16:17:52 -0700 (PDT)
In-Reply-To: <p_vxI.101400$od.74449@fx15.iad>
Injection-Info: google-groups.googlegroups.com; posting-host=2600:1700:291:29f0:7916:6f80:6291:31c9; posting-account=H_G_JQkAAADS6onOMb-dqvUozKse7mcM
NNTP-Posting-Host: 2600:1700:291:29f0:7916:6f80:6291:31c9
References: <548dc91d-831f-4dd7-a947-8fd7974695e3n@googlegroups.com> <cyowI.31628$8f1.4834@fx23.iad> <dfbead57-fff5-4ddb-bb18-38105da53f64n@googlegroups.com> <j0ywI.36041$431.5756@fx39.iad> <s9voen$do7$1@gioia.aioe.org> <8iKwI.317313$N_4.28704@fx36.iad> <sa2pt3$ctq$1@gioia.aioe.org> <b95bb897-d9a0-4550-8292-0a2248c9cb6an@googlegroups.com> <sa5op9$1l48$1@gioia.aioe.org> <p_vxI.101400$od.74449@fx15.iad>
User-Agent: G2/1.0
MIME-Version: 1.0
Message-ID: <3c1ae43d-5af5-4d03-87d6-df49fa0c4383n@googlegroups.com>
Subject: Re: PDP-11-like ISA
From: MitchAl...@aol.com (MitchAlsup)
Injection-Date: Sun, 13 Jun 2021 23:17:52 +0000
Content-Type: text/plain; charset="UTF-8"
Lines: 87
 by: MitchAlsup - Sun, 13 Jun 2021 23:17 UTC

On Sunday, June 13, 2021 at 5:52:08 PM UTC-5, EricP wrote:
> Terje Mathisen wrote:
> > MitchAlsup wrote:
> >> On Saturday, June 12, 2021 at 12:12:07 PM UTC-5, Terje Mathisen wrote:
> >>> EricP wrote:
> >>>> Terje Mathisen wrote:
> >>>>> EricP wrote:
> >>>>>>
> >>>>>> So toss out auto inc/dec address modes and put in two instructions
> >>>>>>
> >>>>>> // pdp-11 asm has dest reg on the right
> >>>>>> ADDTY #tiny,reg Add Tiny
> >>>>>> SUBTY #tiny,reg Subtract Tiny
> >>>>>>
> >>>>>> where tiny is 1..7 and 0 means 8. Much more generally useful.
> >>>>>
> >>>>> I would rather have 0..7 meaning 1..8, since this is trivially
> >>>>> achieved by always enabling an incoming carry to the address adder.
> >>>>> Special-casing 0 to mean 8 seems like it would need more gates?
> >>>>>
> >>>>> I.e. cnt3 = !(cnt0 & cnt1 & cnt2) vs always setting carry-in?
> >>>>>
> >>>>> Terje
> >>>>
> >>>> 1..7, 0 => 8 allows the 3 lower bits to be used directly
> >>>> and a 3-input NOR-3 gate to drive the 4th bit.
> >>>>
> >>>> 0..7 => 1..8 needs an incrementer, in this case a NOT, 2 XOR's,
> >>>> an AND-2 and an AND-3 carry lookaheads.
> >>>>
> >>>> Its in the decoder and there are other more expensive things there,
> >>>> so it wouldn't be on the critical path either way.
> >> <
> >>> The logic is simple either way, but since we're already going to use the
> >>> (address) adder, just setting carry_in is effectively free.
> >> <
> >> What makes you think the AGEN adder has a carry in ??
> >
> > I realize that it doesn't need one: The bottom bit can be handled with a
> > half adder which I'm guessing is at least one gate delay faster, and
> > that can negate any gain from not having to generate cnt3?
> >
> > OTOH, if this is dedicated circuit, then it would know to always add the
> > carry-in, so the bottom half adder would just be slightly different:
> >
> > s0 = a == b
> > carry_out = a | b;
> >
> > Right?
> >
> > Terje
> >
> what I was talking about was the encoding of a field in the instruction
> which would always be set by the assembler/compiler so there is no reason
> for it not to be encoded as simply as possible for the hardware.
<
For what follows I an using the term "decoder" a a block of logic that asserts
on of 2^n output signals in response to n bit input signal. {There are lots of
other kinds of decoders (greater than, less than) but this one is also known
as the equality decoder.
<
One can simply build the decoder of range(0..7) to assert on domain(1..8)
at zero cost. {we have not had this one in the discussion yet, but it needs
to be}
<
The lesser cost is the AND gate which converts 000 into 1000. Then the
lower 3 bits go into a decoder to assert one of 1..7 the AND gate asserting
8. (the zero output of the decoder is not used and if the synthesizer is any
good, the great gate muncher will remove that logic.}
<
The high cost is to increment the 3-bit field and then run it into a decoder.
>
> For a N-bit field, an N-input NOR gate, which 1..7, 0 => 8 needs,
> will always be cheaper than an N-bit incrementer.
>
> That being said, there is lots of opportunity to optimize an incrementer
> up the wazoo. It should use parallel carry lookahead (a 4 input NAND tree)
> so it approximates log_4(N) delay, give or take a few gates.
>
> In this case we are talking about 3 bits so its no big deal.
> But in the fetch-parse units there could be some 52 bit incrementers.
> I dont know, maybe an FP rounder might want to increment long bit strings.
>
> An optimized incrementer could be organized into groups of 4 bits.
> A 4 input NAND would generate the groups (inverted) carry out.
> At the next level a 4 input NOR would generate carry for 16 bits.
> Next a 4 input NAND generates (inverted) group carry for 64 bits.
> The carry outs could drive MUX's to select raw or incremented values.

Re: PDP-11-like ISA

<sa6bdg$18f$1@dont-email.me>

  copy mid

https://www.novabbs.com/devel/article-flat.php?id=17710&group=comp.arch#17710

  copy link   Newsgroups: comp.arch
Path: i2pn2.org!i2pn.org!eternal-september.org!reader02.eternal-september.org!.POSTED!not-for-mail
From: sfu...@alumni.cmu.edu.invalid (Stephen Fuld)
Newsgroups: comp.arch
Subject: Re: PDP-11-like ISA
Date: Sun, 13 Jun 2021 18:29:18 -0700
Organization: A noiseless patient Spider
Lines: 87
Message-ID: <sa6bdg$18f$1@dont-email.me>
References: <548dc91d-831f-4dd7-a947-8fd7974695e3n@googlegroups.com>
<312431da-80d8-4e85-8e8c-ebc814ab099en@googlegroups.com>
<s9vj68$1h6$1@gioia.aioe.org>
<b030e686-69fe-4563-8949-0ccce68a9766n@googlegroups.com>
<0107a6e5-ccb6-4d58-a18d-ccc68cba06f0n@googlegroups.com>
<sa0ud7$861$1@dont-email.me> <sa5oag$1bl2$1@gioia.aioe.org>
<sa5q37$246$1@dont-email.me>
Mime-Version: 1.0
Content-Type: text/plain; charset=utf-8; format=flowed
Content-Transfer-Encoding: 8bit
Injection-Date: Mon, 14 Jun 2021 01:29:20 -0000 (UTC)
Injection-Info: reader02.eternal-september.org; posting-host="29a37587efbd19a909fc2cdb426eca06";
logging-data="1295"; mail-complaints-to="abuse@eternal-september.org"; posting-account="U2FsdGVkX1/vKOcSSiqnwXe7o1FZRYvRJ5y5UXmMM1w="
User-Agent: Mozilla/5.0 (Windows NT 10.0; Win64; x64; rv:78.0) Gecko/20100101
Thunderbird/78.11.0
Cancel-Lock: sha1:h5zaVLrzHuXc9P2E5u31/+9+AJk=
In-Reply-To: <sa5q37$246$1@dont-email.me>
Content-Language: en-US
 by: Stephen Fuld - Mon, 14 Jun 2021 01:29 UTC

On 6/13/2021 1:33 PM, Ivan Godard wrote:
> On 6/13/2021 1:03 PM, Terje Mathisen wrote:
>> Ivan Godard wrote:
>>> On 6/11/2021 11:42 AM, MitchAlsup wrote:
>>>> On Friday, June 11, 2021 at 1:28:30 PM UTC-5, Quadibloc wrote:
>>>>> On Friday, June 11, 2021 at 5:59:09 AM UTC-6, chris wrote:
>>>>>
>>>>>> Opinions may vary, but if you were an assembler programmer back in
>>>>>> the
>>>>>> days when memory was tight and machines were slow, the fact that
>>>>>> condition codes are set with many instructions, was of great
>>>>>> benefit.
>>>>> Oh, I agree that condition codes are a good thing. I'm not too
>>>>> sympathetic
>>>>> to architectures like the Alpha and RISC-V which completely
>>>>> dispense with
>>>>> them.
>>>>>
>>>>> However, if _every_ operate instruction changes the condition
>>>>> codes, you
>>>>> have to put the conditional branch right after the operate
>>>>> instruction that
>>>>> affects it. This creates a dependency, and for branches especially
>>>>> that's a
>>>>> bad thing.
>>>> <
>>>> In the days condition codes made sense, the machines were using 4-10
>>>> cycles
>>>> per instruction, so putting the BC immediately after the Operate
>>>> instruction
>>>> was dé rigueur.
>>>> <
>>>> One of the things we learned in the early RISC days was this one
>>>> either wanted
>>>> no condition codes, or to use a bit to say when the condition codes
>>>> were going
>>>> to be used--in effect CC write elision. Using that bit is bad for
>>>> entropy.
>>>>>
>>>>> So, the way many RISC instruction sets have done it (the PowerPC
>>>>> taking
>>>>> it to a more elaborate level) in a high-performance instruction
>>>>> set, I include
>>>>> a bit to enable or disable setting the condition codes in
>>>>> instructions that can
>>>>> set them.
>>>>>
>>>>> That loses a bit of opcode space, but it doesn't lose the
>>>>> advantages you
>>>>> mention.
>>>> <
>>>> I still fail to see where cc gain anything (with proper branch
>>>> support), plus
>>>> I see many places where a vector of comparand bits is better than a CC
>>>> from a CMP instruction (especially with the requirements of IEEE 754).
>>>
>>> How often is more than one bit subsequently selected from that bit
>>> vector? -
>>
>> Enough?
>>
>> I.e. branch if a >= b, but only if neither is a nan?
>>
>> I.e. any kind of compound check.
>>
>> Terje
>
> And that's the question. True, there could be bits for </=/> and you do
> a compound for <= and >=, but that seems pointless to me; you should
> have a negate flag (branch direction) in the instruction instead of
> compounding. Otherwise, how often are compound checks of real operands
> in real code? Yes, FP folding a NaN check is a reasonable use - but how
> often is NaN checked for in real code regardless of how you do it?
>
> So if compounds are ignorably rare then the bit vector is just an
> encoding idea and should be measured on code density.

One other point. Mitch's scheme does all those checks with a single op
code for the compare and one (or two if you include the predication) for
for the conditional branch/predicate. So depending upon how the
alternative handles things, you might require fewer op codes. How much
that is worth is, of course, dependent on lots of other factors.

--
- Stephen Fuld
(e-mail address disguised to prevent spam)

Re: PDP-11-like ISA

<sa6r0r$euv$1@gioia.aioe.org>

  copy mid

https://www.novabbs.com/devel/article-flat.php?id=17714&group=comp.arch#17714

  copy link   Newsgroups: comp.arch
Path: i2pn2.org!i2pn.org!aioe.org!Z/OnjRNZ74xzNAVdC5cKTg.user.gioia.aioe.org.POSTED!not-for-mail
From: terje.ma...@tmsw.no (Terje Mathisen)
Newsgroups: comp.arch
Subject: Re: PDP-11-like ISA
Date: Mon, 14 Jun 2021 07:55:39 +0200
Organization: Aioe.org NNTP Server
Lines: 23
Message-ID: <sa6r0r$euv$1@gioia.aioe.org>
References: <548dc91d-831f-4dd7-a947-8fd7974695e3n@googlegroups.com>
<cyowI.31628$8f1.4834@fx23.iad>
<dfbead57-fff5-4ddb-bb18-38105da53f64n@googlegroups.com>
<j0ywI.36041$431.5756@fx39.iad>
<ac0da74d-b52b-4b52-b25e-cf1070ce5e04n@googlegroups.com>
<uQ7xI.28973$J21.10021@fx40.iad>
<8b9ed2b1-9aca-43d4-bc95-b199285c9932n@googlegroups.com>
NNTP-Posting-Host: Z/OnjRNZ74xzNAVdC5cKTg.user.gioia.aioe.org
Mime-Version: 1.0
Content-Type: text/plain; charset=ISO-8859-1; format=flowed
Content-Transfer-Encoding: 7bit
X-Complaints-To: abuse@aioe.org
User-Agent: Mozilla/5.0 (Windows NT 10.0; Win64; x64; rv:60.0) Gecko/20100101
Firefox/60.0 SeaMonkey/2.53.7.1
X-Notice: Filtered by postfilter v. 0.9.2
 by: Terje Mathisen - Mon, 14 Jun 2021 05:55 UTC

MitchAlsup wrote:
> On Saturday, June 12, 2021 at 2:23:09 PM UTC-5, EricP wrote:
>> That removes the IP from the register set and moves all the above mode
>> specifier bits into the opcode, where they can be optimally assigned.
> <
> I am currently playing with the notion where there is a small register file
> (8 entries) which is really fast, and a larger register file (64-256 entries)
> which is 1 cycle access, and then memory which is 3-4 cycles of access
> and supports sizes other than 64-bits. {The first 8 entries of the large
> file is the small 8 register file.}

So, this is a NURA (Non-Uniform Register Access) machine?

I am sure the compiler writes would love it. :-)

(Personally I would be perfectly happy, 8 effective registers are enough
for almost all the inner loop code I have ever written.)

Terje

--
- <Terje.Mathisen at tmsw.no>
"almost all programming can be viewed as an exercise in caching"

Re: PDP-11-like ISA

<sa6r8s$hh1$1@gioia.aioe.org>

  copy mid

https://www.novabbs.com/devel/article-flat.php?id=17715&group=comp.arch#17715

  copy link   Newsgroups: comp.arch
Path: i2pn2.org!i2pn.org!aioe.org!Z/OnjRNZ74xzNAVdC5cKTg.user.gioia.aioe.org.POSTED!not-for-mail
From: terje.ma...@tmsw.no (Terje Mathisen)
Newsgroups: comp.arch
Subject: Re: PDP-11-like ISA
Date: Mon, 14 Jun 2021 07:59:56 +0200
Organization: Aioe.org NNTP Server
Lines: 33
Message-ID: <sa6r8s$hh1$1@gioia.aioe.org>
References: <548dc91d-831f-4dd7-a947-8fd7974695e3n@googlegroups.com>
<cyowI.31628$8f1.4834@fx23.iad>
<dfbead57-fff5-4ddb-bb18-38105da53f64n@googlegroups.com>
<j0ywI.36041$431.5756@fx39.iad> <s9voen$do7$1@gioia.aioe.org>
<121bdc96-e76a-4b2f-8fec-7465378f51den@googlegroups.com>
<sa2rvu$17ul$2@gioia.aioe.org>
<d4d27ae4-96e1-440c-952e-c84cf45ab4d4n@googlegroups.com>
<66eec5d7-2e94-422a-84c7-d944a4df33e3n@googlegroups.com>
<10975ce6-d20b-4760-8eda-c0676557d36fn@googlegroups.com>
NNTP-Posting-Host: Z/OnjRNZ74xzNAVdC5cKTg.user.gioia.aioe.org
Mime-Version: 1.0
Content-Type: text/plain; charset=ISO-8859-1; format=flowed
Content-Transfer-Encoding: 7bit
X-Complaints-To: abuse@aioe.org
User-Agent: Mozilla/5.0 (Windows NT 10.0; Win64; x64; rv:60.0) Gecko/20100101
Firefox/60.0 SeaMonkey/2.53.7.1
X-Notice: Filtered by postfilter v. 0.9.2
 by: Terje Mathisen - Mon, 14 Jun 2021 05:59 UTC

MitchAlsup wrote:
> On Saturday, June 12, 2021 at 7:38:05 PM UTC-5, Quadibloc wrote:
>> On Saturday, June 12, 2021 at 2:50:38 PM UTC-6, MitchAlsup wrote:
>>> On Saturday, June 12, 2021 at 12:47:46 PM UTC-5, Terje Mathisen wrote:
>>>> MitchAlsup wrote:
>>
>>>>> cnt3 is 1-gate delay.
>>>>> carry-in is 4-gate delays.
>>
>>>> Wow!
>>
>>>> So you are saying that simply forcing carry-in takes 4 extra gate
>>>> delays, on top of routing the three address bits, and also merging them
>>>> with the generated cnt3 bit?
>>
>>> Yes, it is sad that SW people see AND and ADD as the same latency.
>> Just in case there's a misunderstanding here:
>>
>> Setting the "carry-in" signal line to HIGH does not take more than one
>> gate delay.
> <
> With a 3 (or 4)-bit adder the carry in does not change the 4-gates of delay.

Since the 3-bit immediate ("tiny") value was to be added to a 64-bit
register anyway, I guessed that the cost of decrementing the immediate
in the encoding and instead using carry-in to adjust for it at runtime,
could be a win.

Terje

--
- <Terje.Mathisen at tmsw.no>
"almost all programming can be viewed as an exercise in caching"

Re: PDP-11-like ISA

<sa6tv1$qr7$1@dont-email.me>

  copy mid

https://www.novabbs.com/devel/article-flat.php?id=17719&group=comp.arch#17719

  copy link   Newsgroups: comp.arch
Path: i2pn2.org!i2pn.org!eternal-september.org!reader02.eternal-september.org!.POSTED!not-for-mail
From: iva...@millcomputing.com (Ivan Godard)
Newsgroups: comp.arch
Subject: Re: PDP-11-like ISA
Date: Sun, 13 Jun 2021 23:45:52 -0700
Organization: A noiseless patient Spider
Lines: 26
Message-ID: <sa6tv1$qr7$1@dont-email.me>
References: <548dc91d-831f-4dd7-a947-8fd7974695e3n@googlegroups.com>
<312431da-80d8-4e85-8e8c-ebc814ab099en@googlegroups.com>
<s9vj68$1h6$1@gioia.aioe.org>
<b030e686-69fe-4563-8949-0ccce68a9766n@googlegroups.com>
<0107a6e5-ccb6-4d58-a18d-ccc68cba06f0n@googlegroups.com>
<sa0ud7$861$1@dont-email.me> <sa5oag$1bl2$1@gioia.aioe.org>
<sa5q37$246$1@dont-email.me> <sa6bdg$18f$1@dont-email.me>
Mime-Version: 1.0
Content-Type: text/plain; charset=utf-8; format=flowed
Content-Transfer-Encoding: 8bit
Injection-Date: Mon, 14 Jun 2021 06:45:53 -0000 (UTC)
Injection-Info: reader02.eternal-september.org; posting-host="8502820e329019d05f984e551f3a33dd";
logging-data="27495"; mail-complaints-to="abuse@eternal-september.org"; posting-account="U2FsdGVkX1/jcBK1WNdUlPbIjEiMPnVU"
User-Agent: Mozilla/5.0 (Windows NT 10.0; Win64; x64; rv:78.0) Gecko/20100101
Thunderbird/78.11.0
Cancel-Lock: sha1:Ig2/ITGFe7fewL00ONkMuzNJhR8=
In-Reply-To: <sa6bdg$18f$1@dont-email.me>
Content-Language: en-US
 by: Ivan Godard - Mon, 14 Jun 2021 06:45 UTC

On 6/13/2021 6:29 PM, Stephen Fuld wrote:
> On 6/13/2021 1:33 PM, Ivan Godard wrote:
>> On 6/13/2021 1:03 PM, Terje Mathisen wrote:

>> And that's the question. True, there could be bits for </=/> and you
>> do a compound for <= and >=, but that seems pointless to me; you
>> should have a negate flag (branch direction) in the instruction
>> instead of compounding. Otherwise, how often are compound checks of
>> real operands in real code? Yes, FP folding a NaN check is a
>> reasonable use - but how often is NaN checked for in real code
>> regardless of how you do it?
>>
>> So if compounds are ignorably rare then the bit vector is just an
>> encoding idea and should be measured on code density.
>
> One other point.  Mitch's scheme does all those checks with a single op
> code for the compare and one (or two if you include the predication) for
> for the conditional branch/predicate.  So depending upon how the
> alternative handles things, you might require fewer op codes.  How much
> that is worth is, of course, dependent on lots of other factors.
>
>

Yes; it shifts entropy from opcode to bit selector. But it doesn't
reduce the total entropy.

Re: PDP-11-like ISA

<30464d49-7244-42c7-a62c-888bcc875902n@googlegroups.com>

  copy mid

https://www.novabbs.com/devel/article-flat.php?id=17721&group=comp.arch#17721

  copy link   Newsgroups: comp.arch
X-Received: by 2002:a05:6214:1843:: with SMTP id d3mr17220329qvy.60.1623654078424;
Mon, 14 Jun 2021 00:01:18 -0700 (PDT)
X-Received: by 2002:a9d:7d05:: with SMTP id v5mr11169330otn.240.1623654078181;
Mon, 14 Jun 2021 00:01:18 -0700 (PDT)
Path: i2pn2.org!rocksolid2!i2pn.org!weretis.net!feeder8.news.weretis.net!proxad.net!feeder1-2.proxad.net!209.85.160.216.MISMATCH!news-out.google.com!nntp.google.com!postnews.google.com!google-groups.googlegroups.com!not-for-mail
Newsgroups: comp.arch
Date: Mon, 14 Jun 2021 00:01:17 -0700 (PDT)
In-Reply-To: <sa6r0r$euv$1@gioia.aioe.org>
Injection-Info: google-groups.googlegroups.com; posting-host=2001:56a:f8e3:d700:49ad:8998:dba7:ae90;
posting-account=1nOeKQkAAABD2jxp4Pzmx9Hx5g9miO8y
NNTP-Posting-Host: 2001:56a:f8e3:d700:49ad:8998:dba7:ae90
References: <548dc91d-831f-4dd7-a947-8fd7974695e3n@googlegroups.com>
<cyowI.31628$8f1.4834@fx23.iad> <dfbead57-fff5-4ddb-bb18-38105da53f64n@googlegroups.com>
<j0ywI.36041$431.5756@fx39.iad> <ac0da74d-b52b-4b52-b25e-cf1070ce5e04n@googlegroups.com>
<uQ7xI.28973$J21.10021@fx40.iad> <8b9ed2b1-9aca-43d4-bc95-b199285c9932n@googlegroups.com>
<sa6r0r$euv$1@gioia.aioe.org>
User-Agent: G2/1.0
MIME-Version: 1.0
Message-ID: <30464d49-7244-42c7-a62c-888bcc875902n@googlegroups.com>
Subject: Re: PDP-11-like ISA
From: jsav...@ecn.ab.ca (Quadibloc)
Injection-Date: Mon, 14 Jun 2021 07:01:18 +0000
Content-Type: text/plain; charset="UTF-8"
 by: Quadibloc - Mon, 14 Jun 2021 07:01 UTC

On Sunday, June 13, 2021 at 11:55:42 PM UTC-6, Terje Mathisen wrote:

> So, this is a NURA (Non-Uniform Register Access) machine?

> I am sure the compiler writes would love it. :-)

Uh, oh. This is also where I'm going with Concertina II.

The original Concertina had banks of 8 registers, with a separate
bank of 8 base registers. But there was also a 64-register scratchpad.

Concertina II has banks of 32 registers, to be 'as good as' RISC. But
since Concertina II aims to be 'efficient', some VLIW goodness has been
thrown in, so that theoretically one _might_ be able to get half-decent
performance even out of an implementation that didn't have full OoO.

Not that I _really_ recommend that, but I'm still trying to make it at least
halfway feasible.

The 32 registers already have some strangeness in their organization.

The 16-bit operate instructions, in order to fit in 16 bits, divide those 32
registers into four groups of 8, and the source and destination registers
in such a 16-bit instruction must be from the same group.

And to some extent, the groups have different functions, at least for the
integer registers.

Most instructions which specify an index register can only use registers
1-7 as index registers. A different set of eight registers is used as base
registers.

You can choose between addressing modes with different sizes of
displacement. For each displacement size, a different bank of eight
registers is used for base registers. One reason for this is that if you
reserve, say, 65,536 bytes of memory, and have a pointer to that area,
an addressing mode that only accesses the first 4,096 bytes of it is
really not all that useful.

But it also needs to be noted that it is intended that most of the
integer registers are to be used for integer arithmetic calculations;
so in general, it's expected a program will pick one of those
addressing modes, and stick with it. Possibly with occasional forays
into some other mode, only using one base register for it, because some
needed instruction isn't available in the desired mode.

But I mentioned VLIW.

If the 32 registers in a bank are divided into four groups of eight, this fits
in with programs that are doing several separate calculations at once.
That way, one instruction from each of those calculations can
be put in a group of instructions that, being independent, can
all be executed in parallel.

For that style of programming, though, 32 registers might not be
enough, so I've recently added auxilliary banks of 128 registers.

John Savard

Re: PDP-11-like ISA

<sa731b$1klb$1@gioia.aioe.org>

  copy mid

https://www.novabbs.com/devel/article-flat.php?id=17724&group=comp.arch#17724

  copy link   Newsgroups: comp.arch
Path: i2pn2.org!i2pn.org!aioe.org!Z/OnjRNZ74xzNAVdC5cKTg.user.gioia.aioe.org.POSTED!not-for-mail
From: terje.ma...@tmsw.no (Terje Mathisen)
Newsgroups: comp.arch
Subject: Re: PDP-11-like ISA
Date: Mon, 14 Jun 2021 10:12:28 +0200
Organization: Aioe.org NNTP Server
Lines: 40
Message-ID: <sa731b$1klb$1@gioia.aioe.org>
References: <548dc91d-831f-4dd7-a947-8fd7974695e3n@googlegroups.com>
<cyowI.31628$8f1.4834@fx23.iad>
<dfbead57-fff5-4ddb-bb18-38105da53f64n@googlegroups.com>
<j0ywI.36041$431.5756@fx39.iad> <s9voen$do7$1@gioia.aioe.org>
<121bdc96-e76a-4b2f-8fec-7465378f51den@googlegroups.com>
<sa2rvu$17ul$2@gioia.aioe.org>
<d4d27ae4-96e1-440c-952e-c84cf45ab4d4n@googlegroups.com>
<sa4p97$3pj$1@dont-email.me> <sa53eg$nsa$1@newsreader4.netcologne.de>
NNTP-Posting-Host: Z/OnjRNZ74xzNAVdC5cKTg.user.gioia.aioe.org
Mime-Version: 1.0
Content-Type: text/plain; charset=ISO-8859-1; format=flowed
Content-Transfer-Encoding: 7bit
X-Complaints-To: abuse@aioe.org
User-Agent: Mozilla/5.0 (Windows NT 10.0; Win64; x64; rv:60.0) Gecko/20100101
Firefox/60.0 SeaMonkey/2.53.7.1
X-Notice: Filtered by postfilter v. 0.9.2
 by: Terje Mathisen - Mon, 14 Jun 2021 08:12 UTC

Thomas Koenig wrote:
> James Harris <james.harris.1@gmail.com> schrieb:
>> On 12/06/2021 21:50, MitchAlsup wrote:
>>> On Saturday, June 12, 2021 at 12:47:46 PM UTC-5, Terje Mathisen wrote:
>>
>> ...
>>
>>>> So you are saying that simply forcing carry-in takes 4 extra gate
>>>> delays, on top of routing the three address bits, and also merging them
>>>> with the generated cnt3 bit?
>>> <
>>> Yes, it is sad that SW people see AND and ADD as the same latency.
>>
>> As what you might call a software person I don't see AND and ADD as
>> needing the same latency and I find it a bit irksome when CPUs require
>> the same time (1 cycle) for each -
>
> You would need an elastic pipeline for this; in this case, you could
> also make your additions somewhat faster in the case where you
> do not need to propagate carries very far (or your multipliers if
> you are, for example, multiplying two 8-bit numbers with a 64-bit
> multiplier).
>
> There are papers on this, but no major architecture has done this to
> date (unless I have missed something, which may well be the case).

These sort of elastic pipelines are pretty much incompatible with
crypto: You need a way to make sure your algorithms will run in constant
time, with zero dependency on the actual keys being used.

I.e. even the classic early-out multipliers would have made it
impossible for me to write a version of DFC (an AES candiate) which was
constant time and less than 10% slower than the maximum-speed
implementation.

Terje

--
- <Terje.Mathisen at tmsw.no>
"almost all programming can be viewed as an exercise in caching"

Re: PDP-11-like ISA

<sa74e2$8qq$1@gioia.aioe.org>

  copy mid

https://www.novabbs.com/devel/article-flat.php?id=17726&group=comp.arch#17726

  copy link   Newsgroups: comp.arch
Path: i2pn2.org!i2pn.org!aioe.org!Z/OnjRNZ74xzNAVdC5cKTg.user.gioia.aioe.org.POSTED!not-for-mail
From: terje.ma...@tmsw.no (Terje Mathisen)
Newsgroups: comp.arch
Subject: Re: PDP-11-like ISA
Date: Mon, 14 Jun 2021 10:36:18 +0200
Organization: Aioe.org NNTP Server
Lines: 78
Message-ID: <sa74e2$8qq$1@gioia.aioe.org>
References: <548dc91d-831f-4dd7-a947-8fd7974695e3n@googlegroups.com>
<312431da-80d8-4e85-8e8c-ebc814ab099en@googlegroups.com>
<s9vj68$1h6$1@gioia.aioe.org>
<b030e686-69fe-4563-8949-0ccce68a9766n@googlegroups.com>
<0107a6e5-ccb6-4d58-a18d-ccc68cba06f0n@googlegroups.com>
<sa0ud7$861$1@dont-email.me> <sa5oag$1bl2$1@gioia.aioe.org>
<sa5q37$246$1@dont-email.me>
<c292c061-2583-4bc6-8de8-74ad03211bc6n@googlegroups.com>
NNTP-Posting-Host: Z/OnjRNZ74xzNAVdC5cKTg.user.gioia.aioe.org
Mime-Version: 1.0
Content-Type: text/plain; charset=UTF-8; format=flowed
Content-Transfer-Encoding: 8bit
X-Complaints-To: abuse@aioe.org
User-Agent: Mozilla/5.0 (Windows NT 10.0; Win64; x64; rv:60.0) Gecko/20100101
Firefox/60.0 SeaMonkey/2.53.7.1
X-Notice: Filtered by postfilter v. 0.9.2
 by: Terje Mathisen - Mon, 14 Jun 2021 08:36 UTC

MitchAlsup wrote:
> On Sunday, June 13, 2021 at 3:33:45 PM UTC-5, Ivan Godard wrote:
>> On 6/13/2021 1:03 PM, Terje Mathisen wrote:
>>> Ivan Godard wrote:
>
>>> I.e. any kind of compound check.
>>>
>>> Terje
> <
>> And that's the question. True, there could be bits for </=/> and you do
>> a compound for <= and >=, but that seems pointless to me; you should
>> have a negate flag (branch direction) in the instruction instead of
>> compounding. Otherwise, how often are compound checks of real operands
>> in real code? Yes, FP folding a NaN check is a reasonable use - but how
>> often is NaN checked for in real code regardless of how you do it?
> <
> Well Written FP codes do it all the time::
> <
> double ATAN2( double y, double x )
> { // IEEE 754-2019 quality ATAN2
> // deal with NANs
> if( ISNAN( x ) ) return x;
> if( ISNAN( y ) ) return y;
> // deal with infinities
> if( x == +∞ && |y|== +∞ ) return copysign( π/4, y );
> if( x == +∞ ) return copysign( 0.0, y );
> if( x == -∞ && |y|== +∞ ) return copysign( 3π/4, y );
> if( x == -∞ ) return copysign( π, y );
> if( |y|== +∞ ) return copysign( π/2, y );
> // deal with signed zeros
> if( x == 0.0 && y != 0.0 ) return copysign( π/2, y );
> if( x >=+0.0 && y == 0.0 ) return copysign( 0.0, y );
> if( x <=-0.0 && y == 0.0 ) return copysign( π, y );
> // calculate ATAN2 high performance style
> if( x > 0.0 )
> {
> if( y < 0.0 && |y| < |x| ) return - π/2 - ATAN( x / y );
> if( y < 0.0 && |y| > |x| ) return + ATAN( y / x );
> if( y > 0.0 && |y| < |x| ) return + ATAN( y / x );
> if( y > 0.0 && |y| > |x| ) return + π/2 - ATAN( x / y );
> }
> if( x < 0.0 )
> {
> if( y < 0.0 && |y| > |x| ) return + π/2 + ATAN( x / y );
> if( y < 0.0 && |y| > |x| ) return + π - ATAN( y / x );
> if( y > 0.0 && |y| < |x| ) return + π - ATAN( y / x );
> if( y > 0.0 && |y| > |x| ) return +3π/2 + ATAN( x / y );
> }
> <
> Not so well written FP codes are (well) no so well written.
> <
> Also note: only 1 compare is required to perform all of the above checks
> in My 66000 ISA.

In the Mill code I've written for FP emulation, I try to short-circuit
as many of these tests as possible, so I typically try to isolate
zero(/sub-normal)/inf/nan like this:

if (exponent-1 >= 2046) // double input, unsigned exp field

or I can take the entire 64-bit double as uint64_t:

if ((ix+ix-1) >= (0xfe00000000000000-1) // Must be Zero/Inf/NaN!

I.e. shift up to get rid of sign, decrement (so zero wraps around) and
compare against Inf+Inf-1 bit pattern.

On a machine with LEA this becomes

lea rax,[rax+rax-1]
cmp rax,0xfdffffffffffffff
jae special_input

Terje

--
- <Terje.Mathisen at tmsw.no>
"almost all programming can be viewed as an exercise in caching"

Re: PDP-11-like ISA

<sa7e32$658$1@dont-email.me>

  copy mid

https://www.novabbs.com/devel/article-flat.php?id=17730&group=comp.arch#17730

  copy link   Newsgroups: comp.arch
Path: i2pn2.org!i2pn.org!eternal-september.org!reader02.eternal-september.org!.POSTED!not-for-mail
From: james.ha...@gmail.com (James Harris)
Newsgroups: comp.arch
Subject: Re: PDP-11-like ISA
Date: Mon, 14 Jun 2021 12:21:06 +0100
Organization: A noiseless patient Spider
Lines: 45
Message-ID: <sa7e32$658$1@dont-email.me>
References: <548dc91d-831f-4dd7-a947-8fd7974695e3n@googlegroups.com>
<312431da-80d8-4e85-8e8c-ebc814ab099en@googlegroups.com>
<s9vj68$1h6$1@gioia.aioe.org>
<b030e686-69fe-4563-8949-0ccce68a9766n@googlegroups.com>
<0107a6e5-ccb6-4d58-a18d-ccc68cba06f0n@googlegroups.com>
<02aee909-8b53-4a0b-9f10-83a9b4dfb6dfn@googlegroups.com>
<af646ced-b439-4ea5-9c2d-010abd963bddn@googlegroups.com>
<sa2qem$crs$1@dont-email.me>
<12457f77-068d-4252-9a87-c407f8ccc7f1n@googlegroups.com>
<sa4oj5$v4j$1@dont-email.me>
<0d47945c-add3-4438-8245-47271a40e5d9n@googlegroups.com>
Mime-Version: 1.0
Content-Type: text/plain; charset=utf-8; format=flowed
Content-Transfer-Encoding: 8bit
Injection-Date: Mon, 14 Jun 2021 11:21:07 -0000 (UTC)
Injection-Info: reader02.eternal-september.org; posting-host="50b3367d838bd5b30a87477d98aa9bb9";
logging-data="6312"; mail-complaints-to="abuse@eternal-september.org"; posting-account="U2FsdGVkX1/VlfhC4CjZVUNpTb6OsB2QEGqfqQKIQGA="
User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:78.0) Gecko/20100101
Thunderbird/78.8.1
Cancel-Lock: sha1:k41fLASMGbePl8WI9AMZmtXMc3g=
In-Reply-To: <0d47945c-add3-4438-8245-47271a40e5d9n@googlegroups.com>
Content-Language: en-GB
 by: James Harris - Mon, 14 Jun 2021 11:21 UTC

On 13/06/2021 19:02, MitchAlsup wrote:
> On Sunday, June 13, 2021 at 6:02:00 AM UTC-5, James Harris wrote:
>> On 13/06/2021 01:26, Quadibloc wrote:
>>> On Saturday, June 12, 2021 at 11:21:29 AM UTC-6, Ivan Godard wrote:
>>>> On 6/12/2021 8:25 AM, Quadibloc wrote:
>>>
>>>>> It has occured to me that to achieve - without the baggage that the PowerPC took
>>>>> on - complete immunity to condition codes restricting what the compiler can do,
>>>>> there is a missing class of instructions that I need to add to my architecture.
>>>
>>>>> In addition to condition-code based conditional jump instructions... I need jump
>>>>> instructions that use those flag bits originally envisaged as only for instruction
>>>>> predication. (Of course, I _could_ predicate a jump instruction, but if I can avoid
>>>>> the cost and awkwardness of choosing a block header just for that...)

....

> There are 10 ways to interpret an integer-integer compare-

A bit puzzled, here. By "10 ways" I take it you mean things like signed
and unsigned less-than, less-or-equal, etc but checking my favourite list,

http://www.scs.stanford.edu/05au-cs240c/lab/i386/appd.htm

it looks as though there are 14 meaningful comparisons, altogether.

>
> -even if you
> cast out ½ of them you can't encode 5-states in 2-bits--so you have to
> have signed and unsigned compare instructions.

Yes, I was thinking that having signed and unsigned opcodes might allow
there to be just 1 status bit, using it for overflow (regarding Carry as
unsigned overflow) but I've not checked through the details.

Speaking of 2 bits, though, there's IBM's very clever 2-bit condition
code which is not two flags but an encoding of the numbers 0 to 3 with
each such number having a meaning. I last used that system decades ago
and cannot remember the details but I recall it working very well. If
nothing else it shows that the 4-bit NZCV is not the only way.

--
James Harris

Re: PDP-11-like ISA

<sa7gc7$t1a$1@dont-email.me>

  copy mid

https://www.novabbs.com/devel/article-flat.php?id=17733&group=comp.arch#17733

  copy link   Newsgroups: comp.arch
Path: i2pn2.org!i2pn.org!eternal-september.org!reader02.eternal-september.org!.POSTED!not-for-mail
From: iva...@millcomputing.com (Ivan Godard)
Newsgroups: comp.arch
Subject: Re: PDP-11-like ISA
Date: Mon, 14 Jun 2021 05:00:06 -0700
Organization: A noiseless patient Spider
Lines: 45
Message-ID: <sa7gc7$t1a$1@dont-email.me>
References: <548dc91d-831f-4dd7-a947-8fd7974695e3n@googlegroups.com>
<cyowI.31628$8f1.4834@fx23.iad>
<dfbead57-fff5-4ddb-bb18-38105da53f64n@googlegroups.com>
<j0ywI.36041$431.5756@fx39.iad> <s9voen$do7$1@gioia.aioe.org>
<121bdc96-e76a-4b2f-8fec-7465378f51den@googlegroups.com>
<sa2rvu$17ul$2@gioia.aioe.org>
<d4d27ae4-96e1-440c-952e-c84cf45ab4d4n@googlegroups.com>
<sa4p97$3pj$1@dont-email.me> <sa53eg$nsa$1@newsreader4.netcologne.de>
<sa731b$1klb$1@gioia.aioe.org>
Mime-Version: 1.0
Content-Type: text/plain; charset=utf-8; format=flowed
Content-Transfer-Encoding: 7bit
Injection-Date: Mon, 14 Jun 2021 12:00:07 -0000 (UTC)
Injection-Info: reader02.eternal-september.org; posting-host="8502820e329019d05f984e551f3a33dd";
logging-data="29738"; mail-complaints-to="abuse@eternal-september.org"; posting-account="U2FsdGVkX18OL8Rse6zvUX4TNAWIiBuc"
User-Agent: Mozilla/5.0 (Windows NT 10.0; Win64; x64; rv:78.0) Gecko/20100101
Thunderbird/78.11.0
Cancel-Lock: sha1:ZhQJXPNKgdyhxaMoIJd7aec+2TE=
In-Reply-To: <sa731b$1klb$1@gioia.aioe.org>
Content-Language: en-US
 by: Ivan Godard - Mon, 14 Jun 2021 12:00 UTC

On 6/14/2021 1:12 AM, Terje Mathisen wrote:
> Thomas Koenig wrote:
>> James Harris <james.harris.1@gmail.com> schrieb:
>>> On 12/06/2021 21:50, MitchAlsup wrote:
>>>> On Saturday, June 12, 2021 at 12:47:46 PM UTC-5, Terje Mathisen wrote:
>>>
>>> ...
>>>
>>>>> So you are saying that simply forcing carry-in takes 4 extra gate
>>>>> delays, on top of routing the three address bits, and also merging
>>>>> them
>>>>> with the generated cnt3 bit?
>>>> <
>>>> Yes, it is sad that SW people see AND and ADD as the same latency.
>>>
>>> As what you might call a software person I don't see AND and ADD as
>>> needing the same latency and I find it a bit irksome when CPUs require
>>> the same time (1 cycle) for each -
>>
>> You would need an elastic pipeline for this; in this case, you could
>> also make your additions somewhat faster in the case where you
>> do not need to propagate carries very far (or your multipliers if
>> you are, for example, multiplying two 8-bit numbers with a 64-bit
>> multiplier).
>>
>> There are papers on this, but no major architecture has done this to
>> date (unless I have missed something, which may well be the case).
>
> These sort of elastic pipelines are pretty much incompatible with
> crypto: You need a way to make sure your algorithms will run in constant
> time, with zero dependency on the actual keys being used.
>
> I.e. even the classic early-out multipliers would have made it
> impossible for me to write a version of DFC (an AES candiate) which was
> constant time and less than 10% slower than the maximum-speed
> implementation.
>
> Terje
>

Yes. Among the early complaints about static scheduling is that it
precludes any gain from early-out. Slowly, very slowly, the field is
realizing that you don't want early-out anyway.

Re: PDP-11-like ISA

<F_HxI.623379$ST2.86588@fx47.iad>

  copy mid

https://www.novabbs.com/devel/article-flat.php?id=17734&group=comp.arch#17734

  copy link   Newsgroups: comp.arch
Path: i2pn2.org!i2pn.org!aioe.org!news.uzoreto.com!feeder1.feed.usenet.farm!feed.usenet.farm!peer01.ams4!peer.am4.highwinds-media.com!peer03.iad!feed-me.highwinds-media.com!news.highwinds-media.com!fx47.iad.POSTED!not-for-mail
From: ThatWoul...@thevillage.com (EricP)
User-Agent: Thunderbird 2.0.0.24 (Windows/20100228)
MIME-Version: 1.0
Newsgroups: comp.arch
Subject: Re: PDP-11-like ISA
References: <548dc91d-831f-4dd7-a947-8fd7974695e3n@googlegroups.com> <312431da-80d8-4e85-8e8c-ebc814ab099en@googlegroups.com> <s9vj68$1h6$1@gioia.aioe.org> <b030e686-69fe-4563-8949-0ccce68a9766n@googlegroups.com> <0107a6e5-ccb6-4d58-a18d-ccc68cba06f0n@googlegroups.com> <sa0ud7$861$1@dont-email.me> <sa5oag$1bl2$1@gioia.aioe.org> <sa5q37$246$1@dont-email.me> <c292c061-2583-4bc6-8de8-74ad03211bc6n@googlegroups.com>
In-Reply-To: <c292c061-2583-4bc6-8de8-74ad03211bc6n@googlegroups.com>
Content-Type: text/plain; charset=UTF-8; format=flowed
Content-Transfer-Encoding: 8bit
Lines: 64
Message-ID: <F_HxI.623379$ST2.86588@fx47.iad>
X-Complaints-To: abuse@UsenetServer.com
NNTP-Posting-Date: Mon, 14 Jun 2021 12:31:33 UTC
Date: Mon, 14 Jun 2021 08:31:08 -0400
X-Received-Bytes: 3901
 by: EricP - Mon, 14 Jun 2021 12:31 UTC

MitchAlsup wrote:
> On Sunday, June 13, 2021 at 3:33:45 PM UTC-5, Ivan Godard wrote:
>> On 6/13/2021 1:03 PM, Terje Mathisen wrote:
>>> Ivan Godard wrote:
>
>>> I.e. any kind of compound check.
>>>
>>> Terje
> <
>> And that's the question. True, there could be bits for </=/> and you do
>> a compound for <= and >=, but that seems pointless to me; you should
>> have a negate flag (branch direction) in the instruction instead of
>> compounding. Otherwise, how often are compound checks of real operands
>> in real code? Yes, FP folding a NaN check is a reasonable use - but how
>> often is NaN checked for in real code regardless of how you do it?
> <
> Well Written FP codes do it all the time::
> <
> double ATAN2( double y, double x )
> { // IEEE 754-2019 quality ATAN2
> // deal with NANs
> if( ISNAN( x ) ) return x;
> if( ISNAN( y ) ) return y;
> // deal with infinities
> if( x == +∞ && |y|== +∞ ) return copysign( π/4, y );
> if( x == +∞ ) return copysign( 0.0, y );
> if( x == -∞ && |y|== +∞ ) return copysign( 3π/4, y );
> if( x == -∞ ) return copysign( π, y );
> if( |y|== +∞ ) return copysign( π/2, y );
> // deal with signed zeros
> if( x == 0.0 && y != 0.0 ) return copysign( π/2, y );
> if( x >=+0.0 && y == 0.0 ) return copysign( 0.0, y );
> if( x <=-0.0 && y == 0.0 ) return copysign( π, y );
> // calculate ATAN2 high performance style
> if( x > 0.0 )
> {
> if( y < 0.0 && |y| < |x| ) return - π/2 - ATAN( x / y );
> if( y < 0.0 && |y| > |x| ) return + ATAN( y / x );
> if( y > 0.0 && |y| < |x| ) return + ATAN( y / x );
> if( y > 0.0 && |y| > |x| ) return + π/2 - ATAN( x / y );
> }
> if( x < 0.0 )
> {
> if( y < 0.0 && |y| > |x| ) return + π/2 + ATAN( x / y );
> if( y < 0.0 && |y| > |x| ) return + π - ATAN( y / x );
> if( y > 0.0 && |y| < |x| ) return + π - ATAN( y / x );
> if( y > 0.0 && |y| > |x| ) return +3π/2 + ATAN( x / y );
> }
> <
> Not so well written FP codes are (well) no so well written.
> <
> Also note: only 1 compare is required to perform all of the above checks
> in My 66000 ISA.
>> So if compounds are ignorably rare then the bit vector is just an
>> encoding idea and should be measured on code density.

How many bits is the cmp bit vector?
I would imagine the above IF construct shows up in a lot of FP code.
Can you use your cmp bit vector as an index to a switch instruction,
or would it bloat too much?

Or maybe a two switches, filter out all NAN's and INF's first,
then deal with all the number variations.

Re: PDP-11-like ISA

<d461aa67-280e-428e-a88d-92df48de8168n@googlegroups.com>

  copy mid

https://www.novabbs.com/devel/article-flat.php?id=17744&group=comp.arch#17744

  copy link   Newsgroups: comp.arch
X-Received: by 2002:ac8:424a:: with SMTP id r10mr16902894qtm.147.1623683655409;
Mon, 14 Jun 2021 08:14:15 -0700 (PDT)
X-Received: by 2002:a9d:704b:: with SMTP id x11mr13144676otj.110.1623683655278;
Mon, 14 Jun 2021 08:14:15 -0700 (PDT)
Path: i2pn2.org!i2pn.org!weretis.net!feeder8.news.weretis.net!news.mixmin.net!proxad.net!feeder1-2.proxad.net!209.85.160.216.MISMATCH!news-out.google.com!nntp.google.com!postnews.google.com!google-groups.googlegroups.com!not-for-mail
Newsgroups: comp.arch
Date: Mon, 14 Jun 2021 08:14:15 -0700 (PDT)
In-Reply-To: <sa6r0r$euv$1@gioia.aioe.org>
Injection-Info: google-groups.googlegroups.com; posting-host=2600:1700:291:29f0:c61:3f99:736a:2902;
posting-account=H_G_JQkAAADS6onOMb-dqvUozKse7mcM
NNTP-Posting-Host: 2600:1700:291:29f0:c61:3f99:736a:2902
References: <548dc91d-831f-4dd7-a947-8fd7974695e3n@googlegroups.com>
<cyowI.31628$8f1.4834@fx23.iad> <dfbead57-fff5-4ddb-bb18-38105da53f64n@googlegroups.com>
<j0ywI.36041$431.5756@fx39.iad> <ac0da74d-b52b-4b52-b25e-cf1070ce5e04n@googlegroups.com>
<uQ7xI.28973$J21.10021@fx40.iad> <8b9ed2b1-9aca-43d4-bc95-b199285c9932n@googlegroups.com>
<sa6r0r$euv$1@gioia.aioe.org>
User-Agent: G2/1.0
MIME-Version: 1.0
Message-ID: <d461aa67-280e-428e-a88d-92df48de8168n@googlegroups.com>
Subject: Re: PDP-11-like ISA
From: MitchAl...@aol.com (MitchAlsup)
Injection-Date: Mon, 14 Jun 2021 15:14:15 +0000
Content-Type: text/plain; charset="UTF-8"
 by: MitchAlsup - Mon, 14 Jun 2021 15:14 UTC

On Monday, June 14, 2021 at 12:55:42 AM UTC-5, Terje Mathisen wrote:
> MitchAlsup wrote:
> > On Saturday, June 12, 2021 at 2:23:09 PM UTC-5, EricP wrote:
> >> That removes the IP from the register set and moves all the above mode
> >> specifier bits into the opcode, where they can be optimally assigned.
> > <
> > I am currently playing with the notion where there is a small register file
> > (8 entries) which is really fast, and a larger register file (64-256 entries)
> > which is 1 cycle access, and then memory which is 3-4 cycles of access
> > and supports sizes other than 64-bits. {The first 8 entries of the large
> > file is the small 8 register file.}
>
> So, this is a NURA (Non-Uniform Register Access) machine?
>
> I am sure the compiler writes would love it. :-)
>
> (Personally I would be perfectly happy, 8 effective registers are enough
> for almost all the inner loop code I have ever written.)
<
2 of the Livermore loops require 6 base registers.
And then there is that IP and SP are part of the register file.
<
> Terje
>
> --
> - <Terje.Mathisen at tmsw.no>
> "almost all programming can be viewed as an exercise in caching"

Re: PDP-11-like ISA

<046eb65c-f8e9-4d97-8988-54387ab64227n@googlegroups.com>

  copy mid

https://www.novabbs.com/devel/article-flat.php?id=17745&group=comp.arch#17745

  copy link   Newsgroups: comp.arch
X-Received: by 2002:a05:622a:454:: with SMTP id o20mr366302qtx.14.1623684034167;
Mon, 14 Jun 2021 08:20:34 -0700 (PDT)
X-Received: by 2002:a4a:c101:: with SMTP id s1mr13610928oop.54.1623684033959;
Mon, 14 Jun 2021 08:20:33 -0700 (PDT)
Path: i2pn2.org!i2pn.org!weretis.net!feeder8.news.weretis.net!proxad.net!feeder1-2.proxad.net!209.85.160.216.MISMATCH!news-out.google.com!nntp.google.com!postnews.google.com!google-groups.googlegroups.com!not-for-mail
Newsgroups: comp.arch
Date: Mon, 14 Jun 2021 08:20:33 -0700 (PDT)
In-Reply-To: <sa7e32$658$1@dont-email.me>
Injection-Info: google-groups.googlegroups.com; posting-host=2600:1700:291:29f0:c61:3f99:736a:2902;
posting-account=H_G_JQkAAADS6onOMb-dqvUozKse7mcM
NNTP-Posting-Host: 2600:1700:291:29f0:c61:3f99:736a:2902
References: <548dc91d-831f-4dd7-a947-8fd7974695e3n@googlegroups.com>
<312431da-80d8-4e85-8e8c-ebc814ab099en@googlegroups.com> <s9vj68$1h6$1@gioia.aioe.org>
<b030e686-69fe-4563-8949-0ccce68a9766n@googlegroups.com> <0107a6e5-ccb6-4d58-a18d-ccc68cba06f0n@googlegroups.com>
<02aee909-8b53-4a0b-9f10-83a9b4dfb6dfn@googlegroups.com> <af646ced-b439-4ea5-9c2d-010abd963bddn@googlegroups.com>
<sa2qem$crs$1@dont-email.me> <12457f77-068d-4252-9a87-c407f8ccc7f1n@googlegroups.com>
<sa4oj5$v4j$1@dont-email.me> <0d47945c-add3-4438-8245-47271a40e5d9n@googlegroups.com>
<sa7e32$658$1@dont-email.me>
User-Agent: G2/1.0
MIME-Version: 1.0
Message-ID: <046eb65c-f8e9-4d97-8988-54387ab64227n@googlegroups.com>
Subject: Re: PDP-11-like ISA
From: MitchAl...@aol.com (MitchAlsup)
Injection-Date: Mon, 14 Jun 2021 15:20:34 +0000
Content-Type: text/plain; charset="UTF-8"
Content-Transfer-Encoding: quoted-printable
 by: MitchAlsup - Mon, 14 Jun 2021 15:20 UTC

On Monday, June 14, 2021 at 6:21:10 AM UTC-5, James Harris wrote:
> On 13/06/2021 19:02, MitchAlsup wrote:
> > On Sunday, June 13, 2021 at 6:02:00 AM UTC-5, James Harris wrote:
> >> On 13/06/2021 01:26, Quadibloc wrote:
> >>> On Saturday, June 12, 2021 at 11:21:29 AM UTC-6, Ivan Godard wrote:
> >>>> On 6/12/2021 8:25 AM, Quadibloc wrote:
> >>>
> >>>>> It has occured to me that to achieve - without the baggage that the PowerPC took
> >>>>> on - complete immunity to condition codes restricting what the compiler can do,
> >>>>> there is a missing class of instructions that I need to add to my architecture.
> >>>
> >>>>> In addition to condition-code based conditional jump instructions.... I need jump
> >>>>> instructions that use those flag bits originally envisaged as only for instruction
> >>>>> predication. (Of course, I _could_ predicate a jump instruction, but if I can avoid
> >>>>> the cost and awkwardness of choosing a block header just for that....)
> ...
> > There are 10 ways to interpret an integer-integer compare-
> A bit puzzled, here. By "10 ways" I take it you mean things like signed
> and unsigned less-than, less-or-equal, etc but checking my favourite list,
>
> http://www.scs.stanford.edu/05au-cs240c/lab/i386/appd.htm
>
> it looks as though there are 14 meaningful comparisons, altogether.
<
If you keep track of overflow and carry:: but my CMP instruction does not
create those.
<
> >
> > -even if you
> > cast out ½ of them you can't encode 5-states in 2-bits--so you have to
> > have signed and unsigned compare instructions.
<
> Yes, I was thinking that having signed and unsigned opcodes might allow
> there to be just 1 status bit, using it for overflow (regarding Carry as
> unsigned overflow) but I've not checked through the details.
<
You should consider carry to be "significance" overflow.
>
> Speaking of 2 bits, though, there's IBM's very clever 2-bit condition
> code which is not two flags but an encoding of the numbers 0 to 3 with
> each such number having a meaning. I last used that system decades ago
> and cannot remember the details but I recall it working very well. If
> nothing else it shows that the 4-bit NZCV is not the only way.
>
>
> --
> James Harris

Re: PDP-11-like ISA

<6756e755-b91f-4377-980b-d72d5ddd0a45n@googlegroups.com>

  copy mid

https://www.novabbs.com/devel/article-flat.php?id=17746&group=comp.arch#17746

  copy link   Newsgroups: comp.arch
X-Received: by 2002:ac8:6951:: with SMTP id n17mr10246079qtr.340.1623684292615; Mon, 14 Jun 2021 08:24:52 -0700 (PDT)
X-Received: by 2002:a54:4882:: with SMTP id r2mr22471711oic.110.1623684292411; Mon, 14 Jun 2021 08:24:52 -0700 (PDT)
Path: i2pn2.org!i2pn.org!paganini.bofh.team!news.dns-netz.com!news.freedyn.net!newsfeed.xs4all.nl!newsfeed8.news.xs4all.nl!tr1.eu1.usenetexpress.com!feeder.usenetexpress.com!tr3.iad1.usenetexpress.com!border1.nntp.dca1.giganews.com!nntp.giganews.com!news-out.google.com!nntp.google.com!postnews.google.com!google-groups.googlegroups.com!not-for-mail
Newsgroups: comp.arch
Date: Mon, 14 Jun 2021 08:24:52 -0700 (PDT)
In-Reply-To: <F_HxI.623379$ST2.86588@fx47.iad>
Injection-Info: google-groups.googlegroups.com; posting-host=2600:1700:291:29f0:c61:3f99:736a:2902; posting-account=H_G_JQkAAADS6onOMb-dqvUozKse7mcM
NNTP-Posting-Host: 2600:1700:291:29f0:c61:3f99:736a:2902
References: <548dc91d-831f-4dd7-a947-8fd7974695e3n@googlegroups.com> <312431da-80d8-4e85-8e8c-ebc814ab099en@googlegroups.com> <s9vj68$1h6$1@gioia.aioe.org> <b030e686-69fe-4563-8949-0ccce68a9766n@googlegroups.com> <0107a6e5-ccb6-4d58-a18d-ccc68cba06f0n@googlegroups.com> <sa0ud7$861$1@dont-email.me> <sa5oag$1bl2$1@gioia.aioe.org> <sa5q37$246$1@dont-email.me> <c292c061-2583-4bc6-8de8-74ad03211bc6n@googlegroups.com> <F_HxI.623379$ST2.86588@fx47.iad>
User-Agent: G2/1.0
MIME-Version: 1.0
Message-ID: <6756e755-b91f-4377-980b-d72d5ddd0a45n@googlegroups.com>
Subject: Re: PDP-11-like ISA
From: MitchAl...@aol.com (MitchAlsup)
Injection-Date: Mon, 14 Jun 2021 15:24:52 +0000
Content-Type: text/plain; charset="UTF-8"
Content-Transfer-Encoding: quoted-printable
Lines: 76
 by: MitchAlsup - Mon, 14 Jun 2021 15:24 UTC

On Monday, June 14, 2021 at 7:31:37 AM UTC-5, EricP wrote:
> MitchAlsup wrote:
> > On Sunday, June 13, 2021 at 3:33:45 PM UTC-5, Ivan Godard wrote:
> >> On 6/13/2021 1:03 PM, Terje Mathisen wrote:
> >>> Ivan Godard wrote:
> >
> >>> I.e. any kind of compound check.
> >>>
> >>> Terje
> > <
> >> And that's the question. True, there could be bits for </=/> and you do
> >> a compound for <= and >=, but that seems pointless to me; you should
> >> have a negate flag (branch direction) in the instruction instead of
> >> compounding. Otherwise, how often are compound checks of real operands
> >> in real code? Yes, FP folding a NaN check is a reasonable use - but how
> >> often is NaN checked for in real code regardless of how you do it?
> > <
> > Well Written FP codes do it all the time::
> > <
> > double ATAN2( double y, double x )
> > { // IEEE 754-2019 quality ATAN2
> > // deal with NANs
> > if( ISNAN( x ) ) return x;
> > if( ISNAN( y ) ) return y;
> > // deal with infinities
> > if( x == +∞ && |y|== +∞ ) return copysign( π/4, y );
> > if( x == +∞ ) return copysign( 0.0, y );
> > if( x == -∞ && |y|== +∞ ) return copysign( 3π/4, y );
> > if( x == -∞ ) return copysign( π, y );
> > if( |y|== +∞ ) return copysign( π/2, y );
> > // deal with signed zeros
> > if( x == 0.0 && y != 0.0 ) return copysign( π/2, y );
> > if( x >=+0.0 && y == 0.0 ) return copysign( 0.0, y );
> > if( x <=-0.0 && y == 0.0 ) return copysign( π, y );
> > // calculate ATAN2 high performance style
> > if( x > 0.0 )
> > {
> > if( y < 0.0 && |y| < |x| ) return - π/2 - ATAN( x / y );
> > if( y < 0.0 && |y| > |x| ) return + ATAN( y / x );
> > if( y > 0.0 && |y| < |x| ) return + ATAN( y / x );
> > if( y > 0.0 && |y| > |x| ) return + π/2 - ATAN( x / y );
> > }
> > if( x < 0.0 )
> > {
> > if( y < 0.0 && |y| > |x| ) return + π/2 + ATAN( x / y );
> > if( y < 0.0 && |y| > |x| ) return + π - ATAN( y / x );
> > if( y > 0.0 && |y| < |x| ) return + π - ATAN( y / x );
> > if( y > 0.0 && |y| > |x| ) return +3π/2 + ATAN( x / y );
> > }
> > <
> > Not so well written FP codes are (well) no so well written.
> > <
> > Also note: only 1 compare is required to perform all of the above checks
> > in My 66000 ISA.
> >> So if compounds are ignorably rare then the bit vector is just an
> >> encoding idea and should be measured on code density.
> How many bits is the cmp bit vector?
<
64; of which 28 have assigned values (the rest are set to 0).
<
> I would imagine the above IF construct shows up in a lot of FP code.
> Can you use your cmp bit vector as an index to a switch instruction,
> or would it bloat too much?
<
I have it set up like the classify function of OpenGL. Where you apply
a bit mask and if any are set you take the branch. Individually this
is a branch on bit in the compare result vector.
>
> Or maybe a two switches, filter out all NAN's and INF's first,
> then deal with all the number variations.

Re: PDP-11-like ISA

<jwvbl88sbln.fsf-monnier+comp.arch@gnu.org>

  copy mid

https://www.novabbs.com/devel/article-flat.php?id=17750&group=comp.arch#17750

  copy link   Newsgroups: comp.arch
Path: i2pn2.org!i2pn.org!weretis.net!feeder8.news.weretis.net!eternal-september.org!reader02.eternal-september.org!.POSTED!not-for-mail
From: monn...@iro.umontreal.ca (Stefan Monnier)
Newsgroups: comp.arch
Subject: Re: PDP-11-like ISA
Date: Mon, 14 Jun 2021 11:38:20 -0400
Organization: A noiseless patient Spider
Lines: 22
Message-ID: <jwvbl88sbln.fsf-monnier+comp.arch@gnu.org>
References: <548dc91d-831f-4dd7-a947-8fd7974695e3n@googlegroups.com>
<cyowI.31628$8f1.4834@fx23.iad>
<dfbead57-fff5-4ddb-bb18-38105da53f64n@googlegroups.com>
<j0ywI.36041$431.5756@fx39.iad>
<ac0da74d-b52b-4b52-b25e-cf1070ce5e04n@googlegroups.com>
<uQ7xI.28973$J21.10021@fx40.iad>
<8b9ed2b1-9aca-43d4-bc95-b199285c9932n@googlegroups.com>
<sa6r0r$euv$1@gioia.aioe.org>
Mime-Version: 1.0
Content-Type: text/plain
Injection-Info: reader02.eternal-september.org; posting-host="d5e7ea0a847fec74d4cd7b8139cedc52";
logging-data="14202"; mail-complaints-to="abuse@eternal-september.org"; posting-account="U2FsdGVkX1/80brUJmj4SllpFhoLj4rA"
User-Agent: Gnus/5.13 (Gnus v5.13) Emacs/28.0.50 (gnu/linux)
Cancel-Lock: sha1:MyrZesqtlFTm/lDLwrPBD26Vegw=
sha1:dYhS/8nlJkOloA5twQ9Q52Xuw2Y=
 by: Stefan Monnier - Mon, 14 Jun 2021 15:38 UTC

Terje Mathisen [2021-06-14 07:55:39] wrote:
> MitchAlsup wrote:
>> On Saturday, June 12, 2021 at 2:23:09 PM UTC-5, EricP wrote:
>>> That removes the IP from the register set and moves all the above mode
>>> specifier bits into the opcode, where they can be optimally assigned.
>> <
>> I am currently playing with the notion where there is a small register file
>> (8 entries) which is really fast, and a larger register file (64-256 entries)
>> which is 1 cycle access, and then memory which is 3-4 cycles of access
>> and supports sizes other than 64-bits. {The first 8 entries of the large
>> file is the small 8 register file.}
> So, this is a NURA (Non-Uniform Register Access) machine?
> I am sure the compiler writes would love it. :-)
> (Personally I would be perfectly happy, 8 effective registers are enough for
> almost all the inner loop code I have ever written.)

I believe you're familiar with a machine called "Mill" which has fast
registers (called "belt positions") and slow registers (placed in
a thingy called "scratchpad") ;-)

Stefan

Re: PDP-11-like ISA

<Y6LxI.50121$EW.43175@fx04.iad>

  copy mid

https://www.novabbs.com/devel/article-flat.php?id=17752&group=comp.arch#17752

  copy link   Newsgroups: comp.arch
Path: i2pn2.org!i2pn.org!aioe.org!feeder1.feed.usenet.farm!feed.usenet.farm!peer03.ams4!peer.am4.highwinds-media.com!peer02.iad!feed-me.highwinds-media.com!news.highwinds-media.com!fx04.iad.POSTED!not-for-mail
From: ThatWoul...@thevillage.com (EricP)
User-Agent: Thunderbird 2.0.0.24 (Windows/20100228)
MIME-Version: 1.0
Newsgroups: comp.arch
Subject: Re: PDP-11-like ISA
References: <548dc91d-831f-4dd7-a947-8fd7974695e3n@googlegroups.com> <312431da-80d8-4e85-8e8c-ebc814ab099en@googlegroups.com> <s9vj68$1h6$1@gioia.aioe.org> <b030e686-69fe-4563-8949-0ccce68a9766n@googlegroups.com> <0107a6e5-ccb6-4d58-a18d-ccc68cba06f0n@googlegroups.com> <sa0ud7$861$1@dont-email.me> <sa5oag$1bl2$1@gioia.aioe.org> <sa5q37$246$1@dont-email.me> <c292c061-2583-4bc6-8de8-74ad03211bc6n@googlegroups.com> <F_HxI.623379$ST2.86588@fx47.iad> <6756e755-b91f-4377-980b-d72d5ddd0a45n@googlegroups.com>
In-Reply-To: <6756e755-b91f-4377-980b-d72d5ddd0a45n@googlegroups.com>
Content-Type: text/plain; charset=UTF-8; format=flowed
Content-Transfer-Encoding: 8bit
Lines: 82
Message-ID: <Y6LxI.50121$EW.43175@fx04.iad>
X-Complaints-To: abuse@UsenetServer.com
NNTP-Posting-Date: Mon, 14 Jun 2021 16:05:12 UTC
Date: Mon, 14 Jun 2021 12:04:27 -0400
X-Received-Bytes: 4707
 by: EricP - Mon, 14 Jun 2021 16:04 UTC

MitchAlsup wrote:
> On Monday, June 14, 2021 at 7:31:37 AM UTC-5, EricP wrote:
>> MitchAlsup wrote:
>>> On Sunday, June 13, 2021 at 3:33:45 PM UTC-5, Ivan Godard wrote:
>>>> On 6/13/2021 1:03 PM, Terje Mathisen wrote:
>>>>> Ivan Godard wrote:
>>>>> I.e. any kind of compound check.
>>>>>
>>>>> Terje
>>> <
>>>> And that's the question. True, there could be bits for </=/> and you do
>>>> a compound for <= and >=, but that seems pointless to me; you should
>>>> have a negate flag (branch direction) in the instruction instead of
>>>> compounding. Otherwise, how often are compound checks of real operands
>>>> in real code? Yes, FP folding a NaN check is a reasonable use - but how
>>>> often is NaN checked for in real code regardless of how you do it?
>>> <
>>> Well Written FP codes do it all the time::
>>> <
>>> double ATAN2( double y, double x )
>>> { // IEEE 754-2019 quality ATAN2
>>> // deal with NANs
>>> if( ISNAN( x ) ) return x;
>>> if( ISNAN( y ) ) return y;
>>> // deal with infinities
>>> if( x == +∞ && |y|== +∞ ) return copysign( π/4, y );
>>> if( x == +∞ ) return copysign( 0.0, y );
>>> if( x == -∞ && |y|== +∞ ) return copysign( 3π/4, y );
>>> if( x == -∞ ) return copysign( π, y );
>>> if( |y|== +∞ ) return copysign( π/2, y );
>>> // deal with signed zeros
>>> if( x == 0.0 && y != 0.0 ) return copysign( π/2, y );
>>> if( x >=+0.0 && y == 0.0 ) return copysign( 0.0, y );
>>> if( x <=-0.0 && y == 0.0 ) return copysign( π, y );
>>> // calculate ATAN2 high performance style
>>> if( x > 0.0 )
>>> {
>>> if( y < 0.0 && |y| < |x| ) return - π/2 - ATAN( x / y );
>>> if( y < 0.0 && |y| > |x| ) return + ATAN( y / x );
>>> if( y > 0.0 && |y| < |x| ) return + ATAN( y / x );
>>> if( y > 0.0 && |y| > |x| ) return + π/2 - ATAN( x / y );
>>> }
>>> if( x < 0.0 )
>>> {
>>> if( y < 0.0 && |y| > |x| ) return + π/2 + ATAN( x / y );
>>> if( y < 0.0 && |y| > |x| ) return + π - ATAN( y / x );
>>> if( y > 0.0 && |y| < |x| ) return + π - ATAN( y / x );
>>> if( y > 0.0 && |y| > |x| ) return +3π/2 + ATAN( x / y );
>>> }
>>> <
>>> Not so well written FP codes are (well) no so well written.
>>> <
>>> Also note: only 1 compare is required to perform all of the above checks
>>> in My 66000 ISA.
>>>> So if compounds are ignorably rare then the bit vector is just an
>>>> encoding idea and should be measured on code density.
>> How many bits is the cmp bit vector?
> <
> 64; of which 28 have assigned values (the rest are set to 0).
> <
>> I would imagine the above IF construct shows up in a lot of FP code.
>> Can you use your cmp bit vector as an index to a switch instruction,
>> or would it bloat too much?
> <
> I have it set up like the classify function of OpenGL. Where you apply
> a bit mask and if any are set you take the branch. Individually this
> is a branch on bit in the compare result vector.
>> Or maybe a two switches, filter out all NAN's and INF's first,
>> then deal with all the number variations.

Just spitballing...

If one wanted to use the cmp vector in switch instructions,
it could have a "bit pack" instruction which takes a 64-bit source
and a 64-bit mask immediate, and for every '1' bit in the mask it
packs that source bit to the next available position on the right.
So mask b0010_0100_0101 extracts b9, b6, b2, b0 to bits b[3:0] in dest.
Then use that as the switch index.

So for the above, maybe 4 switch instructions.

Re: PDP-11-like ISA

<czLxI.57725$iY.24595@fx41.iad>

  copy mid

https://www.novabbs.com/devel/article-flat.php?id=17755&group=comp.arch#17755

  copy link   Newsgroups: comp.arch
Path: i2pn2.org!i2pn.org!paganini.bofh.team!news.dns-netz.com!news.freedyn.net!newsfeed.xs4all.nl!newsfeed7.news.xs4all.nl!feeder1.feed.usenet.farm!feed.usenet.farm!peer02.ams4!peer.am4.highwinds-media.com!peer02.iad!feed-me.highwinds-media.com!news.highwinds-media.com!fx41.iad.POSTED!not-for-mail
From: ThatWoul...@thevillage.com (EricP)
User-Agent: Thunderbird 2.0.0.24 (Windows/20100228)
MIME-Version: 1.0
Newsgroups: comp.arch
Subject: Re: PDP-11-like ISA
References: <548dc91d-831f-4dd7-a947-8fd7974695e3n@googlegroups.com> <cyowI.31628$8f1.4834@fx23.iad> <dfbead57-fff5-4ddb-bb18-38105da53f64n@googlegroups.com> <j0ywI.36041$431.5756@fx39.iad> <s9voen$do7$1@gioia.aioe.org> <8iKwI.317313$N_4.28704@fx36.iad> <sa2pt3$ctq$1@gioia.aioe.org> <b95bb897-d9a0-4550-8292-0a2248c9cb6an@googlegroups.com> <sa5op9$1l48$1@gioia.aioe.org> <p_vxI.101400$od.74449@fx15.iad> <3c1ae43d-5af5-4d03-87d6-df49fa0c4383n@googlegroups.com>
In-Reply-To: <3c1ae43d-5af5-4d03-87d6-df49fa0c4383n@googlegroups.com>
Content-Type: text/plain; charset=UTF-8; format=flowed
Content-Transfer-Encoding: 7bit
Lines: 86
Message-ID: <czLxI.57725$iY.24595@fx41.iad>
X-Complaints-To: abuse@UsenetServer.com
NNTP-Posting-Date: Mon, 14 Jun 2021 16:35:20 UTC
Date: Mon, 14 Jun 2021 12:35:12 -0400
X-Received-Bytes: 4685
 by: EricP - Mon, 14 Jun 2021 16:35 UTC

MitchAlsup wrote:
> On Sunday, June 13, 2021 at 5:52:08 PM UTC-5, EricP wrote:
>> Terje Mathisen wrote:
>>> MitchAlsup wrote:
>>>> On Saturday, June 12, 2021 at 12:12:07 PM UTC-5, Terje Mathisen wrote:
>>>>> EricP wrote:
>>>>>> Terje Mathisen wrote:
>>>>>>> EricP wrote:
>>>>>>>> So toss out auto inc/dec address modes and put in two instructions
>>>>>>>>
>>>>>>>> // pdp-11 asm has dest reg on the right
>>>>>>>> ADDTY #tiny,reg Add Tiny
>>>>>>>> SUBTY #tiny,reg Subtract Tiny
>>>>>>>>
>>>>>>>> where tiny is 1..7 and 0 means 8. Much more generally useful.
>>>>>>> I would rather have 0..7 meaning 1..8, since this is trivially
>>>>>>> achieved by always enabling an incoming carry to the address adder.
>>>>>>> Special-casing 0 to mean 8 seems like it would need more gates?
>>>>>>>
>>>>>>> I.e. cnt3 = !(cnt0 & cnt1 & cnt2) vs always setting carry-in?
>>>>>>>
>>>>>>> Terje
>>>>>> 1..7, 0 => 8 allows the 3 lower bits to be used directly
>>>>>> and a 3-input NOR-3 gate to drive the 4th bit.
>>>>>>
>>>>>> 0..7 => 1..8 needs an incrementer, in this case a NOT, 2 XOR's,
>>>>>> an AND-2 and an AND-3 carry lookaheads.
>>>>>>
>>>>>> Its in the decoder and there are other more expensive things there,
>>>>>> so it wouldn't be on the critical path either way.
>>>> <
>>>>> The logic is simple either way, but since we're already going to use the
>>>>> (address) adder, just setting carry_in is effectively free.
>>>> <
>>>> What makes you think the AGEN adder has a carry in ??
>>> I realize that it doesn't need one: The bottom bit can be handled with a
>>> half adder which I'm guessing is at least one gate delay faster, and
>>> that can negate any gain from not having to generate cnt3?
>>>
>>> OTOH, if this is dedicated circuit, then it would know to always add the
>>> carry-in, so the bottom half adder would just be slightly different:
>>>
>>> s0 = a == b
>>> carry_out = a | b;
>>>
>>> Right?
>>>
>>> Terje
>>>
>> what I was talking about was the encoding of a field in the instruction
>> which would always be set by the assembler/compiler so there is no reason
>> for it not to be encoded as simply as possible for the hardware.
> <
> For what follows I an using the term "decoder" a a block of logic that asserts
> on of 2^n output signals in response to n bit input signal. {There are lots of
> other kinds of decoders (greater than, less than) but this one is also known
> as the equality decoder.

A unary (one-hot) decoder, check.

> <
> One can simply build the decoder of range(0..7) to assert on domain(1..8)
> at zero cost. {we have not had this one in the discussion yet, but it needs
> to be}

I don't follow you here. Are you talking about feeding the unary decoder
output into a binary encoder, just offset by 1 position?

> <
> The lesser cost is the AND gate which converts 000 into 1000. Then the
> lower 3 bits go into a decoder to assert one of 1..7 the AND gate asserting
> 8. (the zero output of the decoder is not used and if the synthesizer is any
> good, the great gate muncher will remove that logic.}
> <

You have lost me.

I saw #tiny as just another kind of immediate binary that happens to be
stashed in an opcode register field. When the instruction is decoded
as ADDTY or SUBTY the converted tiny value gets transferred (MUX'd)
into the uOp immediate field, and the uOp instruction field
is set to ADD or SUB.

After that uOp is treated as any other instruction with an immediate operand.

Re: PDP-11-like ISA

<kiNxI.755688$nn2.319656@fx48.iad>

  copy mid

https://www.novabbs.com/devel/article-flat.php?id=17762&group=comp.arch#17762

  copy link   Newsgroups: comp.arch
Path: i2pn2.org!i2pn.org!aioe.org!news.uzoreto.com!news-out.netnews.com!news.alt.net!fdc3.netnews.com!peer01.ams1!peer.ams1.xlned.com!news.xlned.com!peer01.iad!feed-me.highwinds-media.com!news.highwinds-media.com!fx48.iad.POSTED!not-for-mail
From: ThatWoul...@thevillage.com (EricP)
User-Agent: Thunderbird 2.0.0.24 (Windows/20100228)
MIME-Version: 1.0
Newsgroups: comp.arch
Subject: Re: PDP-11-like ISA
References: <548dc91d-831f-4dd7-a947-8fd7974695e3n@googlegroups.com> <cyowI.31628$8f1.4834@fx23.iad> <dfbead57-fff5-4ddb-bb18-38105da53f64n@googlegroups.com> <j0ywI.36041$431.5756@fx39.iad> <s9voen$do7$1@gioia.aioe.org> <121bdc96-e76a-4b2f-8fec-7465378f51den@googlegroups.com> <sa2rvu$17ul$2@gioia.aioe.org> <d4d27ae4-96e1-440c-952e-c84cf45ab4d4n@googlegroups.com> <sa4p97$3pj$1@dont-email.me> <sa53eg$nsa$1@newsreader4.netcologne.de>
In-Reply-To: <sa53eg$nsa$1@newsreader4.netcologne.de>
Content-Type: text/plain; charset=ISO-8859-1; format=flowed
Content-Transfer-Encoding: 7bit
Lines: 45
Message-ID: <kiNxI.755688$nn2.319656@fx48.iad>
X-Complaints-To: abuse@UsenetServer.com
NNTP-Posting-Date: Mon, 14 Jun 2021 18:33:52 UTC
Date: Mon, 14 Jun 2021 14:32:12 -0400
X-Received-Bytes: 3106
 by: EricP - Mon, 14 Jun 2021 18:32 UTC

Thomas Koenig wrote:
> James Harris <james.harris.1@gmail.com> schrieb:
>> On 12/06/2021 21:50, MitchAlsup wrote:
>>> On Saturday, June 12, 2021 at 12:47:46 PM UTC-5, Terje Mathisen wrote:
>> ...
>>
>>>> So you are saying that simply forcing carry-in takes 4 extra gate
>>>> delays, on top of routing the three address bits, and also merging them
>>>> with the generated cnt3 bit?
>>> <
>>> Yes, it is sad that SW people see AND and ADD as the same latency.
>> As what you might call a software person I don't see AND and ADD as
>> needing the same latency and I find it a bit irksome when CPUs require
>> the same time (1 cycle) for each -
>
> You would need an elastic pipeline for this; in this case, you could
> also make your additions somewhat faster in the case where you
> do not need to propagate carries very far (or your multipliers if
> you are, for example, multiplying two 8-bit numbers with a 64-bit
> multiplier).
>
> There are papers on this, but no major architecture has done this to
> date (unless I have missed something, which may well be the case).

I was reading something on elastic pipelines and stumbled
on the term for this: they called them "Telescopic units" and
Google scholar shows a a few hits. Seems to be mostly 20 years ago.

I haven't read the papers but the abstracts sounds on-topic.

"Results, obtained on a large set of benchmark circuits, show an
average throughput improvement exceeding 27%, at the price of a
modest area increase (less than 8% on average)."

Telescopic units: a new paradigm for performance optimization
of VLSI designs, 1998
https://ieeexplore.ieee.org/abstract/document/700720/
https://citeseerx.ist.psu.edu/viewdoc/download?doi=10.1.1.4.3722&rep=rep1&type=pdf

Telescopic units: Increasing the average throughput of pipelined
designs by adaptive latency control, 1997
https://dl.acm.org/doi/abs/10.1145/266021.266029
https://citeseerx.ist.psu.edu/viewdoc/download?doi=10.1.1.32.3516&rep=rep1&type=pdf

Re: PDP-11-like ISA

<sa8hg4$a2p$1@dont-email.me>

  copy mid

https://www.novabbs.com/devel/article-flat.php?id=17769&group=comp.arch#17769

  copy link   Newsgroups: comp.arch
Path: i2pn2.org!i2pn.org!eternal-september.org!reader02.eternal-september.org!.POSTED!not-for-mail
From: sfu...@alumni.cmu.edu.invalid (Stephen Fuld)
Newsgroups: comp.arch
Subject: Re: PDP-11-like ISA
Date: Mon, 14 Jun 2021 14:25:22 -0700
Organization: A noiseless patient Spider
Lines: 42
Message-ID: <sa8hg4$a2p$1@dont-email.me>
References: <548dc91d-831f-4dd7-a947-8fd7974695e3n@googlegroups.com>
<312431da-80d8-4e85-8e8c-ebc814ab099en@googlegroups.com>
<s9vj68$1h6$1@gioia.aioe.org>
<b030e686-69fe-4563-8949-0ccce68a9766n@googlegroups.com>
<0107a6e5-ccb6-4d58-a18d-ccc68cba06f0n@googlegroups.com>
<sa0ud7$861$1@dont-email.me> <sa5oag$1bl2$1@gioia.aioe.org>
<sa5q37$246$1@dont-email.me> <sa6bdg$18f$1@dont-email.me>
<sa6tv1$qr7$1@dont-email.me>
Mime-Version: 1.0
Content-Type: text/plain; charset=utf-8; format=flowed
Content-Transfer-Encoding: 8bit
Injection-Date: Mon, 14 Jun 2021 21:25:25 -0000 (UTC)
Injection-Info: reader02.eternal-september.org; posting-host="46652488b945540d2cc17183c5ebfcd5";
logging-data="10329"; mail-complaints-to="abuse@eternal-september.org"; posting-account="U2FsdGVkX1+MO+DVNC/nraW6ANTdLgoptvVN0/33L2A="
User-Agent: Mozilla/5.0 (Windows NT 10.0; Win64; x64; rv:78.0) Gecko/20100101
Thunderbird/78.11.0
Cancel-Lock: sha1:pxKxPJCQtdfTxd549btnBx1wPXk=
In-Reply-To: <sa6tv1$qr7$1@dont-email.me>
Content-Language: en-US
 by: Stephen Fuld - Mon, 14 Jun 2021 21:25 UTC

On 6/13/2021 11:45 PM, Ivan Godard wrote:
> On 6/13/2021 6:29 PM, Stephen Fuld wrote:
>> On 6/13/2021 1:33 PM, Ivan Godard wrote:
>>> On 6/13/2021 1:03 PM, Terje Mathisen wrote:
>
>>> And that's the question. True, there could be bits for </=/> and you
>>> do a compound for <= and >=, but that seems pointless to me; you
>>> should have a negate flag (branch direction) in the instruction
>>> instead of compounding. Otherwise, how often are compound checks of
>>> real operands in real code? Yes, FP folding a NaN check is a
>>> reasonable use - but how often is NaN checked for in real code
>>> regardless of how you do it?
>>>
>>> So if compounds are ignorably rare then the bit vector is just an
>>> encoding idea and should be measured on code density.
>>
>> One other point.  Mitch's scheme does all those checks with a single
>> op code for the compare and one (or two if you include the
>> predication) for for the conditional branch/predicate.  So depending
>> upon how the alternative handles things, you might require fewer op
>> codes.  How much that is worth is, of course, dependent on lots of
>> other factors.
>>
>>
>
> Yes; it shifts entropy from opcode to bit selector. But it doesn't
> reduce the total entropy.

Of course, you are right, but if (unlike the Mill) you have fixed length
instructions, especially with fixed length fields within them, you might
have more, otherwise unused, bits available in other places. e.g. in a
32 bit fixed instruction, for a conditional branch, typically one of
the, otherwise unused, register specifier fields is available to hold
those bits.

As I said before, YMMV.

--
- Stephen Fuld
(e-mail address disguised to prevent spam)

Pages:12345
server_pubkey.txt

rocksolid light 0.9.8
clearnet tor