Message-ID:

"You are WRONG, you ol' brass-breasted fascist poop!" -- Bloom County

devel / comp.arch / Re: Design a better 16 or 32 bit processor

Re: Java bytecode processors

<2d262fce-9363-4360-bfd9-ba4263e8d703n@googlegroups.com>

https://www.novabbs.com/devel/article-flat.php?id=20398&group=comp.arch#20398

X-Received: by 2002:a37:2754:: with SMTP id n81mr258185qkn.297.1631322228773;
Fri, 10 Sep 2021 18:03:48 -0700 (PDT)
X-Received: by 2002:a9d:7019:: with SMTP id k25mr445851otj.350.1631322228513;
Fri, 10 Sep 2021 18:03:48 -0700 (PDT)
Path: i2pn2.org!i2pn.org!weretis.net!feeder6.news.weretis.net!news.misty.com!border2.nntp.dca1.giganews.com!nntp.giganews.com!news-out.google.com!nntp.google.com!postnews.google.com!google-groups.googlegroups.com!not-for-mail
Newsgroups: comp.arch
Date: Fri, 10 Sep 2021 18:03:48 -0700 (PDT)
In-Reply-To: <shguv8$2416$1@gal.iecc.com>
Injection-Info: google-groups.googlegroups.com; posting-host=2600:1700:291:29f0:5027:7e9b:eaf7:6c1;
posting-account=H_G_JQkAAADS6onOMb-dqvUozKse7mcM
NNTP-Posting-Host: 2600:1700:291:29f0:5027:7e9b:eaf7:6c1
References: <sh3evb$n1v$1@newsreader4.netcologne.de> <sh7ttc$r9h$1@z-news.wcss.wroc.pl>
<3fea318c-62be-4d30-aafb-976eeee14908n@googlegroups.com> <shguv8$2416$1@gal.iecc.com>
User-Agent: G2/1.0
MIME-Version: 1.0
Message-ID: <2d262fce-9363-4360-bfd9-ba4263e8d703n@googlegroups.com>
Subject: Re: Java bytecode processors
From: MitchAl...@aol.com (MitchAlsup)
Injection-Date: Sat, 11 Sep 2021 01:03:48 +0000
Content-Type: text/plain; charset="UTF-8"
Lines: 17

by: MitchAlsup - Sat, 11 Sep 2021 01:03 UTC

On Friday, September 10, 2021 at 7:58:19 PM UTC-5, John Levine wrote:
> According to JimBrakefield <jim.bra...@ieee.org>:
> >Explore extending a byte code ISA to a two byte ISA.
> >gives both a stack operator and a stack reference
> >and with the few remaining bits one can concoct all manner
> >of extras: replace result into stack reference, pop/push the stack,
> >indirect addressing through the stack reference, apply indexing
> >to the stack reference as an address
> > I.e. use the stack frame of mind/reference to add features which
> >map into RISC instructions easily.
<
> Sounds a lot like a PDP-11.
<
Which died because its successor was too hard to pipeline.
> --
> Regards,
> John Levine, jo...@taugh.com, Primary Perpetrator of "The Internet for Dummies",
> Please consider the environment before reading this e-mail. https://jl.ly

Re: not the PDP-11, was Java bytecode processors

<shh02c$267q$1@gal.iecc.com>

copy mid

https://www.novabbs.com/devel/article-flat.php?id=20399&group=comp.arch#20399

copy link Newsgroups: comp.arch

Path: i2pn2.org!i2pn.org!weretis.net!feeder6.news.weretis.net!news.misty.com!news.iecc.com!.POSTED.news.iecc.com!not-for-mail
From: joh...@taugh.com (John Levine)
Newsgroups: comp.arch
Subject: Re: not the PDP-11, was Java bytecode processors
Date: Sat, 11 Sep 2021 01:17:00 -0000 (UTC)
Organization: Taughannock Networks
Message-ID: <shh02c$267q$1@gal.iecc.com>
References: <sh3evb$n1v$1@newsreader4.netcologne.de> <3fea318c-62be-4d30-aafb-976eeee14908n@googlegroups.com> <shguv8$2416$1@gal.iecc.com> <2d262fce-9363-4360-bfd9-ba4263e8d703n@googlegroups.com>
Injection-Date: Sat, 11 Sep 2021 01:17:00 -0000 (UTC)
Injection-Info: gal.iecc.com; posting-host="news.iecc.com:2001:470:1f07:1126:0:676f:7373:6970";
logging-data="71930"; mail-complaints-to="abuse@iecc.com"
In-Reply-To: <sh3evb$n1v$1@newsreader4.netcologne.de> <3fea318c-62be-4d30-aafb-976eeee14908n@googlegroups.com> <shguv8$2416$1@gal.iecc.com> <2d262fce-9363-4360-bfd9-ba4263e8d703n@googlegroups.com>
Cleverness: some
X-Newsreader: trn 4.0-test77 (Sep 1, 2010)
Originator: johnl@iecc.com (John Levine)

by: John Levine - Sat, 11 Sep 2021 01:17 UTC

According to MitchAlsup <MitchAlsup@aol.com>:
>> Sounds a lot like a PDP-11.
><
>Which died because its successor was too hard to pipeline.

The PDP-11 was a really good design for the late 1960s, when memory
was starting to be affordable and microcode ROM was still a lot faster
than core.

The VAX was also a good design for the late 1960s but unfortunately it
was introduced in the late 1970s. It wasn't just that it was too hard
to pipeline. The instruction set was full of overcomplicated
microcoded instructions that were often slower than a sequence of
simple instructions, and they somehow didn't notice that memory was
getting a lot cheaper, with a super-dense super-general instruction
set intended for assembler programmers, and tiny 512 byte pages that
even at the time were obviously too small.

--
Regards,
John Levine, johnl@taugh.com, Primary Perpetrator of "The Internet for Dummies",
Please consider the environment before reading this e-mail. https://jl.ly

Re: not the PDP-11, was Java bytecode processors

<9d0738b6-a57d-4eff-99cf-937f1be0fa6en@googlegroups.com>

copy mid

https://www.novabbs.com/devel/article-flat.php?id=20400&group=comp.arch#20400

copy link Newsgroups: comp.arch

X-Received: by 2002:a37:a4c5:: with SMTP id n188mr400862qke.273.1631325954604;
Fri, 10 Sep 2021 19:05:54 -0700 (PDT)
X-Received: by 2002:a05:6808:118:: with SMTP id b24mr512625oie.0.1631325954280;
Fri, 10 Sep 2021 19:05:54 -0700 (PDT)
Path: i2pn2.org!i2pn.org!weretis.net!feeder6.news.weretis.net!news.misty.com!border2.nntp.dca1.giganews.com!nntp.giganews.com!news-out.google.com!nntp.google.com!postnews.google.com!google-groups.googlegroups.com!not-for-mail
Newsgroups: comp.arch
Date: Fri, 10 Sep 2021 19:05:54 -0700 (PDT)
In-Reply-To: <shh02c$267q$1@gal.iecc.com>
Injection-Info: google-groups.googlegroups.com; posting-host=136.50.182.0; posting-account=AoizIQoAAADa7kQDpB0DAj2jwddxXUgl
NNTP-Posting-Host: 136.50.182.0
References: <sh3evb$n1v$1@newsreader4.netcologne.de> <3fea318c-62be-4d30-aafb-976eeee14908n@googlegroups.com>
<shguv8$2416$1@gal.iecc.com> <2d262fce-9363-4360-bfd9-ba4263e8d703n@googlegroups.com>
<shh02c$267q$1@gal.iecc.com>
User-Agent: G2/1.0
MIME-Version: 1.0
Message-ID: <9d0738b6-a57d-4eff-99cf-937f1be0fa6en@googlegroups.com>
Subject: Re: not the PDP-11, was Java bytecode processors
From: jim.brak...@ieee.org (JimBrakefield)
Injection-Date: Sat, 11 Sep 2021 02:05:54 +0000
Content-Type: text/plain; charset="UTF-8"
Lines: 30

by: JimBrakefield - Sat, 11 Sep 2021 02:05 UTC

On Friday, September 10, 2021 at 8:17:02 PM UTC-5, John Levine wrote:
> According to MitchAlsup <Mitch...@aol.com>:
> >> Sounds a lot like a PDP-11.
> ><
> >Which died because its successor was too hard to pipeline.
> The PDP-11 was a really good design for the late 1960s, when memory
> was starting to be affordable and microcode ROM was still a lot faster
> than core.
>
> The VAX was also a good design for the late 1960s but unfortunately it
> was introduced in the late 1970s. It wasn't just that it was too hard
> to pipeline. The instruction set was full of overcomplicated
> microcoded instructions that were often slower than a sequence of
> simple instructions, and they somehow didn't notice that memory was
> getting a lot cheaper, with a super-dense super-general instruction
> set intended for assembler programmers, and tiny 512 byte pages that
> even at the time were obviously too small.
> --
> Regards,
> John Levine, jo...@taugh.com, Primary Perpetrator of "The Internet for Dummies",
> Please consider the environment before reading this e-mail. https://jl.ly

So, reinvent the VAX:
keep most of the op-codes, encoded in 8-bits
limit memory references to one operand or result except for mem-mem move
eliminate the double indirect addressing modes
all fits nicely into 24-bits (except for immediate values): op, R1, R2, D, adr-mode

A thing of beauty made practical.

Design a better 16 or 32 bit processor

<shhl98$3vo$1@dont-email.me>

copy mid

https://www.novabbs.com/devel/article-flat.php?id=20402&group=comp.arch#20402

copy link Newsgroups: comp.arch

Path: i2pn2.org!i2pn.org!eternal-september.org!reader02.eternal-september.org!.POSTED!not-for-mail
From: ggt...@yahoo.com (Brett)
Newsgroups: comp.arch
Subject: Design a better 16 or 32 bit processor
Date: Sat, 11 Sep 2021 07:19:04 -0000 (UTC)
Organization: A noiseless patient Spider
Lines: 62
Message-ID: <shhl98$3vo$1@dont-email.me>
References: <sh3evb$n1v$1@newsreader4.netcologne.de>
<3fea318c-62be-4d30-aafb-976eeee14908n@googlegroups.com>
<shguv8$2416$1@gal.iecc.com>
<2d262fce-9363-4360-bfd9-ba4263e8d703n@googlegroups.com>
<shh02c$267q$1@gal.iecc.com>
Mime-Version: 1.0
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: 8bit
Injection-Date: Sat, 11 Sep 2021 07:19:04 -0000 (UTC)
Injection-Info: reader02.eternal-september.org; posting-host="5d5f2bad989dbbf8a8d56ad9bacbde7f";
logging-data="4088"; mail-complaints-to="abuse@eternal-september.org"; posting-account="U2FsdGVkX1/C5IDCnjouFsHB7RPANr+U"
User-Agent: NewsTap/5.5 (iPad)
Cancel-Lock: sha1:IX4DqK7VmdSYOxzXH+VcrLclXbU=
sha1:0IEyQ+sPzpJ2w6iDgFOSxe/ZYp0=

by: Brett - Sat, 11 Sep 2021 07:19 UTC

John Levine <johnl@taugh.com> wrote:
> According to MitchAlsup <MitchAlsup@aol.com>:
>>> Sounds a lot like a PDP-11.
>> <
>> Which died because its successor was too hard to pipeline.
>
> The PDP-11 was a really good design for the late 1960s, when memory
> was starting to be affordable and microcode ROM was still a lot faster
> than core.
>
> The VAX was also a good design for the late 1960s but unfortunately it
> was introduced in the late 1970s. It wasn't just that it was too hard
> to pipeline.

> The instruction set was full of overcomplicated
> microcoded instructions that were often slower than a sequence of
> simple instructions,

A Myth what was promoted by RISC proponents, which has been debunked.

RISC only made sense for a decade back in the ancient history of the
1980’s.

Today if I wanted to build a better 16 or 32 bit processor the first step
would be to find what micro coded instructions I could add to reduce
instruction density, and thus win the lowest cost war.

The 8086 with hard coded registers was quite good for the era, but we can
do better today, by micro coding much more complicated sequences.

The first instruction I would add is a one instruction memcpy loop, which
would use three hard coded registers to make the instruction short. The
data register would not be visible so that I could use vector registers if
I wanted. And there would be several variants for copy size and alignment.
And a bit to decide if the count is part of the instruction.

Another instruction I would add is add plus store, etc.

The case for load plus add is harder, and might not make the cut due to
transistor implementation cost outweighing memory transistor savings.
Load compare and branch is in the same boat and has to be added to the
total system cost of load compute.

I seriously think a new 16 or 32 bit processor with micro coded
instructions could win market share, by simple expedient of smaller total
size with code included.

My template is a pipelined 386 with 32 registers and far more complex and
longer micro code instructions.

There is a major company that went for an updated 386, but did not improve
anything besides instruction encoding and minor fixes. Failed to go big and
so failed, go big or go home.

> and they somehow didn't notice that memory was
> getting a lot cheaper, with a super-dense super-general instruction
> set intended for assembler programmers, and tiny 512 byte pages that
> even at the time were obviously too small.
>

Re: Design a better 16 or 32 bit processor

<shhlm7$9p5$1@newsreader4.netcologne.de>

copy mid

https://www.novabbs.com/devel/article-flat.php?id=20403&group=comp.arch#20403

copy link Newsgroups: comp.arch

Path: i2pn2.org!i2pn.org!weretis.net!feeder8.news.weretis.net!newsreader4.netcologne.de!news.netcologne.de!.POSTED.2001-4dd7-c002-0-7285-c2ff-fe6c-992d.ipv6dyn.netcologne.de!not-for-mail
From: tkoe...@netcologne.de (Thomas Koenig)
Newsgroups: comp.arch
Subject: Re: Design a better 16 or 32 bit processor
Date: Sat, 11 Sep 2021 07:25:59 -0000 (UTC)
Organization: news.netcologne.de
Distribution: world
Message-ID: <shhlm7$9p5$1@newsreader4.netcologne.de>
References: <sh3evb$n1v$1@newsreader4.netcologne.de>
<3fea318c-62be-4d30-aafb-976eeee14908n@googlegroups.com>
<shguv8$2416$1@gal.iecc.com>
<2d262fce-9363-4360-bfd9-ba4263e8d703n@googlegroups.com>
<shh02c$267q$1@gal.iecc.com> <shhl98$3vo$1@dont-email.me>
Injection-Date: Sat, 11 Sep 2021 07:25:59 -0000 (UTC)
Injection-Info: newsreader4.netcologne.de; posting-host="2001-4dd7-c002-0-7285-c2ff-fe6c-992d.ipv6dyn.netcologne.de:2001:4dd7:c002:0:7285:c2ff:fe6c:992d";
logging-data="10021"; mail-complaints-to="abuse@netcologne.de"
User-Agent: slrn/1.0.3 (Linux)

by: Thomas Koenig - Sat, 11 Sep 2021 07:25 UTC

Brett <ggtgp@yahoo.com> schrieb:
> John Levine <johnl@taugh.com> wrote:
>> According to MitchAlsup <MitchAlsup@aol.com>:
>>>> Sounds a lot like a PDP-11.
>>> <
>>> Which died because its successor was too hard to pipeline.
>>
>> The PDP-11 was a really good design for the late 1960s, when memory
>> was starting to be affordable and microcode ROM was still a lot faster
>> than core.
>>
>> The VAX was also a good design for the late 1960s but unfortunately it
>> was introduced in the late 1970s. It wasn't just that it was too hard
>> to pipeline.
>
>> The instruction set was full of overcomplicated
>> microcoded instructions that were often slower than a sequence of
>> simple instructions,
>
> A Myth what was promoted by RISC proponents, which has been debunked.

^1 ^2

^1 = Dubious, discuss
^2 = Citation needed

In article <shhl98$3vo$1@dont-email.me>, ggtgp@yahoo.com (Brett) wrote:

> There is a major company that went for an updated 386, but did not
> improve anything besides instruction encoding and minor fixes.
> Failed to go big and so failed, go big or go home.

Citation?

John

Re: Design a better 16 or 32 bit processor

<shiqou$6kc$1@dont-email.me>

copy mid

https://www.novabbs.com/devel/article-flat.php?id=20411&group=comp.arch#20411

copy link Newsgroups: comp.arch

Path: i2pn2.org!i2pn.org!eternal-september.org!reader02.eternal-september.org!.POSTED!not-for-mail
From: cr88...@gmail.com (BGB)
Newsgroups: comp.arch
Subject: Re: Design a better 16 or 32 bit processor
Date: Sat, 11 Sep 2021 12:57:34 -0500
Organization: A noiseless patient Spider
Lines: 130
Message-ID: <shiqou$6kc$1@dont-email.me>
References: <shhl98$3vo$1@dont-email.me>
<memo.20210911133846.9608G@jgd.cix.co.uk>
Mime-Version: 1.0
Content-Type: text/plain; charset=utf-8; format=flowed
Content-Transfer-Encoding: 7bit
Injection-Date: Sat, 11 Sep 2021 17:58:54 -0000 (UTC)
Injection-Info: reader02.eternal-september.org; posting-host="c5184feed35a074691f636cdcfe7cce4";
logging-data="6796"; mail-complaints-to="abuse@eternal-september.org"; posting-account="U2FsdGVkX1/pqhyrLNlW66Ljgbw60dm9"
User-Agent: Mozilla/5.0 (Windows NT 10.0; Win64; x64; rv:78.0) Gecko/20100101
Thunderbird/78.14.0
Cancel-Lock: sha1:k393T7epZWF3dZtwVRdqT3mcIfs=
In-Reply-To: <memo.20210911133846.9608G@jgd.cix.co.uk>
Content-Language: en-US

by: BGB - Sat, 11 Sep 2021 17:57 UTC

On 9/11/2021 7:37 AM, John Dallman wrote:
> In article <shhl98$3vo$1@dont-email.me>, ggtgp@yahoo.com (Brett) wrote:
>
>> There is a major company that went for an updated 386, but did not
>> improve anything besides instruction encoding and minor fixes.
>> Failed to go big and so failed, go big or go home.
>
> Citation?
>
> John
>

I am starting to think about the possibility of a core which only
natively does a restricted subset of x86 and then does a special trap
for cases. The trap likely executes code with a modified ISA and
expanded register set (potentially, REX prefixes are supported during a
trap, but used mostly for the trap handler to have scratch registers).

The trap would be to a special ROM address.

Say, Mod/RM, Mod:
00: [Reg] //MOV Only (Native, 1)
01: [Reg+Disp8] //MOV Only (Native, 1)
10: [Reg+Disp32] //MOV Only (Native, 1)
11: Reg //General Ops

SIB byte:
Mod=00, all index values allowed.
Mod=01/10, only None allowed.

*1:
OP Reg, Mem
Splits into a 2-op sequence on decode:
MOV Ri, Mem
OP Reg, Ri
Where Ri is an internal register.

Ops with memory as a destination would likely split into 3 ops in the
decode stage:
MOV Ri, Mem
OP Ri, Reg
MOV Mem, Ri

Goal would be an x86 variant that could be "usefully" implemented on an
FPGA without being too slow to be usable.

There were apparently a few x86 on FPGA attempts, but many of the
"usable" ones were non-pipelined and generally too slow even to run most
MS-DOS era games. Some others generally required big/expensive FPGAs
(apparently the developers were renting them via a cloud server).

The idea in this case would be to shoe-horn x86 into a RISC-style pipeline.

The Mem-Cache would be able to Stall, ID1 and ID2 would both be able to
trigger interlocks (ID2 for registers, ID1 for decomposing instructions).

Opcode:
Most: 1-Byte Base
0F, F0, F2, F3, 64, 65, 66, 67, 26, 36, 2E, 3E, D8..DF: 2-byte Base

Bytes 2+3 or 3+4, Lookup table for Mod/RM/SIB extension.
Added to Base for encodings which use Mod/RM.
Another table can be used for Byte Word/DWord immediates.

Eg:
04, 0C, 14, 1C, ...: Byte Immed
05, 0D, 15, 1D, ...: Word Immed
80, 83: Byte Immed
81, Word Immed
...

Likely, the IF stage would need to figure out opcode length and would
produce results "unpacked" into a fixed-length format, say:
Opcode: 12 Bits (Prefix Merged)
ModRM / SIB: 20 bits
Imm: 32-bits (Sx/Zx)

This would be followed by another Decode and Register Fetch stage.
EIP would be semi-looped, where IF figures out the next EIP (EIP_1), but
other logic has the ability to override it. This EIP_1 would be captured
for use by EIP-relative instructions.

FPU: Partial Emulation
The Actual FPU would be MM0..MM7, FADD/FSUB/FMUL, ...
Most other FPU instructions are emulated via a trap.
....

MMX/SSE: Probably not supported.

Hardware memory Map:
00000000 .. 0010FFFF: Mimic DOS memory map.
00110000 .. 00FFFFFF: Mix of RAM and ISA hardware addresses.
01000000 .. 7FFFFFFF: More RAM, wraps around.
C0000000 .. FFFFFFFF: PCI emulation, ROM, ...

This would be along with the A20 line and all that other fun.

Goal would be to mostly support enough of the ISA to hopefully run Doom
and Quake and similar, and hopefully run MS-DOS, ...

I guess the debate is whether such a thing could be worthwhile.

I had previously also imagined potentially JIT compiling x86 to BJX2,
but this project hasn't gone very far as of yet (and I have doubts as to
whether it could give satisfactory results). Software emulation could
likely support a larger part of the ISA than would seem realistic via a
hardware implementation.

A JIT would have a slightly easier time as one doesn't necessarily need
to fake all the hardware, but could instead limit things to "userland
only" and then mimic OS level APIs.

I once started on such a project to try to emulate a limited subset of
the Win32 API, but this effort kinda fizzled out without too much results.

Another possibility, granted, would be trying to port DOSBox or similar.
I suspect performance would be pretty terrible though, given that ARM
ports are seemingly unable to do a passable job running Doom or similar
(2), so DOSBox on BJX2 would probably be pretty much hopeless...

*2: RasPi 4 does kinda OK, but older RasPi's and my phones tend to give
what is basically a slide-show.

Re: not a vax, was Design a better 16 or 32 bit processor

<shjeir$1rco$1@gal.iecc.com>

copy mid

https://www.novabbs.com/devel/article-flat.php?id=20414&group=comp.arch#20414

copy link Newsgroups: comp.arch

Path: i2pn2.org!i2pn.org!weretis.net!feeder6.news.weretis.net!news.misty.com!news.iecc.com!.POSTED.news.iecc.com!not-for-mail
From: joh...@taugh.com (John Levine)
Newsgroups: comp.arch
Subject: Re: not a vax, was Design a better 16 or 32 bit processor
Date: Sat, 11 Sep 2021 23:36:59 -0000 (UTC)
Organization: Taughannock Networks
Message-ID: <shjeir$1rco$1@gal.iecc.com>
References: <sh3evb$n1v$1@newsreader4.netcologne.de> <2d262fce-9363-4360-bfd9-ba4263e8d703n@googlegroups.com> <shh02c$267q$1@gal.iecc.com> <shhl98$3vo$1@dont-email.me>
Injection-Date: Sat, 11 Sep 2021 23:36:59 -0000 (UTC)
Injection-Info: gal.iecc.com; posting-host="news.iecc.com:2001:470:1f07:1126:0:676f:7373:6970";
logging-data="60824"; mail-complaints-to="abuse@iecc.com"
In-Reply-To: <sh3evb$n1v$1@newsreader4.netcologne.de> <2d262fce-9363-4360-bfd9-ba4263e8d703n@googlegroups.com> <shh02c$267q$1@gal.iecc.com> <shhl98$3vo$1@dont-email.me>
Cleverness: some
X-Newsreader: trn 4.0-test77 (Sep 1, 2010)
Originator: johnl@iecc.com (John Levine)

by: John Levine - Sat, 11 Sep 2021 23:36 UTC

According to Brett <ggtgp@yahoo.com>:
>> The instruction set was full of overcomplicated
>> microcoded instructions that were often slower than a sequence of
>> simple instructions,
>
>A Myth what was promoted by RISC proponents, which has been debunked.

Dunno how old you are but I wrote programs for Vaxen in the 1970s.

Instructions were an opcode followed by some number of operands, each
of which was a one byte code followed by zero to four bytes of data,
the data length depending on both the operand specifier and the
opcode. It had to decode instructions a byte at a time since you
couldn't tell where the code for operand N+1 was until you knew how
many data bytes followed opeand N. It was also hard to overlap address
calculations since a register autoincement or decrement could affect a
register used in a later operand. Today we could throw millions of
transistors at it with an umpteen stage pipeline that turns the whole
thing into micro-ops, but good luck doing that in 1980.

DEC had super all purpose procedure call and return instructions
which were super slow and overimplemented, so Unix systems never
used them. There were lots of complex instructions that as far as
I can tell nobody used.

Compare that to the IBM 360 where you can tell from the first two
bits of the opcode how long the instruction is, and for the most
part where the memory addresses and register operands are. The 360
wasn't perfect but it certainly aged a lot better.
--
Regards,
John Levine, johnl@taugh.com, Primary Perpetrator of "The Internet for Dummies",
Please consider the environment before reading this e-mail. https://jl.ly

Re: Design a better 16 or 32 bit processor

<shjj38$4r7$1@dont-email.me>

copy mid

https://www.novabbs.com/devel/article-flat.php?id=20415&group=comp.arch#20415

copy link Newsgroups: comp.arch

Path: i2pn2.org!i2pn.org!eternal-september.org!reader02.eternal-september.org!.POSTED!not-for-mail
From: ggt...@yahoo.com (Brett)
Newsgroups: comp.arch
Subject: Re: Design a better 16 or 32 bit processor
Date: Sun, 12 Sep 2021 00:54:00 -0000 (UTC)
Organization: A noiseless patient Spider
Lines: 18
Message-ID: <shjj38$4r7$1@dont-email.me>
References: <shhl98$3vo$1@dont-email.me>
<memo.20210911133846.9608G@jgd.cix.co.uk>
Mime-Version: 1.0
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: 8bit
Injection-Date: Sun, 12 Sep 2021 00:54:00 -0000 (UTC)
Injection-Info: reader02.eternal-september.org; posting-host="d4b5fbc9ab1bf17f5587bdac63515e47";
logging-data="4967"; mail-complaints-to="abuse@eternal-september.org"; posting-account="U2FsdGVkX1/RX4SW6k4Bx2Jndb5G4+YI"
User-Agent: NewsTap/5.5 (iPad)
Cancel-Lock: sha1:f0YqZhHcBVHlOpKrN3lvj2yWznI=
sha1:qf8Db2bFDV7TkzZk4eEglCMrgIw=

by: Brett - Sun, 12 Sep 2021 00:54 UTC

John Dallman <jgd@cix.co.uk> wrote:
> In article <shhl98$3vo$1@dont-email.me>, ggtgp@yahoo.com (Brett) wrote:
>
>> There is a major company that went for an updated 386, but did not
>> improve anything besides instruction encoding and minor fixes.
>> Failed to go big and so failed, go big or go home.
>
> Citation?

Renesas RX CPU
Has byte opcodes with 16 registers and is little endian.
Basically an improved incompatible 386.

> John
>

Re: not a vax, was Design a better 16 or 32 bit processor

<shjj39$4r7$2@dont-email.me>

copy mid

https://www.novabbs.com/devel/article-flat.php?id=20416&group=comp.arch#20416

copy link Newsgroups: comp.arch

Path: i2pn2.org!i2pn.org!eternal-september.org!reader02.eternal-september.org!.POSTED!not-for-mail
From: ggt...@yahoo.com (Brett)
Newsgroups: comp.arch
Subject: Re: not a vax, was Design a better 16 or 32 bit processor
Date: Sun, 12 Sep 2021 00:54:01 -0000 (UTC)
Organization: A noiseless patient Spider
Lines: 45
Message-ID: <shjj39$4r7$2@dont-email.me>
References: <sh3evb$n1v$1@newsreader4.netcologne.de>
<2d262fce-9363-4360-bfd9-ba4263e8d703n@googlegroups.com>
<shh02c$267q$1@gal.iecc.com>
<shhl98$3vo$1@dont-email.me>
<shjeir$1rco$1@gal.iecc.com>
Mime-Version: 1.0
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: 8bit
Injection-Date: Sun, 12 Sep 2021 00:54:01 -0000 (UTC)
Injection-Info: reader02.eternal-september.org; posting-host="d4b5fbc9ab1bf17f5587bdac63515e47";
logging-data="4967"; mail-complaints-to="abuse@eternal-september.org"; posting-account="U2FsdGVkX183i1ZuVOgMa4qGl1AmHE0R"
User-Agent: NewsTap/5.5 (iPad)
Cancel-Lock: sha1:pLI8JI1pz0d1r3/OC94idUDDsNo=
sha1:nndL6MGB7uOQSw/FxFHT38Vk6qo=

by: Brett - Sun, 12 Sep 2021 00:54 UTC

John Levine <johnl@taugh.com> wrote:
> According to Brett <ggtgp@yahoo.com>:
>>> The instruction set was full of overcomplicated
>>> microcoded instructions that were often slower than a sequence of
>>> simple instructions,
>>
>> A Myth what was promoted by RISC proponents, which has been debunked.
>
> Dunno how old you are but I wrote programs for Vaxen in the 1970s.
>
> Instructions were an opcode followed by some number of operands, each
> of which was a one byte code followed by zero to four bytes of data,
> the data length depending on both the operand specifier and the
> opcode. It had to decode instructions a byte at a time since you
> couldn't tell where the code for operand N+1 was until you knew how
> many data bytes followed opeand N. It was also hard to overlap address
> calculations since a register autoincement or decrement could affect a
> register used in a later operand. Today we could throw millions of
> transistors at it with an umpteen stage pipeline that turns the whole
> thing into micro-ops, but good luck doing that in 1980.

Having horrible instruction encoding has nothing to do with modern
microcode.

Not proposing add instructions where all three operands are memory indirect
addressing. VAX stupidly need not apply.

> DEC had super all purpose procedure call and return instructions
> which were super slow and overimplemented, so Unix systems never
> used them. There were lots of complex instructions that as far as
> I can tell nobody used.

So microcode the shorter call sequence, one wonders why no one with a brain
at DEC did this.

VAX had bad instruction density, same as RISC which tells you how horrible
it was.

> Compare that to the IBM 360 where you can tell from the first two
> bits of the opcode how long the instruction is, and for the most
> part where the memory addresses and register operands are. The 360
> wasn't perfect but it certainly aged a lot better.

Re: Design a better 16 or 32 bit processor

<shjjkf$83t$1@dont-email.me>

copy mid

https://www.novabbs.com/devel/article-flat.php?id=20417&group=comp.arch#20417

copy link Newsgroups: comp.arch

Path: i2pn2.org!i2pn.org!eternal-september.org!reader02.eternal-september.org!.POSTED!not-for-mail
From: ggt...@yahoo.com (Brett)
Newsgroups: comp.arch
Subject: Re: Design a better 16 or 32 bit processor
Date: Sun, 12 Sep 2021 01:03:11 -0000 (UTC)
Organization: A noiseless patient Spider
Lines: 38
Message-ID: <shjjkf$83t$1@dont-email.me>
References: <sh3evb$n1v$1@newsreader4.netcologne.de>
<3fea318c-62be-4d30-aafb-976eeee14908n@googlegroups.com>
<shguv8$2416$1@gal.iecc.com>
<2d262fce-9363-4360-bfd9-ba4263e8d703n@googlegroups.com>
<shh02c$267q$1@gal.iecc.com>
<shhl98$3vo$1@dont-email.me>
<shhlm7$9p5$1@newsreader4.netcologne.de>
Mime-Version: 1.0
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: 8bit
Injection-Date: Sun, 12 Sep 2021 01:03:11 -0000 (UTC)
Injection-Info: reader02.eternal-september.org; posting-host="d4b5fbc9ab1bf17f5587bdac63515e47";
logging-data="8317"; mail-complaints-to="abuse@eternal-september.org"; posting-account="U2FsdGVkX1+pxTEmbWqEZqozMiD43Ji1"
User-Agent: NewsTap/5.5 (iPad)
Cancel-Lock: sha1:/NiTPueX0iixm6+x8dTkVEuO7W0=
sha1:yO/jNaXpIvSnVKKTUtiiE6MblKU=

by: Brett - Sun, 12 Sep 2021 01:03 UTC

Thomas Koenig <tkoenig@netcologne.de> wrote:
> Brett <ggtgp@yahoo.com> schrieb:
>> John Levine <johnl@taugh.com> wrote:
>>> According to MitchAlsup <MitchAlsup@aol.com>:
>>>>> Sounds a lot like a PDP-11.
>>>> <
>>>> Which died because its successor was too hard to pipeline.
>>>
>>> The PDP-11 was a really good design for the late 1960s, when memory
>>> was starting to be affordable and microcode ROM was still a lot faster
>>> than core.
>>>
>>> The VAX was also a good design for the late 1960s but unfortunately it
>>> was introduced in the late 1970s. It wasn't just that it was too hard
>>> to pipeline.
>>
>>> The instruction set was full of overcomplicated
>>> microcoded instructions that were often slower than a sequence of
>>> simple instructions,
>>
>> A Myth what was promoted by RISC proponents, which has been debunked.
>
> ^1 ^2
>
> ^1 = Dubious, discuss
> ^2 = Citation needed

One example I know of is the VAX divide microcode which was 2 cycles slower
than the assembly macro, but this was because the assembly macro cheated on
the last carry and so was half a bit less accurate.

In order to pull off the microcode is slower scam you have to compare a
pipelined compiler output on a pipelined processor to a non-pipelined micro
coded processor, which is cheating. X86 has done just fine with enough
money piled into pipelining the micro ops.

Re: not a vax, was Design a better 16 or 32 bit processor

<2021Sep12.141107@mips.complang.tuwien.ac.at>

copy mid

https://www.novabbs.com/devel/article-flat.php?id=20418&group=comp.arch#20418

copy link Newsgroups: comp.arch

Path: i2pn2.org!i2pn.org!eternal-september.org!reader02.eternal-september.org!.POSTED!not-for-mail
From: ant...@mips.complang.tuwien.ac.at (Anton Ertl)
Newsgroups: comp.arch
Subject: Re: not a vax, was Design a better 16 or 32 bit processor
Date: Sun, 12 Sep 2021 12:11:07 GMT
Organization: Institut fuer Computersprachen, Technische Universitaet Wien
Lines: 14
Message-ID: <2021Sep12.141107@mips.complang.tuwien.ac.at>
References: <sh3evb$n1v$1@newsreader4.netcologne.de> <2d262fce-9363-4360-bfd9-ba4263e8d703n@googlegroups.com> <shh02c$267q$1@gal.iecc.com> <shhl98$3vo$1@dont-email.me> <shjeir$1rco$1@gal.iecc.com>
Injection-Info: reader02.eternal-september.org; posting-host="c44f096c1a6c40e22bb5d141b7a34564";
logging-data="22941"; mail-complaints-to="abuse@eternal-september.org"; posting-account="U2FsdGVkX1/b+G1eZbtxmOnyOlc9phg2"
Cancel-Lock: sha1:j7l8MkEq3Olxnr+joH+7PU+0LNk=
X-newsreader: xrn 10.00-beta-3

by: Anton Ertl - Sun, 12 Sep 2021 12:11 UTC

John Levine <johnl@taugh.com> writes:
[VAX]
>Compare that to the IBM 360 where you can tell from the first two
>bits of the opcode how long the instruction is, and for the most
>part where the memory addresses and register operands are. The 360
>wasn't perfect but it certainly aged a lot better.

And yet the 801 project found that they could outperform the 360
descendants by moving to a load/store architecture.

- anton
--
'Anyone trying for "industrial quality" ISA should avoid undefined behavior.'
Mitch Alsup, <c17fcd89-f024-40e7-a594-88a85ac10d20o@googlegroups.com>

Re: not a 360 either, was Design a better 16 or 32 bit processor

<shlc98$q3q$1@gal.iecc.com>

copy mid

https://www.novabbs.com/devel/article-flat.php?id=20419&group=comp.arch#20419

copy link Newsgroups: comp.arch

Path: i2pn2.org!i2pn.org!weretis.net!feeder6.news.weretis.net!news.misty.com!news.iecc.com!.POSTED.news.iecc.com!not-for-mail
From: joh...@taugh.com (John Levine)
Newsgroups: comp.arch
Subject: Re: not a 360 either, was Design a better 16 or 32 bit processor
Date: Sun, 12 Sep 2021 17:10:00 -0000 (UTC)
Organization: Taughannock Networks
Message-ID: <shlc98$q3q$1@gal.iecc.com>
References: <sh3evb$n1v$1@newsreader4.netcologne.de> <shhl98$3vo$1@dont-email.me> <shjeir$1rco$1@gal.iecc.com> <2021Sep12.141107@mips.complang.tuwien.ac.at>
Injection-Date: Sun, 12 Sep 2021 17:10:00 -0000 (UTC)
Injection-Info: gal.iecc.com; posting-host="news.iecc.com:2001:470:1f07:1126:0:676f:7373:6970";
logging-data="26746"; mail-complaints-to="abuse@iecc.com"
In-Reply-To: <sh3evb$n1v$1@newsreader4.netcologne.de> <shhl98$3vo$1@dont-email.me> <shjeir$1rco$1@gal.iecc.com> <2021Sep12.141107@mips.complang.tuwien.ac.at>
Cleverness: some
X-Newsreader: trn 4.0-test77 (Sep 1, 2010)
Originator: johnl@iecc.com (John Levine)

by: John Levine - Sun, 12 Sep 2021 17:10 UTC

According to Anton Ertl <anton@mips.complang.tuwien.ac.at>:
>John Levine <johnl@taugh.com> writes:
>[VAX]
>>Compare that to the IBM 360 where you can tell from the first two
>>bits of the opcode how long the instruction is, and for the most
>>part where the memory addresses and register operands are. The 360
>>wasn't perfect but it certainly aged a lot better.
>
>And yet the 801 project found that they could outperform the 360
>descendants by moving to a load/store architecture.

Sure, but that was 40 years ago. The 801 also only had 24 bit
registers because that how big the memory addresses were at the time.
The point of the 801 wasn't that it was load/store but that the
instructions were simple enough to implement without microcode with a
1980 transistor budget, and that they found that compilers rarely used
the more complex instructions anyway. In the 1960s ROMs were faster
than core memory so even with multiple microinstructions it could keep
the main memory running at full speed, but by 1980 we had
semiconductor RAM and caches so a microcode cycle was no faster than a
main memory cycle.

Descendants of the 801 added back a certain amount of stuff that
turned out to be useful like 32 bit registers and some support for
packed decimal.

The 360 wasn't perfect, e.g., the 12 bit displacements were too small,
branches should have been PC-relative, addresses should have been 32
rather than 24 bits, and the floating point gave awful results, all
things they fixed in later interations of the architecture. But it
seems to me that the 360 lends itself to a pipelined implmentation way
better than a Vax-like design. It can tell the length and format of an
instruction from the first byte (and mostly from the first two bits),
and it can compute the addresses in parallel since address
calculations don't change anything else.

--
Regards,
John Levine, johnl@taugh.com, Primary Perpetrator of "The Internet for Dummies",
Please consider the environment before reading this e-mail. https://jl.ly

Re: not a 360 either, was Design a better 16 or 32 bit processor

<340124ee-a081-4bf4-b9e0-a3d6ab3dc91en@googlegroups.com>

copy mid

https://www.novabbs.com/devel/article-flat.php?id=20420&group=comp.arch#20420

copy link Newsgroups: comp.arch

X-Received: by 2002:a05:6214:20eb:: with SMTP id 11mr7213562qvk.52.1631468227344;
Sun, 12 Sep 2021 10:37:07 -0700 (PDT)
X-Received: by 2002:a54:4883:: with SMTP id r3mr5198242oic.7.1631468227021;
Sun, 12 Sep 2021 10:37:07 -0700 (PDT)
Path: i2pn2.org!i2pn.org!weretis.net!feeder6.news.weretis.net!news.misty.com!border2.nntp.dca1.giganews.com!nntp.giganews.com!news-out.google.com!nntp.google.com!postnews.google.com!google-groups.googlegroups.com!not-for-mail
Newsgroups: comp.arch
Date: Sun, 12 Sep 2021 10:37:06 -0700 (PDT)
In-Reply-To: <shlc98$q3q$1@gal.iecc.com>
Injection-Info: google-groups.googlegroups.com; posting-host=2600:1700:291:29f0:d44:65dd:4acf:8c77;
posting-account=H_G_JQkAAADS6onOMb-dqvUozKse7mcM
NNTP-Posting-Host: 2600:1700:291:29f0:d44:65dd:4acf:8c77
References: <sh3evb$n1v$1@newsreader4.netcologne.de> <shhl98$3vo$1@dont-email.me>
<shjeir$1rco$1@gal.iecc.com> <2021Sep12.141107@mips.complang.tuwien.ac.at> <shlc98$q3q$1@gal.iecc.com>
User-Agent: G2/1.0
MIME-Version: 1.0
Message-ID: <340124ee-a081-4bf4-b9e0-a3d6ab3dc91en@googlegroups.com>
Subject: Re: not a 360 either, was Design a better 16 or 32 bit processor
From: MitchAl...@aol.com (MitchAlsup)
Injection-Date: Sun, 12 Sep 2021 17:37:07 +0000
Content-Type: text/plain; charset="UTF-8"
Lines: 46

by: MitchAlsup - Sun, 12 Sep 2021 17:37 UTC

On Sunday, September 12, 2021 at 12:10:02 PM UTC-5, John Levine wrote:
> According to Anton Ertl <an...@mips.complang.tuwien.ac.at>:
> >John Levine <jo...@taugh.com> writes:
> >[VAX]
> >>Compare that to the IBM 360 where you can tell from the first two
> >>bits of the opcode how long the instruction is, and for the most
> >>part where the memory addresses and register operands are. The 360
> >>wasn't perfect but it certainly aged a lot better.
> >
> >And yet the 801 project found that they could outperform the 360
> >descendants by moving to a load/store architecture.
> Sure, but that was 40 years ago. The 801 also only had 24 bit
> registers because that how big the memory addresses were at the time.
> The point of the 801 wasn't that it was load/store but that the
> instructions were simple enough to implement without microcode with a
> 1980 transistor budget, and that they found that compilers rarely used
> the more complex instructions anyway. In the 1960s ROMs were faster
> than core memory so even with multiple microinstructions it could keep
> the main memory running at full speed, but by 1980 we had
> semiconductor RAM and caches so a microcode cycle was no faster than a
> main memory cycle.
>
> Descendants of the 801 added back a certain amount of stuff that
> turned out to be useful like 32 bit registers and some support for
> packed decimal.
>
> The 360 wasn't perfect, e.g., the 12 bit displacements were too small,
> branches should have been PC-relative, addresses should have been 32
> rather than 24 bits, and the floating point gave awful results, all
> things they fixed in later interations of the architecture. But it
> seems to me that the 360 lends itself to a pipelined implmentation way
> better than a Vax-like design.
<
Yes, ti did.
<
> It can tell the length and format of an
> instruction from the first byte (and mostly from the first two bits),
<
This, BTW, has NOTHING to do with 360 being pipelineable. The RR and
RX formats have everything to do with it being pipelineable.
<
> and it can compute the addresses in parallel since address
> calculations don't change anything else.
> --
> Regards,
> John Levine, jo...@taugh.com, Primary Perpetrator of "The Internet for Dummies",
> Please consider the environment before reading this e-mail. https://jl.ly

Re: not a vax, was Design a better 16 or 32 bit processor

<_hr%I.105199$lC6.75305@fx41.iad>

copy mid

https://www.novabbs.com/devel/article-flat.php?id=20421&group=comp.arch#20421

copy link Newsgroups: comp.arch

Path: i2pn2.org!i2pn.org!aioe.org!news.uzoreto.com!newsfeed.xs4all.nl!newsfeed7.news.xs4all.nl!feeder5.feed.usenet.farm!feeder1.feed.usenet.farm!feed.usenet.farm!peer02.ams4!peer.am4.highwinds-media.com!peer02.iad!feed-me.highwinds-media.com!news.highwinds-media.com!fx41.iad.POSTED!not-for-mail
From: ThatWoul...@thevillage.com (EricP)
User-Agent: Thunderbird 2.0.0.24 (Windows/20100228)
MIME-Version: 1.0
Newsgroups: comp.arch
Subject: Re: not a vax, was Design a better 16 or 32 bit processor
References: <sh3evb$n1v$1@newsreader4.netcologne.de> <2d262fce-9363-4360-bfd9-ba4263e8d703n@googlegroups.com> <shh02c$267q$1@gal.iecc.com> <shhl98$3vo$1@dont-email.me> <shjeir$1rco$1@gal.iecc.com>
In-Reply-To: <shjeir$1rco$1@gal.iecc.com>
Content-Type: text/plain; charset=ISO-8859-1; format=flowed
Content-Transfer-Encoding: 7bit
Lines: 53
Message-ID: <_hr%I.105199$lC6.75305@fx41.iad>
X-Complaints-To: abuse@UsenetServer.com
NNTP-Posting-Date: Sun, 12 Sep 2021 18:03:38 UTC
Date: Sun, 12 Sep 2021 14:02:43 -0400
X-Received-Bytes: 3368

by: EricP - Sun, 12 Sep 2021 18:02 UTC

John Levine wrote:
> According to Brett <ggtgp@yahoo.com>:
>>> The instruction set was full of overcomplicated
>>> microcoded instructions that were often slower than a sequence of
>>> simple instructions,
>> A Myth what was promoted by RISC proponents, which has been debunked.
>
> Dunno how old you are but I wrote programs for Vaxen in the 1970s.
>
> Instructions were an opcode followed by some number of operands, each
> of which was a one byte code followed by zero to four bytes of data,
> the data length depending on both the operand specifier and the
> opcode. It had to decode instructions a byte at a time since you
> couldn't tell where the code for operand N+1 was until you knew how
> many data bytes followed opeand N. It was also hard to overlap address
> calculations since a register autoincement or decrement could affect a
> register used in a later operand. Today we could throw millions of
> transistors at it with an umpteen stage pipeline that turns the whole
> thing into micro-ops, but good luck doing that in 1980.
>
> DEC had super all purpose procedure call and return instructions
> which were super slow and overimplemented, so Unix systems never
> used them. There were lots of complex instructions that as far as
> I can tell nobody used.
>
> Compare that to the IBM 360 where you can tell from the first two
> bits of the opcode how long the instruction is, and for the most
> part where the memory addresses and register operands are. The 360
> wasn't perfect but it certainly aged a lot better.

One thing the VAX did do was spur the investigation in 1985-87
by Yale Patt, Wen-mei Hwu, Michael Shebanow, and others,
called HPS (High Performance Substrate) as a way to work
around VAX's pipeline problems.

Based on Tomasulo and others, HPS looks to be the basis for modern
OoO when picked up by Pentium and others.

HPS, A New Microarchitecture Rationale And Introduction, 1985
http://impact.crhc.illinois.edu/Shared/Papers/Micro-85-HPS_a_new_microarchitecture.pdf

Critical Issues Regarding HPS, A High Performance Microarchitecture, 1985
http://impact.crhc.illinois.edu/shared/papers/Micro-85-HPS_critical_issues.pdf

(and other papers)

Comp.arch tie-in: Mitch co-authors a paper with Pratt, Shebanow, et al.
Single Instruction Stream Parallelism Is Greater than Two, 1991
and co-invents some patents with Shebanow at Motorola,
possibly drawing on the HPS approach for the 88100 design.

Re: not a 360 either, was Design a better 16 or 32 bit processor

<87a6kh3arw.fsf@localhost>

copy mid

https://www.novabbs.com/devel/article-flat.php?id=20422&group=comp.arch#20422

copy link Newsgroups: comp.arch

Path: i2pn2.org!i2pn.org!eternal-september.org!reader02.eternal-september.org!.POSTED!not-for-mail
From: lyn...@garlic.com (Anne & Lynn Wheeler)
Newsgroups: comp.arch
Subject: Re: not a 360 either, was Design a better 16 or 32 bit processor
Date: Sun, 12 Sep 2021 08:32:19 -1000
Organization: Wheeler&Wheeler
Lines: 72
Message-ID: <87a6kh3arw.fsf@localhost>
References: <sh3evb$n1v$1@newsreader4.netcologne.de>
<shhl98$3vo$1@dont-email.me> <shjeir$1rco$1@gal.iecc.com>
<2021Sep12.141107@mips.complang.tuwien.ac.at>
<shlc98$q3q$1@gal.iecc.com>
Mime-Version: 1.0
Content-Type: text/plain
Injection-Info: reader02.eternal-september.org; posting-host="da1b76402ec871e6b894203d5d4e3121";
logging-data="9625"; mail-complaints-to="abuse@eternal-september.org"; posting-account="U2FsdGVkX1+QCjBLcphOs/Y0pE7IPHaYAVrTOEyZzOc="
User-Agent: Gnus/5.13 (Gnus v5.13) Emacs/27.2 (gnu/linux)
Cancel-Lock: sha1:p/OTX9egsKYJKoigwaDg3+pJsko=
sha1:j9Y5z2XON38o6HjUJeRObYaokRY=

by: Anne & Lynn Whee - Sun, 12 Sep 2021 18:32 UTC

John Levine <johnl@taugh.com> writes:
> Sure, but that was 40 years ago. The 801 also only had 24 bit
> registers because that how big the memory addresses were at the time.
> The point of the 801 wasn't that it was load/store but that the
> instructions were simple enough to implement without microcode with a
> 1980 transistor budget, and that they found that compilers rarely used
> the more complex instructions anyway. In the 1960s ROMs were faster
> than core memory so even with multiple microinstructions it could keep
> the main memory running at full speed, but by 1980 we had
> semiconductor RAM and caches so a microcode cycle was no faster than a
> main memory cycle.

801/ROMP ... originally going to be displaywriter followon running CP.r
and programmed in PL.8 ... didn't have any hardware protection domain
.... claim was that PL.8 would only generate "correct" programs and CP.r
would only load/execute correct PL.8 programs. Everything was trusted
and so things that nominally required kernel call to change modes, could
be done inline code. It nominally had 32bit addressing ... top four bits
indexed content of 12bit "segment" register ... with 28bit displacement.
Since the segment registers were 12bits ... they claimed it was 40bit
virtual address ... 28bit displacement appended to the 12 bit contents
of the segment register value (claiming that inline application code
could change the contents of any segment register as easily as pointer
in general register could be changed).

In effect, treating machine as a single 40bit virtual address space,
with 16 segment registers that could each "window" 28bits of that
virtual address space at a time.

when displaywriter followon was killed, they decided to retarget to the
unix workstation market ... having to adopt the hardware to unix
programming paradigm ... including privilege / non-privilege (PC/RT) ...
and getting the company that did AT&T Unix port to IBM/PC for PC/IX to
do one for ROMP (AIX) ... but documentation still would periodically
reference "40-bit" addressing ... sort of being able to map a 32bit
virtual address space into a specific portion of 40-bit machine address
space (theoritically 40-32=8 or 256 32-bit virtual address spaces)
possibly reserving specific 12bit segment register values for "shared
memory" segments.

For RIOS (used in RS/6000) they extended segment registers to 24bits and
documentation would periodically reference 52bit address (i.e. 24+28
instead of the ROMP 12+28 40bit).

there was internal advanced technology conference where we presented a
design for 16processor tightly coupled 370 multiprocessor and the 801
group presented 801/risc, CP.r & PL.8. One of their people claimed that
they had looked at existing operating system code and claimed that it
didn't support 16-way (implying that we couldn't write new code) so I
criticized them for how could they (efficiently) support shared memory
segments ... since their segment size was so large and there were only
16.

I had done a page-mapped filesystem for CP67/CMS and made extensive use
of shared semgnets. When the development group morphed CP67->VM370, they
greatly simplified and/or dropped (like multiprocessor support) a lot of
stuff. By 1975, I had migrated lots of the dropped CP67/CMS stuff to
VM370 (including my page-mapped filesystem stuff with extensive shared
segment sharing as well as multiprocessor support). One of my hobbies
after joining IBM was enhanced production operating systems for internal
datacenters ... and I continued to work on 370 stuff all during the
Future System period ... even perodically ridiculing how they were doing
stuff (FS was completely different than 370 and going to completely
replace it ... lack of new 370 during the FS period is credited with
giving the clone 370 makers their market foothold). some FS info
http://www.jfsowa.com/computer/memo125.htm

I've periodically claimed that John Cocke had taken 801 to the opposite
extreme from FS complexity.

--
virtualization experience starting Jan1968, online at home since Mar1970

Re: not a vax, was Design a better 16 or 32 bit processor

<c8da8614-0767-43d7-a552-047519f81bdan@googlegroups.com>

copy mid

https://www.novabbs.com/devel/article-flat.php?id=20423&group=comp.arch#20423

copy link Newsgroups: comp.arch

X-Received: by 2002:a37:f902:: with SMTP id l2mr6785415qkj.511.1631475766382;
Sun, 12 Sep 2021 12:42:46 -0700 (PDT)
X-Received: by 2002:a05:6830:b96:: with SMTP id a22mr6963974otv.282.1631475766095;
Sun, 12 Sep 2021 12:42:46 -0700 (PDT)
Path: i2pn2.org!i2pn.org!weretis.net!feeder6.news.weretis.net!news.misty.com!border2.nntp.dca1.giganews.com!nntp.giganews.com!news-out.google.com!nntp.google.com!postnews.google.com!google-groups.googlegroups.com!not-for-mail
Newsgroups: comp.arch
Date: Sun, 12 Sep 2021 12:42:45 -0700 (PDT)
In-Reply-To: <_hr%I.105199$lC6.75305@fx41.iad>
Injection-Info: google-groups.googlegroups.com; posting-host=2600:1700:291:29f0:d44:65dd:4acf:8c77;
posting-account=H_G_JQkAAADS6onOMb-dqvUozKse7mcM
NNTP-Posting-Host: 2600:1700:291:29f0:d44:65dd:4acf:8c77
References: <sh3evb$n1v$1@newsreader4.netcologne.de> <2d262fce-9363-4360-bfd9-ba4263e8d703n@googlegroups.com>
<shh02c$267q$1@gal.iecc.com> <shhl98$3vo$1@dont-email.me> <shjeir$1rco$1@gal.iecc.com>
<_hr%I.105199$lC6.75305@fx41.iad>
User-Agent: G2/1.0
MIME-Version: 1.0
Message-ID: <c8da8614-0767-43d7-a552-047519f81bdan@googlegroups.com>
Subject: Re: not a vax, was Design a better 16 or 32 bit processor
From: MitchAl...@aol.com (MitchAlsup)
Injection-Date: Sun, 12 Sep 2021 19:42:46 +0000
Content-Type: text/plain; charset="UTF-8"
Lines: 59

by: MitchAlsup - Sun, 12 Sep 2021 19:42 UTC

On Sunday, September 12, 2021 at 1:03:43 PM UTC-5, EricP wrote:
> John Levine wrote:
> > According to Brett <gg...@yahoo.com>:
> >>> The instruction set was full of overcomplicated
> >>> microcoded instructions that were often slower than a sequence of
> >>> simple instructions,
> >> A Myth what was promoted by RISC proponents, which has been debunked.
> >
> > Dunno how old you are but I wrote programs for Vaxen in the 1970s.
> >
> > Instructions were an opcode followed by some number of operands, each
> > of which was a one byte code followed by zero to four bytes of data,
> > the data length depending on both the operand specifier and the
> > opcode. It had to decode instructions a byte at a time since you
> > couldn't tell where the code for operand N+1 was until you knew how
> > many data bytes followed opeand N. It was also hard to overlap address
> > calculations since a register autoincement or decrement could affect a
> > register used in a later operand. Today we could throw millions of
> > transistors at it with an umpteen stage pipeline that turns the whole
> > thing into micro-ops, but good luck doing that in 1980.
> >
> > DEC had super all purpose procedure call and return instructions
> > which were super slow and overimplemented, so Unix systems never
> > used them. There were lots of complex instructions that as far as
> > I can tell nobody used.
> >
> > Compare that to the IBM 360 where you can tell from the first two
> > bits of the opcode how long the instruction is, and for the most
> > part where the memory addresses and register operands are. The 360
> > wasn't perfect but it certainly aged a lot better.
> One thing the VAX did do was spur the investigation in 1985-87
> by Yale Patt, Wen-mei Hwu, Michael Shebanow, and others,
> called HPS (High Performance Substrate) as a way to work
> around VAX's pipeline problems.
>
> Based on Tomasulo and others, HPS looks to be the basis for modern
> OoO when picked up by Pentium and others.
>
> HPS, A New Microarchitecture Rationale And Introduction, 1985
> http://impact.crhc.illinois.edu/Shared/Papers/Micro-85-HPS_a_new_microarchitecture.pdf
>
> Critical Issues Regarding HPS, A High Performance Microarchitecture, 1985
> http://impact.crhc.illinois.edu/shared/papers/Micro-85-HPS_critical_issues.pdf
>
> (and other papers)
>
> Comp.arch tie-in: Mitch co-authors a paper with Pratt, Shebanow, et al.
> Single Instruction Stream Parallelism Is Greater than Two, 1991
> and co-invents some patents with Shebanow at Motorola,
> possibly drawing on the HPS approach for the 88100 design.
<
Yes, indeed, HPS was, in reality, not a whole lot more than Tomasulo with
multiple common data busses and branch prediction.
<
And we were using many of the ideas, but with entirely different underpinnings
than HPS. For example, we used a physical register file rather than a RAT, and
created a way to backup a branch misprediction and issue on the alternate path
in the same cycle.
<
This was for the 88100 architecture and for the 88120 design point.

Re: not a 360 either, was Design a better 16 or 32 bit processor

<shltt3$24e4$1@gal.iecc.com>

copy mid

https://www.novabbs.com/devel/article-flat.php?id=20424&group=comp.arch#20424

copy link Newsgroups: comp.arch

Path: i2pn2.org!i2pn.org!weretis.net!feeder6.news.weretis.net!news.misty.com!news.iecc.com!.POSTED.news.iecc.com!not-for-mail
From: joh...@taugh.com (John Levine)
Newsgroups: comp.arch
Subject: Re: not a 360 either, was Design a better 16 or 32 bit processor
Date: Sun, 12 Sep 2021 22:10:43 -0000 (UTC)
Organization: Taughannock Networks
Message-ID: <shltt3$24e4$1@gal.iecc.com>
References: <sh3evb$n1v$1@newsreader4.netcologne.de> <2021Sep12.141107@mips.complang.tuwien.ac.at> <shlc98$q3q$1@gal.iecc.com> <87a6kh3arw.fsf@localhost>
Injection-Date: Sun, 12 Sep 2021 22:10:43 -0000 (UTC)
Injection-Info: gal.iecc.com; posting-host="news.iecc.com:2001:470:1f07:1126:0:676f:7373:6970";
logging-data="70084"; mail-complaints-to="abuse@iecc.com"
In-Reply-To: <sh3evb$n1v$1@newsreader4.netcologne.de> <2021Sep12.141107@mips.complang.tuwien.ac.at> <shlc98$q3q$1@gal.iecc.com> <87a6kh3arw.fsf@localhost>
Cleverness: some
X-Newsreader: trn 4.0-test77 (Sep 1, 2010)
Originator: johnl@iecc.com (John Levine)

by: John Levine - Sun, 12 Sep 2021 22:10 UTC

According to Anne & Lynn Wheeler <lynn@garlic.com>:
>when displaywriter followon was killed, they decided to retarget to the
>unix workstation market ... having to adopt the hardware to unix
>programming paradigm ... including privilege / non-privilege (PC/RT) ...
>and getting the company that did AT&T Unix port to IBM/PC for PC/IX to
>do one for ROMP (AIX) ...

Yeah, that was me. I worked for Interactive and wrote the ROMP assembler
and linker for AIX. It sat on top of an IBM monitor called the VRM which
provided us 28 bit segments we could map in and out of a process address
space, which was fine except the VRM was dog slow.

Someone else did a native BSD port which worked a lot better.

--
Regards,
John Levine, johnl@taugh.com, Primary Perpetrator of "The Internet for Dummies",
Please consider the environment before reading this e-mail. https://jl.ly

Re: not a 360 either, was Design a better 16 or 32 bit processor

<87y281ksfo.fsf@localhost>

copy mid

https://www.novabbs.com/devel/article-flat.php?id=20425&group=comp.arch#20425

copy link Newsgroups: comp.arch

Path: i2pn2.org!i2pn.org!eternal-september.org!reader02.eternal-september.org!.POSTED!not-for-mail
From: lyn...@garlic.com (Anne & Lynn Wheeler)
Newsgroups: comp.arch
Subject: Re: not a 360 either, was Design a better 16 or 32 bit processor
Date: Sun, 12 Sep 2021 18:31:07 -1000
Organization: Wheeler&Wheeler
Lines: 41
Message-ID: <87y281ksfo.fsf@localhost>
References: <sh3evb$n1v$1@newsreader4.netcologne.de>
<2021Sep12.141107@mips.complang.tuwien.ac.at>
<shlc98$q3q$1@gal.iecc.com> <87a6kh3arw.fsf@localhost>
<shltt3$24e4$1@gal.iecc.com>
Mime-Version: 1.0
Content-Type: text/plain
Injection-Info: reader02.eternal-september.org; posting-host="34a1643f2b2f52062465b1af3eabb966";
logging-data="24154"; mail-complaints-to="abuse@eternal-september.org"; posting-account="U2FsdGVkX1+IdScIB8s+5RHSuP7Lrio51I84MPXRBnw="
User-Agent: Gnus/5.13 (Gnus v5.13) Emacs/27.2 (gnu/linux)
Cancel-Lock: sha1:KFqP9pHCLQaIFM/kpHFsAMXay3Q=
sha1:N4fHPPMW1lhI+PK4pjuUqeeBX8A=

by: Anne & Lynn Whee - Mon, 13 Sep 2021 04:31 UTC

John Levine <johnl@taugh.com> writes:
> Yeah, that was me. I worked for Interactive and wrote the ROMP assembler
> and linker for AIX. It sat on top of an IBM monitor called the VRM which
> provided us 28 bit segments we could map in and out of a process address
> space, which was fine except the VRM was dog slow.
>
> Someone else did a native BSD port which worked a lot better.

I was working with the people doing the BSD port ... the Austin people
claimed that it was quicker, less resources and cheaper for them to
build a VRM with abstract virtual machine and then have you do the AT&T
port to the abstract virtual machine ... than having you do the AT&T
port directly to the bare hardware. Actually I think they had 200 PL.8
programmers and they needed something for them to do (aka the VRM).

The Palo Alto group was doing BSD port to VM370 virtual machine
mainframe when they got retargeted to do it directly port directly to
bare machine. I think the BSD port directly to bare PC/RT hardware was
1/10th the effort (or less) to do (just) the VRM.

Just one of the less obvious downsides that Austin ran into was device
drivers for new hardware 1st had to be done in C for unix ... and then
repeated in PL.8 for VRM.

trivia: I had been working with one of the people in Los Gatos VLSI
tools group using (metaware) TWS and did Pascal for IBM mainframe
(original for internal VLSI tools, later released as vs/pascal)... to
work on C language front-end. I left for a summer lecture tour in Europe
.... when I got back he had left IBM and was working for Metaware. I
talked the Palo Alto people into hiring Metaware to do the C-compiler
for the BSD mainframe port ... which they then had Metaware do ROMP
backend when they were redirected to PC/RT port ("AOS").

Note the Palo Alto group was also working with UCLA and doing LOCUS
ports ... which was eventually released as AIX/370 and AIX/386.
https://en.wikipedia.org/wiki/LOCUS

aka an "AIX" having nothing to do with PC/RT AIX or RS/6000 AIX.

--
virtualization experience starting Jan1968, online at home since Mar1970

vs/pascal (Was: Re: not a 360 either, was Design a better 16 or 32 bit processor)

<shmpfd$1egb$1@gioia.aioe.org>

copy mid

https://www.novabbs.com/devel/article-flat.php?id=20426&group=comp.arch#20426

copy link Newsgroups: comp.arch

Path: i2pn2.org!i2pn.org!aioe.org!dbQAFyOzpflkRItwGlQf9g.user.46.165.242.91.POSTED!not-for-mail
From: terje.ma...@tmsw.no (Terje Mathisen)
Newsgroups: comp.arch
Subject: vs/pascal (Was: Re: not a 360 either, was Design a better 16 or 32
bit processor)
Date: Mon, 13 Sep 2021 08:01:16 +0200
Organization: Aioe.org NNTP Server
Message-ID: <shmpfd$1egb$1@gioia.aioe.org>
References: <sh3evb$n1v$1@newsreader4.netcologne.de>
<2021Sep12.141107@mips.complang.tuwien.ac.at> <shlc98$q3q$1@gal.iecc.com>
<87a6kh3arw.fsf@localhost> <shltt3$24e4$1@gal.iecc.com>
<87y281ksfo.fsf@localhost>
Mime-Version: 1.0
Content-Type: text/plain; charset=ISO-8859-1; format=flowed
Content-Transfer-Encoding: 7bit
Injection-Info: gioia.aioe.org; logging-data="47627"; posting-host="dbQAFyOzpflkRItwGlQf9g.user.gioia.aioe.org"; mail-complaints-to="abuse@aioe.org";
User-Agent: Mozilla/5.0 (Windows NT 10.0; Win64; x64; rv:60.0) Gecko/20100101
Firefox/60.0 SeaMonkey/2.53.9
X-Notice: Filtered by postfilter v. 0.9.2

by: Terje Mathisen - Mon, 13 Sep 2021 06:01 UTC

Anne & Lynn Wheeler wrote:
>
> trivia: I had been working with one of the people in Los Gatos VLSI
> tools group using (metaware) TWS and did Pascal for IBM mainframe
> (original for internal VLSI tools, later released as vs/pascal)... to

Wow!

I once used that vs/pascal to implement large packet Kermit, using up to
a full page (25x80=2000 bytes) of a 3270 emulator as the packet size,
instead of just a single 80-byte line.

The result was file transfers running at the same speed as the 3270/PC
(or the /AT 286 version) while using 3270 protocol emulators to allow
ascii serial port connections.

I remember that Pascal version as being quite nice. :-)

Terje

--
- <Terje.Mathisen at tmsw.no>
"almost all programming can be viewed as an exercise in caching"

Re: not a 360 either, was Design a better 16 or 32 bit processor

<2021Sep13.092822@mips.complang.tuwien.ac.at>

copy mid

https://www.novabbs.com/devel/article-flat.php?id=20427&group=comp.arch#20427

copy link Newsgroups: comp.arch

Path: i2pn2.org!i2pn.org!eternal-september.org!reader02.eternal-september.org!.POSTED!not-for-mail
From: ant...@mips.complang.tuwien.ac.at (Anton Ertl)
Newsgroups: comp.arch
Subject: Re: not a 360 either, was Design a better 16 or 32 bit processor
Date: Mon, 13 Sep 2021 07:28:22 GMT
Organization: Institut fuer Computersprachen, Technische Universitaet Wien
Lines: 46
Message-ID: <2021Sep13.092822@mips.complang.tuwien.ac.at>
References: <sh3evb$n1v$1@newsreader4.netcologne.de> <shhl98$3vo$1@dont-email.me> <shjeir$1rco$1@gal.iecc.com> <2021Sep12.141107@mips.complang.tuwien.ac.at> <shlc98$q3q$1@gal.iecc.com>
Injection-Info: reader02.eternal-september.org; posting-host="adc1cc9999d4f6f902b7838f9a4a2636";
logging-data="26072"; mail-complaints-to="abuse@eternal-september.org"; posting-account="U2FsdGVkX18D1Vs7qay7xKpk4ehXpP7t"
Cancel-Lock: sha1:X08xkRgx71+WvUe8F8lOMDK4c9w=
X-newsreader: xrn 10.00-beta-3

by: Anton Ertl - Mon, 13 Sep 2021 07:28 UTC

John Levine <johnl@taugh.com> writes:
>According to Anton Ertl <anton@mips.complang.tuwien.ac.at>:
>>John Levine <johnl@taugh.com> writes:
>>[VAX]
>>>Compare that to the IBM 360 where you can tell from the first two
>>>bits of the opcode how long the instruction is, and for the most
>>>part where the memory addresses and register operands are. The 360
>>>wasn't perfect but it certainly aged a lot better.
>>
>>And yet the 801 project found that they could outperform the 360
>>descendants by moving to a load/store architecture.
>
>Sure, but that was 40 years ago. The 801 also only had 24 bit
>registers because that how big the memory addresses were at the time.
>The point of the 801 wasn't that it was load/store but that the
>instructions were simple enough to implement without microcode with a
>1980 transistor budget, and that they found that compilers rarely used
>the more complex instructions anyway.

The first implementation was "Motorola MECL-10K discrete component
technology on large wire-wrapped custom boards." So the transistor
budget was not limited by what fit on a single chip. So what limited
the transistor budget? Or was it that they just wanted to leave out
everything that they could do without, making it easier to make the
rest fast? They certainly made it really fast.

>In the 1960s ROMs were faster
>than core memory so even with multiple microinstructions it could keep
>the main memory running at full speed, but by 1980 we had
>semiconductor RAM and caches so a microcode cycle was no faster than a
>main memory cycle.

The IBM 801 ran at 15MHZ, the Vax 11/780 at 5MHZ and had a 2KB cache,
so apparently the main memory was too slow for 5MHz (but cache was
probably at the same speed as the microcode store). The IBM 801 (as
describe by Radin
<https://course.ece.cmu.edu/~ece447/s12/lib/exe/fetch.php?media=wiki:801-radin-1982.pdf>,
which is already the 32-bit version) had a split instruction and data
cache with 32-byte cache lines. I have not found cache sizes;
associativity seems to be at least 2-way (they mention LRU
replacement).

- anton
--
'Anyone trying for "industrial quality" ISA should avoid undefined behavior.'
Mitch Alsup, <c17fcd89-f024-40e7-a594-88a85ac10d20o@googlegroups.com>

Re: not a 360 either, was Design a better 16 or 32 bit processor

<1bI%I.120304$o45.44935@fx46.iad>

copy mid

https://www.novabbs.com/devel/article-flat.php?id=20428&group=comp.arch#20428

copy link Newsgroups: comp.arch

Path: i2pn2.org!i2pn.org!weretis.net!feeder6.news.weretis.net!news.misty.com!border2.nntp.dca1.giganews.com!nntp.giganews.com!peer03.iad!feed-me.highwinds-media.com!news.highwinds-media.com!fx46.iad.POSTED!not-for-mail
From: ThatWoul...@thevillage.com (EricP)
User-Agent: Thunderbird 2.0.0.24 (Windows/20100228)
MIME-Version: 1.0
Newsgroups: comp.arch
Subject: Re: not a 360 either, was Design a better 16 or 32 bit processor
References: <sh3evb$n1v$1@newsreader4.netcologne.de> <shhl98$3vo$1@dont-email.me> <shjeir$1rco$1@gal.iecc.com> <2021Sep12.141107@mips.complang.tuwien.ac.at> <shlc98$q3q$1@gal.iecc.com> <2021Sep13.092822@mips.complang.tuwien.ac.at>
In-Reply-To: <2021Sep13.092822@mips.complang.tuwien.ac.at>
Content-Type: text/plain; charset=ISO-8859-1; format=flowed
Content-Transfer-Encoding: 7bit
Lines: 89
Message-ID: <1bI%I.120304$o45.44935@fx46.iad>
X-Complaints-To: abuse@UsenetServer.com
NNTP-Posting-Date: Mon, 13 Sep 2021 13:16:45 UTC
Date: Mon, 13 Sep 2021 09:16:07 -0400
X-Received-Bytes: 5187
X-Original-Bytes: 5136

by: EricP - Mon, 13 Sep 2021 13:16 UTC

Anton Ertl wrote:
> John Levine <johnl@taugh.com> writes:
>> According to Anton Ertl <anton@mips.complang.tuwien.ac.at>:
>>> John Levine <johnl@taugh.com> writes:
>>> [VAX]
>>>> Compare that to the IBM 360 where you can tell from the first two
>>>> bits of the opcode how long the instruction is, and for the most
>>>> part where the memory addresses and register operands are. The 360
>>>> wasn't perfect but it certainly aged a lot better.
>>> And yet the 801 project found that they could outperform the 360
>>> descendants by moving to a load/store architecture.
>> Sure, but that was 40 years ago. The 801 also only had 24 bit
>> registers because that how big the memory addresses were at the time.
>> The point of the 801 wasn't that it was load/store but that the
>> instructions were simple enough to implement without microcode with a
>> 1980 transistor budget, and that they found that compilers rarely used
>> the more complex instructions anyway.
>
> The first implementation was "Motorola MECL-10K discrete component
> technology on large wire-wrapped custom boards." So the transistor
> budget was not limited by what fit on a single chip. So what limited
> the transistor budget? Or was it that they just wanted to leave out
> everything that they could do without, making it easier to make the
> rest fast? They certainly made it really fast.

ECL was invented by IBM in 1956. Originally called current-steering logic,
it was used in the Stretch, IBM 7090, and IBM 7094 computers (circa 1959).
MECL is Motorola's integrated circuit ECL logic developed starting in 1962.
By 1971 Motorola had their MECL 10,000 series.

The IBM 801 was circa 1976.

https://en.wikipedia.org/wiki/Emitter-coupled_logic#history

"MECL III in 1968 with 1-nanosecond gate propagation time and 300 MHz
flip-flop toggle rates, and the 10,000 series (with lower power
consumption and controlled edge speeds) in 1971."

ECL has a fan-out of 20 (up to 80) and allows wired-OR logic.
MECL 10K series is similar to TTL with DIP packages for things like
quad 2-input NOR, dual flip-flop, 8-1 mux, 64-bit RAM, 4-bit ALU, etc.

The design rules for ECL are different from TTL.
I've never worked with ECL but from what I read wire reflections are a
major consideration. There are different rules for PCB layout, grounding.
Wire-wrap could have issues with wire length (~2-3ns per foot),
long wires need damping resistors, and each wrap connection has
to be perfect or you get resistance reflections
(so you might have to scope each connection).

Other than wire length, I can't think of anything that would
otherwise limit the physical size and therefore the complexity.

>> In the 1960s ROMs were faster
>> than core memory so even with multiple microinstructions it could keep
>> the main memory running at full speed, but by 1980 we had
>> semiconductor RAM and caches so a microcode cycle was no faster than a
>> main memory cycle.
>
> The IBM 801 ran at 15MHZ, the Vax 11/780 at 5MHZ and had a 2KB cache,
> so apparently the main memory was too slow for 5MHz (but cache was
> probably at the same speed as the microcode store). The IBM 801 (as
> describe by Radin
> <https://course.ece.cmu.edu/~ece447/s12/lib/exe/fetch.php?media=wiki:801-radin-1982.pdf>,
> which is already the 32-bit version) had a split instruction and data
> cache with 32-byte cache lines. I have not found cache sizes;
> associativity seems to be at least 2-way (they mention LRU
> replacement).
>
> - anton

Bitsavers has two IBM Company Confidential docs from 1976 which
might provide more insight as why they made the decisions they did.
The overview says they had a project team of 20.

The 801 Minicomputer - An Overview, 1976
http://bitsavers.org/pdf/ibm/system801/The_801_Minicomputer_an_Overview_Sep76.pdf

IBM System 801 Principles of Operation, 1976
http://bitsavers.org/pdf/ibm/system801/System_801_Principles_of_Operation_Jan76.pdf

They definitely refer to it as a minicomputer and a prototype.
In a big company like IBM with many entrenched products,
not appearing to threaten other products and thereby step
on toes is a career consideration, part of the unwritten rules.
It can conjecture that they might have had to get "permission"
to move from 24 to 32 bits.

Re: not a 360 either, was Design a better 16 or 32 bit processor

<d97b9700-0e3e-45e7-8092-130213ea553an@googlegroups.com>

copy mid

https://www.novabbs.com/devel/article-flat.php?id=20429&group=comp.arch#20429

copy link Newsgroups: comp.arch

X-Received: by 2002:ac8:43d6:: with SMTP id w22mr483983qtn.92.1631551330502;
Mon, 13 Sep 2021 09:42:10 -0700 (PDT)
X-Received: by 2002:a05:6830:1105:: with SMTP id w5mr10153533otq.85.1631551330180;
Mon, 13 Sep 2021 09:42:10 -0700 (PDT)
Path: i2pn2.org!i2pn.org!weretis.net!feeder6.news.weretis.net!news.misty.com!border2.nntp.dca1.giganews.com!nntp.giganews.com!news-out.google.com!nntp.google.com!postnews.google.com!google-groups.googlegroups.com!not-for-mail
Newsgroups: comp.arch
Date: Mon, 13 Sep 2021 09:42:09 -0700 (PDT)
In-Reply-To: <1bI%I.120304$o45.44935@fx46.iad>
Injection-Info: google-groups.googlegroups.com; posting-host=2600:1700:291:29f0:49b:5002:45eb:aaea;
posting-account=H_G_JQkAAADS6onOMb-dqvUozKse7mcM
NNTP-Posting-Host: 2600:1700:291:29f0:49b:5002:45eb:aaea
References: <sh3evb$n1v$1@newsreader4.netcologne.de> <shhl98$3vo$1@dont-email.me>
<shjeir$1rco$1@gal.iecc.com> <2021Sep12.141107@mips.complang.tuwien.ac.at>
<shlc98$q3q$1@gal.iecc.com> <2021Sep13.092822@mips.complang.tuwien.ac.at> <1bI%I.120304$o45.44935@fx46.iad>
User-Agent: G2/1.0
MIME-Version: 1.0
Message-ID: <d97b9700-0e3e-45e7-8092-130213ea553an@googlegroups.com>
Subject: Re: not a 360 either, was Design a better 16 or 32 bit processor
From: MitchAl...@aol.com (MitchAlsup)
Injection-Date: Mon, 13 Sep 2021 16:42:10 +0000
Content-Type: text/plain; charset="UTF-8"
Lines: 103

by: MitchAlsup - Mon, 13 Sep 2021 16:42 UTC

On Monday, September 13, 2021 at 8:16:51 AM UTC-5, EricP wrote:
> Anton Ertl wrote:
> > John Levine <jo...@taugh.com> writes:
> >> According to Anton Ertl <an...@mips.complang.tuwien.ac.at>:
> >>> John Levine <jo...@taugh.com> writes:
> >>> [VAX]
> >>>> Compare that to the IBM 360 where you can tell from the first two
> >>>> bits of the opcode how long the instruction is, and for the most
> >>>> part where the memory addresses and register operands are. The 360
> >>>> wasn't perfect but it certainly aged a lot better.
> >>> And yet the 801 project found that they could outperform the 360
> >>> descendants by moving to a load/store architecture.
> >> Sure, but that was 40 years ago. The 801 also only had 24 bit
> >> registers because that how big the memory addresses were at the time.
> >> The point of the 801 wasn't that it was load/store but that the
> >> instructions were simple enough to implement without microcode with a
> >> 1980 transistor budget, and that they found that compilers rarely used
> >> the more complex instructions anyway.
> >
> > The first implementation was "Motorola MECL-10K discrete component
> > technology on large wire-wrapped custom boards." So the transistor
> > budget was not limited by what fit on a single chip. So what limited
> > the transistor budget? Or was it that they just wanted to leave out
> > everything that they could do without, making it easier to make the
> > rest fast? They certainly made it really fast.
> ECL was invented by IBM in 1956. Originally called current-steering logic,
> it was used in the Stretch, IBM 7090, and IBM 7094 computers (circa 1959).
> MECL is Motorola's integrated circuit ECL logic developed starting in 1962.
> By 1971 Motorola had their MECL 10,000 series.
>
> The IBM 801 was circa 1976.
>
> https://en.wikipedia.org/wiki/Emitter-coupled_logic#history
>
> "MECL III in 1968 with 1-nanosecond gate propagation time and 300 MHz
> flip-flop toggle rates, and the 10,000 series (with lower power
> consumption and controlled edge speeds) in 1971."
>
> ECL has a fan-out of 20 (up to 80) and allows wired-OR logic.
> MECL 10K series is similar to TTL with DIP packages for things like
> quad 2-input NOR, dual flip-flop, 8-1 mux, 64-bit RAM, 4-bit ALU, etc.
>
> The design rules for ECL are different from TTL.
> I've never worked with ECL but from what I read wire reflections are a
> major consideration. There are different rules for PCB layout, grounding.
<
ECL has its major voltage of -5.2 (that is MINUS 5.2V) and Gnd is 0V.
All of the load comes from the Gnd <plane>.
<
> Wire-wrap could have issues with wire length (~2-3ns per foot),
> long wires need damping resistors, and each wrap connection has
> to be perfect or you get resistance reflections
> (so you might have to scope each connection).
<
You had 3 choices, 1) wires shorter than 5", 2) parallel damping,
3) series damping. In series damping one puts a resistor equal
to the characteristic impedance near the driving gate and a high
value (1K Ohms) resistor at the terminating end of the wire. In
parallel damping one puts a resistor at the terminating end the
resistance equaling the characteristic impedance to -2.0V.
<
Both arrangements result in no overshoot/undershoot on the logic
wire.
<
>
> Other than wire length, I can't think of anything that would
> otherwise limit the physical size and therefore the complexity.
<
Indeed, Cray built refrigerator sized wire wrap boards that ran at
80 MHz with sub nanosecond clock delay+skew+jitter.
<
> >> In the 1960s ROMs were faster
> >> than core memory so even with multiple microinstructions it could keep
> >> the main memory running at full speed, but by 1980 we had
> >> semiconductor RAM and caches so a microcode cycle was no faster than a
> >> main memory cycle.
> >
> > The IBM 801 ran at 15MHZ, the Vax 11/780 at 5MHZ and had a 2KB cache,
> > so apparently the main memory was too slow for 5MHz (but cache was
> > probably at the same speed as the microcode store). The IBM 801 (as
> > describe by Radin
> > <https://course.ece.cmu.edu/~ece447/s12/lib/exe/fetch.php?media=wiki:801-radin-1982.pdf>,
> > which is already the 32-bit version) had a split instruction and data
> > cache with 32-byte cache lines. I have not found cache sizes;
> > associativity seems to be at least 2-way (they mention LRU
> > replacement).
> >
> > - anton
> Bitsavers has two IBM Company Confidential docs from 1976 which
> might provide more insight as why they made the decisions they did.
> The overview says they had a project team of 20.
>
> The 801 Minicomputer - An Overview, 1976
> http://bitsavers.org/pdf/ibm/system801/The_801_Minicomputer_an_Overview_Sep76.pdf
>
> IBM System 801 Principles of Operation, 1976
> http://bitsavers.org/pdf/ibm/system801/System_801_Principles_of_Operation_Jan76.pdf
>
> They definitely refer to it as a minicomputer and a prototype.
> In a big company like IBM with many entrenched products,
> not appearing to threaten other products and thereby step
> on toes is a career consideration, part of the unwritten rules.
> It can conjecture that they might have had to get "permission"
> to move from 24 to 32 bits.

Re: Design a better 16 or 32 bit processor

<shocru$pl0$1@dont-email.me>

copy mid

https://www.novabbs.com/devel/article-flat.php?id=20430&group=comp.arch#20430

copy link Newsgroups: comp.arch

Path: i2pn2.org!i2pn.org!eternal-september.org!reader02.eternal-september.org!.POSTED!not-for-mail
From: ggt...@yahoo.com (Brett)
Newsgroups: comp.arch
Subject: Re: Design a better 16 or 32 bit processor
Date: Mon, 13 Sep 2021 20:38:22 -0000 (UTC)
Organization: A noiseless patient Spider
Lines: 78
Message-ID: <shocru$pl0$1@dont-email.me>
References: <sh3evb$n1v$1@newsreader4.netcologne.de>
<3fea318c-62be-4d30-aafb-976eeee14908n@googlegroups.com>
<shguv8$2416$1@gal.iecc.com>
<2d262fce-9363-4360-bfd9-ba4263e8d703n@googlegroups.com>
<shh02c$267q$1@gal.iecc.com>
<shhl98$3vo$1@dont-email.me>
Mime-Version: 1.0
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: 8bit
Injection-Date: Mon, 13 Sep 2021 20:38:22 -0000 (UTC)
Injection-Info: reader02.eternal-september.org; posting-host="fbea62ffed0aed5b85f57517fd953b3f";
logging-data="26272"; mail-complaints-to="abuse@eternal-september.org"; posting-account="U2FsdGVkX1+2yMWt7yP/TIfB+AtvJ0Nk"
User-Agent: NewsTap/5.5 (iPad)
Cancel-Lock: sha1:BvtJeti0xIg/BG5q3p4LaG4MmmI=
sha1:I0BMsPePebFqLYKgArQcmjMhvas=

by: Brett - Mon, 13 Sep 2021 20:38 UTC

Brett <ggtgp@yahoo.com> wrote:
> John Levine <johnl@taugh.com> wrote:
>> According to MitchAlsup <MitchAlsup@aol.com>:
>>>> Sounds a lot like a PDP-11.
>>> <
>>> Which died because its successor was too hard to pipeline.
>>
>> The PDP-11 was a really good design for the late 1960s, when memory
>> was starting to be affordable and microcode ROM was still a lot faster
>> than core.
>>
>> The VAX was also a good design for the late 1960s but unfortunately it
>> was introduced in the late 1970s. It wasn't just that it was too hard
>> to pipeline.
>
>> The instruction set was full of overcomplicated
>> microcoded instructions that were often slower than a sequence of
>> simple instructions,
>
> A Myth what was promoted by RISC proponents, which has been debunked.

An alternative or addition is wide packed instructions with chaining.
The example of wide packed is Itanic, but of course they did it wrong.

You would go 256 bits wide and support a variable number of instructions in
the packet and by supporting chaining you save 10 bits of register
specifiers for each chain segment, minus a bit to indicate the chain link.

You would use heads and tails encoding and support jumps into and out of
the packet. Chaining alone should give you the best instruction density and
add some micro coded instructions and you should get dominating instruction
density that makes manufactures take a serious look at your offerings.

A variable width instruction set can support chaining today by just adding
the instructions, I am perplexed as to why no one has.

A new architecture needs a hook to get noticed and dominating instruction
density is one way to get that notice.

> RISC only made sense for a decade back in the ancient history of the
> 1980’s.
>
> Today if I wanted to build a better 16 or 32 bit processor the first step
> would be to find what micro coded instructions I could add to reduce
> instruction density, and thus win the lowest cost war.
>
> The 8086 with hard coded registers was quite good for the era, but we can
> do better today, by micro coding much more complicated sequences.
>
> The first instruction I would add is a one instruction memcpy loop, which
> would use three hard coded registers to make the instruction short. The
> data register would not be visible so that I could use vector registers if
> I wanted. And there would be several variants for copy size and alignment.
> And a bit to decide if the count is part of the instruction.
>
> Another instruction I would add is add plus store, etc.
>
> The case for load plus add is harder, and might not make the cut due to
> transistor implementation cost outweighing memory transistor savings.
> Load compare and branch is in the same boat and has to be added to the
> total system cost of load compute.
>
> I seriously think a new 16 or 32 bit processor with micro coded
> instructions could win market share, by simple expedient of smaller total
> size with code included.
>
> My template is a pipelined 386 with 32 registers and far more complex and
> longer micro code instructions.
>
> There is a major company that went for an updated 386, but did not improve
> anything besides instruction encoding and minor fixes. Failed to go big and
> so failed, go big or go home.
>
>> and they somehow didn't notice that memory was
>> getting a lot cheaper, with a super-dense super-general instruction
>> set intended for assembler programmers, and tiny 512 byte pages that
>> even at the time were obviously too small.

Re: Design a better 16 or 32 bit processor

<shog6o$nvj$1@dont-email.me>

copy mid

https://www.novabbs.com/devel/article-flat.php?id=20431&group=comp.arch#20431

copy link Newsgroups: comp.arch

Path: i2pn2.org!i2pn.org!eternal-september.org!reader02.eternal-september.org!.POSTED!not-for-mail
From: iva...@millcomputing.com (Ivan Godard)
Newsgroups: comp.arch
Subject: Re: Design a better 16 or 32 bit processor
Date: Mon, 13 Sep 2021 14:35:21 -0700
Organization: A noiseless patient Spider
Lines: 42
Message-ID: <shog6o$nvj$1@dont-email.me>
References: <sh3evb$n1v$1@newsreader4.netcologne.de>
<3fea318c-62be-4d30-aafb-976eeee14908n@googlegroups.com>
<shguv8$2416$1@gal.iecc.com>
<2d262fce-9363-4360-bfd9-ba4263e8d703n@googlegroups.com>
<shh02c$267q$1@gal.iecc.com> <shhl98$3vo$1@dont-email.me>
<shocru$pl0$1@dont-email.me>
Mime-Version: 1.0
Content-Type: text/plain; charset=utf-8; format=flowed
Content-Transfer-Encoding: 7bit
Injection-Date: Mon, 13 Sep 2021 21:35:20 -0000 (UTC)
Injection-Info: reader02.eternal-september.org; posting-host="26c6f9ee6629ab1ced93af91a7454605";
logging-data="24563"; mail-complaints-to="abuse@eternal-september.org"; posting-account="U2FsdGVkX1864aCc/brOxxsiA29VvYqS"
User-Agent: Mozilla/5.0 (Windows NT 10.0; Win64; x64; rv:78.0) Gecko/20100101
Thunderbird/78.13.0
Cancel-Lock: sha1:9zj4iX8CQ3BjELS6RgluUgkv2cQ=
In-Reply-To: <shocru$pl0$1@dont-email.me>
Content-Language: en-US

by: Ivan Godard - Mon, 13 Sep 2021 21:35 UTC

On 9/13/2021 1:38 PM, Brett wrote:
> Brett <ggtgp@yahoo.com> wrote:
>> John Levine <johnl@taugh.com> wrote:
>>> According to MitchAlsup <MitchAlsup@aol.com>:
>>>>> Sounds a lot like a PDP-11.
>>>> <
>>>> Which died because its successor was too hard to pipeline.
>>>
>>> The PDP-11 was a really good design for the late 1960s, when memory
>>> was starting to be affordable and microcode ROM was still a lot faster
>>> than core.
>>>
>>> The VAX was also a good design for the late 1960s but unfortunately it
>>> was introduced in the late 1970s. It wasn't just that it was too hard
>>> to pipeline.
>>
>>> The instruction set was full of overcomplicated
>>> microcoded instructions that were often slower than a sequence of
>>> simple instructions,
>>
>> A Myth what was promoted by RISC proponents, which has been debunked.
>
> An alternative or addition is wide packed instructions with chaining.
> The example of wide packed is Itanic, but of course they did it wrong.
>
> You would go 256 bits wide and support a variable number of instructions in
> the packet and by supporting chaining you save 10 bits of register
> specifiers for each chain segment, minus a bit to indicate the chain link.
>
> You would use heads and tails encoding and support jumps into and out of
> the packet. Chaining alone should give you the best instruction density and
> add some micro coded instructions and you should get dominating instruction
> density that makes manufactures take a serious look at your offerings.
>
> A variable width instruction set can support chaining today by just adding
> the instructions, I am perplexed as to why no one has.

And how is this different from Mill belt?

> A new architecture needs a hook to get noticed and dominating instruction
> density is one way to get that notice.
>

Subject	Author
Java bytecode processors	Thomas Koenig
Re: Java bytecode processors	MitchAlsup
Re: Java bytecode processors	BGB
Re: Java bytecode processors	aph
Re: Java bytecode processors	MitchAlsup
Re: Java bytecode processors	Thomas Koenig
Re: Java bytecode processors	MitchAlsup
Re: Java bytecode processors	BGB
Re: Java bytecode processors	BGB
Re: Java bytecode processors	Thomas Koenig
Re: Java bytecode processors	EricP
Re: Java bytecode processors	EricP
Re: Java bytecode processors	Terje Mathisen
Re: Java bytecode processors	EricP
Re: Java bytecode processors	Terje Mathisen
Re: Java bytecode processors	Anton Ertl
Re: Java bytecode processors	Stefan Monnier
Re: Java bytecode processors	BGB
Re: Java bytecode processors	Anton Ertl
Re: Java bytecode processors	MitchAlsup
Re: Java bytecode processors	Nemo
Re: Java bytecode processors	Nemo
Re: Java bytecode processors	EricP
Re: Java bytecode processors	MitchAlsup
Re: Java bytecode processors	MitchAlsup
Re: Java bytecode processors	MitchAlsup
Re: Java bytecode processors	Bernd Linsel
Re: Java bytecode processors	MitchAlsup
Re: Java bytecode processors	Thomas Koenig
Re: Java bytecode processors	MitchAlsup
Re: Java bytecode processors	BGB
Re: bad code, Java bytecode processors	John Levine
Re: bad code, Java bytecode processors	Thomas Koenig
Re: bad code, Java bytecode processors	Anton Ertl
Re: Java bytecode processors	Anton Ertl
Re: Java bytecode processors	gareth evans
Re: Java bytecode processors	Anton Ertl
Re: Java bytecode processors	BGB
Re: Java bytecode processors	MitchAlsup
Re: Java bytecode processors	BGB
Re: Java bytecode processors	MitchAlsup
Re: Java bytecode processors	MitchAlsup
Re: Java bytecode processors	gareth evans
Re: Java bytecode processors	MitchAlsup
Re: Java bytecode processors	BGB
Re: Java bytecode processors	Marcus
Re: Java bytecode processors	antispam
Re: Java bytecode processors	JimBrakefield
Re: Java bytecode processors	JimBrakefield
Re: Java bytecode processors	John Levine
Re: Java bytecode processors	MitchAlsup
Re: not the PDP-11, was Java bytecode processors	John Levine
Re: not the PDP-11, was Java bytecode processors	JimBrakefield
Design a better 16 or 32 bit processor	Brett
Re: Design a better 16 or 32 bit processor	Thomas Koenig
Re: Design a better 16 or 32 bit processor	Brett
Re: Design a better 16 or 32 bit processor	John Dallman
Re: Design a better 16 or 32 bit processor	BGB
Re: Design a better 16 or 32 bit processor	Brett
Re: not a vax, was Design a better 16 or 32 bit processor	John Levine
Re: not a vax, was Design a better 16 or 32 bit processor	Brett
Re: not a vax, was Design a better 16 or 32 bit processor	Anton Ertl
Re: not a 360 either, was Design a better 16 or 32 bit processor	John Levine
Re: not a 360 either, was Design a better 16 or 32 bit processor	MitchAlsup
Re: not a 360 either, was Design a better 16 or 32 bit processor	Anne & Lynn Wheeler
Re: not a 360 either, was Design a better 16 or 32 bit processor	John Levine
Re: not a 360 either, was Design a better 16 or 32 bit processor	Anne & Lynn Wheeler
vs/pascal (Was: Re: not a 360 either, was Design a better 16 or 32	Terje Mathisen
Re: vs/pascal	Anne & Lynn Wheeler
Re: not a 360 either, was Design a better 16 or 32 bit processor	Anton Ertl
Re: not a 360 either, was Design a better 16 or 32 bit processor	EricP
Re: not a 360 either, was Design a better 16 or 32 bit processor	MitchAlsup
Re: not a vax, was Design a better 16 or 32 bit processor	EricP
Re: not a vax, was Design a better 16 or 32 bit processor	MitchAlsup
Re: Design a better 16 or 32 bit processor	Brett
Re: Design a better 16 or 32 bit processor	Ivan Godard
Re: Design a better 16 or 32 bit processor	Brett
Re: Design a better 16 or 32 bit processor	MitchAlsup
Re: Design a better 16 or 32 bit processor	Brett
Re: Design a better 16 or 32 bit processor	Brett
Re: Design a better 16 or 32 bit processor	MitchAlsup
Re: Design a better 16 or 32 bit processor	Stefan Monnier
Re: Design a better 16 or 32 bit processor	Anton Ertl
Re: Design a better 16 or 32 bit processor	EricP
Re: Design a better 16 or 32 bit processor	David Brown
Re: Design a better 16 or 32 bit processor	Stephen Fuld
Re: Design a better 16 or 32 bit processor	David Brown
Re: Design a better 16 or 32 bit processor	Brett
Re: Design a better 16 or 32 bit processor	MitchAlsup
Re: Design a better 16 or 32 bit processor	Ivan Godard
Re: Design a better 16 or 32 bit processor	Brett
Re: Design a better 16 or 32 bit processor	JimBrakefield
Re: Design a better 16 or 32 bit processor	Brett
Re: Design a better 16 or 32 bit processor	BGB
Re: Design a better 16 or 32 bit processor	MitchAlsup
Re: Design a better 16 or 32 bit processor	BGB
Re: Design a better 16 or 32 bit processor	Ivan Godard
Re: Design a better 16 or 32 bit processor	BGB
Re: Design a better 16 or 32 bit processor	Terje Mathisen
Re: Design a better 16 or 32 bit processor	MitchAlsup
Re: Design a better 16 or 32 bit processor	Stephen Fuld
Re: Design a better 16 or 32 bit processor	BGB
Re: Design a better 16 or 32 bit processor	EricP
ARM just added MEMCPY instructions.	Brett