Rocksolid Light

Welcome to novaBBS (click a section below)

mail  files  register  newsreader  groups  login

Message-ID:  

6 May, 2024: The networking issue during the past two days has been identified and appears to be fixed. Will keep monitoring.


devel / comp.arch / Re: RISC-V vs. Aarch64

SubjectAuthor
* RISC-V vs. Aarch64Anton Ertl
+* Re: RISC-V vs. Aarch64MitchAlsup
|+* Re: RISC-V vs. Aarch64Anton Ertl
||`* Re: RISC-V vs. Aarch64MitchAlsup
|| +- Re: RISC-V vs. Aarch64BGB
|| `- Re: RISC-V vs. Aarch64Anton Ertl
|+* Re: RISC-V vs. Aarch64Ivan Godard
||+- Re: RISC-V vs. Aarch64robf...@gmail.com
||+- Re: RISC-V vs. Aarch64MitchAlsup
||`* Re: RISC-V vs. Aarch64Quadibloc
|| `* Re: RISC-V vs. Aarch64Quadibloc
||  `- Re: RISC-V vs. Aarch64Quadibloc
|+* Re: RISC-V vs. Aarch64Marcus
||+- Re: RISC-V vs. Aarch64BGB
||`* Re: RISC-V vs. Aarch64MitchAlsup
|| +- Re: RISC-V vs. Aarch64BGB
|| `- Re: RISC-V vs. Aarch64Ivan Godard
|`- Re: RISC-V vs. Aarch64MitchAlsup
`* Re: RISC-V vs. Aarch64BGB
 +* Re: RISC-V vs. Aarch64MitchAlsup
 |+- Re: RISC-V vs. Aarch64MitchAlsup
 |+* Re: RISC-V vs. Aarch64Thomas Koenig
 ||+* Re: RISC-V vs. Aarch64Ivan Godard
 |||`* Re: RISC-V vs. Aarch64EricP
 ||| `- Re: RISC-V vs. Aarch64Ivan Godard
 ||+* Re: RISC-V vs. Aarch64MitchAlsup
 |||`* Re: RISC-V vs. Aarch64Ivan Godard
 ||| `* Re: RISC-V vs. Aarch64MitchAlsup
 |||  `* Re: RISC-V vs. Aarch64Ivan Godard
 |||   `* Re: RISC-V vs. Aarch64MitchAlsup
 |||    `- Re: RISC-V vs. Aarch64Marcus
 ||`* Re: RISC-V vs. Aarch64BGB
 || `- Re: RISC-V vs. Aarch64MitchAlsup
 |+* Re: RISC-V vs. Aarch64BGB
 ||`* Re: RISC-V vs. Aarch64MitchAlsup
 || `- Re: RISC-V vs. Aarch64Thomas Koenig
 |`* Re: RISC-V vs. Aarch64Marcus
 | `* Re: RISC-V vs. Aarch64EricP
 |  +* Re: RISC-V vs. Aarch64Marcus
 |  |+* Re: RISC-V vs. Aarch64MitchAlsup
 |  ||+* Re: RISC-V vs. Aarch64Niklas Holsti
 |  |||+* Re: RISC-V vs. Aarch64Bill Findlay
 |  ||||`- Re: RISC-V vs. Aarch64MitchAlsup
 |  |||`- Re: RISC-V vs. Aarch64Ivan Godard
 |  ||`- Re: RISC-V vs. Aarch64Thomas Koenig
 |  |+* Re: RISC-V vs. Aarch64Thomas Koenig
 |  ||+* Re: RISC-V vs. Aarch64MitchAlsup
 |  |||`- Re: RISC-V vs. Aarch64BGB
 |  ||+* Re: RISC-V vs. Aarch64Ivan Godard
 |  |||`* Re: RISC-V vs. Aarch64Thomas Koenig
 |  ||| `- Re: RISC-V vs. Aarch64Ivan Godard
 |  ||`* Re: RISC-V vs. Aarch64Marcus
 |  || +* Re: RISC-V vs. Aarch64Thomas Koenig
 |  || |`* Re: RISC-V vs. Aarch64aph
 |  || | +- Re: RISC-V vs. Aarch64Michael S
 |  || | `* Re: RISC-V vs. Aarch64Thomas Koenig
 |  || |  `* Re: RISC-V vs. Aarch64robf...@gmail.com
 |  || |   +* Re: RISC-V vs. Aarch64Ivan Godard
 |  || |   |`- Re: RISC-V vs. Aarch64Tim Rentsch
 |  || |   `* Re: RISC-V vs. Aarch64Terje Mathisen
 |  || |    `* Re: RISC-V vs. Aarch64Thomas Koenig
 |  || |     `* Re: RISC-V vs. Aarch64Marcus
 |  || |      `* Re: RISC-V vs. Aarch64Guillaume
 |  || |       `* Re: RISC-V vs. Aarch64MitchAlsup
 |  || |        +- Re: RISC-V vs. Aarch64Marcus
 |  || |        +* Re: RISC-V vs. Aarch64Ivan Godard
 |  || |        |`* Re: RISC-V vs. Aarch64MitchAlsup
 |  || |        | `* Re: RISC-V vs. Aarch64Ivan Godard
 |  || |        |  `* Re: RISC-V vs. Aarch64Thomas Koenig
 |  || |        |   `* Re: RISC-V vs. Aarch64Ivan Godard
 |  || |        |    `* Re: RISC-V vs. Aarch64EricP
 |  || |        |     +* Re: RISC-V vs. Aarch64MitchAlsup
 |  || |        |     |`* Re: RISC-V vs. Aarch64EricP
 |  || |        |     | `- Re: RISC-V vs. Aarch64MitchAlsup
 |  || |        |     `* Re: RISC-V vs. Aarch64Ivan Godard
 |  || |        |      `* Re: RISC-V vs. Aarch64EricP
 |  || |        |       +- Re: RISC-V vs. Aarch64MitchAlsup
 |  || |        |       `* Re: RISC-V vs. Aarch64Ivan Godard
 |  || |        |        +* Re: RISC-V vs. Aarch64Brett
 |  || |        |        |+* Re: RISC-V vs. Aarch64MitchAlsup
 |  || |        |        ||`- Re: RISC-V vs. Aarch64Ivan Godard
 |  || |        |        |`- Re: RISC-V vs. Aarch64Ivan Godard
 |  || |        |        `* Re: RISC-V vs. Aarch64Stephen Fuld
 |  || |        |         `* Re: RISC-V vs. Aarch64Ivan Godard
 |  || |        |          +* Re: RISC-V vs. Aarch64Stefan Monnier
 |  || |        |          |`- Re: RISC-V vs. Aarch64Ivan Godard
 |  || |        |          +* Re: RISC-V vs. Aarch64MitchAlsup
 |  || |        |          |`* Re: RISC-V vs. Aarch64Ivan Godard
 |  || |        |          | `- Re: RISC-V vs. Aarch64MitchAlsup
 |  || |        |          +* Re: RISC-V vs. Aarch64Stephen Fuld
 |  || |        |          |`- Re: RISC-V vs. Aarch64Ivan Godard
 |  || |        |          `* Re: RISC-V vs. Aarch64EricP
 |  || |        |           +* Re: RISC-V vs. Aarch64EricP
 |  || |        |           |`* Re: RISC-V vs. Aarch64Ivan Godard
 |  || |        |           | `* The type of Mill's belt's slotsStefan Monnier
 |  || |        |           |  +- Re: The type of Mill's belt's slotsMitchAlsup
 |  || |        |           |  `* Re: The type of Mill's belt's slotsIvan Godard
 |  || |        |           |   `* Re: The type of Mill's belt's slotsStefan Monnier
 |  || |        |           |    `* Re: The type of Mill's belt's slotsIvan Godard
 |  || |        |           |     +* Re: The type of Mill's belt's slotsStefan Monnier
 |  || |        |           |     |`* Re: The type of Mill's belt's slotsIvan Godard
 |  || |        |           |     `* Re: The type of Mill's belt's slotsMitchAlsup
 |  || |        |           `- Re: RISC-V vs. Aarch64Ivan Godard
 |  || |        +* Re: RISC-V vs. Aarch64Guillaume
 |  || |        `* Re: RISC-V vs. Aarch64Quadibloc
 |  || `* MRISC32 vectorization (was: RISC-V vs. Aarch64)Thomas Koenig
 |  |`* Re: RISC-V vs. Aarch64Terje Mathisen
 |  `- Re: RISC-V vs. Aarch64Quadibloc
 +* Re: RISC-V vs. Aarch64Anton Ertl
 `- Re: RISC-V vs. Aarch64aph

Pages:123456789101112131415
Re: The type of Mill's belt's slots

<ss9e8s$lv8$1@dont-email.me>

  copy mid

https://www.novabbs.com/devel/article-flat.php?id=23014&group=comp.arch#23014

  copy link   Newsgroups: comp.arch
Path: i2pn2.org!i2pn.org!eternal-september.org!reader02.eternal-september.org!.POSTED!not-for-mail
From: iva...@millcomputing.com (Ivan Godard)
Newsgroups: comp.arch
Subject: Re: The type of Mill's belt's slots
Date: Wed, 19 Jan 2022 08:29:15 -0800
Organization: A noiseless patient Spider
Lines: 46
Message-ID: <ss9e8s$lv8$1@dont-email.me>
References: <2021Dec24.163843@mips.complang.tuwien.ac.at>
<sqssff$a9j$1@gioia.aioe.org>
<077afaee-009e-4860-be45-61106126934bn@googlegroups.com>
<squhht$79u$1@dont-email.me>
<bb6d49bb-a676-44bd-9a6d-29386d429454n@googlegroups.com>
<sr0vhm$c4u$1@dont-email.me> <sr114i$1qc$1@newsreader4.netcologne.de>
<sr1dca$70e$1@dont-email.me> <kM%AJ.186634$np6.183460@fx46.iad>
<sr2gf6$64u$1@dont-email.me> <7DpBJ.254731$3q9.63673@fx47.iad>
<sr62tb$u2o$1@dont-email.me> <ss4g91$hvs$1@dont-email.me>
<ss4ktr$dvv$1@dont-email.me> <UZCFJ.4610$uP.4480@fx16.iad>
<bSDFJ.268310$1d1.64158@fx99.iad> <ss78bk$hm6$1@dont-email.me>
<jwvo848n4ud.fsf-monnier+comp.arch@gnu.org> <ss7ila$obl$1@dont-email.me>
<jwv7dawmmb7.fsf-monnier+comp.arch@gnu.org> <ss858d$m0a$1@dont-email.me>
<jwv7davagpd.fsf-monnier+comp.arch@gnu.org>
Mime-Version: 1.0
Content-Type: text/plain; charset=UTF-8; format=flowed
Content-Transfer-Encoding: 7bit
Injection-Date: Wed, 19 Jan 2022 16:29:16 -0000 (UTC)
Injection-Info: reader02.eternal-september.org; posting-host="e035f31cff3a51acdec223f4e7c82b8d";
logging-data="22504"; mail-complaints-to="abuse@eternal-september.org"; posting-account="U2FsdGVkX1+xSBz+V4IcJW4/jCYIZxpS"
User-Agent: Mozilla/5.0 (Windows NT 10.0; Win64; x64; rv:91.0) Gecko/20100101
Thunderbird/91.5.0
Cancel-Lock: sha1:fTAr/9/vObx00cK6Bqqfb+vrF2g=
In-Reply-To: <jwv7davagpd.fsf-monnier+comp.arch@gnu.org>
Content-Language: en-US
 by: Ivan Godard - Wed, 19 Jan 2022 16:29 UTC

On 1/19/2022 7:36 AM, Stefan Monnier wrote:
> Ivan Godard [2022-01-18 20:49:16] wrote:
>> Width mismatch can arise from software (specializer bug)
>
> That would be in a case like when we add a 32bit int and a 64 bit int?
> These can be detected in the decoder/mapper part, right?

Sometimes, if the projected input widths are inconsistent with any
instruction overload. In other cases the projected input widths do match
a valid overload, but the actual widths are not as projected.

>
>> or hardware (gamma ray) error,
>
> Sanity checks, like ECC, are good, yes.
>
>> Most instructions that have results have data-dependent output widths.
>> Usually the result width is the same as the argument widths, but all
>> instructions that can overflow (including nearly all arithmetic
>> instructions) have a variant that produces a double-width result.
>
> But the width is not data-dependent, in the sense that it's only depend
> on the input widths, not the input values. So the decoder/mapper can
> always correctly predict the output width (and it can also correctly
> predict the mismatches like adding a 32bit to a 64bit arg), it never
> needs to guess, right?
>
> IOW, when you said:
>
>> the actual (from the data) is matched against the expected and faults
>> on a mismatch.
>
> This is only needed to detect hardware errors.

It is needed whenever data is kept that is not being tracked by the
mapper, for example in the scratchpad. We could omit the actual width
tag while a datum is on the belt, but would need to restore the mapping
when filling a spilled datum. The specializer knows what the filled
width is, and could just assert the width in the fill instruction the
way we do with loads. However, we are spilling the rest of the metadata
anyway and so currently just track (and spill) the actual width as well.
A similar tradeoff occurs with function return, where the caller's belt
(and hence the mapping) must be restored.

Re: The type of Mill's belt's slots

<jwvv8yf8yx1.fsf-monnier+comp.arch@gnu.org>

  copy mid

https://www.novabbs.com/devel/article-flat.php?id=23015&group=comp.arch#23015

  copy link   Newsgroups: comp.arch
Path: i2pn2.org!i2pn.org!eternal-september.org!reader02.eternal-september.org!.POSTED!not-for-mail
From: monn...@iro.umontreal.ca (Stefan Monnier)
Newsgroups: comp.arch
Subject: Re: The type of Mill's belt's slots
Date: Wed, 19 Jan 2022 11:40:20 -0500
Organization: A noiseless patient Spider
Lines: 18
Message-ID: <jwvv8yf8yx1.fsf-monnier+comp.arch@gnu.org>
References: <2021Dec24.163843@mips.complang.tuwien.ac.at>
<squhht$79u$1@dont-email.me>
<bb6d49bb-a676-44bd-9a6d-29386d429454n@googlegroups.com>
<sr0vhm$c4u$1@dont-email.me> <sr114i$1qc$1@newsreader4.netcologne.de>
<sr1dca$70e$1@dont-email.me> <kM%AJ.186634$np6.183460@fx46.iad>
<sr2gf6$64u$1@dont-email.me> <7DpBJ.254731$3q9.63673@fx47.iad>
<sr62tb$u2o$1@dont-email.me> <ss4g91$hvs$1@dont-email.me>
<ss4ktr$dvv$1@dont-email.me> <UZCFJ.4610$uP.4480@fx16.iad>
<bSDFJ.268310$1d1.64158@fx99.iad> <ss78bk$hm6$1@dont-email.me>
<jwvo848n4ud.fsf-monnier+comp.arch@gnu.org>
<ss7ila$obl$1@dont-email.me>
<jwv7dawmmb7.fsf-monnier+comp.arch@gnu.org>
<ss858d$m0a$1@dont-email.me>
<jwv7davagpd.fsf-monnier+comp.arch@gnu.org>
<ss9e8s$lv8$1@dont-email.me>
Mime-Version: 1.0
Content-Type: text/plain
Injection-Info: reader02.eternal-september.org; posting-host="6b077cbe65e9ab4d1d4961c7fa08cc9e";
logging-data="6641"; mail-complaints-to="abuse@eternal-september.org"; posting-account="U2FsdGVkX19kzhtoPVdXF3BPLVyj+Mw5"
User-Agent: Gnus/5.13 (Gnus v5.13) Emacs/29.0.50 (gnu/linux)
Cancel-Lock: sha1:KBnPdN5lLTAWiiHBnd9GRB4iaL8=
sha1:tCLOT0yvO3KuN4j/oLIkjS5DOug=
 by: Stefan Monnier - Wed, 19 Jan 2022 16:40 UTC

> It is needed whenever data is kept that is not being tracked by the mapper,
> for example in the scratchpad. We could omit the actual width tag while
> a datum is on the belt, but would need to restore the mapping when filling
> a spilled datum. The specializer knows what the filled width is, and could
> just assert the width in the fill instruction the way we do with
> loads. However, we are spilling the rest of the metadata anyway and so
> currently just track (and spill) the actual width as well. A similar
> tradeoff occurs with function return, where the caller's belt (and hence the
> mapping) must be restored.

Hmm... so upon return from a function, the decoder/mapper doesn't know
the belt slots's width (because they'll only be available a few cycles
later when the spiller returns the corresponding data) and instead tries
to guess them, which thus comes with occasional
misprediction-triggered reexecutions?

Stefan

Re: The type of Mill's belt's slots

<74f730f1-e138-4124-9fbb-21f388eafeb3n@googlegroups.com>

  copy mid

https://www.novabbs.com/devel/article-flat.php?id=23016&group=comp.arch#23016

  copy link   Newsgroups: comp.arch
X-Received: by 2002:ac8:7f96:: with SMTP id z22mr10972731qtj.171.1642614415011;
Wed, 19 Jan 2022 09:46:55 -0800 (PST)
X-Received: by 2002:a9d:2a82:: with SMTP id e2mr11037717otb.331.1642614414743;
Wed, 19 Jan 2022 09:46:54 -0800 (PST)
Path: i2pn2.org!i2pn.org!weretis.net!feeder6.news.weretis.net!news.misty.com!border2.nntp.dca1.giganews.com!nntp.giganews.com!news-out.google.com!nntp.google.com!postnews.google.com!google-groups.googlegroups.com!not-for-mail
Newsgroups: comp.arch
Date: Wed, 19 Jan 2022 09:46:54 -0800 (PST)
In-Reply-To: <ss858d$m0a$1@dont-email.me>
Injection-Info: google-groups.googlegroups.com; posting-host=2600:1700:291:29f0:5089:ab19:f6cc:8a98;
posting-account=H_G_JQkAAADS6onOMb-dqvUozKse7mcM
NNTP-Posting-Host: 2600:1700:291:29f0:5089:ab19:f6cc:8a98
References: <2021Dec24.163843@mips.complang.tuwien.ac.at> <sqpqbm$7qo$1@newsreader4.netcologne.de>
<sqq3ce$c4n$2@dont-email.me> <sqssff$a9j$1@gioia.aioe.org>
<077afaee-009e-4860-be45-61106126934bn@googlegroups.com> <squhht$79u$1@dont-email.me>
<bb6d49bb-a676-44bd-9a6d-29386d429454n@googlegroups.com> <sr0vhm$c4u$1@dont-email.me>
<sr114i$1qc$1@newsreader4.netcologne.de> <sr1dca$70e$1@dont-email.me>
<kM%AJ.186634$np6.183460@fx46.iad> <sr2gf6$64u$1@dont-email.me>
<7DpBJ.254731$3q9.63673@fx47.iad> <sr62tb$u2o$1@dont-email.me>
<ss4g91$hvs$1@dont-email.me> <ss4ktr$dvv$1@dont-email.me> <UZCFJ.4610$uP.4480@fx16.iad>
<bSDFJ.268310$1d1.64158@fx99.iad> <ss78bk$hm6$1@dont-email.me>
<jwvo848n4ud.fsf-monnier+comp.arch@gnu.org> <ss7ila$obl$1@dont-email.me>
<jwv7dawmmb7.fsf-monnier+comp.arch@gnu.org> <ss858d$m0a$1@dont-email.me>
User-Agent: G2/1.0
MIME-Version: 1.0
Message-ID: <74f730f1-e138-4124-9fbb-21f388eafeb3n@googlegroups.com>
Subject: Re: The type of Mill's belt's slots
From: MitchAl...@aol.com (MitchAlsup)
Injection-Date: Wed, 19 Jan 2022 17:46:54 +0000
Content-Type: text/plain; charset="UTF-8"
Lines: 35
 by: MitchAlsup - Wed, 19 Jan 2022 17:46 UTC

On Tuesday, January 18, 2022 at 10:49:20 PM UTC-6, Ivan Godard wrote:
> On 1/18/2022 7:35 PM, Stefan Monnier wrote:
> >> It is both kept with the data and projected by the decoder/mapper. Execution
> >> follows the projected width; the actual (from the data) is matched against
> >> the expected and faults on a mismatch.
> >
> > Hmm... how/when would such a mismatch occur?
> > Do you have instructions whose output width is data-dependent?
> > Where is it useful/used? Do you then have instructions to dynamically
> > test the width of some belt slot?
> >
> > Dynamically typed machine language?
> >
> >
> > Stefan
> Width mismatch can arise from software (specializer bug) or hardware
> (gamma ray) error, or from an attack that generates or modifies
> binaries, before or after execution begins. The Mill design cares *very*
> much about RAS issues and sanity-checks everything it can.
>
> Most instructions that have results have data-dependent output widths.
> Usually the result width is the same as the argument widths, but all
> instructions that can overflow (including nearly all arithmetic
> instructions) have a variant that produces a double-width result.
> Obviously the WIDEN and NARROW instructions change the width, and there
> are a few odd-balls.
<
Why do you consider this (previous paragraph) better than the RISC model of
"everything becomes register size when it is loaded" ?
<
>
> The belt evaporates too quickly to do much useful with it dynamically. A
> debugger or exception handler that needs to dynamically interpret a belt
> can call a few nested functions to cause it to be pushed into the
> spiller and thence to DRAM, where software can see the widths in the
> spill-format data.

Re: The type of Mill's belt's slots

<ss9p4l$a2i$1@dont-email.me>

  copy mid

https://www.novabbs.com/devel/article-flat.php?id=23017&group=comp.arch#23017

  copy link   Newsgroups: comp.arch
Path: i2pn2.org!i2pn.org!eternal-september.org!reader02.eternal-september.org!.POSTED!not-for-mail
From: iva...@millcomputing.com (Ivan Godard)
Newsgroups: comp.arch
Subject: Re: The type of Mill's belt's slots
Date: Wed, 19 Jan 2022 11:34:46 -0800
Organization: A noiseless patient Spider
Lines: 36
Message-ID: <ss9p4l$a2i$1@dont-email.me>
References: <2021Dec24.163843@mips.complang.tuwien.ac.at>
<sqpqbm$7qo$1@newsreader4.netcologne.de> <sqq3ce$c4n$2@dont-email.me>
<sqssff$a9j$1@gioia.aioe.org>
<077afaee-009e-4860-be45-61106126934bn@googlegroups.com>
<squhht$79u$1@dont-email.me>
<bb6d49bb-a676-44bd-9a6d-29386d429454n@googlegroups.com>
<sr0vhm$c4u$1@dont-email.me> <sr114i$1qc$1@newsreader4.netcologne.de>
<sr1dca$70e$1@dont-email.me> <kM%AJ.186634$np6.183460@fx46.iad>
<sr2gf6$64u$1@dont-email.me> <7DpBJ.254731$3q9.63673@fx47.iad>
<sr62tb$u2o$1@dont-email.me> <ss4g91$hvs$1@dont-email.me>
<ss4ktr$dvv$1@dont-email.me> <UZCFJ.4610$uP.4480@fx16.iad>
<bSDFJ.268310$1d1.64158@fx99.iad> <ss78bk$hm6$1@dont-email.me>
<jwvo848n4ud.fsf-monnier+comp.arch@gnu.org> <ss7ila$obl$1@dont-email.me>
<jwv7dawmmb7.fsf-monnier+comp.arch@gnu.org> <ss858d$m0a$1@dont-email.me>
<74f730f1-e138-4124-9fbb-21f388eafeb3n@googlegroups.com>
Mime-Version: 1.0
Content-Type: text/plain; charset=UTF-8; format=flowed
Content-Transfer-Encoding: 7bit
Injection-Date: Wed, 19 Jan 2022 19:34:46 -0000 (UTC)
Injection-Info: reader02.eternal-september.org; posting-host="e035f31cff3a51acdec223f4e7c82b8d";
logging-data="10322"; mail-complaints-to="abuse@eternal-september.org"; posting-account="U2FsdGVkX1+9QSIUgKB4lbmJyDobWZyA"
User-Agent: Mozilla/5.0 (Windows NT 10.0; Win64; x64; rv:91.0) Gecko/20100101
Thunderbird/91.5.0
Cancel-Lock: sha1:yQYk8PDB7G5QEVRLIdi20ropEy4=
In-Reply-To: <74f730f1-e138-4124-9fbb-21f388eafeb3n@googlegroups.com>
Content-Language: en-US
 by: Ivan Godard - Wed, 19 Jan 2022 19:34 UTC

On 1/19/2022 9:46 AM, MitchAlsup wrote:
> On Tuesday, January 18, 2022 at 10:49:20 PM UTC-6, Ivan Godard wrote:
>> On 1/18/2022 7:35 PM, Stefan Monnier wrote:
>>>> It is both kept with the data and projected by the decoder/mapper. Execution
>>>> follows the projected width; the actual (from the data) is matched against
>>>> the expected and faults on a mismatch.
>>>
>>> Hmm... how/when would such a mismatch occur?
>>> Do you have instructions whose output width is data-dependent?
>>> Where is it useful/used? Do you then have instructions to dynamically
>>> test the width of some belt slot?
>>>
>>> Dynamically typed machine language?
>>>
>>>
>>> Stefan
>> Width mismatch can arise from software (specializer bug) or hardware
>> (gamma ray) error, or from an attack that generates or modifies
>> binaries, before or after execution begins. The Mill design cares *very*
>> much about RAS issues and sanity-checks everything it can.
>>
>> Most instructions that have results have data-dependent output widths.
>> Usually the result width is the same as the argument widths, but all
>> instructions that can overflow (including nearly all arithmetic
>> instructions) have a variant that produces a double-width result.
>> Obviously the WIDEN and NARROW instructions change the width, and there
>> are a few odd-balls.
> <
> Why do you consider this (previous paragraph) better than the RISC model of
> "everything becomes register size when it is loaded" ?

We wanted to do auto-SIMD, so scalar and lane semantics should be the
same. Once we decided to expose integer overflow (as opposed to the
legacy/RISC letting the user handle the boojums), RAS reasons led to
doing calculations in the declared types.

Re: The type of Mill's belt's slots

<ss9p8g$a2i$2@dont-email.me>

  copy mid

https://www.novabbs.com/devel/article-flat.php?id=23018&group=comp.arch#23018

  copy link   Newsgroups: comp.arch
Path: i2pn2.org!i2pn.org!eternal-september.org!reader02.eternal-september.org!.POSTED!not-for-mail
From: iva...@millcomputing.com (Ivan Godard)
Newsgroups: comp.arch
Subject: Re: The type of Mill's belt's slots
Date: Wed, 19 Jan 2022 11:36:49 -0800
Organization: A noiseless patient Spider
Lines: 22
Message-ID: <ss9p8g$a2i$2@dont-email.me>
References: <2021Dec24.163843@mips.complang.tuwien.ac.at>
<squhht$79u$1@dont-email.me>
<bb6d49bb-a676-44bd-9a6d-29386d429454n@googlegroups.com>
<sr0vhm$c4u$1@dont-email.me> <sr114i$1qc$1@newsreader4.netcologne.de>
<sr1dca$70e$1@dont-email.me> <kM%AJ.186634$np6.183460@fx46.iad>
<sr2gf6$64u$1@dont-email.me> <7DpBJ.254731$3q9.63673@fx47.iad>
<sr62tb$u2o$1@dont-email.me> <ss4g91$hvs$1@dont-email.me>
<ss4ktr$dvv$1@dont-email.me> <UZCFJ.4610$uP.4480@fx16.iad>
<bSDFJ.268310$1d1.64158@fx99.iad> <ss78bk$hm6$1@dont-email.me>
<jwvo848n4ud.fsf-monnier+comp.arch@gnu.org> <ss7ila$obl$1@dont-email.me>
<jwv7dawmmb7.fsf-monnier+comp.arch@gnu.org> <ss858d$m0a$1@dont-email.me>
<jwv7davagpd.fsf-monnier+comp.arch@gnu.org> <ss9e8s$lv8$1@dont-email.me>
<jwvv8yf8yx1.fsf-monnier+comp.arch@gnu.org>
Mime-Version: 1.0
Content-Type: text/plain; charset=UTF-8; format=flowed
Content-Transfer-Encoding: 7bit
Injection-Date: Wed, 19 Jan 2022 19:36:49 -0000 (UTC)
Injection-Info: reader02.eternal-september.org; posting-host="e035f31cff3a51acdec223f4e7c82b8d";
logging-data="10322"; mail-complaints-to="abuse@eternal-september.org"; posting-account="U2FsdGVkX18uoPIDFbZsfwxLbMfGfuko"
User-Agent: Mozilla/5.0 (Windows NT 10.0; Win64; x64; rv:91.0) Gecko/20100101
Thunderbird/91.5.0
Cancel-Lock: sha1:DCUMnr8doMrFnmrZA92VVqYInKs=
In-Reply-To: <jwvv8yf8yx1.fsf-monnier+comp.arch@gnu.org>
Content-Language: en-US
 by: Ivan Godard - Wed, 19 Jan 2022 19:36 UTC

On 1/19/2022 8:40 AM, Stefan Monnier wrote:
>> It is needed whenever data is kept that is not being tracked by the mapper,
>> for example in the scratchpad. We could omit the actual width tag while
>> a datum is on the belt, but would need to restore the mapping when filling
>> a spilled datum. The specializer knows what the filled width is, and could
>> just assert the width in the fill instruction the way we do with
>> loads. However, we are spilling the rest of the metadata anyway and so
>> currently just track (and spill) the actual width as well. A similar
>> tradeoff occurs with function return, where the caller's belt (and hence the
>> mapping) must be restored.
>
> Hmm... so upon return from a function, the decoder/mapper doesn't know
> the belt slots's width (because they'll only be available a few cycles
> later when the spiller returns the corresponding data) and instead tries
> to guess them, which thus comes with occasional
> misprediction-triggered reexecutions?
>
>
> Stefan

Several ways to do this in HW - spill a copy of the map table, wait and
load from the data, others, but not guessing. HW choice.

Re: The type of Mill's belt's slots

<34d3c262-c385-4a42-a6c1-a15c34536b28n@googlegroups.com>

  copy mid

https://www.novabbs.com/devel/article-flat.php?id=23019&group=comp.arch#23019

  copy link   Newsgroups: comp.arch
X-Received: by 2002:a05:622a:548:: with SMTP id m8mr1673248qtx.300.1642728405153;
Thu, 20 Jan 2022 17:26:45 -0800 (PST)
X-Received: by 2002:a4a:c44b:: with SMTP id h11mr1160810ooq.54.1642728404942;
Thu, 20 Jan 2022 17:26:44 -0800 (PST)
Path: i2pn2.org!i2pn.org!weretis.net!feeder6.news.weretis.net!news.misty.com!border2.nntp.dca1.giganews.com!nntp.giganews.com!news-out.google.com!nntp.google.com!postnews.google.com!google-groups.googlegroups.com!not-for-mail
Newsgroups: comp.arch
Date: Thu, 20 Jan 2022 17:26:44 -0800 (PST)
In-Reply-To: <ss9p4l$a2i$1@dont-email.me>
Injection-Info: google-groups.googlegroups.com; posting-host=2001:56a:fb70:6300:1031:8d1b:71c7:f0d2;
posting-account=1nOeKQkAAABD2jxp4Pzmx9Hx5g9miO8y
NNTP-Posting-Host: 2001:56a:fb70:6300:1031:8d1b:71c7:f0d2
References: <2021Dec24.163843@mips.complang.tuwien.ac.at> <sqpqbm$7qo$1@newsreader4.netcologne.de>
<sqq3ce$c4n$2@dont-email.me> <sqssff$a9j$1@gioia.aioe.org>
<077afaee-009e-4860-be45-61106126934bn@googlegroups.com> <squhht$79u$1@dont-email.me>
<bb6d49bb-a676-44bd-9a6d-29386d429454n@googlegroups.com> <sr0vhm$c4u$1@dont-email.me>
<sr114i$1qc$1@newsreader4.netcologne.de> <sr1dca$70e$1@dont-email.me>
<kM%AJ.186634$np6.183460@fx46.iad> <sr2gf6$64u$1@dont-email.me>
<7DpBJ.254731$3q9.63673@fx47.iad> <sr62tb$u2o$1@dont-email.me>
<ss4g91$hvs$1@dont-email.me> <ss4ktr$dvv$1@dont-email.me> <UZCFJ.4610$uP.4480@fx16.iad>
<bSDFJ.268310$1d1.64158@fx99.iad> <ss78bk$hm6$1@dont-email.me>
<jwvo848n4ud.fsf-monnier+comp.arch@gnu.org> <ss7ila$obl$1@dont-email.me>
<jwv7dawmmb7.fsf-monnier+comp.arch@gnu.org> <ss858d$m0a$1@dont-email.me>
<74f730f1-e138-4124-9fbb-21f388eafeb3n@googlegroups.com> <ss9p4l$a2i$1@dont-email.me>
User-Agent: G2/1.0
MIME-Version: 1.0
Message-ID: <34d3c262-c385-4a42-a6c1-a15c34536b28n@googlegroups.com>
Subject: Re: The type of Mill's belt's slots
From: jsav...@ecn.ab.ca (Quadibloc)
Injection-Date: Fri, 21 Jan 2022 01:26:45 +0000
Content-Type: text/plain; charset="UTF-8"
Lines: 67
 by: Quadibloc - Fri, 21 Jan 2022 01:26 UTC

On Wednesday, January 19, 2022 at 12:34:49 PM UTC-7, Ivan Godard wrote:
> On 1/19/2022 9:46 AM, MitchAlsup wrote:
> > On Tuesday, January 18, 2022 at 10:49:20 PM UTC-6, Ivan Godard wrote:
> >> On 1/18/2022 7:35 PM, Stefan Monnier wrote:
> >>>> It is both kept with the data and projected by the decoder/mapper. Execution
> >>>> follows the projected width; the actual (from the data) is matched against
> >>>> the expected and faults on a mismatch.
> >>>
> >>> Hmm... how/when would such a mismatch occur?
> >>> Do you have instructions whose output width is data-dependent?
> >>> Where is it useful/used? Do you then have instructions to dynamically
> >>> test the width of some belt slot?
> >>>
> >>> Dynamically typed machine language?

> >> Width mismatch can arise from software (specializer bug) or hardware
> >> (gamma ray) error, or from an attack that generates or modifies
> >> binaries, before or after execution begins. The Mill design cares *very*
> >> much about RAS issues and sanity-checks everything it can.
> >>
> >> Most instructions that have results have data-dependent output widths.
> >> Usually the result width is the same as the argument widths, but all
> >> instructions that can overflow (including nearly all arithmetic
> >> instructions) have a variant that produces a double-width result.
> >> Obviously the WIDEN and NARROW instructions change the width, and there
> >> are a few odd-balls.
> > <
> > Why do you consider this (previous paragraph) better than the RISC model of
> > "everything becomes register size when it is loaded" ?

> We wanted to do auto-SIMD, so scalar and lane semantics should be the
> same. Once we decided to expose integer overflow (as opposed to the
> legacy/RISC letting the user handle the boojums), RAS reasons led to
> doing calculations in the declared types.

I'll have to admit that reading this discussion was confusing for me.

I'm used to thinking in terms of conventional computer architectures,
like that of the IBM System/360. In such an architecture, the width
of data depends on the instruction being executed. There can't be a
'mismatch', although there certainly can be code that doesn't make sense
because it operates on the same bits in memory as though they belong
to data of multiple different types.

So, for it to be even _possible_ to detect a 'mismatch' should one occur,
it sounds to me like the data has to have a label associated with it. Sort
of like on the old Burroughs computers.

When one saved and restored registers, instead of just saving their
contents as plain data, the save would have to be a protected binary blob
with type attributes included. Of course, the Mill doesn't have registers,
and data spill from the Belt would for other reasons have to be protected
from ordinary memory operations.

While I think security is a good thing, I tend to approve of the traditional
practice of avoiding any measures that would appear to have a significant
overhead. Computers are for setting new LINPACK records, and security
is about keeping the wrong people's grubby fingers away - on the assumption
that you can build a secure wall around a computer that on the inside, where
it's actually cranking out work, has almost no security features present...
and not get totally owned for such foolishness.

This may be misguided for some applications and threat models, but I tend
to presume it's a specialized area of research, outside the 'mainstream' of
computer design.

John Savard

Re: RISC-V vs. Aarch64

<ssep4k$vke$1@dont-email.me>

  copy mid

https://www.novabbs.com/devel/article-flat.php?id=23020&group=comp.arch#23020

  copy link   Newsgroups: comp.arch
Path: i2pn2.org!i2pn.org!eternal-september.org!reader02.eternal-september.org!.POSTED!not-for-mail
From: sfu...@alumni.cmu.edu.invalid (Stephen Fuld)
Newsgroups: comp.arch
Subject: Re: RISC-V vs. Aarch64
Date: Fri, 21 Jan 2022 09:05:22 -0800
Organization: A noiseless patient Spider
Lines: 224
Message-ID: <ssep4k$vke$1@dont-email.me>
References: <2021Dec24.163843@mips.complang.tuwien.ac.at>
<86h7agvxun.fsf@linuxsc.com> <M4_BJ.140002$lz3.547@fx34.iad>
<f91f3db8-640e-4c10-b0f7-61c7085b70c8n@googlegroups.com>
<srag0i$2ed$2@dont-email.me>
<00add816-93d7-4763-a68b-33a67db6d770n@googlegroups.com>
<2022Jan8.101413@mips.complang.tuwien.ac.at>
<7557bf3a-61ce-4500-8cf8-ced2dbed7087n@googlegroups.com>
<ad2ee700-b604-4565-9e24-3386580b90c8n@googlegroups.com>
<4d2fbc82-af69-4388-bfa5-e3b2be652744n@googlegroups.com>
<2e706405-006a-49bb-8e8a-f634d749205en@googlegroups.com>
<570acc73-a5da-497f-8ec4-810150e0a9f1n@googlegroups.com>
<850b7681-204a-4df6-9095-cd6ee816a7d5n@googlegroups.com>
<5ea00397-5572-4fbd-bfb3-85c3554f1eb9n@googlegroups.com>
<srnkf0$4cb$1@dont-email.me>
<b4e98991-4fb9-4ef7-a831-430c3fc10145n@googlegroups.com>
<srp3n0$e0a$1@dont-email.me> <srp4lu$hhg$1@gioia.aioe.org>
<81c0ddc6-4b46-4b3d-b64b-e65963889214n@googlegroups.com>
<srpjgj$541$1@dont-email.me>
<1e0b0ba3-e11c-4a14-a0c7-7c074f2f9ba7n@googlegroups.com>
Mime-Version: 1.0
Content-Type: text/plain; charset=UTF-8; format=flowed
Content-Transfer-Encoding: 8bit
Injection-Date: Fri, 21 Jan 2022 17:05:25 -0000 (UTC)
Injection-Info: reader02.eternal-september.org; posting-host="df1c2bb0508560b1482abd0cd87e42b2";
logging-data="32398"; mail-complaints-to="abuse@eternal-september.org"; posting-account="U2FsdGVkX1/JLIW32PkFMWd4fNSV8C3Nu6hIQyKPG3k="
User-Agent: Mozilla/5.0 (Windows NT 10.0; Win64; x64; rv:91.0) Gecko/20100101
Thunderbird/91.5.0
Cancel-Lock: sha1:o36RZrABbCtrv//XldhlS8bJllQ=
In-Reply-To: <1e0b0ba3-e11c-4a14-a0c7-7c074f2f9ba7n@googlegroups.com>
Content-Language: en-US
 by: Stephen Fuld - Fri, 21 Jan 2022 17:05 UTC

On 1/13/2022 9:53 AM, MitchAlsup wrote:
> On Thursday, January 13, 2022 at 10:20:39 AM UTC-6, Ivan Godard wrote:
>> On 1/13/2022 7:54 AM, MitchAlsup wrote:
>>> On Thursday, January 13, 2022 at 6:07:29 AM UTC-6, Terje Mathisen wrote:
>>>> Ivan Godard wrote:
>>>>> On 1/12/2022 2:46 PM, MitchAlsup wrote:
>>>>>> On Wednesday, January 12, 2022 at 4:24:36 PM UTC-6, Ivan Godard wrote:
>>>>>>> On 1/12/2022 1:24 PM, MitchAlsup wrote:
>>>>>>>> On Monday, January 10, 2022 at 11:36:01 AM UTC-6, Quadibloc wrote:
>>>>>>>>> On Saturday, January 8, 2022 at 6:16:14 PM UTC-7, MitchAlsup wrote:
>>>>>>>>>
>>>>>>>>>> So, what does (13 << -7) mean ?
>>>>>>>>> From the replies, I see that this is a completely different
>>>>>>>>> can of worms. I would tend to favor the VAX interpretation,
>>>>>>>>> but it's unclear to me that it would be worth the extra
>>>>>>>>> run-time overhead that it might imply.
>>>>>>>> <
>>>>>>>> The VAX interpretation does not allow for shifts to be used as
>>>>>>>> bit-manipulation instructions (extract signed and unsigned).
>>>>>> <
>>>>>>> That turns into a rotate and a shft right. But isn't that what a
>>>>>>> hardware EXTR is going to do anyway?
>>>>>> <
>>>>>> Technically an extract is:
>>>>>> <
>>>>>> r = ((c << (containerWidth - fieldWidth - offset)) >>
>>>>>> (containerWidth-fieldWidth));
>>>>>> <
>>>>>> However, what we do in HW is:
>>>>>> <
>>>>>> m = tableM[fieldWidth]; // mask m:: often done
>>>>>> with table
>>>>>> s = ~0u << (fieldWidth+offset); // sign extension bits ::
>>>>>> often done with table
>>>>>> t = (c >> offset) & m | ( signed || (c & ( 1 <<
>>>>>> (fieldWidth+offset)) ? s : 0);
>>>>>> <
>>>>>> as this uses 1 shifter and either tables or greater-than decoders.
>>>>>> <
>>>>>> But the part I was hinting at is that we have a large container for
>>>>>> this shift
>>>>>> count and we need only a few bits on one end. So, in My 66000, the
>>>>>> upper ½
>>>>>> of the container is allowed to contain a value 0..64 (where both 0 and 64
>>>>>> mean 64-bit field width).
>>>>>> <
>>>>>> Done this way, shifts are degenerate subsets of EXT.
>>>>>> <
>>>>>> If you use the sign bit<63> to control shift direction, you probably
>>>>>> should not
>>>>>> be using bits<37:32> as the field width because pasting the fieldWidth
>>>>>> into
>>>>>> such a container is simply harder.
>>>>>> <
>>>>>> I learned the hard way about placing both containers too close together
>>>>>> in the Mc 88100.
>>>>>
>>>>> Hmm. So the container is a field descriptor. But it's not actually a
>>>>> bitRow, because it can't cross word boundaries - more like a PDP-10 PLT
>>>>> IIRC. Gives you some encoding entropy, but not actually any new
>>>>> semantic, and you have to build the descriptor, which you avoid when
>>>>> offset and width are separate arguments.
>>>>>
>>>>> Where ISAs really fall down is parsing a bit stream: grab a dynamic
>>>>> number of bits off the front of a bit stream, advancing the stream; word
>>>>> boundaries are not significant. The problem is that HW provides word
>>>>> streams (loop/load) and mapping that to a bit stream is nasty. The logic
>>>>> is the same as mapping a line stream into a (VL) instruction stream in
>>>>> the decoder's instruction buffer, but how to represent that in an ISA?
>>>> We did use to have something like that back in the PDP days with the
>>>> variable-sized load byte opcode that would automatically step forward to
>>>> the next work if needed.
>>>>
>>>> Today we would need a separate hw mechanism for that bitstream buffer,
>>>> and probably a filling mechanism which either would need to be
>>>> hardware-only or possibly a FILL_IF_ROOM target,offs,[src] opcode that
>>>> would load the next (8/16/32 bits into the target register at offs bit
>>>> offset, updating the OFFS reg and SRC pointer, but only if there was
>>>> room, otherwise it is a NOP. The main problem is the need to update all
>>>> three register operands. :-(
>>>>
>>>> Having it would however make it far easier to handle arbitrary bit
>>>> streams, avoiding the current need for branchy code.
>>> <
>>> What is wrong with having a container 2× as large as the largest bit-field
>>> then use a shift-double to position the bits at an appropriate
>>> place for extraction and then decoding/encoding.
>> What's wrong is the need to conditionally refill and then merge the
>> loaded value into the shift pair. The condition is totally
>> unpredictable, so you get a miss every <word size>/<average request
>> size> cycles. We have predicated loads and isomorphic shifts, but have
>> only enough slots to do a reasonable job on a Gold.
>>
>> Try it on my66: input is an array of request bit-sizes, an input array
>> of bits, an output array of words, and a count. Unpack the consecutive
>> bit-size fields from the bits into the words.
>>
>> Have fun.
> <
> I came up with this in 5 minutes::
> This assumes the input bit-length selector is an vector of characters and that the
> chars contain values from {1..64}
> <
> void unpack( uchar_t size[], uint64_t packed[], uint64_t unpacked[], uint64_t count )
> {
> uint64_t len,
> bit=0,
> word=0,
> extract,
> container1 = packed[0],
> container2 = packed[1];
>
> for( unsigned int i = 0; i < count; i++ )
> {
> len = size[i];
> bit += len;
> extract = ( len << 32 ) | ( bit & 0x3F );
> if( word != bit >> 6 )
> {
> container1 = container2;
> container2 = packed[++word];
> }
> unpacked[i] = {container2, container1} >> extract;
> }
> }
> <
> This translates into pretty nice My 66000 ISA:
> <
> ENTRY unpack
> unpack:
> MOV R5,#0
> MOV R6,#0
> LDD R7,[R2]
> LDD R8,[R2+8]
> MOV R9,#0
> loop:
> LDUB R10,[R1+R9]
> ADD R5,R5,R10
> AND R11,R5,#63
> SL R12,R10,#32
> OR R11,R11,R12
> SR R12,R6,#6
> CMP R11,R6,R12
> PEQ R11,{111}
> ADD R6,R6,#1
> MOV R7,R8
> LDD R8,[R2+R6<<3]
> CARRY R8,{{I}}
> SL R12,R7,R11
> STD R12,[R3+R9<<3]
> ADD R9,R9,#1
> CMP R11,R9,R4
> BLT R11,loop
> RET
> <
> Well at least straightforwardly.

If Terje is right, and he almost always is, it is worth trying to come
up with a better solution for this type of problem. So, as a start, I
came up with what follows. This certainly isn’t the final solution. It
is intended to start a discussion on better ways to do this. And the
usual disclaimer, IANAHG, so this is from a software perspective. But I
did try to fit it “in the spirit” of the MY 66000, and it takes
advantages of that design’s unique capabilities.

The idea is to add one new instruction, which typically would be in the
shadow of a preceding Carry meta instruction. I called the new
instruction Load Bit Field (LBF).

It is a two source, one result instruction, but uses the carry register
for an additional source and destination. The syntax is

LBF Result register, field length (in bits), buffer starting address
(in bytes)

The carry register contains the offset, in bits, from the start of the
buffer where the desired field starts.

The instruction computes the start of the desired field by adding the
high order all but three bits of the carry register to get the starting
byte number, then uses the low order three bits to get the starting bit
number. The instruction extracts the field, starting at the computed
bit address with length as given in the register specified in the
register, and right justifies that field in the result register. The
higher order bits in the result register are set to zero. If the output
bit of the Carry instruction is set, the length value is added to the
Carry register.


Click here to read the complete article
Re: The type of Mill's belt's slots

<ydCGJ.2495$2V.2468@fx07.iad>

  copy mid

https://www.novabbs.com/devel/article-flat.php?id=23021&group=comp.arch#23021

  copy link   Newsgroups: comp.arch
Path: i2pn2.org!i2pn.org!weretis.net!feeder8.news.weretis.net!newsreader4.netcologne.de!news.netcologne.de!peer02.ams1!peer.ams1.xlned.com!news.xlned.com!peer02.iad!feed-me.highwinds-media.com!news.highwinds-media.com!fx07.iad.POSTED!not-for-mail
From: ThatWoul...@thevillage.com (EricP)
User-Agent: Thunderbird 2.0.0.24 (Windows/20100228)
MIME-Version: 1.0
Newsgroups: comp.arch
Subject: Re: The type of Mill's belt's slots
References: <2021Dec24.163843@mips.complang.tuwien.ac.at> <sqpqbm$7qo$1@newsreader4.netcologne.de> <sqq3ce$c4n$2@dont-email.me> <sqssff$a9j$1@gioia.aioe.org> <077afaee-009e-4860-be45-61106126934bn@googlegroups.com> <squhht$79u$1@dont-email.me> <bb6d49bb-a676-44bd-9a6d-29386d429454n@googlegroups.com> <sr0vhm$c4u$1@dont-email.me> <sr114i$1qc$1@newsreader4.netcologne.de> <sr1dca$70e$1@dont-email.me> <kM%AJ.186634$np6.183460@fx46.iad> <sr2gf6$64u$1@dont-email.me> <7DpBJ.254731$3q9.63673@fx47.iad> <sr62tb$u2o$1@dont-email.me> <ss4g91$hvs$1@dont-email.me> <ss4ktr$dvv$1@dont-email.me> <UZCFJ.4610$uP.4480@fx16.iad> <bSDFJ.268310$1d1.64158@fx99.iad> <ss78bk$hm6$1@dont-email.me> <jwvo848n4ud.fsf-monnier+comp.arch@gnu.org> <ss7ila$obl$1@dont-email.me> <jwv7dawmmb7.fsf-monnier+comp.arch@gnu.org> <ss858d$m0a$1@dont-email.me> <74f730f1-e138-4124-9fbb-21f388eafeb3n@googlegroups.com> <ss9p4l$a2i$1@dont-email.me> <34d3c262-c385-4a42-a6c1-a15c34536b28n@googlegroups.com>
In-Reply-To: <34d3c262-c385-4a42-a6c1-a15c34536b28n@googlegroups.com>
Content-Type: text/plain; charset=UTF-8; format=flowed
Content-Transfer-Encoding: 7bit
Lines: 24
Message-ID: <ydCGJ.2495$2V.2468@fx07.iad>
X-Complaints-To: abuse@UsenetServer.com
NNTP-Posting-Date: Fri, 21 Jan 2022 17:39:42 UTC
Date: Fri, 21 Jan 2022 12:38:17 -0500
X-Received-Bytes: 2782
 by: EricP - Fri, 21 Jan 2022 17:38 UTC

Quadibloc wrote:
>
> So, for it to be even _possible_ to detect a 'mismatch' should one occur,
> it sounds to me like the data has to have a label associated with it. Sort
> of like on the old Burroughs computers.
>
> When one saved and restored registers, instead of just saving their
> contents as plain data, the save would have to be a protected binary blob
> with type attributes included. Of course, the Mill doesn't have registers,
> and data spill from the Belt would for other reasons have to be protected
> from ordinary memory operations.

A full capabilities system is not necessarily required.
The internal type tag can be established by a LD_t instruction and
carried with the item until spilled. ST knows how big item is so
compiler has to ensure allocated memory matches internal type.
While in memory the type is tracked by the compiler, which it does anyway.

HW internal meta tag saves operating instructions from having to
continuously re-state what kind of interpretation to put on bits.
Only overhead is requirement for explicit instructions for type changes,
which mostly happen anyway.

Re: The type of Mill's belt's slots

<ssero6$kae$1@dont-email.me>

  copy mid

https://www.novabbs.com/devel/article-flat.php?id=23022&group=comp.arch#23022

  copy link   Newsgroups: comp.arch
Path: i2pn2.org!i2pn.org!eternal-september.org!reader02.eternal-september.org!.POSTED!not-for-mail
From: iva...@millcomputing.com (Ivan Godard)
Newsgroups: comp.arch
Subject: Re: The type of Mill's belt's slots
Date: Fri, 21 Jan 2022 09:49:57 -0800
Organization: A noiseless patient Spider
Lines: 105
Message-ID: <ssero6$kae$1@dont-email.me>
References: <2021Dec24.163843@mips.complang.tuwien.ac.at>
<sqpqbm$7qo$1@newsreader4.netcologne.de> <sqq3ce$c4n$2@dont-email.me>
<sqssff$a9j$1@gioia.aioe.org>
<077afaee-009e-4860-be45-61106126934bn@googlegroups.com>
<squhht$79u$1@dont-email.me>
<bb6d49bb-a676-44bd-9a6d-29386d429454n@googlegroups.com>
<sr0vhm$c4u$1@dont-email.me> <sr114i$1qc$1@newsreader4.netcologne.de>
<sr1dca$70e$1@dont-email.me> <kM%AJ.186634$np6.183460@fx46.iad>
<sr2gf6$64u$1@dont-email.me> <7DpBJ.254731$3q9.63673@fx47.iad>
<sr62tb$u2o$1@dont-email.me> <ss4g91$hvs$1@dont-email.me>
<ss4ktr$dvv$1@dont-email.me> <UZCFJ.4610$uP.4480@fx16.iad>
<bSDFJ.268310$1d1.64158@fx99.iad> <ss78bk$hm6$1@dont-email.me>
<jwvo848n4ud.fsf-monnier+comp.arch@gnu.org> <ss7ila$obl$1@dont-email.me>
<jwv7dawmmb7.fsf-monnier+comp.arch@gnu.org> <ss858d$m0a$1@dont-email.me>
<74f730f1-e138-4124-9fbb-21f388eafeb3n@googlegroups.com>
<ss9p4l$a2i$1@dont-email.me>
<34d3c262-c385-4a42-a6c1-a15c34536b28n@googlegroups.com>
Mime-Version: 1.0
Content-Type: text/plain; charset=UTF-8; format=flowed
Content-Transfer-Encoding: 7bit
Injection-Date: Fri, 21 Jan 2022 17:49:58 -0000 (UTC)
Injection-Info: reader02.eternal-september.org; posting-host="bd152fa2a5fde99057cfa2e5aed93b67";
logging-data="20814"; mail-complaints-to="abuse@eternal-september.org"; posting-account="U2FsdGVkX1+g94RnJs5uW/Urxp7eOCtz"
User-Agent: Mozilla/5.0 (Windows NT 10.0; Win64; x64; rv:91.0) Gecko/20100101
Thunderbird/91.5.0
Cancel-Lock: sha1:rsqO/NhMbqmAryNJV3hFxyTyoPo=
In-Reply-To: <34d3c262-c385-4a42-a6c1-a15c34536b28n@googlegroups.com>
Content-Language: en-US
 by: Ivan Godard - Fri, 21 Jan 2022 17:49 UTC

On 1/20/2022 5:26 PM, Quadibloc wrote:
> On Wednesday, January 19, 2022 at 12:34:49 PM UTC-7, Ivan Godard wrote:
>> On 1/19/2022 9:46 AM, MitchAlsup wrote:
>>> On Tuesday, January 18, 2022 at 10:49:20 PM UTC-6, Ivan Godard wrote:
>>>> On 1/18/2022 7:35 PM, Stefan Monnier wrote:
>
>>>>>> It is both kept with the data and projected by the decoder/mapper. Execution
>>>>>> follows the projected width; the actual (from the data) is matched against
>>>>>> the expected and faults on a mismatch.
>>>>>
>>>>> Hmm... how/when would such a mismatch occur?
>>>>> Do you have instructions whose output width is data-dependent?
>>>>> Where is it useful/used? Do you then have instructions to dynamically
>>>>> test the width of some belt slot?
>>>>>
>>>>> Dynamically typed machine language?
>
>>>> Width mismatch can arise from software (specializer bug) or hardware
>>>> (gamma ray) error, or from an attack that generates or modifies
>>>> binaries, before or after execution begins. The Mill design cares *very*
>>>> much about RAS issues and sanity-checks everything it can.
>>>>
>>>> Most instructions that have results have data-dependent output widths.
>>>> Usually the result width is the same as the argument widths, but all
>>>> instructions that can overflow (including nearly all arithmetic
>>>> instructions) have a variant that produces a double-width result.
>>>> Obviously the WIDEN and NARROW instructions change the width, and there
>>>> are a few odd-balls.
>>> <
>>> Why do you consider this (previous paragraph) better than the RISC model of
>>> "everything becomes register size when it is loaded" ?
>
>> We wanted to do auto-SIMD, so scalar and lane semantics should be the
>> same. Once we decided to expose integer overflow (as opposed to the
>> legacy/RISC letting the user handle the boojums), RAS reasons led to
>> doing calculations in the declared types.
>
> I'll have to admit that reading this discussion was confusing for me.
>
> I'm used to thinking in terms of conventional computer architectures,
> like that of the IBM System/360. In such an architecture, the width
> of data depends on the instruction being executed. There can't be a
> 'mismatch', although there certainly can be code that doesn't make sense
> because it operates on the same bits in memory as though they belong
> to data of multiple different types.
>
> So, for it to be even _possible_ to detect a 'mismatch' should one occur,
> it sounds to me like the data has to have a label associated with it. Sort
> of like on the old Burroughs computers.
>
> When one saved and restored registers, instead of just saving their
> contents as plain data, the save would have to be a protected binary blob
> with type attributes included. Of course, the Mill doesn't have registers,
> and data spill from the Belt would for other reasons have to be protected
> from ordinary memory operations.
>
> While I think security is a good thing, I tend to approve of the traditional
> practice of avoiding any measures that would appear to have a significant
> overhead. Computers are for setting new LINPACK records, and security
> is about keeping the wrong people's grubby fingers away - on the assumption
> that you can build a secure wall around a computer that on the inside, where
> it's actually cranking out work, has almost no security features present...
> and not get totally owned for such foolishness.
>
> This may be misguided for some applications and threat models, but I tend
> to presume it's a specialized area of research, outside the 'mainstream' of
> computer design.
>
> John Savard

Any ISA that supports multiple data widths must have some way to tell
the FUs what width they should use. Historically there have been several
different ways chosen to do this. Widths have been encoded in the
instruction opcode; in the register names used by the instruction; as
additional arguments in the instruction encoding; in the data; and in
combinations of these.

Width is a static attribute in all ISAs. That is, if you are given the
arguments of the instruction and the static producer-consumer
relationships of the data inputs then you statically know the widths of
the results. There are no instructions that produce a value of unknown
or random width. An ISA can take advantage of this to reduce its
encoding entropy: track the producer-consumer histry at runtime, and
only use explicit width indications (of any form) for instructions with
no history, i.e. pure sources.

That is what the Mill encoding does: pure sources (loads, constants,
streamers) have explicit width arguments in the asm and the encoding,
while all other instructions take the width as tracked through the
producer-consumer relationships of the program as executed. If the
program us correct then all FUs will get the correct width indications
with each instruction. This saves 3 bits per instruction kind that has
results, or 18% in a typical Mill member encoding.

Unfortunately programs are never correct. As a sanity check, Mill also
encodes width with each datum, which is checked against the projected
width by the FU. The cost is three bits per datum, or a little less than
5%. I agree that the sanity check is not worth building a whole metadata
system for, but Mill already has metadata attached to every datum for
other reasons and it's trivial to extend it for this check.

One thing in your post I must disagree strongly with: the purpose of
computers is *not* to set LINPACK records. The view that it is their
purpose is in large part the source of the mess that computers have made
of our everyday lives.

Re: RISC-V vs. Aarch64

<9d652029-a997-4bec-9182-769002a6bd1dn@googlegroups.com>

  copy mid

https://www.novabbs.com/devel/article-flat.php?id=23023&group=comp.arch#23023

  copy link   Newsgroups: comp.arch
X-Received: by 2002:a05:620a:1988:: with SMTP id bm8mr3587045qkb.494.1642787597193;
Fri, 21 Jan 2022 09:53:17 -0800 (PST)
X-Received: by 2002:a9d:24:: with SMTP id 33mr3612454ota.85.1642787596883;
Fri, 21 Jan 2022 09:53:16 -0800 (PST)
Path: i2pn2.org!i2pn.org!weretis.net!feeder6.news.weretis.net!news.misty.com!border2.nntp.dca1.giganews.com!nntp.giganews.com!news-out.google.com!nntp.google.com!postnews.google.com!google-groups.googlegroups.com!not-for-mail
Newsgroups: comp.arch
Date: Fri, 21 Jan 2022 09:53:16 -0800 (PST)
In-Reply-To: <ssep4k$vke$1@dont-email.me>
Injection-Info: google-groups.googlegroups.com; posting-host=2600:1700:291:29f0:155a:6773:152c:fbf3;
posting-account=H_G_JQkAAADS6onOMb-dqvUozKse7mcM
NNTP-Posting-Host: 2600:1700:291:29f0:155a:6773:152c:fbf3
References: <2021Dec24.163843@mips.complang.tuwien.ac.at> <86h7agvxun.fsf@linuxsc.com>
<M4_BJ.140002$lz3.547@fx34.iad> <f91f3db8-640e-4c10-b0f7-61c7085b70c8n@googlegroups.com>
<srag0i$2ed$2@dont-email.me> <00add816-93d7-4763-a68b-33a67db6d770n@googlegroups.com>
<2022Jan8.101413@mips.complang.tuwien.ac.at> <7557bf3a-61ce-4500-8cf8-ced2dbed7087n@googlegroups.com>
<ad2ee700-b604-4565-9e24-3386580b90c8n@googlegroups.com> <4d2fbc82-af69-4388-bfa5-e3b2be652744n@googlegroups.com>
<2e706405-006a-49bb-8e8a-f634d749205en@googlegroups.com> <570acc73-a5da-497f-8ec4-810150e0a9f1n@googlegroups.com>
<850b7681-204a-4df6-9095-cd6ee816a7d5n@googlegroups.com> <5ea00397-5572-4fbd-bfb3-85c3554f1eb9n@googlegroups.com>
<srnkf0$4cb$1@dont-email.me> <b4e98991-4fb9-4ef7-a831-430c3fc10145n@googlegroups.com>
<srp3n0$e0a$1@dont-email.me> <srp4lu$hhg$1@gioia.aioe.org>
<81c0ddc6-4b46-4b3d-b64b-e65963889214n@googlegroups.com> <srpjgj$541$1@dont-email.me>
<1e0b0ba3-e11c-4a14-a0c7-7c074f2f9ba7n@googlegroups.com> <ssep4k$vke$1@dont-email.me>
User-Agent: G2/1.0
MIME-Version: 1.0
Message-ID: <9d652029-a997-4bec-9182-769002a6bd1dn@googlegroups.com>
Subject: Re: RISC-V vs. Aarch64
From: MitchAl...@aol.com (MitchAlsup)
Injection-Date: Fri, 21 Jan 2022 17:53:17 +0000
Content-Type: text/plain; charset="UTF-8"
Content-Transfer-Encoding: quoted-printable
Lines: 143
 by: MitchAlsup - Fri, 21 Jan 2022 17:53 UTC

On Friday, January 21, 2022 at 11:05:28 AM UTC-6, Stephen Fuld wrote:
> On 1/13/2022 9:53 AM, MitchAlsup wrote:

> > I came up with this in 5 minutes::
> > This assumes the input bit-length selector is an vector of characters and that the
> > chars contain values from {1..64}
> > <
> > void unpack( uchar_t size[], uint64_t packed[], uint64_t unpacked[], uint64_t count )
> > {
> > uint64_t len,
> > bit=0,
> > word=0,
> > extract,
> > container1 = packed[0],
> > container2 = packed[1];
> >
> > for( unsigned int i = 0; i < count; i++ )
> > {
> > len = size[i];
> > bit += len;
> > extract = ( len << 32 ) | ( bit & 0x3F );
> > if( word != bit >> 6 )
> > {
> > container1 = container2;
> > container2 = packed[++word];
> > }
> > unpacked[i] = {container2, container1} >> extract;
> > }
> > }
> > <
> > This translates into pretty nice My 66000 ISA:
> > <
> > ENTRY unpack
> > unpack:
> > MOV R5,#0
> > MOV R6,#0
> > LDD R7,[R2]
> > LDD R8,[R2+8]
> > MOV R9,#0
> > loop:
> > LDUB R10,[R1+R9]
> > ADD R5,R5,R10
> > AND R11,R5,#63
> > SL R12,R10,#32
> > OR R11,R11,R12
> > SR R12,R6,#6
> > CMP R11,R6,R12
> > PEQ R11,{111}
> > ADD R6,R6,#1
> > MOV R7,R8
> > LDD R8,[R2+R6<<3]
> > CARRY R8,{{I}}
> > SL R12,R7,R11
> > STD R12,[R3+R9<<3]
> > ADD R9,R9,#1
> > CMP R11,R9,R4
> > BLT R11,loop
> > RET
> > <
> > Well at least straightforwardly.
>
>
> If Terje is right, and he almost always is, it is worth trying to come
> up with a better solution for this type of problem. So, as a start, I
> came up with what follows. This certainly isn’t the final solution. It
> is intended to start a discussion on better ways to do this. And the
> usual disclaimer, IANAHG, so this is from a software perspective. But I
> did try to fit it “in the spirit” of the MY 66000, and it takes
> advantages of that design’s unique capabilities.
>
> The idea is to add one new instruction, which typically would be in the
> shadow of a preceding Carry meta instruction. I called the new
> instruction Load Bit Field (LBF).
>
> It is a two source, one result instruction, but uses the carry register
> for an additional source and destination. The syntax is
>
> LBF Result register, field length (in bits), buffer starting address
> (in bytes)
>
> The carry register contains the offset, in bits, from the start of the
> buffer where the desired field starts.
>
> The instruction computes the start of the desired field by adding the
> high order all but three bits of the carry register to get the starting
> byte number, then uses the low order three bits to get the starting bit
> number. The instruction extracts the field, starting at the computed
> bit address with length as given in the register specified in the
> register, and right justifies that field in the result register. The
> higher order bits in the result register are set to zero. If the output
> bit of the Carry instruction is set, the length value is added to the
> Carry register.
<
A bit more on the CISC side than desired (most of the time)--3
exceptions possible, 2 memory accesses. Also note, my original
solution can produce signed or unsigned output stream. This is
going to take 2 cycles in AGEN, and 2 result register writes.
>
> In order to speed up this instruction, and given that it will frequently
> occur in a fairly tight loop, I think (hope) that the hardware can take
> advantage of the “streaming” buffers otherwise used for VVM operations.
> Anyway, if one had this instruction, the main loop in the code above
> could be something like
>
>
> loop:
> LDUB R10,[R1+R9]
> CARRY R6,IO
> LBF R12,R10,R2 ;I am not sure about R2, It should be the start of
> the packed buffer.
> STD R12,[R3+R9<<3]
> ADD R9,R9,#1
> CMP R11,R9,R4
> BLT R11,loop
>
> For a savings of about 10 instructions in the I cache, but fewer in
> execution (but still significant) depending upon how often the
> instructions under the predicate are executed.
>
I have to admit, this looks fairly juicy--just have to plow my way
through and see what comes out.
>
> Anyway, Of course, I invite comments, criticisms, etc. One obvious
> drawback is that this only addresses the "decompression" side. While I
> briefly considered a "Store Bit Field", I discarded it as it seemed too
> complex, and presumably would used less frequently, as
> compression/coding happens less frequently than decompression/decoding.
> --
> - Stephen Fuld
> (e-mail address disguised to prevent spam)

Re: RISC-V vs. Aarch64

<sset5a$v8n$1@dont-email.me>

  copy mid

https://www.novabbs.com/devel/article-flat.php?id=23025&group=comp.arch#23025

  copy link   Newsgroups: comp.arch
Path: i2pn2.org!i2pn.org!eternal-september.org!reader02.eternal-september.org!.POSTED!not-for-mail
From: sfu...@alumni.cmu.edu.invalid (Stephen Fuld)
Newsgroups: comp.arch
Subject: Re: RISC-V vs. Aarch64
Date: Fri, 21 Jan 2022 10:14:00 -0800
Organization: A noiseless patient Spider
Lines: 160
Message-ID: <sset5a$v8n$1@dont-email.me>
References: <2021Dec24.163843@mips.complang.tuwien.ac.at>
<srag0i$2ed$2@dont-email.me>
<00add816-93d7-4763-a68b-33a67db6d770n@googlegroups.com>
<2022Jan8.101413@mips.complang.tuwien.ac.at>
<7557bf3a-61ce-4500-8cf8-ced2dbed7087n@googlegroups.com>
<ad2ee700-b604-4565-9e24-3386580b90c8n@googlegroups.com>
<4d2fbc82-af69-4388-bfa5-e3b2be652744n@googlegroups.com>
<2e706405-006a-49bb-8e8a-f634d749205en@googlegroups.com>
<570acc73-a5da-497f-8ec4-810150e0a9f1n@googlegroups.com>
<850b7681-204a-4df6-9095-cd6ee816a7d5n@googlegroups.com>
<5ea00397-5572-4fbd-bfb3-85c3554f1eb9n@googlegroups.com>
<srnkf0$4cb$1@dont-email.me>
<b4e98991-4fb9-4ef7-a831-430c3fc10145n@googlegroups.com>
<srp3n0$e0a$1@dont-email.me> <srp4lu$hhg$1@gioia.aioe.org>
<81c0ddc6-4b46-4b3d-b64b-e65963889214n@googlegroups.com>
<srpjgj$541$1@dont-email.me>
<1e0b0ba3-e11c-4a14-a0c7-7c074f2f9ba7n@googlegroups.com>
<ssep4k$vke$1@dont-email.me>
<9d652029-a997-4bec-9182-769002a6bd1dn@googlegroups.com>
Mime-Version: 1.0
Content-Type: text/plain; charset=UTF-8; format=flowed
Content-Transfer-Encoding: 8bit
Injection-Date: Fri, 21 Jan 2022 18:14:03 -0000 (UTC)
Injection-Info: reader02.eternal-september.org; posting-host="df1c2bb0508560b1482abd0cd87e42b2";
logging-data="32023"; mail-complaints-to="abuse@eternal-september.org"; posting-account="U2FsdGVkX18P06orh24OY6fVKMNFbiZDz1S3oDrLwxg="
User-Agent: Mozilla/5.0 (Windows NT 10.0; Win64; x64; rv:91.0) Gecko/20100101
Thunderbird/91.5.0
Cancel-Lock: sha1:eKCSZAlVhYjWiYJGc/rGSsz7WZs=
In-Reply-To: <9d652029-a997-4bec-9182-769002a6bd1dn@googlegroups.com>
Content-Language: en-US
 by: Stephen Fuld - Fri, 21 Jan 2022 18:14 UTC

On 1/21/2022 9:53 AM, MitchAlsup wrote:
> On Friday, January 21, 2022 at 11:05:28 AM UTC-6, Stephen Fuld wrote:
>> On 1/13/2022 9:53 AM, MitchAlsup wrote:
>
>>> I came up with this in 5 minutes::
>>> This assumes the input bit-length selector is an vector of characters and that the
>>> chars contain values from {1..64}
>>> <
>>> void unpack( uchar_t size[], uint64_t packed[], uint64_t unpacked[], uint64_t count )
>>> {
>>> uint64_t len,
>>> bit=0,
>>> word=0,
>>> extract,
>>> container1 = packed[0],
>>> container2 = packed[1];
>>>
>>> for( unsigned int i = 0; i < count; i++ )
>>> {
>>> len = size[i];
>>> bit += len;
>>> extract = ( len << 32 ) | ( bit & 0x3F );
>>> if( word != bit >> 6 )
>>> {
>>> container1 = container2;
>>> container2 = packed[++word];
>>> }
>>> unpacked[i] = {container2, container1} >> extract;
>>> }
>>> }
>>> <
>>> This translates into pretty nice My 66000 ISA:
>>> <
>>> ENTRY unpack
>>> unpack:
>>> MOV R5,#0
>>> MOV R6,#0
>>> LDD R7,[R2]
>>> LDD R8,[R2+8]
>>> MOV R9,#0
>>> loop:
>>> LDUB R10,[R1+R9]
>>> ADD R5,R5,R10
>>> AND R11,R5,#63
>>> SL R12,R10,#32
>>> OR R11,R11,R12
>>> SR R12,R6,#6
>>> CMP R11,R6,R12
>>> PEQ R11,{111}
>>> ADD R6,R6,#1
>>> MOV R7,R8
>>> LDD R8,[R2+R6<<3]
>>> CARRY R8,{{I}}
>>> SL R12,R7,R11
>>> STD R12,[R3+R9<<3]
>>> ADD R9,R9,#1
>>> CMP R11,R9,R4
>>> BLT R11,loop
>>> RET
>>> <
>>> Well at least straightforwardly.
>>
>>
>> If Terje is right, and he almost always is, it is worth trying to come
>> up with a better solution for this type of problem. So, as a start, I
>> came up with what follows. This certainly isn’t the final solution. It
>> is intended to start a discussion on better ways to do this. And the
>> usual disclaimer, IANAHG, so this is from a software perspective. But I
>> did try to fit it “in the spirit” of the MY 66000, and it takes
>> advantages of that design’s unique capabilities.
>>
>> The idea is to add one new instruction, which typically would be in the
>> shadow of a preceding Carry meta instruction. I called the new
>> instruction Load Bit Field (LBF).
>>
>> It is a two source, one result instruction, but uses the carry register
>> for an additional source and destination. The syntax is
>>
>> LBF Result register, field length (in bits), buffer starting address
>> (in bytes)
>>
>> The carry register contains the offset, in bits, from the start of the
>> buffer where the desired field starts.
>>
>> The instruction computes the start of the desired field by adding the
>> high order all but three bits of the carry register to get the starting
>> byte number, then uses the low order three bits to get the starting bit
>> number. The instruction extracts the field, starting at the computed
>> bit address with length as given in the register specified in the
>> register, and right justifies that field in the result register. The
>> higher order bits in the result register are set to zero. If the output
>> bit of the Carry instruction is set, the length value is added to the
>> Carry register.
> <
> A bit more on the CISC side than desired

I understand.

(most of the time)--3
> exceptions possible, 2 memory accesses.

Yes, but I justified the two memory accesses in my mind by thinking it
was no worse than a load multiple instruction.

> Also note, my original
> solution can produce signed or unsigned output stream.

True. Would the ability to produce signed output be used? Genuine
question - I don't know the answer. If it was important, I suppose you
could use an additional op code or if available a spare bit in the LBF
instruction.

> This is
> going to take 2 cycles in AGEN,

I believe you, but why? Unless you are saying crossing a 64 bit memory
boundary would require a second AGEN?

> and 2 result register writes.

Yes. The same as any instruction operating under the shadow of a CARRY
meta instruction with the O bit set in the mask.

>> In order to speed up this instruction, and given that it will frequently
>> occur in a fairly tight loop, I think (hope) that the hardware can take
>> advantage of the “streaming” buffers otherwise used for VVM operations.
>> Anyway, if one had this instruction, the main loop in the code above
>> could be something like
>>
>>
>> loop:
>> LDUB R10,[R1+R9]
>> CARRY R6,IO
>> LBF R12,R10,R2 ;I am not sure about R2, It should be the start of
>> the packed buffer.
>> STD R12,[R3+R9<<3]
>> ADD R9,R9,#1
>> CMP R11,R9,R4
>> BLT R11,loop
>>
>> For a savings of about 10 instructions in the I cache, but fewer in
>> execution (but still significant) depending upon how often the
>> instructions under the predicate are executed.
>>
> I have to admit, this looks fairly juicy--just have to plow my way
> through and see what comes out.

:-)

--
- Stephen Fuld
(e-mail address disguised to prevent spam)

Re: RISC-V vs. Aarch64

<dc78242b-1bda-45d7-ae78-893121125b92n@googlegroups.com>

  copy mid

https://www.novabbs.com/devel/article-flat.php?id=23026&group=comp.arch#23026

  copy link   Newsgroups: comp.arch
X-Received: by 2002:a05:6214:4109:: with SMTP id kc9mr5231261qvb.59.1642795769208;
Fri, 21 Jan 2022 12:09:29 -0800 (PST)
X-Received: by 2002:a05:6808:1248:: with SMTP id o8mr1881483oiv.157.1642795768838;
Fri, 21 Jan 2022 12:09:28 -0800 (PST)
Path: i2pn2.org!i2pn.org!weretis.net!feeder6.news.weretis.net!news.misty.com!border2.nntp.dca1.giganews.com!nntp.giganews.com!news-out.google.com!nntp.google.com!postnews.google.com!google-groups.googlegroups.com!not-for-mail
Newsgroups: comp.arch
Date: Fri, 21 Jan 2022 12:09:28 -0800 (PST)
In-Reply-To: <sset5a$v8n$1@dont-email.me>
Injection-Info: google-groups.googlegroups.com; posting-host=2600:1700:291:29f0:155a:6773:152c:fbf3;
posting-account=H_G_JQkAAADS6onOMb-dqvUozKse7mcM
NNTP-Posting-Host: 2600:1700:291:29f0:155a:6773:152c:fbf3
References: <2021Dec24.163843@mips.complang.tuwien.ac.at> <srag0i$2ed$2@dont-email.me>
<00add816-93d7-4763-a68b-33a67db6d770n@googlegroups.com> <2022Jan8.101413@mips.complang.tuwien.ac.at>
<7557bf3a-61ce-4500-8cf8-ced2dbed7087n@googlegroups.com> <ad2ee700-b604-4565-9e24-3386580b90c8n@googlegroups.com>
<4d2fbc82-af69-4388-bfa5-e3b2be652744n@googlegroups.com> <2e706405-006a-49bb-8e8a-f634d749205en@googlegroups.com>
<570acc73-a5da-497f-8ec4-810150e0a9f1n@googlegroups.com> <850b7681-204a-4df6-9095-cd6ee816a7d5n@googlegroups.com>
<5ea00397-5572-4fbd-bfb3-85c3554f1eb9n@googlegroups.com> <srnkf0$4cb$1@dont-email.me>
<b4e98991-4fb9-4ef7-a831-430c3fc10145n@googlegroups.com> <srp3n0$e0a$1@dont-email.me>
<srp4lu$hhg$1@gioia.aioe.org> <81c0ddc6-4b46-4b3d-b64b-e65963889214n@googlegroups.com>
<srpjgj$541$1@dont-email.me> <1e0b0ba3-e11c-4a14-a0c7-7c074f2f9ba7n@googlegroups.com>
<ssep4k$vke$1@dont-email.me> <9d652029-a997-4bec-9182-769002a6bd1dn@googlegroups.com>
<sset5a$v8n$1@dont-email.me>
User-Agent: G2/1.0
MIME-Version: 1.0
Message-ID: <dc78242b-1bda-45d7-ae78-893121125b92n@googlegroups.com>
Subject: Re: RISC-V vs. Aarch64
From: MitchAl...@aol.com (MitchAlsup)
Injection-Date: Fri, 21 Jan 2022 20:09:29 +0000
Content-Type: text/plain; charset="UTF-8"
Content-Transfer-Encoding: quoted-printable
Lines: 183
 by: MitchAlsup - Fri, 21 Jan 2022 20:09 UTC

On Friday, January 21, 2022 at 12:14:06 PM UTC-6, Stephen Fuld wrote:
> On 1/21/2022 9:53 AM, MitchAlsup wrote:
> > On Friday, January 21, 2022 at 11:05:28 AM UTC-6, Stephen Fuld wrote:
> >> On 1/13/2022 9:53 AM, MitchAlsup wrote:
> >
> >>> I came up with this in 5 minutes::
> >>> This assumes the input bit-length selector is an vector of characters and that the
> >>> chars contain values from {1..64}
> >>> <
> >>> void unpack( uchar_t size[], uint64_t packed[], uint64_t unpacked[], uint64_t count )
> >>> {
> >>> uint64_t len,
> >>> bit=0,
> >>> word=0,
> >>> extract,
> >>> container1 = packed[0],
> >>> container2 = packed[1];
> >>>
> >>> for( unsigned int i = 0; i < count; i++ )
> >>> {
> >>> len = size[i];
> >>> bit += len;
> >>> extract = ( len << 32 ) | ( bit & 0x3F );
> >>> if( word != bit >> 6 )
> >>> {
> >>> container1 = container2;
> >>> container2 = packed[++word];
> >>> }
> >>> unpacked[i] = {container2, container1} >> extract;
> >>> }
> >>> }
> >>> <
> >>> This translates into pretty nice My 66000 ISA:
> >>> <
> >>> ENTRY unpack
> >>> unpack:
> >>> MOV R5,#0
> >>> MOV R6,#0
> >>> LDD R7,[R2]
> >>> LDD R8,[R2+8]
> >>> MOV R9,#0
> >>> loop:
> >>> LDUB R10,[R1+R9]
> >>> ADD R5,R5,R10
> >>> AND R11,R5,#63
> >>> SL R12,R10,#32
> >>> OR R11,R11,R12
> >>> SR R12,R6,#6
> >>> CMP R11,R6,R12
> >>> PEQ R11,{111}
> >>> ADD R6,R6,#1
> >>> MOV R7,R8
> >>> LDD R8,[R2+R6<<3]
> >>> CARRY R8,{{I}}
> >>> SL R12,R7,R11
> >>> STD R12,[R3+R9<<3]
> >>> ADD R9,R9,#1
> >>> CMP R11,R9,R4
> >>> BLT R11,loop
> >>> RET
> >>> <
> >>> Well at least straightforwardly.
> >>
> >>
> >> If Terje is right, and he almost always is, it is worth trying to come
> >> up with a better solution for this type of problem. So, as a start, I
> >> came up with what follows. This certainly isn’t the final solution. It
> >> is intended to start a discussion on better ways to do this. And the
> >> usual disclaimer, IANAHG, so this is from a software perspective. But I
> >> did try to fit it “in the spirit” of the MY 66000, and it takes
> >> advantages of that design’s unique capabilities.
> >>
> >> The idea is to add one new instruction, which typically would be in the
> >> shadow of a preceding Carry meta instruction. I called the new
> >> instruction Load Bit Field (LBF).
> >>
> >> It is a two source, one result instruction, but uses the carry register
> >> for an additional source and destination. The syntax is
> >>
> >> LBF Result register, field length (in bits), buffer starting address
> >> (in bytes)
> >>
> >> The carry register contains the offset, in bits, from the start of the
> >> buffer where the desired field starts.
> >>
> >> The instruction computes the start of the desired field by adding the
> >> high order all but three bits of the carry register to get the starting
> >> byte number, then uses the low order three bits to get the starting bit
> >> number. The instruction extracts the field, starting at the computed
> >> bit address with length as given in the register specified in the
> >> register, and right justifies that field in the result register. The
> >> higher order bits in the result register are set to zero. If the output
> >> bit of the Carry instruction is set, the length value is added to the
> >> Carry register.
> > <
> > A bit more on the CISC side than desired
> I understand.
> (most of the time)--3
> > exceptions possible, 2 memory accesses.
> Yes, but I justified the two memory accesses in my mind by thinking it
> was no worse than a load multiple instruction.
<
The 2 mem refs are no worse than the inherently misaligned LDs and STs.
<
> > Also note, my original
> > solution can produce signed or unsigned output stream.
> True. Would the ability to produce signed output be used? Genuine
> question - I don't know the answer. If it was important, I suppose you
> could use an additional op code or if available a spare bit in the LBF
> instruction.
<
My EXT instructions (SL) can deliver signed or unsigned results. My estimation
is that there will be at least 10% of applications that would want the BFs signed.
{Whether that should make the cut (or not) is undecided.}
<
> > This is
> > going to take 2 cycles in AGEN,
> I believe you, but why? Unless you are saying crossing a 64 bit memory
> boundary would require a second AGEN?
<
1 cycle to compute the DW address (typical AGEN)
1 cycle to add the length to the bit address--if you routed this add to a different
function unit you run out of busses (or forwarding,...) If AGEN had 2 adders
you can do both in parallel and save the cycle at fairly high cost.
{1 optional cycle if the bit field crosses a cache line boundary.}
<
> > and 2 result register writes.
> Yes. The same as any instruction operating under the shadow of a CARRY
> meta instruction with the O bit set in the mask.
> >> In order to speed up this instruction, and given that it will frequently
> >> occur in a fairly tight loop, I think (hope) that the hardware can take
> >> advantage of the “streaming” buffers otherwise used for VVM operations.
> >> Anyway, if one had this instruction, the main loop in the code above
> >> could be something like
> >>
> >>
> >> loop:
> >> LDUB R10,[R1+R9]
> >> CARRY R6,IO
> >> LBF R12,R10,R2 ;I am not sure about R2, It should be the start of
> >> the packed buffer.
> >> STD R12,[R3+R9<<3]
> >> ADD R9,R9,#1
> >> CMP R11,R9,R4
> >> BLT R11,loop
> >>
> >> For a savings of about 10 instructions in the I cache, but fewer in
> >> execution (but still significant) depending upon how often the
> >> instructions under the predicate are executed.
> >>
> > I have to admit, this looks fairly juicy--just have to plow my way
> > through and see what comes out.
> :-)
> --
> - Stephen Fuld
> (e-mail address disguised to prevent spam)

Re: RISC-V vs. Aarch64

<ssfco0$hgh$1@dont-email.me>

  copy mid

https://www.novabbs.com/devel/article-flat.php?id=23027&group=comp.arch#23027

  copy link   Newsgroups: comp.arch
Path: i2pn2.org!i2pn.org!eternal-september.org!reader02.eternal-september.org!.POSTED!not-for-mail
From: iva...@millcomputing.com (Ivan Godard)
Newsgroups: comp.arch
Subject: Re: RISC-V vs. Aarch64
Date: Fri, 21 Jan 2022 14:39:59 -0800
Organization: A noiseless patient Spider
Lines: 125
Message-ID: <ssfco0$hgh$1@dont-email.me>
References: <2021Dec24.163843@mips.complang.tuwien.ac.at>
<srag0i$2ed$2@dont-email.me>
<00add816-93d7-4763-a68b-33a67db6d770n@googlegroups.com>
<2022Jan8.101413@mips.complang.tuwien.ac.at>
<7557bf3a-61ce-4500-8cf8-ced2dbed7087n@googlegroups.com>
<ad2ee700-b604-4565-9e24-3386580b90c8n@googlegroups.com>
<4d2fbc82-af69-4388-bfa5-e3b2be652744n@googlegroups.com>
<2e706405-006a-49bb-8e8a-f634d749205en@googlegroups.com>
<570acc73-a5da-497f-8ec4-810150e0a9f1n@googlegroups.com>
<850b7681-204a-4df6-9095-cd6ee816a7d5n@googlegroups.com>
<5ea00397-5572-4fbd-bfb3-85c3554f1eb9n@googlegroups.com>
<srnkf0$4cb$1@dont-email.me>
<b4e98991-4fb9-4ef7-a831-430c3fc10145n@googlegroups.com>
<srp3n0$e0a$1@dont-email.me> <srp4lu$hhg$1@gioia.aioe.org>
<81c0ddc6-4b46-4b3d-b64b-e65963889214n@googlegroups.com>
<srpjgj$541$1@dont-email.me>
<1e0b0ba3-e11c-4a14-a0c7-7c074f2f9ba7n@googlegroups.com>
<ssep4k$vke$1@dont-email.me>
<9d652029-a997-4bec-9182-769002a6bd1dn@googlegroups.com>
Mime-Version: 1.0
Content-Type: text/plain; charset=UTF-8; format=flowed
Content-Transfer-Encoding: 8bit
Injection-Date: Fri, 21 Jan 2022 22:40:00 -0000 (UTC)
Injection-Info: reader02.eternal-september.org; posting-host="bd152fa2a5fde99057cfa2e5aed93b67";
logging-data="17937"; mail-complaints-to="abuse@eternal-september.org"; posting-account="U2FsdGVkX183fnFDLeo9VIKdqNyy4ZEG"
User-Agent: Mozilla/5.0 (Windows NT 10.0; Win64; x64; rv:91.0) Gecko/20100101
Thunderbird/91.5.0
Cancel-Lock: sha1:9O2mQZfqPpt7gwGWK9P9Owpsf/Y=
In-Reply-To: <9d652029-a997-4bec-9182-769002a6bd1dn@googlegroups.com>
Content-Language: en-US
 by: Ivan Godard - Fri, 21 Jan 2022 22:39 UTC

On 1/21/2022 9:53 AM, MitchAlsup wrote:
> On Friday, January 21, 2022 at 11:05:28 AM UTC-6, Stephen Fuld wrote:
>> On 1/13/2022 9:53 AM, MitchAlsup wrote:
>
>>> I came up with this in 5 minutes::
>>> This assumes the input bit-length selector is an vector of characters and that the
>>> chars contain values from {1..64}
>>> <
>>> void unpack( uchar_t size[], uint64_t packed[], uint64_t unpacked[], uint64_t count )
>>> {
>>> uint64_t len,
>>> bit=0,
>>> word=0,
>>> extract,
>>> container1 = packed[0],
>>> container2 = packed[1];
>>>
>>> for( unsigned int i = 0; i < count; i++ )
>>> {
>>> len = size[i];
>>> bit += len;
>>> extract = ( len << 32 ) | ( bit & 0x3F );
>>> if( word != bit >> 6 )
>>> {
>>> container1 = container2;
>>> container2 = packed[++word];
>>> }
>>> unpacked[i] = {container2, container1} >> extract;
>>> }
>>> }
>>> <
>>> This translates into pretty nice My 66000 ISA:
>>> <
>>> ENTRY unpack
>>> unpack:
>>> MOV R5,#0
>>> MOV R6,#0
>>> LDD R7,[R2]
>>> LDD R8,[R2+8]
>>> MOV R9,#0
>>> loop:
>>> LDUB R10,[R1+R9]
>>> ADD R5,R5,R10
>>> AND R11,R5,#63
>>> SL R12,R10,#32
>>> OR R11,R11,R12
>>> SR R12,R6,#6
>>> CMP R11,R6,R12
>>> PEQ R11,{111}
>>> ADD R6,R6,#1
>>> MOV R7,R8
>>> LDD R8,[R2+R6<<3]
>>> CARRY R8,{{I}}
>>> SL R12,R7,R11
>>> STD R12,[R3+R9<<3]
>>> ADD R9,R9,#1
>>> CMP R11,R9,R4
>>> BLT R11,loop
>>> RET
>>> <
>>> Well at least straightforwardly.
>>
>>
>> If Terje is right, and he almost always is, it is worth trying to come
>> up with a better solution for this type of problem. So, as a start, I
>> came up with what follows. This certainly isn’t the final solution. It
>> is intended to start a discussion on better ways to do this. And the
>> usual disclaimer, IANAHG, so this is from a software perspective. But I
>> did try to fit it “in the spirit” of the MY 66000, and it takes
>> advantages of that design’s unique capabilities.
>>
>> The idea is to add one new instruction, which typically would be in the
>> shadow of a preceding Carry meta instruction. I called the new
>> instruction Load Bit Field (LBF).
>>
>> It is a two source, one result instruction, but uses the carry register
>> for an additional source and destination. The syntax is
>>
>> LBF Result register, field length (in bits), buffer starting address
>> (in bytes)
>>
>> The carry register contains the offset, in bits, from the start of the
>> buffer where the desired field starts.
>>
>> The instruction computes the start of the desired field by adding the
>> high order all but three bits of the carry register to get the starting
>> byte number, then uses the low order three bits to get the starting bit
>> number. The instruction extracts the field, starting at the computed
>> bit address with length as given in the register specified in the
>> register, and right justifies that field in the result register. The
>> higher order bits in the result register are set to zero. If the output
>> bit of the Carry instruction is set, the length value is added to the
>> Carry register.
> <
> A bit more on the CISC side than desired (most of the time)--3
> exceptions possible, 2 memory accesses. Also note, my original
> solution can produce signed or unsigned output stream. This is
> going to take 2 cycles in AGEN, and 2 result register writes.
>>
>> In order to speed up this instruction, and given that it will frequently
>> occur in a fairly tight loop, I think (hope) that the hardware can take
>> advantage of the “streaming” buffers otherwise used for VVM operations.
>> Anyway, if one had this instruction, the main loop in the code above
>> could be something like
>>
>>
>> loop:
>> LDUB R10,[R1+R9]
>> CARRY R6,IO
>> LBF R12,R10,R2 ;I am not sure about R2, It should be the start of
>> the packed buffer.
>> STD R12,[R3+R9<<3]
>> ADD R9,R9,#1
>> CMP R11,R9,R4
>> BLT R11,loop
>>
>> For a savings of about 10 instructions in the I cache, but fewer in
>> execution (but still significant) depending upon how often the
>> instructions under the predicate are executed.
>>
> I have to admit, this looks fairly juicy--just have to plow my way
> through and see what comes out.

I must be missing something - what's the advantage of this over a simple
unaligned LOAD and an EXTRACT?

Re: RISC-V vs. Aarch64

<ssfcsc$hgh$2@dont-email.me>

  copy mid

https://www.novabbs.com/devel/article-flat.php?id=23028&group=comp.arch#23028

  copy link   Newsgroups: comp.arch
Path: i2pn2.org!i2pn.org!eternal-september.org!reader02.eternal-september.org!.POSTED!not-for-mail
From: iva...@millcomputing.com (Ivan Godard)
Newsgroups: comp.arch
Subject: Re: RISC-V vs. Aarch64
Date: Fri, 21 Jan 2022 14:42:20 -0800
Organization: A noiseless patient Spider
Lines: 118
Message-ID: <ssfcsc$hgh$2@dont-email.me>
References: <2021Dec24.163843@mips.complang.tuwien.ac.at>
<2022Jan8.101413@mips.complang.tuwien.ac.at>
<7557bf3a-61ce-4500-8cf8-ced2dbed7087n@googlegroups.com>
<ad2ee700-b604-4565-9e24-3386580b90c8n@googlegroups.com>
<4d2fbc82-af69-4388-bfa5-e3b2be652744n@googlegroups.com>
<2e706405-006a-49bb-8e8a-f634d749205en@googlegroups.com>
<570acc73-a5da-497f-8ec4-810150e0a9f1n@googlegroups.com>
<850b7681-204a-4df6-9095-cd6ee816a7d5n@googlegroups.com>
<5ea00397-5572-4fbd-bfb3-85c3554f1eb9n@googlegroups.com>
<srnkf0$4cb$1@dont-email.me>
<b4e98991-4fb9-4ef7-a831-430c3fc10145n@googlegroups.com>
<srp3n0$e0a$1@dont-email.me> <srp4lu$hhg$1@gioia.aioe.org>
<81c0ddc6-4b46-4b3d-b64b-e65963889214n@googlegroups.com>
<srpjgj$541$1@dont-email.me>
<1e0b0ba3-e11c-4a14-a0c7-7c074f2f9ba7n@googlegroups.com>
<ssep4k$vke$1@dont-email.me>
<9d652029-a997-4bec-9182-769002a6bd1dn@googlegroups.com>
<sset5a$v8n$1@dont-email.me>
<dc78242b-1bda-45d7-ae78-893121125b92n@googlegroups.com>
Mime-Version: 1.0
Content-Type: text/plain; charset=UTF-8; format=flowed
Content-Transfer-Encoding: 8bit
Injection-Date: Fri, 21 Jan 2022 22:42:20 -0000 (UTC)
Injection-Info: reader02.eternal-september.org; posting-host="bd152fa2a5fde99057cfa2e5aed93b67";
logging-data="17937"; mail-complaints-to="abuse@eternal-september.org"; posting-account="U2FsdGVkX18ayojGCj6QdFDsmuQYC02m"
User-Agent: Mozilla/5.0 (Windows NT 10.0; Win64; x64; rv:91.0) Gecko/20100101
Thunderbird/91.5.0
Cancel-Lock: sha1:BARmySfE1PVsTnsI756V0114oEk=
In-Reply-To: <dc78242b-1bda-45d7-ae78-893121125b92n@googlegroups.com>
Content-Language: en-US
 by: Ivan Godard - Fri, 21 Jan 2022 22:42 UTC

On 1/21/2022 12:09 PM, MitchAlsup wrote:
> On Friday, January 21, 2022 at 12:14:06 PM UTC-6, Stephen Fuld wrote:
>> On 1/21/2022 9:53 AM, MitchAlsup wrote:
>>> On Friday, January 21, 2022 at 11:05:28 AM UTC-6, Stephen Fuld wrote:
>>>> On 1/13/2022 9:53 AM, MitchAlsup wrote:
>>>
>>>>> I came up with this in 5 minutes::
>>>>> This assumes the input bit-length selector is an vector of characters and that the
>>>>> chars contain values from {1..64}
>>>>> <
>>>>> void unpack( uchar_t size[], uint64_t packed[], uint64_t unpacked[], uint64_t count )
>>>>> {
>>>>> uint64_t len,
>>>>> bit=0,
>>>>> word=0,
>>>>> extract,
>>>>> container1 = packed[0],
>>>>> container2 = packed[1];
>>>>>
>>>>> for( unsigned int i = 0; i < count; i++ )
>>>>> {
>>>>> len = size[i];
>>>>> bit += len;
>>>>> extract = ( len << 32 ) | ( bit & 0x3F );
>>>>> if( word != bit >> 6 )
>>>>> {
>>>>> container1 = container2;
>>>>> container2 = packed[++word];
>>>>> }
>>>>> unpacked[i] = {container2, container1} >> extract;
>>>>> }
>>>>> }
>>>>> <
>>>>> This translates into pretty nice My 66000 ISA:
>>>>> <
>>>>> ENTRY unpack
>>>>> unpack:
>>>>> MOV R5,#0
>>>>> MOV R6,#0
>>>>> LDD R7,[R2]
>>>>> LDD R8,[R2+8]
>>>>> MOV R9,#0
>>>>> loop:
>>>>> LDUB R10,[R1+R9]
>>>>> ADD R5,R5,R10
>>>>> AND R11,R5,#63
>>>>> SL R12,R10,#32
>>>>> OR R11,R11,R12
>>>>> SR R12,R6,#6
>>>>> CMP R11,R6,R12
>>>>> PEQ R11,{111}
>>>>> ADD R6,R6,#1
>>>>> MOV R7,R8
>>>>> LDD R8,[R2+R6<<3]
>>>>> CARRY R8,{{I}}
>>>>> SL R12,R7,R11
>>>>> STD R12,[R3+R9<<3]
>>>>> ADD R9,R9,#1
>>>>> CMP R11,R9,R4
>>>>> BLT R11,loop
>>>>> RET
>>>>> <
>>>>> Well at least straightforwardly.
>>>>
>>>>
>>>> If Terje is right, and he almost always is, it is worth trying to come
>>>> up with a better solution for this type of problem. So, as a start, I
>>>> came up with what follows. This certainly isn’t the final solution. It
>>>> is intended to start a discussion on better ways to do this. And the
>>>> usual disclaimer, IANAHG, so this is from a software perspective. But I
>>>> did try to fit it “in the spirit” of the MY 66000, and it takes
>>>> advantages of that design’s unique capabilities.
>>>>
>>>> The idea is to add one new instruction, which typically would be in the
>>>> shadow of a preceding Carry meta instruction. I called the new
>>>> instruction Load Bit Field (LBF).
>>>>
>>>> It is a two source, one result instruction, but uses the carry register
>>>> for an additional source and destination. The syntax is
>>>>
>>>> LBF Result register, field length (in bits), buffer starting address
>>>> (in bytes)
>>>>
>>>> The carry register contains the offset, in bits, from the start of the
>>>> buffer where the desired field starts.
>>>>
>>>> The instruction computes the start of the desired field by adding the
>>>> high order all but three bits of the carry register to get the starting
>>>> byte number, then uses the low order three bits to get the starting bit
>>>> number. The instruction extracts the field, starting at the computed
>>>> bit address with length as given in the register specified in the
>>>> register, and right justifies that field in the result register. The
>>>> higher order bits in the result register are set to zero. If the output
>>>> bit of the Carry instruction is set, the length value is added to the
>>>> Carry register.
>>> <
>>> A bit more on the CISC side than desired
>> I understand.
>> (most of the time)--3
>>> exceptions possible, 2 memory accesses.
>> Yes, but I justified the two memory accesses in my mind by thinking it
>> was no worse than a load multiple instruction.
> <
> The 2 mem refs are no worse than the inherently misaligned LDs and STs.
> <
>>> Also note, my original
>>> solution can produce signed or unsigned output stream.
>> True. Would the ability to produce signed output be used? Genuine
>> question - I don't know the answer. If it was important, I suppose you
>> could use an additional op code or if available a spare bit in the LBF
>> instruction.
> <
> My EXT instructions (SL) can deliver signed or unsigned results. My estimation
> is that there will be at least 10% of applications that would want the BFs signed.
> {Whether that should make the cut (or not) is undecided.}

Did for us.

Re: The type of Mill's belt's slots

<6cc6b294-7b58-42a8-8751-4fd808ea034fn@googlegroups.com>

  copy mid

https://www.novabbs.com/devel/article-flat.php?id=23029&group=comp.arch#23029

  copy link   Newsgroups: comp.arch
X-Received: by 2002:a05:620a:1994:: with SMTP id bm20mr4618021qkb.459.1642813942037;
Fri, 21 Jan 2022 17:12:22 -0800 (PST)
X-Received: by 2002:a9d:2a82:: with SMTP id e2mr4729523otb.331.1642813941733;
Fri, 21 Jan 2022 17:12:21 -0800 (PST)
Path: i2pn2.org!i2pn.org!weretis.net!feeder6.news.weretis.net!news.misty.com!border2.nntp.dca1.giganews.com!nntp.giganews.com!news-out.google.com!nntp.google.com!postnews.google.com!google-groups.googlegroups.com!not-for-mail
Newsgroups: comp.arch
Date: Fri, 21 Jan 2022 17:12:21 -0800 (PST)
In-Reply-To: <ssero6$kae$1@dont-email.me>
Injection-Info: google-groups.googlegroups.com; posting-host=2001:56a:fb70:6300:7407:f993:819b:4bb6;
posting-account=1nOeKQkAAABD2jxp4Pzmx9Hx5g9miO8y
NNTP-Posting-Host: 2001:56a:fb70:6300:7407:f993:819b:4bb6
References: <2021Dec24.163843@mips.complang.tuwien.ac.at> <sqpqbm$7qo$1@newsreader4.netcologne.de>
<sqq3ce$c4n$2@dont-email.me> <sqssff$a9j$1@gioia.aioe.org>
<077afaee-009e-4860-be45-61106126934bn@googlegroups.com> <squhht$79u$1@dont-email.me>
<bb6d49bb-a676-44bd-9a6d-29386d429454n@googlegroups.com> <sr0vhm$c4u$1@dont-email.me>
<sr114i$1qc$1@newsreader4.netcologne.de> <sr1dca$70e$1@dont-email.me>
<kM%AJ.186634$np6.183460@fx46.iad> <sr2gf6$64u$1@dont-email.me>
<7DpBJ.254731$3q9.63673@fx47.iad> <sr62tb$u2o$1@dont-email.me>
<ss4g91$hvs$1@dont-email.me> <ss4ktr$dvv$1@dont-email.me> <UZCFJ.4610$uP.4480@fx16.iad>
<bSDFJ.268310$1d1.64158@fx99.iad> <ss78bk$hm6$1@dont-email.me>
<jwvo848n4ud.fsf-monnier+comp.arch@gnu.org> <ss7ila$obl$1@dont-email.me>
<jwv7dawmmb7.fsf-monnier+comp.arch@gnu.org> <ss858d$m0a$1@dont-email.me>
<74f730f1-e138-4124-9fbb-21f388eafeb3n@googlegroups.com> <ss9p4l$a2i$1@dont-email.me>
<34d3c262-c385-4a42-a6c1-a15c34536b28n@googlegroups.com> <ssero6$kae$1@dont-email.me>
User-Agent: G2/1.0
MIME-Version: 1.0
Message-ID: <6cc6b294-7b58-42a8-8751-4fd808ea034fn@googlegroups.com>
Subject: Re: The type of Mill's belt's slots
From: jsav...@ecn.ab.ca (Quadibloc)
Injection-Date: Sat, 22 Jan 2022 01:12:22 +0000
Content-Type: text/plain; charset="UTF-8"
Lines: 65
 by: Quadibloc - Sat, 22 Jan 2022 01:12 UTC

On Friday, January 21, 2022 at 10:50:01 AM UTC-7, Ivan Godard wrote:

> One thing in your post I must disagree strongly with: the purpose of
> computers is *not* to set LINPACK records. The view that it is their
> purpose is in large part the source of the mess that computers have made
> of our everyday lives.

I must admit that this is very true.

Given that this mess _has_ been made of our everyday lives, though,
the conclusion can be drawn that historically, computers have been
designed _as though_ their purpose was "to set LINPACK records",
or at least something similar. Why?

Well, that's easily enough answered. Back in the old days, when
computers were made out of vacuum tubes, discrete transistors,
or even small-scale integrated circuits, they were... terribly
expensive. And who had lots of money to spend on a computer?

And yet that doesn't lead to the answer I'm thinking of in a
direct fashion. Yes, Control Data and DEC made computers, but
so did Burroughs and IBM.

So computers for doing payroll and accounts receivable for large
companies were a big part of the market for computers, and for those,
security and reliability were of major importance.

Of course, microprocessors are made by *microprocessor*
companies, and while their current offerings have mainframe power,
these companies got their start making 8-bit (or even smaller)
machines, which of necessity had few frills, and thus were of a
minicomputer-like architecture, which tended to be in the 'scientific'
rather than 'commercial' category. And at the beginning, maximum
performance was desperately sought, so a culture was established.

Today, Unisys offers computers to former users of Burroughs
(and Univac) mainframes that work by simulating the architectures
of those machines on x86 hardware. Nothing else is really cost-effective,
because not having a long production run leads to a disadvantage
in price/performance of orders of magnitude.

The clock is ticking for System z.

And this leads to the thinking behind my impractical fantasy computer
architecture designs. If the slogan from _Higlander_ is what applies here
("there can be only one") then the thing an alternative computer architecture
intended to compete needs more than anything else, to fill the niche that
is left unfilled by an x86 monoculture (and still left unfilled even with ARM
and RISC-V, sadly)...

seems to me to be suitability for _emulating_ a variety of computer architectures,
so that people wanting to...

make systems compatible with legacy mainframes (like Unisys)
build classic computer clones (like the 680x0-based Amiga)

and so on could all buy your/my processor designed to allow emulating
the architecture of your choice with a minimal loss of efficiency compared
to custom silicon.

The thing is, though, that a hardware solution can't look like microcode, as that
is _not_ efficient enough... and software emulation has, thanks to just-in-time
compilation, become so efficient that emulation-oriented hardware doesn't
seem to have an advantage.

John Savard

Re: The type of Mill's belt's slots

<ed3f5b9c-bb06-45c3-8804-df0774fa7a4dn@googlegroups.com>

  copy mid

https://www.novabbs.com/devel/article-flat.php?id=23030&group=comp.arch#23030

  copy link   Newsgroups: comp.arch
X-Received: by 2002:a37:a7d4:: with SMTP id q203mr4745409qke.274.1642815692337;
Fri, 21 Jan 2022 17:41:32 -0800 (PST)
X-Received: by 2002:a05:6808:aba:: with SMTP id r26mr2720933oij.155.1642815692097;
Fri, 21 Jan 2022 17:41:32 -0800 (PST)
Path: i2pn2.org!i2pn.org!aioe.org!news.uzoreto.com!newsfeed.xs4all.nl!newsfeed7.news.xs4all.nl!news-out.netnews.com!news.alt.net!fdc2.netnews.com!peer02.ams1!peer.ams1.xlned.com!news.xlned.com!peer01.iad!feed-me.highwinds-media.com!news.highwinds-media.com!news-out.google.com!nntp.google.com!postnews.google.com!google-groups.googlegroups.com!not-for-mail
Newsgroups: comp.arch
Date: Fri, 21 Jan 2022 17:41:31 -0800 (PST)
In-Reply-To: <6cc6b294-7b58-42a8-8751-4fd808ea034fn@googlegroups.com>
Injection-Info: google-groups.googlegroups.com; posting-host=2600:1700:291:29f0:155a:6773:152c:fbf3;
posting-account=H_G_JQkAAADS6onOMb-dqvUozKse7mcM
NNTP-Posting-Host: 2600:1700:291:29f0:155a:6773:152c:fbf3
References: <2021Dec24.163843@mips.complang.tuwien.ac.at> <sqpqbm$7qo$1@newsreader4.netcologne.de>
<sqq3ce$c4n$2@dont-email.me> <sqssff$a9j$1@gioia.aioe.org>
<077afaee-009e-4860-be45-61106126934bn@googlegroups.com> <squhht$79u$1@dont-email.me>
<bb6d49bb-a676-44bd-9a6d-29386d429454n@googlegroups.com> <sr0vhm$c4u$1@dont-email.me>
<sr114i$1qc$1@newsreader4.netcologne.de> <sr1dca$70e$1@dont-email.me>
<kM%AJ.186634$np6.183460@fx46.iad> <sr2gf6$64u$1@dont-email.me>
<7DpBJ.254731$3q9.63673@fx47.iad> <sr62tb$u2o$1@dont-email.me>
<ss4g91$hvs$1@dont-email.me> <ss4ktr$dvv$1@dont-email.me> <UZCFJ.4610$uP.4480@fx16.iad>
<bSDFJ.268310$1d1.64158@fx99.iad> <ss78bk$hm6$1@dont-email.me>
<jwvo848n4ud.fsf-monnier+comp.arch@gnu.org> <ss7ila$obl$1@dont-email.me>
<jwv7dawmmb7.fsf-monnier+comp.arch@gnu.org> <ss858d$m0a$1@dont-email.me>
<74f730f1-e138-4124-9fbb-21f388eafeb3n@googlegroups.com> <ss9p4l$a2i$1@dont-email.me>
<34d3c262-c385-4a42-a6c1-a15c34536b28n@googlegroups.com> <ssero6$kae$1@dont-email.me>
<6cc6b294-7b58-42a8-8751-4fd808ea034fn@googlegroups.com>
User-Agent: G2/1.0
MIME-Version: 1.0
Message-ID: <ed3f5b9c-bb06-45c3-8804-df0774fa7a4dn@googlegroups.com>
Subject: Re: The type of Mill's belt's slots
From: MitchAl...@aol.com (MitchAlsup)
Injection-Date: Sat, 22 Jan 2022 01:41:32 +0000
Content-Type: text/plain; charset="UTF-8"
X-Received-Bytes: 6081
 by: MitchAlsup - Sat, 22 Jan 2022 01:41 UTC

On Friday, January 21, 2022 at 7:12:23 PM UTC-6, Quadibloc wrote:
> On Friday, January 21, 2022 at 10:50:01 AM UTC-7, Ivan Godard wrote:
>
> > One thing in your post I must disagree strongly with: the purpose of
> > computers is *not* to set LINPACK records. The view that it is their
> > purpose is in large part the source of the mess that computers have made
> > of our everyday lives.
<
> I must admit that this is very true.
>
> Given that this mess _has_ been made of our everyday lives, though,
> the conclusion can be drawn that historically, computers have been
> designed _as though_ their purpose was "to set LINPACK records",
> or at least something similar. Why?
<
Because that is what people (back then) would be willing to PAY for.
>
> Well, that's easily enough answered. Back in the old days, when
> computers were made out of vacuum tubes, discrete transistors,
> or even small-scale integrated circuits, they were... terribly
> expensive. And who had lots of money to spend on a computer?
>
*.gov
>
> And yet that doesn't lead to the answer I'm thinking of in a
> direct fashion. Yes, Control Data and DEC made computers, but
> so did Burroughs and IBM.
>
> So computers for doing payroll and accounts receivable for large
> companies were a big part of the market for computers, and for those,
> security and reliability were of major importance.
<
The appearance of security and reliability were important--the actuality
not so much.
>
> Of course, microprocessors are made by *microprocessor*
> companies, and while their current offerings have mainframe power,
> these companies got their start making 8-bit (or even smaller)
<
4004
<
> machines, which of necessity had few frills, and thus were of a
> minicomputer-like architecture, which tended to be in the 'scientific'
<
Originally; more like toaster controllers than 'scientific'.
<
> rather than 'commercial' category. And at the beginning, maximum
> performance was desperately sought, so a culture was established.
<
Not dissimilar to racing {cars, boats, airplanes,...}
>
> Today, Unisys offers computers to former users of Burroughs
> (and Univac) mainframes that work by simulating the architectures
> of those machines on x86 hardware. Nothing else is really cost-effective,
> because not having a long production run leads to a disadvantage
> in price/performance of orders of magnitude.
>
> The clock is ticking for System z.
>
> And this leads to the thinking behind my impractical fantasy computer
> architecture designs. If the slogan from _Higlander_ is what applies here
> ("there can be only one") then the thing an alternative computer architecture
> intended to compete needs more than anything else, to fill the niche that
> is left unfilled by an x86 monoculture (and still left unfilled even with ARM
> and RISC-V, sadly)...
<
You need to add Windows, and Linux to the problem creation/preservation space.
>
> seems to me to be suitability for _emulating_ a variety of computer architectures,
> so that people wanting to...
<
{ >
> make systems compatible with legacy mainframes (like Unisys)
> build classic computer clones (like the 680x0-based Amiga)
>
> and so on could all buy your/my processor designed to allow emulating
> the architecture of your choice with a minimal loss of efficiency compared
> to custom silicon.
<
} Remember, it only takes $1,000,000,000 to get an architecture off the ground,
into production, and with all the software people want to run.
>
> The thing is, though, that a hardware solution can't look like microcode, as that
> is _not_ efficient enough... and software emulation has, thanks to just-in-time
> compilation, become so efficient that emulation-oriented hardware doesn't
> seem to have an advantage.
>
> John Savard

Re: RISC-V vs. Aarch64

<6ab380e5-d47a-4647-8e75-f19f644c6ee9n@googlegroups.com>

  copy mid

https://www.novabbs.com/devel/article-flat.php?id=23031&group=comp.arch#23031

  copy link   Newsgroups: comp.arch
X-Received: by 2002:a05:620a:22c4:: with SMTP id o4mr4898077qki.534.1642815768289;
Fri, 21 Jan 2022 17:42:48 -0800 (PST)
X-Received: by 2002:a05:6808:11c5:: with SMTP id p5mr2703078oiv.51.1642815768160;
Fri, 21 Jan 2022 17:42:48 -0800 (PST)
Path: i2pn2.org!i2pn.org!weretis.net!feeder8.news.weretis.net!newsreader4.netcologne.de!news.netcologne.de!peer03.ams1!peer.ams1.xlned.com!news.xlned.com!peer01.iad!feed-me.highwinds-media.com!news.highwinds-media.com!news-out.google.com!nntp.google.com!postnews.google.com!google-groups.googlegroups.com!not-for-mail
Newsgroups: comp.arch
Date: Fri, 21 Jan 2022 17:42:47 -0800 (PST)
In-Reply-To: <ssfco0$hgh$1@dont-email.me>
Injection-Info: google-groups.googlegroups.com; posting-host=2600:1700:291:29f0:155a:6773:152c:fbf3;
posting-account=H_G_JQkAAADS6onOMb-dqvUozKse7mcM
NNTP-Posting-Host: 2600:1700:291:29f0:155a:6773:152c:fbf3
References: <2021Dec24.163843@mips.complang.tuwien.ac.at> <srag0i$2ed$2@dont-email.me>
<00add816-93d7-4763-a68b-33a67db6d770n@googlegroups.com> <2022Jan8.101413@mips.complang.tuwien.ac.at>
<7557bf3a-61ce-4500-8cf8-ced2dbed7087n@googlegroups.com> <ad2ee700-b604-4565-9e24-3386580b90c8n@googlegroups.com>
<4d2fbc82-af69-4388-bfa5-e3b2be652744n@googlegroups.com> <2e706405-006a-49bb-8e8a-f634d749205en@googlegroups.com>
<570acc73-a5da-497f-8ec4-810150e0a9f1n@googlegroups.com> <850b7681-204a-4df6-9095-cd6ee816a7d5n@googlegroups.com>
<5ea00397-5572-4fbd-bfb3-85c3554f1eb9n@googlegroups.com> <srnkf0$4cb$1@dont-email.me>
<b4e98991-4fb9-4ef7-a831-430c3fc10145n@googlegroups.com> <srp3n0$e0a$1@dont-email.me>
<srp4lu$hhg$1@gioia.aioe.org> <81c0ddc6-4b46-4b3d-b64b-e65963889214n@googlegroups.com>
<srpjgj$541$1@dont-email.me> <1e0b0ba3-e11c-4a14-a0c7-7c074f2f9ba7n@googlegroups.com>
<ssep4k$vke$1@dont-email.me> <9d652029-a997-4bec-9182-769002a6bd1dn@googlegroups.com>
<ssfco0$hgh$1@dont-email.me>
User-Agent: G2/1.0
MIME-Version: 1.0
Message-ID: <6ab380e5-d47a-4647-8e75-f19f644c6ee9n@googlegroups.com>
Subject: Re: RISC-V vs. Aarch64
From: MitchAl...@aol.com (MitchAlsup)
Injection-Date: Sat, 22 Jan 2022 01:42:48 +0000
Content-Type: text/plain; charset="UTF-8"
Content-Transfer-Encoding: quoted-printable
X-Received-Bytes: 3167
 by: MitchAlsup - Sat, 22 Jan 2022 01:42 UTC

On Friday, January 21, 2022 at 4:40:03 PM UTC-6, Ivan Godard wrote:
> On 1/21/2022 9:53 AM, MitchAlsup wrote:
> > On Friday, January 21, 2022 at 11:05:28 AM UTC-6, Stephen Fuld wrote:
> >> On 1/13/2022 9:53 AM, MitchAlsup wrote:
> >

> >> loop:
> >> LDUB R10,[R1+R9]
> >> CARRY R6,IO
> >> LBF R12,R10,R2 ;I am not sure about R2, It should be the start of
> >> the packed buffer.
> >> STD R12,[R3+R9<<3]
> >> ADD R9,R9,#1
> >> CMP R11,R9,R4
> >> BLT R11,loop
> >>
> >> For a savings of about 10 instructions in the I cache, but fewer in
> >> execution (but still significant) depending upon how often the
> >> instructions under the predicate are executed.
> >>
> > I have to admit, this looks fairly juicy--just have to plow my way
> > through and see what comes out.
<
> I must be missing something - what's the advantage of this over a simple
> unaligned LOAD and an EXTRACT?
<
The LDBF does all the "other" math involved and cuts the instruction
count by ½.

Re: The type of Mill's belt's slots

<ssfreu$ck$1@dont-email.me>

  copy mid

https://www.novabbs.com/devel/article-flat.php?id=23032&group=comp.arch#23032

  copy link   Newsgroups: comp.arch
Path: i2pn2.org!i2pn.org!eternal-september.org!reader02.eternal-september.org!.POSTED!not-for-mail
From: iva...@millcomputing.com (Ivan Godard)
Newsgroups: comp.arch
Subject: Re: The type of Mill's belt's slots
Date: Fri, 21 Jan 2022 18:51:10 -0800
Organization: A noiseless patient Spider
Lines: 86
Message-ID: <ssfreu$ck$1@dont-email.me>
References: <2021Dec24.163843@mips.complang.tuwien.ac.at>
<sqssff$a9j$1@gioia.aioe.org>
<077afaee-009e-4860-be45-61106126934bn@googlegroups.com>
<squhht$79u$1@dont-email.me>
<bb6d49bb-a676-44bd-9a6d-29386d429454n@googlegroups.com>
<sr0vhm$c4u$1@dont-email.me> <sr114i$1qc$1@newsreader4.netcologne.de>
<sr1dca$70e$1@dont-email.me> <kM%AJ.186634$np6.183460@fx46.iad>
<sr2gf6$64u$1@dont-email.me> <7DpBJ.254731$3q9.63673@fx47.iad>
<sr62tb$u2o$1@dont-email.me> <ss4g91$hvs$1@dont-email.me>
<ss4ktr$dvv$1@dont-email.me> <UZCFJ.4610$uP.4480@fx16.iad>
<bSDFJ.268310$1d1.64158@fx99.iad> <ss78bk$hm6$1@dont-email.me>
<jwvo848n4ud.fsf-monnier+comp.arch@gnu.org> <ss7ila$obl$1@dont-email.me>
<jwv7dawmmb7.fsf-monnier+comp.arch@gnu.org> <ss858d$m0a$1@dont-email.me>
<74f730f1-e138-4124-9fbb-21f388eafeb3n@googlegroups.com>
<ss9p4l$a2i$1@dont-email.me>
<34d3c262-c385-4a42-a6c1-a15c34536b28n@googlegroups.com>
<ssero6$kae$1@dont-email.me>
<6cc6b294-7b58-42a8-8751-4fd808ea034fn@googlegroups.com>
Mime-Version: 1.0
Content-Type: text/plain; charset=UTF-8; format=flowed
Content-Transfer-Encoding: 7bit
Injection-Date: Sat, 22 Jan 2022 02:51:11 -0000 (UTC)
Injection-Info: reader02.eternal-september.org; posting-host="91824058145af369f06251457d1d9841";
logging-data="404"; mail-complaints-to="abuse@eternal-september.org"; posting-account="U2FsdGVkX1/SVJTocRA5kh1aAyD0e+3j"
User-Agent: Mozilla/5.0 (Windows NT 10.0; Win64; x64; rv:91.0) Gecko/20100101
Thunderbird/91.5.0
Cancel-Lock: sha1:l+vFJnUDDLyxG4b3+gTVk5XZhuE=
In-Reply-To: <6cc6b294-7b58-42a8-8751-4fd808ea034fn@googlegroups.com>
Content-Language: en-US
 by: Ivan Godard - Sat, 22 Jan 2022 02:51 UTC

On 1/21/2022 5:12 PM, Quadibloc wrote:
> On Friday, January 21, 2022 at 10:50:01 AM UTC-7, Ivan Godard wrote:
>
>> One thing in your post I must disagree strongly with: the purpose of
>> computers is *not* to set LINPACK records. The view that it is their
>> purpose is in large part the source of the mess that computers have made
>> of our everyday lives.
>
> I must admit that this is very true.
>
> Given that this mess _has_ been made of our everyday lives, though,
> the conclusion can be drawn that historically, computers have been
> designed _as though_ their purpose was "to set LINPACK records",
> or at least something similar. Why?
>
> Well, that's easily enough answered. Back in the old days, when
> computers were made out of vacuum tubes, discrete transistors,
> or even small-scale integrated circuits, they were... terribly
> expensive. And who had lots of money to spend on a computer?
>
> And yet that doesn't lead to the answer I'm thinking of in a
> direct fashion. Yes, Control Data and DEC made computers, but
> so did Burroughs and IBM.
>
> So computers for doing payroll and accounts receivable for large
> companies were a big part of the market for computers, and for those,
> security and reliability were of major importance.
>
> Of course, microprocessors are made by *microprocessor*
> companies, and while their current offerings have mainframe power,
> these companies got their start making 8-bit (or even smaller)
> machines, which of necessity had few frills, and thus were of a
> minicomputer-like architecture, which tended to be in the 'scientific'
> rather than 'commercial' category. And at the beginning, maximum
> performance was desperately sought, so a culture was established.
>
> Today, Unisys offers computers to former users of Burroughs
> (and Univac) mainframes that work by simulating the architectures
> of those machines on x86 hardware. Nothing else is really cost-effective,
> because not having a long production run leads to a disadvantage
> in price/performance of orders of magnitude.
>
> The clock is ticking for System z.
>
> And this leads to the thinking behind my impractical fantasy computer
> architecture designs. If the slogan from _Higlander_ is what applies here
> ("there can be only one") then the thing an alternative computer architecture
> intended to compete needs more than anything else, to fill the niche that
> is left unfilled by an x86 monoculture (and still left unfilled even with ARM
> and RISC-V, sadly)...
>
> seems to me to be suitability for _emulating_ a variety of computer architectures,
> so that people wanting to...
>
> make systems compatible with legacy mainframes (like Unisys)
> build classic computer clones (like the 680x0-based Amiga)
>
> and so on could all buy your/my processor designed to allow emulating
> the architecture of your choice with a minimal loss of efficiency compared
> to custom silicon.
>
> The thing is, though, that a hardware solution can't look like microcode, as that
> is _not_ efficient enough... and software emulation has, thanks to just-in-time
> compilation, become so efficient that emulation-oriented hardware doesn't
> seem to have an advantage.
>
> John Savard

Something I learned a long time ago: there is a difference between the
first and the second sale of any product. To get the first sale you must
give them what they want. To get the second sale you must have given
them what they needed.

In our business, the first sale generally wants bang for the buck - or
LINPACK for power/area (area equals price). The Mill ISA was designed to
scale beyond any competing chip in total bang, and do it cheaper than
any competing chip. To do that the design had to wildly diverge for
conventional practice.

The second sale adds RAS as a factor, overlooked in the first sale by
most (but not all) customers.

It doesn't matter if economics causes resort to emulation: the customer
doesn't see the x86 underneath, and his use is not exposed to x86
attacks. And non-ISA RAS concerns, like internal ECC, are independent of
ISA anyway.

Re: The type of Mill's belt's slots

<ssfrhc$ck$2@dont-email.me>

  copy mid

https://www.novabbs.com/devel/article-flat.php?id=23033&group=comp.arch#23033

  copy link   Newsgroups: comp.arch
Path: i2pn2.org!i2pn.org!eternal-september.org!reader02.eternal-september.org!.POSTED!not-for-mail
From: iva...@millcomputing.com (Ivan Godard)
Newsgroups: comp.arch
Subject: Re: The type of Mill's belt's slots
Date: Fri, 21 Jan 2022 18:52:29 -0800
Organization: A noiseless patient Spider
Lines: 15
Message-ID: <ssfrhc$ck$2@dont-email.me>
References: <2021Dec24.163843@mips.complang.tuwien.ac.at>
<squhht$79u$1@dont-email.me>
<bb6d49bb-a676-44bd-9a6d-29386d429454n@googlegroups.com>
<sr0vhm$c4u$1@dont-email.me> <sr114i$1qc$1@newsreader4.netcologne.de>
<sr1dca$70e$1@dont-email.me> <kM%AJ.186634$np6.183460@fx46.iad>
<sr2gf6$64u$1@dont-email.me> <7DpBJ.254731$3q9.63673@fx47.iad>
<sr62tb$u2o$1@dont-email.me> <ss4g91$hvs$1@dont-email.me>
<ss4ktr$dvv$1@dont-email.me> <UZCFJ.4610$uP.4480@fx16.iad>
<bSDFJ.268310$1d1.64158@fx99.iad> <ss78bk$hm6$1@dont-email.me>
<jwvo848n4ud.fsf-monnier+comp.arch@gnu.org> <ss7ila$obl$1@dont-email.me>
<jwv7dawmmb7.fsf-monnier+comp.arch@gnu.org> <ss858d$m0a$1@dont-email.me>
<74f730f1-e138-4124-9fbb-21f388eafeb3n@googlegroups.com>
<ss9p4l$a2i$1@dont-email.me>
<34d3c262-c385-4a42-a6c1-a15c34536b28n@googlegroups.com>
<ssero6$kae$1@dont-email.me>
<6cc6b294-7b58-42a8-8751-4fd808ea034fn@googlegroups.com>
<ed3f5b9c-bb06-45c3-8804-df0774fa7a4dn@googlegroups.com>
Mime-Version: 1.0
Content-Type: text/plain; charset=UTF-8; format=flowed
Content-Transfer-Encoding: 7bit
Injection-Date: Sat, 22 Jan 2022 02:52:28 -0000 (UTC)
Injection-Info: reader02.eternal-september.org; posting-host="91824058145af369f06251457d1d9841";
logging-data="404"; mail-complaints-to="abuse@eternal-september.org"; posting-account="U2FsdGVkX1+r0FCncHdzH5wuYhUPLhoS"
User-Agent: Mozilla/5.0 (Windows NT 10.0; Win64; x64; rv:91.0) Gecko/20100101
Thunderbird/91.5.0
Cancel-Lock: sha1:5ip2pSv0Tcj1KsJOXlRRAk2OlU4=
In-Reply-To: <ed3f5b9c-bb06-45c3-8804-df0774fa7a4dn@googlegroups.com>
Content-Language: en-US
 by: Ivan Godard - Sat, 22 Jan 2022 02:52 UTC

On 1/21/2022 5:41 PM, MitchAlsup wrote:
> On Friday, January 21, 2022 at 7:12:23 PM UTC-6, Quadibloc wrote:

<snip>

>> and so on could all buy your/my processor designed to allow emulating
>> the architecture of your choice with a minimal loss of efficiency compared
>> to custom silicon.
> <
> }
> Remember, it only takes $1,000,000,000 to get an architecture off the ground,
> into production, and with all the software people want to run.

A billion her, a billion there - pretty soon you're talking about real
money.

Re: The type of Mill's belt's slots

<666a6ae5-0bac-421e-81c6-8e3d752192e0n@googlegroups.com>

  copy mid

https://www.novabbs.com/devel/article-flat.php?id=23034&group=comp.arch#23034

  copy link   Newsgroups: comp.arch
X-Received: by 2002:a05:622a:1394:: with SMTP id o20mr5784638qtk.530.1642821926382;
Fri, 21 Jan 2022 19:25:26 -0800 (PST)
X-Received: by 2002:a9d:6382:: with SMTP id w2mr4688419otk.227.1642821926034;
Fri, 21 Jan 2022 19:25:26 -0800 (PST)
Path: i2pn2.org!i2pn.org!weretis.net!feeder8.news.weretis.net!news.uzoreto.com!feeder1.cambriumusenet.nl!feed.tweak.nl!209.85.160.216.MISMATCH!news-out.google.com!nntp.google.com!postnews.google.com!google-groups.googlegroups.com!not-for-mail
Newsgroups: comp.arch
Date: Fri, 21 Jan 2022 19:25:25 -0800 (PST)
In-Reply-To: <ssfreu$ck$1@dont-email.me>
Injection-Info: google-groups.googlegroups.com; posting-host=2600:1700:291:29f0:155a:6773:152c:fbf3;
posting-account=H_G_JQkAAADS6onOMb-dqvUozKse7mcM
NNTP-Posting-Host: 2600:1700:291:29f0:155a:6773:152c:fbf3
References: <2021Dec24.163843@mips.complang.tuwien.ac.at> <sqssff$a9j$1@gioia.aioe.org>
<077afaee-009e-4860-be45-61106126934bn@googlegroups.com> <squhht$79u$1@dont-email.me>
<bb6d49bb-a676-44bd-9a6d-29386d429454n@googlegroups.com> <sr0vhm$c4u$1@dont-email.me>
<sr114i$1qc$1@newsreader4.netcologne.de> <sr1dca$70e$1@dont-email.me>
<kM%AJ.186634$np6.183460@fx46.iad> <sr2gf6$64u$1@dont-email.me>
<7DpBJ.254731$3q9.63673@fx47.iad> <sr62tb$u2o$1@dont-email.me>
<ss4g91$hvs$1@dont-email.me> <ss4ktr$dvv$1@dont-email.me> <UZCFJ.4610$uP.4480@fx16.iad>
<bSDFJ.268310$1d1.64158@fx99.iad> <ss78bk$hm6$1@dont-email.me>
<jwvo848n4ud.fsf-monnier+comp.arch@gnu.org> <ss7ila$obl$1@dont-email.me>
<jwv7dawmmb7.fsf-monnier+comp.arch@gnu.org> <ss858d$m0a$1@dont-email.me>
<74f730f1-e138-4124-9fbb-21f388eafeb3n@googlegroups.com> <ss9p4l$a2i$1@dont-email.me>
<34d3c262-c385-4a42-a6c1-a15c34536b28n@googlegroups.com> <ssero6$kae$1@dont-email.me>
<6cc6b294-7b58-42a8-8751-4fd808ea034fn@googlegroups.com> <ssfreu$ck$1@dont-email.me>
User-Agent: G2/1.0
MIME-Version: 1.0
Message-ID: <666a6ae5-0bac-421e-81c6-8e3d752192e0n@googlegroups.com>
Subject: Re: The type of Mill's belt's slots
From: MitchAl...@aol.com (MitchAlsup)
Injection-Date: Sat, 22 Jan 2022 03:25:26 +0000
Content-Type: text/plain; charset="UTF-8"
Content-Transfer-Encoding: quoted-printable
 by: MitchAlsup - Sat, 22 Jan 2022 03:25 UTC

I seem to be wearing my pessimistic hat today::
<
On Friday, January 21, 2022 at 8:51:14 PM UTC-6, Ivan Godard wrote:
<
> Something I learned a long time ago: there is a difference between the
> first and the second sale of any product. To get the first sale you must
> give them what they want. To get the second sale you must have given
> them what they needed.
>
> In our business, the first sale generally wants bang for the buck - or
> LINPACK for power/area (area equals price). The Mill ISA was designed to
> scale beyond any competing chip in total bang, and do it cheaper than
> any competing chip. To do that the design had to wildly diverge for
> conventional practice.
<
Mill is competing against 3 giants that are willing and able to sell at ½ their
production costs for years at a time,
AND
Until you are selling 1,000,000 per month, you will not be able to get within
½ of their production costs.
<
So, starting out you are facing a ~4× cost problem.
When you get to 1,000,000 per month you are only facing a ~2× cost problem.
<
Mill may (MAY) end up with a 2× cost advantage due to all the cool things Mill
architecture is doing. But the gravitational vector is still greatly problematic.
<
>
> The second sale adds RAS as a factor, overlooked in the first sale by
> most (but not all) customers.
>
> It doesn't matter if economics causes resort to emulation: the customer
> doesn't see the x86 underneath, and his use is not exposed to x86
> attacks. And non-ISA RAS concerns, like internal ECC, are independent of
> ISA anyway.
<
I wonder if all of the viruses, Spctré, Meltdown, RoP attacks have reduced
PC sales by more than a total of 37 chips ? I know no cell phone sales have
been reduced by ARM's vulnerabilities.
<
Thus, it seems like 99.996% of chip sales are independent of your argument
that RAS sells. It may, but not in the volume needed to compete to the point
where your production costs are comparable with their sales prices.

Re: RISC-V vs. Aarch64

<ssfv3e$tq1$1@dont-email.me>

  copy mid

https://www.novabbs.com/devel/article-flat.php?id=23035&group=comp.arch#23035

  copy link   Newsgroups: comp.arch
Path: i2pn2.org!i2pn.org!eternal-september.org!reader02.eternal-september.org!.POSTED!not-for-mail
From: sfu...@alumni.cmu.edu.invalid (Stephen Fuld)
Newsgroups: comp.arch
Subject: Re: RISC-V vs. Aarch64
Date: Fri, 21 Jan 2022 19:53:16 -0800
Organization: A noiseless patient Spider
Lines: 29
Message-ID: <ssfv3e$tq1$1@dont-email.me>
References: <2021Dec24.163843@mips.complang.tuwien.ac.at>
<2022Jan8.101413@mips.complang.tuwien.ac.at>
<7557bf3a-61ce-4500-8cf8-ced2dbed7087n@googlegroups.com>
<ad2ee700-b604-4565-9e24-3386580b90c8n@googlegroups.com>
<4d2fbc82-af69-4388-bfa5-e3b2be652744n@googlegroups.com>
<2e706405-006a-49bb-8e8a-f634d749205en@googlegroups.com>
<570acc73-a5da-497f-8ec4-810150e0a9f1n@googlegroups.com>
<850b7681-204a-4df6-9095-cd6ee816a7d5n@googlegroups.com>
<5ea00397-5572-4fbd-bfb3-85c3554f1eb9n@googlegroups.com>
<srnkf0$4cb$1@dont-email.me>
<b4e98991-4fb9-4ef7-a831-430c3fc10145n@googlegroups.com>
<srp3n0$e0a$1@dont-email.me> <srp4lu$hhg$1@gioia.aioe.org>
<81c0ddc6-4b46-4b3d-b64b-e65963889214n@googlegroups.com>
<srpjgj$541$1@dont-email.me>
<1e0b0ba3-e11c-4a14-a0c7-7c074f2f9ba7n@googlegroups.com>
<ssep4k$vke$1@dont-email.me>
<9d652029-a997-4bec-9182-769002a6bd1dn@googlegroups.com>
<sset5a$v8n$1@dont-email.me>
<dc78242b-1bda-45d7-ae78-893121125b92n@googlegroups.com>
Mime-Version: 1.0
Content-Type: text/plain; charset=UTF-8; format=flowed
Content-Transfer-Encoding: 7bit
Injection-Date: Sat, 22 Jan 2022 03:53:18 -0000 (UTC)
Injection-Info: reader02.eternal-september.org; posting-host="4632c0763ef12c0c08445e2e9303af14";
logging-data="30529"; mail-complaints-to="abuse@eternal-september.org"; posting-account="U2FsdGVkX1+0Q0v6woUf4xjC5J+pwQmM/V8ugSJkOeI="
User-Agent: Mozilla/5.0 (Windows NT 10.0; Win64; x64; rv:91.0) Gecko/20100101
Thunderbird/91.5.0
Cancel-Lock: sha1:lamWii0IF5MkD4OM2LtTq7P3l9k=
In-Reply-To: <dc78242b-1bda-45d7-ae78-893121125b92n@googlegroups.com>
Content-Language: en-US
 by: Stephen Fuld - Sat, 22 Jan 2022 03:53 UTC

On 1/21/2022 12:09 PM, MitchAlsup wrote:
> On Friday, January 21, 2022 at 12:14:06 PM UTC-6, Stephen Fuld wrote:
>> On 1/21/2022 9:53 AM, MitchAlsup wrote:

snip

>>> Also note, my original
>>> solution can produce signed or unsigned output stream.
>> True. Would the ability to produce signed output be used? Genuine
>> question - I don't know the answer. If it was important, I suppose you
>> could use an additional op code or if available a spare bit in the LBF
>> instruction.
> <
> My EXT instructions (SL) can deliver signed or unsigned results. My estimation
> is that there will be at least 10% of applications that would want the BFs signed.
> {Whether that should make the cut (or not) is undecided.}

Then another alternative presents itself. If you don't want to spend an
additional op code or instruction bit, you could follow the LDBF with an
EXT instruction for the signed case, extracting the relevant bits into
the same register. It costs a few more instructions, but it only
happens 10% of the time. I am not saying this is a good alternative,
but it is an alternative.

--
- Stephen Fuld
(e-mail address disguised to prevent spam)

Re: The type of Mill's belt's slots

<ssg700$tgs$1@dont-email.me>

  copy mid

https://www.novabbs.com/devel/article-flat.php?id=23036&group=comp.arch#23036

  copy link   Newsgroups: comp.arch
Path: i2pn2.org!i2pn.org!eternal-september.org!reader02.eternal-september.org!.POSTED!not-for-mail
From: iva...@millcomputing.com (Ivan Godard)
Newsgroups: comp.arch
Subject: Re: The type of Mill's belt's slots
Date: Fri, 21 Jan 2022 22:08:00 -0800
Organization: A noiseless patient Spider
Lines: 67
Message-ID: <ssg700$tgs$1@dont-email.me>
References: <2021Dec24.163843@mips.complang.tuwien.ac.at>
<squhht$79u$1@dont-email.me>
<bb6d49bb-a676-44bd-9a6d-29386d429454n@googlegroups.com>
<sr0vhm$c4u$1@dont-email.me> <sr114i$1qc$1@newsreader4.netcologne.de>
<sr1dca$70e$1@dont-email.me> <kM%AJ.186634$np6.183460@fx46.iad>
<sr2gf6$64u$1@dont-email.me> <7DpBJ.254731$3q9.63673@fx47.iad>
<sr62tb$u2o$1@dont-email.me> <ss4g91$hvs$1@dont-email.me>
<ss4ktr$dvv$1@dont-email.me> <UZCFJ.4610$uP.4480@fx16.iad>
<bSDFJ.268310$1d1.64158@fx99.iad> <ss78bk$hm6$1@dont-email.me>
<jwvo848n4ud.fsf-monnier+comp.arch@gnu.org> <ss7ila$obl$1@dont-email.me>
<jwv7dawmmb7.fsf-monnier+comp.arch@gnu.org> <ss858d$m0a$1@dont-email.me>
<74f730f1-e138-4124-9fbb-21f388eafeb3n@googlegroups.com>
<ss9p4l$a2i$1@dont-email.me>
<34d3c262-c385-4a42-a6c1-a15c34536b28n@googlegroups.com>
<ssero6$kae$1@dont-email.me>
<6cc6b294-7b58-42a8-8751-4fd808ea034fn@googlegroups.com>
<ssfreu$ck$1@dont-email.me>
<666a6ae5-0bac-421e-81c6-8e3d752192e0n@googlegroups.com>
Mime-Version: 1.0
Content-Type: text/plain; charset=UTF-8; format=flowed
Content-Transfer-Encoding: 8bit
Injection-Date: Sat, 22 Jan 2022 06:08:00 -0000 (UTC)
Injection-Info: reader02.eternal-september.org; posting-host="91824058145af369f06251457d1d9841";
logging-data="30236"; mail-complaints-to="abuse@eternal-september.org"; posting-account="U2FsdGVkX19w7CQS7nynz67/eH0vm12S"
User-Agent: Mozilla/5.0 (Windows NT 10.0; Win64; x64; rv:91.0) Gecko/20100101
Thunderbird/91.5.0
Cancel-Lock: sha1:OedrFD+Crpl6MPY2VD4i9/nxWqo=
In-Reply-To: <666a6ae5-0bac-421e-81c6-8e3d752192e0n@googlegroups.com>
Content-Language: en-US
 by: Ivan Godard - Sat, 22 Jan 2022 06:08 UTC

On 1/21/2022 7:25 PM, MitchAlsup wrote:
> I seem to be wearing my pessimistic hat today::
> <
> On Friday, January 21, 2022 at 8:51:14 PM UTC-6, Ivan Godard wrote:
> <
>> Something I learned a long time ago: there is a difference between the
>> first and the second sale of any product. To get the first sale you must
>> give them what they want. To get the second sale you must have given
>> them what they needed.
>>
>> In our business, the first sale generally wants bang for the buck - or
>> LINPACK for power/area (area equals price). The Mill ISA was designed to
>> scale beyond any competing chip in total bang, and do it cheaper than
>> any competing chip. To do that the design had to wildly diverge for
>> conventional practice.
> <
> Mill is competing against 3 giants that are willing and able to sell at ½ their
> production costs for years at a time,
> AND
> Until you are selling 1,000,000 per month, you will not be able to get within
> ½ of their production costs.
> <
> So, starting out you are facing a ~4× cost problem.
> When you get to 1,000,000 per month you are only facing a ~2× cost problem.
> <
> Mill may (MAY) end up with a 2× cost advantage due to all the cool things Mill
> architecture is doing. But the gravitational vector is still greatly problematic.
> <
>>
>> The second sale adds RAS as a factor, overlooked in the first sale by
>> most (but not all) customers.
>>
>> It doesn't matter if economics causes resort to emulation: the customer
>> doesn't see the x86 underneath, and his use is not exposed to x86
>> attacks. And non-ISA RAS concerns, like internal ECC, are independent of
>> ISA anyway.
> <
> I wonder if all of the viruses, Spctré, Meltdown, RoP attacks have reduced
> PC sales by more than a total of 37 chips ? I know no cell phone sales have
> been reduced by ARM's vulnerabilities.
> <
> Thus, it seems like 99.996% of chip sales are independent of your argument
> that RAS sells. It may, but not in the volume needed to compete to the point
> where your production costs are comparable with their sales prices.

All true.

It's clear that there exists a market in which RAS sells; the question
is the size of that market and its price elasticity
(https://en.wikipedia.org/wiki/Elasticity_(economics)). There are other
inelastic niche markets that we could own too; maximal single thread for
example.

For major markets, I expect the usual reaction of a disruptee to a
disruptor. So long as we are small enough that our market share is
immaterial to the Big Boys, they won't cut their margins to drive us
out. In fact, with our cost advantage we can undercut them to go for the
first sale.

That won't last of course. At some point, probably well before we too
are Big Boys, they will start to notice. The first response will not be
to price cut, which would seriously impact their bottom lines (and
option stock price). Instead, they will go for Lawyers, Guns, and Money
(apologies Warren Zevon). At that point, if we do it right, our second
sale advantage will kick in and we should have a defensible (although
possibly small) economic position. I'd prefer better, but 1% of the
world CPU market sounds pretty good.

Re: RISC-V vs. Aarch64

<vMNGJ.13984$mj1.1444@fx02.iad>

  copy mid

https://www.novabbs.com/devel/article-flat.php?id=23037&group=comp.arch#23037

  copy link   Newsgroups: comp.arch
Path: i2pn2.org!i2pn.org!weretis.net!feeder8.news.weretis.net!feeder1.feed.usenet.farm!feed.usenet.farm!peer01.ams4!peer.am4.highwinds-media.com!peer01.iad!feed-me.highwinds-media.com!news.highwinds-media.com!fx02.iad.POSTED!not-for-mail
From: ThatWoul...@thevillage.com (EricP)
User-Agent: Thunderbird 2.0.0.24 (Windows/20100228)
MIME-Version: 1.0
Newsgroups: comp.arch
Subject: Re: RISC-V vs. Aarch64
References: <2021Dec24.163843@mips.complang.tuwien.ac.at> <srag0i$2ed$2@dont-email.me> <00add816-93d7-4763-a68b-33a67db6d770n@googlegroups.com> <2022Jan8.101413@mips.complang.tuwien.ac.at> <7557bf3a-61ce-4500-8cf8-ced2dbed7087n@googlegroups.com> <ad2ee700-b604-4565-9e24-3386580b90c8n@googlegroups.com> <4d2fbc82-af69-4388-bfa5-e3b2be652744n@googlegroups.com> <2e706405-006a-49bb-8e8a-f634d749205en@googlegroups.com> <570acc73-a5da-497f-8ec4-810150e0a9f1n@googlegroups.com> <850b7681-204a-4df6-9095-cd6ee816a7d5n@googlegroups.com> <5ea00397-5572-4fbd-bfb3-85c3554f1eb9n@googlegroups.com> <srnkf0$4cb$1@dont-email.me> <b4e98991-4fb9-4ef7-a831-430c3fc10145n@googlegroups.com> <srp3n0$e0a$1@dont-email.me> <srp4lu$hhg$1@gioia.aioe.org> <81c0ddc6-4b46-4b3d-b64b-e65963889214n@googlegroups.com> <srpjgj$541$1@dont-email.me> <1e0b0ba3-e11c-4a14-a0c7-7c074f2f9ba7n@googlegroups.com> <ssep4k$vke$1@dont-email.me> <9d652029-a997-4bec-9182-769002a6bd1dn@googlegroups.com> <ssfco0$hgh$1@dont-email.me>
In-Reply-To: <ssfco0$hgh$1@dont-email.me>
Content-Type: text/plain; charset=UTF-8; format=flowed
Content-Transfer-Encoding: 8bit
Lines: 78
Message-ID: <vMNGJ.13984$mj1.1444@fx02.iad>
X-Complaints-To: abuse@UsenetServer.com
NNTP-Posting-Date: Sat, 22 Jan 2022 06:47:55 UTC
Date: Sat, 22 Jan 2022 01:47:33 -0500
X-Received-Bytes: 5279
 by: EricP - Sat, 22 Jan 2022 06:47 UTC

Ivan Godard wrote:
> On 1/21/2022 9:53 AM, MitchAlsup wrote:
>> On Friday, January 21, 2022 at 11:05:28 AM UTC-6, Stephen Fuld wrote:
>>>
>>> If Terje is right, and he almost always is, it is worth trying to come
>>> up with a better solution for this type of problem. So, as a start, I
>>> came up with what follows. This certainly isn’t the final solution. It
>>> is intended to start a discussion on better ways to do this. And the
>>> usual disclaimer, IANAHG, so this is from a software perspective. But I
>>> did try to fit it “in the spirit” of the MY 66000, and it takes
>>> advantages of that design’s unique capabilities.
>>>
>>> The idea is to add one new instruction, which typically would be in the
>>> shadow of a preceding Carry meta instruction. I called the new
>>> instruction Load Bit Field (LBF).
>>>
>>> It is a two source, one result instruction, but uses the carry register
>>> for an additional source and destination. The syntax is
>>>
>>> LBF Result register, field length (in bits), buffer starting address
>>> (in bytes)
>>>
>>> The carry register contains the offset, in bits, from the start of the
>>> buffer where the desired field starts.
>>>
>>> The instruction computes the start of the desired field by adding the
>>> high order all but three bits of the carry register to get the starting
>>> byte number, then uses the low order three bits to get the starting bit
>>> number. The instruction extracts the field, starting at the computed
>>> bit address with length as given in the register specified in the
>>> register, and right justifies that field in the result register. The
>>> higher order bits in the result register are set to zero. If the output
>>> bit of the Carry instruction is set, the length value is added to the
>>> Carry register.
>> <
>> A bit more on the CISC side than desired (most of the time)--3
>> exceptions possible, 2 memory accesses. Also note, my original
>> solution can produce signed or unsigned output stream. This is
>> going to take 2 cycles in AGEN, and 2 result register writes.
>>>
>>> In order to speed up this instruction, and given that it will frequently
>>> occur in a fairly tight loop, I think (hope) that the hardware can take
>>> advantage of the “streaming” buffers otherwise used for VVM operations.
>>> Anyway, if one had this instruction, the main loop in the code above
>>> could be something like
>>>
>>>
>>> loop:
>>> LDUB R10,[R1+R9]
>>> CARRY R6,IO
>>> LBF R12,R10,R2 ;I am not sure about R2, It should be the start of
>>> the packed buffer.
>>> STD R12,[R3+R9<<3]
>>> ADD R9,R9,#1
>>> CMP R11,R9,R4
>>> BLT R11,loop
>>>
>>> For a savings of about 10 instructions in the I cache, but fewer in
>>> execution (but still significant) depending upon how often the
>>> instructions under the predicate are executed.
>>>
>> I have to admit, this looks fairly juicy--just have to plow my way
>> through and see what comes out.
>
> I must be missing something - what's the advantage of this over a simple
> unaligned LOAD and an EXTRACT?

If the extract/insert field size is 57..64 bits then to handle that
last byte there is a conditional load/store plus some fiddly code
(because it is a 57..64 bit field within a 72 bit container).
Note that the load or store should be careful not to blindly cross over
a page boundary unless the bit field actually does (so it can't just read
8 bytes and extract a field) and shouldn't blindly cross cache lines either.

This can also be handled with double wide register extract/insert
with unaligned load/store and branchy code.

Re: The type of Mill's belt's slots

<ssgfml$7kb$1@newsreader4.netcologne.de>

  copy mid

https://www.novabbs.com/devel/article-flat.php?id=23038&group=comp.arch#23038

  copy link   Newsgroups: comp.arch
Path: i2pn2.org!i2pn.org!weretis.net!feeder8.news.weretis.net!newsreader4.netcologne.de!news.netcologne.de!.POSTED.2001-4dd4-df7a-0-7285-c2ff-fe6c-992d.ipv6dyn.netcologne.de!not-for-mail
From: tkoe...@netcologne.de (Thomas Koenig)
Newsgroups: comp.arch
Subject: Re: The type of Mill's belt's slots
Date: Sat, 22 Jan 2022 08:36:37 -0000 (UTC)
Organization: news.netcologne.de
Distribution: world
Message-ID: <ssgfml$7kb$1@newsreader4.netcologne.de>
References: <2021Dec24.163843@mips.complang.tuwien.ac.at>
<squhht$79u$1@dont-email.me>
<bb6d49bb-a676-44bd-9a6d-29386d429454n@googlegroups.com>
<sr0vhm$c4u$1@dont-email.me> <sr114i$1qc$1@newsreader4.netcologne.de>
<sr1dca$70e$1@dont-email.me> <kM%AJ.186634$np6.183460@fx46.iad>
<sr2gf6$64u$1@dont-email.me> <7DpBJ.254731$3q9.63673@fx47.iad>
<sr62tb$u2o$1@dont-email.me> <ss4g91$hvs$1@dont-email.me>
<ss4ktr$dvv$1@dont-email.me> <UZCFJ.4610$uP.4480@fx16.iad>
<bSDFJ.268310$1d1.64158@fx99.iad> <ss78bk$hm6$1@dont-email.me>
<jwvo848n4ud.fsf-monnier+comp.arch@gnu.org> <ss7ila$obl$1@dont-email.me>
<jwv7dawmmb7.fsf-monnier+comp.arch@gnu.org> <ss858d$m0a$1@dont-email.me>
<74f730f1-e138-4124-9fbb-21f388eafeb3n@googlegroups.com>
<ss9p4l$a2i$1@dont-email.me>
<34d3c262-c385-4a42-a6c1-a15c34536b28n@googlegroups.com>
<ssero6$kae$1@dont-email.me>
<6cc6b294-7b58-42a8-8751-4fd808ea034fn@googlegroups.com>
<ssfreu$ck$1@dont-email.me>
<666a6ae5-0bac-421e-81c6-8e3d752192e0n@googlegroups.com>
Mime-Version: 1.0
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: 8bit
Injection-Date: Sat, 22 Jan 2022 08:36:37 -0000 (UTC)
Injection-Info: newsreader4.netcologne.de; posting-host="2001-4dd4-df7a-0-7285-c2ff-fe6c-992d.ipv6dyn.netcologne.de:2001:4dd4:df7a:0:7285:c2ff:fe6c:992d";
logging-data="7819"; mail-complaints-to="abuse@netcologne.de"
User-Agent: slrn/1.0.3 (Linux)
 by: Thomas Koenig - Sat, 22 Jan 2022 08:36 UTC

MitchAlsup <MitchAlsup@aol.com> schrieb:
> I seem to be wearing my pessimistic hat today::
><
> On Friday, January 21, 2022 at 8:51:14 PM UTC-6, Ivan Godard wrote:
><
>> Something I learned a long time ago: there is a difference between the
>> first and the second sale of any product. To get the first sale you must
>> give them what they want. To get the second sale you must have given
>> them what they needed.
>>
>> In our business, the first sale generally wants bang for the buck - or
>> LINPACK for power/area (area equals price). The Mill ISA was designed to
>> scale beyond any competing chip in total bang, and do it cheaper than
>> any competing chip. To do that the design had to wildly diverge for
>> conventional practice.
><
> Mill is competing against 3 giants that are willing and able to sell at ½ their
> production costs for years at a time,

Do you have a source for that? And if you say "production cost", what
exactly do you mean? Do you mean that contribution margin 1, 2 or 3
is negative?

(Contribution margin 1 is sales minus variable costs, contribution
margin 2 is sales minus variable and product-specific fixed costs,
which usually includes depreceation, contribution margin 3 is
sales minus product group specific fixed costs.)

If they are in fact selling below costs, this could be grounds for
an anti-dumping procedure.

Re: The type of Mill's belt's slots

<0e873a22-bd83-4241-a1ff-9eb5a5aa2e42n@googlegroups.com>

  copy mid

https://www.novabbs.com/devel/article-flat.php?id=23039&group=comp.arch#23039

  copy link   Newsgroups: comp.arch
X-Received: by 2002:ad4:5e87:: with SMTP id jl7mr7092171qvb.130.1642845801018;
Sat, 22 Jan 2022 02:03:21 -0800 (PST)
X-Received: by 2002:a9d:2a82:: with SMTP id e2mr5674083otb.331.1642845800764;
Sat, 22 Jan 2022 02:03:20 -0800 (PST)
Path: i2pn2.org!i2pn.org!weretis.net!feeder6.news.weretis.net!news.misty.com!border2.nntp.dca1.giganews.com!nntp.giganews.com!news-out.google.com!nntp.google.com!postnews.google.com!google-groups.googlegroups.com!not-for-mail
Newsgroups: comp.arch
Date: Sat, 22 Jan 2022 02:03:20 -0800 (PST)
In-Reply-To: <ssfreu$ck$1@dont-email.me>
Injection-Info: google-groups.googlegroups.com; posting-host=2001:56a:fb70:6300:98f5:1880:8e9a:cae;
posting-account=1nOeKQkAAABD2jxp4Pzmx9Hx5g9miO8y
NNTP-Posting-Host: 2001:56a:fb70:6300:98f5:1880:8e9a:cae
References: <2021Dec24.163843@mips.complang.tuwien.ac.at> <sqssff$a9j$1@gioia.aioe.org>
<077afaee-009e-4860-be45-61106126934bn@googlegroups.com> <squhht$79u$1@dont-email.me>
<bb6d49bb-a676-44bd-9a6d-29386d429454n@googlegroups.com> <sr0vhm$c4u$1@dont-email.me>
<sr114i$1qc$1@newsreader4.netcologne.de> <sr1dca$70e$1@dont-email.me>
<kM%AJ.186634$np6.183460@fx46.iad> <sr2gf6$64u$1@dont-email.me>
<7DpBJ.254731$3q9.63673@fx47.iad> <sr62tb$u2o$1@dont-email.me>
<ss4g91$hvs$1@dont-email.me> <ss4ktr$dvv$1@dont-email.me> <UZCFJ.4610$uP.4480@fx16.iad>
<bSDFJ.268310$1d1.64158@fx99.iad> <ss78bk$hm6$1@dont-email.me>
<jwvo848n4ud.fsf-monnier+comp.arch@gnu.org> <ss7ila$obl$1@dont-email.me>
<jwv7dawmmb7.fsf-monnier+comp.arch@gnu.org> <ss858d$m0a$1@dont-email.me>
<74f730f1-e138-4124-9fbb-21f388eafeb3n@googlegroups.com> <ss9p4l$a2i$1@dont-email.me>
<34d3c262-c385-4a42-a6c1-a15c34536b28n@googlegroups.com> <ssero6$kae$1@dont-email.me>
<6cc6b294-7b58-42a8-8751-4fd808ea034fn@googlegroups.com> <ssfreu$ck$1@dont-email.me>
User-Agent: G2/1.0
MIME-Version: 1.0
Message-ID: <0e873a22-bd83-4241-a1ff-9eb5a5aa2e42n@googlegroups.com>
Subject: Re: The type of Mill's belt's slots
From: jsav...@ecn.ab.ca (Quadibloc)
Injection-Date: Sat, 22 Jan 2022 10:03:21 +0000
Content-Type: text/plain; charset="UTF-8"
Lines: 13
 by: Quadibloc - Sat, 22 Jan 2022 10:03 UTC

On Friday, January 21, 2022 at 7:51:14 PM UTC-7, Ivan Godard wrote:

> Something I learned a long time ago: there is a difference between the
> first and the second sale of any product. To get the first sale you must
> give them what they want. To get the second sale you must have given
> them what they needed.

That is very true. And since the first sale is a prerequisite for the second
sale... you have to solve the very difficult problem of giving them *both*
what they want and what they need, and yet at a competitive price.

....which I'm sure you knew, but which is worth _explicitly_ pointing out.

John Savard

Pages:123456789101112131415
server_pubkey.txt

rocksolid light 0.9.81
clearnet tor