Rocksolid Light

Welcome to novaBBS (click a section below)

mail  files  register  newsreader  groups  login

Message-ID:  

6 May, 2024: The networking issue during the past two days has been identified and appears to be fixed. Will keep monitoring.


devel / comp.arch / Re: Introducing ForwardCom: An open ISA with variable-length vector registers

SubjectAuthor
* Introducing ForwardCom: An open ISA with variable-length vector registersAgner Fog
+* Re: Introducing ForwardCom: An open ISA with variable-length vector registersMitchAlsup
|`- Re: Introducing ForwardCom: An open ISA with variable-length vector registersQuadibloc
+* Re: Introducing ForwardCom: An open ISA with variable-length vector registersMitchAlsup
|`- Re: Introducing ForwardCom: An open ISA with variable-length vector registersAgner Fog
+* Re: Introducing ForwardCom: An open ISA with variable-length vectorThomas Koenig
|+* Re: Introducing ForwardCom: An open ISA with variable-length vectorTerje Mathisen
||`* Re: Introducing ForwardCom: An open ISA with variable-length vectorThomas Koenig
|| +- Re: Introducing ForwardCom: An open ISA with variable-length vector registersTim Rentsch
|| `- Re: Introducing ForwardCom: An open ISA with variable-length vector registersAnton Ertl
|`* Re: Introducing ForwardCom: An open ISA with variable-length vector registersMitchAlsup
| `* Re: Introducing ForwardCom: An open ISA with variable-length vectorThomas Koenig
|  +* Re: Introducing ForwardCom: An open ISA with variable-length vector registersMitchAlsup
|  |+* Re: Introducing ForwardCom: An open ISA with variable-length vectorThomas Koenig
|  ||`* Re: Introducing ForwardCom: An open ISA with variable-length vector registersMitchAlsup
|  || `* Re: Introducing ForwardCom: An open ISA with variable-length vectorThomas Koenig
|  ||  +* Re: Introducing ForwardCom: An open ISA with variable-length vector registersAgner Fog
|  ||  |`- Re: Introducing ForwardCom: An open ISA with variable-length vector registersAgner Fog
|  ||  `* Re: Introducing ForwardCom: An open ISA with variable-length vectorThomas Koenig
|  ||   `* Re: Introducing ForwardCom: An open ISA with variable-length vectorStephen Fuld
|  ||    `* Re: Introducing ForwardCom: An open ISA with variable-length vector registersMitchAlsup
|  ||     +* Re: Introducing ForwardCom: An open ISA with variable-length vectorThomas Koenig
|  ||     |+- Re: Introducing ForwardCom: An open ISA with variable-length vector registersQuadibloc
|  ||     |`* Re: Introducing ForwardCom: An open ISA with variable-length vector registersMitchAlsup
|  ||     | `- Re: Introducing ForwardCom: An open ISA with variable-length vectorStephen Fuld
|  ||     `* Re: Introducing ForwardCom: An open ISA with variable-length vector registersTim Rentsch
|  ||      +* Re: Introducing ForwardCom: An open ISA with variable-length vectorwilliamfindlay
|  ||      |`* Re: Introducing ForwardCom: An open ISA with variable-length vector registersTim Rentsch
|  ||      | `* Re: Grammar peevingmoi
|  ||      |  +* Re: Grammar peevingTim Rentsch
|  ||      |  |+* Re: Grammar peevingmoi
|  ||      |  ||`- Re: Grammar peevingTim Rentsch
|  ||      |  |`* Re: Grammar peevingThomas Koenig
|  ||      |  | +- Re: Grammar peevingMitchAlsup
|  ||      |  | +* Re: Grammar peevingTim Rentsch
|  ||      |  | |`* Re: Grammar peevingThomas Koenig
|  ||      |  | | `- Re: Grammar peevingTim Rentsch
|  ||      |  | `* Re: Grammar peevingMichael S
|  ||      |  |  `- Re: Grammar peevingMitchAlsup
|  ||      |  `* Re: extreme Grammar peevingJohn Levine
|  ||      |   `* Re: extreme Grammar peevingmoi
|  ||      |    `* Re: not even wrong, extreme Grammar peevingJohn Levine
|  ||      |     `* Re: not even wrong, extreme Grammar peevingmoi
|  ||      |      `- Re: not even wrong, extreme Grammar peevingJohn Levine
|  ||      `* Re: Introducing ForwardCom: An open ISA with variable-length vector registersQuadibloc
|  ||       `- Re: Introducing ForwardCom: An open ISA with variable-length vector registersTim Rentsch
|  |`* Re: Introducing ForwardCom: An open ISA with variable-lengthmac
|  | `- Re: Introducing ForwardCom: An open ISA with variable-length vector registersMitchAlsup
|  `* Re: Introducing ForwardCom: An open ISA with variable-length vectorTerje Mathisen
|   `* Re: Introducing ForwardCom: An open ISA with variable-length vectorThomas Koenig
|    `* Re: Introducing ForwardCom: An open ISA with variable-length vectorTerje Mathisen
|     `* Re: Introducing ForwardCom: An open ISA with variable-length vector registersrobf...@gmail.com
|      `* Re: Introducing ForwardCom: An open ISA with variable-length vector registersAgner Fog
|       +* Re: Introducing ForwardCom: An open ISA with variable-length vectorThomas Koenig
|       |`* Re: Introducing ForwardCom: An open ISA with variable-length vector registersAgner Fog
|       | `- Re: Introducing ForwardCom: An open ISA with variable-length vectorThomas Koenig
|       `* Re: Introducing ForwardCom: An open ISA with variable-length vectorEricP
|        +* Re: Introducing ForwardCom: An open ISA with variable-length vector registersAgner Fog
|        |`* Re: Introducing ForwardCom: An open ISA with variable-length vector registersMitchAlsup
|        | `* Re: Introducing ForwardCom: An open ISA with variable-length vector registersAgner Fog
|        |  +- Re: Introducing ForwardCom: An open ISA with variable-length vectorThomas Koenig
|        |  +* Re: Introducing ForwardCom: An open ISA with variable-length vector registersQuadibloc
|        |  |`* Re: Introducing ForwardCom: An open ISA with variable-length vector registersAnton Ertl
|        |  | `* Re: Introducing ForwardCom: An open ISA with variable-length vector registersJosh Vanderhoof
|        |  |  `* Re: Introducing ForwardCom: An open ISA with variable-length vector registersAgner Fog
|        |  |   +* Re: Introducing ForwardCom: An open ISA with variable-length vector registersQuadibloc
|        |  |   |+- Re: Introducing ForwardCom: An open ISA with variable-length vector registersQuadibloc
|        |  |   |`* Re: Introducing ForwardCom: An open ISA with variable-length vector registersMitchAlsup
|        |  |   | `* Re: Introducing ForwardCom: An open ISA with variable-length vectorThomas Koenig
|        |  |   |  `* Re: Introducing ForwardCom: An open ISA with variable-length vector registersMitchAlsup
|        |  |   |   `* Re: Introducing ForwardCom: An open ISA with variable-length vector registersJohnG
|        |  |   |    +* Re: Introducing ForwardCom: An open ISA with variable-length vector registersMitchAlsup
|        |  |   |    |`* Re: Introducing ForwardCom: An open ISA with variable-length vector registersAgner Fog
|        |  |   |    | +- Re: Introducing ForwardCom: An open ISA with variable-length vector registersScott Lurndal
|        |  |   |    | +* Re: Introducing ForwardCom: An open ISA with variable-length vectorEricP
|        |  |   |    | |`* Re: Introducing ForwardCom: An open ISA with variable-length vector registersScott Lurndal
|        |  |   |    | | `* Re: Introducing ForwardCom: An open ISA with variable-length vectorEricP
|        |  |   |    | |  `* Re: Introducing ForwardCom: An open ISA with variable-length vector registersScott Lurndal
|        |  |   |    | |   +* Re: Introducing ForwardCom: An open ISA with variable-length vector registersAgner Fog
|        |  |   |    | |   |+- Re: Introducing ForwardCom: An open ISA with variable-length vector registersMitchAlsup
|        |  |   |    | |   |`* Re: Introducing ForwardCom: An open ISA with variable-length vectorEricP
|        |  |   |    | |   | `* Re: Introducing ForwardCom: An open ISA with variable-length vector registersMitchAlsup
|        |  |   |    | |   |  `* Re: Introducing ForwardCom: An open ISA with variable-length vectorThomas Koenig
|        |  |   |    | |   |   `* Re: Introducing ForwardCom: An open ISA with variable-length vector registersMitchAlsup
|        |  |   |    | |   |    `* Re: Introducing ForwardCom: An open ISA with variable-length vectorThomas Koenig
|        |  |   |    | |   |     +- Re: Introducing ForwardCom: An open ISA with variable-length vectorBrian G. Lucas
|        |  |   |    | |   |     `* Re: Introducing ForwardCom: An open ISA with variable-length vector registersQuadibloc
|        |  |   |    | |   |      `- Re: Introducing ForwardCom: An open ISA with variable-length vector registersMitchAlsup
|        |  |   |    | |   `* Re: Introducing ForwardCom: An open ISA with variable-length vectorEricP
|        |  |   |    | |    `- Re: Introducing ForwardCom: An open ISA with variable-length vectorTerje Mathisen
|        |  |   |    | `* Re: Introducing ForwardCom: An open ISA with variable-length vector registersMitchAlsup
|        |  |   |    |  +* Re: Introducing ForwardCom: An open ISA with variable-length vector registersScott Lurndal
|        |  |   |    |  |`* Re: Introducing ForwardCom: An open ISA with variable-length vector registersMitchAlsup
|        |  |   |    |  | `* Re: Introducing ForwardCom: An open ISA with variable-length vector registersScott Lurndal
|        |  |   |    |  |  +- Re: Introducing ForwardCom: An open ISA with variable-length vector registersMitchAlsup
|        |  |   |    |  |  `* Re: Introducing ForwardCom: An open ISA with variable-length vectorEricP
|        |  |   |    |  |   `* Re: Introducing ForwardCom: An open ISA with variable-length vectorEricP
|        |  |   |    |  |    `* Re: Introducing ForwardCom: An open ISA with variable-length vectorBGB
|        |  |   |    |  |     `* Re: Introducing ForwardCom: An open ISA with variable-length vector registersScott Lurndal
|        |  |   |    |  |      +- Re: Introducing ForwardCom: An open ISA with variable-length vectorBGB
|        |  |   |    |  |      `* Re: Introducing ForwardCom: An open ISA with variable-length vector registersMitchAlsup
|        |  |   |    |  +- Re: Introducing ForwardCom: An open ISA with variable-length vectorEricP
|        |  |   |    |  `* Re: Introducing ForwardCom: An open ISA with variable-length vector registersMitchAlsup
|        |  |   |    `- Re: Introducing ForwardCom: An open ISA with variable-length vectorThomas Koenig
|        |  |   `* Re: Introducing ForwardCom: An open ISA with variable-length vectorThomas Koenig
|        |  +- Re: Introducing ForwardCom: An open ISA with variable-length vector registersScott Lurndal
|        |  `- Re: Introducing ForwardCom: An open ISA with variable-length vector registersMitchAlsup
|        `- Re: Introducing ForwardCom: An open ISA with variable-length vector registersQuadibloc
+* Re: Introducing ForwardCom: An open ISA with variable-length vector registersQuadibloc
+* Re: Introducing ForwardCom: An open ISA with variable-length vector registersAnton Ertl
`* Re: Introducing ForwardCom: An open ISA with variable-length vector registersluke.l...@gmail.com

Pages:1234567891011
Introducing ForwardCom: An open ISA with variable-length vector registers

<12155892-ec89-4bee-85bb-d1491a5dd20dn@googlegroups.com>

  copy mid

https://www.novabbs.com/devel/article-flat.php?id=30617&group=comp.arch#30617

  copy link   Newsgroups: comp.arch
X-Received: by 2002:a05:620a:1442:b0:71d:8202:bc8 with SMTP id i2-20020a05620a144200b0071d82020bc8mr171936qkl.192.1675076614445;
Mon, 30 Jan 2023 03:03:34 -0800 (PST)
X-Received: by 2002:a05:6870:8202:b0:163:994c:9495 with SMTP id
n2-20020a056870820200b00163994c9495mr712270oae.79.1675076613892; Mon, 30 Jan
2023 03:03:33 -0800 (PST)
Path: i2pn2.org!i2pn.org!news.niel.me!glou.org!news.glou.org!fdn.fr!proxad.net!feeder1-2.proxad.net!209.85.160.216.MISMATCH!news-out.google.com!nntp.google.com!postnews.google.com!google-groups.googlegroups.com!not-for-mail
Newsgroups: comp.arch
Date: Mon, 30 Jan 2023 03:03:33 -0800 (PST)
Injection-Info: google-groups.googlegroups.com; posting-host=128.76.247.189; posting-account=tYjOgQoAAACRs74arwcusKjVVQt_fFMX
NNTP-Posting-Host: 128.76.247.189
User-Agent: G2/1.0
MIME-Version: 1.0
Message-ID: <12155892-ec89-4bee-85bb-d1491a5dd20dn@googlegroups.com>
Subject: Introducing ForwardCom: An open ISA with variable-length vector registers
From: agf...@dtu.dk (Agner Fog)
Injection-Date: Mon, 30 Jan 2023 11:03:34 +0000
Content-Type: text/plain; charset="UTF-8"
Content-Transfer-Encoding: quoted-printable
 by: Agner Fog - Mon, 30 Jan 2023 11:03 UTC

I have developed ForwardCom with help and suggestions from many people. This is a project to improve not only the ISA, but the entire ecosystem of software, ABI standard, development tools, etc.

ForwardCom can vectorize array loops in a new way that automatically adjusts to the maximum vector length supported by the CPU. It works in the following way:

A register used as loop counter is initialized to the array length. This counter is decremented by the vector register length and repeats as long as it is positive. The counter register also specifies the vector register length. The vector registers used in the loop will have the maximum length as long as the counter exceeds this value. There is no loop tail because the vector length is automatically adjusted in the last iteration of the loop to fit the remaining number of array elements.

A special addressing mode can load and store vector registers at an address calculated as the end of the array minus the loop counter.

This array loop mechanism makes the same software code run optimally on different CPUs with different maximum vector lengths. This is what I call forward compatibility: You don't have to update or recompile the software when you get access to a new CPU with larger vectors.

There is no global vector length register. You can have different vectors with different lengths at the same time. The length is stored in the vector register itself. When you save and restore a vector register, it will only save the part of the register that is actually used.

It is straightforward to call mathematical functions inside a vector loop. A library function takes vector registers as inputs and outputs. The same function can handle scalars and vectors of any length because the length is stored in each register.

ForwardCom is neither RISC nor CISC. It is a combination that gets the best of both worlds. There are few instructions, but many variants of each instruction. The instruction length can be one, two, or three 32-bit words. Decoding is still efficient because the length is specified by just two bits in the first code word.

The instruction set is orthogonal: Operands can be general purpose registers, vector registers, immediate constants, or memory operands with different addressing modes. The same instruction can handle integers of different sizes and floating point numbers of different precisions.

All instructions are coded according to a flexible and consistent template system. You can use a short single-word form of an instruction if the operands are simple, or a longer instruction format if you need extra registers, large immediate constants, large memory addresses, complex addressing modes, or extra option bits.

The flexible instruction format allows the CPU to do more work per instruction. Various option bits adds extra functionality to many instructions, but only as long as it fits into a smooth pipeline structure. Most instructions have a latency of one clock cycle and a throughput of one instruction per clock per execution unit. Multiplication, division, and floating point arithmetic have longer latencies.

The ForwardCom assembly language is simple and immediately intelligible. Adding two integers is as simple as: int r1 = r2 + r3. Branches and loops are written just like in C or Java, for example:
for (int r0 = 0; r0 < 100; r0++) { }

There are many other features, including strong security, efficient memory management, an efficient library system, efficient global register allocation, and more.

See www.forwardcom.info for all the details.

The current status of the ForwardCom project is:

* Instruction set and ABI standards are defined in detail.
* The following binary tools have been developed: high-level assembler, disassembler, linker, library manager, emulator, and debugger. A compiler is not available yet.
* A hardware implementation in an FPGA soft core is available with full documentation. The current version supports integer instructions only.
* Everything is free and open source

Re: Introducing ForwardCom: An open ISA with variable-length vector registers

<43fbc72b-98c5-4c9e-afeb-6642c4386acbn@googlegroups.com>

  copy mid

https://www.novabbs.com/devel/article-flat.php?id=30621&group=comp.arch#30621

  copy link   Newsgroups: comp.arch
X-Received: by 2002:a37:b181:0:b0:723:28ca:e6f2 with SMTP id a123-20020a37b181000000b0072328cae6f2mr162327qkf.405.1675105650372;
Mon, 30 Jan 2023 11:07:30 -0800 (PST)
X-Received: by 2002:a05:6808:d47:b0:378:15e:c61d with SMTP id
w7-20020a0568080d4700b00378015ec61dmr829543oik.298.1675105650067; Mon, 30 Jan
2023 11:07:30 -0800 (PST)
Path: i2pn2.org!i2pn.org!usenet.blueworldhosting.com!feed1.usenet.blueworldhosting.com!peer03.iad!feed-me.highwinds-media.com!news.highwinds-media.com!news-out.google.com!nntp.google.com!postnews.google.com!google-groups.googlegroups.com!not-for-mail
Newsgroups: comp.arch
Date: Mon, 30 Jan 2023 11:07:29 -0800 (PST)
In-Reply-To: <12155892-ec89-4bee-85bb-d1491a5dd20dn@googlegroups.com>
Injection-Info: google-groups.googlegroups.com; posting-host=2600:1700:291:29f0:7465:9c8f:f537:2149;
posting-account=H_G_JQkAAADS6onOMb-dqvUozKse7mcM
NNTP-Posting-Host: 2600:1700:291:29f0:7465:9c8f:f537:2149
References: <12155892-ec89-4bee-85bb-d1491a5dd20dn@googlegroups.com>
User-Agent: G2/1.0
MIME-Version: 1.0
Message-ID: <43fbc72b-98c5-4c9e-afeb-6642c4386acbn@googlegroups.com>
Subject: Re: Introducing ForwardCom: An open ISA with variable-length vector registers
From: MitchAl...@aol.com (MitchAlsup)
Injection-Date: Mon, 30 Jan 2023 19:07:30 +0000
Content-Type: text/plain; charset="UTF-8"
Content-Transfer-Encoding: quoted-printable
X-Received-Bytes: 6853
 by: MitchAlsup - Mon, 30 Jan 2023 19:07 UTC

On Monday, January 30, 2023 at 5:03:36 AM UTC-6, Agner Fog wrote:
> I have developed ForwardCom with help and suggestions from many people. This is a project to improve not only the ISA, but the entire ecosystem of software, ABI standard, development tools, etc.
>
> ForwardCom can vectorize array loops in a new way that automatically adjusts to the maximum vector length supported by the CPU. It works in the following way:
>
> A register used as loop counter is initialized to the array length. This counter is decremented by the vector register length and repeats as long as it is positive. The counter register also specifies the vector register length. The vector registers used in the loop will have the maximum length as long as the counter exceeds this value. There is no loop tail because the vector length is automatically adjusted in the last iteration of the loop to fit the remaining number of array elements.
>
> A special addressing mode can load and store vector registers at an address calculated as the end of the array minus the loop counter.
>
> This array loop mechanism makes the same software code run optimally on different CPUs with different maximum vector lengths. This is what I call forward compatibility: You don't have to update or recompile the software when you get access to a new CPU with larger vectors.
>
> There is no global vector length register. You can have different vectors with different lengths at the same time. The length is stored in the vector register itself. When you save and restore a vector register, it will only save the part of the register that is actually used.
>
> It is straightforward to call mathematical functions inside a vector loop.. A library function takes vector registers as inputs and outputs. The same function can handle scalars and vectors of any length because the length is stored in each register.
>
> ForwardCom is neither RISC nor CISC. It is a combination that gets the best of both worlds. There are few instructions, but many variants of each instruction. The instruction length can be one, two, or three 32-bit words. Decoding is still efficient because the length is specified by just two bits in the first code word.
<
My 66000 is mostly RISC with a few of the better parts of CISC throw in. I also
think that something between RISC and CISC has potential for being better than
either. And in particular, Brian's LLVM compiler produces My 66000 ASM for
benchmarks (with available source code) that average 75%-79% the instruction
count of RISC-V LLVM compiler (with same optimization settings).
>
> The instruction set is orthogonal: Operands can be general purpose registers, vector registers, immediate constants, or memory operands with different addressing modes. The same instruction can handle integers of different sizes and floating point numbers of different precisions.
<
When I designed My 66000 ISA I developed the instruction set so that every
instruction could have 1 immediate, or 1 displacement, or both; and that
the immediate could be in either operand position. This enables my ISA to
<
SLL R7,#1,R9
STD #3.1415926535897932,[Rbase+Rindex<<scale+Displacement]
FMAC R9,R7,#3.2768D5,R9
FMAC R10,R6,R5,#1.7D0
<
as single instructions. There is no way to deliver an operand into calculation
that is of lower power than a constant.
>
> All instructions are coded according to a flexible and consistent template system. You can use a short single-word form of an instruction if the operands are simple, or a longer instruction format if you need extra registers, large immediate constants, large memory addresses, complex addressing modes, or extra option bits.
<
Same.
>
> The flexible instruction format allows the CPU to do more work per instruction.
<
Agreed.
<
> Various option bits adds extra functionality to many instructions, but only as long as it fits into a smooth pipeline structure. Most instructions have a latency of one clock cycle and a throughput of one instruction per clock per execution unit.
<
I worked in sign control, too; so::
<
ADD R7,R9,-R14 // is the subtract version
ADD R7,-R9,R14 // is the reverse subtract version
<
> Multiplication, division, and floating point arithmetic have longer latencies.
<
Of course they do.
>
> The ForwardCom assembly language is simple and immediately intelligible. Adding two integers is as simple as: int r1 = r2 + r3. Branches and loops are written just like in C or Java, for example:
> for (int r0 = 0; r0 < 100; r0++) { }
>
> There are many other features, including strong security, efficient memory management, an efficient library system, efficient global register allocation, and more.
>
> See www.forwardcom.info for all the details.
>
> The current status of the ForwardCom project is:
>
> * Instruction set and ABI standards are defined in detail.
> * The following binary tools have been developed: high-level assembler, disassembler, linker, library manager, emulator, and debugger. A compiler is not available yet.
> * A hardware implementation in an FPGA soft core is available with full documentation. The current version supports integer instructions only.
> * Everything is free and open source

Re: Introducing ForwardCom: An open ISA with variable-length vector registers

<96078d6d-75ca-4416-abc7-23b4286fa3ban@googlegroups.com>

  copy mid

https://www.novabbs.com/devel/article-flat.php?id=30628&group=comp.arch#30628

  copy link   Newsgroups: comp.arch
X-Received: by 2002:a05:622a:1041:b0:3b8:6c16:ea5c with SMTP id f1-20020a05622a104100b003b86c16ea5cmr292440qte.57.1675120015667;
Mon, 30 Jan 2023 15:06:55 -0800 (PST)
X-Received: by 2002:a05:6870:b609:b0:163:82ad:dbdb with SMTP id
cm9-20020a056870b60900b0016382addbdbmr1127566oab.118.1675120015377; Mon, 30
Jan 2023 15:06:55 -0800 (PST)
Path: i2pn2.org!i2pn.org!usenet.blueworldhosting.com!feed1.usenet.blueworldhosting.com!peer03.iad!feed-me.highwinds-media.com!news.highwinds-media.com!news-out.google.com!nntp.google.com!postnews.google.com!google-groups.googlegroups.com!not-for-mail
Newsgroups: comp.arch
Date: Mon, 30 Jan 2023 15:06:55 -0800 (PST)
In-Reply-To: <12155892-ec89-4bee-85bb-d1491a5dd20dn@googlegroups.com>
Injection-Info: google-groups.googlegroups.com; posting-host=2600:1700:291:29f0:7465:9c8f:f537:2149;
posting-account=H_G_JQkAAADS6onOMb-dqvUozKse7mcM
NNTP-Posting-Host: 2600:1700:291:29f0:7465:9c8f:f537:2149
References: <12155892-ec89-4bee-85bb-d1491a5dd20dn@googlegroups.com>
User-Agent: G2/1.0
MIME-Version: 1.0
Message-ID: <96078d6d-75ca-4416-abc7-23b4286fa3ban@googlegroups.com>
Subject: Re: Introducing ForwardCom: An open ISA with variable-length vector registers
From: MitchAl...@aol.com (MitchAlsup)
Injection-Date: Mon, 30 Jan 2023 23:06:55 +0000
Content-Type: text/plain; charset="UTF-8"
X-Received-Bytes: 1742
 by: MitchAlsup - Mon, 30 Jan 2023 23:06 UTC

On Monday, January 30, 2023 at 5:03:36 AM UTC-6, Agner Fog wrote:
>
> The ForwardCom assembly language is simple and immediately intelligible. Adding two integers is as simple as: int r1 = r2 + r3. Branches and loops are written just like in C or Java, for example:
> for (int r0 = 0; r0 < 100; r0++) { }
>
This reminds me of CDC 6600 assembly where essentially all calculations are
set x7 = x3 + x2
<
How would one write:: uint64_t R7 = (int32_t)R6 + (uint8_t)R19; ??
And what code would be produced ??
<

Re: Introducing ForwardCom: An open ISA with variable-length vector registers

<20978ba2-65fe-4d62-9762-5acd10f1672dn@googlegroups.com>

  copy mid

https://www.novabbs.com/devel/article-flat.php?id=30632&group=comp.arch#30632

  copy link   Newsgroups: comp.arch
X-Received: by 2002:ac8:5fd0:0:b0:3b8:671e:9a9c with SMTP id k16-20020ac85fd0000000b003b8671e9a9cmr346833qta.75.1675148445206;
Mon, 30 Jan 2023 23:00:45 -0800 (PST)
X-Received: by 2002:a05:6830:2044:b0:68b:bc10:f0a with SMTP id
f4-20020a056830204400b0068bbc100f0amr510500otp.20.1675148444941; Mon, 30 Jan
2023 23:00:44 -0800 (PST)
Path: i2pn2.org!i2pn.org!weretis.net!feeder6.news.weretis.net!news.misty.com!border-2.nntp.ord.giganews.com!nntp.giganews.com!news-out.google.com!nntp.google.com!postnews.google.com!google-groups.googlegroups.com!not-for-mail
Newsgroups: comp.arch
Date: Mon, 30 Jan 2023 23:00:44 -0800 (PST)
In-Reply-To: <96078d6d-75ca-4416-abc7-23b4286fa3ban@googlegroups.com>
Injection-Info: google-groups.googlegroups.com; posting-host=128.76.247.189; posting-account=tYjOgQoAAACRs74arwcusKjVVQt_fFMX
NNTP-Posting-Host: 128.76.247.189
References: <12155892-ec89-4bee-85bb-d1491a5dd20dn@googlegroups.com> <96078d6d-75ca-4416-abc7-23b4286fa3ban@googlegroups.com>
User-Agent: G2/1.0
MIME-Version: 1.0
Message-ID: <20978ba2-65fe-4d62-9762-5acd10f1672dn@googlegroups.com>
Subject: Re: Introducing ForwardCom: An open ISA with variable-length vector registers
From: agf...@dtu.dk (Agner Fog)
Injection-Date: Tue, 31 Jan 2023 07:00:45 +0000
Content-Type: text/plain; charset="UTF-8"
Content-Transfer-Encoding: quoted-printable
Lines: 16
 by: Agner Fog - Tue, 31 Jan 2023 07:00 UTC

31. januar 2023 kl. 00.06.57 UTC+1 MitchAlsup wrote:
> How would one write:: uint64_t R7 = (int32_t)R6 + (uint8_t)R19; ??
> And what code would be produced ??

The ForwardCom instruction format has one field specifying operand type and precision, so you cannot mix operand sizes in a single instruction. Unused parts of a register are always set to zero, so a 32-bit operation will zero-extend the result to 64 bits in a 64-bit register.
int32 r7 = r6 + r19
will do a 32-bit addition and zero-extend the result to 64 bits in r7. This will work if r19 is known to contain an 8-bit unsigned integer. If the bits 8-31 of R19 are not known to be zero then you have to cut them off first:
int8 r19 = r19.
There is no implicit sign extension. If you want to sign-extend an integer to a larger size you must use an extra instruction for doing so.

Re: Introducing ForwardCom: An open ISA with variable-length vector registers

<trdrft$1rac$1@newsreader4.netcologne.de>

  copy mid

https://www.novabbs.com/devel/article-flat.php?id=30670&group=comp.arch#30670

  copy link   Newsgroups: comp.arch
Path: i2pn2.org!i2pn.org!weretis.net!feeder8.news.weretis.net!newsreader4.netcologne.de!news.netcologne.de!.POSTED.2001-4dd6-2711-0-7285-c2ff-fe6c-992d.ipv6dyn.netcologne.de!not-for-mail
From: tkoe...@netcologne.de (Thomas Koenig)
Newsgroups: comp.arch
Subject: Re: Introducing ForwardCom: An open ISA with variable-length vector
registers
Date: Wed, 1 Feb 2023 14:03:41 -0000 (UTC)
Organization: news.netcologne.de
Distribution: world
Message-ID: <trdrft$1rac$1@newsreader4.netcologne.de>
References: <12155892-ec89-4bee-85bb-d1491a5dd20dn@googlegroups.com>
Injection-Date: Wed, 1 Feb 2023 14:03:41 -0000 (UTC)
Injection-Info: newsreader4.netcologne.de; posting-host="2001-4dd6-2711-0-7285-c2ff-fe6c-992d.ipv6dyn.netcologne.de:2001:4dd6:2711:0:7285:c2ff:fe6c:992d";
logging-data="60748"; mail-complaints-to="abuse@netcologne.de"
User-Agent: slrn/1.0.3 (Linux)
 by: Thomas Koenig - Wed, 1 Feb 2023 14:03 UTC

Agner Fog <agfo@dtu.dk> schrieb:

> The ForwardCom assembly language is simple and immediately
> intelligible. Adding two integers is as simple as: int r1 =
> r2 + r3. Branches and loops are written just like in C or Java,
> for example:

> for (int r0 = 0; r0 < 100; r0++) { }

Quite interesting, thanks a lot for sharing!

One question: What would be the best way to handle loop-carried
dependencies (let's say a memmove, where operands can overlap,
or C's

void add (const int *a, int *b, int *c, int n)
{

for (int i=0; i<n; i++)
a[i] = b[i] + c[i];
}

in your ISA?

Re: Introducing ForwardCom: An open ISA with variable-length vector registers

<b0dc17fa-9dca-46fa-9777-fd49ea294904n@googlegroups.com>

  copy mid

https://www.novabbs.com/devel/article-flat.php?id=30671&group=comp.arch#30671

  copy link   Newsgroups: comp.arch
X-Received: by 2002:ae9:e902:0:b0:6ff:812e:9b55 with SMTP id x2-20020ae9e902000000b006ff812e9b55mr200898qkf.4.1675262884771;
Wed, 01 Feb 2023 06:48:04 -0800 (PST)
X-Received: by 2002:a05:6830:3378:b0:684:e371:b7ea with SMTP id
l56-20020a056830337800b00684e371b7eamr151535ott.137.1675262884528; Wed, 01
Feb 2023 06:48:04 -0800 (PST)
Path: i2pn2.org!i2pn.org!weretis.net!feeder8.news.weretis.net!proxad.net!feeder1-2.proxad.net!209.85.160.216.MISMATCH!news-out.google.com!nntp.google.com!postnews.google.com!google-groups.googlegroups.com!not-for-mail
Newsgroups: comp.arch
Date: Wed, 1 Feb 2023 06:48:04 -0800 (PST)
In-Reply-To: <43fbc72b-98c5-4c9e-afeb-6642c4386acbn@googlegroups.com>
Injection-Info: google-groups.googlegroups.com; posting-host=162.157.97.93; posting-account=1nOeKQkAAABD2jxp4Pzmx9Hx5g9miO8y
NNTP-Posting-Host: 162.157.97.93
References: <12155892-ec89-4bee-85bb-d1491a5dd20dn@googlegroups.com> <43fbc72b-98c5-4c9e-afeb-6642c4386acbn@googlegroups.com>
User-Agent: G2/1.0
MIME-Version: 1.0
Message-ID: <b0dc17fa-9dca-46fa-9777-fd49ea294904n@googlegroups.com>
Subject: Re: Introducing ForwardCom: An open ISA with variable-length vector registers
From: jsav...@ecn.ab.ca (Quadibloc)
Injection-Date: Wed, 01 Feb 2023 14:48:04 +0000
Content-Type: text/plain; charset="UTF-8"
 by: Quadibloc - Wed, 1 Feb 2023 14:48 UTC

On Monday, January 30, 2023 at 12:07:31 PM UTC-7, MitchAlsup wrote:

> My 66000 is mostly RISC with a few of the better parts of CISC
thrown
> in.

The only thing I'm not happy about in your design is the fact that
some actions, treated as single instructions by the microarchitecture
for efficiency, are specified as multiple instructions in the code.

To me, this appears to massively complicate the decoding of
instructions, making the architecture more difficult to implement.

Of course, though, it doesn't make the processor less efficient,
since instruction decoding can be done well ahead of execution,
so this need not have any impact on latencies.

John Savard

Re: Introducing ForwardCom: An open ISA with variable-length vector registers

<tre3s7$dmsd$2@dont-email.me>

  copy mid

https://www.novabbs.com/devel/article-flat.php?id=30673&group=comp.arch#30673

  copy link   Newsgroups: comp.arch
Path: i2pn2.org!i2pn.org!eternal-september.org!reader01.eternal-september.org!.POSTED!not-for-mail
From: terje.ma...@tmsw.no (Terje Mathisen)
Newsgroups: comp.arch
Subject: Re: Introducing ForwardCom: An open ISA with variable-length vector
registers
Date: Wed, 1 Feb 2023 17:26:47 +0100
Organization: A noiseless patient Spider
Lines: 40
Message-ID: <tre3s7$dmsd$2@dont-email.me>
References: <12155892-ec89-4bee-85bb-d1491a5dd20dn@googlegroups.com>
<trdrft$1rac$1@newsreader4.netcologne.de>
MIME-Version: 1.0
Content-Type: text/plain; charset=ISO-8859-1; format=flowed
Content-Transfer-Encoding: 7bit
Injection-Date: Wed, 1 Feb 2023 16:26:47 -0000 (UTC)
Injection-Info: reader01.eternal-september.org; posting-host="d4b56886379bc69a019a36c1b3cc4100";
logging-data="449421"; mail-complaints-to="abuse@eternal-september.org"; posting-account="U2FsdGVkX1/9dV6EYgpIZ1Ab2QnMvXmUlEuU1MJdlgIcz68Ja6Aw3A=="
User-Agent: Mozilla/5.0 (Windows NT 10.0; Win64; x64; rv:68.0) Gecko/20100101
Firefox/68.0 SeaMonkey/2.53.14
Cancel-Lock: sha1:p53A2Df1Qq29fiTYPvcDNdm3PwY=
In-Reply-To: <trdrft$1rac$1@newsreader4.netcologne.de>
 by: Terje Mathisen - Wed, 1 Feb 2023 16:26 UTC

Thomas Koenig wrote:
> Agner Fog <agfo@dtu.dk> schrieb:
>
>> The ForwardCom assembly language is simple and immediately
>> intelligible. Adding two integers is as simple as: int r1 =
>> r2 + r3. Branches and loops are written just like in C or Java,
>> for example:
>
>> for (int r0 = 0; r0 < 100; r0++) { }
>
> Quite interesting, thanks a lot for sharing!
>
> One question: What would be the best way to handle loop-carried
> dependencies (let's say a memmove, where operands can overlap,
> or C's
>
> void add (const int *a, int *b, int *c, int n)
> {
>
> for (int i=0; i<n; i++)
> a[i] = b[i] + c[i];
> }

Does that make sense? Adding a 'const' to the target array?

I could see how noalias would be useful for the compiler but what is
const supposed to do here?

BTW, I would expect ForwardCom to compile this as a size-agnostic SIMD
loop, effectively something like:

while (i < n) {
i += vector_add(a,b,c,n-i);
}

Terje

--
- <Terje.Mathisen at tmsw.no>
"almost all programming can be viewed as an exercise in caching"

Re: Introducing ForwardCom: An open ISA with variable-length vector registers

<tre6v0$23u8$1@newsreader4.netcologne.de>

  copy mid

https://www.novabbs.com/devel/article-flat.php?id=30674&group=comp.arch#30674

  copy link   Newsgroups: comp.arch
Path: i2pn2.org!i2pn.org!weretis.net!feeder8.news.weretis.net!newsreader4.netcologne.de!news.netcologne.de!.POSTED.2001-4dd6-2711-0-7285-c2ff-fe6c-992d.ipv6dyn.netcologne.de!not-for-mail
From: tkoe...@netcologne.de (Thomas Koenig)
Newsgroups: comp.arch
Subject: Re: Introducing ForwardCom: An open ISA with variable-length vector
registers
Date: Wed, 1 Feb 2023 17:19:28 -0000 (UTC)
Organization: news.netcologne.de
Distribution: world
Message-ID: <tre6v0$23u8$1@newsreader4.netcologne.de>
References: <12155892-ec89-4bee-85bb-d1491a5dd20dn@googlegroups.com>
<trdrft$1rac$1@newsreader4.netcologne.de> <tre3s7$dmsd$2@dont-email.me>
Injection-Date: Wed, 1 Feb 2023 17:19:28 -0000 (UTC)
Injection-Info: newsreader4.netcologne.de; posting-host="2001-4dd6-2711-0-7285-c2ff-fe6c-992d.ipv6dyn.netcologne.de:2001:4dd6:2711:0:7285:c2ff:fe6c:992d";
logging-data="69576"; mail-complaints-to="abuse@netcologne.de"
User-Agent: slrn/1.0.3 (Linux)
 by: Thomas Koenig - Wed, 1 Feb 2023 17:19 UTC

Terje Mathisen <terje.mathisen@tmsw.no> schrieb:
> Thomas Koenig wrote:
>> Agner Fog <agfo@dtu.dk> schrieb:
>>
>>> The ForwardCom assembly language is simple and immediately
>>> intelligible. Adding two integers is as simple as: int r1 =
>>> r2 + r3. Branches and loops are written just like in C or Java,
>>> for example:
>>
>>> for (int r0 = 0; r0 < 100; r0++) { }
>>
>> Quite interesting, thanks a lot for sharing!
>>
>> One question: What would be the best way to handle loop-carried
>> dependencies (let's say a memmove, where operands can overlap,
>> or C's
>>
>> void add (const int *a, int *b, int *c, int n)
>> {
>>
>> for (int i=0; i<n; i++)
>> a[i] = b[i] + c[i];
>> }
>
> Does that make sense? Adding a 'const' to the target array?

Not at all, I had not thought that through, const should be on
the other two arguments.

> I could see how noalias would be useful for the compiler but what is
> const supposed to do here?

> BTW, I would expect ForwardCom to compile this as a size-agnostic SIMD
> loop, effectively something like:
>
> while (i < n) {
> i += vector_add(a,b,c,n-i);
> }

Hm.

Let's assume C semantics without noalias or restrict, and assume that
the function (without the nonsenscial const) is called as

int *arr;
add (arr+1, arr, arr+2, n);

Then, the store to a[0] would overwrite arr[1], and in the next
iteration b[1] would be referenced, which would be arr[1], so the
store to a[0] would have to be carried over to the next iteration
of the loop. Classic aliasing (and I made the case especially
obnoxious because not even loop reversal would help).

Mitch's My66000 would detect the interdependency at runtime and
drop down to scalar mode.

For Fortran, there is a related problem. While arguments cannot
overlap, the language proscribes that the right-hand side of
an assignment is evaluated before the actual assignment.
This leads to generation of temporary arrays for code like

a(n:m) = a(n-1:m-1) + a(n+1:m+1)

or when overlap / non-overlap cannot be detected at compile time.

My question would be: Would this kind of thing be left to the
programmer (or the language definition, or the compiler) or are
there special provisions in the ISA to deal with this kind of
thing one way or another?

Mitch's VVM drops down to scalar if an interdpendency is found,
via a hardware check.

Re: Introducing ForwardCom: An open ISA with variable-length vector registers

<967f0435-af12-452d-9aa9-ddf9f1580760n@googlegroups.com>

  copy mid

https://www.novabbs.com/devel/article-flat.php?id=30677&group=comp.arch#30677

  copy link   Newsgroups: comp.arch
X-Received: by 2002:a0c:db89:0:b0:537:6bb8:63d3 with SMTP id m9-20020a0cdb89000000b005376bb863d3mr232328qvk.54.1675278308254;
Wed, 01 Feb 2023 11:05:08 -0800 (PST)
X-Received: by 2002:a05:6871:610:b0:163:32c8:bb97 with SMTP id
w16-20020a056871061000b0016332c8bb97mr332977oan.61.1675278307719; Wed, 01 Feb
2023 11:05:07 -0800 (PST)
Path: i2pn2.org!i2pn.org!usenet.blueworldhosting.com!feed1.usenet.blueworldhosting.com!peer02.iad!feed-me.highwinds-media.com!news.highwinds-media.com!news-out.google.com!nntp.google.com!postnews.google.com!google-groups.googlegroups.com!not-for-mail
Newsgroups: comp.arch
Date: Wed, 1 Feb 2023 11:05:07 -0800 (PST)
In-Reply-To: <trdrft$1rac$1@newsreader4.netcologne.de>
Injection-Info: google-groups.googlegroups.com; posting-host=2600:1700:291:29f0:cd4f:a245:3452:d333;
posting-account=H_G_JQkAAADS6onOMb-dqvUozKse7mcM
NNTP-Posting-Host: 2600:1700:291:29f0:cd4f:a245:3452:d333
References: <12155892-ec89-4bee-85bb-d1491a5dd20dn@googlegroups.com> <trdrft$1rac$1@newsreader4.netcologne.de>
User-Agent: G2/1.0
MIME-Version: 1.0
Message-ID: <967f0435-af12-452d-9aa9-ddf9f1580760n@googlegroups.com>
Subject: Re: Introducing ForwardCom: An open ISA with variable-length vector registers
From: MitchAl...@aol.com (MitchAlsup)
Injection-Date: Wed, 01 Feb 2023 19:05:08 +0000
Content-Type: text/plain; charset="UTF-8"
X-Received-Bytes: 2155
 by: MitchAlsup - Wed, 1 Feb 2023 19:05 UTC

On Wednesday, February 1, 2023 at 8:03:45 AM UTC-6, Thomas Koenig wrote:
> Agner Fog <ag...@dtu.dk> schrieb:
> > The ForwardCom assembly language is simple and immediately
> > intelligible. Adding two integers is as simple as: int r1 =
> > r2 + r3. Branches and loops are written just like in C or Java,
> > for example:
>
> > for (int r0 = 0; r0 < 100; r0++) { }
> Quite interesting, thanks a lot for sharing!
>
> One question: What would be the best way to handle loop-carried
> dependencies (let's say a memmove, where operands can overlap,
> or C's
<
MM Rto,Rfrom,Rcnt
>
> void add (const int *a, int *b, int *c, int n)
> {
> for (int i=0; i<n; i++)
> a[i] = b[i] + c[i];
> }
<
I fail to see a loop carried dependence. Nothing loaded or
calculated in the loop is used in the subsequent loop. Do I
have a bad interpretation of "loop carried" ?
>
> in your ISA?

Re: Introducing ForwardCom: An open ISA with variable-length vector registers

<tree26$28u3$1@newsreader4.netcologne.de>

  copy mid

https://www.novabbs.com/devel/article-flat.php?id=30679&group=comp.arch#30679

  copy link   Newsgroups: comp.arch
Path: i2pn2.org!i2pn.org!weretis.net!feeder8.news.weretis.net!newsreader4.netcologne.de!news.netcologne.de!.POSTED.2001-4dd6-2711-0-7285-c2ff-fe6c-992d.ipv6dyn.netcologne.de!not-for-mail
From: tkoe...@netcologne.de (Thomas Koenig)
Newsgroups: comp.arch
Subject: Re: Introducing ForwardCom: An open ISA with variable-length vector
registers
Date: Wed, 1 Feb 2023 19:20:38 -0000 (UTC)
Organization: news.netcologne.de
Distribution: world
Message-ID: <tree26$28u3$1@newsreader4.netcologne.de>
References: <12155892-ec89-4bee-85bb-d1491a5dd20dn@googlegroups.com>
<trdrft$1rac$1@newsreader4.netcologne.de>
<967f0435-af12-452d-9aa9-ddf9f1580760n@googlegroups.com>
Injection-Date: Wed, 1 Feb 2023 19:20:38 -0000 (UTC)
Injection-Info: newsreader4.netcologne.de; posting-host="2001-4dd6-2711-0-7285-c2ff-fe6c-992d.ipv6dyn.netcologne.de:2001:4dd6:2711:0:7285:c2ff:fe6c:992d";
logging-data="74691"; mail-complaints-to="abuse@netcologne.de"
User-Agent: slrn/1.0.3 (Linux)
 by: Thomas Koenig - Wed, 1 Feb 2023 19:20 UTC

MitchAlsup <MitchAlsup@aol.com> schrieb:
> On Wednesday, February 1, 2023 at 8:03:45 AM UTC-6, Thomas Koenig wrote:
>> Agner Fog <ag...@dtu.dk> schrieb:
>> > The ForwardCom assembly language is simple and immediately
>> > intelligible. Adding two integers is as simple as: int r1 =
>> > r2 + r3. Branches and loops are written just like in C or Java,
>> > for example:
>>
>> > for (int r0 = 0; r0 < 100; r0++) { }
>> Quite interesting, thanks a lot for sharing!
>>
>> One question: What would be the best way to handle loop-carried
>> dependencies (let's say a memmove, where operands can overlap,
>> or C's
><
> MM Rto,Rfrom,Rcnt
>>
>> void add (const int *a, int *b, int *c, int n)
>> {
>> for (int i=0; i<n; i++)
>> a[i] = b[i] + c[i];
>> }
><
> I fail to see a loop carried dependence. Nothing loaded or
> calculated in the loop is used in the subsequent loop. Do I
> have a bad interpretation of "loop carried" ?

It's not obvious, but (see my reply to Terje) the pointers can
point to overlapping parts of an array. In this case, a non-
obvious loop carried depenence can be introduced (note the
lack of restrict in the argument list), apart from the
const which is wrong.

Re: Introducing ForwardCom: An open ISA with variable-length vector registers

<86bkmddvkx.fsf@linuxsc.com>

  copy mid

https://www.novabbs.com/devel/article-flat.php?id=30681&group=comp.arch#30681

  copy link   Newsgroups: comp.arch
Path: i2pn2.org!i2pn.org!eternal-september.org!reader01.eternal-september.org!.POSTED!not-for-mail
From: tr.17...@z991.linuxsc.com (Tim Rentsch)
Newsgroups: comp.arch
Subject: Re: Introducing ForwardCom: An open ISA with variable-length vector registers
Date: Wed, 01 Feb 2023 11:39:10 -0800
Organization: A noiseless patient Spider
Lines: 9
Message-ID: <86bkmddvkx.fsf@linuxsc.com>
References: <12155892-ec89-4bee-85bb-d1491a5dd20dn@googlegroups.com> <trdrft$1rac$1@newsreader4.netcologne.de> <tre3s7$dmsd$2@dont-email.me> <tre6v0$23u8$1@newsreader4.netcologne.de>
MIME-Version: 1.0
Content-Type: text/plain; charset=us-ascii
Injection-Info: reader01.eternal-september.org; posting-host="2049e0e6d35ec910d5131203803ce580";
logging-data="508085"; mail-complaints-to="abuse@eternal-september.org"; posting-account="U2FsdGVkX1+WH5PUcSJd3ZLZ1mdWD66s04tYYcH9C+8="
User-Agent: Gnus/5.11 (Gnus v5.11) Emacs/22.4 (gnu/linux)
Cancel-Lock: sha1:vsa5c9DoaNJtFvBVjvzBgbq1g3A=
sha1:ZKXOB0/t3yCOnGek/svqO7/CEMo=
 by: Tim Rentsch - Wed, 1 Feb 2023 19:39 UTC

Thomas Koenig <tkoenig@netcologne.de> writes:

[...]

> For Fortran, there is a related problem. While arguments cannot
> overlap, the language proscribes that the right-hand side of
> an assignment is evaluated before the actual assignment.

I expect you mean prescribes, not proscribes.

Re: Introducing ForwardCom: An open ISA with variable-length vector registers

<5c9cef11-d404-44cb-86f1-70174340d082n@googlegroups.com>

  copy mid

https://www.novabbs.com/devel/article-flat.php?id=30682&group=comp.arch#30682

  copy link   Newsgroups: comp.arch
X-Received: by 2002:a0c:b2cb:0:b0:535:54ab:1c8a with SMTP id d11-20020a0cb2cb000000b0053554ab1c8amr239933qvf.75.1675283124007;
Wed, 01 Feb 2023 12:25:24 -0800 (PST)
X-Received: by 2002:a05:6870:b002:b0:163:4d75:7541 with SMTP id
y2-20020a056870b00200b001634d757541mr32683oae.23.1675283123450; Wed, 01 Feb
2023 12:25:23 -0800 (PST)
Path: i2pn2.org!i2pn.org!weretis.net!feeder8.news.weretis.net!proxad.net!feeder1-2.proxad.net!209.85.160.216.MISMATCH!news-out.google.com!nntp.google.com!postnews.google.com!google-groups.googlegroups.com!not-for-mail
Newsgroups: comp.arch
Date: Wed, 1 Feb 2023 12:25:23 -0800 (PST)
In-Reply-To: <tree26$28u3$1@newsreader4.netcologne.de>
Injection-Info: google-groups.googlegroups.com; posting-host=2600:1700:291:29f0:cd4f:a245:3452:d333;
posting-account=H_G_JQkAAADS6onOMb-dqvUozKse7mcM
NNTP-Posting-Host: 2600:1700:291:29f0:cd4f:a245:3452:d333
References: <12155892-ec89-4bee-85bb-d1491a5dd20dn@googlegroups.com>
<trdrft$1rac$1@newsreader4.netcologne.de> <967f0435-af12-452d-9aa9-ddf9f1580760n@googlegroups.com>
<tree26$28u3$1@newsreader4.netcologne.de>
User-Agent: G2/1.0
MIME-Version: 1.0
Message-ID: <5c9cef11-d404-44cb-86f1-70174340d082n@googlegroups.com>
Subject: Re: Introducing ForwardCom: An open ISA with variable-length vector registers
From: MitchAl...@aol.com (MitchAlsup)
Injection-Date: Wed, 01 Feb 2023 20:25:24 +0000
Content-Type: text/plain; charset="UTF-8"
 by: MitchAlsup - Wed, 1 Feb 2023 20:25 UTC

On Wednesday, February 1, 2023 at 1:20:41 PM UTC-6, Thomas Koenig wrote:
> MitchAlsup <Mitch...@aol.com> schrieb:
> > On Wednesday, February 1, 2023 at 8:03:45 AM UTC-6, Thomas Koenig wrote:
> >> Agner Fog <ag...@dtu.dk> schrieb:
> >> > The ForwardCom assembly language is simple and immediately
> >> > intelligible. Adding two integers is as simple as: int r1 =
> >> > r2 + r3. Branches and loops are written just like in C or Java,
> >> > for example:
> >>
> >> > for (int r0 = 0; r0 < 100; r0++) { }
> >> Quite interesting, thanks a lot for sharing!
> >>
> >> One question: What would be the best way to handle loop-carried
> >> dependencies (let's say a memmove, where operands can overlap,
> >> or C's
> ><
> > MM Rto,Rfrom,Rcnt
> >>
> >> void add (const int *a, int *b, int *c, int n)
> >> {
> >> for (int i=0; i<n; i++)
> >> a[i] = b[i] + c[i];
> >> }
> ><
> > I fail to see a loop carried dependence. Nothing loaded or
> > calculated in the loop is used in the subsequent loop. Do I
> > have a bad interpretation of "loop carried" ?
<
> It's not obvious, but (see my reply to Terje) the pointers can
> point to overlapping parts of an array. In this case, a non-
> obvious loop carried depenence can be introduced (note the
> lack of restrict in the argument list), apart from the
> const which is wrong.
<
I can envision an implementation where there are no loop
carried dependences.
>
But if your point was to illustrate that C has bad semantics
wrt array aliasing, I have to agree. The Fortran version has
no loop carried dependence.

Re: Introducing ForwardCom: An open ISA with variable-length vector registers

<6c4b9a52-57af-4b4f-a00b-457ddf825842n@googlegroups.com>

  copy mid

https://www.novabbs.com/devel/article-flat.php?id=30683&group=comp.arch#30683

  copy link   Newsgroups: comp.arch
X-Received: by 2002:ac8:58c8:0:b0:3b9:b24f:1997 with SMTP id u8-20020ac858c8000000b003b9b24f1997mr488776qta.147.1675286396702;
Wed, 01 Feb 2023 13:19:56 -0800 (PST)
X-Received: by 2002:a05:6870:525:b0:163:3f73:b113 with SMTP id
j37-20020a056870052500b001633f73b113mr34798oao.261.1675286396426; Wed, 01 Feb
2023 13:19:56 -0800 (PST)
Path: i2pn2.org!i2pn.org!weretis.net!feeder8.news.weretis.net!proxad.net!feeder1-2.proxad.net!209.85.160.216.MISMATCH!news-out.google.com!nntp.google.com!postnews.google.com!google-groups.googlegroups.com!not-for-mail
Newsgroups: comp.arch
Date: Wed, 1 Feb 2023 13:19:56 -0800 (PST)
In-Reply-To: <12155892-ec89-4bee-85bb-d1491a5dd20dn@googlegroups.com>
Injection-Info: google-groups.googlegroups.com; posting-host=162.157.97.93; posting-account=1nOeKQkAAABD2jxp4Pzmx9Hx5g9miO8y
NNTP-Posting-Host: 162.157.97.93
References: <12155892-ec89-4bee-85bb-d1491a5dd20dn@googlegroups.com>
User-Agent: G2/1.0
MIME-Version: 1.0
Message-ID: <6c4b9a52-57af-4b4f-a00b-457ddf825842n@googlegroups.com>
Subject: Re: Introducing ForwardCom: An open ISA with variable-length vector registers
From: jsav...@ecn.ab.ca (Quadibloc)
Injection-Date: Wed, 01 Feb 2023 21:19:56 +0000
Content-Type: text/plain; charset="UTF-8"
 by: Quadibloc - Wed, 1 Feb 2023 21:19 UTC

On Monday, January 30, 2023 at 4:03:36 AM UTC-7, Agner Fog wrote:

> ForwardCom can vectorize array loops in a new way that automatically
> adjusts to the maximum vector length supported by the CPU. It works
> in the following way:

> A register used as loop counter is initialized to the array length. This
> counter is decremented by the vector register length and repeats as
> long as it is positive. The counter register also specifies the vector
> register length. The vector registers used in the loop will have the
> maximum length as long as the counter exceeds this value. There
> is no loop tail because the vector length is automatically adjusted
> in the last iteration of the loop to fit the remaining number of array
> elements.

This isn't completely new. The IBM System/370 at one point added a
vector feature which was largely modelled on that of the Cray I. But
unlike the Cray I, it was designed so that one model of the 370 might
have vector registers with 64 elements, and another one might have
vector registers with 256 elements.

To allow code to be written that was independent of the hardware
vector size, a 16-bit field in the vector control register was used as
the loop counter.

This is not the same as the vector feature recently added to the
current IBM z/Architecture mainframes, which is just one comparable
to AVX on x86 microprocessors.

John Savard

Re: Introducing ForwardCom: An open ISA with variable-length vector registers

<trenlr$2g2p$1@newsreader4.netcologne.de>

  copy mid

https://www.novabbs.com/devel/article-flat.php?id=30684&group=comp.arch#30684

  copy link   Newsgroups: comp.arch
Path: i2pn2.org!i2pn.org!weretis.net!feeder8.news.weretis.net!newsreader4.netcologne.de!news.netcologne.de!.POSTED.2001-4dd6-2711-0-7285-c2ff-fe6c-992d.ipv6dyn.netcologne.de!not-for-mail
From: tkoe...@netcologne.de (Thomas Koenig)
Newsgroups: comp.arch
Subject: Re: Introducing ForwardCom: An open ISA with variable-length vector
registers
Date: Wed, 1 Feb 2023 22:04:43 -0000 (UTC)
Organization: news.netcologne.de
Distribution: world
Message-ID: <trenlr$2g2p$1@newsreader4.netcologne.de>
References: <12155892-ec89-4bee-85bb-d1491a5dd20dn@googlegroups.com>
<trdrft$1rac$1@newsreader4.netcologne.de>
<967f0435-af12-452d-9aa9-ddf9f1580760n@googlegroups.com>
<tree26$28u3$1@newsreader4.netcologne.de>
<5c9cef11-d404-44cb-86f1-70174340d082n@googlegroups.com>
Injection-Date: Wed, 1 Feb 2023 22:04:43 -0000 (UTC)
Injection-Info: newsreader4.netcologne.de; posting-host="2001-4dd6-2711-0-7285-c2ff-fe6c-992d.ipv6dyn.netcologne.de:2001:4dd6:2711:0:7285:c2ff:fe6c:992d";
logging-data="82009"; mail-complaints-to="abuse@netcologne.de"
User-Agent: slrn/1.0.3 (Linux)
 by: Thomas Koenig - Wed, 1 Feb 2023 22:04 UTC

MitchAlsup <MitchAlsup@aol.com> schrieb:
> On Wednesday, February 1, 2023 at 1:20:41 PM UTC-6, Thomas Koenig wrote:
>> MitchAlsup <Mitch...@aol.com> schrieb:
>> > On Wednesday, February 1, 2023 at 8:03:45 AM UTC-6, Thomas Koenig wrote:
>> >> Agner Fog <ag...@dtu.dk> schrieb:
>> >> > The ForwardCom assembly language is simple and immediately
>> >> > intelligible. Adding two integers is as simple as: int r1 =
>> >> > r2 + r3. Branches and loops are written just like in C or Java,
>> >> > for example:
>> >>
>> >> > for (int r0 = 0; r0 < 100; r0++) { }
>> >> Quite interesting, thanks a lot for sharing!
>> >>
>> >> One question: What would be the best way to handle loop-carried
>> >> dependencies (let's say a memmove, where operands can overlap,
>> >> or C's
>> ><
>> > MM Rto,Rfrom,Rcnt
>> >>
>> >> void add (const int *a, int *b, int *c, int n)
>> >> {
>> >> for (int i=0; i<n; i++)
>> >> a[i] = b[i] + c[i];
>> >> }
>> ><
>> > I fail to see a loop carried dependence. Nothing loaded or
>> > calculated in the loop is used in the subsequent loop. Do I
>> > have a bad interpretation of "loop carried" ?
><
>> It's not obvious, but (see my reply to Terje) the pointers can
>> point to overlapping parts of an array. In this case, a non-
>> obvious loop carried depenence can be introduced (note the
>> lack of restrict in the argument list), apart from the
>> const which is wrong.
><
> I can envision an implementation where there are no loop
> carried dependences.
>>
> But if your point was to illustrate that C has bad semantics
> wrt array aliasing, I have to agree. The Fortran version has
> no loop carried dependence.

The point I was trying to make (but which probably didn't come
across very well) is that loop-carried dependencies which are
allowed by the language can carry large performance penalties,
and that they impede any sort of vectorization - with wrong code
or with performance degradation, or both.

And Fortran has its own issues in that respect, not with arguments,
but with array expressions. Conceptually, everything on the
right-hand side of an assignment is evaluated before assignment
to the left-hand side. This is no problem if there are no
dependencies between the two sides. If there are simple
dependencies, like

a = sin(a)

that is not a problem, because the compiler can easily prove
that this can be scalarized into into a simple loop. For
something like

a (1:n) = a(0:n-1)

(array slice notation, where after the assignment a(1) has the
value that a(0) had previously, a(2) previously a(1), etc.) this
is conceptually identical to

tmp(1:n) = a(0:n-1)
a(1:n) = tmp

but the compiler cannot use a forward loop. It could either
actually create a temporary, or reverse the loop.

And in doubt, the compiler has to err on the side of correct code,
rather than speed.

Re: Introducing ForwardCom: An open ISA with variable-length vector registers

<treorr$hb6f$1@dont-email.me>

  copy mid

https://www.novabbs.com/devel/article-flat.php?id=30685&group=comp.arch#30685

  copy link   Newsgroups: comp.arch
Path: i2pn2.org!i2pn.org!eternal-september.org!reader01.eternal-september.org!.POSTED!not-for-mail
From: terje.ma...@tmsw.no (Terje Mathisen)
Newsgroups: comp.arch
Subject: Re: Introducing ForwardCom: An open ISA with variable-length vector
registers
Date: Wed, 1 Feb 2023 23:24:59 +0100
Organization: A noiseless patient Spider
Lines: 43
Message-ID: <treorr$hb6f$1@dont-email.me>
References: <12155892-ec89-4bee-85bb-d1491a5dd20dn@googlegroups.com>
<trdrft$1rac$1@newsreader4.netcologne.de>
<967f0435-af12-452d-9aa9-ddf9f1580760n@googlegroups.com>
<tree26$28u3$1@newsreader4.netcologne.de>
MIME-Version: 1.0
Content-Type: text/plain; charset=ISO-8859-1; format=flowed
Content-Transfer-Encoding: 7bit
Injection-Date: Wed, 1 Feb 2023 22:24:59 -0000 (UTC)
Injection-Info: reader01.eternal-september.org; posting-host="d4b56886379bc69a019a36c1b3cc4100";
logging-data="568527"; mail-complaints-to="abuse@eternal-september.org"; posting-account="U2FsdGVkX1+2L2nLxCN9kIwiStmDZowXapXHXScFA6RtBPD6TPIXWA=="
User-Agent: Mozilla/5.0 (Windows NT 10.0; Win64; x64; rv:68.0) Gecko/20100101
Firefox/68.0 SeaMonkey/2.53.14
Cancel-Lock: sha1:sThvPr5+cgqm4C+RrFdGRqQxY2Y=
In-Reply-To: <tree26$28u3$1@newsreader4.netcologne.de>
 by: Terje Mathisen - Wed, 1 Feb 2023 22:24 UTC

Thomas Koenig wrote:
> MitchAlsup <MitchAlsup@aol.com> schrieb:
>> On Wednesday, February 1, 2023 at 8:03:45 AM UTC-6, Thomas Koenig wrote:
>>> Agner Fog <ag...@dtu.dk> schrieb:
>>>> The ForwardCom assembly language is simple and immediately
>>>> intelligible. Adding two integers is as simple as: int r1 =
>>>> r2 + r3. Branches and loops are written just like in C or Java,
>>>> for example:
>>>
>>>> for (int r0 = 0; r0 < 100; r0++) { }
>>> Quite interesting, thanks a lot for sharing!
>>>
>>> One question: What would be the best way to handle loop-carried
>>> dependencies (let's say a memmove, where operands can overlap,
>>> or C's
>> <
>> MM Rto,Rfrom,Rcnt
>>>
>>> void add (const int *a, int *b, int *c, int n)
>>> {
>>> for (int i=0; i<n; i++)
>>> a[i] = b[i] + c[i];
>>> }
>> <
>> I fail to see a loop carried dependence. Nothing loaded or
>> calculated in the loop is used in the subsequent loop. Do I
>> have a bad interpretation of "loop carried" ?
>
> It's not obvious, but (see my reply to Terje) the pointers can
> point to overlapping parts of an array. In this case, a non-
> obvious loop carried depenence can be introduced (note the
> lack of restrict in the argument list), apart from the
> const which is wrong.

IMHO, this is exactly like the IBM RLL-decoding using intentionally
overlapping moves (MVC?): You really, really should not do it, and if
you do, then you deserve whatever ills befall you.

Terje

--
- <Terje.Mathisen at tmsw.no>
"almost all programming can be viewed as an exercise in caching"

Re: Introducing ForwardCom: An open ISA with variable-length vector registers

<bc5ec264-5711-4793-9228-b58b5eb2db11n@googlegroups.com>

  copy mid

https://www.novabbs.com/devel/article-flat.php?id=30686&group=comp.arch#30686

  copy link   Newsgroups: comp.arch
X-Received: by 2002:a05:620a:6226:b0:72b:74d3:6497 with SMTP id ou38-20020a05620a622600b0072b74d36497mr62071qkn.436.1675294818617;
Wed, 01 Feb 2023 15:40:18 -0800 (PST)
X-Received: by 2002:a05:6870:3050:b0:163:3ab5:b3f with SMTP id
u16-20020a056870305000b001633ab50b3fmr62004oau.218.1675294818323; Wed, 01 Feb
2023 15:40:18 -0800 (PST)
Path: i2pn2.org!i2pn.org!weretis.net!feeder8.news.weretis.net!proxad.net!feeder1-2.proxad.net!209.85.160.216.MISMATCH!news-out.google.com!nntp.google.com!postnews.google.com!google-groups.googlegroups.com!not-for-mail
Newsgroups: comp.arch
Date: Wed, 1 Feb 2023 15:40:18 -0800 (PST)
In-Reply-To: <trenlr$2g2p$1@newsreader4.netcologne.de>
Injection-Info: google-groups.googlegroups.com; posting-host=2600:1700:291:29f0:cd4f:a245:3452:d333;
posting-account=H_G_JQkAAADS6onOMb-dqvUozKse7mcM
NNTP-Posting-Host: 2600:1700:291:29f0:cd4f:a245:3452:d333
References: <12155892-ec89-4bee-85bb-d1491a5dd20dn@googlegroups.com>
<trdrft$1rac$1@newsreader4.netcologne.de> <967f0435-af12-452d-9aa9-ddf9f1580760n@googlegroups.com>
<tree26$28u3$1@newsreader4.netcologne.de> <5c9cef11-d404-44cb-86f1-70174340d082n@googlegroups.com>
<trenlr$2g2p$1@newsreader4.netcologne.de>
User-Agent: G2/1.0
MIME-Version: 1.0
Message-ID: <bc5ec264-5711-4793-9228-b58b5eb2db11n@googlegroups.com>
Subject: Re: Introducing ForwardCom: An open ISA with variable-length vector registers
From: MitchAl...@aol.com (MitchAlsup)
Injection-Date: Wed, 01 Feb 2023 23:40:18 +0000
Content-Type: text/plain; charset="UTF-8"
 by: MitchAlsup - Wed, 1 Feb 2023 23:40 UTC

On Wednesday, February 1, 2023 at 4:04:46 PM UTC-6, Thomas Koenig wrote:
> MitchAlsup <Mitch...@aol.com> schrieb:
>
> > But if your point was to illustrate that C has bad semantics
> > wrt array aliasing, I have to agree. The Fortran version has
> > no loop carried dependence.
<
> The point I was trying to make (but which probably didn't come
> across very well) is that loop-carried dependencies which are
> allowed by the language can carry large performance penalties,
> and that they impede any sort of vectorization - with wrong code
> or with performance degradation, or both.
<
A subtle point you may have missed with VVM; is that one CAN
vectorize such loops, and the ones with such a dependency simply
run slower (at the speed the dependency resolves) rather than the
speed the pipeline is capable of running if the dependence were
not present. In both cases, the correct results are obtained.
<
Loop carried dependencies do not prevent VVM vectorization.

Re: Introducing ForwardCom: An open ISA with variable-length vector registers

<trflaf$32et$1@newsreader4.netcologne.de>

  copy mid

https://www.novabbs.com/devel/article-flat.php?id=30687&group=comp.arch#30687

  copy link   Newsgroups: comp.arch
Path: i2pn2.org!i2pn.org!weretis.net!feeder8.news.weretis.net!newsreader4.netcologne.de!news.netcologne.de!.POSTED.2001-4dd6-2711-0-7285-c2ff-fe6c-992d.ipv6dyn.netcologne.de!not-for-mail
From: tkoe...@netcologne.de (Thomas Koenig)
Newsgroups: comp.arch
Subject: Re: Introducing ForwardCom: An open ISA with variable-length vector
registers
Date: Thu, 2 Feb 2023 06:30:39 -0000 (UTC)
Organization: news.netcologne.de
Distribution: world
Message-ID: <trflaf$32et$1@newsreader4.netcologne.de>
References: <12155892-ec89-4bee-85bb-d1491a5dd20dn@googlegroups.com>
<trdrft$1rac$1@newsreader4.netcologne.de>
<967f0435-af12-452d-9aa9-ddf9f1580760n@googlegroups.com>
<tree26$28u3$1@newsreader4.netcologne.de> <treorr$hb6f$1@dont-email.me>
Injection-Date: Thu, 2 Feb 2023 06:30:39 -0000 (UTC)
Injection-Info: newsreader4.netcologne.de; posting-host="2001-4dd6-2711-0-7285-c2ff-fe6c-992d.ipv6dyn.netcologne.de:2001:4dd6:2711:0:7285:c2ff:fe6c:992d";
logging-data="100829"; mail-complaints-to="abuse@netcologne.de"
User-Agent: slrn/1.0.3 (Linux)
 by: Thomas Koenig - Thu, 2 Feb 2023 06:30 UTC

Terje Mathisen <terje.mathisen@tmsw.no> schrieb:
> Thomas Koenig wrote:
>> MitchAlsup <MitchAlsup@aol.com> schrieb:
>>> On Wednesday, February 1, 2023 at 8:03:45 AM UTC-6, Thomas Koenig wrote:
>>>> Agner Fog <ag...@dtu.dk> schrieb:
>>>>> The ForwardCom assembly language is simple and immediately
>>>>> intelligible. Adding two integers is as simple as: int r1 =
>>>>> r2 + r3. Branches and loops are written just like in C or Java,
>>>>> for example:
>>>>
>>>>> for (int r0 = 0; r0 < 100; r0++) { }
>>>> Quite interesting, thanks a lot for sharing!
>>>>
>>>> One question: What would be the best way to handle loop-carried
>>>> dependencies (let's say a memmove, where operands can overlap,
>>>> or C's
>>> <
>>> MM Rto,Rfrom,Rcnt
>>>>
>>>> void add (const int *a, int *b, int *c, int n)
>>>> {
>>>> for (int i=0; i<n; i++)
>>>> a[i] = b[i] + c[i];
>>>> }
>>> <
>>> I fail to see a loop carried dependence. Nothing loaded or
>>> calculated in the loop is used in the subsequent loop. Do I
>>> have a bad interpretation of "loop carried" ?
>>
>> It's not obvious, but (see my reply to Terje) the pointers can
>> point to overlapping parts of an array. In this case, a non-
>> obvious loop carried depenence can be introduced (note the
>> lack of restrict in the argument list), apart from the
>> const which is wrong.
>
> IMHO, this is exactly like the IBM RLL-decoding using intentionally
> overlapping moves (MVC?): You really, really should not do it, and if
> you do, then you deserve whatever ills befall you.

I concur. The question is who is "you" in your sentence above...

Compiler writers have to get it right, according to the language
specification, so wrong code is not an option. If the language
specification results in pessimized code to cater for a case that
hardly anybody uses intentionally, well... what is a compiler
writer to do?

Re: Introducing ForwardCom: An open ISA with variable-length vector registers

<trfm67$32et$2@newsreader4.netcologne.de>

  copy mid

https://www.novabbs.com/devel/article-flat.php?id=30688&group=comp.arch#30688

  copy link   Newsgroups: comp.arch
Path: i2pn2.org!i2pn.org!weretis.net!feeder8.news.weretis.net!newsreader4.netcologne.de!news.netcologne.de!.POSTED.2001-4dd6-2711-0-7285-c2ff-fe6c-992d.ipv6dyn.netcologne.de!not-for-mail
From: tkoe...@netcologne.de (Thomas Koenig)
Newsgroups: comp.arch
Subject: Re: Introducing ForwardCom: An open ISA with variable-length vector
registers
Date: Thu, 2 Feb 2023 06:45:27 -0000 (UTC)
Organization: news.netcologne.de
Distribution: world
Message-ID: <trfm67$32et$2@newsreader4.netcologne.de>
References: <12155892-ec89-4bee-85bb-d1491a5dd20dn@googlegroups.com>
<trdrft$1rac$1@newsreader4.netcologne.de>
<967f0435-af12-452d-9aa9-ddf9f1580760n@googlegroups.com>
<tree26$28u3$1@newsreader4.netcologne.de>
<5c9cef11-d404-44cb-86f1-70174340d082n@googlegroups.com>
<trenlr$2g2p$1@newsreader4.netcologne.de>
<bc5ec264-5711-4793-9228-b58b5eb2db11n@googlegroups.com>
Injection-Date: Thu, 2 Feb 2023 06:45:27 -0000 (UTC)
Injection-Info: newsreader4.netcologne.de; posting-host="2001-4dd6-2711-0-7285-c2ff-fe6c-992d.ipv6dyn.netcologne.de:2001:4dd6:2711:0:7285:c2ff:fe6c:992d";
logging-data="100829"; mail-complaints-to="abuse@netcologne.de"
User-Agent: slrn/1.0.3 (Linux)
 by: Thomas Koenig - Thu, 2 Feb 2023 06:45 UTC

MitchAlsup <MitchAlsup@aol.com> schrieb:
> On Wednesday, February 1, 2023 at 4:04:46 PM UTC-6, Thomas Koenig wrote:
>> MitchAlsup <Mitch...@aol.com> schrieb:
>>
>> > But if your point was to illustrate that C has bad semantics
>> > wrt array aliasing, I have to agree. The Fortran version has
>> > no loop carried dependence.
><
>> The point I was trying to make (but which probably didn't come
>> across very well) is that loop-carried dependencies which are
>> allowed by the language can carry large performance penalties,
>> and that they impede any sort of vectorization - with wrong code
>> or with performance degradation, or both.
><
> A subtle point you may have missed with VVM; is that one CAN
> vectorize such loops, and the ones with such a dependency simply
> run slower (at the speed the dependency resolves) rather than the
> speed the pipeline is capable of running if the dependence were
> not present. In both cases, the correct results are obtained.

I know that, but didn't explicitly mention it :-)
><
> Loop carried dependencies do not prevent VVM vectorization.

The _really_ good thing is that it only happens when detected at
runtime, so there is no performance penalty when it is not needed.

But if loop carried dependencies occur, they will have a large
performance impact.

Also, VVM faithfully reproduces a special ordering at the loop,
the one that it was written in. This is obviously ideal for
languages which write out explicit loops, like C, but does not
offer improvment for languages (like Fortran :-) which operate
more on vectors or arrays.

Consider the Fortran statement

a(n:m) = a(k:k+m-n) + 2.0

which can be efficiently implemented with either a forward or a backward
loop, depending on the sign of n - k. The compiler has two choices:
Either copy the intermediate result to a temporary, or perform a
run-time test and switch the loop ordering, depending on the result.

Depending on the length of the loop, there is then always the question
of which is faster, because the run-time test adds to the overhead of
the loop startup.

This is one case where a genuine vector ISA could offer advantages,
it could do the loop reversal check in hardware, and fast. I simply
don't know if ForwardCom can do that, or not.

Re: Introducing ForwardCom: An open ISA with variable-length vector registers

<dd10199e-9c5a-41bb-9529-4d2df4a45e97n@googlegroups.com>

  copy mid

https://www.novabbs.com/devel/article-flat.php?id=30689&group=comp.arch#30689

  copy link   Newsgroups: comp.arch
X-Received: by 2002:ac8:5bca:0:b0:3b8:6ca5:6df4 with SMTP id b10-20020ac85bca000000b003b86ca56df4mr616202qtb.18.1675321030174;
Wed, 01 Feb 2023 22:57:10 -0800 (PST)
X-Received: by 2002:a05:6870:b28f:b0:163:994c:9495 with SMTP id
c15-20020a056870b28f00b00163994c9495mr227979oao.79.1675321029816; Wed, 01 Feb
2023 22:57:09 -0800 (PST)
Path: i2pn2.org!i2pn.org!usenet.blueworldhosting.com!feed1.usenet.blueworldhosting.com!peer03.iad!feed-me.highwinds-media.com!news.highwinds-media.com!news-out.google.com!nntp.google.com!postnews.google.com!google-groups.googlegroups.com!not-for-mail
Newsgroups: comp.arch
Date: Wed, 1 Feb 2023 22:57:09 -0800 (PST)
In-Reply-To: <trfm67$32et$2@newsreader4.netcologne.de>
Injection-Info: google-groups.googlegroups.com; posting-host=128.76.247.189; posting-account=tYjOgQoAAACRs74arwcusKjVVQt_fFMX
NNTP-Posting-Host: 128.76.247.189
References: <12155892-ec89-4bee-85bb-d1491a5dd20dn@googlegroups.com>
<trdrft$1rac$1@newsreader4.netcologne.de> <967f0435-af12-452d-9aa9-ddf9f1580760n@googlegroups.com>
<tree26$28u3$1@newsreader4.netcologne.de> <5c9cef11-d404-44cb-86f1-70174340d082n@googlegroups.com>
<trenlr$2g2p$1@newsreader4.netcologne.de> <bc5ec264-5711-4793-9228-b58b5eb2db11n@googlegroups.com>
<trfm67$32et$2@newsreader4.netcologne.de>
User-Agent: G2/1.0
MIME-Version: 1.0
Message-ID: <dd10199e-9c5a-41bb-9529-4d2df4a45e97n@googlegroups.com>
Subject: Re: Introducing ForwardCom: An open ISA with variable-length vector registers
From: agf...@dtu.dk (Agner Fog)
Injection-Date: Thu, 02 Feb 2023 06:57:10 +0000
Content-Type: text/plain; charset="UTF-8"
Content-Transfer-Encoding: quoted-printable
X-Received-Bytes: 3655
 by: Agner Fog - Thu, 2 Feb 2023 06:57 UTC

Thomas Koenig wrote:
What would be the best way to handle loop-carried dependencies (let's say a memmove, where operands can overlap)

The memmove function allows overlap, unlike memcpy. The memmove library function is using a backwards loop when necessary. The ForwardCom library function will do the same, of course. A backwards loop is not as elegant as a forward loop, but still possible.

An optimizing compiler will not vectorize a loop automatically unless it can prove that there is no aliasing problem. The programmer may help by explicit vectorization, or a __restrict keyword, or a no-aliasing compiler option. ForwardCom does not differ from other SIMD processors in this respect.

Quadibloc wrote:
some actions, treated as single instructions by the microarchitecture for efficiency, are specified as multiple instructions in the code.

The assembler understands the high-level construct:
for (int r1=0; r1<n; r1++) { }
The loop prolog (r1=0) is a single instruction (or two instructions if n is a register and it is not certain that n > 0).
The loop epilog (r1++, jump if r1<n) is a single instruction. It is the high-level assembler that does the translation. The microprocessor sees the epilog as a single instruction and treats it like any other combined ALU-branch instruction, so it has no problem decoding this instruction.
The compiler is free to pass the 'for' loop as a high-level expression to the assembler or translate it to low-level instruction mnemonics.

Thomas Koenig wrote:
This is one case where a genuine vector ISA could offer advantages, it could do the loop reversal check in hardware, and fast. I simply don't know if ForwardCom can do that, or not.

It cannot. ForwardCom treats the code as it is written. A loop reversal check in hardware would require that the entire loop is decoded and stored in a micro-op cache and all addresses calculated first. What if the loop is big with branches and function calls and nested loops etc. My66000 cannot do this either. It can only handle small loops.

Re: Introducing ForwardCom: An open ISA with variable-length vector registers

<b9b97d27-c100-438a-8fe4-7ba285a85c5cn@googlegroups.com>

  copy mid

https://www.novabbs.com/devel/article-flat.php?id=30690&group=comp.arch#30690

  copy link   Newsgroups: comp.arch
X-Received: by 2002:a05:6214:4a47:b0:538:52e3:44ad with SMTP id ph7-20020a0562144a4700b0053852e344admr370179qvb.6.1675321322695;
Wed, 01 Feb 2023 23:02:02 -0800 (PST)
X-Received: by 2002:aca:e188:0:b0:37a:c065:520b with SMTP id
y130-20020acae188000000b0037ac065520bmr129696oig.298.1675321322391; Wed, 01
Feb 2023 23:02:02 -0800 (PST)
Path: i2pn2.org!i2pn.org!usenet.blueworldhosting.com!feed1.usenet.blueworldhosting.com!peer03.iad!feed-me.highwinds-media.com!news.highwinds-media.com!news-out.google.com!nntp.google.com!postnews.google.com!google-groups.googlegroups.com!not-for-mail
Newsgroups: comp.arch
Date: Wed, 1 Feb 2023 23:02:02 -0800 (PST)
In-Reply-To: <dd10199e-9c5a-41bb-9529-4d2df4a45e97n@googlegroups.com>
Injection-Info: google-groups.googlegroups.com; posting-host=128.76.247.189; posting-account=tYjOgQoAAACRs74arwcusKjVVQt_fFMX
NNTP-Posting-Host: 128.76.247.189
References: <12155892-ec89-4bee-85bb-d1491a5dd20dn@googlegroups.com>
<trdrft$1rac$1@newsreader4.netcologne.de> <967f0435-af12-452d-9aa9-ddf9f1580760n@googlegroups.com>
<tree26$28u3$1@newsreader4.netcologne.de> <5c9cef11-d404-44cb-86f1-70174340d082n@googlegroups.com>
<trenlr$2g2p$1@newsreader4.netcologne.de> <bc5ec264-5711-4793-9228-b58b5eb2db11n@googlegroups.com>
<trfm67$32et$2@newsreader4.netcologne.de> <dd10199e-9c5a-41bb-9529-4d2df4a45e97n@googlegroups.com>
User-Agent: G2/1.0
MIME-Version: 1.0
Message-ID: <b9b97d27-c100-438a-8fe4-7ba285a85c5cn@googlegroups.com>
Subject: Re: Introducing ForwardCom: An open ISA with variable-length vector registers
From: agf...@dtu.dk (Agner Fog)
Injection-Date: Thu, 02 Feb 2023 07:02:02 +0000
Content-Type: text/plain; charset="UTF-8"
Content-Transfer-Encoding: quoted-printable
X-Received-Bytes: 2331
 by: Agner Fog - Thu, 2 Feb 2023 07:02 UTC

> Thomas Koenig wrote:
> This is one case where a genuine vector ISA could offer advantages, it could do the loop reversal check in hardware, and fast. I simply don't know if ForwardCom can do that, or not.
> It cannot. ForwardCom treats the code as it is written. A loop reversal check in hardware would require that the entire loop is decoded and stored in a micro-op cache and all addresses calculated first. What if the loop is big with branches and function calls and nested loops etc. My66000 cannot do this either. It can only handle small loops.

What I meant was, an out-of-order My66000 might do the equivalent of loop reversal in small, simple cases, but not for big or complicated loops.

Re: Introducing ForwardCom: An open ISA with variable-length vector registers

<trgrn9$3qv5$1@newsreader4.netcologne.de>

  copy mid

https://www.novabbs.com/devel/article-flat.php?id=30692&group=comp.arch#30692

  copy link   Newsgroups: comp.arch
Path: i2pn2.org!i2pn.org!weretis.net!feeder8.news.weretis.net!newsreader4.netcologne.de!news.netcologne.de!.POSTED.2001-4dd6-2711-0-7285-c2ff-fe6c-992d.ipv6dyn.netcologne.de!not-for-mail
From: tkoe...@netcologne.de (Thomas Koenig)
Newsgroups: comp.arch
Subject: Re: Introducing ForwardCom: An open ISA with variable-length vector
registers
Date: Thu, 2 Feb 2023 17:26:01 -0000 (UTC)
Organization: news.netcologne.de
Distribution: world
Message-ID: <trgrn9$3qv5$1@newsreader4.netcologne.de>
References: <12155892-ec89-4bee-85bb-d1491a5dd20dn@googlegroups.com>
<trdrft$1rac$1@newsreader4.netcologne.de>
<967f0435-af12-452d-9aa9-ddf9f1580760n@googlegroups.com>
<tree26$28u3$1@newsreader4.netcologne.de>
<5c9cef11-d404-44cb-86f1-70174340d082n@googlegroups.com>
<trenlr$2g2p$1@newsreader4.netcologne.de>
<bc5ec264-5711-4793-9228-b58b5eb2db11n@googlegroups.com>
<trfm67$32et$2@newsreader4.netcologne.de>
Injection-Date: Thu, 2 Feb 2023 17:26:01 -0000 (UTC)
Injection-Info: newsreader4.netcologne.de; posting-host="2001-4dd6-2711-0-7285-c2ff-fe6c-992d.ipv6dyn.netcologne.de:2001:4dd6:2711:0:7285:c2ff:fe6c:992d";
logging-data="125925"; mail-complaints-to="abuse@netcologne.de"
User-Agent: slrn/1.0.3 (Linux)
 by: Thomas Koenig - Thu, 2 Feb 2023 17:26 UTC

Thomas Koenig <tkoenig@netcologne.de> schrieb:

> Also, VVM faithfully reproduces a special ordering at the loop,
> the one that it was written in. This is obviously ideal for
> languages which write out explicit loops, like C, but does not
> offer improvment for languages (like Fortran :-) which operate
> more on vectors or arrays.

Actually, that should read

"does not offer additional improvement". Having VVM for Fortran would
already speed up a lot of code.

Re: Introducing ForwardCom: An open ISA with variable-length vector registers

<trgth4$104kd$1@dont-email.me>

  copy mid

https://www.novabbs.com/devel/article-flat.php?id=30693&group=comp.arch#30693

  copy link   Newsgroups: comp.arch
Path: i2pn2.org!i2pn.org!eternal-september.org!reader01.eternal-september.org!.POSTED!not-for-mail
From: sfu...@alumni.cmu.edu.invalid (Stephen Fuld)
Newsgroups: comp.arch
Subject: Re: Introducing ForwardCom: An open ISA with variable-length vector
registers
Date: Thu, 2 Feb 2023 09:56:50 -0800
Organization: A noiseless patient Spider
Lines: 35
Message-ID: <trgth4$104kd$1@dont-email.me>
References: <12155892-ec89-4bee-85bb-d1491a5dd20dn@googlegroups.com>
<trdrft$1rac$1@newsreader4.netcologne.de>
<967f0435-af12-452d-9aa9-ddf9f1580760n@googlegroups.com>
<tree26$28u3$1@newsreader4.netcologne.de>
<5c9cef11-d404-44cb-86f1-70174340d082n@googlegroups.com>
<trenlr$2g2p$1@newsreader4.netcologne.de>
<bc5ec264-5711-4793-9228-b58b5eb2db11n@googlegroups.com>
<trfm67$32et$2@newsreader4.netcologne.de>
<trgrn9$3qv5$1@newsreader4.netcologne.de>
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8; format=flowed
Content-Transfer-Encoding: 7bit
Injection-Date: Thu, 2 Feb 2023 17:56:52 -0000 (UTC)
Injection-Info: reader01.eternal-september.org; posting-host="5e8c85e0f1726ac2a3fd3f0a65e6503c";
logging-data="1053325"; mail-complaints-to="abuse@eternal-september.org"; posting-account="U2FsdGVkX1+pm6XicQMKYdo0ZVjd+3wM8hjPrSn3SwM="
User-Agent: Mozilla/5.0 (Windows NT 10.0; Win64; x64; rv:102.0) Gecko/20100101
Thunderbird/102.7.0
Cancel-Lock: sha1:nUicopfa/HcK7pc0Yh+7E6QPmxs=
In-Reply-To: <trgrn9$3qv5$1@newsreader4.netcologne.de>
Content-Language: en-US
 by: Stephen Fuld - Thu, 2 Feb 2023 17:56 UTC

On 2/2/2023 9:26 AM, Thomas Koenig wrote:
> Thomas Koenig <tkoenig@netcologne.de> schrieb:
>
>> Also, VVM faithfully reproduces a special ordering at the loop,
>> the one that it was written in. This is obviously ideal for
>> languages which write out explicit loops, like C, but does not
>> offer improvment for languages (like Fortran :-) which operate
>> more on vectors or arrays.
>
> Actually, that should read
>
> "does not offer additional improvement". Having VVM for Fortran would
> already speed up a lot of code.

I get what you are saying, but I am not sure it is true. Specifically,
while the CPU can do only one add at a time (due to the dependency), so
it doesn't offer the advantage of operating on multiple elements
simultaneously, as it does on non loop carried dependency loops.

But it may be, and Mitch is the expert here, that the other advantages
of VVM, specifically, the large "prefill" buffers (I'm sorry, I forget
what they are officially called) offer an advantage over pure scaler code.

BTW, whether you call what VVM does for this loop "vectorizing" is, I
think a matter of ones definition of vectorizing, specifically whether
the definition must include operating on more than one element
simultaneously or not.

--
- Stephen Fuld
(e-mail address disguised to prevent spam)

Re: Introducing ForwardCom: An open ISA with variable-length vector registers

<f1073f9d-2853-4e41-84ba-543869aaad02n@googlegroups.com>

  copy mid

https://www.novabbs.com/devel/article-flat.php?id=30694&group=comp.arch#30694

  copy link   Newsgroups: comp.arch
X-Received: by 2002:ac8:7f49:0:b0:3a8:15e1:757 with SMTP id g9-20020ac87f49000000b003a815e10757mr677277qtk.194.1675371146218;
Thu, 02 Feb 2023 12:52:26 -0800 (PST)
X-Received: by 2002:a05:6870:819b:b0:169:df52:c87a with SMTP id
k27-20020a056870819b00b00169df52c87amr335273oae.186.1675371145963; Thu, 02
Feb 2023 12:52:25 -0800 (PST)
Path: i2pn2.org!i2pn.org!usenet.blueworldhosting.com!feed1.usenet.blueworldhosting.com!peer03.iad!feed-me.highwinds-media.com!news.highwinds-media.com!news-out.google.com!nntp.google.com!postnews.google.com!google-groups.googlegroups.com!not-for-mail
Newsgroups: comp.arch
Date: Thu, 2 Feb 2023 12:52:25 -0800 (PST)
In-Reply-To: <trgth4$104kd$1@dont-email.me>
Injection-Info: google-groups.googlegroups.com; posting-host=2600:1700:291:29f0:6c8f:6452:e55b:f171;
posting-account=H_G_JQkAAADS6onOMb-dqvUozKse7mcM
NNTP-Posting-Host: 2600:1700:291:29f0:6c8f:6452:e55b:f171
References: <12155892-ec89-4bee-85bb-d1491a5dd20dn@googlegroups.com>
<trdrft$1rac$1@newsreader4.netcologne.de> <967f0435-af12-452d-9aa9-ddf9f1580760n@googlegroups.com>
<tree26$28u3$1@newsreader4.netcologne.de> <5c9cef11-d404-44cb-86f1-70174340d082n@googlegroups.com>
<trenlr$2g2p$1@newsreader4.netcologne.de> <bc5ec264-5711-4793-9228-b58b5eb2db11n@googlegroups.com>
<trfm67$32et$2@newsreader4.netcologne.de> <trgrn9$3qv5$1@newsreader4.netcologne.de>
<trgth4$104kd$1@dont-email.me>
User-Agent: G2/1.0
MIME-Version: 1.0
Message-ID: <f1073f9d-2853-4e41-84ba-543869aaad02n@googlegroups.com>
Subject: Re: Introducing ForwardCom: An open ISA with variable-length vector registers
From: MitchAl...@aol.com (MitchAlsup)
Injection-Date: Thu, 02 Feb 2023 20:52:26 +0000
Content-Type: text/plain; charset="UTF-8"
Content-Transfer-Encoding: quoted-printable
X-Received-Bytes: 5965
 by: MitchAlsup - Thu, 2 Feb 2023 20:52 UTC

On Thursday, February 2, 2023 at 11:56:55 AM UTC-6, Stephen Fuld wrote:
> On 2/2/2023 9:26 AM, Thomas Koenig wrote:
> > Thomas Koenig <tko...@netcologne.de> schrieb:
> >
> >> Also, VVM faithfully reproduces a special ordering at the loop,
> >> the one that it was written in. This is obviously ideal for
> >> languages which write out explicit loops, like C, but does not
> >> offer improvment for languages (like Fortran :-) which operate
> >> more on vectors or arrays.
> >
> > Actually, that should read
> >
> > "does not offer additional improvement". Having VVM for Fortran would
> > already speed up a lot of code.
<
> I get what you are saying, but I am not sure it is true. Specifically,
> while the CPU can do only one add at a time (due to the dependency),
> so it doesn't offer the advantage of operating on multiple elements
> simultaneously, as it does on non loop carried dependency loops.
<
If we are talking about the following loop (snatched from above)::
<
void add (const int *a, int *b, int *c, int n)
{ for (int i=0; i<n; i++)
a[i] = b[i] + c[i];
} <
The only thing preventing performing 2-4-8 adds, at the same time,
is whether (or not) a[] actually aliases b[] or c[]. If/when there is no
aliasing, this can be performed as wide as the implementation has
resources. If aliasing is actually taking place then one is limited by
the actual data-flow.
<
Note: That in the case above::
<
add( a, a, @a[37]);
<
Does not perform and ACTUAL aliasing up to execution widths of
32 per cycle (8 per cycle FP due to latency).
>
> But it may be, and Mitch is the expert here, that the other advantages
> of VVM, specifically, the large "prefill" buffers (I'm sorry, I forget
> what they are officially called) offer an advantage over pure scaler code..
<
Consider that a processor is comprised of many function units, each
having a characteristic latency:: integer is 1 cycle, LDs are 3, STs are 1
cycle, FP is 4 cycles. Now consider that there is 1 FU of each kind and
that the LOOP instruction executes in 1 cycle. Finally consider that the
underlying machine is 1-wide and in-order (classic 1st generation RISC).
Cache width is 128-bits--not 1st generation RISC, but suitable for mis-
aligned DoubleWord access at full speed.
>
Here, loops are bound by the number of times a FU is used within the
loop. DAXPY, which has 2 LDs, 1 ST, 1 FMAC, per iteration. LDs and STs
are being performed 128-bits per access, or 2 Iterations per access.
<
On my 1-wide in order machine, we are reading 128-bits per for the
2 LDs, and writing 128-bits for the ST every 3 (2*)cycles, so with a
single FMAC unit we can perform 2 LDs, 1 FMAC, 1 ST and the equi-
valent of 1×(ADD-CMP-BLT) every 3 cycles (7 instructions in 3 cycles
= 2+1/3 I/C). 2.3 IPC is not bad for a 1-wide machine being about
3.5× the performance of the scalar code all by itself.
<
(*) The nature of cache lines having 1 tag and 4 quadword columns
enables the 2 LDs and the ST to both be performed in 2 cycles since
the cache line is 4 accesses wide and we need 3 tag accesses every
4 cycles. {This may require an additional QuadWord of buffering for
STs.} In order to do this, the column being stored cannot be a column
being read by either of the 2 LDs. 7 effective instructions in 2 cycles
is 3.5 IPC and 5× what scalar code could perform.
<
Also note: The C (or Fortran) source has manually unrolled this loop 4
times. This just multiplies the number of instructions in the loop by
4 while leaving a single LOOP instruction for iteration control.
<
> BTW, whether you call what VVM does for this loop "vectorizing" is, I
> think a matter of ones definition of vectorizing, specifically whether
> the definition must include operating on more than one element
> simultaneously or not.
>
In the same way a CARY-like vector machine does not need fetch
bandwidth while its vector instructions ae being performed, a VVM
implementation does not need the Fetch-Parse-Decode stages of
the pipeline to be active while the loop is being performed.
>
>
>
> --
> - Stephen Fuld
> (e-mail address disguised to prevent spam)

Re: Introducing ForwardCom: An open ISA with variable-length vector registers

<2023Feb2.234358@mips.complang.tuwien.ac.at>

  copy mid

https://www.novabbs.com/devel/article-flat.php?id=30695&group=comp.arch#30695

  copy link   Newsgroups: comp.arch
Path: i2pn2.org!i2pn.org!eternal-september.org!reader01.eternal-september.org!.POSTED!not-for-mail
From: ant...@mips.complang.tuwien.ac.at (Anton Ertl)
Newsgroups: comp.arch
Subject: Re: Introducing ForwardCom: An open ISA with variable-length vector registers
Date: Thu, 02 Feb 2023 22:43:58 GMT
Organization: Institut fuer Computersprachen, Technische Universitaet Wien
Lines: 41
Message-ID: <2023Feb2.234358@mips.complang.tuwien.ac.at>
References: <12155892-ec89-4bee-85bb-d1491a5dd20dn@googlegroups.com>
Injection-Info: reader01.eternal-september.org; posting-host="4f6f460af62ee7969d064311fafd2f31";
logging-data="1153873"; mail-complaints-to="abuse@eternal-september.org"; posting-account="U2FsdGVkX1/IS8sDZpGiXPNeiGUFSs/d"
Cancel-Lock: sha1:Ef4xyNFeHGHKkqdi8biqY29sDuo=
X-newsreader: xrn 10.11
 by: Anton Ertl - Thu, 2 Feb 2023 22:43 UTC

Agner Fog <agfo@dtu.dk> writes:
>ForwardCom can vectorize array loops in a new way that automatically adjust=
>s to the maximum vector length supported by the CPU. It works in the follow=
>ing way:
>
>A register used as loop counter is initialized to the array length. This co=
>unter is decremented by the vector register length and repeats as long as i=
>t is positive. The counter register also specifies the vector register leng=
>th. The vector registers used in the loop will have the maximum length as l=
>ong as the counter exceeds this value. There is no loop tail because the ve=
>ctor length is automatically adjusted in the last iteration of the loop to =
>fit the remaining number of array elements.
....
>There is no global vector length register. You can have different vectors w=
>ith different lengths at the same time. The length is stored in the vector =
>register itself. When you save and restore a vector register, it will only =
>save the part of the register that is actually used.=20

In recent time we have had the problem that CPU manufacturers combine
different cores on the same CPU, and want to migrate threads between
these cores. So they use the same SIMD lengths on all cores, even, in
the case of Intel, disabling AVX-512 on CPUs that do not have the
smaller cores, i.e., where all cores could do AVX-512 if it was not
disabled.

It seems to me that you don't have an answer for that. And I guess
that as long as there are architectural (programmer-visible) SIMD
registers, the problem will persist. The SIMD registers must not be
an architectural feature, as in VVM (or maybe Helium?), to make
migration from cores with longer to cores with shorter vectors
possible.

The other option would be for Intel to implement AVX-512 in the small
cores without 512-bit units, the same way that AMD has been using time
and again: XMM=2x64bits on K8, YMM=2x128 bits on later heavy machinery
and Zen1, and ZMM is 2x256 bits on Zen4.

- anton
--
'Anyone trying for "industrial quality" ISA should avoid undefined behavior.'
Mitch Alsup, <c17fcd89-f024-40e7-a594-88a85ac10d20o@googlegroups.com>

Re: Introducing ForwardCom: An open ISA with variable-length vector registers

<f68efae2-ac92-4f83-9221-191a89438510n@googlegroups.com>

  copy mid

https://www.novabbs.com/devel/article-flat.php?id=30697&group=comp.arch#30697

  copy link   Newsgroups: comp.arch
X-Received: by 2002:a05:620a:98f:b0:724:31b6:632a with SMTP id x15-20020a05620a098f00b0072431b6632amr525210qkx.432.1675403247003;
Thu, 02 Feb 2023 21:47:27 -0800 (PST)
X-Received: by 2002:a05:6870:3050:b0:163:3ab5:b3f with SMTP id
u16-20020a056870305000b001633ab50b3fmr509934oau.218.1675403246508; Thu, 02
Feb 2023 21:47:26 -0800 (PST)
Path: i2pn2.org!i2pn.org!weretis.net!feeder6.news.weretis.net!news.misty.com!border-2.nntp.ord.giganews.com!nntp.giganews.com!news-out.google.com!nntp.google.com!postnews.google.com!google-groups.googlegroups.com!not-for-mail
Newsgroups: comp.arch
Date: Thu, 2 Feb 2023 21:47:26 -0800 (PST)
In-Reply-To: <2023Feb2.234358@mips.complang.tuwien.ac.at>
Injection-Info: google-groups.googlegroups.com; posting-host=128.76.247.189; posting-account=tYjOgQoAAACRs74arwcusKjVVQt_fFMX
NNTP-Posting-Host: 128.76.247.189
References: <12155892-ec89-4bee-85bb-d1491a5dd20dn@googlegroups.com> <2023Feb2.234358@mips.complang.tuwien.ac.at>
User-Agent: G2/1.0
MIME-Version: 1.0
Message-ID: <f68efae2-ac92-4f83-9221-191a89438510n@googlegroups.com>
Subject: Re: Introducing ForwardCom: An open ISA with variable-length vector registers
From: agf...@dtu.dk (Agner Fog)
Injection-Date: Fri, 03 Feb 2023 05:47:26 +0000
Content-Type: text/plain; charset="UTF-8"
Content-Transfer-Encoding: quoted-printable
Lines: 22
 by: Agner Fog - Fri, 3 Feb 2023 05:47 UTC

Anton Ertl wrote:
> In recent time we have had the problem that CPU manufacturers combine
> different cores on the same CPU, and want to migrate threads between
> these cores. So they use the same SIMD lengths on all cores, even, in
> the case of Intel, disabling AVX-512 on CPUs that do not have the
> smaller cores, i.e., where all cores could do AVX-512 if it was not
> disabled.
Yes, this was a big mistake in my opinion. The first version of Intel's Alder Lake had access to different cores with and without AVX512 support. Your program will crash if the software detects that it can use AVX512 and later jumps to another core without AVX512. So they had to disable the AVX512 completely, including the new half-precision floating point instructions. Did Intel expect the OS to move a thread back to a capable core in this case? You cannot expect an OS to give special treatment to every new processor on the market with new quirks. I have discussed this problem at https://www.agner.org/forum/viewtopic.php?f=1&t=79
Unfortunately, ARM does the same in their big.LITTLE architecture.

Regarding pointer aliasing:
I think it is the responsibility of the programmer to consider pointer aliasing problems and do a loop backwards if necessary. If you want the hardware to fix all kinds of bad programming, you will soon find yourself down a very deep rabbit hole.

Pages:1234567891011
server_pubkey.txt

rocksolid light 0.9.81
clearnet tor