Rocksolid Light

Welcome to novaBBS (click a section below)

mail  files  register  newsreader  groups  login

Message-ID:  

Computers are like air conditioners. Both stop working, if you open windows. -- Adam Heath


devel / comp.arch / Around the bike shed: Instruction names and assembler syntax

SubjectAuthor
* Around the bike shed: Instruction names and assembler syntaxThomas Koenig
+- Re: Around the bike shed: Instruction names and assembler syntaxTerje Mathisen
+* Re: Around the bike shed: Instruction names and assembler syntaxIvan Godard
|+- Re: Around the bike shed: Instruction names and assembler syntaxStephen Fuld
|+* Re: Around the bike shed: Instruction names and assembler syntaxThomas Koenig
||`* Re: Around the bike shed: Instruction names and assembler syntaxIvan Godard
|| `* Re: Around the bike shed: Instruction names and assembler syntaxThomas Koenig
||  `* Re: Around the bike shed: Instruction names and assembler syntaxIvan Godard
||   +* Re: Around the bike shed: Instruction names and assembler syntaxTerje Mathisen
||   |`- Re: Around the bike shed: Instruction names and assembler syntaxThomas Koenig
||   `* Fortran for The Mill (was: Around the bike shed: Instruction namesThomas Koenig
||    `- Re: Fortran for The Mill (was: Around the bike shed: InstructionIvan Godard
|`* Re: Around the bike shed: Instruction names and assembler syntaxJohn Levine
| +- Re: Around the bike shed: Instruction names and assembler syntaxIvan Godard
| +* Re: Around the bike shed: Instruction names and assembler syntaxEricP
| |`* Re: Around the bike shed: Instruction names and assembler syntaxJohn Levine
| | `* Re: Around the bike shed: Instruction names and assembler syntaxEricP
| |  `- Re: high and higner level assemblers, was Around the bike shed: Instruction nameJohn Levine
| `* Re: Around the bike shed: Instruction names and assembler syntaxThomas Koenig
|  `* Re: Around the bike shed: Instruction names and assembler syntaxIvan Godard
|   `* Re: Around the bike shed: Instruction names and assembler syntaxNiklas Holsti
|    `* Re: Around the bike shed: Instruction names and assembler syntaxIvan Godard
|     `* Re: Around the bike shed: Instruction names and assembler syntaxAnton Ertl
|      +* Re: Around the bike shed: Instruction names and assembler syntaxIvan Godard
|      |`* Re: Around the bike shed: Instruction names and assembler syntaxAnton Ertl
|      | `* Re: Around the bike shed: Instruction names and assembler syntaxIvan Godard
|      |  `* Re: Around the bike shed: Instruction names and assembler syntaxAnton Ertl
|      |   +- Re: Around the bike shed: Instruction names and assembler syntaxTerje Mathisen
|      |   `* Re: Around the bike shed: Instruction names and assembler syntaxStefan Monnier
|      |    +- Re: Around the bike shed: Instruction names and assembler syntaxIvan Godard
|      |    `* Re: Around the bike shed: Instruction names and assembler syntaxAnton Ertl
|      |     `* Re: Around the bike shed: Instruction names and assembler syntaxIvan Godard
|      |      `* Re: Around the bike shed: Instruction names and assembler syntaxAnton Ertl
|      |       `* Re: Around the bike shed: Instruction names and assembler syntaxIvan Godard
|      |        `* Re: Around the bike shed: Instruction names and assembler syntaxAnton Ertl
|      |         +* Re: Around the bike shed: Instruction names and assembler syntaxIvan Godard
|      |         |+* Re: Around the bike shed: Instruction names and assembler syntaxMitchAlsup
|      |         ||+* Re: Around the bike shed: Instruction names and assembler syntaxTim Rentsch
|      |         |||`* Re: Around the bike shed: Instruction names and assembler syntaxJohn Levine
|      |         ||| +- Re: Around the bike shed: Instruction names and assembler syntaxDavid Brown
|      |         ||| `- Re: Around the bike shed: Instruction names and assembler syntaxTim Rentsch
|      |         ||`- Re: Around the bike shed: Instruction names and assembler syntaxIvan Godard
|      |         |`* Re: Around the bike shed: Instruction names and assembler syntaxAnton Ertl
|      |         | `- Re: Around the bike shed: Instruction names and assembler syntaxIvan Godard
|      |         `* Re: Around the bike shed: Instruction names and assembler syntaxTim Rentsch
|      |          +* Re: Around the bike shed: Instruction names and assembler syntaxStefan Monnier
|      |          |`- Re: Around the bike shed: Instruction names and assembler syntaxTim Rentsch
|      |          `* Re: Around the bike shed: Instruction names and assembler syntaxAnton Ertl
|      |           +- Re: Around the bike shed: Instruction names and assembler syntaxIvan Godard
|      |           `* Re: Around the bike shed: Instruction names and assembler syntaxTim Rentsch
|      |            `- Re: Around the bike shed: Instruction names and assembler syntaxStefan Monnier
|      +* Re: Around the bike shed: Instruction names and assembler syntaxThomas Koenig
|      |`* Re: Around the bike shed: Instruction names and assembler syntaxAnton Ertl
|      | +* Re: Around the bike shed: Instruction names and assembler syntaxIvan Godard
|      | |+* Re: Around the bike shed: Instruction names and assembler syntaxThomas Koenig
|      | ||`* Re: Around the bike shed: Instruction names and assembler syntaxIvan Godard
|      | || +- Re: Around the bike shed: Instruction names and assembler syntaxMitchAlsup
|      | || `* Re: Around the bike shed: Instruction names and assembler syntaxJohn Levine
|      | ||  `- Re: Around the bike shed: Instruction names and assembler syntaxNiklas Holsti
|      | |`* Re: Around the bike shed: Instruction names and assembler syntaxAnton Ertl
|      | | `* Re: Around the bike shed: Instruction names and assembler syntaxIvan Godard
|      | |  `* Re: Around the bike shed: Instruction names and assembler syntaxAnton Ertl
|      | |   `* Re: Around the bike shed: Instruction names and assembler syntaxIvan Godard
|      | |    `- Re: Around the bike shed: Instruction names and assembler syntaxTerje Mathisen
|      | +- Re: Around the bike shed: Instruction names and assembler syntaxMitchAlsup
|      | +* Re: Around the bike shed: Instruction names and assembler syntaxNiklas Holsti
|      | |`* Re: Around the bike shed: Instruction names and assembler syntaxAnton Ertl
|      | | `- Re: Around the bike shed: Instruction names and assembler syntaxIvan Godard
|      | `* Re: Around the bike shed: Instruction names and assembler syntaxTerje Mathisen
|      |  `- Re: Around the bike shed: Instruction names and assembler syntaxThomas Koenig
|      `* Re: Around the bike shed: Instruction names and assembler syntaxTim Rentsch
|       +* Re: Around the bike shed: Instruction names and assembler syntaxNiklas Holsti
|       |`* Re: Around the bike shed: Instruction names and assembler syntaxTim Rentsch
|       | `* Re: Around the bike shed: Instruction names and assembler syntaxMitchAlsup
|       |  +* Re: Around the bike shed: Instruction names and assembler syntaxDavid Brown
|       |  |+* Re: Around the bike shed: Instruction names and assembler syntaxIvan Godard
|       |  ||+* Re: Around the bike shed: Instruction names and assembler syntaxEricP
|       |  |||+- Re: Around the bike shed: Instruction names and assembler syntaxIvan Godard
|       |  |||`* Re: Around the bike shed: Instruction names and assembler syntaxMitchAlsup
|       |  ||| +* Re: Around the bike shed: Instruction names and assembler syntaxIvan Godard
|       |  ||| |+* Re: Around the bike shed: Instruction names and assembler syntaxNiklas Holsti
|       |  ||| ||+* Re: Around the bike shed: Instruction names and assembler syntaxIvan Godard
|       |  ||| |||`* Re: Around the bike shed: Instruction names and assembler syntaxTerje Mathisen
|       |  ||| ||| +- Re: Around the bike shed: Instruction names and assembler syntaxIvan Godard
|       |  ||| ||| `- Re: Around the bike shed: Instruction names and assembler syntaxStefan Monnier
|       |  ||| ||`* Re: Around the bike shed: Instruction names and assembler syntaxDavid Brown
|       |  ||| || `* Re: Around the bike shed: Instruction names and assembler syntaxNiklas Holsti
|       |  ||| ||  +* Re: Around the bike shed: Instruction names and assembler syntaxDavid Brown
|       |  ||| ||  |`* Re: Around the bike shed: Instruction names and assembler syntaxIvan Godard
|       |  ||| ||  | `* Re: Around the bike shed: Instruction names and assembler syntaxDavid Brown
|       |  ||| ||  |  `- Re: Around the bike shed: Instruction names and assembler syntaxIvan Godard
|       |  ||| ||  `- Re: Around the bike shed: Instruction names and assembler syntaxEricP
|       |  ||| |+- Re: Around the bike shed: Instruction names and assembler syntaxMitchAlsup
|       |  ||| |`- Re: Around the bike shed: Instruction names and assembler syntaxDavid Brown
|       |  ||| `- Re: Around the bike shed: Instruction names and assembler syntaxDavid Brown
|       |  ||+- Re: Around the bike shed: Instruction names and assembler syntaxStephen Fuld
|       |  ||+* Re: Around the bike shed: Instruction names and assembler syntaxDavid Brown
|       |  |||+* Re: Around the bike shed: Instruction names and assembler syntaxMitchAlsup
|       |  ||||`* Re: Around the bike shed: Instruction names and assembler syntaxDavid Brown
|       |  |||| +* Re: Around the bike shed: Instruction names and assembler syntaxTom Gardner
|       |  |||| |+* Re: Around the bike shed: Instruction names and assembler syntaxDavid Brown
|       |  |||| |`- Re: Around the bike shed: Instruction names and assembler syntaxStefan Monnier
|       |  |||| `* Re: Around the bike shed: Instruction names and assembler syntaxIvan Godard
|       |  |||`- Re: Around the bike shed: Instruction names and assembler syntaxTerje Mathisen
|       |  ||`* Re: Around the bike shed: Instruction names and assembler syntaxAnton Ertl
|       |  |`- Re: Around the bike shed: Instruction names and assembler syntaxThomas Koenig
|       |  `- Re: Around the bike shed: Instruction names and assembler syntaxTim Rentsch
|       +* Re: Around the bike shed: Instruction names and assembler syntaxThomas Koenig
|       `* Re: Around the bike shed: Instruction names and assembler syntaxMitchAlsup
+- Re: Around the bike shed: Instruction names and assembler syntaxStefan Monnier
+* Re: Around the bike shed: Instruction names and assembler syntaxMitchAlsup
+* Re: Around the bike shed: Instruction names and assembler syntaxJames Van Buskirk
+- Re: Around the bike shed: Instruction names and assembler syntaxBGB
`* Re: Around the bike shed: Instruction names and assembler syntaxThomas Koenig

Pages:12345678910
Around the bike shed: Instruction names and assembler syntax

<svcqrj$jj6$1@newsreader4.netcologne.de>

  copy mid

https://www.novabbs.com/devel/article-flat.php?id=23779&group=comp.arch#23779

  copy link   Newsgroups: comp.arch
Path: i2pn2.org!i2pn.org!weretis.net!feeder8.news.weretis.net!newsreader4.netcologne.de!news.netcologne.de!.POSTED.2a0a-a540-b65-0-7285-c2ff-fe6c-992d.ipv6dyn.netcologne.de!not-for-mail
From: tkoe...@netcologne.de (Thomas Koenig)
Newsgroups: comp.arch
Subject: Around the bike shed: Instruction names and assembler syntax
Date: Sat, 26 Feb 2022 09:11:15 -0000 (UTC)
Organization: news.netcologne.de
Distribution: world
Message-ID: <svcqrj$jj6$1@newsreader4.netcologne.de>
Injection-Date: Sat, 26 Feb 2022 09:11:15 -0000 (UTC)
Injection-Info: newsreader4.netcologne.de; posting-host="2a0a-a540-b65-0-7285-c2ff-fe6c-992d.ipv6dyn.netcologne.de:2a0a:a540:b65:0:7285:c2ff:fe6c:992d";
logging-data="20070"; mail-complaints-to="abuse@netcologne.de"
User-Agent: slrn/1.0.3 (Linux)
 by: Thomas Koenig - Sat, 26 Feb 2022 09:11 UTC

This is surely not very important for most people, because few
people read assembler, and fewer write it, but it is a part
of designing an ISA.

Reading assembler for a few architectures over the years, I have
formed a few preferences that I would like to share, with the firm
expectation that others will have different preferences :-)

Instruction names:

If two instructions have a different encoding, they should have a
different name. Apart from making an assembler marginally easier
to write, this also makes clear what size an instruction is, if
it has immediates.

Instruction names should be as regular as possible, made up of

- prefix, for example when special registers are accessed
- operation
- postfix indicating data size or other speciaities
(like immediate data)

Ideally, I'd like to be able to look at an instruction I have not
seen before and recognize it, or write it based on the structure
of the instruction names when I only remember similar instructions.
A single regexp for all instructions splitting the parts above
in a unique way would be great.

And I don't like dots in instructions.

Load and store should be called some variant of load and store,
not move, and especially not move based on the type of the
operands.

(The above looks a lot like POWER, but see below).

Assembler syntax:

Registers should be clearly visible, not encoded into numbers like
IBM likes to do. Syntax of arithmetic should be

instr r_target,operand, ...

while load and store should have the register first.

I "grew up" on the AT&T syntax (well, after 6502 assembler, which
hardly counts), so I am used to the offset(register) convention,
but I can see why [register,offset] is more clear, so I have
no clear preference there. I just find the Intel BYTE PTR stuff
strange to look at.

Comments? Other preferences? :-)

Re: Around the bike shed: Instruction names and assembler syntax

<svcsak$1flk$1@gioia.aioe.org>

  copy mid

https://www.novabbs.com/devel/article-flat.php?id=23780&group=comp.arch#23780

  copy link   Newsgroups: comp.arch
Path: i2pn2.org!i2pn.org!aioe.org!rd9pRsUZyxkRLAEK7e/Uzw.user.46.165.242.91.POSTED!not-for-mail
From: terje.ma...@tmsw.no (Terje Mathisen)
Newsgroups: comp.arch
Subject: Re: Around the bike shed: Instruction names and assembler syntax
Date: Sat, 26 Feb 2022 10:36:28 +0100
Organization: Aioe.org NNTP Server
Message-ID: <svcsak$1flk$1@gioia.aioe.org>
References: <svcqrj$jj6$1@newsreader4.netcologne.de>
Mime-Version: 1.0
Content-Type: text/plain; charset=ISO-8859-1; format=flowed
Content-Transfer-Encoding: 7bit
Injection-Info: gioia.aioe.org; logging-data="48820"; posting-host="rd9pRsUZyxkRLAEK7e/Uzw.user.gioia.aioe.org"; mail-complaints-to="abuse@aioe.org";
User-Agent: Mozilla/5.0 (Windows NT 10.0; Win64; x64; rv:68.0) Gecko/20100101
Firefox/68.0 SeaMonkey/2.53.10.2
X-Notice: Filtered by postfilter v. 0.9.2
 by: Terje Mathisen - Sat, 26 Feb 2022 09:36 UTC

Thomas Koenig wrote:
> This is surely not very important for most people, because few
> people read assembler, and fewer write it, but it is a part
> of designing an ISA.
>
> Reading assembler for a few architectures over the years, I have
> formed a few preferences that I would like to share, with the firm
> expectation that others will have different preferences :-)
>
> Instruction names:
>
> If two instructions have a different encoding, they should have a
> different name. Apart from making an assembler marginally easier
> to write, this also makes clear what size an instruction is, if
> it has immediates.

So this means that you want separate names for a 16-bit short form vs
32-bit long form operation, even if they are doing the exact same op?

>
> Instruction names should be as regular as possible, made up of
>
> - prefix, for example when special registers are accessed
> - operation
> - postfix indicating data size or other speciaities
> (like immediate data)

The postfix (or Intel "byte ptr" prefix) is only needed when you don't
have a size implied by register name/type.

>
> Ideally, I'd like to be able to look at an instruction I have not
> seen before and recognize it, or write it based on the structure
> of the instruction names when I only remember similar instructions.
> A single regexp for all instructions splitting the parts above
> in a unique way would be great.

OK.
>
> And I don't like dots in instructions.

Strong plus here.

> Load and store should be called some variant of load and store,
> not move, and especially not move based on the type of the
> operands.

This is a "don't care" for me, either is fine.
>
> (The above looks a lot like POWER, but see below).
>
> Assembler syntax:
>
> Registers should be clearly visible, not encoded into numbers like

Yes!

> IBM likes to do. Syntax of arithmetic should be
>
> instr r_target,operand, ...
>
> while load and store should have the register first.

This is completely wrong!

If you encode all other operations with the target on the left, then you
have to do the same for a store op!
>
> I "grew up" on the AT&T syntax (well, after 6502 assembler, which
> hardly counts), so I am used to the offset(register) convention,
> but I can see why [register,offset] is more clear, so I have
> no clear preference there. I just find the Intel BYTE PTR stuff
> strange to look at.

In almost all (optimized) asm, the byte/word/dword ptr stuff is
superfluous, it is only when you do mem-mem or mem-immediate operations
that you need them.

I.e. after running out of registers in an inner loop, I might have to
increment a memory counter:

inc word ptr [cnt]
or
add word ptr [cnt]

but both of those can be avoided if [cnt] has been previously declared
as a word variable.

>
> Comments? Other preferences? :-)

I like to indent all branch instructions, sort of the opposite to the
standard HLL block indent, it still makes these operations stand out.

Terje

--
- <Terje.Mathisen at tmsw.no>
"almost all programming can be viewed as an exercise in caching"

Re: Around the bike shed: Instruction names and assembler syntax

<svd280$fiv$1@dont-email.me>

  copy mid

https://www.novabbs.com/devel/article-flat.php?id=23782&group=comp.arch#23782

  copy link   Newsgroups: comp.arch
Path: i2pn2.org!i2pn.org!eternal-september.org!reader02.eternal-september.org!.POSTED!not-for-mail
From: iva...@millcomputing.com (Ivan Godard)
Newsgroups: comp.arch
Subject: Re: Around the bike shed: Instruction names and assembler syntax
Date: Sat, 26 Feb 2022 03:17:18 -0800
Organization: A noiseless patient Spider
Lines: 64
Message-ID: <svd280$fiv$1@dont-email.me>
References: <svcqrj$jj6$1@newsreader4.netcologne.de>
Mime-Version: 1.0
Content-Type: text/plain; charset=UTF-8; format=flowed
Content-Transfer-Encoding: 7bit
Injection-Date: Sat, 26 Feb 2022 11:17:21 -0000 (UTC)
Injection-Info: reader02.eternal-september.org; posting-host="d75d203e896af9548ce12ecdb0158516";
logging-data="15967"; mail-complaints-to="abuse@eternal-september.org"; posting-account="U2FsdGVkX18HioD2dVXiT+MftmGNLozp"
User-Agent: Mozilla/5.0 (Windows NT 10.0; Win64; x64; rv:91.0) Gecko/20100101
Thunderbird/91.6.1
Cancel-Lock: sha1:kgE0YMBjmosxWRg4JQttY+tQvGY=
In-Reply-To: <svcqrj$jj6$1@newsreader4.netcologne.de>
Content-Language: en-US
 by: Ivan Godard - Sat, 26 Feb 2022 11:17 UTC

On 2/26/2022 1:11 AM, Thomas Koenig wrote:
> This is surely not very important for most people, because few
> people read assembler, and fewer write it, but it is a part
> of designing an ISA.
>
> Reading assembler for a few architectures over the years, I have
> formed a few preferences that I would like to share, with the firm
> expectation that others will have different preferences :-)
>
> Instruction names:
>
> If two instructions have a different encoding, they should have a
> different name. Apart from making an assembler marginally easier
> to write, this also makes clear what size an instruction is, if
> it has immediates.
>
> Instruction names should be as regular as possible, made up of
>
> - prefix, for example when special registers are accessed
> - operation
> - postfix indicating data size or other speciaities
> (like immediate data)
>
> Ideally, I'd like to be able to look at an instruction I have not
> seen before and recognize it, or write it based on the structure
> of the instruction names when I only remember similar instructions.
> A single regexp for all instructions splitting the parts above
> in a unique way would be great.

Or you can make the asm be similar to the syntax of a HLL, arguably more
familiar to most readers than a traditional asm syntax, even one with
your suggestions.

The case in hand: conAsm is syntactically C++, using overloading of
operators. Asm instructions are C++ functions using C++ "foo(bar1, bar2,
....)" functional notation.

Asm mnemonics are constructed from the same instruction specification as
are the code generators, simulator, and hardware. A single specification
gives a list of attributes that are relevant to the instruction;
attributes to be supported are expressed as values drawn from an
enumeration, or (notational convenience) as the whole enumeration
implying every element. The instruction is defined for the cross product
of all relevant attributes. Sample:

opPattern(exuBlock, floorOp) << floats << exuArg,
Here "floorOp" is a member of the opCode enumeration, and "floats" is
the whole of "enum floats{binFloat, decFloat}".

Each attribute enum value has a related mnemonic string. Here, the
string for floorOp is "floor" (surprise), while that for binFloat is "f"
and decFloat is "d". The whole instruction mnemonic is just the
concatenation of these strings, so this specification defines the
mnemonics "floorf" and "floord". The remainder of the instruction
specification gives the parts that are explicit arguments instead of
being implied by the mnemonic; "exuArg" (specified elsewhere) says that
the argument is a belt value evaluated in exuPhase, so the full syntax
recognized by the assembler will be for example "floorf(b7)".

Because each attribute has its own mnemonic fragment, and they are
always glued together in the same way, our system seems to necessarily
satisfy your regularity recommendations. Whether functional notation is
a bug or a feature lies in the eye of the reader, but it is certainly
readable.

Re: Around the bike shed: Instruction names and assembler syntax

<jwvk0dh38aq.fsf-monnier+comp.arch@gnu.org>

  copy mid

https://www.novabbs.com/devel/article-flat.php?id=23787&group=comp.arch#23787

  copy link   Newsgroups: comp.arch
Path: i2pn2.org!i2pn.org!eternal-september.org!reader02.eternal-september.org!.POSTED!not-for-mail
From: monn...@iro.umontreal.ca (Stefan Monnier)
Newsgroups: comp.arch
Subject: Re: Around the bike shed: Instruction names and assembler syntax
Date: Sat, 26 Feb 2022 11:27:58 -0500
Organization: A noiseless patient Spider
Lines: 15
Message-ID: <jwvk0dh38aq.fsf-monnier+comp.arch@gnu.org>
References: <svcqrj$jj6$1@newsreader4.netcologne.de>
Mime-Version: 1.0
Content-Type: text/plain
Injection-Info: reader02.eternal-september.org; posting-host="87ce8b683589a565409c8e61c8a05300";
logging-data="12663"; mail-complaints-to="abuse@eternal-september.org"; posting-account="U2FsdGVkX1/xDSUncReSozTzgDDik+h1"
User-Agent: Gnus/5.13 (Gnus v5.13) Emacs/29.0.50 (gnu/linux)
Cancel-Lock: sha1:eSzTGJk1oa2nw4mOmldMlr9fjXk=
sha1:tUXoWA6oRx7QcNhP5XOR4Epd9uE=
 by: Stefan Monnier - Sat, 26 Feb 2022 16:27 UTC

> instr r_target,operand, ...

That's my pet peeve of asm syntax: you need to know the asm's typical
conventions before you can tell which operand plays which role.
I much prefer it when the source and destination are syntactically
distinct, maybe with

add x <- y, z

or

add x, z -> z

-- Stefan

Re: Around the bike shed: Instruction names and assembler syntax

<5ce70d12-791c-44c0-8ee3-f8cbdba4ac09n@googlegroups.com>

  copy mid

https://www.novabbs.com/devel/article-flat.php?id=23788&group=comp.arch#23788

  copy link   Newsgroups: comp.arch
X-Received: by 2002:a05:622a:283:b0:2de:b3a2:b52d with SMTP id z3-20020a05622a028300b002deb3a2b52dmr7673182qtw.625.1645898118758;
Sat, 26 Feb 2022 09:55:18 -0800 (PST)
X-Received: by 2002:a05:6870:b017:b0:ce:c0c9:673 with SMTP id
y23-20020a056870b01700b000cec0c90673mr3913309oae.197.1645898117964; Sat, 26
Feb 2022 09:55:17 -0800 (PST)
Path: i2pn2.org!i2pn.org!weretis.net!feeder6.news.weretis.net!news.misty.com!border2.nntp.dca1.giganews.com!nntp.giganews.com!news-out.google.com!nntp.google.com!postnews.google.com!google-groups.googlegroups.com!not-for-mail
Newsgroups: comp.arch
Date: Sat, 26 Feb 2022 09:55:17 -0800 (PST)
In-Reply-To: <svcqrj$jj6$1@newsreader4.netcologne.de>
Injection-Info: google-groups.googlegroups.com; posting-host=2600:1700:291:29f0:8510:89eb:840b:4fba;
posting-account=H_G_JQkAAADS6onOMb-dqvUozKse7mcM
NNTP-Posting-Host: 2600:1700:291:29f0:8510:89eb:840b:4fba
References: <svcqrj$jj6$1@newsreader4.netcologne.de>
User-Agent: G2/1.0
MIME-Version: 1.0
Message-ID: <5ce70d12-791c-44c0-8ee3-f8cbdba4ac09n@googlegroups.com>
Subject: Re: Around the bike shed: Instruction names and assembler syntax
From: MitchAl...@aol.com (MitchAlsup)
Injection-Date: Sat, 26 Feb 2022 17:55:18 +0000
Content-Type: text/plain; charset="UTF-8"
Lines: 92
 by: MitchAlsup - Sat, 26 Feb 2022 17:55 UTC

On Saturday, February 26, 2022 at 3:11:18 AM UTC-6, Thomas Koenig wrote:
> This is surely not very important for most people, because few
> people read assembler, and fewer write it, but it is a part
> of designing an ISA.
>
> Reading assembler for a few architectures over the years, I have
> formed a few preferences that I would like to share, with the firm
> expectation that others will have different preferences :-)
>
> Instruction names:
>
> If two instructions have a different encoding, they should have a
> different name. Apart from making an assembler marginally easier
> to write, this also makes clear what size an instruction is, if
> it has immediates.
<
Then you would be eating up a lot of namespace in the programmers head::
<
ADD R7,R4,R19
ADD R7,R4,-R19
ADD R7,-R4,-R19
ADD R7,R4,#256
ADD R7,R4,#-256
ADD R7,#256,R19
ADD R7,#-256,R19
ADD R7,R4,#0x12345678
ADD R7,R4,#0x1234567890123456
ADD R7,#0x12345678,R19
ADD R7,#0x1234567890123456,R19
<
You eat up a lot of namespace whereas you could completely minimize the
namespace consumption and allow the ASCII express itself.
<
And I left out the Overflow checking versions and the unsigned distinctions.
<
AND: this is only 1 instruction. Yet, the ASCII is completely readable
to anyone who has ever read assembly language code before. More
importantly, the surprise factor is near zero.
>
> Instruction names should be as regular as possible, made up of
>
> - prefix, for example when special registers are accessed
> - operation
> - postfix indicating data size or other speciaities
> (like immediate data)
<
Only memory references have size and signedness, everything
else is 64-bits (except for 32-bit floats)
>
> Ideally, I'd like to be able to look at an instruction I have not
> seen before and recognize it, or write it based on the structure
> of the instruction names when I only remember similar instructions.
> A single regexp for all instructions splitting the parts above
> in a unique way would be great.
>
> And I don't like dots in instructions.
>
> Load and store should be called some variant of load and store,
> not move, and especially not move based on the type of the
> operands.
<
Here I completely agree. MOV is reg-to-reg, LD brings forth from
memory, ST sends towards memory.
>
> (The above looks a lot like POWER, but see below).
>
> Assembler syntax:
>
> Registers should be clearly visible, not encoded into numbers like
> IBM likes to do. Syntax of arithmetic should be
>
> instr r_target,operand, ...
>
> while load and store should have the register first.
<
The official name of a register in My 66000 is R[index] where index
goes from 0..31. The assembler has a set of Macros that capture
R0..R31, with aliases for FP = R[30] and SP = R[31]. In memory
reference instructions one can write:: LD Rd,[IP+immed]; using
the current IP as the base register.
<
Your proposal would need LDIP of all 8 flavors and STIP of another
4 flavors
>
> I "grew up" on the AT&T syntax (well, after 6502 assembler, which
> hardly counts), so I am used to the offset(register) convention,
> but I can see why [register,offset] is more clear, so I have
> no clear preference there. I just find the Intel BYTE PTR stuff
> strange to look at.
>
> Comments? Other preferences? :-)
<
Let architects define their own instruction sets.

Re: Around the bike shed: Instruction names and assembler syntax

<svdpt4$or9$1@dont-email.me>

  copy mid

https://www.novabbs.com/devel/article-flat.php?id=23789&group=comp.arch#23789

  copy link   Newsgroups: comp.arch
Path: i2pn2.org!i2pn.org!eternal-september.org!reader02.eternal-september.org!.POSTED!not-for-mail
From: not_va...@comcast.net (James Van Buskirk)
Newsgroups: comp.arch
Subject: Re: Around the bike shed: Instruction names and assembler syntax
Date: Sat, 26 Feb 2022 11:00:51 -0700
Organization: A noiseless patient Spider
Lines: 1
Message-ID: <svdpt4$or9$1@dont-email.me>
References: <svcqrj$jj6$1@newsreader4.netcologne.de>
Mime-Version: 1.0
Content-Type: text/plain;
format=flowed;
charset="Windows-1252";
reply-type=original
Content-Transfer-Encoding: 7bit
Injection-Date: Sat, 26 Feb 2022 18:01:08 -0000 (UTC)
Injection-Info: reader02.eternal-september.org; posting-host="081432a2a4cd3c36ac61cdc52a38b2a8";
logging-data="25449"; mail-complaints-to="abuse@eternal-september.org"; posting-account="U2FsdGVkX19tl9u5htHyI4mBtdeTy9Fk/gZUXtAGrjk="
Cancel-Lock: sha1:Ud/+76dnB/KlcJTUZylhvDvzpLE=
X-MimeOLE: Produced By Microsoft MimeOLE V16.4.3528.331
In-Reply-To: <svcqrj$jj6$1@newsreader4.netcologne.de>
X-Newsreader: Microsoft Windows Live Mail 16.4.3528.331
Importance: Normal
X-Priority: 3
X-MSMail-Priority: Normal
 by: James Van Buskirk - Sat, 26 Feb 2022 18:00 UTC

"Thomas Koenig" wrote in message
news:svcqrj$jj6$1@newsreader4.netcologne.de...

> Instruction names:

> If two instructions have a different encoding, they should have a
> different name. Apart from making an assembler marginally easier
> to write, this also makes clear what size an instruction is, if
> it has immediates.

How do you propose to distinguish between

AND RAX, RBX

and

AND RAX, RBX

The first is encoded as REX.W + 21 /r and the second as REX.W + 23 /r?
Kind of nice for steganogrphy :)

Re: Around the bike shed: Instruction names and assembler syntax

<svdrds$bu1$1@newsreader4.netcologne.de>

  copy mid

https://www.novabbs.com/devel/article-flat.php?id=23790&group=comp.arch#23790

  copy link   Newsgroups: comp.arch
Path: i2pn2.org!i2pn.org!weretis.net!feeder8.news.weretis.net!newsreader4.netcologne.de!news.netcologne.de!.POSTED.2001-4dd7-e750-0-7285-c2ff-fe6c-992d.ipv6dyn.netcologne.de!not-for-mail
From: tkoe...@netcologne.de (Thomas Koenig)
Newsgroups: comp.arch
Subject: Re: Around the bike shed: Instruction names and assembler syntax
Date: Sat, 26 Feb 2022 18:27:08 -0000 (UTC)
Organization: news.netcologne.de
Distribution: world
Message-ID: <svdrds$bu1$1@newsreader4.netcologne.de>
References: <svcqrj$jj6$1@newsreader4.netcologne.de>
<svdpt4$or9$1@dont-email.me>
Injection-Date: Sat, 26 Feb 2022 18:27:08 -0000 (UTC)
Injection-Info: newsreader4.netcologne.de; posting-host="2001-4dd7-e750-0-7285-c2ff-fe6c-992d.ipv6dyn.netcologne.de:2001:4dd7:e750:0:7285:c2ff:fe6c:992d";
logging-data="12225"; mail-complaints-to="abuse@netcologne.de"
User-Agent: slrn/1.0.3 (Linux)
 by: Thomas Koenig - Sat, 26 Feb 2022 18:27 UTC

James Van Buskirk <not_valid@comcast.net> schrieb:
> "Thomas Koenig" wrote in message
> news:svcqrj$jj6$1@newsreader4.netcologne.de...
>
>> Instruction names:
>
>> If two instructions have a different encoding, they should have a
>> different name. Apart from making an assembler marginally easier
>> to write, this also makes clear what size an instruction is, if
>> it has immediates.
>
> How do you propose to distinguish between
>
> AND RAX, RBX
>
> and
>
> AND RAX, RBX
>
> The first is encoded as REX.W + 21 /r and the second as REX.W + 23 /r?

Sorry, I don't understand what you mean. Who or what is REX.W and what
does + 21 /r and + 23 /r mean?

Re: Around the bike shed: Instruction names and assembler syntax

<7548a9b6-53f4-4089-8c32-bda1c11af7d1n@googlegroups.com>

  copy mid

https://www.novabbs.com/devel/article-flat.php?id=23791&group=comp.arch#23791

  copy link   Newsgroups: comp.arch
X-Received: by 2002:ac8:5ace:0:b0:2c9:f9d2:146 with SMTP id d14-20020ac85ace000000b002c9f9d20146mr11801046qtd.216.1645900866873;
Sat, 26 Feb 2022 10:41:06 -0800 (PST)
X-Received: by 2002:a05:6870:c153:b0:d6:fef3:d1a5 with SMTP id
g19-20020a056870c15300b000d6fef3d1a5mr2212620oad.217.1645900866623; Sat, 26
Feb 2022 10:41:06 -0800 (PST)
Path: i2pn2.org!i2pn.org!weretis.net!feeder8.news.weretis.net!proxad.net!feeder1-2.proxad.net!209.85.160.216.MISMATCH!news-out.google.com!nntp.google.com!postnews.google.com!google-groups.googlegroups.com!not-for-mail
Newsgroups: comp.arch
Date: Sat, 26 Feb 2022 10:41:06 -0800 (PST)
In-Reply-To: <svdrds$bu1$1@newsreader4.netcologne.de>
Injection-Info: google-groups.googlegroups.com; posting-host=2600:1700:291:29f0:8510:89eb:840b:4fba;
posting-account=H_G_JQkAAADS6onOMb-dqvUozKse7mcM
NNTP-Posting-Host: 2600:1700:291:29f0:8510:89eb:840b:4fba
References: <svcqrj$jj6$1@newsreader4.netcologne.de> <svdpt4$or9$1@dont-email.me>
<svdrds$bu1$1@newsreader4.netcologne.de>
User-Agent: G2/1.0
MIME-Version: 1.0
Message-ID: <7548a9b6-53f4-4089-8c32-bda1c11af7d1n@googlegroups.com>
Subject: Re: Around the bike shed: Instruction names and assembler syntax
From: MitchAl...@aol.com (MitchAlsup)
Injection-Date: Sat, 26 Feb 2022 18:41:06 +0000
Content-Type: text/plain; charset="UTF-8"
 by: MitchAlsup - Sat, 26 Feb 2022 18:41 UTC

On Saturday, February 26, 2022 at 12:27:11 PM UTC-6, Thomas Koenig wrote:
> James Van Buskirk <not_...@comcast.net> schrieb:
> > "Thomas Koenig" wrote in message
> > news:svcqrj$jj6$1...@newsreader4.netcologne.de...
> >
> >> Instruction names:
> >
> >> If two instructions have a different encoding, they should have a
> >> different name. Apart from making an assembler marginally easier
> >> to write, this also makes clear what size an instruction is, if
> >> it has immediates.
> >
> > How do you propose to distinguish between
> >
> > AND RAX, RBX
> >
> > and
> >
> > AND RAX, RBX
> >
> > The first is encoded as REX.W + 21 /r and the second as REX.W + 23 /r?
> Sorry, I don't understand what you mean. Who or what is REX.W and what
> does + 21 /r and + 23 /r mean?
<
They are MOD/RMs that lead to the same actual calculation.

Re: Around the bike shed: Instruction names and assembler syntax

<svdsi7$19ch$1@gal.iecc.com>

  copy mid

https://www.novabbs.com/devel/article-flat.php?id=23792&group=comp.arch#23792

  copy link   Newsgroups: comp.arch
Path: i2pn2.org!i2pn.org!weretis.net!feeder6.news.weretis.net!news.misty.com!news.iecc.com!.POSTED.news.iecc.com!not-for-mail
From: joh...@taugh.com (John Levine)
Newsgroups: comp.arch
Subject: Re: Around the bike shed: Instruction names and assembler syntax
Date: Sat, 26 Feb 2022 18:46:31 -0000 (UTC)
Organization: Taughannock Networks
Message-ID: <svdsi7$19ch$1@gal.iecc.com>
References: <svcqrj$jj6$1@newsreader4.netcologne.de> <svdpt4$or9$1@dont-email.me>
Injection-Date: Sat, 26 Feb 2022 18:46:31 -0000 (UTC)
Injection-Info: gal.iecc.com; posting-host="news.iecc.com:2001:470:1f07:1126:0:676f:7373:6970";
logging-data="42385"; mail-complaints-to="abuse@iecc.com"
In-Reply-To: <svcqrj$jj6$1@newsreader4.netcologne.de> <svdpt4$or9$1@dont-email.me>
Cleverness: some
X-Newsreader: trn 4.0-test77 (Sep 1, 2010)
Originator: johnl@iecc.com (John Levine)
 by: John Levine - Sat, 26 Feb 2022 18:46 UTC

It appears that James Van Buskirk <not_valid@comcast.net> said:
>"Thomas Koenig" wrote in message
>news:svcqrj$jj6$1@newsreader4.netcologne.de...
>
>> Instruction names:
>
>> If two instructions have a different encoding, they should have a
>> different name. Apart from making an assembler marginally easier
>> to write, this also makes clear what size an instruction is, if
>> it has immediates.
>
>How do you propose to distinguish between
>
>AND RAX, RBX
>
>and
>
>AND RAX, RBX

Personally, I don't.

>The first is encoded as REX.W + 21 /r and the second as REX.W + 23 /r?
>Kind of nice for steganogrphy :)

I understand that the difference is that the first allows the form general,register
and the second is register,general, and since the general operand can be a register
they are two ways to do the same thing.

Not having looked at much assembler output lately, which one will assemblers use?
Is there a way to force the other?

There are a lot of other redundant encodings, e.g., register indirect with or
without the SIB byte. I assume the assembler prefers the shorter version but
is there a way to force it to use the other one?

--
Regards,
John Levine, johnl@taugh.com, Primary Perpetrator of "The Internet for Dummies",
Please consider the environment before reading this e-mail. https://jl.ly

Re: Around the bike shed: Instruction names and assembler syntax

<sve2vj$nct$1@dont-email.me>

  copy mid

https://www.novabbs.com/devel/article-flat.php?id=23793&group=comp.arch#23793

  copy link   Newsgroups: comp.arch
Path: i2pn2.org!i2pn.org!eternal-september.org!reader02.eternal-september.org!.POSTED!not-for-mail
From: sfu...@alumni.cmu.edu.invalid (Stephen Fuld)
Newsgroups: comp.arch
Subject: Re: Around the bike shed: Instruction names and assembler syntax
Date: Sat, 26 Feb 2022 12:36:01 -0800
Organization: A noiseless patient Spider
Lines: 54
Message-ID: <sve2vj$nct$1@dont-email.me>
References: <svcqrj$jj6$1@newsreader4.netcologne.de>
<svd280$fiv$1@dont-email.me>
Mime-Version: 1.0
Content-Type: text/plain; charset=UTF-8; format=flowed
Content-Transfer-Encoding: 8bit
Injection-Date: Sat, 26 Feb 2022 20:36:03 -0000 (UTC)
Injection-Info: reader02.eternal-september.org; posting-host="691d4678c8499e0e49ffe95accdf0931";
logging-data="23965"; mail-complaints-to="abuse@eternal-september.org"; posting-account="U2FsdGVkX18BOioCUlHO9llKoSaXNPkkaP1IRl/trXc="
User-Agent: Mozilla/5.0 (Windows NT 10.0; Win64; x64; rv:91.0) Gecko/20100101
Thunderbird/91.6.1
Cancel-Lock: sha1:exUuadR7zC+I1tWfnU7gSUakfX0=
In-Reply-To: <svd280$fiv$1@dont-email.me>
Content-Language: en-US
 by: Stephen Fuld - Sat, 26 Feb 2022 20:36 UTC

On 2/26/2022 3:17 AM, Ivan Godard wrote:
> On 2/26/2022 1:11 AM, Thomas Koenig wrote:
>> This is surely not very important for most people, because few
>> people read assembler, and fewer write it, but it is a part
>> of designing an ISA.
>>
>> Reading assembler for a few architectures over the years, I have
>> formed a few preferences that I would like to share, with the firm
>> expectation that others will have different preferences :-)
>>
>> Instruction names:
>>
>> If two instructions have a different encoding, they should have a
>> different name.  Apart from making an assembler marginally easier
>> to write, this also makes clear what size an instruction is, if
>> it has immediates.
>>
>> Instruction names should be as regular as possible, made up of
>>
>> - prefix, for example when special registers are accessed
>> - operation
>> - postfix indicating data size or other speciaities
>>    (like immediate data)
>>
>> Ideally, I'd like to be able to look at an instruction I have not
>> seen before and recognize it, or write it based on the structure
>> of the instruction names when I only remember similar instructions.
>> A single regexp for all instructions splitting the parts above
>> in a unique way would be great.
>
> Or you can make the asm be similar to the syntax of a HLL, arguably more
> familiar to most readers than a traditional asm syntax, even one with
> your suggestions.
>
> The case in hand: conAsm is syntactically C++, using overloading of
> operators. Asm instructions are C++ functions using C++ "foo(bar1, bar2,
> ...)" functional notation.

OK, that surely works, but . . .

For non-Mill architectures, you still have the issue of the order of the
operands, e.g. is the target first or last? Is the order of "move type"
instructions always from-to? If so, while that is consistent with
typical HLL syntax, is puts the register destination for loads after the
source.

Also, while it is certainly C++/function call compatible, I note that if
you simply leave out the parentheses, and require a space after the
"function name", it is syntactically the same as a typical ASM format.

--
- Stephen Fuld
(e-mail address disguised to prevent spam)

Re: Around the bike shed: Instruction names and assembler syntax

<svea7m$6f4$1@dont-email.me>

  copy mid

https://www.novabbs.com/devel/article-flat.php?id=23795&group=comp.arch#23795

  copy link   Newsgroups: comp.arch
Path: i2pn2.org!i2pn.org!eternal-september.org!reader02.eternal-september.org!.POSTED!not-for-mail
From: cr88...@gmail.com (BGB)
Newsgroups: comp.arch
Subject: Re: Around the bike shed: Instruction names and assembler syntax
Date: Sat, 26 Feb 2022 16:39:45 -0600
Organization: A noiseless patient Spider
Lines: 208
Message-ID: <svea7m$6f4$1@dont-email.me>
References: <svcqrj$jj6$1@newsreader4.netcologne.de>
Mime-Version: 1.0
Content-Type: text/plain; charset=UTF-8; format=flowed
Content-Transfer-Encoding: 7bit
Injection-Date: Sat, 26 Feb 2022 22:39:50 -0000 (UTC)
Injection-Info: reader02.eternal-september.org; posting-host="421f448afd5aef2d9430e676fc2948af";
logging-data="6628"; mail-complaints-to="abuse@eternal-september.org"; posting-account="U2FsdGVkX18/9UJOY2ZUhLixzxIoi4V8"
User-Agent: Mozilla/5.0 (Windows NT 10.0; Win64; x64; rv:91.0) Gecko/20100101
Thunderbird/91.6.1
Cancel-Lock: sha1:UZLhkjiLfs9NlPQd/8V+tqNc+o8=
In-Reply-To: <svcqrj$jj6$1@newsreader4.netcologne.de>
Content-Language: en-US
 by: BGB - Sat, 26 Feb 2022 22:39 UTC

On 2/26/2022 3:11 AM, Thomas Koenig wrote:
> This is surely not very important for most people, because few
> people read assembler, and fewer write it, but it is a part
> of designing an ISA.
>
> Reading assembler for a few architectures over the years, I have
> formed a few preferences that I would like to share, with the firm
> expectation that others will have different preferences :-)
>

FWIW, my ASM syntax is the way it is, partly because it "evolved" out of
the syntax used by GAS when targeting SH:
OP Src, Dst

I did go from:
@Reg, @(Disp, Reg), @(Reg), ...
Mostly over to:
(Reg), (Reg, Disp), ...

Partly because I found the '@' ugly.

In practice, this was just sort of making the '@' optional and parsing
it roughly the same way as before (where both "@(Disp,Reg)" and
"@(Reg,Disp)" could be used in SH ASM, ...).

Luckily, GAS for SH did not use '%' on all over the place (as it does on
x86), which is good because putting '%' everywhere looks like crap IMHO.

I actually slightly prefer:
OP Dst, Src
OP Dst, SrcA, SrcB

Ordering, but this was hindered by the evolution path; BJX1 evolved out
of SH4; and initial forms of BJX2 were basically BJX1 but with the
instruction encoding reworked to be less of an awful mess.

Though, arguably, with the addition of XGPR it is now back to being a
mess, if albeit slightly less of a mess, because all of the "new"
encodings (XGPR, Op64, Op40x2, ...) are basically bit-twiddles of the
existing instruction spaces (and can all reuse the same decoders, 1).

1: As-is, the original 32-bit instruction decoder has in effect
internally expanded to closer to 56 bits (with some adhoc repacking on
the frontend).

Though, the annoyances with the hairy encodings aren't quite enough to
get over the hump of there basically being no way to (cleanly) repack
the ISA "as it is" into a consistent 32-bit instruction format.

Even if I give up the 16-bit encodings, by the time the instruction
format is "orthogonal" (with 6-bit register fields, WEX, 2-bit
predication, ...), I would only have a fraction of the opcode space.

The only other real option would be to redesign it around a bigger
instruction format (40 or 48 bits), but then I already have some
(currently unused) 48-bit encoding spaces in BJX2.

> Instruction names:
>
> If two instructions have a different encoding, they should have a
> different name. Apart from making an assembler marginally easier
> to write, this also makes clear what size an instruction is, if
> it has immediates.
>

Partial disagree.

I prefer to separate instructions more based on behavior than encoding,
if one forces it to be 1:1, that would be way too many mnemonics.

Things like immediate values are "obvious enough" without needing a
separate mnemonic, as 'R31' or 'X28' or similar is different than '31'.

There was also originally an '#' prefix for number immediates, but I
ended up mostly dropping this for similar reasons to '@', as well as
mixing poorly with the C preprocessor and the MSVC / Borland style
inline ASM (GCC style inline ASM is ugly).

In some cases, it gets a little abstract, as in:
MOV Immed, Rn

Can actually be mapped to any number of things by the assembler, as MOV
in this context is more interpreted as "get this constant loaded into
this register, I don't care how".

In the ISA listing, there is instead LDI/LDIZ/LDIN/LDIQ/... but, these
more reflect "how" the CPU gets the value loaded, as expressing all of
these cases as "MOV" would get a bit ambiguous.

The person writing ASM probably doesn't need to care as much how the
constant gets loaded into the register, or the optimal way to do so
(Does the target have Jumbo-Load encoding? Can it be expressed as a
Binary16 immediate? ...). The assembler can deal with this part.

Unlike ARM/Thumb, no '=' prefix is used, because the '=' is also kinda
pointless IMO.

> Instruction names should be as regular as possible, made up of
>
> - prefix, for example when special registers are accessed
> - operation
> - postfix indicating data size or other speciaities
> (like immediate data)
>
> Ideally, I'd like to be able to look at an instruction I have not
> seen before and recognize it, or write it based on the structure
> of the instruction names when I only remember similar instructions.
> A single regexp for all instructions splitting the parts above
> in a unique way would be great.
>

Fair enough.

> And I don't like dots in instructions.
>

It is a tradeoff. In my case, they were also inherited from SH ASM:
MOV.B, MOV.L, ...
They had also used other characters in instruction names:
CMP/GE, FCMP/EQ, ...

I generally dropped the '/' though, and in some cases the '.' is optional:
'SHLDQ' is equivalent to 'SHLD.Q'.

While it could be possible to go over to RISC-V style naming for
Load/Store ops, there isn't a huge reason to do so when the distinction
can be determined syntactically.

Say:
(Rm) or [Rm] //We know already this is memory
Rm //Bare register name is not memory

It makes more sense for an assembler which does not have special
notation for memory access, or if the order is ambiguous:
LDB X17, X13, 208
STB X17, X15, 252
OK, whatever then...

> Load and store should be called some variant of load and store,
> not move, and especially not move based on the type of the
> operands.
>
> (The above looks a lot like POWER, but see below).
>
> Assembler syntax:
>
> Registers should be clearly visible, not encoded into numbers like
> IBM likes to do. Syntax of arithmetic should be
>
> instr r_target,operand, ...
>
> while load and store should have the register first.
>

Something like:
MOVQ R23, (R28)
MOVQ (R30), R23

Would be "good enough" IMO.

For ASM, I also prefer simple untyped registers (though, within BGBCC,
register types come up in some contexts).

Namely:
R0..R63: Basic Register, Context-Dependent
RD0..RD63: Register holds a 32-bit value (assumes sign-extended)
RQ0..RQ63: Register holds a 64-bit value
LR0..LR31: Holds a register pair (Odd encodes R32..R63)
X0..X31: R0..R31, but in "RISC-V Mode" (1)

1: Decided mostly to not go into this pile of hacks. Like, even if it
isn't all that expensive on the FPGA side, the hackery involved is kinda
ugly (and whether or not this is "sane" is yet to be determined).

> I "grew up" on the AT&T syntax (well, after 6502 assembler, which
> hardly counts), so I am used to the offset(register) convention,
> but I can see why [register,offset] is more clear, so I have
> no clear preference there. I just find the Intel BYTE PTR stuff
> strange to look at.
>

All the "BYTE PTR", "DWORD PTR", ... stuff is borderline useless even in
x86.

There are only a rare few cases where it actually matters, and for
pretty much everything else they are unnecessary fluff.

> Comments? Other preferences? :-)
>

Re: Around the bike shed: Instruction names and assembler syntax

<svegkb$bse$1@dont-email.me>

  copy mid

https://www.novabbs.com/devel/article-flat.php?id=23796&group=comp.arch#23796

  copy link   Newsgroups: comp.arch
Path: i2pn2.org!i2pn.org!eternal-september.org!reader02.eternal-september.org!.POSTED!not-for-mail
From: cr88...@gmail.com (BGB)
Newsgroups: comp.arch
Subject: Re: Around the bike shed: Instruction names and assembler syntax
Date: Sat, 26 Feb 2022 18:28:55 -0600
Organization: A noiseless patient Spider
Lines: 243
Message-ID: <svegkb$bse$1@dont-email.me>
References: <svcqrj$jj6$1@newsreader4.netcologne.de>
<5ce70d12-791c-44c0-8ee3-f8cbdba4ac09n@googlegroups.com>
Mime-Version: 1.0
Content-Type: text/plain; charset=UTF-8; format=flowed
Content-Transfer-Encoding: 7bit
Injection-Date: Sun, 27 Feb 2022 00:28:59 -0000 (UTC)
Injection-Info: reader02.eternal-september.org; posting-host="f1156aedfc4c25f5791605d56cba624a";
logging-data="12174"; mail-complaints-to="abuse@eternal-september.org"; posting-account="U2FsdGVkX18NAB7i26aXthb74S3T2+Hr"
User-Agent: Mozilla/5.0 (Windows NT 10.0; Win64; x64; rv:91.0) Gecko/20100101
Thunderbird/91.6.1
Cancel-Lock: sha1:veGYlnujVYFSRRSSOivzP/spYwE=
In-Reply-To: <5ce70d12-791c-44c0-8ee3-f8cbdba4ac09n@googlegroups.com>
Content-Language: en-US
 by: BGB - Sun, 27 Feb 2022 00:28 UTC

On 2/26/2022 11:55 AM, MitchAlsup wrote:
> On Saturday, February 26, 2022 at 3:11:18 AM UTC-6, Thomas Koenig wrote:
>> This is surely not very important for most people, because few
>> people read assembler, and fewer write it, but it is a part
>> of designing an ISA.
>>
>> Reading assembler for a few architectures over the years, I have
>> formed a few preferences that I would like to share, with the firm
>> expectation that others will have different preferences :-)
>>
>> Instruction names:
>>
>> If two instructions have a different encoding, they should have a
>> different name. Apart from making an assembler marginally easier
>> to write, this also makes clear what size an instruction is, if
>> it has immediates.
> <
> Then you would be eating up a lot of namespace in the programmers head::
> <
> ADD R7,R4,R19
> ADD R7,R4,-R19
> ADD R7,-R4,-R19
> ADD R7,R4,#256
> ADD R7,R4,#-256
> ADD R7,#256,R19
> ADD R7,#-256,R19
> ADD R7,R4,#0x12345678
> ADD R7,R4,#0x1234567890123456
> ADD R7,#0x12345678,R19
> ADD R7,#0x1234567890123456,R19
> <
> You eat up a lot of namespace whereas you could completely minimize the
> namespace consumption and allow the ASCII express itself.
> <
> And I left out the Overflow checking versions and the unsigned distinctions.
> <
> AND: this is only 1 instruction. Yet, the ASCII is completely readable
> to anyone who has ever read assembly language code before. More
> importantly, the surprise factor is near zero.

Likewise, similar issue:
ADD Rm, Rn //16-bit
ADD Rm, Rn //32-bit
ADD Imm16u, Rn
ADD Imm16n, Rn
ADD Rm, Ro, Rn
ADD Rm, Imm9u, Rn
ADD Rm, Imm9n, Rn

Add in a few more variants with Jumbo and Op64 encodings.
ADD Rm, Ro, Rn //XGPR (R0..R63)
ADD Rm, Ro, Rn //Op64, like XGPR but twice as big
ADD Rm, Imm33s, Rn //Jumbo
ADD Rm, Imm17s, Rn //Op64 (R0..R63)
...

Add more if one considers the predicated forms as separate instructions
(rather than alternate encodings of the same instruction).

....

I don't actually bother to include most of these variants in the
listings, since they can be inferred from the encoding rules.

The CPU doesn't really care much either, in terms of the main decoder,
it might see:
It is in the F0 Block,
Its second word matches the pattern: 16'h1zz0 (ADD Rm, Ro, Rn).
It is in the F2 Block,
Its second word matches the pattern: 16'h0zzz (ADD Rm, Imm, Rn).
...

That there are 10+ encoding paths which happen to land in the F0 block,
doesn't really matter all that much.

Well, and the EX stages sees:
NMID=ALU3R, UCIX=ADD
The EX stages and ALU don't really care how we got there.

Trying to list out every encoding of every instruction would likely be
impractical.

ADD Rm, Ro, Rn
F0nm_1eo0 (Base)
F4nm_1eo0 (WEX)
E0nm_1eo0 (?T)
E4nm_1eo0 (?F)
EAnm_1eo0 (WEX?T)
EEnm_1eo0 (WEX?F)
7wnm_1eo0 (XGPR), Ww=0
7Wnm_1eo0 (XGPR, WEX), Ww=1
FFw0_0vii_F0nm_1eo0 (Op64)
FFw0_0vii_F4nm_1eo0 (WEX, Reserved)
FFw0_0vii_E0nm_1eo0 (?T)
FFw0_0vii_E4nm_1eo0 (?F)
FFw0_0vii_EAnm_1eo0 (WEX?T, Reserved)
FFw0_0vii_EEnm_1eo0 (WEX?F, Reserved)

FFw0_0Vii_F0nm_1eo0 (Op64), Vv=1 (N/A 2)
FFw0_0Vii_F4nm_1eo0 (WEX, Reserved), Vv=1 (N/A 2)
FFw0_0Vii_E0nm_1eo0 (?T), Vv=1 (N/A 2)
FFw0_0Vii_E4nm_1eo0 (?F), Vv=1 (N/A 2)
FFw0_0Vii_EAnm_1eo0 (WEX?T, Reserved), Vv=1 (N/A 2)
FFw0_0Vii_EEnm_1eo0 (WEX?F, Reserved), Vv=1 (N/A 2)

FEjj_jjjj_F0nm_1eo0 (Jumbo, N/A 2)
FEjj_jjjj_F4nm_1eo0 (WEX, Reserved)
FEjj_jjjj_E0nm_1eo0 (?T, N/A 2)
FEjj_jjjj_E4nm_1eo0 (?F, N/A 2)
FEjj_jjjj_EAnm_1eo0 (WEX?T, Reserved)
FEjj_jjjj_EEnm_1eo0 (WEX?F, Reserved)

...
Then repeat a bunch of these again for the Op40x2 Encodings.
78ww_vvii_F4nm_1eo0_F0nm_1eo0
...
Maybe add Jumbo96 variants for good measure...
FEjj_jjjj_FEjj_jjjj_F0nm_1eo0 (Yeah...)
...
Maybe add Jumbo+Op64 variants (Reserved, but whatever)
FEjj_jjjj_FFw0_0vii_F0nm_1eo0 (...)
...
...

N/A 2: Technically Valid, but N/A for this instruction, thus reserved.

And, "ADD Rm, Imm, Rn" (F2) or "ADD Imm, Rn" (F2|F8) would have an
entirely different set of patterns.

The assembler doesn't need to deal with every possible case either, it
generates the base encoding, and then modifies it into the needed form
(or fails if the instruction can't be repacked into the requested form).

Granted, one can't really do a naive "count instructions and multiply by
4 to get code size" with this.

>>
>> Instruction names should be as regular as possible, made up of
>>
>> - prefix, for example when special registers are accessed
>> - operation
>> - postfix indicating data size or other speciaities
>> (like immediate data)
> <
> Only memory references have size and signedness, everything
> else is 64-bits (except for 32-bit floats)

More or less similar.

I have:
32-bits, sign or zero extended to 64 bits
64-bits, native
128-bits, register pair

Where relevant, instructions use suffix modifiers:
L: 32-bit variant.
Q: 64-bit variant
X: 128-bit variant
In a few cases, the X has been used as a prefix:
XMOV / XMOV.x
Mostly because putting it as a suffix would be ambiguous.

>>
>> Ideally, I'd like to be able to look at an instruction I have not
>> seen before and recognize it, or write it based on the structure
>> of the instruction names when I only remember similar instructions.
>> A single regexp for all instructions splitting the parts above
>> in a unique way would be great.
>>
>> And I don't like dots in instructions.
>>
>> Load and store should be called some variant of load and store,
>> not move, and especially not move based on the type of the
>> operands.
> <
> Here I completely agree. MOV is reg-to-reg, LD brings forth from
> memory, ST sends towards memory.

I require a type suffix for Load / Store.

Bare 'MOV' is limited to RegReg or ImmReg and similar.

I did merge it down some, vs SH:
LDS, STS, LDC, STC

Which were all folded into MOV:
MOV LR, R1
MOV R3, GBR
...
Obvious enough, don't need special mnemonics.

Assembler can see that I am using LR or GBR and figure this detail out
on its own.

>>
>> (The above looks a lot like POWER, but see below).
>>
>> Assembler syntax:
>>
>> Registers should be clearly visible, not encoded into numbers like
>> IBM likes to do. Syntax of arithmetic should be
>>
>> instr r_target,operand, ...
>>
>> while load and store should have the register first.
> <
> The official name of a register in My 66000 is R[index] where index
> goes from 0..31. The assembler has a set of Macros that capture
> R0..R31, with aliases for FP = R[30] and SP = R[31]. In memory
> reference instructions one can write:: LD Rd,[IP+immed]; using
> the current IP as the base register.
> <
> Your proposal would need LDIP of all 8 flavors and STIP of another
> 4 flavors

As noted, I have:
(Rm,Disp)
(Rm,Ri)
I could have used:
[Rm+Disp]
[Rm+Ri*Sc]

But, either way.

>>
>> I "grew up" on the AT&T syntax (well, after 6502 assembler, which
>> hardly counts), so I am used to the offset(register) convention,
>> but I can see why [register,offset] is more clear, so I have
>> no clear preference there. I just find the Intel BYTE PTR stuff
>> strange to look at.
>>
>> Comments? Other preferences? :-)
> <
> Let architects define their own instruction sets.

Re: Around the bike shed: Instruction names and assembler syntax

<svevu6$ksc$1@dont-email.me>

  copy mid

https://www.novabbs.com/devel/article-flat.php?id=23798&group=comp.arch#23798

  copy link   Newsgroups: comp.arch
Path: i2pn2.org!i2pn.org!eternal-september.org!reader02.eternal-september.org!.POSTED!not-for-mail
From: not_va...@comcast.net (James Van Buskirk)
Newsgroups: comp.arch
Subject: Re: Around the bike shed: Instruction names and assembler syntax
Date: Sat, 26 Feb 2022 21:49:57 -0700
Organization: A noiseless patient Spider
Lines: 1
Message-ID: <svevu6$ksc$1@dont-email.me>
References: <svcqrj$jj6$1@newsreader4.netcologne.de> <svdpt4$or9$1@dont-email.me> <svdrds$bu1$1@newsreader4.netcologne.de>
Mime-Version: 1.0
Content-Type: text/plain;
format=flowed;
charset="Windows-1252";
reply-type=original
Content-Transfer-Encoding: 7bit
Injection-Date: Sun, 27 Feb 2022 04:50:14 -0000 (UTC)
Injection-Info: reader02.eternal-september.org; posting-host="e2d5c72125eaaa0f9989bcdbe5e027a0";
logging-data="21388"; mail-complaints-to="abuse@eternal-september.org"; posting-account="U2FsdGVkX1/yIUueMEgXXI3Sm6Uij5LA4WETlNqTvTs="
Cancel-Lock: sha1:woiPGQMZg08Pr5PxFMhanZ6ZKGs=
X-MimeOLE: Produced By Microsoft MimeOLE V16.4.3528.331
In-Reply-To: <svdrds$bu1$1@newsreader4.netcologne.de>
X-Newsreader: Microsoft Windows Live Mail 16.4.3528.331
Importance: Normal
X-Priority: 3
X-MSMail-Priority: Normal
 by: James Van Buskirk - Sun, 27 Feb 2022 04:49 UTC

"Thomas Koenig" wrote in message
news:svdrds$bu1$1@newsreader4.netcologne.de...

> James Van Buskirk <not_valid@comcast.net> schrieb:

> > How do you propose to distinguish between

> > AND RAX, RBX

> > and

> > AND RAX, RBX

> > The first is encoded as REX.W + 21 /r and the second as REX.W + 23 /r?

> Sorry, I don't understand what you mean. Who or what is REX.W and what
> does + 21 /r and + 23 /r mean?

REX.W means to use the REX prefix and indicates an instruction with 64-bit,
rather than 32-bit, registers. This is INT(Z'48',INT8).
The +21 or +23 mean opcode bytes, thus INT(Z'21',INT8) or INT(Z'23',INT8).
The /r means bits 3:4 of the MOD R/M bytes refer to the register; when
bits 6:7 == B'11', the other argument is a register as well. The bits 0:2
refer
to that register.
RAX is register B'000' and RBX is register B'011' (I know :( ) so the MOD
R/M
bytes in the first instance comes out to be 11 011 000 = INT(Z'D8',INT8) and
int the second case it's 11 000 011 = INT(Z'C3',INT8).
Thus we can encode

AND RAX, RBX

as

db 48h, 21h, 0D8h

or

db 48h, 23h, 0C3h

I found https://defuse.ca/online-x86-assembler.htm#disassembly2
to be convenient for checking this.

Re: Around the bike shed: Instruction names and assembler syntax

<svgn1j$9nm$1@newsreader4.netcologne.de>

  copy mid

https://www.novabbs.com/devel/article-flat.php?id=23804&group=comp.arch#23804

  copy link   Newsgroups: comp.arch
Path: i2pn2.org!i2pn.org!usenet.goja.nl.eu.org!news.freedyn.de!newsreader4.netcologne.de!news.netcologne.de!.POSTED.2a0a-a540-c7b-0-7285-c2ff-fe6c-992d.ipv6dyn.netcologne.de!not-for-mail
From: tkoe...@netcologne.de (Thomas Koenig)
Newsgroups: comp.arch
Subject: Re: Around the bike shed: Instruction names and assembler syntax
Date: Sun, 27 Feb 2022 20:30:43 -0000 (UTC)
Organization: news.netcologne.de
Distribution: world
Message-ID: <svgn1j$9nm$1@newsreader4.netcologne.de>
References: <svcqrj$jj6$1@newsreader4.netcologne.de>
<svd280$fiv$1@dont-email.me>
Injection-Date: Sun, 27 Feb 2022 20:30:43 -0000 (UTC)
Injection-Info: newsreader4.netcologne.de; posting-host="2a0a-a540-c7b-0-7285-c2ff-fe6c-992d.ipv6dyn.netcologne.de:2a0a:a540:c7b:0:7285:c2ff:fe6c:992d";
logging-data="9974"; mail-complaints-to="abuse@netcologne.de"
User-Agent: slrn/1.0.3 (Linux)
 by: Thomas Koenig - Sun, 27 Feb 2022 20:30 UTC

Ivan Godard <ivan@millcomputing.com> schrieb:
> On 2/26/2022 1:11 AM, Thomas Koenig wrote:
>> This is surely not very important for most people, because few
>> people read assembler, and fewer write it, but it is a part
>> of designing an ISA.
>>
>> Reading assembler for a few architectures over the years, I have
>> formed a few preferences that I would like to share, with the firm
>> expectation that others will have different preferences :-)
>>
>> Instruction names:
>>
>> If two instructions have a different encoding, they should have a
>> different name. Apart from making an assembler marginally easier
>> to write, this also makes clear what size an instruction is, if
>> it has immediates.
>>
>> Instruction names should be as regular as possible, made up of
>>
>> - prefix, for example when special registers are accessed
>> - operation
>> - postfix indicating data size or other speciaities
>> (like immediate data)
>>
>> Ideally, I'd like to be able to look at an instruction I have not
>> seen before and recognize it, or write it based on the structure
>> of the instruction names when I only remember similar instructions.
>> A single regexp for all instructions splitting the parts above
>> in a unique way would be great.
>
> Or you can make the asm be similar to the syntax of a HLL, arguably more
> familiar to most readers than a traditional asm syntax, even one with
> your suggestions.
>
> The case in hand: conAsm is syntactically C++, using overloading of
> operators. Asm instructions are C++ functions using C++ "foo(bar1, bar2,
> ...)" functional notation.

How does that look in detail? would you have (assuming conditional
registers, which I know the Mill does not have)

r1 = add(r2,r3)

or

add (r1,r2,r3)

? The latter would not be an advantage over conventional assembler,
IMHO.

Can you maybe share a few examples?

Re: Around the bike shed: Instruction names and assembler syntax

<svgn7t$9nm$2@newsreader4.netcologne.de>

  copy mid

https://www.novabbs.com/devel/article-flat.php?id=23805&group=comp.arch#23805

  copy link   Newsgroups: comp.arch
Path: i2pn2.org!i2pn.org!weretis.net!feeder8.news.weretis.net!newsreader4.netcologne.de!news.netcologne.de!.POSTED.2a0a-a540-c7b-0-7285-c2ff-fe6c-992d.ipv6dyn.netcologne.de!not-for-mail
From: tkoe...@netcologne.de (Thomas Koenig)
Newsgroups: comp.arch
Subject: Re: Around the bike shed: Instruction names and assembler syntax
Date: Sun, 27 Feb 2022 20:34:05 -0000 (UTC)
Organization: news.netcologne.de
Distribution: world
Message-ID: <svgn7t$9nm$2@newsreader4.netcologne.de>
References: <svcqrj$jj6$1@newsreader4.netcologne.de>
<5ce70d12-791c-44c0-8ee3-f8cbdba4ac09n@googlegroups.com>
Injection-Date: Sun, 27 Feb 2022 20:34:05 -0000 (UTC)
Injection-Info: newsreader4.netcologne.de; posting-host="2a0a-a540-c7b-0-7285-c2ff-fe6c-992d.ipv6dyn.netcologne.de:2a0a:a540:c7b:0:7285:c2ff:fe6c:992d";
logging-data="9974"; mail-complaints-to="abuse@netcologne.de"
User-Agent: slrn/1.0.3 (Linux)
 by: Thomas Koenig - Sun, 27 Feb 2022 20:34 UTC

MitchAlsup <MitchAlsup@aol.com> schrieb:
> On Saturday, February 26, 2022 at 3:11:18 AM UTC-6, Thomas Koenig wrote:
>> This is surely not very important for most people, because few
>> people read assembler, and fewer write it, but it is a part
>> of designing an ISA.
>>
>> Reading assembler for a few architectures over the years, I have
>> formed a few preferences that I would like to share, with the firm
>> expectation that others will have different preferences :-)
>>
>> Instruction names:
>>
>> If two instructions have a different encoding, they should have a
>> different name. Apart from making an assembler marginally easier
>> to write, this also makes clear what size an instruction is, if
>> it has immediates.
><
> Then you would be eating up a lot of namespace in the programmers head::
><
> ADD R7,R4,R19
> ADD R7,R4,-R19
> ADD R7,-R4,-R19
> ADD R7,R4,#256
> ADD R7,R4,#-256
> ADD R7,#256,R19
> ADD R7,#-256,R19
> ADD R7,R4,#0x12345678
> ADD R7,R4,#0x1234567890123456
> ADD R7,#0x12345678,R19
> ADD R7,#0x1234567890123456,R19

I agree that it would make less sense for your ISA than
for a more conventional RISC.

[...]

>> Comments? Other preferences? :-)
><
> Let architects define their own instruction sets.

Obviously :-)

Re: Around the bike shed: Instruction names and assembler syntax

<svgqq7$cll$1@newsreader4.netcologne.de>

  copy mid

https://www.novabbs.com/devel/article-flat.php?id=23807&group=comp.arch#23807

  copy link   Newsgroups: comp.arch
Path: i2pn2.org!i2pn.org!usenet.goja.nl.eu.org!news.freedyn.de!newsreader4.netcologne.de!news.netcologne.de!.POSTED.2a0a-a540-c7b-0-7285-c2ff-fe6c-992d.ipv6dyn.netcologne.de!not-for-mail
From: tkoe...@netcologne.de (Thomas Koenig)
Newsgroups: comp.arch
Subject: Re: Around the bike shed: Instruction names and assembler syntax
Date: Sun, 27 Feb 2022 21:35:03 -0000 (UTC)
Organization: news.netcologne.de
Distribution: world
Message-ID: <svgqq7$cll$1@newsreader4.netcologne.de>
References: <svcqrj$jj6$1@newsreader4.netcologne.de>
<memo.20220226104336.7708a@jgd.cix.co.uk>
Injection-Date: Sun, 27 Feb 2022 21:35:03 -0000 (UTC)
Injection-Info: newsreader4.netcologne.de; posting-host="2a0a-a540-c7b-0-7285-c2ff-fe6c-992d.ipv6dyn.netcologne.de:2a0a:a540:c7b:0:7285:c2ff:fe6c:992d";
logging-data="12981"; mail-complaints-to="abuse@netcologne.de"
User-Agent: slrn/1.0.3 (Linux)
 by: Thomas Koenig - Sun, 27 Feb 2022 21:35 UTC

John Dallman <jgd@cix.co.uk> schrieb:
> In article <svcqrj$jj6$1@newsreader4.netcologne.de>,
> tkoenig@netcologne.de (Thomas Koenig) wrote:
>
>> If two instructions have a different encoding, they should have a
>> different name. Apart from making an assembler marginally easier
>> to write, this also makes clear what size an instruction is, if
>> it has immediates.
>
> This hinges on what an assembly language is optimised for. Is it a
> higher-level expression of a sequence of instructions, or a low-level
> representation of a program? Those aren't quite the same thing, although
> assembly languages do both duties.

I would say the latter, except when using macros.

> Your view makes perfect sense for a processor designer. I'm someone who
> reads assembly listings and debugger disassembly to figure out where a
> compiler bug has shown up.

I do that as well, but I am also looking for inefficient encoding.

Re: Around the bike shed: Instruction names and assembler syntax

<svh0ls$d6m$1@dont-email.me>

  copy mid

https://www.novabbs.com/devel/article-flat.php?id=23809&group=comp.arch#23809

  copy link   Newsgroups: comp.arch
Path: i2pn2.org!i2pn.org!eternal-september.org!reader02.eternal-september.org!.POSTED!not-for-mail
From: iva...@millcomputing.com (Ivan Godard)
Newsgroups: comp.arch
Subject: Re: Around the bike shed: Instruction names and assembler syntax
Date: Sun, 27 Feb 2022 15:15:07 -0800
Organization: A noiseless patient Spider
Lines: 82
Message-ID: <svh0ls$d6m$1@dont-email.me>
References: <svcqrj$jj6$1@newsreader4.netcologne.de>
<svd280$fiv$1@dont-email.me> <svgn1j$9nm$1@newsreader4.netcologne.de>
Mime-Version: 1.0
Content-Type: text/plain; charset=UTF-8; format=flowed
Content-Transfer-Encoding: 7bit
Injection-Date: Sun, 27 Feb 2022 23:15:09 -0000 (UTC)
Injection-Info: reader02.eternal-september.org; posting-host="7ba5d4082cb3d231a79cb55212aa6519";
logging-data="13526"; mail-complaints-to="abuse@eternal-september.org"; posting-account="U2FsdGVkX18/HRZXppfAqTp7ymSYQ8OQ"
User-Agent: Mozilla/5.0 (Windows NT 10.0; Win64; x64; rv:91.0) Gecko/20100101
Thunderbird/91.6.1
Cancel-Lock: sha1:1+LQWJR48REzA0kS4xIeDrRLqbI=
In-Reply-To: <svgn1j$9nm$1@newsreader4.netcologne.de>
Content-Language: en-US
 by: Ivan Godard - Sun, 27 Feb 2022 23:15 UTC

On 2/27/2022 12:30 PM, Thomas Koenig wrote:
> Ivan Godard <ivan@millcomputing.com> schrieb:
>> On 2/26/2022 1:11 AM, Thomas Koenig wrote:
>>> This is surely not very important for most people, because few
>>> people read assembler, and fewer write it, but it is a part
>>> of designing an ISA.
>>>
>>> Reading assembler for a few architectures over the years, I have
>>> formed a few preferences that I would like to share, with the firm
>>> expectation that others will have different preferences :-)
>>>
>>> Instruction names:
>>>
>>> If two instructions have a different encoding, they should have a
>>> different name. Apart from making an assembler marginally easier
>>> to write, this also makes clear what size an instruction is, if
>>> it has immediates.
>>>
>>> Instruction names should be as regular as possible, made up of
>>>
>>> - prefix, for example when special registers are accessed
>>> - operation
>>> - postfix indicating data size or other speciaities
>>> (like immediate data)
>>>
>>> Ideally, I'd like to be able to look at an instruction I have not
>>> seen before and recognize it, or write it based on the structure
>>> of the instruction names when I only remember similar instructions.
>>> A single regexp for all instructions splitting the parts above
>>> in a unique way would be great.
>>
>> Or you can make the asm be similar to the syntax of a HLL, arguably more
>> familiar to most readers than a traditional asm syntax, even one with
>> your suggestions.
>>
>> The case in hand: conAsm is syntactically C++, using overloading of
>> operators. Asm instructions are C++ functions using C++ "foo(bar1, bar2,
>> ...)" functional notation.
>
> How does that look in detail? would you have (assuming conditional
> registers, which I know the Mill does not have)
>
> r1 = add(r2,r3)
>
> or
>
> add (r1,r2,r3)
>
> ? The latter would not be an advantage over conventional assembler,
> IMHO.
>
> Can you maybe share a few examples?

Source:
int a;
int foo(int b, int c) {
if ((b&14) == 0)
a = c - 1414;
else
b += c;
return b;
}

Compiled conAsm:
F("foo") %0 %1;
formals() %0 %1,
rd(w(1414)) %7,
sub(b2 %1, b0 %7) %4,
eqlb() %3, andl(b2 %0, 14) %2,
storetr(b3 %4, dp, gl("a"), b2 %3),
add(b5 %1, b4 %0) %6,
retnfl(b2 %3, b0 %6),
retn(b5 %0);

Edited to remove compiler-generated documentary comment, such as source
line/column numbers, belt event activity description, etc. I have left
in the "%N" belt value identifiers, although these can be omitted in
manual asm (only the "bN" temporal indices are necessary).

Note that this function comprises one bundle and takes one cycle to
execute on a low-midrange Mill.

Re: Around the bike shed: Instruction names and assembler syntax

<svh2tg$j53$1@gal.iecc.com>

  copy mid

https://www.novabbs.com/devel/article-flat.php?id=23810&group=comp.arch#23810

  copy link   Newsgroups: comp.arch
Path: i2pn2.org!i2pn.org!weretis.net!feeder6.news.weretis.net!news.misty.com!news.iecc.com!.POSTED.news.iecc.com!not-for-mail
From: joh...@taugh.com (John Levine)
Newsgroups: comp.arch
Subject: Re: Around the bike shed: Instruction names and assembler syntax
Date: Sun, 27 Feb 2022 23:53:20 -0000 (UTC)
Organization: Taughannock Networks
Message-ID: <svh2tg$j53$1@gal.iecc.com>
References: <svcqrj$jj6$1@newsreader4.netcologne.de> <svd280$fiv$1@dont-email.me>
Injection-Date: Sun, 27 Feb 2022 23:53:20 -0000 (UTC)
Injection-Info: gal.iecc.com; posting-host="news.iecc.com:2001:470:1f07:1126:0:676f:7373:6970";
logging-data="19619"; mail-complaints-to="abuse@iecc.com"
In-Reply-To: <svcqrj$jj6$1@newsreader4.netcologne.de> <svd280$fiv$1@dont-email.me>
Cleverness: some
X-Newsreader: trn 4.0-test77 (Sep 1, 2010)
Originator: johnl@iecc.com (John Levine)
 by: John Levine - Sun, 27 Feb 2022 23:53 UTC

It appears that Ivan Godard <ivan@millcomputing.com> said:
>Or you can make the asm be similar to the syntax of a HLL, arguably more
>familiar to most readers than a traditional asm syntax, even one with
>your suggestions.

I liked PL360 which was a S/360 assembler with a syntax similar to Algol.
Wirth used it to write Algol W. I used it some around 1970 at Princeton
which let people run small PL360 and AlgolW programs without needing an account.
I don't think they realized that PL360 was an assembler so you could write some
rather naughty programs if you wanted to.

http://bitsavers.org/pdf/stanford/cs_techReports/STAN-CS-71-215_PL360_Rev_May72.pdf

--
Regards,
John Levine, johnl@taugh.com, Primary Perpetrator of "The Internet for Dummies",
Please consider the environment before reading this e-mail. https://jl.ly

Re: Around the bike shed: Instruction names and assembler syntax

<svh3hj$u5r$1@dont-email.me>

  copy mid

https://www.novabbs.com/devel/article-flat.php?id=23811&group=comp.arch#23811

  copy link   Newsgroups: comp.arch
Path: i2pn2.org!i2pn.org!eternal-september.org!reader02.eternal-september.org!.POSTED!not-for-mail
From: iva...@millcomputing.com (Ivan Godard)
Newsgroups: comp.arch
Subject: Re: Around the bike shed: Instruction names and assembler syntax
Date: Sun, 27 Feb 2022 16:04:04 -0800
Organization: A noiseless patient Spider
Lines: 19
Message-ID: <svh3hj$u5r$1@dont-email.me>
References: <svcqrj$jj6$1@newsreader4.netcologne.de>
<svd280$fiv$1@dont-email.me> <svh2tg$j53$1@gal.iecc.com>
Mime-Version: 1.0
Content-Type: text/plain; charset=UTF-8; format=flowed
Content-Transfer-Encoding: 7bit
Injection-Date: Mon, 28 Feb 2022 00:04:03 -0000 (UTC)
Injection-Info: reader02.eternal-september.org; posting-host="7ba5d4082cb3d231a79cb55212aa6519";
logging-data="30907"; mail-complaints-to="abuse@eternal-september.org"; posting-account="U2FsdGVkX1+9Ztc4dngzlG4k2UzHs9jE"
User-Agent: Mozilla/5.0 (Windows NT 10.0; Win64; x64; rv:91.0) Gecko/20100101
Thunderbird/91.6.1
Cancel-Lock: sha1:8kCiZT5byIe8p2DeKaR6DY1yQSk=
In-Reply-To: <svh2tg$j53$1@gal.iecc.com>
Content-Language: en-US
 by: Ivan Godard - Mon, 28 Feb 2022 00:04 UTC

On 2/27/2022 3:53 PM, John Levine wrote:
> It appears that Ivan Godard <ivan@millcomputing.com> said:
>> Or you can make the asm be similar to the syntax of a HLL, arguably more
>> familiar to most readers than a traditional asm syntax, even one with
>> your suggestions.
>
> I liked PL360 which was a S/360 assembler with a syntax similar to Algol.
> Wirth used it to write Algol W. I used it some around 1970 at Princeton
> which let people run small PL360 and AlgolW programs without needing an account.
> I don't think they realized that PL360 was an assembler so you could write some
> rather naughty programs if you wanted to.
>
> http://bitsavers.org/pdf/stanford/cs_techReports/STAN-CS-71-215_PL360_Rev_May72.pdf
>

The expression-assembler PL360 was the best language work Wirth ever
did. Following it there were several others done for various other ISA -
PL11, PL516, ... My own Mary language family can be seen as the bastard
child of Algol68 and PL360.

Re: Around the bike shed: Instruction names and assembler syntax

<Fx5TJ.75448$3jp8.56095@fx33.iad>

  copy mid

https://www.novabbs.com/devel/article-flat.php?id=23817&group=comp.arch#23817

  copy link   Newsgroups: comp.arch
Path: i2pn2.org!i2pn.org!weretis.net!feeder8.news.weretis.net!newsreader4.netcologne.de!news.netcologne.de!peer01.ams1!peer.ams1.xlned.com!news.xlned.com!peer03.iad!feed-me.highwinds-media.com!news.highwinds-media.com!fx33.iad.POSTED!not-for-mail
From: ThatWoul...@thevillage.com (EricP)
User-Agent: Thunderbird 2.0.0.24 (Windows/20100228)
MIME-Version: 1.0
Newsgroups: comp.arch
Subject: Re: Around the bike shed: Instruction names and assembler syntax
References: <svcqrj$jj6$1@newsreader4.netcologne.de> <svd280$fiv$1@dont-email.me> <svh2tg$j53$1@gal.iecc.com>
In-Reply-To: <svh2tg$j53$1@gal.iecc.com>
Content-Type: text/plain; charset=ISO-8859-1; format=flowed
Content-Transfer-Encoding: 7bit
Lines: 30
Message-ID: <Fx5TJ.75448$3jp8.56095@fx33.iad>
X-Complaints-To: abuse@UsenetServer.com
NNTP-Posting-Date: Mon, 28 Feb 2022 15:06:13 UTC
Date: Mon, 28 Feb 2022 10:06:07 -0500
X-Received-Bytes: 2241
 by: EricP - Mon, 28 Feb 2022 15:06 UTC

John Levine wrote:
> It appears that Ivan Godard <ivan@millcomputing.com> said:
>> Or you can make the asm be similar to the syntax of a HLL, arguably more
>> familiar to most readers than a traditional asm syntax, even one with
>> your suggestions.
>
> I liked PL360 which was a S/360 assembler with a syntax similar to Algol.
> Wirth used it to write Algol W. I used it some around 1970 at Princeton
> which let people run small PL360 and AlgolW programs without needing an account.
> I don't think they realized that PL360 was an assembler so you could write some
> rather naughty programs if you wanted to.
>
> http://bitsavers.org/pdf/stanford/cs_techReports/STAN-CS-71-215_PL360_Rev_May72.pdf
>

If this refers to structured assembler, yes that should be supported.
I've used structured VAX macros that implement nested
IF-THEN-ELSIF-ELSE-ENDIF and various kinds of LOOP-ENDLOOP.
That alone goes a long way to de-cluttering assembler.
So that should always be a built in assembler feature.

My preference is that branch BR and variants BSR, Bxxx is for relative
offsets, and jump JMP and variants JSR, Jxxx for absolute addresses.
I really don't like it when the syntax of the address expression is
somehow used to determine whether this is absolute or relative.
Too easy to make a mistake.
Also I have both "JMP reg" and "BR reg' instructions and variants
so it needs to be clear which is intended.

Re: Around the bike shed: Instruction names and assembler syntax

<svit72$8pq$1@gal.iecc.com>

  copy mid

https://www.novabbs.com/devel/article-flat.php?id=23819&group=comp.arch#23819

  copy link   Newsgroups: comp.arch
Path: i2pn2.org!i2pn.org!weretis.net!feeder6.news.weretis.net!news.misty.com!news.iecc.com!.POSTED.news.iecc.com!not-for-mail
From: joh...@taugh.com (John Levine)
Newsgroups: comp.arch
Subject: Re: Around the bike shed: Instruction names and assembler syntax
Date: Mon, 28 Feb 2022 16:28:18 -0000 (UTC)
Organization: Taughannock Networks
Message-ID: <svit72$8pq$1@gal.iecc.com>
References: <svcqrj$jj6$1@newsreader4.netcologne.de> <svd280$fiv$1@dont-email.me> <svh2tg$j53$1@gal.iecc.com> <Fx5TJ.75448$3jp8.56095@fx33.iad>
Injection-Date: Mon, 28 Feb 2022 16:28:18 -0000 (UTC)
Injection-Info: gal.iecc.com; posting-host="news.iecc.com:2001:470:1f07:1126:0:676f:7373:6970";
logging-data="9018"; mail-complaints-to="abuse@iecc.com"
In-Reply-To: <svcqrj$jj6$1@newsreader4.netcologne.de> <svd280$fiv$1@dont-email.me> <svh2tg$j53$1@gal.iecc.com> <Fx5TJ.75448$3jp8.56095@fx33.iad>
Cleverness: some
X-Newsreader: trn 4.0-test77 (Sep 1, 2010)
Originator: johnl@iecc.com (John Levine)
 by: John Levine - Mon, 28 Feb 2022 16:28 UTC

According to EricP <ThatWouldBeTelling@thevillage.com>:
>John Levine wrote:
>> I liked PL360 which was a S/360 assembler with a syntax similar to Algol. ...
>>
>> http://bitsavers.org/pdf/stanford/cs_techReports/STAN-CS-71-215_PL360_Rev_May72.pdf
>>
>If this refers to structured assembler, yes that should be supported.

No, it was an assembler with a syntax similar to Algol.

When you looked at the report at the URL in your message, what did you think of it?

--
Regards,
John Levine, johnl@taugh.com, Primary Perpetrator of "The Internet for Dummies",
Please consider the environment before reading this e-mail. https://jl.ly

Re: Around the bike shed: Instruction names and assembler syntax

<167TJ.7057$aT3.6255@fx09.iad>

  copy mid

https://www.novabbs.com/devel/article-flat.php?id=23820&group=comp.arch#23820

  copy link   Newsgroups: comp.arch
Path: i2pn2.org!i2pn.org!usenet.goja.nl.eu.org!news.freedyn.de!newsreader4.netcologne.de!news.netcologne.de!peer03.ams1!peer.ams1.xlned.com!news.xlned.com!peer03.iad!feed-me.highwinds-media.com!news.highwinds-media.com!fx09.iad.POSTED!not-for-mail
From: ThatWoul...@thevillage.com (EricP)
User-Agent: Thunderbird 2.0.0.24 (Windows/20100228)
MIME-Version: 1.0
Newsgroups: comp.arch
Subject: Re: Around the bike shed: Instruction names and assembler syntax
References: <svcqrj$jj6$1@newsreader4.netcologne.de> <svd280$fiv$1@dont-email.me> <svh2tg$j53$1@gal.iecc.com> <Fx5TJ.75448$3jp8.56095@fx33.iad> <svit72$8pq$1@gal.iecc.com>
In-Reply-To: <svit72$8pq$1@gal.iecc.com>
Content-Type: text/plain; charset=ISO-8859-1; format=flowed
Content-Transfer-Encoding: 7bit
Lines: 22
Message-ID: <167TJ.7057$aT3.6255@fx09.iad>
X-Complaints-To: abuse@UsenetServer.com
NNTP-Posting-Date: Mon, 28 Feb 2022 16:53:17 UTC
Date: Mon, 28 Feb 2022 11:52:53 -0500
X-Received-Bytes: 1689
 by: EricP - Mon, 28 Feb 2022 16:52 UTC

John Levine wrote:
> According to EricP <ThatWouldBeTelling@thevillage.com>:
>> John Levine wrote:
>>> I liked PL360 which was a S/360 assembler with a syntax similar to Algol. ...
>>>
>>> http://bitsavers.org/pdf/stanford/cs_techReports/STAN-CS-71-215_PL360_Rev_May72.pdf
>>>
>> If this refers to structured assembler, yes that should be supported.
>
> No, it was an assembler with a syntax similar to Algol.
>
> When you looked at the report at the URL in your message, what did you think of it?

I hadn't looked at it... I had heard that IBM has structured assembler
in the past and was hoping that summarized its basic description.
However now that do look at it, I'd say its more of a HLL than assembler.
Reminds me of VAX Bliss, which I never wrote but read on source code
documentation microfiche.

Re: Around the bike shed: Instruction names and assembler syntax

<svj1lm$q0p$1@newsreader4.netcologne.de>

  copy mid

https://www.novabbs.com/devel/article-flat.php?id=23821&group=comp.arch#23821

  copy link   Newsgroups: comp.arch
Path: i2pn2.org!i2pn.org!weretis.net!feeder8.news.weretis.net!newsreader4.netcologne.de!news.netcologne.de!.POSTED.2a0a-a540-c7b-0-7285-c2ff-fe6c-992d.ipv6dyn.netcologne.de!not-for-mail
From: tkoe...@netcologne.de (Thomas Koenig)
Newsgroups: comp.arch
Subject: Re: Around the bike shed: Instruction names and assembler syntax
Date: Mon, 28 Feb 2022 17:44:22 -0000 (UTC)
Organization: news.netcologne.de
Distribution: world
Message-ID: <svj1lm$q0p$1@newsreader4.netcologne.de>
References: <svcqrj$jj6$1@newsreader4.netcologne.de>
<svd280$fiv$1@dont-email.me> <svgn1j$9nm$1@newsreader4.netcologne.de>
<svh0ls$d6m$1@dont-email.me>
Injection-Date: Mon, 28 Feb 2022 17:44:22 -0000 (UTC)
Injection-Info: newsreader4.netcologne.de; posting-host="2a0a-a540-c7b-0-7285-c2ff-fe6c-992d.ipv6dyn.netcologne.de:2a0a:a540:c7b:0:7285:c2ff:fe6c:992d";
logging-data="26649"; mail-complaints-to="abuse@netcologne.de"
User-Agent: slrn/1.0.3 (Linux)
 by: Thomas Koenig - Mon, 28 Feb 2022 17:44 UTC

Ivan Godard <ivan@millcomputing.com> schrieb:

> int a;
> int foo(int b, int c) {
> if ((b&14) == 0)
> a = c - 1414;
> else
> b += c;
> return b;
> }

Let me see if I can reasd this:

> Compiled conAsm:
> F("foo") %0 %1;
> formals() %0 %1,

Is this an actual instruction, or some sort of
pseudo-instruction?

> rd(w(1414)) %7,

I assume this loads a belt entry with the constant 1414.

> sub(b2 %1, b0 %7) %4,

Hm, if I understand you correctly this is

%4 = sub (b2, b0), correct? Is the target always the
last digit after the percent sign?

> eqlb() %3, andl(b2 %0, 14) %2,
> storetr(b3 %4, dp, gl("a"), b2 %3),
> add(b5 %1, b4 %0) %6,
> retnfl(b2 %3, b0 %6),
> retn(b5 %0);

Without further explanation, this is rather hard to read, certainly
not your average assembler syntax :-)

>
> Edited to remove compiler-generated documentary comment, such as source
> line/column numbers, belt event activity description, etc. I have left
> in the "%N" belt value identifiers, although these can be omitted in
> manual asm (only the "bN" temporal indices are necessary).

> Note that this function comprises one bundle and takes one cycle to
> execute on a low-midrange Mill.

That is rather impressive. It would also be interesting to know
how your compiler fares with some normal Fortran code :-)

Re: Around the bike shed: Instruction names and assembler syntax

<svj2ki$q0p$4@newsreader4.netcologne.de>

  copy mid

https://www.novabbs.com/devel/article-flat.php?id=23822&group=comp.arch#23822

  copy link   Newsgroups: comp.arch
Path: i2pn2.org!i2pn.org!usenet.goja.nl.eu.org!news.freedyn.de!newsreader4.netcologne.de!news.netcologne.de!.POSTED.2a0a-a540-c7b-0-7285-c2ff-fe6c-992d.ipv6dyn.netcologne.de!not-for-mail
From: tkoe...@netcologne.de (Thomas Koenig)
Newsgroups: comp.arch
Subject: Re: Around the bike shed: Instruction names and assembler syntax
Date: Mon, 28 Feb 2022 18:00:50 -0000 (UTC)
Organization: news.netcologne.de
Distribution: world
Message-ID: <svj2ki$q0p$4@newsreader4.netcologne.de>
References: <svcqrj$jj6$1@newsreader4.netcologne.de>
<svd280$fiv$1@dont-email.me> <svh2tg$j53$1@gal.iecc.com>
Injection-Date: Mon, 28 Feb 2022 18:00:50 -0000 (UTC)
Injection-Info: newsreader4.netcologne.de; posting-host="2a0a-a540-c7b-0-7285-c2ff-fe6c-992d.ipv6dyn.netcologne.de:2a0a:a540:c7b:0:7285:c2ff:fe6c:992d";
logging-data="26649"; mail-complaints-to="abuse@netcologne.de"
User-Agent: slrn/1.0.3 (Linux)
 by: Thomas Koenig - Mon, 28 Feb 2022 18:00 UTC

John Levine <johnl@taugh.com> schrieb:
> It appears that Ivan Godard <ivan@millcomputing.com> said:
>>Or you can make the asm be similar to the syntax of a HLL, arguably more
>>familiar to most readers than a traditional asm syntax, even one with
>>your suggestions.
>
> I liked PL360 which was a S/360 assembler with a syntax similar to Algol.
> Wirth used it to write Algol W. I used it some around 1970 at Princeton
> which let people run small PL360 and AlgolW programs without needing an account.
> I don't think they realized that PL360 was an assembler so you could write some
> rather naughty programs if you wanted to.
>
> http://bitsavers.org/pdf/stanford/cs_techReports/STAN-CS-71-215_PL360_Rev_May72.pdf

That is an interesting piece of software.

I see they deviated from the usual meaning of

R1 := R2 + R1

which would scare me - one of FORTRAN's great achievement
was to be able to write formulas naturally. But it is
certainly the highest-level assembler I have seen yet.

Re: high and higner level assemblers, was Around the bike shed: Instruction names and assembler syntax

<svj4ol$1j3e$1@gal.iecc.com>

  copy mid

https://www.novabbs.com/devel/article-flat.php?id=23823&group=comp.arch#23823

  copy link   Newsgroups: comp.arch
Path: i2pn2.org!i2pn.org!weretis.net!feeder6.news.weretis.net!news.misty.com!news.iecc.com!.POSTED.news.iecc.com!not-for-mail
From: joh...@taugh.com (John Levine)
Newsgroups: comp.arch
Subject: Re: high and higner level assemblers, was Around the bike shed: Instruction names and assembler syntax
Date: Mon, 28 Feb 2022 18:37:09 -0000 (UTC)
Organization: Taughannock Networks
Message-ID: <svj4ol$1j3e$1@gal.iecc.com>
References: <svcqrj$jj6$1@newsreader4.netcologne.de> <Fx5TJ.75448$3jp8.56095@fx33.iad> <svit72$8pq$1@gal.iecc.com> <167TJ.7057$aT3.6255@fx09.iad>
Injection-Date: Mon, 28 Feb 2022 18:37:09 -0000 (UTC)
Injection-Info: gal.iecc.com; posting-host="news.iecc.com:2001:470:1f07:1126:0:676f:7373:6970";
logging-data="52334"; mail-complaints-to="abuse@iecc.com"
In-Reply-To: <svcqrj$jj6$1@newsreader4.netcologne.de> <Fx5TJ.75448$3jp8.56095@fx33.iad> <svit72$8pq$1@gal.iecc.com> <167TJ.7057$aT3.6255@fx09.iad>
Cleverness: some
X-Newsreader: trn 4.0-test77 (Sep 1, 2010)
Originator: johnl@iecc.com (John Levine)
 by: John Levine - Mon, 28 Feb 2022 18:37 UTC

According to EricP <ThatWouldBeTelling@thevillage.com>:
>>>> I liked PL360 which was a S/360 assembler with a syntax similar to Algol. ...
>>>>
>>>> http://bitsavers.org/pdf/stanford/cs_techReports/STAN-CS-71-215_PL360_Rev_May72.pdf

>I hadn't looked at it... I had heard that IBM has structured assembler
>in the past and was hoping that summarized its basic description.
>However now that do look at it, I'd say its more of a HLL than assembler.
>Reminds me of VAX Bliss, which I never wrote but read on source code
>documentation microfiche.

IBM has something they call High Level Assembler or HLASM which is an ordinary assembler with
a powerful macro language in which people write some pretty fancy macros.

PL360 was an assembler because everything you wrote had a direct translation into
machine language. For example, these are not the same:

R1 := R1 + R2 // add R2 to R1
R1 := R2 + R1 // load R2 into R1, then add R1 to itself

It also has explicit ways to say what base register to use, what registers to use for
call and return, and so forth.

I never used Bliss32 but I wrote a fair amount of Bliss-10 on a PDP-10. I'd say it was half
a level up from PL360. All of the data references were in terms of the PDP-10's addressing
structure and you could insert specific instructions with pseudo-functions as in PL360, but
it was enough of a compiler to handle expressions that used temporary registers.

--
Regards,
John Levine, johnl@taugh.com, Primary Perpetrator of "The Internet for Dummies",
Please consider the environment before reading this e-mail. https://jl.ly

Pages:12345678910
server_pubkey.txt

rocksolid light 0.9.8
clearnet tor