Rocksolid Light

Welcome to novaBBS (click a section below)

mail  files  register  newsreader  groups  login

Message-ID:  

6 May, 2024: The networking issue during the past two days has been identified and fixed.


devel / comp.arch / Re: An Alternative to Predication

SubjectAuthor
* An Alternative to PredicationQuadibloc
+- Re: An Alternative to PredicationMitchAlsup
`* Re: An Alternative to Predicationluke.l...@gmail.com
 +* Re: An Alternative to PredicationTerje Mathisen
 |`* Re: An Alternative to Predicationluke.l...@gmail.com
 | +* Re: An Alternative to PredicationStephen Fuld
 | |`* Re: An Alternative to Predicationluke.l...@gmail.com
 | | `- Re: An Alternative to PredicationQuadibloc
 | `- Re: An Alternative to PredicationTerje Mathisen
 `* Re: An Alternative to PredicationMitchAlsup
  `* Re: An Alternative to Predicationluke.l...@gmail.com
   `* Re: An Alternative to PredicationMitchAlsup
    `* Re: An Alternative to Predicationluke.l...@gmail.com
     `- Re: An Alternative to PredicationBrian G. Lucas

1
An Alternative to Predication

<7f3be666-ee15-4eca-a53e-52bcd1f0732an@googlegroups.com>

  copy mid

https://www.novabbs.com/devel/article-flat.php?id=31542&group=comp.arch#31542

  copy link   Newsgroups: comp.arch
X-Received: by 2002:ac8:59c1:0:b0:3e0:4bb6:f998 with SMTP id f1-20020ac859c1000000b003e04bb6f998mr2563603qtf.10.1680796364783;
Thu, 06 Apr 2023 08:52:44 -0700 (PDT)
X-Received: by 2002:a05:6870:581e:b0:17b:e128:b4c2 with SMTP id
r30-20020a056870581e00b0017be128b4c2mr5076226oap.0.1680796364543; Thu, 06 Apr
2023 08:52:44 -0700 (PDT)
Path: i2pn2.org!i2pn.org!usenet.blueworldhosting.com!diablo1.usenet.blueworldhosting.com!peer01.iad!feed-me.highwinds-media.com!news.highwinds-media.com!news-out.google.com!nntp.google.com!postnews.google.com!google-groups.googlegroups.com!not-for-mail
Newsgroups: comp.arch
Date: Thu, 6 Apr 2023 08:52:44 -0700 (PDT)
Injection-Info: google-groups.googlegroups.com; posting-host=2001:56a:fb71:2b00:31ae:9775:db2a:5222;
posting-account=1nOeKQkAAABD2jxp4Pzmx9Hx5g9miO8y
NNTP-Posting-Host: 2001:56a:fb71:2b00:31ae:9775:db2a:5222
User-Agent: G2/1.0
MIME-Version: 1.0
Message-ID: <7f3be666-ee15-4eca-a53e-52bcd1f0732an@googlegroups.com>
Subject: An Alternative to Predication
From: jsav...@ecn.ab.ca (Quadibloc)
Injection-Date: Thu, 06 Apr 2023 15:52:44 +0000
Content-Type: text/plain; charset="UTF-8"
X-Received-Bytes: 2466
 by: Quadibloc - Thu, 6 Apr 2023 15:52 UTC

Branches are bad!

This is why many ISAs which intend to achieve high performance
include the option of changing the flow of execution through
predicated instructions.

But that doesn't seem to be an ideal solution, because the instructions
that aren't executed still at least get fetched from memory, and likely
are processed in some other ways as well.

Why are branches bad?

Essentially, it's because they cause pipeline flushes - the computer
reads ahead in the code, perhaps assuming that any conditional
branches aren't taken, or using sophisticated branch prediction logic,
so that it's only a branch if it guesses wrong... and now the results
from reading ahead are all useless.

Hmm. Perhaps my idea won''t solve this problem.

I was thinking that perhaps one could use a pair of instructions -

BLBB, ELBB - begin local branch block, end local branch block

which, like Mitch's VVM instructions, delimit code which otherwise
isn't modified. What it directs is that the block of code be given priority
to be saved in the micro-op cache because it contains lots of little
local branches to carry out some complicated logic.

It can also serve as a warning that the branches are hard to predict, and
so either the computer looks ahead down both paths if possible, or it
avoids looking ahead, letting other threads run instead, so only latency,
but not throughput, is impacted by the branches.

John Savard

Re: An Alternative to Predication

<6c49a640-4df7-4dac-bde1-4fb1e5d7f2ddn@googlegroups.com>

  copy mid

https://www.novabbs.com/devel/article-flat.php?id=31544&group=comp.arch#31544

  copy link   Newsgroups: comp.arch
X-Received: by 2002:a05:620a:178a:b0:746:78f8:a3be with SMTP id ay10-20020a05620a178a00b0074678f8a3bemr2795179qkb.3.1680801172990;
Thu, 06 Apr 2023 10:12:52 -0700 (PDT)
X-Received: by 2002:a05:6870:a106:b0:180:b597:c500 with SMTP id
m6-20020a056870a10600b00180b597c500mr25278oae.2.1680801172693; Thu, 06 Apr
2023 10:12:52 -0700 (PDT)
Path: i2pn2.org!i2pn.org!weretis.net!feeder8.news.weretis.net!proxad.net!feeder1-2.proxad.net!209.85.160.216.MISMATCH!news-out.google.com!nntp.google.com!postnews.google.com!google-groups.googlegroups.com!not-for-mail
Newsgroups: comp.arch
Date: Thu, 6 Apr 2023 10:12:52 -0700 (PDT)
In-Reply-To: <7f3be666-ee15-4eca-a53e-52bcd1f0732an@googlegroups.com>
Injection-Info: google-groups.googlegroups.com; posting-host=2600:1700:291:29f0:c8d3:9dfd:4dac:9816;
posting-account=H_G_JQkAAADS6onOMb-dqvUozKse7mcM
NNTP-Posting-Host: 2600:1700:291:29f0:c8d3:9dfd:4dac:9816
References: <7f3be666-ee15-4eca-a53e-52bcd1f0732an@googlegroups.com>
User-Agent: G2/1.0
MIME-Version: 1.0
Message-ID: <6c49a640-4df7-4dac-bde1-4fb1e5d7f2ddn@googlegroups.com>
Subject: Re: An Alternative to Predication
From: MitchAl...@aol.com (MitchAlsup)
Injection-Date: Thu, 06 Apr 2023 17:12:52 +0000
Content-Type: text/plain; charset="UTF-8"
Content-Transfer-Encoding: quoted-printable
 by: MitchAlsup - Thu, 6 Apr 2023 17:12 UTC

On Thursday, April 6, 2023 at 10:52:46 AM UTC-5, Quadibloc wrote:
> Branches are bad!
>
> This is why many ISAs which intend to achieve high performance
> include the option of changing the flow of execution through
> predicated instructions.
>
> But that doesn't seem to be an ideal solution, because the instructions
> that aren't executed still at least get fetched from memory, and likely
> are processed in some other ways as well.
>
> Why are branches bad?
>
> Essentially, it's because they cause pipeline flushes - the computer
> reads ahead in the code, perhaps assuming that any conditional
> branches aren't taken, or using sophisticated branch prediction logic,
> so that it's only a branch if it guesses wrong... and now the results
> from reading ahead are all useless.
<
Predication accepts reading the instructions not executed in order
to avoid disrupting the front-end. In other words, If the front-end will
fetch the first instruction at the convergence point by the time the
branch in the then clause would execute, then predication has saved
you the pipeline flush !! Pipeline flushes are expensive in power and
sometimes in cycles.
>
> Hmm. Perhaps my idea won''t solve this problem.
>
> I was thinking that perhaps one could use a pair of instructions -
>
> BLBB, ELBB - begin local branch block, end local branch block
>
> which, like Mitch's VVM instructions, delimit code which otherwise
> isn't modified. What it directs is that the block of code be given priority
> to be saved in the micro-op cache because it contains lots of little
> local branches to carry out some complicated logic.
<
That is too many instructions. My 66000 Predication uses 1 PRED
instruction to describe the then-clause extent and the else-clause
extent and by implication the sum of then and else clauses which
is the convergence point. And does this in 1 instruction. Anytime
a predicated instruction sequence has both a then-clause and an
else clause, you have saved 1 branch instruction.
>
> It can also serve as a warning that the branches are hard to predict, and
> so either the computer looks ahead down both paths if possible, or it
> avoids looking ahead, letting other threads run instead, so only latency,
> but not throughput, is impacted by the branches.
>
> John Savard

Re: An Alternative to Predication

<6b6abeac-e2c3-4c2f-bbdf-5d4a9f42e694n@googlegroups.com>

  copy mid

https://www.novabbs.com/devel/article-flat.php?id=31557&group=comp.arch#31557

  copy link   Newsgroups: comp.arch
X-Received: by 2002:a05:622a:1a25:b0:3d4:63fa:3db4 with SMTP id f37-20020a05622a1a2500b003d463fa3db4mr544949qtb.5.1680868955585;
Fri, 07 Apr 2023 05:02:35 -0700 (PDT)
X-Received: by 2002:a05:6808:4:b0:386:a6f0:5e4e with SMTP id
u4-20020a056808000400b00386a6f05e4emr555292oic.7.1680868955256; Fri, 07 Apr
2023 05:02:35 -0700 (PDT)
Path: i2pn2.org!i2pn.org!usenet.blueworldhosting.com!diablo1.usenet.blueworldhosting.com!peer02.iad!feed-me.highwinds-media.com!news.highwinds-media.com!news-out.google.com!nntp.google.com!postnews.google.com!google-groups.googlegroups.com!not-for-mail
Newsgroups: comp.arch
Date: Fri, 7 Apr 2023 05:02:34 -0700 (PDT)
In-Reply-To: <7f3be666-ee15-4eca-a53e-52bcd1f0732an@googlegroups.com>
Injection-Info: google-groups.googlegroups.com; posting-host=92.19.80.230; posting-account=soFpvwoAAADIBXOYOBcm_mixNPAaxW9p
NNTP-Posting-Host: 92.19.80.230
References: <7f3be666-ee15-4eca-a53e-52bcd1f0732an@googlegroups.com>
User-Agent: G2/1.0
MIME-Version: 1.0
Message-ID: <6b6abeac-e2c3-4c2f-bbdf-5d4a9f42e694n@googlegroups.com>
Subject: Re: An Alternative to Predication
From: luke.lei...@gmail.com (luke.l...@gmail.com)
Injection-Date: Fri, 07 Apr 2023 12:02:35 +0000
Content-Type: text/plain; charset="UTF-8"
Content-Transfer-Encoding: quoted-printable
X-Received-Bytes: 2288
 by: luke.l...@gmail.com - Fri, 7 Apr 2023 12:02 UTC

On Thursday, April 6, 2023 at 4:52:46 PM UTC+1, Quadibloc wrote:
> Branches are bad!
>
> This is why many ISAs which intend to achieve high performance
> include the option of changing the flow of execution through
> predicated instructions.
>
> But that doesn't seem to be an ideal solution, because the instructions
> that aren't executed still at least get fetched from memory, and likely
> are processed in some other ways as well.

look up Zero-Overhead-Loop-Control.

https://ieeexplore.ieee.org/document/1692906
https://www.researchgate.net/publication/3351728_Zero-overhead_loop_controller_that_implements_multimedia_algorithms

it achieved a whopping 45% reduction in instruction execution
count on a simple in-order system compared to more complex
micro-architectures, for complex MPEG motion-estimation.

*SIX* level nested *conditional* zero-overhead-loops.

absolutely astonishing. some benchmarks reduced by *eighty*
percent instruction count.

ST Micro were so impressed they actually put it into a
production silicon product.

l.

Re: An Alternative to Predication

<u0p75o$r8ji$1@dont-email.me>

  copy mid

https://www.novabbs.com/devel/article-flat.php?id=31563&group=comp.arch#31563

  copy link   Newsgroups: comp.arch
Path: i2pn2.org!i2pn.org!eternal-september.org!feeder.eternal-september.org!.POSTED!not-for-mail
From: terje.ma...@tmsw.no (Terje Mathisen)
Newsgroups: comp.arch
Subject: Re: An Alternative to Predication
Date: Fri, 7 Apr 2023 15:51:51 +0200
Organization: A noiseless patient Spider
Lines: 40
Message-ID: <u0p75o$r8ji$1@dont-email.me>
References: <7f3be666-ee15-4eca-a53e-52bcd1f0732an@googlegroups.com>
<6b6abeac-e2c3-4c2f-bbdf-5d4a9f42e694n@googlegroups.com>
MIME-Version: 1.0
Content-Type: text/plain; charset=ISO-8859-1; format=flowed
Content-Transfer-Encoding: 8bit
Injection-Date: Fri, 7 Apr 2023 13:51:52 -0000 (UTC)
Injection-Info: dont-email.me; posting-host="48ac017bb7fe2e601ec0f9ff6384e426";
logging-data="893554"; mail-complaints-to="abuse@eternal-september.org"; posting-account="U2FsdGVkX1+zK2nDU4EzmTqVq58vgrWqVdLUZTKsNK1ifZwABfqKDw=="
User-Agent: Mozilla/5.0 (Windows NT 10.0; Win64; x64; rv:91.0) Gecko/20100101
Firefox/91.0 SeaMonkey/2.53.15
Cancel-Lock: sha1:1kcywb3xE37F6ruZZj7EWi29Rck=
In-Reply-To: <6b6abeac-e2c3-4c2f-bbdf-5d4a9f42e694n@googlegroups.com>
 by: Terje Mathisen - Fri, 7 Apr 2023 13:51 UTC

luke.l...@gmail.com wrote:
> On Thursday, April 6, 2023 at 4:52:46 PM UTC+1, Quadibloc wrote:
>> Branches are bad!
>>
>> This is why many ISAs which intend to achieve high performance
>> include the option of changing the flow of execution through
>> predicated instructions.
>>
>> But that doesn't seem to be an ideal solution, because the instructions
>> that aren't executed still at least get fetched from memory, and likely
>> are processed in some other ways as well.
>
> look up Zero-Overhead-Loop-Control.
>
> https://ieeexplore.ieee.org/document/1692906
> https://www.researchgate.net/publication/3351728_Zero-overhead_loop_controller_that_implements_multimedia_algorithms
>
> it achieved a whopping 45% reduction in instruction execution
> count on a simple in-order system compared to more complex
> micro-architectures, for complex MPEG motion-estimation.
>
> *SIX* level nested *conditional* zero-overhead-loops.
>
> absolutely astonishing. some benchmarks reduced by *eighty*
> percent instruction count.
>
> ST Micro were so impressed they actually put it into a
> production silicon product.

The most obvious limitation here is probably that they are comparing
against scalar multimedia code, the specific case where SIMD already
provides the largest speedups. I.e. the Pentium-MMX with 8-wide byte
operations overlayed on the 64-bit x87 FP mantissas were fast enough
that the Zoran SoftDVD player was able to keep up with zero frame skips.

Terje

--
- <Terje.Mathisen at tmsw.no>
"almost all programming can be viewed as an exercise in caching"

Re: An Alternative to Predication

<421a829c-15b9-4e4d-b7f2-7e26e1414333n@googlegroups.com>

  copy mid

https://www.novabbs.com/devel/article-flat.php?id=31567&group=comp.arch#31567

  copy link   Newsgroups: comp.arch
X-Received: by 2002:a05:622a:1452:b0:3e1:3cc8:98b5 with SMTP id v18-20020a05622a145200b003e13cc898b5mr895436qtx.4.1680878428097;
Fri, 07 Apr 2023 07:40:28 -0700 (PDT)
X-Received: by 2002:a05:6830:1d49:b0:68b:c94d:bbf7 with SMTP id
p9-20020a0568301d4900b0068bc94dbbf7mr640030oth.0.1680878427782; Fri, 07 Apr
2023 07:40:27 -0700 (PDT)
Path: i2pn2.org!i2pn.org!weretis.net!feeder6.news.weretis.net!news.misty.com!border-2.nntp.ord.giganews.com!nntp.giganews.com!news-out.google.com!nntp.google.com!postnews.google.com!google-groups.googlegroups.com!not-for-mail
Newsgroups: comp.arch
Date: Fri, 7 Apr 2023 07:40:27 -0700 (PDT)
In-Reply-To: <6b6abeac-e2c3-4c2f-bbdf-5d4a9f42e694n@googlegroups.com>
Injection-Info: google-groups.googlegroups.com; posting-host=2600:1700:291:29f0:4084:9cd9:aee1:5118;
posting-account=H_G_JQkAAADS6onOMb-dqvUozKse7mcM
NNTP-Posting-Host: 2600:1700:291:29f0:4084:9cd9:aee1:5118
References: <7f3be666-ee15-4eca-a53e-52bcd1f0732an@googlegroups.com> <6b6abeac-e2c3-4c2f-bbdf-5d4a9f42e694n@googlegroups.com>
User-Agent: G2/1.0
MIME-Version: 1.0
Message-ID: <421a829c-15b9-4e4d-b7f2-7e26e1414333n@googlegroups.com>
Subject: Re: An Alternative to Predication
From: MitchAl...@aol.com (MitchAlsup)
Injection-Date: Fri, 07 Apr 2023 14:40:28 +0000
Content-Type: text/plain; charset="UTF-8"
Content-Transfer-Encoding: quoted-printable
Lines: 35
 by: MitchAlsup - Fri, 7 Apr 2023 14:40 UTC

On Friday, April 7, 2023 at 7:02:37 AM UTC-5, luke.l...@gmail.com wrote:
> On Thursday, April 6, 2023 at 4:52:46 PM UTC+1, Quadibloc wrote:
> > Branches are bad!
> >
> > This is why many ISAs which intend to achieve high performance
> > include the option of changing the flow of execution through
> > predicated instructions.
> >
> > But that doesn't seem to be an ideal solution, because the instructions
> > that aren't executed still at least get fetched from memory, and likely
> > are processed in some other ways as well.
> look up Zero-Overhead-Loop-Control.
<
This is what the LOOP instruction does in VVM.
>
> https://ieeexplore.ieee.org/document/1692906
> https://www.researchgate.net/publication/3351728_Zero-overhead_loop_controller_that_implements_multimedia_algorithms
>
> it achieved a whopping 45% reduction in instruction execution
> count on a simple in-order system compared to more complex
> micro-architectures, for complex MPEG motion-estimation.
>
> *SIX* level nested *conditional* zero-overhead-loops.
>
> absolutely astonishing. some benchmarks reduced by *eighty*
> percent instruction count.
>
> ST Micro were so impressed they actually put it into a
> production silicon product.
>
> l.

Re: An Alternative to Predication

<634e2c11-ac55-481f-b3c7-8cc12b6e72a6n@googlegroups.com>

  copy mid

https://www.novabbs.com/devel/article-flat.php?id=31575&group=comp.arch#31575

  copy link   Newsgroups: comp.arch
X-Received: by 2002:a05:620a:288a:b0:746:7aed:38cf with SMTP id j10-20020a05620a288a00b007467aed38cfmr994260qkp.1.1680908453977;
Fri, 07 Apr 2023 16:00:53 -0700 (PDT)
X-Received: by 2002:a05:6830:1304:b0:6a1:342f:7ba4 with SMTP id
p4-20020a056830130400b006a1342f7ba4mr962921otq.0.1680908453705; Fri, 07 Apr
2023 16:00:53 -0700 (PDT)
Path: i2pn2.org!i2pn.org!usenet.blueworldhosting.com!diablo1.usenet.blueworldhosting.com!peer03.iad!feed-me.highwinds-media.com!news.highwinds-media.com!news-out.google.com!nntp.google.com!postnews.google.com!google-groups.googlegroups.com!not-for-mail
Newsgroups: comp.arch
Date: Fri, 7 Apr 2023 16:00:53 -0700 (PDT)
In-Reply-To: <u0p75o$r8ji$1@dont-email.me>
Injection-Info: google-groups.googlegroups.com; posting-host=92.19.80.230; posting-account=soFpvwoAAADIBXOYOBcm_mixNPAaxW9p
NNTP-Posting-Host: 92.19.80.230
References: <7f3be666-ee15-4eca-a53e-52bcd1f0732an@googlegroups.com>
<6b6abeac-e2c3-4c2f-bbdf-5d4a9f42e694n@googlegroups.com> <u0p75o$r8ji$1@dont-email.me>
User-Agent: G2/1.0
MIME-Version: 1.0
Message-ID: <634e2c11-ac55-481f-b3c7-8cc12b6e72a6n@googlegroups.com>
Subject: Re: An Alternative to Predication
From: luke.lei...@gmail.com (luke.l...@gmail.com)
Injection-Date: Fri, 07 Apr 2023 23:00:53 +0000
Content-Type: text/plain; charset="UTF-8"
Content-Transfer-Encoding: quoted-printable
X-Received-Bytes: 2447
 by: luke.l...@gmail.com - Fri, 7 Apr 2023 23:00 UTC

On Friday, April 7, 2023 at 2:51:55 PM UTC+1, Terje Mathisen wrote:

> The most obvious limitation here is probably that they are comparing
> against scalar multimedia code, the specific case where SIMD already
> provides the largest speedups. I.e. the Pentium-MMX with 8-wide byte
> operations overlayed on the 64-bit x87 FP mantissas were fast enough
> that the Zoran SoftDVD player was able to keep up with zero frame skips.

it would be unfair to compare apples v oranges. thinking about it:
if ZOLC were to be applied to a SIMD baseline core it would equally
cut the number of branch-conditionals etc. but yes not the amount
of operations done *inside* any given loop.

but the specific problem with MPEG motion-estimation is that
it is an absolute bitch-of-a-task. there's six levels of nested loops
in the algorithm, with conditional jumping in and out of two or
more levels *and back*, which wreaks absolute havoc with any
traditional CPU branch-prediction, regardless of whether it has
SIMD or not.

the bit you missed Terje is that the CPU that had ZOLC added to
it *had no branch-prediction* and yet still managed to piss all over
cores that did.

l.

Re: An Alternative to Predication

<77a5f439-b719-477d-94d8-bc9aa0582ee1n@googlegroups.com>

  copy mid

https://www.novabbs.com/devel/article-flat.php?id=31576&group=comp.arch#31576

  copy link   Newsgroups: comp.arch
X-Received: by 2002:a05:622a:1c4:b0:3e4:dbbe:eb64 with SMTP id t4-20020a05622a01c400b003e4dbbeeb64mr115715qtw.5.1680908643567;
Fri, 07 Apr 2023 16:04:03 -0700 (PDT)
X-Received: by 2002:a05:6870:3109:b0:177:c2fb:8cec with SMTP id
v9-20020a056870310900b00177c2fb8cecmr1663982oaa.9.1680908643348; Fri, 07 Apr
2023 16:04:03 -0700 (PDT)
Path: i2pn2.org!i2pn.org!usenet.blueworldhosting.com!diablo1.usenet.blueworldhosting.com!peer03.iad!feed-me.highwinds-media.com!news.highwinds-media.com!news-out.google.com!nntp.google.com!postnews.google.com!google-groups.googlegroups.com!not-for-mail
Newsgroups: comp.arch
Date: Fri, 7 Apr 2023 16:04:03 -0700 (PDT)
In-Reply-To: <421a829c-15b9-4e4d-b7f2-7e26e1414333n@googlegroups.com>
Injection-Info: google-groups.googlegroups.com; posting-host=92.19.80.230; posting-account=soFpvwoAAADIBXOYOBcm_mixNPAaxW9p
NNTP-Posting-Host: 92.19.80.230
References: <7f3be666-ee15-4eca-a53e-52bcd1f0732an@googlegroups.com>
<6b6abeac-e2c3-4c2f-bbdf-5d4a9f42e694n@googlegroups.com> <421a829c-15b9-4e4d-b7f2-7e26e1414333n@googlegroups.com>
User-Agent: G2/1.0
MIME-Version: 1.0
Message-ID: <77a5f439-b719-477d-94d8-bc9aa0582ee1n@googlegroups.com>
Subject: Re: An Alternative to Predication
From: luke.lei...@gmail.com (luke.l...@gmail.com)
Injection-Date: Fri, 07 Apr 2023 23:04:03 +0000
Content-Type: text/plain; charset="UTF-8"
Content-Transfer-Encoding: quoted-printable
X-Received-Bytes: 1889
 by: luke.l...@gmail.com - Fri, 7 Apr 2023 23:04 UTC

On Friday, April 7, 2023 at 3:40:29 PM UTC+1, MitchAlsup wrote:

> This is what the LOOP instruction does in VVM.

aiui, only one level, though, Mitch (only one LOOP which
needs no test-and-branch-back). ZOLC had as many
*nested* loops on a stack that was as large as your
architectural state context-switch could tolerate,
and the ability to make decisions to push/pop/swap
which loop to jump into or out of, at any time, in
a 100% forward-predictable fashion that resulted
in *zero stalls* even though ZOLC Loops could be
6-deep-nested.

that's... insanely impressive.

l.

Re: An Alternative to Predication

<64ff5aaf-73bd-4e70-9587-46c8e50c0924n@googlegroups.com>

  copy mid

https://www.novabbs.com/devel/article-flat.php?id=31577&group=comp.arch#31577

  copy link   Newsgroups: comp.arch
X-Received: by 2002:ac8:58cc:0:b0:3e6:970e:a405 with SMTP id u12-20020ac858cc000000b003e6970ea405mr188063qta.6.1680916224731;
Fri, 07 Apr 2023 18:10:24 -0700 (PDT)
X-Received: by 2002:a05:6870:a704:b0:17e:3fe:6711 with SMTP id
g4-20020a056870a70400b0017e03fe6711mr947227oam.2.1680916224498; Fri, 07 Apr
2023 18:10:24 -0700 (PDT)
Path: i2pn2.org!i2pn.org!usenet.blueworldhosting.com!diablo1.usenet.blueworldhosting.com!peer01.iad!feed-me.highwinds-media.com!news.highwinds-media.com!news-out.google.com!nntp.google.com!postnews.google.com!google-groups.googlegroups.com!not-for-mail
Newsgroups: comp.arch
Date: Fri, 7 Apr 2023 18:10:24 -0700 (PDT)
In-Reply-To: <77a5f439-b719-477d-94d8-bc9aa0582ee1n@googlegroups.com>
Injection-Info: google-groups.googlegroups.com; posting-host=2600:1700:291:29f0:4084:9cd9:aee1:5118;
posting-account=H_G_JQkAAADS6onOMb-dqvUozKse7mcM
NNTP-Posting-Host: 2600:1700:291:29f0:4084:9cd9:aee1:5118
References: <7f3be666-ee15-4eca-a53e-52bcd1f0732an@googlegroups.com>
<6b6abeac-e2c3-4c2f-bbdf-5d4a9f42e694n@googlegroups.com> <421a829c-15b9-4e4d-b7f2-7e26e1414333n@googlegroups.com>
<77a5f439-b719-477d-94d8-bc9aa0582ee1n@googlegroups.com>
User-Agent: G2/1.0
MIME-Version: 1.0
Message-ID: <64ff5aaf-73bd-4e70-9587-46c8e50c0924n@googlegroups.com>
Subject: Re: An Alternative to Predication
From: MitchAl...@aol.com (MitchAlsup)
Injection-Date: Sat, 08 Apr 2023 01:10:24 +0000
Content-Type: text/plain; charset="UTF-8"
Content-Transfer-Encoding: quoted-printable
X-Received-Bytes: 2192
 by: MitchAlsup - Sat, 8 Apr 2023 01:10 UTC

On Friday, April 7, 2023 at 6:04:04 PM UTC-5, luke.l...@gmail.com wrote:
> On Friday, April 7, 2023 at 3:40:29 PM UTC+1, MitchAlsup wrote:
>
>
> > This is what the LOOP instruction does in VVM.
> aiui, only one level, though, Mitch (only one LOOP which
> needs no test-and-branch-back). ZOLC had as many
> *nested* loops on a stack that was as large as your
> architectural state context-switch could tolerate,
<
90% of the gain is in the innermost loop.
<
> and the ability to make decisions to push/pop/swap
> which loop to jump into or out of, at any time, in
> a 100% forward-predictable fashion that resulted
> in *zero stalls* even though ZOLC Loops could be
> 6-deep-nested.
>
> that's... insanely impressive.
>
> l.

Re: An Alternative to Predication

<u0qvtq$15qdg$1@dont-email.me>

  copy mid

https://www.novabbs.com/devel/article-flat.php?id=31579&group=comp.arch#31579

  copy link   Newsgroups: comp.arch
Path: i2pn2.org!i2pn.org!eternal-september.org!feeder.eternal-september.org!.POSTED!not-for-mail
From: sfu...@alumni.cmu.edu.invalid (Stephen Fuld)
Newsgroups: comp.arch
Subject: Re: An Alternative to Predication
Date: Fri, 7 Apr 2023 23:00:23 -0700
Organization: A noiseless patient Spider
Lines: 29
Message-ID: <u0qvtq$15qdg$1@dont-email.me>
References: <7f3be666-ee15-4eca-a53e-52bcd1f0732an@googlegroups.com>
<6b6abeac-e2c3-4c2f-bbdf-5d4a9f42e694n@googlegroups.com>
<u0p75o$r8ji$1@dont-email.me>
<634e2c11-ac55-481f-b3c7-8cc12b6e72a6n@googlegroups.com>
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8; format=flowed
Content-Transfer-Encoding: 8bit
Injection-Date: Sat, 8 Apr 2023 06:00:26 -0000 (UTC)
Injection-Info: dont-email.me; posting-host="6e69dffa2e96e01e8adc8338991d2a29";
logging-data="1239472"; mail-complaints-to="abuse@eternal-september.org"; posting-account="U2FsdGVkX1+0qHfL+ioFkVcxd6Cqfqa/lL0akY5QE6E="
User-Agent: Mozilla/5.0 (Windows NT 10.0; Win64; x64; rv:102.0) Gecko/20100101
Thunderbird/102.9.1
Cancel-Lock: sha1:ZmbpOpS0vV7C+DqDLd2GZOPXY9w=
In-Reply-To: <634e2c11-ac55-481f-b3c7-8cc12b6e72a6n@googlegroups.com>
Content-Language: en-US
 by: Stephen Fuld - Sat, 8 Apr 2023 06:00 UTC

On 4/7/2023 4:00 PM, luke.l...@gmail.com wrote:
> On Friday, April 7, 2023 at 2:51:55 PM UTC+1, Terje Mathisen wrote:
>
>> The most obvious limitation here is probably that they are comparing
>> against scalar multimedia code, the specific case where SIMD already
>> provides the largest speedups. I.e. the Pentium-MMX with 8-wide byte
>> operations overlayed on the 64-bit x87 FP mantissas were fast enough
>> that the Zoran SoftDVD player was able to keep up with zero frame skips.
>
> it would be unfair to compare apples v oranges. thinking about it:
> if ZOLC were to be applied to a SIMD baseline core it would equally
> cut the number of branch-conditionals etc. but yes not the amount
> of operations done *inside* any given loop.
>
> but the specific problem with MPEG motion-estimation is that
> it is an absolute bitch-of-a-task. there's six levels of nested loops
> in the algorithm, with conditional jumping in and out of two or
> more levels *and back*, which wreaks absolute havoc with any
> traditional CPU branch-prediction, regardless of whether it has
> SIMD or not.

If there is source code readily available for this, it seems like it
might be an interesting benchmark program, both for compilers and for
architectures.

--
- Stephen Fuld
(e-mail address disguised to prevent spam)

Re: An Alternative to Predication

<9a5cfc5b-8492-4736-af58-4a279dea523bn@googlegroups.com>

  copy mid

https://www.novabbs.com/devel/article-flat.php?id=31581&group=comp.arch#31581

  copy link   Newsgroups: comp.arch
X-Received: by 2002:a05:622a:1801:b0:3d7:9d03:75af with SMTP id t1-20020a05622a180100b003d79d0375afmr456071qtc.12.1680942530176;
Sat, 08 Apr 2023 01:28:50 -0700 (PDT)
X-Received: by 2002:a9d:6f0f:0:b0:6a1:6e74:b23d with SMTP id
n15-20020a9d6f0f000000b006a16e74b23dmr1193563otq.2.1680942529882; Sat, 08 Apr
2023 01:28:49 -0700 (PDT)
Path: i2pn2.org!i2pn.org!weretis.net!feeder8.news.weretis.net!proxad.net!feeder1-2.proxad.net!209.85.160.216.MISMATCH!news-out.google.com!nntp.google.com!postnews.google.com!google-groups.googlegroups.com!not-for-mail
Newsgroups: comp.arch
Date: Sat, 8 Apr 2023 01:28:49 -0700 (PDT)
In-Reply-To: <u0qvtq$15qdg$1@dont-email.me>
Injection-Info: google-groups.googlegroups.com; posting-host=92.19.80.230; posting-account=soFpvwoAAADIBXOYOBcm_mixNPAaxW9p
NNTP-Posting-Host: 92.19.80.230
References: <7f3be666-ee15-4eca-a53e-52bcd1f0732an@googlegroups.com>
<6b6abeac-e2c3-4c2f-bbdf-5d4a9f42e694n@googlegroups.com> <u0p75o$r8ji$1@dont-email.me>
<634e2c11-ac55-481f-b3c7-8cc12b6e72a6n@googlegroups.com> <u0qvtq$15qdg$1@dont-email.me>
User-Agent: G2/1.0
MIME-Version: 1.0
Message-ID: <9a5cfc5b-8492-4736-af58-4a279dea523bn@googlegroups.com>
Subject: Re: An Alternative to Predication
From: luke.lei...@gmail.com (luke.l...@gmail.com)
Injection-Date: Sat, 08 Apr 2023 08:28:50 +0000
Content-Type: text/plain; charset="UTF-8"
Content-Transfer-Encoding: quoted-printable
 by: luke.l...@gmail.com - Sat, 8 Apr 2023 08:28 UTC

On Saturday, April 8, 2023 at 7:00:30 AM UTC+1, Stephen Fuld wrote:

> > but the specific problem with MPEG motion-estimation is that
> > it is an absolute bitch-of-a-task. there's six levels of nested loops
> > in the algorithm, with conditional jumping in and out of two or
> > more levels *and back*, which wreaks absolute havoc with any
> > traditional CPU branch-prediction, regardless of whether it has
> > SIMD or not.
> If there is source code readily available for this, it seems like it
> might be an interesting benchmark program, both for compilers and for
> architectures.

it was an extremely complex research project, funded back in
2007-2008 by the European Union. it took me a long time to
piece together and because the people funded pulled the
contents off the public website pretty much immediately after
the grant was finished, i had to email one of the authors to
get a copy of the VHDL.

the compiler-augmentation scripts, however, which subdivide
(mark) programs identifying Basic Blocks, etc. etc., are all
still available online.

yes of course MPEG source code is readily available,
try ffmpeg.

l.

Re: An Alternative to Predication

<b23eb0a8-d4ec-4411-beb1-26a825e6bc41n@googlegroups.com>

  copy mid

https://www.novabbs.com/devel/article-flat.php?id=31582&group=comp.arch#31582

  copy link   Newsgroups: comp.arch
X-Received: by 2002:a05:622a:18a8:b0:3e4:e17f:a544 with SMTP id v40-20020a05622a18a800b003e4e17fa544mr479885qtc.12.1680942684374;
Sat, 08 Apr 2023 01:31:24 -0700 (PDT)
X-Received: by 2002:a05:6870:12d0:b0:17e:2ddf:b23c with SMTP id
16-20020a05687012d000b0017e2ddfb23cmr1742453oam.0.1680942684155; Sat, 08 Apr
2023 01:31:24 -0700 (PDT)
Path: i2pn2.org!i2pn.org!usenet.blueworldhosting.com!diablo1.usenet.blueworldhosting.com!peer03.iad!feed-me.highwinds-media.com!news.highwinds-media.com!news-out.google.com!nntp.google.com!postnews.google.com!google-groups.googlegroups.com!not-for-mail
Newsgroups: comp.arch
Date: Sat, 8 Apr 2023 01:31:23 -0700 (PDT)
In-Reply-To: <64ff5aaf-73bd-4e70-9587-46c8e50c0924n@googlegroups.com>
Injection-Info: google-groups.googlegroups.com; posting-host=92.19.80.230; posting-account=soFpvwoAAADIBXOYOBcm_mixNPAaxW9p
NNTP-Posting-Host: 92.19.80.230
References: <7f3be666-ee15-4eca-a53e-52bcd1f0732an@googlegroups.com>
<6b6abeac-e2c3-4c2f-bbdf-5d4a9f42e694n@googlegroups.com> <421a829c-15b9-4e4d-b7f2-7e26e1414333n@googlegroups.com>
<77a5f439-b719-477d-94d8-bc9aa0582ee1n@googlegroups.com> <64ff5aaf-73bd-4e70-9587-46c8e50c0924n@googlegroups.com>
User-Agent: G2/1.0
MIME-Version: 1.0
Message-ID: <b23eb0a8-d4ec-4411-beb1-26a825e6bc41n@googlegroups.com>
Subject: Re: An Alternative to Predication
From: luke.lei...@gmail.com (luke.l...@gmail.com)
Injection-Date: Sat, 08 Apr 2023 08:31:24 +0000
Content-Type: text/plain; charset="UTF-8"
Content-Transfer-Encoding: quoted-printable
X-Received-Bytes: 1906
 by: luke.l...@gmail.com - Sat, 8 Apr 2023 08:31 UTC

On Saturday, April 8, 2023 at 2:10:26 AM UTC+1, MitchAlsup wrote:

> 90% of the gain is in the innermost loop.

indeed. VVM is pretty high bang-per-buck in that regard,
and doesnt have the (associated) disadvantage of ZOLC
nested-looping that the compilers have no idea how it works.
the original researchers had to create special workaround
tools that inserted assembler "shims" into binaries.

instead VVM's single-loop benefits are brain-dead-simple
for compilers to add.

l.

Re: An Alternative to Predication

<u0rj1v$1894j$1@dont-email.me>

  copy mid

https://www.novabbs.com/devel/article-flat.php?id=31583&group=comp.arch#31583

  copy link   Newsgroups: comp.arch
Path: i2pn2.org!i2pn.org!eternal-september.org!feeder.eternal-september.org!.POSTED!not-for-mail
From: terje.ma...@tmsw.no (Terje Mathisen)
Newsgroups: comp.arch
Subject: Re: An Alternative to Predication
Date: Sat, 8 Apr 2023 13:26:55 +0200
Organization: A noiseless patient Spider
Lines: 35
Message-ID: <u0rj1v$1894j$1@dont-email.me>
References: <7f3be666-ee15-4eca-a53e-52bcd1f0732an@googlegroups.com>
<6b6abeac-e2c3-4c2f-bbdf-5d4a9f42e694n@googlegroups.com>
<u0p75o$r8ji$1@dont-email.me>
<634e2c11-ac55-481f-b3c7-8cc12b6e72a6n@googlegroups.com>
MIME-Version: 1.0
Content-Type: text/plain; charset=ISO-8859-1; format=flowed
Content-Transfer-Encoding: 8bit
Injection-Date: Sat, 8 Apr 2023 11:26:55 -0000 (UTC)
Injection-Info: dont-email.me; posting-host="b0e682b9e30413d24f4fe50942e6d5a4";
logging-data="1320083"; mail-complaints-to="abuse@eternal-september.org"; posting-account="U2FsdGVkX19xWM6pnoVdew4OU4+Mzkdia3VTDxtnljjwWiXlMNh/zg=="
User-Agent: Mozilla/5.0 (Windows NT 10.0; Win64; x64; rv:91.0) Gecko/20100101
Firefox/91.0 SeaMonkey/2.53.15
Cancel-Lock: sha1:lduaW1NXDfvOOfcVB9uFFONeV7I=
In-Reply-To: <634e2c11-ac55-481f-b3c7-8cc12b6e72a6n@googlegroups.com>
 by: Terje Mathisen - Sat, 8 Apr 2023 11:26 UTC

luke.l...@gmail.com wrote:
> On Friday, April 7, 2023 at 2:51:55 PM UTC+1, Terje Mathisen wrote:
>
>> The most obvious limitation here is probably that they are comparing
>> against scalar multimedia code, the specific case where SIMD already
>> provides the largest speedups. I.e. the Pentium-MMX with 8-wide byte
>> operations overlayed on the 64-bit x87 FP mantissas were fast enough
>> that the Zoran SoftDVD player was able to keep up with zero frame skips.
>
> it would be unfair to compare apples v oranges. thinking about it:
> if ZOLC were to be applied to a SIMD baseline core it would equally
> cut the number of branch-conditionals etc. but yes not the amount
> of operations done *inside* any given loop.
>
> but the specific problem with MPEG motion-estimation is that
> it is an absolute bitch-of-a-task. there's six levels of nested loops
> in the algorithm, with conditional jumping in and out of two or
> more levels *and back*, which wreaks absolute havoc with any
> traditional CPU branch-prediction, regardless of whether it has
> SIMD or not.
>
> the bit you missed Terje is that the CPU that had ZOLC added to
> it *had no branch-prediction* and yet still managed to piss all over
> cores that did.

I did not miss that, and yes, it is impressive. I am just saying that
(just like FPGA) it is not quite that large an improvement when you
compare against optimized/competent code.

Terje

--
- <Terje.Mathisen at tmsw.no>
"almost all programming can be viewed as an exercise in caching"

Re: An Alternative to Predication

<1e3722f7-5b92-4b2a-b054-8d364ea44f9fn@googlegroups.com>

  copy mid

https://www.novabbs.com/devel/article-flat.php?id=31584&group=comp.arch#31584

  copy link   Newsgroups: comp.arch
X-Received: by 2002:ad4:4aee:0:b0:56f:6e1b:fac1 with SMTP id cp14-20020ad44aee000000b0056f6e1bfac1mr1261291qvb.2.1680965319769;
Sat, 08 Apr 2023 07:48:39 -0700 (PDT)
X-Received: by 2002:a05:6870:12cf:b0:17a:b713:63e9 with SMTP id
15-20020a05687012cf00b0017ab71363e9mr1046756oam.4.1680965319541; Sat, 08 Apr
2023 07:48:39 -0700 (PDT)
Path: i2pn2.org!i2pn.org!usenet.blueworldhosting.com!diablo1.usenet.blueworldhosting.com!peer01.iad!feed-me.highwinds-media.com!news.highwinds-media.com!news-out.google.com!nntp.google.com!postnews.google.com!google-groups.googlegroups.com!not-for-mail
Newsgroups: comp.arch
Date: Sat, 8 Apr 2023 07:48:39 -0700 (PDT)
In-Reply-To: <9a5cfc5b-8492-4736-af58-4a279dea523bn@googlegroups.com>
Injection-Info: google-groups.googlegroups.com; posting-host=2001:56a:fb71:2b00:a5e6:1dfd:ade5:870d;
posting-account=1nOeKQkAAABD2jxp4Pzmx9Hx5g9miO8y
NNTP-Posting-Host: 2001:56a:fb71:2b00:a5e6:1dfd:ade5:870d
References: <7f3be666-ee15-4eca-a53e-52bcd1f0732an@googlegroups.com>
<6b6abeac-e2c3-4c2f-bbdf-5d4a9f42e694n@googlegroups.com> <u0p75o$r8ji$1@dont-email.me>
<634e2c11-ac55-481f-b3c7-8cc12b6e72a6n@googlegroups.com> <u0qvtq$15qdg$1@dont-email.me>
<9a5cfc5b-8492-4736-af58-4a279dea523bn@googlegroups.com>
User-Agent: G2/1.0
MIME-Version: 1.0
Message-ID: <1e3722f7-5b92-4b2a-b054-8d364ea44f9fn@googlegroups.com>
Subject: Re: An Alternative to Predication
From: jsav...@ecn.ab.ca (Quadibloc)
Injection-Date: Sat, 08 Apr 2023 14:48:39 +0000
Content-Type: text/plain; charset="UTF-8"
Content-Transfer-Encoding: quoted-printable
X-Received-Bytes: 1909
 by: Quadibloc - Sat, 8 Apr 2023 14:48 UTC

On Saturday, April 8, 2023 at 2:28:51 AM UTC-6, luke.l...@gmail.com wrote:

> it was an extremely complex research project, funded back in
> 2007-2008 by the European Union. it took me a long time to
> piece together and because the people funded pulled the
> contents off the public website pretty much immediately after
> the grant was finished,

Would it be possible that the Wayback Machine might help with that?

John Savard

Re: An Alternative to Predication

<u1705p$15vs$1@dont-email.me>

  copy mid

https://www.novabbs.com/devel/article-flat.php?id=31681&group=comp.arch#31681

  copy link   Newsgroups: comp.arch
Path: i2pn2.org!i2pn.org!eternal-september.org!feeder.eternal-september.org!news.eternal-september.org!.POSTED!not-for-mail
From: bage...@gmail.com (Brian G. Lucas)
Newsgroups: comp.arch
Subject: Re: An Alternative to Predication
Date: Wed, 12 Apr 2023 14:18:17 -0500
Organization: A noiseless patient Spider
Lines: 21
Message-ID: <u1705p$15vs$1@dont-email.me>
References: <7f3be666-ee15-4eca-a53e-52bcd1f0732an@googlegroups.com>
<6b6abeac-e2c3-4c2f-bbdf-5d4a9f42e694n@googlegroups.com>
<421a829c-15b9-4e4d-b7f2-7e26e1414333n@googlegroups.com>
<77a5f439-b719-477d-94d8-bc9aa0582ee1n@googlegroups.com>
<64ff5aaf-73bd-4e70-9587-46c8e50c0924n@googlegroups.com>
<b23eb0a8-d4ec-4411-beb1-26a825e6bc41n@googlegroups.com>
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8; format=flowed
Content-Transfer-Encoding: 8bit
Injection-Date: Wed, 12 Apr 2023 19:18:17 -0000 (UTC)
Injection-Info: dont-email.me; posting-host="9e73f85ee371708ba37abc81bda52c5c";
logging-data="38908"; mail-complaints-to="abuse@eternal-september.org"; posting-account="U2FsdGVkX1+qRwLt0H13LqJ8ewORnkXG"
User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:102.0) Gecko/20100101
Thunderbird/102.7.1
Cancel-Lock: sha1:r0mUYrJ+uOGHnPiudJvlRpRKJhs=
In-Reply-To: <b23eb0a8-d4ec-4411-beb1-26a825e6bc41n@googlegroups.com>
Content-Language: en-US
 by: Brian G. Lucas - Wed, 12 Apr 2023 19:18 UTC

On 4/8/23 03:31, luke.l...@gmail.com wrote:
> On Saturday, April 8, 2023 at 2:10:26 AM UTC+1, MitchAlsup wrote:
>
>> 90% of the gain is in the innermost loop.
>
> indeed. VVM is pretty high bang-per-buck in that regard,
> and doesnt have the (associated) disadvantage of ZOLC
> nested-looping that the compilers have no idea how it works.
> the original researchers had to create special workaround
> tools that inserted assembler "shims" into binaries.
>
> instead VVM's single-loop benefits are brain-dead-simple
> for compilers to add.
>
If only "compilers" could do it by themselves (ChatGPT?) Otherwise the poor
compiler author must do it. I'm not as smart as I used to be, but I'm not
"brain-dead" yet :-)

brian
> l.

1
server_pubkey.txt

rocksolid light 0.9.81
clearnet tor