Rocksolid Light

Welcome to novaBBS (click a section below)

mail  files  register  nodelist  faq  login

Save gas, don't use the shell.


programming / comp.lang.asm.x86 / Re: What's purpose of "gather" instructions?

SubjectAuthor
* What's purpose of "gather" instructions?Branimir Maksimovic
+- Re: What's purpose of "gather" instructions?Terje Mathisen
`- Re: What's purpose of "gather" instructions?Anton Ertl

1
Subject: What's purpose of "gather" instructions?
From: Branimir Maksimovic
Newsgroups: comp.lang.asm.x86
Organization: usenet-news.net
Date: Thu, 27 May 2021 10:37 UTC
Path: i2pn2.org!i2pn.org!eternal-september.org!reader02.eternal-september.org!.POSTED!not-for-mail
From: branimir...@nospicedham.gmail.com (Branimir Maksimovic)
Newsgroups: comp.lang.asm.x86
Subject: What's purpose of "gather" instructions?
Date: Thu, 27 May 2021 10:37:25 GMT
Organization: usenet-news.net
Lines: 10
Sender: <news@fx18.iad.omicronmedia.com>
Approved: fbkotler@myfairpoint.net - comp.lang.asm.x86 moderation team.
Message-ID: <FDKrI.93269$Sx7.77772@fx18.iad>
Injection-Info: reader02.eternal-september.org; posting-host="ec35b3dff26eccc32e83cd3b0fde3470";
logging-data="12828"; mail-complaints-to="abuse@eternal-september.org"; posting-account="U2FsdGVkX18izWLHwWcDQBi8Yx7N4QyjjDUWOodpIm4="
User-Agent: slrn/1.0.3 (Linux)
Cancel-Lock: sha1:/R8AVnno/EHVmFsaCqDcTzki1HA=
View all headers
I tried with them recenlty and they are slow, slow,
slower then manualy loading ;)
I mean like "loop" instruction, uselless ;)

--
current job title: senior software engineer
skills: x86 aasembler,c++,c,rust,go,nim,haskell...

press any key to continue or any other to quit...



Subject: Re: What's purpose of "gather" instructions?
From: Terje Mathisen
Newsgroups: comp.lang.asm.x86
Organization: Aioe.org NNTP Server
Date: Thu, 27 May 2021 14:23 UTC
References: 1
Path: i2pn2.org!i2pn.org!eternal-september.org!reader02.eternal-september.org!.POSTED!not-for-mail
From: terje.ma...@nospicedham.tmsw.no (Terje Mathisen)
Newsgroups: comp.lang.asm.x86
Subject: Re: What's purpose of "gather" instructions?
Date: Thu, 27 May 2021 16:23:59 +0200
Organization: Aioe.org NNTP Server
Lines: 22
Approved: fbkotler@myfairpoint.net - comp.lang.asm.x86 moderation team.
Message-ID: <s8oa20$kio$1@gioia.aioe.org>
References: <FDKrI.93269$Sx7.77772@fx18.iad>
Mime-Version: 1.0
Content-Type: text/plain; charset=UTF-8; format=flowed
Content-Transfer-Encoding: 7bit
Injection-Info: reader02.eternal-september.org; posting-host="ec35b3dff26eccc32e83cd3b0fde3470";
logging-data="20672"; mail-complaints-to="abuse@eternal-september.org"; posting-account="U2FsdGVkX19qdXfdxjKmOiHUqgZx4NeAeMefrvMT9gI="
User-Agent: Mozilla/5.0 (Windows NT 10.0; Win64; x64; rv:60.0) Gecko/20100101
Firefox/60.0 SeaMonkey/2.53.7
Cancel-Lock: sha1:1Utr5L98NdieI8T/cosVI3Km03k=
View all headers
Branimir Maksimovic wrote:
I tried with them recenlty and they are slow, slow,
slower then manualy loading ;)
I mean like "loop" instruction, uselless ;)

Gather is supposed to run at minimum one word per cycle, but preferably all loads that come from the same cache line should happen in a single cycle, so that looking up stuff in a compact structure should be reasonably fast, and much faster than scalar loads.

The first Larrabee CPU had gather implemented in an external chip, so it was effectively a coprocessor. The idea was that you would setup a bunch of these as part of a big processing loop, then stream the results through.

I.e. typical GPU optimizing for bandwidth, not latency.

Terje

--
- <Terje.Mathisen at tmsw.no>
"almost all programming can be viewed as an exercise in caching"



Subject: Re: What's purpose of "gather" instructions?
From: Anton Ertl
Newsgroups: comp.lang.asm.x86
Organization: Institut fuer Computersprachen, Technische Universitaet Wien
Date: Thu, 27 May 2021 14:56 UTC
References: 1
Path: i2pn2.org!i2pn.org!eternal-september.org!reader02.eternal-september.org!.POSTED!not-for-mail
From: ant...@nospicedham.mips.complang.tuwien.ac.at (Anton Ertl)
Newsgroups: comp.lang.asm.x86
Subject: Re: What's purpose of "gather" instructions?
Date: Thu, 27 May 2021 14:56:51 GMT
Organization: Institut fuer Computersprachen, Technische Universitaet Wien
Lines: 35
Approved: fbkotler@myfairpoint.net - comp.lang.asm.x86 moderation team.
Message-ID: <2021May27.165651@mips.complang.tuwien.ac.at>
References: <FDKrI.93269$Sx7.77772@fx18.iad>
Injection-Info: reader02.eternal-september.org; posting-host="ec35b3dff26eccc32e83cd3b0fde3470";
logging-data="12485"; mail-complaints-to="abuse@eternal-september.org"; posting-account="U2FsdGVkX19bU2oDW1cevyw7yhpgNxnHKvtPhTXnDuw="
Cancel-Lock: sha1:zInK/Isn8uBBS3YzsQEQlaBb7vY=
View all headers
Branimir Maksimovic <branimir.maksimovic@nospicedham.gmail.com> writes:
I tried with them recenlty and they are slow, slow,
slower then manualy loading ;)
I mean like "loop" instruction, uselless ;)

Possible explanations:

1) An instruction set designer thought that this could be implemented
   better than by using scalar loads, but

   a) the hardware designers did not get around to it.
   b) the hardware designers tried, but the result was buggy, and was
      disabled in delivered hardware.

   Still, there is a slight benefit to having these instructions: If
   there ever is a useful hardware implementation, software people can
   use it in the knowledge that their code will at least run on a
   variety of hardware (some may have a switch between using gather
   instructions and scalar code, but not everyone can afford
   development time for all CPU variations).

2) The instruction already worked better than the scalar code in the
   Xeon Phi (I dimly remember reading something like that, although
   looking at the cycle numbers I found the claim questionable), and
   was added to other CPUs to support software that uses the
   instruction.  The problem with this theory is that Xeon Phi
   supports (a variant of) AVX-512, but the Haswell and Skylake
   (client) support only AVX2.

- anton
--
M. Anton Ertl                    Some things have to be seen to be believed
anton@mips.complang.tuwien.ac.at Most things have to be believed to be seen
http://www.complang.tuwien.ac.at/anton/home.html



1
rocksolid light 0.7.2
clearneti2ptor