Rocksolid Light

Welcome to novaBBS (click a section below)

mail  files  register  newsreader  groups  login

Message-ID:  

* dpkg ponders: 'C++' should have been called 'D' -- #Debian


devel / comp.arch / Re: Check if all 256 bits are clear or set ?

SubjectAuthor
* Check if all 256 bits are clear or set ?skybuck2000
`* Re: Check if all 256 bits are clear or set ?skybuck2000
 +* Re: Check if all 256 bits are clear or set ?skybuck2000
 |`* Re: Check if all 256 bits are clear or set ?skybuck2000
 | `* Re: Check if all 256 bits are clear or set ?skybuck2000
 |  `- Re: Check if all 256 bits are clear or set ?skybuck2000
 `* Re: Check if all 256 bits are clear or set ?MitchAlsup
  `- Re: Check if all 256 bits are clear or set ?skybuck2000

1
Check if all 256 bits are clear or set ?

<ea75e6dc-dece-4b30-98f6-58fe7cd4ae31n@googlegroups.com>

  copy mid

https://www.novabbs.com/devel/article-flat.php?id=21752&group=comp.arch#21752

  copy link   Newsgroups: comp.arch
X-Received: by 2002:a05:6214:d88:: with SMTP id e8mr7539573qve.14.1635468479435;
Thu, 28 Oct 2021 17:47:59 -0700 (PDT)
X-Received: by 2002:a05:6808:19a8:: with SMTP id bj40mr5515788oib.37.1635468479074;
Thu, 28 Oct 2021 17:47:59 -0700 (PDT)
Path: i2pn2.org!i2pn.org!aioe.org!news.mixmin.net!proxad.net!feeder1-2.proxad.net!209.85.160.216.MISMATCH!news-out.google.com!nntp.google.com!postnews.google.com!google-groups.googlegroups.com!not-for-mail
Newsgroups: comp.arch
Date: Thu, 28 Oct 2021 17:47:58 -0700 (PDT)
Injection-Info: google-groups.googlegroups.com; posting-host=84.25.28.171; posting-account=np6u_wkAAADxbE7UBGUIOm-csir6aX02
NNTP-Posting-Host: 84.25.28.171
User-Agent: G2/1.0
MIME-Version: 1.0
Message-ID: <ea75e6dc-dece-4b30-98f6-58fe7cd4ae31n@googlegroups.com>
Subject: Check if all 256 bits are clear or set ?
From: skybuck2...@hotmail.com (skybuck2000)
Injection-Date: Fri, 29 Oct 2021 00:47:59 +0000
Content-Type: text/plain; charset="UTF-8"
 by: skybuck2000 - Fri, 29 Oct 2021 00:47 UTC

I have a data structure in one of my applications which is an array of 32 bytes.

So basically 256 bits.

The code needs to check if all bits are empty or if all bits are full. So basically all clear or all set.

Currently this data structure uses an additional counter of 1 byte to smartly keep track of when a bit is set or cleared individually and it acts upon it's conclusion =D
(secret code =D)

So basically this code uses 1 branch to check the counter, and sometimes it does a little decrementing and sometimes a little incrementing.

So the code itself is pretty efficient.

There is ofcourse a slight little "problem" / "inefficiency" with this technique it consumes 1 additional byte.

So 1 out of 33 bytes. Maybe there are other implicit penalties where this 33 data structure is misaligned because of this 33 byte structure.

So reducing this structure to just 32 bytes could not only safe memory but it might also increase performance a little bit, to maybe a lot ?

So at least memory wise this would give a 3% reduction of memory consumption.

Performance wise I don't know.

Currently checking 256 bits with code with be much worse.

Possibilities:

1. Check each bit individually, this would be pretty stupid and slow.
2. Check each byte, so 32 comparisons, still pretty slow.
3. Check each word, so 16 comparisons, a bit better.
4. Check each longword, so 8 comparisons, a lot better but still slow.
5. Check each quadword, so 4 comparisons, only applicable in 64 bit application.

On some systems 64 bit integer compare can still take two clock cycles.

Is there something better and faster ?

Is there a 128 bit compare in a single clock or two ?

Is there a 256 bit compare in a single clock or two ?

Just wondering ! ;)

Anyway in case such an instruction would exist for example:

if Are256BitsClear( DataStructure )

if Are256BitsSet( DataStructure )

then the counter could most likely/ofcourse be removed =D
and the data structure would be a nice 32 byte fit and a bit more efficient.

Perhaps it might even make sense to use 4 comparisons if the alignment somehow gives it more performance then the overhead of 4 comparisons lol.

Bye,
Skybuck.

Re: Check if all 256 bits are clear or set ?

<10686b1b-868d-430f-9f35-0c5460d8f75cn@googlegroups.com>

  copy mid

https://www.novabbs.com/devel/article-flat.php?id=21753&group=comp.arch#21753

  copy link   Newsgroups: comp.arch
X-Received: by 2002:ac8:7f4a:: with SMTP id g10mr1435099qtk.33.1635469346311;
Thu, 28 Oct 2021 18:02:26 -0700 (PDT)
X-Received: by 2002:a9d:764c:: with SMTP id o12mr6356725otl.129.1635469346042;
Thu, 28 Oct 2021 18:02:26 -0700 (PDT)
Path: i2pn2.org!i2pn.org!aioe.org!news.mixmin.net!proxad.net!feeder1-2.proxad.net!209.85.160.216.MISMATCH!news-out.google.com!nntp.google.com!postnews.google.com!google-groups.googlegroups.com!not-for-mail
Newsgroups: comp.arch
Date: Thu, 28 Oct 2021 18:02:25 -0700 (PDT)
In-Reply-To: <ea75e6dc-dece-4b30-98f6-58fe7cd4ae31n@googlegroups.com>
Injection-Info: google-groups.googlegroups.com; posting-host=84.25.28.171; posting-account=np6u_wkAAADxbE7UBGUIOm-csir6aX02
NNTP-Posting-Host: 84.25.28.171
References: <ea75e6dc-dece-4b30-98f6-58fe7cd4ae31n@googlegroups.com>
User-Agent: G2/1.0
MIME-Version: 1.0
Message-ID: <10686b1b-868d-430f-9f35-0c5460d8f75cn@googlegroups.com>
Subject: Re: Check if all 256 bits are clear or set ?
From: skybuck2...@hotmail.com (skybuck2000)
Injection-Date: Fri, 29 Oct 2021 01:02:26 +0000
Content-Type: text/plain; charset="UTF-8"
 by: skybuck2000 - Fri, 29 Oct 2021 01:02 UTC

I was just scanning the internet/googling it and I came across this, which is a pretty neat trick to start with lol:

https://github.com/holiman/uint256/blob/master/uint256.go

Particularly this code, written in go, I have never programmed in go but I can still read/understand some of it:

// Cmp compares z and x and returns:
//
// -1 if z < x
// 0 if z == x
// +1 if z > x
//
func (z *Int) Cmp(x *Int) (r int) {
// z < x <=> z - x < 0 i.e. when subtraction overflows.
d0, carry := bits.Sub64(z[0], x[0], 0)
d1, carry := bits.Sub64(z[1], x[1], carry)
d2, carry := bits.Sub64(z[2], x[2], carry)
d3, carry := bits.Sub64(z[3], x[3], carry)
if carry == 1 {
return -1
}
if d0|d1|d2|d3 == 0 {
return 0
}
return 1
}

The interesting part is in:

if d0|d1|d2|d3 == 0 {

Apperently what this code does is it retrieves 4x64 bit quantities.

OR-s them together.

And then checks the result of that with a single branch.

So at the assembler/instruction/compiler generated code level this would probably result in something like:
mov register1, 64 bits from array[0]
mov register2, 64 bits from array[1]
mov register3, 64 bits from array[2]
mov register4, 64 bits from array[3]
or register1, register2
or register1, register3
or register1, register4
jump if zero/jump if not zero.

Something like that, so this could be pretty efficient ! at least "branch times" are avoided/costly comparisons.
The jump zero flag instruction could also be pretty efficient, even more efficient than a cmp, because the or instruction may already have set it.

The only slight downside to this would be 7 extra instructions taking up some instruction cache.

All in all not to bad for a first google ! ;) This might be quite quick actually ! LOL and might be worth it.

And the nicest part about it is it doesn't require any special assembler or avx or mmx or sse instructions... hmmmm...

(And ofcourse the compiler has to be slightly efficient at generating this exact assembler code, but it can probably do that ! ;) maybe it has even some more tricks
up it's sleeve.)

Can you do better than this though ? Wondering...

Bye,
Skybuck.

Re: Check if all 256 bits are clear or set ?

<3e3feb27-ea8c-4abb-bbee-63ae6cd8e404n@googlegroups.com>

  copy mid

https://www.novabbs.com/devel/article-flat.php?id=21756&group=comp.arch#21756

  copy link   Newsgroups: comp.arch
X-Received: by 2002:ac8:5794:: with SMTP id v20mr8483473qta.243.1635469971266; Thu, 28 Oct 2021 18:12:51 -0700 (PDT)
X-Received: by 2002:a05:6808:1814:: with SMTP id bh20mr3703665oib.0.1635469970991; Thu, 28 Oct 2021 18:12:50 -0700 (PDT)
Path: i2pn2.org!i2pn.org!aioe.org!feeder1.feed.usenet.farm!feed.usenet.farm!tr2.eu1.usenetexpress.com!feeder.usenetexpress.com!tr3.iad1.usenetexpress.com!border1.nntp.dca1.giganews.com!nntp.giganews.com!news-out.google.com!nntp.google.com!postnews.google.com!google-groups.googlegroups.com!not-for-mail
Newsgroups: comp.arch
Date: Thu, 28 Oct 2021 18:12:50 -0700 (PDT)
In-Reply-To: <10686b1b-868d-430f-9f35-0c5460d8f75cn@googlegroups.com>
Injection-Info: google-groups.googlegroups.com; posting-host=84.25.28.171; posting-account=np6u_wkAAADxbE7UBGUIOm-csir6aX02
NNTP-Posting-Host: 84.25.28.171
References: <ea75e6dc-dece-4b30-98f6-58fe7cd4ae31n@googlegroups.com> <10686b1b-868d-430f-9f35-0c5460d8f75cn@googlegroups.com>
User-Agent: G2/1.0
MIME-Version: 1.0
Message-ID: <3e3feb27-ea8c-4abb-bbee-63ae6cd8e404n@googlegroups.com>
Subject: Re: Check if all 256 bits are clear or set ?
From: skybuck2...@hotmail.com (skybuck2000)
Injection-Date: Fri, 29 Oct 2021 01:12:51 +0000
Content-Type: text/plain; charset="UTF-8"
Lines: 31
 by: skybuck2000 - Fri, 29 Oct 2021 01:12 UTC

This would make a fine implementation for Are256BitsClear.

This leaves Are256BitsSet still to be implemented...

Hmm can the same trick be done for AllSet ?

Probably with AND and checking for FF also decimally known as 255 :)

So for Are256BitsSet this code may work:

mov register1, 64 bits from array[0]
mov register2, 64 bits from array[1]
mov register3, 64 bits from array[2]
mov register4, 64 bits from array[3]
and register1, register2
and register1, register3
and register1, register4

jump if zero/jump if not zero.

^ Not sure about the jump if not zero instruction... this would probably be invalid.

It needs to be 255 or nothing at all.

My initially thought was to maybe negate it or something, turn 255 specific to zero somehow...

Maybe subtract 255 from it to see if it's zero... if not then it's not full.

Hmmm ?

Bye for now,
Skybuck.

Re: Check if all 256 bits are clear or set ?

<5ec5a28a-2d91-457c-bebd-8f10d32d185bn@googlegroups.com>

  copy mid

https://www.novabbs.com/devel/article-flat.php?id=21757&group=comp.arch#21757

  copy link   Newsgroups: comp.arch
X-Received: by 2002:a05:6214:28f:: with SMTP id l15mr8003411qvv.16.1635470181783;
Thu, 28 Oct 2021 18:16:21 -0700 (PDT)
X-Received: by 2002:aca:ac0b:: with SMTP id v11mr11275217oie.155.1635470181574;
Thu, 28 Oct 2021 18:16:21 -0700 (PDT)
Path: i2pn2.org!i2pn.org!weretis.net!feeder6.news.weretis.net!news.misty.com!border2.nntp.dca1.giganews.com!nntp.giganews.com!news-out.google.com!nntp.google.com!postnews.google.com!google-groups.googlegroups.com!not-for-mail
Newsgroups: comp.arch
Date: Thu, 28 Oct 2021 18:16:21 -0700 (PDT)
In-Reply-To: <10686b1b-868d-430f-9f35-0c5460d8f75cn@googlegroups.com>
Injection-Info: google-groups.googlegroups.com; posting-host=2600:1700:291:29f0:90fe:8a18:ad83:1d7a;
posting-account=H_G_JQkAAADS6onOMb-dqvUozKse7mcM
NNTP-Posting-Host: 2600:1700:291:29f0:90fe:8a18:ad83:1d7a
References: <ea75e6dc-dece-4b30-98f6-58fe7cd4ae31n@googlegroups.com> <10686b1b-868d-430f-9f35-0c5460d8f75cn@googlegroups.com>
User-Agent: G2/1.0
MIME-Version: 1.0
Message-ID: <5ec5a28a-2d91-457c-bebd-8f10d32d185bn@googlegroups.com>
Subject: Re: Check if all 256 bits are clear or set ?
From: MitchAl...@aol.com (MitchAlsup)
Injection-Date: Fri, 29 Oct 2021 01:16:21 +0000
Content-Type: text/plain; charset="UTF-8"
Lines: 71
 by: MitchAlsup - Fri, 29 Oct 2021 01:16 UTC

On Thursday, October 28, 2021 at 8:02:27 PM UTC-5, skybuck2000 wrote:
> I was just scanning the internet/googling it and I came across this, which is a pretty neat trick to start with lol:
>
> https://github.com/holiman/uint256/blob/master/uint256.go
>
> Particularly this code, written in go, I have never programmed in go but I can still read/understand some of it:
>
> // Cmp compares z and x and returns:
> //
> // -1 if z < x
> // 0 if z == x
> // +1 if z > x
> //
> func (z *Int) Cmp(x *Int) (r int) {
> // z < x <=> z - x < 0 i.e. when subtraction overflows.
> d0, carry := bits.Sub64(z[0], x[0], 0)
> d1, carry := bits.Sub64(z[1], x[1], carry)
> d2, carry := bits.Sub64(z[2], x[2], carry)
> d3, carry := bits.Sub64(z[3], x[3], carry)
> if carry == 1 {
> return -1
> }
> if d0|d1|d2|d3 == 0 {
> return 0
> }
> return 1
> }
>
> The interesting part is in:
>
> if d0|d1|d2|d3 == 0 {
>
> Apperently what this code does is it retrieves 4x64 bit quantities.
>
> OR-s them together.
>
> And then checks the result of that with a single branch.
>
> So at the assembler/instruction/compiler generated code level this would probably result in something like:
> mov register1, 64 bits from array[0]
> mov register2, 64 bits from array[1]
> mov register3, 64 bits from array[2]
> mov register4, 64 bits from array[3]
> or register1, register2
> or register1, register3
> or register1, register4
> jump if zero/jump if not zero.
>
> Something like that, so this could be pretty efficient ! at least "branch times" are avoided/costly comparisons.
> The jump zero flag instruction could also be pretty efficient, even more efficient than a cmp, because the or instruction may already have set it.
>
> The only slight downside to this would be 7 extra instructions taking up some instruction cache.
>
> All in all not to bad for a first google ! ;) This might be quite quick actually ! LOL and might be worth it.
>
> And the nicest part about it is it doesn't require any special assembler or avx or mmx or sse instructions... hmmmm...
>
> (And ofcourse the compiler has to be slightly efficient at generating this exact assembler code, but it can probably do that ! ;) maybe it has even some more tricks
> up it's sleeve.)
>
> Can you do better than this though ? Wondering...
mov register1, 64 bits from array[0]
mov register2, 64 bits from array[1]
mov register3, 64 bits from array[2]
mov register4, 64 bits from array[3]
or register1, register2
or register3, register4
or register1, register3
//the fist two ors can execute in parallel.
>
> Bye,
> Skybuck.

Re: Check if all 256 bits are clear or set ?

<db6c1cc3-adfc-4225-8614-40355cad2240n@googlegroups.com>

  copy mid

https://www.novabbs.com/devel/article-flat.php?id=21759&group=comp.arch#21759

  copy link   Newsgroups: comp.arch
X-Received: by 2002:ac8:6112:: with SMTP id a18mr8653653qtm.401.1635473936434;
Thu, 28 Oct 2021 19:18:56 -0700 (PDT)
X-Received: by 2002:a9d:3a2:: with SMTP id f31mr891305otf.144.1635473936149;
Thu, 28 Oct 2021 19:18:56 -0700 (PDT)
Path: i2pn2.org!i2pn.org!aioe.org!news.mixmin.net!proxad.net!feeder1-2.proxad.net!209.85.160.216.MISMATCH!news-out.google.com!nntp.google.com!postnews.google.com!google-groups.googlegroups.com!not-for-mail
Newsgroups: comp.arch
Date: Thu, 28 Oct 2021 19:18:55 -0700 (PDT)
In-Reply-To: <3e3feb27-ea8c-4abb-bbee-63ae6cd8e404n@googlegroups.com>
Injection-Info: google-groups.googlegroups.com; posting-host=84.25.28.171; posting-account=np6u_wkAAADxbE7UBGUIOm-csir6aX02
NNTP-Posting-Host: 84.25.28.171
References: <ea75e6dc-dece-4b30-98f6-58fe7cd4ae31n@googlegroups.com>
<10686b1b-868d-430f-9f35-0c5460d8f75cn@googlegroups.com> <3e3feb27-ea8c-4abb-bbee-63ae6cd8e404n@googlegroups.com>
User-Agent: G2/1.0
MIME-Version: 1.0
Message-ID: <db6c1cc3-adfc-4225-8614-40355cad2240n@googlegroups.com>
Subject: Re: Check if all 256 bits are clear or set ?
From: skybuck2...@hotmail.com (skybuck2000)
Injection-Date: Fri, 29 Oct 2021 02:18:56 +0000
Content-Type: text/plain; charset="UTF-8"
 by: skybuck2000 - Fri, 29 Oct 2021 02:18 UTC

I found it fun to scan the internet to see if there is anybody that has/needs these same methods.

I came across unity documentation:

https://docs.unity3d.com/Packages/com.unity.render-pipelines.core@7.2/api/UnityEngine.Rendering.BitArray256.html

Describing

allTrue

True if all bits are 1.
Declaration

public bool allTrue { get; }

and it took a while to find the exact source code but it's here:

https://github.com/needle-mirror/com.unity.render-pipelines.core/blob/master/Runtime/Utilities/BitArray.cs

public bool allTrue => data1 == ulong.MaxValue && data2 == ulong.MaxValue && data3 == ulong.MaxValue && data4 == ulong.MaxValue;

from this class to give it a bit more context:
/// <summary>
/// Bit array of size 256.
/// </summary>
[Serializable]
[System.Diagnostics.DebuggerDisplay("{this.GetType().Name} {humanizedData}")]
public struct BitArray256 : IBitArray
{
[SerializeField]
ulong data1;
[SerializeField]
ulong data2;
[SerializeField]
ulong data3;
[SerializeField]
ulong data4;

/// <summary>Number of elements in the bit array.</summary>
public uint capacity => 256u;
/// <summary>True if all bits are 0.</summary>
public bool allFalse => data1 == 0uL && data2 == 0uL && data3 == 0uL && data4 == 0uL;
/// <summary>True if all bits are 1.</summary>
public bool allTrue => data1 == ulong.MaxValue && data2 == ulong.MaxValue && data3 == ulong.MaxValue && data4 == ulong.MaxValue;
/// <summary>Returns the bit array in a human readable form.</summary>

So let's go back to this code:

public bool allTrue => data1 == ulong.MaxValue && data2 == ulong.MaxValue && data3 == ulong.MaxValue && data4 == ulong.MaxValue;

It's a bit funny to see both solutions for allFalse and AllTrue the naming is nice, in my case you could think of it as AllEmpty and AllFull hint hint about secret code lol.

Anyway

At least the allFalse code could be written a bit more efficient HAHA ! =D
as already explored.

But can the allTrue code be written more efficient as well ? I think so...

At least the and's could be done on the data itself

(data1 && data2 && data3 && data4) == ulong.MaxValue
3's and's
1 comparison to max value.

Bye for now,
Skybuck.

Re: Check if all 256 bits are clear or set ?

<b919ef62-c7e2-4d1f-8fe7-f517ed1ecf0dn@googlegroups.com>

  copy mid

https://www.novabbs.com/devel/article-flat.php?id=21765&group=comp.arch#21765

  copy link   Newsgroups: comp.arch
X-Received: by 2002:a05:620a:818:: with SMTP id s24mr8476444qks.395.1635509257814;
Fri, 29 Oct 2021 05:07:37 -0700 (PDT)
X-Received: by 2002:a05:6808:1597:: with SMTP id t23mr7904406oiw.78.1635509257584;
Fri, 29 Oct 2021 05:07:37 -0700 (PDT)
Path: i2pn2.org!i2pn.org!paganini.bofh.team!news.dns-netz.com!news.freedyn.net!newsreader4.netcologne.de!news.netcologne.de!peer02.ams1!peer.ams1.xlned.com!news.xlned.com!peer01.iad!feed-me.highwinds-media.com!news.highwinds-media.com!news-out.google.com!nntp.google.com!postnews.google.com!google-groups.googlegroups.com!not-for-mail
Newsgroups: comp.arch
Date: Fri, 29 Oct 2021 05:07:37 -0700 (PDT)
In-Reply-To: <db6c1cc3-adfc-4225-8614-40355cad2240n@googlegroups.com>
Injection-Info: google-groups.googlegroups.com; posting-host=84.25.28.171; posting-account=np6u_wkAAADxbE7UBGUIOm-csir6aX02
NNTP-Posting-Host: 84.25.28.171
References: <ea75e6dc-dece-4b30-98f6-58fe7cd4ae31n@googlegroups.com>
<10686b1b-868d-430f-9f35-0c5460d8f75cn@googlegroups.com> <3e3feb27-ea8c-4abb-bbee-63ae6cd8e404n@googlegroups.com>
<db6c1cc3-adfc-4225-8614-40355cad2240n@googlegroups.com>
User-Agent: G2/1.0
MIME-Version: 1.0
Message-ID: <b919ef62-c7e2-4d1f-8fe7-f517ed1ecf0dn@googlegroups.com>
Subject: Re: Check if all 256 bits are clear or set ?
From: skybuck2...@hotmail.com (skybuck2000)
Injection-Date: Fri, 29 Oct 2021 12:07:37 +0000
Content-Type: text/plain; charset="UTF-8"
X-Received-Bytes: 17273
 by: skybuck2000 - Fri, 29 Oct 2021 12:07 UTC

I have created a "professional" github page, so you guys can download the example source code properly and quickly using GIT and don't have to hassle with newsgroup crappy messaging formats:

TData and TLink optimization project

https://github.com/SkybuckFlying/TDataTLinkOptimizationProject

The aim of the project is to:

Optimize the IsEmpty and IsFull routines for TData and TLink.
Optimize the Empty and Full routines for TData and TLink.
Implement WIN64 platform support by implementing BitSet and BitReset.

YES BABY ! =D It's 2021 ! Time to start sharing code a little bit more professionally and effectively and actually good usuable.

However in case you don't know how to use GIT and still want to have a look at the source code then I will post it here anyway,

we can also discuss further ideas for optimization on newsgroups, I am particularly interested in any high speed assembler assistence for x86 and/or x64 instruction set !

unit UnitTDataTLink;

{

TestProgram/Optimization Project for TData and TLink

version 0.02 created on 29 october 2021 by Skybuck Flying

TData contains an array of 32 bytes, in other words 256 bits.
TLink contains an array of 256 pointers.

The mission of this project is to optimize the IsEmpty and IsFull functions
to run as fast as possible on modern x86/x64 processors.

Bonus would be to optimize Empty and Full routines, however beware
the TLink structure uses special pointer values for empty and full.

The verification routines do not need to be optimized and should not
be optimized and are only there to verify any optimized code.

I will sure this code in github for easy downloading and uploading.

Would be cool to get some help with optimizing this code.

The original code used a mCount : byte; member to optimize the code
however this increases the TData structure to 33 bytes which could
be causing misalignment and consumes 3% more memory.

The TLink structure would also fit better if mCount is illiminated.

Both structures contained a mCount : byte variable and have been removed.

Now the same functionality to check for empty and full should be implemented
with as fast as code as possible without using any extra member fields
for the structures to keep them memory efficient like this.

Perhaps these link functions can be optimized with AVX-512 or so ?

}

{

version 0.03 created on 29 october 2021 by Skybuck Flying

Console program turned into multi-device-application project for
multi-platform support.

Project and Source files seperated into seperate folders to avoid
file clutter.

Only WIN32 platform currently implemented.

Other platforms possible thanks to multi-device-project/applicaation.

For some reason it was not possible to add platforms to a console project.

So I modified code so it can work with a multi-device-application, this
added a lot of platforms.

WIN64 platform support is desired !

The rest would be a bonus !

}

interface

uses
Classes; // for TStrings

type
TData = packed record
mData : packed array[0..31] of byte;

procedure Empty;
procedure Full;

function IsEmpty : boolean;
function IsFull : boolean;

// debug routines
function VerifyIsEmpty : boolean;
function VerifyIsFull : boolean;
function VerifyOne : boolean;
function VerifyRandomButOne : boolean;
procedure VerifyAll;
end;

TLink = packed record
mLink : packed array[0..255] of pointer;

procedure Empty;
procedure Full;

function IsEmpty : boolean;
function IsFull : boolean;

// debug routines
function VerifyIsEmpty : boolean;
function VerifyIsFull : boolean;
function VerifyOne : boolean;
function VerifyRandomButOne : boolean;
procedure VerifyAll;
end;

procedure MainTestProgram( ParaOutput : TStrings );

implementation

uses
SysUtils; // for IntToStr

const
ConstDataTestLoops = 1000000; // one million ! =D
ConstLinkTestLoops = 100000; // one hundred thousand ! =D

var
const_empty : pointer;
const_full : pointer;

var
vOutput : TStrings;

procedure OutputMessage( ParaString : string );
begin
vOutput.Add( ParaString );
end;

// only works on win32 platform, need win64 platform versions !

{$ifdef WIN32}
// unlocked versions (multi-threading-unsafe)
procedure BitReset( ParaMemory : pointer; ParaBitIndex : integer );
asm
btr [eax], edx
end;

procedure BitSet( ParaMemory : pointer; ParaBitIndex : integer );
asm
bts [eax], edx
end;

// locked versions (multi-threading-safe)
procedure LockedBitReset( ParaMemory : pointer; ParaBitIndex : integer );
asm
LOCK btr [eax], edx
end;

procedure LockedBitSet( ParaMemory : pointer; ParaBitIndex : integer );
asm
LOCK bts [eax], edx
end;
{$endif}

{$ifdef WIN64}
// unlocked versions (multi-threading-unsafe)
// optimize me ?
procedure BitReset( ParaMemory : pointer; ParaBitIndex : integer );
begin
// to be implemented
ERROR NOT IMPLEMENTED
end;

procedure BitSet( ParaMemory : pointer; ParaBitIndex : integer );
begin
// to be implemented
ERROR NOT IMPLEMENTED
end;

// not really required for this test program but would be nice to have a 64 bit version of this as well.
procedure LockedBitReset( ParaMemory : pointer; ParaBitIndex : integer );
begin

end;

procedure LockedBitSet( ParaMemory : pointer; ParaBitIndex : integer );
begin

end;
{$endif}

{$ifdef WIN32}
procedure TData.Empty;
type
TUInt32Array = packed array[0..7] of UInt32;
begin
TUInt32Array(mData)[0] := 0;
TUInt32Array(mData)[1] := 0;
TUInt32Array(mData)[2] := 0;
TUInt32Array(mData)[3] := 0;
TUInt32Array(mData)[4] := 0;
TUInt32Array(mData)[5] := 0;
TUInt32Array(mData)[6] := 0;
TUInt32Array(mData)[7] := 0;
end;
{$endif}

{$ifdef WIN64}
procedure TData.Empty;
type
TUInt64Array = packed array[0..3] of UInt64;
begin
TUInt64Array(mData)[0] := 0;
TUInt64Array(mData)[1] := 0;
TUInt64Array(mData)[2] := 0;
TUInt64Array(mData)[3] := 0;
end;
{$endif}

{$ifdef WIN32}
procedure TData.Full;
type
TUInt32Array = packed array[0..7] of longword;
begin
// 11223344
TUInt32Array(mData)[0] := $FFFFFFFF;
TUInt32Array(mData)[1] := $FFFFFFFF;
TUInt32Array(mData)[2] := $FFFFFFFF;
TUInt32Array(mData)[3] := $FFFFFFFF;
TUInt32Array(mData)[4] := $FFFFFFFF;
TUInt32Array(mData)[5] := $FFFFFFFF;
TUInt32Array(mData)[6] := $FFFFFFFF;
TUInt32Array(mData)[7] := $FFFFFFFF;
end;
{$endif}

{$ifdef WIN64}
procedure TData.Full;
type
TUInt64Array = packed array[0..3] of UInt64;
begin
// 1122334455667788
TUInt64Array(mData)[0] := $FFFFFFFFFFFFFFFF;
TUInt64Array(mData)[1] := $FFFFFFFFFFFFFFFF;
TUInt64Array(mData)[2] := $FFFFFFFFFFFFFFFF;
TUInt64Array(mData)[3] := $FFFFFFFFFFFFFFFF;
end;
{$endif}

{$ifdef WIN32}
function TData.IsEmpty : boolean;
type
TUInt32Array = packed array[0..7] of longword;
var
vResult : longword;
begin
result := False;

vResult :=
TUInt32Array(mData)[0] or
TUInt32Array(mData)[1] or
TUInt32Array(mData)[2] or
TUInt32Array(mData)[3] or
TUInt32Array(mData)[4] or
TUInt32Array(mData)[5] or
TUInt32Array(mData)[6] or
TUInt32Array(mData)[7];

if vResult = 0 then
begin
result := True;
end;
end;
{$endif}

{$ifdef WIN64}
function TData.IsEmpty : boolean;
type
TUInt64Array = packed array[0..3] of UInt64;
var
vResult : UInt64;
begin
result := False;

vResult :=
TUInt64Array(mData)[0] or
TUInt64Array(mData)[1] or
TUInt64Array(mData)[2] or
TUInt64Array(mData)[3];

if vResult = 0 then
begin
result := True;
end;
end;
{$endif}

{$ifdef WIN32}
function TData.IsFull : boolean;
type
TUInt32Array = packed array[0..7] of longword;
var
vResult : UInt64;
begin
result := False;

vResult :=
TUInt32Array(mData)[0] and
TUInt32Array(mData)[1] and
TUInt32Array(mData)[2] and
TUInt32Array(mData)[3] and
TUInt32Array(mData)[4] and
TUInt32Array(mData)[5] and
TUInt32Array(mData)[6] and
TUInt32Array(mData)[7];

// 11223344
if vResult = $FFFFFFFF then
begin
result := True;
end;
end;
{$endif}

{$ifdef WIN64}
function TData.IsFull : boolean;
type
TUInt64Array = packed array[0..3] of UInt64;
var
vResult : UInt64;
begin
result := False;

vResult :=
TUInt64Array(mData)[0] and
TUInt64Array(mData)[1] and
TUInt64Array(mData)[2] and
TUInt64Array(mData)[3];

// 1122334455667788
if vResult = $FFFFFFFFFFFFFFFF then
begin
result := True;
end;
end;
{$endif}

// TData verification routines

function TData.VerifyIsEmpty : boolean;
var
vIndex : integer;
begin
result := True;
for vIndex := 0 to 31 do
begin
if mData[vIndex] <> 0 then
begin
result := False;
end;
end;
end;


Click here to read the complete article
Re: Check if all 256 bits are clear or set ?

<8567e4fd-7953-4acb-aaec-145a297f6291n@googlegroups.com>

  copy mid

https://www.novabbs.com/devel/article-flat.php?id=21766&group=comp.arch#21766

  copy link   Newsgroups: comp.arch
X-Received: by 2002:a05:622a:13cc:: with SMTP id p12mr11505378qtk.227.1635509448115;
Fri, 29 Oct 2021 05:10:48 -0700 (PDT)
X-Received: by 2002:a9d:d12:: with SMTP id 18mr8016807oti.85.1635509447912;
Fri, 29 Oct 2021 05:10:47 -0700 (PDT)
Path: i2pn2.org!i2pn.org!weretis.net!feeder6.news.weretis.net!news.misty.com!border2.nntp.dca1.giganews.com!nntp.giganews.com!news-out.google.com!nntp.google.com!postnews.google.com!google-groups.googlegroups.com!not-for-mail
Newsgroups: comp.arch
Date: Fri, 29 Oct 2021 05:10:47 -0700 (PDT)
In-Reply-To: <5ec5a28a-2d91-457c-bebd-8f10d32d185bn@googlegroups.com>
Injection-Info: google-groups.googlegroups.com; posting-host=84.25.28.171; posting-account=np6u_wkAAADxbE7UBGUIOm-csir6aX02
NNTP-Posting-Host: 84.25.28.171
References: <ea75e6dc-dece-4b30-98f6-58fe7cd4ae31n@googlegroups.com>
<10686b1b-868d-430f-9f35-0c5460d8f75cn@googlegroups.com> <5ec5a28a-2d91-457c-bebd-8f10d32d185bn@googlegroups.com>
User-Agent: G2/1.0
MIME-Version: 1.0
Message-ID: <8567e4fd-7953-4acb-aaec-145a297f6291n@googlegroups.com>
Subject: Re: Check if all 256 bits are clear or set ?
From: skybuck2...@hotmail.com (skybuck2000)
Injection-Date: Fri, 29 Oct 2021 12:10:48 +0000
Content-Type: text/plain; charset="UTF-8"
Lines: 30
 by: skybuck2000 - Fri, 29 Oct 2021 12:10 UTC

> > Can you do better than this though ? Wondering...
> mov register1, 64 bits from array[0]
> mov register2, 64 bits from array[1]
> mov register3, 64 bits from array[2]
> mov register4, 64 bits from array[3]
> or register1, register2
> or register3, register4
> or register1, register3
> //the fist two ors can execute in parallel.

Thank you for this suggestion.

Today I just wrote some quick code and some verification code, see other message in this thread.

For now I have implemented it as a bunch of or statements in the Delphi programming language like so:

A or
B or
C or
D;

and also a 32 bit version A,B,C,D,E,F,G,H and or's in between.

It will be interesting to examine later if Delphi's compiler will apply your optimization suggestion !

If not then perhaps parenthesis or explicit code can be written to apply your parallel optimization trick ! =D

I haven't had time yet to look into it, busy written the code, but I will later, probably tomorrow =D

Bye for now,
Skybuck.

Re: Check if all 256 bits are clear or set ?

<945c4ab8-863b-4650-a45a-0bb450d20f5cn@googlegroups.com>

  copy mid

https://www.novabbs.com/devel/article-flat.php?id=22132&group=comp.arch#22132

  copy link   Newsgroups: comp.arch
X-Received: by 2002:a05:620a:754:: with SMTP id i20mr5946351qki.312.1637685246877; Tue, 23 Nov 2021 08:34:06 -0800 (PST)
X-Received: by 2002:a9d:764c:: with SMTP id o12mr5846033otl.129.1637685246658; Tue, 23 Nov 2021 08:34:06 -0800 (PST)
Path: i2pn2.org!i2pn.org!aioe.org!feeder5.feed.usenet.farm!feeder1.feed.usenet.farm!feed.usenet.farm!tr3.eu1.usenetexpress.com!feeder.usenetexpress.com!tr1.iad1.usenetexpress.com!border1.nntp.dca1.giganews.com!nntp.giganews.com!news-out.google.com!nntp.google.com!postnews.google.com!google-groups.googlegroups.com!not-for-mail
Newsgroups: comp.arch
Date: Tue, 23 Nov 2021 08:34:06 -0800 (PST)
In-Reply-To: <b919ef62-c7e2-4d1f-8fe7-f517ed1ecf0dn@googlegroups.com>
Injection-Info: google-groups.googlegroups.com; posting-host=84.25.28.171; posting-account=np6u_wkAAADxbE7UBGUIOm-csir6aX02
NNTP-Posting-Host: 84.25.28.171
References: <ea75e6dc-dece-4b30-98f6-58fe7cd4ae31n@googlegroups.com> <10686b1b-868d-430f-9f35-0c5460d8f75cn@googlegroups.com> <3e3feb27-ea8c-4abb-bbee-63ae6cd8e404n@googlegroups.com> <db6c1cc3-adfc-4225-8614-40355cad2240n@googlegroups.com> <b919ef62-c7e2-4d1f-8fe7-f517ed1ecf0dn@googlegroups.com>
User-Agent: G2/1.0
MIME-Version: 1.0
Message-ID: <945c4ab8-863b-4650-a45a-0bb450d20f5cn@googlegroups.com>
Subject: Re: Check if all 256 bits are clear or set ?
From: skybuck2...@hotmail.com (skybuck2000)
Injection-Date: Tue, 23 Nov 2021 16:34:06 +0000
Content-Type: text/plain; charset="UTF-8"
Lines: 131
 by: skybuck2000 - Tue, 23 Nov 2021 16:34 UTC

Conceptually this is very smart to invert the logic !

Your idea (Peter 'Shaggy' Haywood) has given me a new idea for the code:

Instead of wasting cpu cycles/times checking everything instead only check what is needed to come to a certain "inverted conclusions".

It is indeed not necessary to check all bits to see if they are all clear, if one of them is not clear then obviously the data is not empty, and the function can exit earlier.

It is indeed not necessary to check all bits to see if they are all set, if one of them is not set then obviously the data is not full, and the function can exit earlier.

HAHA ! Pretty smart and brilliant ! ;) =D

Or is it ? hmmmm

Maybe not.... it depends...

Let's analyze code below:

{$ifdef WIN32}
function TData.IsEmpty : boolean;
type
TUInt32Array = packed array[0..7] of longword;
var
vResult : longword;
begin
result := False;

vResult :=
TUInt32Array(mData)[0] or
TUInt32Array(mData)[1] or
TUInt32Array(mData)[2] or
TUInt32Array(mData)[3] or
TUInt32Array(mData)[4] or
TUInt32Array(mData)[5] or
TUInt32Array(mData)[6] or
TUInt32Array(mData)[7];

if vResult = 0 then
begin
result := True;
end;
end;
{$endif}

Theoretically it needs to pull in 8x 32 bit integers, this could strain the memory bus somewhat, slow down the cpu, while it's waiting for data.... and it may pollute the l1 data cache unnecessarily.

On the other hand it does only one branch comparison.

Now the alternative would be to do 8 branch comparisons. Perhaps there is a golden ground in the middle, maybe only do 2 or 4 comparisons that is a possibility as well.

So different combinations could be tried to see which one performs best.

Here is where a little bit of artificial intelligence could be handy.

The application could measure all versions during active/real world usage with statisticall data tracking... for the first few minutes or so...

Then once another variations and statistics collected it can switch to non-statistical methods and pick the one which it believes is the fastest.

Anyway let's create a worst case counter example and analyze that, untested code:

To cool thing is this also gets rid of the extra variable.

function TData.IsEmpty : boolean;
type
TUInt32Array = packed array[0..7] of longword;
begin
result := False;

// if not empty then exit.
if TUInt32Array(mData)[0]) <> 0 then exit;
if TUInt32Array(mData)[1]) <> 0 then exit;
if TUInt32Array(mData)[2]) <> 0 then exit;
if TUInt32Array(mData)[3]) <> 0 then exit;
if TUInt32Array(mData)[4]) <> 0 then exit;
if TUInt32Array(mData)[5]) <> 0 then exit;
if TUInt32Array(mData)[6]) <> 0 then exit;
if TUInt32Array(mData)[7]) <> 0 then exit;

result := True; // if execution reaches here then return true.
end;
{$endif}

Pretty cool,

These are still 8 data accesses, no or instructions, 8 comparisons, possibly 8 jumps

On average this will probably run at 4,4,4

It will take up some more branch predicators, less data cache

Hard to say which one will run faster ;) for now my bet would be on the first one, because or instructions are 1 cycle and produces can be 15 or something.

However data access can be in the range of 400 nanoseconds, maybe 400 cycles, that could severely impact performance.

And this this newer version might run faster, depends on how could the L1 data cache lining is, if it's good maybe 10 cycles per data access... for a average total of 40.

Hmmm... I may have to benchmark this sometime ! ;)

And the isFull version:

// untested code

{$ifdef WIN32}
function TData.IsFull : boolean;
type
TUInt32Array = packed array[0..7] of longword;
begin
result := False;

// if not full exit
if TUInt32Array(mData)[0] <> $FFFFFFFF then exit;
if TUInt32Array(mData)[1] <> $FFFFFFFF then exit;
if TUInt32Array(mData)[2] <> $FFFFFFFF then exit;
if TUInt32Array(mData)[3] <> $FFFFFFFF then exit;
if TUInt32Array(mData)[4] <> $FFFFFFFF then exit;
if TUInt32Array(mData)[5] <> $FFFFFFFF then exit;
if TUInt32Array(mData)[6] <> $FFFFFFFF then exit;
if TUInt32Array(mData)[7] <> $FFFFFFFF then exit;

result := True;
end;
{$endif}

Ok interesting,

Now we truely have something to benchmark ! ;) =D

I am kinda busy so this will have to wait time a further time, but other people are welcome to benchmark this themselfes ! ;) :)

Bye for now,
And thanks for the feedback and ideas !
Skybuck Flying ! =D

1
server_pubkey.txt

rocksolid light 0.9.8
clearnet tor