Message-ID:

Computer programmers never die, they just get lost in the processing.

On 5/14/2022 4:29 PM, MitchAlsup wrote:
> On Saturday, May 14, 2022 at 3:57:06 PM UTC-5, BGB wrote:
>> On 5/13/2022 9:06 AM, MitchAlsup wrote:
>>> On Thursday, May 12, 2022 at 11:23:44 PM UTC-5, Quadibloc wrote:
>>>> I tried sending an E-mail about this brainstorm of mine to posituhb.org, and in
>>>> my unsuccessful attempt to do so, I discovered that John L. Gustafson's E-mail
>>>> address at the University of Singapore is no longer valid.
>>>> Also, the most recent news of him I have heard of was that he was hired by
>>>> ClearSpeed Technology, but that company no longer exists; another company
>>>> with a similar name makes AI software, not chips.
>>>>
>>>> Anyways, the brainstorm I was inspired with by reading the discussion about
>>>> NaNs is this:
>>>>
>>>> Let us consider a floating-point format which is identical to the IEEE 754 floating-point format... when the first three bits of the excess-n exponent
>>>> field are in the range [001 ... 110].
>>>>
>>>> So for the "inner 75%" of the range of IEEE 754 floats, not one bit of
>>>> precision is lost.
>>>>
>>>> In the bottom 12.5%, we have Extremely Gradual Underflow, which I have
>>>> described before.
>>>>
>>>> So if the exponent field starts with 000, _and_ the first bit of the
>>>> significand field is 1, then:
>>>>
>>>> - that first bit of the significand field corresponds to the first bit of the
>>>> significand, and is no longer "hidden", and
>>>> - the effective value of the exponent is adjusted the same way the
>>>> effective value of an all-zeroes exponent is adjusted for conventional
>>>> IEEE-754 floats; one is added to it..
>>>>
>>>> Nothing new here, except that a bit seems to be wasted for no good
>>>> reason.
>>>> But if you remember how EGU works, or if you're familiar with posits,
>>>> you will know what comes next.
>>>> If the significand field begins with "01", now the effective value of the
>>>> exponent is adjusted by how it needs to be adjusted so that the range of
>>>> floats with "01" significands is tacked on neatly to the low end of the
>>>> range of floats with "1" significands.
>>>>
>>>> Or, in posit language, exponents starting with 000 are followed by a
>>>> "regime" field before a significand field (still with a hidden first bit)
>>>> starts.
>>>>
>>>> And the same happens, but on the high (EGO - Extremely Gradual
>>>> Overflow) end, for exponents starting with 111.
>>> <
>>> Why not a hierarchy of infinities ?
>>>>
>>>> The order of bits can be shuffled around to make the resulting number
>>>> look more like a posit, if you like. (That is, put the regime field
>>>> immediately after the 000 and 111, then put in the rest of the exponent,
>>>> before starting the significand, instead of putting the regime field
>>>> after the whole exponent.)
>>>>
>>>> The idea is that now one can switch to this... 25% solution of posits...
>>>> without the complaint that, oh, noes, we're losing one bit of precision.
>>>>
>>>> Now it's only at the outer 25% of the old floating point range, which is
>>>> obviously totally irrelevant if you don't even believe that posits are
>>>> necessary and/or useful!
>>> <
>>> The most useful thing about posits is that they do not come with all
>>> the IEEE 754 baggage.
>> One could go the other direction:
>> Drop any semantic distinction between Inf and NaN;
>> Effectively, Inf is just a special case of NaN.
> <
> Infiniity has a defined magnitude :: bigger than any IEEE representable number
> NaN has a defined non-value :: not comparable to any IEEE representable
> number (including even itself).
> <
> I don't see why you would want Infinity to just be another NaN.
> <

If either Inf or NaN happens, it usually means "something has gone
wrong". Having them as distinct cases adds more special cases that need
to be detected and handled by hardware, without contributing much beyond
slightly different ways of saying "Well, the math has broken".

In practice, the distinction contributes little in terms of either
practical use cases nor is much benefit for debugging.

Though, I am still generally in favor of keeping NaN around.

>> Define "Denormal As Zero" as canonical;
> <
> This no longer saves circuitry.
> <

Denormal numbers are only really "free" if one has an FMA unit rather
than separate FADD/FMUL (so, eg, both FMUL and FADD can share the same
renormalization logic).

Though, the cheaper option here seemingly being to not have FMA, in
which case it is no longer free.

Though, I would assume the particular interpretation of DAZ as FTZ
(Flush to Zero) on the results, since the interpretation where "result
exponent may be random garbage" results in other (generally worse)
issues regarding the semantics.

Also technically cheaper to implement FMUL and FADD in a way where most
of the low order bits which "fall off the bottom" are effectively
discarded from the calculation, because only a relatively limited number
of bits below the ULP are likely to have much effect on the rounded result.

Though, FADD does need a mantissa large enough internally to deal with
integer conversion (so, say, 66 bits to deal with Binary64<->Int64
conversion).

Reusing FADD for conversion makes more sense, since FADD already has
most of the logic needed for doing conversions, and this is cheaper than
repeating the logic for a dedicated module.

For FMUL, given the relatively limited dynamic range of the results (for
normalized inputs), the renormalization step is very minimal:
Result is 1<=x<2, All is good (do nothing);
Result is 2<=x<4, Shift right by 1 bit and add 1 to exponent.

Main expensive part of FMUL being the "multiply the two mantissas
together" aspect.

>> ...
>>
>> FPU operations:
>> ADD/SUB/MUL
>> CMP, CONV
>>
>> Rounding:
>> Mostly Undefined (Rounding modes may be unsupported or ignored)
>> ADD/SUB/MUL are ULP +/- 1.5 or 2 or similar
> <
> Even the GPUs are migrating towards full IEEE 754 compliance.

This is more likely due to GPGPU uses than due to full 754 being
particularly useful for graphics processing and similar.

> <
>> Conversion to integer is always "truncate towards zero".
> <
> i = ICEIL( x );
> ...

There are ways to implement floor/ceil/... that don't depend on having
multiple rounding modes in hardware, or multiple float->int conversions.

For example, it is frequently useful to implement float->int conversion
in a way that rounds towards negative infinity, but usually this is
handled by doing something like, say:
long floor_to_long(double x)
{
if(x<0)
{
return(-(long)((-x)+0.999999999999));
}
return((long)x);
}

Or similar...

>>
>> Could still require that, for the same inputs, the operators will still
>> produce the same output each time.
>>
>> Divide and Square-Root are software, and "somewhere in generally the
>> right area" is regarded as sufficient.
> <
> Unlikely to be accepted by the market.

This likely depends on what the processor can pull off effectively.

I have yet to figure out a "good" and "cheap" way to do FDIV and FSQRT
moderately quickly in hardware.

So, doing it in software is still faster in my case.

And, in software, one can use versions which cut back on the number of
N-R stages. Say, for example, one finds that for a given calculation,
two N-R stages is sufficient (and we don't want to spend the cycles to
converge it all the way to the ULP).

>>
>>
>> Or, basically, the cheapest FPU possible which is still sufficient as to
>> be basically usable.
>>
> Will end up with the same "avid" following as FIAT here in USA.

Dunno.

>>
>> Possible optional additions:
>> Denormalized formats, which lack a hidden bit
>> More like the x87 long-double format;
>> Intermediate precision formats, such as:
>> Binary48 (S.E11.F36)
>> Binary24 (S.E8.F15).
>> If stored in a 32 or 64 bit container:
>> Will use the same format as Binary32 or Binary64
>> Will ignore the low order bits.
> <
> It seems to me that this is not a job of CPU architects, but a job
> for people who want to use quality FP implementations.

These options would be like faster or cheaper alternatives for the full
width versions.

Though, trying to pass them off as their full-width siblings is unlikely
to go unnoticed.

But, for example, semantically-truncating Binary32 to 24 bits could be
useful for SIMD in cases where Binary16 is insufficient, but where full
Binary32 precision isn't needed, in cases where the truncated form could
be handled in fewer clock cycles.

Subject	Replies	Author
Mixed EGU/EGO floating-point By: Quadibloc on Fri, 13 May 2022	116	Quadibloc

Computer programmers never die, they just get lost in the processing.

computers / comp.arch / Re: Mixed EGU/EGO floating-point