Rocksolid Light

Welcome to novaBBS (click a section below)

mail  files  register  nodelist  faq  login

My little brother got this fortune: nohup rm -fr /& So he did...


computers / comp.arch / latest

Re: Misc: Idle thoughts for cheap and fast(ish) GPU. (thread)

comp.arch

Posted: 2 Hours 27 Minutes ago by: MitchAlsup

< Why are the 9 above instructions not a single instruction ? < -------------------------------- See above.

Misc: Idle thoughts for cheap and fast(ish) GPU.

comp.arch

Posted: 4 Hours 57 Minutes ago by: BGB

Getting GLQuake above 10fps on a 50MHz CPU core is non-trivial. A dedicated GPU is possible, but I have a practical resource limit (on the XC7A100T) of around 15-20 kLUT for something like this. To get the sort of frame-rates I want

Re: Ryzen 7000 Described, but not detailed (thread)

comp.arch

Posted: 5 Hours 16 Minutes ago by: Torbjorn Lindgren

I suspect that's an old statement that made sense THEN, long before the motherboards and CPUs came out - at that point everyone expected DDR5 to only have a small premium. And at that point the unmatchable bandwidth that DDR5 do give you

Re: Ryzen 7000 Described, but not detailed (thread)

comp.arch

Posted: 6 Hours 23 Minutes ago by: MitchAlsup

< Perhaps driven by DRAM manufactures ?

Re: Ryzen 7000 Described, but not detailed (thread)

comp.arch

Posted: 7 Hours 3 Minutes ago by: Quadibloc

True, but previous news reports said that Intel was telling motherboard makers to "emphasize" DDR5 over DDR4 quite strongly, so it seems like it doesn't want to accept the gift. John Savard

Re: Ryzen 7000 Described, but not detailed (thread)

comp.arch

Posted: 7 Hours 51 Minutes ago by: Torbjorn Lindgren

No, what they showed that it's POSSIBLE to take ONE Zen4 cpu up to at least 5.5GHz. The slides only says at least one SKU will hit "5GHz+" single-core boost speed, usually one would expect this to also be the high-core variant but that's

Re: Ryzen 7000 Described, but not detailed (thread)

comp.arch

Posted: 17 Hours 14 Minutes ago by: Anton Ertl

L3 cache was tripled to 96MB in the 5800X3D. But yes, between the clock rate increase and the IPC increase from the larger L2, I expect little other IPC increases. But increasing the clock to 5.5GHz is quite a feat. I wonder how they d

Ryzen 7000 Described, but not detailed

comp.arch

Posted: 18 Hours 12 Minutes ago by: Quadibloc

Lisa Su's Computex keynote noted some facts about the Ryzen 7000 series in general, but she didn't give the details on each SKU in the forthcoming lineup. The 16-core version has, apparently, a turbo speed of around 5.5 GHz, as shown in a

Re: branchless binary search (thread)

comp.arch

Posted: 20 Hours 4 Minutes ago by: Terje Mathisen

This is the real solution, i.e. as long as I as a cloud customer know that only my own code runs on the ~100K cpu instances I'm using at any given point in time, then I don't care about these types of attacks. It is the mixing of my ow

Re: branchless binary search (thread)

comp.arch

Posted: 1 Day 3 Hours ago by: Andy Valencia

The virtual server hosting market is really big, and growing. So far, the CPU makers--along with their counterpart OS devs--pretend like they're staying ahead. And so far, we pretend like we believe them. This participatory delusion ma

Re: branchless binary search (thread)

comp.arch

Posted: 1 Day 8 Hours ago by: MitchAlsup

< And also application vendors against the attacks of its customers. <

Re: branchless binary search (thread)

comp.arch

Posted: 1 Day 12 Hours ago by: Stefan Monnier

I'm also wondering if it'll happen. So far hardware manufacturers have been rather keen on improving "security" when it comes to protecting the interest of manufacturers against "attacks" by the device owners (e.g. HDCP, Secure Boot, the

Re: Spectre fix (was: branchless binary search) (thread)

comp.arch

Posted: 1 Day 14 Hours ago by: Michael S

As far as HW is concerned, spectre is not a vulnerability, it's a feature. It should be documented rather than fixed. As a customer, I am not willing to pay for HW fix for spectre, even if payment is small. Only if we consider spectre

Re: Ill-advised use of CMOVE (thread)

comp.arch

Posted: 1 Day 16 Hours ago by: aph

Looking at the neoverse N2 Software Optimization Guide, I see that basic ALU ops have 1 clock latency, 4 ops/clock throughput. Basic ALU ops with condition run at only 1 op/clock. These predicated instructions run on the Integer Multicycl

Spectre fix (was: branchless binary search) (thread)

comp.arch

Posted: 1 Day 17 Hours ago by: Anton Ertl

[reformatted to satisfy Usenet line length conventions] Possibly. The same could be said about any other vulnerability. Should we stop fixing them? In contrast to software vulnerabilities, which seem to have an endless supply given the

Re: branchless binary search (thread)

comp.arch

Posted: 2 Days 7 Hours ago by: Thomas Koenig

Amen. https://xkcd.com/2166/ comes to mind.

Re: branchless binary search (thread)

comp.arch

Posted: 2 Days 8 Hours ago by: Michael S

The 3rd. Except that I don't see increased power as major concern. Reduced performance - yes. Increased area - yes. Not worth design resources - yes, yes, yes Also, futility of the effort. A new sub-channel with the same effect except,

Re: Spectre and resource comtention (was: branchless binary search) (thread)

comp.arch

Posted: 2 Days 8 Hours ago by: MitchAlsup

< < Yes, s/meltdown/rowhammer/ < No, you missed my point. When the L3 is the read and write buffers to DRAM, then one can CFLUSH as many times as one likes, but DRAM only gets written once. The L3 gets written k times, Dram 1 time. Nobod

Re: Spectre and resource comtention (was: branchless binary search) (thread)

comp.arch

Posted: 2 Days 17 Hours ago by: Anton Ertl

More precisely, the core can access the bit speculatively through some gadget; this ios just the same as the first speculative load in Spectre V1 or Spectre V2, only the side channel afterwards is different. The side channel is not about

Re: Spectre and resource comtention (thread)

comp.arch

Posted: 2 Days 17 Hours ago by: Anton Ertl

My countermeasure gives priority to non-speculative accesses, so the speculative access then just would not happen, and the attacker coulkd not measure it. The attack would not work in the presence of my countermeasure. For non-specula

Re: Spectre and resource comtention (thread)

comp.arch

Posted: 3 Days 3 Hours ago by: MitchAlsup

< Basically, everything (other than the first 5 instructions executed) are speculative. One STILL has speculative instructions in the pipeline after MOST branch mispredictions. So, there is no getting away from this. Just assume everythin

Re: Spectre and resource comtention (was: branchless binary search) (thread)

comp.arch

Posted: 3 Days 3 Hours ago by: MitchAlsup

< Phraseology:: a LD either hits in L1 or it does not (and so on) I fail to see how secret bit is present in the core being attacked unless the core performs a Flush to L3 on a line recently in L1. {So, you prevent this attack by not flush

Re: Spectre and resource comtention (thread)

comp.arch

Posted: 3 Days 5 Hours ago by: Stefan Monnier

Why does the attacker need to (constantly) access L3 speculatively? Couldn't they just as well access L3 (constantly) non-speculatively? [ If so, your counter measure wouldn't work, right? ] Stefan

Spectre and resource comtention (was: branchless binary search) (thread)

comp.arch

Posted: 3 Days 7 Hours ago by: Anton Ertl

That helps against attacks that then test for the presence of some data in a cache. But I also want to be immune to an attack like the following: One core contains your secret bit, and somehow the secret bit is accessed speculatively (t

Re: branchless binary search (thread)

comp.arch

Posted: 3 Days 8 Hours ago by: MitchAlsup

< < < < One can Fetch down a non-architectural path, feed the data to the LD instruction, and allow the LD instruction (transitively) to start new Fetches. What one cannot do is put the fetched non-architectural data in the data cache. You

Re: branchless binary search (thread)

comp.arch

Posted: 3 Days 9 Hours ago by: Stephen Fuld

I am not sure what you are saying here. ISTM that statement could mean one of three things. 1. It is impossible to design any OoO CPU, even from scratch that is not susceptible to Spectre. I think Mitch and others would disagree, 2.

Re: branchless binary search (thread)

comp.arch

Posted: 3 Days 9 Hours ago by: Anton Ertl

[...] If it speculates 20 iterations of the loop, it fetches 20 cache lines speculatively, and in the unpredictable case on average only one speculative fetch is actually used on average, i.e. 5%. And if we de-Spectre-ize it such that i

Re: branchless binary search (thread)

comp.arch

Posted: 3 Days 10 Hours ago by: Michael S

de-Spectre-ization is a pipe dream. Not going to happen.

Re: branchless binary search (thread)

comp.arch

Posted: 3 Days 11 Hours ago by: EricP

The branching version depends on flooding the memory hierarchy with speculative requests to get as much overlapped cache line fetches going at once, and then on average uses maybe 50% of those fetches, but that is still a win. Now if we

Re: branchless binary search (thread)

comp.arch

Posted: 3 Days 14 Hours ago by: Terje Mathisen

Yeah, that has been my real-life experience. I.e. I have written branchless binary search code at least a couple of times without finding it to be a win. OTOH, if you have a lot of these searches, with variable history leading up to th

Re: Mixed EGU/EGO floating-point (thread)

comp.arch

Posted: 3 Days 20 Hours ago by: BGB

Debugging program code is a lot easier in general, since its state often reduces down more easily to a binary choice of "works" or "doesn't work". One can generally look at the state of the program, when and where it misbehaves, and fi

Re: Mixed EGU/EGO floating-point (thread)

comp.arch

Posted: 4 Days ago by: Quadibloc

This is true. But since Excel is an _application_, as opposed to an _operating system_, the problem of testing it adequately is not necessarily as impossible as the problem of testing a Windows Update. John Savard

Re: Mixed EGU/EGO floating-point (thread)

comp.arch

Posted: 4 Days ago by: MitchAlsup

< Extensive testing may still be inadequate. < I was brought in as a consultant (1992) to figure out why a design was taking so long to debug. After only a couple of weeks I put my finger on the issue. There were 2 large state machines, ea

Re: Mixed EGU/EGO floating-point (thread)

comp.arch

Posted: 4 Days ago by: George Neuner

According to the book by Steve Macguire, Excel was extensively tested. Macguire worked for Microsoft, so you can make of that what you will. George

Re: Mixed EGU/EGO floating-point (thread)

comp.arch

Posted: 4 Days 21 Hours ago by: BGB

Some more fiddling later, I am not thinking yeah, maybe it is actually due to a bug... Still annoying though, but at least I have a partial workaround. Well, in my case, I still have enough of my own bugs. Spent a while trying to hu

Re: Mixed EGU/EGO floating-point (thread)

comp.arch

Posted: 5 Days 10 Hours ago by: Quadibloc

PC-DOS 1.0? Paper-tape BASIC for the Altair? Otherwise, I can't _think_ of any examples offhand, but I didn't want to go too far, in case I might be accused of bias against Microsoft. John Savard

Re: binary search vs. hash tables (was: Ill-advised use of CMOVE) (thread)

comp.arch

Posted: 5 Days 11 Hours ago by: Michael S

I am pretty sure that it's unrelated. It seems, google found to a way to make newer versions of Firefox unusable by exploiting a way they handle mistakes in Cross-Origin Requests. Under certain conditions Firefox just stops to react, not

Re: branchless binary search (thread)

comp.arch

Posted: 5 Days 20 Hours ago by: Anton Ertl

Yes, a CPU with the Spectre fix that Mitch Alsup and I have advocated would load the cache line speculatively, but would not put the result in the (lower-level) cache unless the load was committed. So no evictions for mispredicted loads.

Re: Mixed EGU/EGO floating-point (thread)

comp.arch

Posted: 6 Days ago by: MitchAlsup

< like they've done how many times before < Has MS EVER released a product AFTER adequate testing has been applied ? < < Not that Apple has that stellar of a record........

Re: Ill-advised use of CMOVE (thread)

comp.arch

Posted: 6 Days ago by: MitchAlsup

< I have tried to steer clear how My 66000 implementations predict branches and when and if. Low end machines may have no prediction at all (just search ahead in the instruction buffer and prefetch.) Higher end machines will be taking more

Re: Mixed EGU/EGO floating-point (thread)

comp.arch

Posted: 6 Days ago by: BGB

Possible, though having the OS automatically delete drivers and then refuse to let them be reinstalled by end users would seemingly imply that "something" is going on here (as opposed to it being a normal software bug).

Re: Mixed EGU/EGO floating-point (thread)

comp.arch

Posted: 6 Days 1 Hour ago by: Quadibloc

No malice need be suspected when Microsoft being clumsy and stupid and releasing updates without adequate testing, like they've done how many times before, is sufficient. Although, that may be unfair to Microsoft, as the number of poss

Re: Mixed EGU/EGO floating-point (thread)

comp.arch

Posted: 6 Days 1 Hour ago by: Quadibloc

It's not like a computer architecture has feelilngs that could be hurt by doing so. John Savard

Re: Ill-advised use of CMOVE (thread)

comp.arch

Posted: 6 Days 3 Hours ago by: Ivan Godard

Two on Gold, but shared with the case bodies so likely zero.

Re: Ill-advised use of CMOVE (thread)

comp.arch

Posted: 6 Days 3 Hours ago by: Ivan Godard

Well, sort of, for the way you handle branches and prediction anyway. But not so much when you can have a four-way branch in a single bundle, with less that one prediction for that. The reason search can be smaller than table is that m

Re: Ill-advised use of CMOVE (thread)

comp.arch

Posted: 6 Days 3 Hours ago by: Ivan Godard

Still is on Mill, but then Mill uses exit prediction, not branch prediction, and so has many fewer predictions to make - and misses to pay for - than BGBCC. Sort of a static version of trace caches.

Re: Ill-advised use of CMOVE (thread)

comp.arch

Posted: 6 Days 3 Hours ago by: MitchAlsup

< My 66000 implementations avoid the LD-Align stage as the instruction buffer can absorb cache-access-width every cycle, avoids forwarding since it knows the displacement will be added to IP (twice), so doing this stuff in FETCH greatly s

Re: Ill-advised use of CMOVE (thread)

comp.arch

Posted: 6 Days 4 Hours ago by: BGB

Just the "BRA (PC, Rt)" by itself takes 8 cycles, since it effectively flushes the pipeline... The variability in this case is due to which branch it goes down and whether it hits or misses. My emulator doesn't really model the br

Re: branchless binary search (thread)

comp.arch

Posted: 6 Days 4 Hours ago by: Stefan Monnier

If your CPU is careful to avoid Spectre attacks, this should not be the case, right? Stefan

Re: binary search vs. hash tables (was: Ill-advised use of CMOVE) (thread)

comp.arch

Posted: 6 Days 4 Hours ago by: Brett

There was a corrupt spam post since deleted that broke the NewTap reader on iOS, had to delete comp.arch, re-add and restart the app.

Re: Ill-advised use of CMOVE (thread)

comp.arch

Posted: 6 Days 5 Hours ago by: MitchAlsup

< Which is 12 too many........ < Mine takes 4 cycles uniformly. < Note: I was only talking about switches in general and not participating in the search portions of this thread. < By encoding labels as word displacements off the IP, one ca

Re: Ill-advised use of CMOVE (thread)

comp.arch

Posted: 6 Days 5 Hours ago by: BGB

OK. In my case, it would be something like: SUB Ri, Base, Rt CMPHI Limit, Rt //unsigned Rt>Limit BT .default ADD Rt, Rt //(assuming a table of 32-bit branch ops) BRA (PC, Rt) //AKA: "BRAF Rt" ..

Re: Mixed EGU/EGO floating-point (thread)

comp.arch

Posted: 6 Days 6 Hours ago by: BGB

Looks into it a little more: Turns out the R4300 in the N64 did have rounding-modes / ..., it was more just a common emulator issue to not bother with respecting the rounding-modes, sometimes leading to issues if the behavior was diff

Re: Ill-advised use of CMOVE (thread)

comp.arch

Posted: 6 Days 6 Hours ago by: MitchAlsup

< My 66000 has tabularized jumps directly (and PIC, too) For a short switch clause (less than 256 instructions in the switch clause) The tabularized jump is ALWAYS smaller than a series of if-elses. < So, the instruction looks like:: <

branchless binary search (was: Ill-advised use of CMOVE) (thread)

comp.arch

Posted: 6 Days 7 Hours ago by: Anton Ertl

I searched around a bit, and found the following code in <https://stackoverflow.com/questions/11360831/about-the-branchless-binary-search> by BeeOnRope (Travis Downs): int needle; // value we are searching for int *base = ...; // base po

Re: Ill-advised use of CMOVE (thread)

comp.arch

Posted: 6 Days 8 Hours ago by: BGB

Agreed. Before I added jump-tables to BGBCC, binary search was the primary form of "switch()". Say: 1..5 case labels: Use if-Else 6+: Divide space in half, recurse on left and right halves. Compare-and-branch to right

Re: Mixed EGU/EGO floating-point (thread)

comp.arch

Posted: 6 Days 8 Hours ago by: BGB

Excluding NaNs, yeah, sorta... The FCMPGT operator is basically: Perform an unsigned integer compare; Twiddle the result based on the sign bits. 00: A>B 01: 1 10: 0 11: !(A>B) In terms of unsigned numbers, Inf,

Re: Ill-advised use of CMOVE (thread)

comp.arch

Posted: 6 Days 9 Hours ago by: MitchAlsup

< A) do not do predication with condition codes ! B) do not waste bits in each instruction to support predication ! C) is anything more needed ?!?

Re: Mixed EGU/EGO floating-point (thread)

comp.arch

Posted: 6 Days 9 Hours ago by: MitchAlsup

< Any machine that is Touring complete can run FORTRAN. That does not mean that the machines are efficient at running FORTRAN or designed for running FORTRAN Or as good at running FORTRAN as a machine that actually was.

Re: Mixed EGU/EGO floating-point (thread)

comp.arch

Posted: 6 Days 9 Hours ago by: MitchAlsup

< Do these get the "right answer" when NaNs, infinities, and denorms are used ?

Re: Mixed EGU/EGO floating-point (thread)

comp.arch

Posted: 6 Days 10 Hours ago by: MitchAlsup

< In My 66000 case: FCMP Rt,R4,R4 // at this point Rt<6> tells you if the operands are comparable // That is neither are NaN. // at this point Rt<7> tells you if the operands are uncomparable

Re: Ill-advised use of CMOVE (thread)

comp.arch

Posted: 6 Days 10 Hours ago by: EricP

I'm not suggesting otherwise. ARM has had at least 4 major implementations of OoO conditional execution and predication, and they would have learned a thing or two along the way. It might be nice to see where the warts are, and any ideas

Re: Mixed EGU/EGO floating-point (thread)

comp.arch

Posted: 6 Days 12 Hours ago by: Thomas Koenig

Suitable to be adapted to the task at hand, which is what software is for. How do deal with missing data is totally dependent on the use case, and a one-size-fits-all solution is bound to bring nothing useful to the majority of use cases

Re: Mixed EGU/EGO floating-point (thread)

comp.arch

Posted: 6 Days 12 Hours ago by: Thomas Koenig

There are quite a few according to ISO/IEC 1539:2018, subclause 3.113.

Re: Mixed EGU/EGO floating-point (thread)

comp.arch

Posted: 6 Days 12 Hours ago by: Anton Ertl

Maybe not with babies (but that differs by culture), but it's common for architectures. I can understand why AMD called it x86-64 at first, but * it's based on the x86 misconception (as if IA-32 was an extended 8086 instuction set).

Re: Ill-advised use of CMOVE (thread)

comp.arch

Posted: 6 Days 12 Hours ago by: Thomas Koenig

Splines (or, more general, interpolation) are a big field.

Re: Ill-advised use of CMOVE (thread)

comp.arch

Posted: 6 Days 16 Hours ago by: Michael S

That does not fit well within into typical bsearch implementation. Besides, on this particular CPU it would be harmful, because it can process only 1 Dcache miss at time. You're beginning to see the light!

Re: Ill-advised use of CMOVE (thread)

comp.arch

Posted: 6 Days 16 Hours ago by: Michael S

Not worth a complexity of it. I don't mean "complexity" in "computation complexity" sense, but in "programmer's effort" sense. Also in a sense of learning new discipline, non-related to application domain.

Re: Ill-advised use of CMOVE (thread)

comp.arch

Posted: 6 Days 17 Hours ago by: Ivan Godard

And not worth building a perfectHash on the fly? There are dynamic construction algorithms that are O(n) IIRC.

Re: Ill-advised use of CMOVE (thread)

comp.arch

Posted: 6 Days 17 Hours ago by: aph

It could work if a JIT had an accurate model of the branch predictor, or (better) a way to ask a branch predictor for its opinion. But branch predictors are the most secret kind of secret sauce, so that won't happen. On the other hand, a

Re: Ill-advised use of CMOVE (thread)

comp.arch

Posted: 6 Days 17 Hours ago by: Anton Ertl

For binary search it's relatively cheap to do without (at least someone showed code here many years ago that looked convincing). Well, that's the other problem with binary search: In a hash table you typically notice inequality by looki

Re: Ill-advised use of CMOVE (thread)

comp.arch

Posted: 6 Days 17 Hours ago by: Michael S

It never changes *after program initialization*. But it does not have to be identical on each activation of the program.

Re: binary search vs. hash tables (was: Ill-advised use of CMOVE) (thread)

comp.arch

Posted: 6 Days 17 Hours ago by: Michael S

More the strangerer

Re: Ill-advised use of CMOVE (thread)

comp.arch

Posted: 6 Days 17 Hours ago by: Ivan Godard

If never changed, why not build a perfectHash table?

Re: binary search vs. hash tables (was: Ill-advised use of CMOVE) (thread)

comp.arch

Posted: 6 Days 17 Hours ago by: Michael S

Sorry for all 'test' posts. I am trying to figure out if google, in their eternal wisdom, had broken a posting on Google Groups completely, or if it did it selectively, just on Firefox. So far it looks like the later.

Re: Ill-advised use of CMOVE (thread)

comp.arch

Posted: 6 Days 18 Hours ago by: Michael S

6 cycles on the CPU that was used in this project. The CPU in question has no CMOVE/Select. Emulating it will take 4 instruction. However my point is that the most of the time is consumed in strcmp(). bsearch loop itself is relatively i

Re: binary search vs. hash tables (was: Ill-advised use of CMOVE) (thread)

comp.arch

Posted: 6 Days 18 Hours ago by: Michael S

test

Re: binary search vs. hash tables (was: Ill-advised use of CMOVE) (thread)

comp.arch

Posted: 6 Days 18 Hours ago by: Michael S

test

Re: binary search vs. hash tables (was: Ill-advised use of CMOVE) (thread)

comp.arch

Posted: 6 Days 18 Hours ago by: Michael S

test

Re: "Tachyum Universal Processor" (thread)

comp.arch

Posted: 6 Days 18 Hours ago by: Quadibloc

Oh, that's a pity. I saw the part about its vector capabilities, and was strongly encouraged by it. John Savard

Re: binary search vs. hash tables (was: Ill-advised use of CMOVE) (thread)

comp.arch

Posted: 6 Days 18 Hours ago by: Michael S

test

Re: Mixed EGU/EGO floating-point (thread)

comp.arch

Posted: 6 Days 19 Hours ago by: BGB

Agreed. Could they be useful? Maybe. Do they make sense to add to FPU semantics? Doubtful. Could make sense to add it via adding something like a NaN box type-tag checking instruction, which could also be potentially used to aide code

Re: Mixed EGU/EGO floating-point (thread)

comp.arch

Posted: 6 Days 19 Hours ago by: BGB

FABS is at least cheap, albeit infrequently used. Did "make the cut" in my case, mostly due to its cheapness. There is also FNEG, which is also cheap, and more commonly used. FMIN and FMAX are less cheap, also rarely used. They do no

Re: Mixed EGU/EGO floating-point (thread)

comp.arch

Posted: 6 Days 19 Hours ago by: BGB

Hmm, my case, could do it something like: FCMPEQ R4, R4 FADD?T R8, R4, R8 This would filter out all NaNs because only NaNs compare not-equal to themselves... I guess it is theoretically possible one could have a tag-checking i

Re: Mixed EGU/EGO floating-point (thread)

comp.arch

Posted: 6 Days 23 Hours ago by: BGB

Proper solution is to glue both names together and call him "Joe-Bob". But, one could be like, "By IA-32, do you really mean iAPX 386?..." Or, debate between the relative merits of AMD64, Intel64, or EM64T. I mostly prefer the names

Re: Mixed EGU/EGO floating-point (thread)

comp.arch

Posted: 7 Days ago by: BGB

The ISA is still 64-bit in these profiles, just using a 32-bit physically-mapped address space. With bits (63:32) being ignored for memory accesses and similar. Generally, there is very little RAM or ROM because: The relevant FPGA

Re: Mixed EGU/EGO floating-point (thread)

comp.arch

Posted: 7 Days 3 Hours ago by: Quadibloc

IA-64, of course, for the benefit of those not familiar with these abbreviations, being the infamous Itanium. Oh, dear, that's alliterative. Perhaps it should be tied up with the crimson bands of Cytorrak for safekeeping. John Savard

Re: Mixed EGU/EGO floating-point (thread)

comp.arch

Posted: 7 Days 3 Hours ago by: MitchAlsup

I just don't want to invest anything more in a 32-bit world. As long as the 64-bit world has simple processors where one can fit 100 of them on a 7nm die. Each processor in real silicon is now so tiny they put dozens of great big complic

Re: Mixed EGU/EGO floating-point (thread)

comp.arch

Posted: 7 Days 4 Hours ago by: BGB

I was more following an approach more like that used by things like MSP430 or Cortex-M or similar, where it is assumed that code is built for a processor core with a matching feature-set. One can then enable or disable features to get

Re: Mixed EGU/EGO floating-point (thread)

comp.arch

Posted: 7 Days 4 Hours ago by: George Neuner

SQL DBMS typically will zero fields which are null - but the value used is not relevant: each field in a record tuple has associated metadata that indicates (among other things) whether its value is valid. George

Re: Mixed EGU/EGO floating-point (thread)

comp.arch

Posted: 7 Days 5 Hours ago by: Anton Ertl

First, AMD called it x86-64, later AMD64 (until now). Later Intel called it IA32e, then EM64T, and now Intel 64. Microsoft calls it x64. I may have missed/forgotten some names. The Intel names are funny, especially how they tried to d

Re: Mixed EGU/EGO floating-point (thread)

comp.arch

Posted: 7 Days 5 Hours ago by: MitchAlsup

< < You don't name a baby "bob" and then years later reman him "joe" < < < Would if have been possible for java to define FP calculations in a way that x87 would have been acceptable as was ? < If so, why was this not done ?

Re: Mixed EGU/EGO floating-point (thread)

comp.arch

Posted: 7 Days 5 Hours ago by: MitchAlsup

< It was known internally as x86-64 even before it became known of as x86-64 externally. In fact, you use of AMD64 caught me completely by surprise !!

Re: Ill-advised use of CMOVE (thread)

comp.arch

Posted: 7 Days 5 Hours ago by: MitchAlsup

< Perhaps because their predication model was not well though out !! < This does not mean that all forms of predication are difficult in all kinds of instruction queue devices and mechanisms.

Re: Mixed EGU/EGO floating-point (thread)

comp.arch

Posted: 7 Days 5 Hours ago by: Ivan Godard

Sounds like a 1401.

Re: Mixed EGU/EGO floating-point (thread)

comp.arch

Posted: 7 Days 6 Hours ago by: Anton Ertl

The 387 instructions were already available on IA-32. x86-64 is an older name for AMD64. What is a problem in compilers is the conversion that happens when storing a 387 register with a 53-bit mantissa to a 64-bit location in memory on

Re: Ill-advised use of CMOVE (thread)

comp.arch

Posted: 7 Days 6 Hours ago by: EricP

This was based on their observation that operand data values arrive before predicate values. They believed that freeing up reservation stations ASAP was worth the extra logic in their design. "It" and "this" means their proposed design.

Re: Mixed EGU/EGO floating-point (thread)

comp.arch

Posted: 7 Days 6 Hours ago by: Terje Mathisen

In one DB where I had to reverse engineeer the binary storage format, all variables used a leading byte count, followed by a little-endian array of bytes, just large enough to hold the actual value. I.e. null as the leading byte always

Re: Mixed EGU/EGO floating-point (thread)

comp.arch

Posted: 7 Days 6 Hours ago by: Quadibloc

Or, if you don't want to confuse people by expecting them to know the up-to-date names for these things, but instead want to use the original names, perhaps something like EM64T/AMD64 (now known as x86-64) would be clearest. John Sava

Re: binary search vs. hash tables (was: Ill-advised use of CMOVE) (thread)

comp.arch

Posted: 7 Days 7 Hours ago by: Michael S

test

Re: Mixed EGU/EGO floating-point (thread)

comp.arch

Posted: 7 Days 8 Hours ago by: MitchAlsup

< < Is it not x86-64 (AMD64) ? < < This is a problem with the way Java defined FP not in x87. <

Re: binary search vs. hash tables (was: Ill-advised use of CMOVE) (thread)

comp.arch

Posted: 7 Days 8 Hours ago by: John Levine

Nope. There's SEARCH (serial), SEARCH ALL (binary), and SORT (sort it so you can do binary search.) My impression is that for more complicated stuff, you put it in a database and use SQL.

Re: Mixed EGU/EGO floating-point (thread)

comp.arch

Posted: 7 Days 8 Hours ago by: MitchAlsup

Code compiled for one member of My 66000 will run on all members of My 66000. I chose differently, so did Mill but they solved the problem differently. My 66000 does not have E or F or C or A My 66000 only has 32×64-bit registers and 4×6

Re: Ill-advised use of CMOVE (thread)

comp.arch

Posted: 7 Days 8 Hours ago by: EricP

I can't find any details on those ARM uArch's at all. None. Almost all predication studies are proposals by Intel for doing OoO Itanium. The only paper I found is not by ARM but looks at OoO predication implementation, aka Guarded Execut

Re: Mixed EGU/EGO floating-point (thread)

comp.arch

Posted: 7 Days 8 Hours ago by: MitchAlsup

< Arithmetically: none; But it takes 1 cycle whereas -x takes 0 cycles. < FMAX Rd,Rx,-Rx < That's reap (not rip). <

Re: binary search vs. hash tables (was: Ill-advised use of CMOVE) (thread)

comp.arch

Posted: 7 Days 8 Hours ago by: John Levine

Maybe, but remember that COBOL was designed for memories that seem impossibly small now. Binary search needs no extra storage, and the code is very simple, needing no fancy arithmetic for the hashing or dealing with collisions. I can thi

Re: binary search vs. hash tables (was: Ill-advised use of CMOVE) (thread)

comp.arch

Posted: 7 Days 8 Hours ago by: Stephen Fuld

You make good points. I don't know if current versions of COBOL offer a hash table as an alternative for the search verb.

Re: Mixed EGU/EGO floating-point (thread)

comp.arch

Posted: 7 Days 9 Hours ago by: John Levine

That's how SQL works, so other SQL databases like PostgreSQL and MySQL and MS SQL server handle nulls, too. Dunno what DB2 does for nulls but MySQL keeps a bit map external to the data, so there are no reserved values. You can null any k

Re: Mixed EGU/EGO floating-point (thread)

comp.arch

Posted: 7 Days 9 Hours ago by: EricP

Yes, for compare operations SQL's NULL handling could be a good starting point. But SQL's arithmetic defines that any expression with a null operand is null, which does not sound like what people might want. It sounded like "Value + Null

binary search vs. hash tables (was: Ill-advised use of CMOVE) (thread)

comp.arch

Posted: 7 Days 10 Hours ago by: Anton Ertl

Sounds to me like it is common because it is in Cobol, not the other way round. Certainly associative arrays in more recent languages are typically implemented as hash tables, so much that they have sometimes (Perl, Ruby and Seed7) been

Re: Ill-advised use of CMOVE (thread)

comp.arch

Posted: 7 Days 10 Hours ago by: Anton Ertl

Slow and complex. The question is how predictable your searches are. If they are unpredictable, branches cost ~10 cycles (~20 per misprediction) on average on a typical desktop CPU, and a "branchless" variant is much faster. A simple

Re: Mixed EGU/EGO floating-point (thread)

comp.arch

Posted: 7 Days 11 Hours ago by: Stephen Fuld

You might want to look at DB2 (IBM's relational database software). It has supported missing values, called "Nulls", for decades, and it supports what are called aggregate functions such as sum, max, count, etc. Note also that Nulls

Re: Ill-advised use of CMOVE (thread)

comp.arch

Posted: 7 Days 13 Hours ago by: EricP

Binary search is one of the decompositions for SWITCH statements with discontiguous blocks.

Re: Ill-advised use of CMOVE (thread)

comp.arch

Posted: 7 Days 13 Hours ago by: EricP

According to page, the column for which ARMv7 are OoO lists: Cortex-A9, Cortex-A12, Cortex-A15, Cortex-A17, Qualcomm Scorpion (partial), Qualcomm Krait, Swift https://en.wikipedia.org/wiki/Comparison_of_ARMv7-A_cores I'll see if I can f

Re: Ill-advised use of CMOVE (thread)

comp.arch

Posted: 7 Days 16 Hours ago by: Stephen Fuld

In business applications, it at least used to be common enough that COBOL actually has source code statements to tell the compiler that an array (though they don't call it that) is sorted and that when searching it, generate binary sea

Re: Mixed EGU/EGO floating-point (thread)

comp.arch

Posted: 7 Days 16 Hours ago by: Anton Ertl

One example that comes to my mind is IA-32/AMD64. They have the 387 registers (80 bits wide), and they have xmm/ymm/zmm registers, which can contain 32-bit or 64-bit FP numbers. The 387 part has the option to limit the mantissa to 53 or

Re: Mixed EGU/EGO floating-point (thread)

comp.arch

Posted: 7 Days 17 Hours ago by: BGB

Only way to not eat multiple lanes with 128-bit SIMD would be for each lane to have a 128-bit operand path (with 128-bit register ports, ...). As a result of the 64-bit data widths, running 128-bit data through the pipeline this requ

Re: Ill-advised use of CMOVE (thread)

comp.arch

Posted: 7 Days 18 Hours ago by: Michael S

Sorted array + binary search is a reasonable option for data structure+algorithm when [past initialization phase] the dictionary is either totally or almost totally static. I used it, for example, for generation of Dynamic HTML on resourc

Re: Mixed EGU/EGO floating-point (thread)

comp.arch

Posted: 7 Days 18 Hours ago by: Anton Ertl

.... Please demonstrate "much better". Show assembly code produced for the Fortran code above, and show that it is much better than the code produced for s = sum(a) (which Terje Mathiesen would do for computing the same result, if FP nu

Re: Ill-advised use of CMOVE (thread)

comp.arch

Posted: 7 Days 19 Hours ago by: Anton Ertl

T32 includes if-then-else instructions, so it has to deal with predicated instructions, too. As for "primary instruction set", my understanding is that T32 was designed such that it could be easily decoded into the same uops as A32, and

Re: Mixed EGU/EGO floating-point (thread)

comp.arch

Posted: 7 Days 20 Hours ago by: Anton Ertl

Anything wrong with x & 0x7fffffffffffffff ? Given that you have to bear the costs of sign-magnitude, you should also rip the benefits. - anton

Re: Ill-advised use of CMOVE (thread)

comp.arch

Posted: 7 Days 20 Hours ago by: Anton Ertl

As is often the case, this depends very much on the inputs. And then there's the question of why one uses binary search in the first place. The only time I remember using it is when searching for the try-catch block of an exception in a

Re: Mixed EGU/EGO floating-point (thread)

comp.arch

Posted: 7 Days 20 Hours ago by: Anton Ertl

I.e., none. - anton

Re: Mixed EGU/EGO floating-point (thread)

comp.arch

Posted: 7 Days 22 Hours ago by: Quadibloc

I checked; the situation is described on the page http://www.quadibloc.com/arch/ar010102.htm and there _is_ a potential problem even with IEEE 754 floats. Since they're in internal form in registers, they may have a few extra guard bi

Re: Mixed EGU/EGO floating-point (thread)

comp.arch

Posted: 7 Days 22 Hours ago by: Quadibloc

My original Concertina architecture has that characteristic for _some_ floating-point types. For IEEE 754, though: - when they're stored in the regular floating-point registers, they're stored in an "internal form", similar to temporar

Re: Mixed EGU/EGO floating-point (thread)

comp.arch

Posted: 7 Days 23 Hours ago by: BGB

Probably assumes that one has NaN's that contain useful information of some sort. Possibly, though generating -0 (such as from the result of multiplying a negative number with 0) seems to cause Quake's "progs.dat" VM to break... But

Re: Mixed EGU/EGO floating-point (thread)

comp.arch

Posted: 8 Days ago by: MitchAlsup

< < Arguments in containers smaller than 64-bit are enlarged to 64-bit containers. Arguments larger than 64-bit containers are enlarged next larger MOD-64-bit container. < First 8 containers are passed in R1-R9 {arg[0]..arg[8]} Remaining

Re: Ill-advised use of CMOVE (thread)

comp.arch

Posted: 8 Days ago by: Stephen Fuld

That was Anton's and my point. Compilers aren't very good at doing this, so my suggestion is don't ask them to.

Re: Mixed EGU/EGO floating-point (thread)

comp.arch

Posted: 8 Days ago by: MitchAlsup

< < But you don't even need to encode which bit to clear.

Re: Mixed EGU/EGO floating-point (thread)

comp.arch

Posted: 8 Days ago by: Ivan Godard

FABS is a bitClear instruction.

Re: Mixed EGU/EGO floating-point (thread)

comp.arch

Posted: 8 Days ago by: Ivan Godard

Idle curiosity: what do you dofor a function that has more arguments than registers? Also: how do you do VARARGS?

Re: Mixed EGU/EGO floating-point (thread)

comp.arch

Posted: 8 Days 3 Hours ago by: MitchAlsup

< Yes, but I am looking both forwards and backwards at the same time. I have SIMD of 64×2^k for some integer k determined by the implementation. So, on lowest end machines, VVM runs 1 iteration per cycle, in middle range implementations

Re: Mixed EGU/EGO floating-point (thread)

comp.arch

Posted: 8 Days 3 Hours ago by: MitchAlsup

< You do not have to perform a comparison (cost of integer adder) as all you need to do is make sure 1-bit is clear (0). < < In higher end machines it is performed by suppressing the HoB from asserting 1 on the forwarding path (about 1/4rd

Re: Mixed EGU/EGO floating-point (thread)

comp.arch

Posted: 8 Days 4 Hours ago by: BGB

No SIMD file in my case, only GPRs. SIMD operations exist, in one of several "flavors": Those that operate on 64 bits, and use a single GPR; Those that operate on 128-bits, and use a pair. Which in turn has other effects: Ope

Re: Mixed EGU/EGO floating-point (thread)

comp.arch

Posted: 8 Days 4 Hours ago by: Stefan Monnier

Interesting. Do you happen to know exactly where this power savings come from? I guess it depends on how the instruction is implemented, of course (i.e. does it have its own separate implementation in the FP-ALU, or is it decoded into som

Re: Mixed EGU/EGO floating-point (thread)

comp.arch

Posted: 8 Days 5 Hours ago by: MitchAlsup

< < Essentially all of my FORTRAN exposure after 1983 was FORTRAN programmers and compiler writers coming to me asking what is the proper instructions for "this" piece of FORTRAN code, and the converse: Why did the FROTRAN compiler spit o

Re: Mixed EGU/EGO floating-point (thread)

comp.arch

Posted: 8 Days 5 Hours ago by: BGB

While it is not "difficult" to check whether the mantissa is 0, it isn't entirely free either. In my case, I used the top 4 bits of the mantissa: 0000: Inf (Assumes the rest of the bits are also zero) Else: NaN No real practical

Re: Mixed EGU/EGO floating-point (thread)

comp.arch

Posted: 8 Days 5 Hours ago by: MitchAlsup

< My 66000 does all SIMD stuff as vectorized loops. No need for a SIMD register file, either. < # define int int64_t Vectorization, instead. < Not with VVM. PLUS: when you widen up the machine capabilities you don't need new register resou

Re: Mixed EGU/EGO floating-point (thread)

comp.arch

Posted: 8 Days 6 Hours ago by: MitchAlsup

< Sooner or later you will realize that when operands can all be NaNs, you should choose one over the rest. My 66000 has the following rules: 3-operands:: Rs3 is chosen over Rs2 over Rs1 2-operands:: Rs2 is chosen over Rs1 1-opernd:: Rs1

Re: Mixed EGU/EGO floating-point (thread)

comp.arch

Posted: 8 Days 6 Hours ago by: BGB

I know of at least several major ISA's with this particular design issue (typically between dedicated FPU registers, and doing FPU operations in SIMD registers). I am currently in the "everything goes in GPRs camp": If one has 64-b

Re: Mixed EGU/EGO floating-point (thread)

comp.arch

Posted: 8 Days 6 Hours ago by: Thomas Koenig

I assume you didn't use much Fortran 90+ compilers, judging by some of your previous comments :-) And my actual point: No sense in implementing something in a floating point format that can be be much better handled by code like the one

Re: Ill-advised use of CMOVE (thread)

comp.arch

Posted: 8 Days 6 Hours ago by: Michael S

But most of them (too my best knowledge, all of them) don't have "classic" A32 ISA as their primary instruction set. The older one are meant to run T32 code fast, the newer optimized for aarch64. So, I wouldn't be surprised if all existin

Re: Mixed EGU/EGO floating-point (thread)

comp.arch

Posted: 8 Days 7 Hours ago by: BGB

Possibly, though I had found that if one generates a -0, some software will misbehave. It is seemingly necessary to force all 0 to be positive zero for software compatibility reasons. This means one has a few cases, eg: A or B is Na

Re: Mixed EGU/EGO floating-point (thread)

comp.arch

Posted: 8 Days 7 Hours ago by: MitchAlsup

< I have used plenty of processors that had quality implementations of FORTRAN but did not have that as an instruction.

Re: Mixed EGU/EGO floating-point (thread)

comp.arch

Posted: 8 Days 8 Hours ago by: John Dallman

If I describe them, you'll be able to identify the vendor, which I'd prefer to avoid. It's hardly fair to damn them now for a bad presentation about 20 year sago. John

Re: Ill-advised use of CMOVE (thread)

comp.arch

Posted: 8 Days 8 Hours ago by: Thomas Koenig

Binary search was also recently quoted here as an example where CMOV makes a lot of sense - running a fixed number of iterations for each size (to keep the branch predictors happy). Ideally, a compiler would contain heuristics when a b

Re: Ill-advised use of CMOVE (thread)

comp.arch

Posted: 8 Days 8 Hours ago by: Anton Ertl

It certainly needs to have several instructions under consideration when decoding such an if-then-else and the instructions it covers. .... .... Ok, but as soon as the predicate is then available, the result is made available. So this

Re: Mixed EGU/EGO floating-point (thread)

comp.arch

Posted: 8 Days 8 Hours ago by: Thomas Koenig

Any processor which implements Fortran.

Re: Mixed EGU/EGO floating-point (thread)

comp.arch

Posted: 8 Days 8 Hours ago by: Anton Ertl

Which CPU has this instruction? Set up a straw man ... .... and now beat on it. - anton

Re: Ill-advised use of CMOVE (thread)

comp.arch

Posted: 8 Days 9 Hours ago by: MitchAlsup

< Or allocate those values into registers. < < Just because one has good predication does not remove the utility of CMOV. < CMOV is for those hard to predict cases. Do not "bet against" your branch predictor.

Re: Mixed EGU/EGO floating-point (thread)

comp.arch

Posted: 8 Days 9 Hours ago by: MitchAlsup

< My 66000 compare instruction has a result set of TRUE, FALSE, and NOT_COMPARABLE. When NOT_COMPARABLE, both the TRUE and FALSE conditions are set to zero. < My 66000 compare to zero Branch (and PRED) instructions provide Branch on (!oper

Re: Ill-advised use of CMOVE (thread)

comp.arch

Posted: 8 Days 9 Hours ago by: EricP

Yes, that is the idiom that forces it to always load Ry,Rz,Ra,Rb and then skips calculation. Like CMOV it requires preparing data for both paths, then skips using it. It's slightly better than CMOV because it can potentially prune the MUL

Re: Mixed EGU/EGO floating-point (thread)

comp.arch

Posted: 8 Days 9 Hours ago by: Thomas Koenig

Sounds like a broken design to me. What exactly were the different kinds of FP registers, and how were they different?

Re: Mixed EGU/EGO floating-point (thread)

comp.arch

Posted: 8 Days 9 Hours ago by: MitchAlsup

< Do you also avoid incrementing the number of values in the summation {So that you can compute average} ?

Re: Mixed EGU/EGO floating-point (thread)

comp.arch

Posted: 8 Days 9 Hours ago by: MitchAlsup

< This is an argument against having more than 1 register set.

Re: Ill-advised use of CMOVE (thread)

comp.arch

Posted: 8 Days 9 Hours ago by: Anton Ertl

This predication implementation is about reducing latency, not about reducing resource consumption. I think that in most code latency rather than resources limits the performance. Register allocation? But sure, if you have resource-li

Re: Mixed EGU/EGO floating-point (thread)

comp.arch

Posted: 8 Days 9 Hours ago by: MitchAlsup

< Needing to test for an explicit -0.0 has similar problems. < Then you are not compliant with IEEE 754-2008 so why bother with the rest of it. < Yes, and FMAC is a bit larger than a FMUL and an FADD. But not hideously so. < Only if you

Re: Suresh kumar Devanathan fundamental theorem of Transistor Design (thread)

comp.arch

Posted: 8 Days 9 Hours ago by: Suresh Devanathan

Ids = K[(Vgs - Vt) VDs - 1/2Vds^2] + detlax * int_-^K^K 0 dx Ids = K[(Vgs - Vt) VDs - 1/2Vds^2] + A

Re: Ill-advised use of CMOVE (thread)

comp.arch

Posted: 8 Days 9 Hours ago by: Stephen Fuld

After going through all of the responses in this thread, I am led to the following: CMOV was probably a good idea when it was first developed, but the development of highly accurate hardware branch predictors has lessened its utility

Re: Ill-advised use of CMOVE (thread)

comp.arch

Posted: 8 Days 9 Hours ago by: MitchAlsup

< Yes, and especially when written this way:: < PREDcnd {TE} ADD R3,Ry,Rz MUL R3,Ra,Rb

Re: Mixed EGU/EGO floating-point (thread)

comp.arch

Posted: 8 Days 9 Hours ago by: EricP

If all the vector values are None then the sum is None. The count could be either 0.0 or None. And the average could be None/0.0 = Nan or None, or None/None = None.

Re: Mixed EGU/EGO floating-point (thread)

comp.arch

Posted: 8 Days 10 Hours ago by: Thomas Koenig

This so strongly depends on the application that any attempt to standardize via a new numeric type is likely to fail in the huge majority of cases.

Re: Mixed EGU/EGO floating-point (thread)

comp.arch

Posted: 8 Days 10 Hours ago by: Stephen Fuld

If you can compute the sum of the non-missing values, you can certainly compute the count of non-missing values, thus the average is as meaningful as the sum. And, you can get an idea of "how meaningful" by comparing the count of non-

Re: Mixed EGU/EGO floating-point (thread)

comp.arch

Posted: 8 Days 10 Hours ago by: EricP

None/Null missing values occur all the time in datasets. Applications that can have missing values already deal with this. Currently they just all deal with it individually. It might be nice to standardize the behavior. For example, how

Re: Ill-advised use of CMOVE (thread)

comp.arch

Posted: 8 Days 11 Hours ago by: EricP

So this handles things like: x = cond? y + z : a * b; This merges a PHI and two alternate uOps to eliminate the temp register allocations and copies. But it requires all the source operands y,z,a,b to be loaded so the savings are min

Re: Ill-advised use of CMOVE (thread)

comp.arch

Posted: 8 Days 11 Hours ago by: EricP

It looks like a uOp fusion. It might help in limited situations. This was proposed for OoO predication. It frees up reservation stations ASAP by executing uOps when their data operands are ready and not wait for the guarding predicate.

Re: Mixed EGU/EGO floating-point (thread)

comp.arch

Posted: 8 Days 11 Hours ago by: Thomas Koenig

s = sum(a,mask=.not. ieee_is_nan(a)) works fine. Simply ignoring values for summation would start leading to "interesting" results when you want to have an average instead of a sum, for example. Introducing new classes of numbers in

"Tachyum Universal Processor"

comp.arch

Posted: 8 Days 11 Hours ago by: John Dallman

Does anyone here know anything about this? https://www.tachyum.com/ So far, it seems to consist of PR and FPGAs, with very little hard information available. The claim that it can emulate x86, ARM and RISC-V faster than native hardware is

Re: Mixed EGU/EGO floating-point (thread)

comp.arch

Posted: 8 Days 11 Hours ago by: John Dallman

It is. A manufacturer once told us we could not expect bit-identical results between two different kinds of FP registers on the same machine, both usable at the same time. For ISVs who use compilers, rather than assembler, that's deadly.

Re: Suresh kumar Devanathan fundamental theorem of Transistor Design (thread)

comp.arch

Posted: 8 Days 13 Hours ago by: Suresh Devanathan

in the day of michael, perun, zuse. 80/20 rule

Suresh kumar Devanathan fundamental theorem of Transistor Design

comp.arch

Posted: 8 Days 13 Hours ago by: Suresh Devanathan

Ids = K[ (Vgs - Vt)Vds - 1/2*VDs^2] DIds = K[ (Vgs - Vt) - VDs] dVds DIds = K[ (Vgs - Vt) - VDs] dVds + deltax Ids = K[(Vgs - Vt) VDs - 1/2Vds] + detlax * int_-^K^K 0 dx detlax * lim K->infinity int_-^K^K 0 dx = detlax

Re: Mixed EGU/EGO floating-point (thread)

comp.arch

Posted: 8 Days 15 Hours ago by: Terje Mathisen

I've been back & forth on this issue but lean towards letting None be non-sticky (i.e. let it disappear when that makes sense) vs NaN which is 100% sticky. Terje

Re: Mixed EGU/EGO floating-point (thread)

comp.arch

Posted: 8 Days 17 Hours ago by: BGB

Though, FWIW, expecting bit-identical floating-point results between architectures or target machines is asking for trouble. Even an inexact implementation of the FPU will still give the same results for each run of a given program on

Re: Mixed EGU/EGO floating-point (thread)

comp.arch

Posted: 8 Days 18 Hours ago by: BGB

If either Inf or NaN happens, it usually means "something has gone wrong". Having them as distinct cases adds more special cases that need to be detected and handled by hardware, without contributing much beyond slightly different ways

Re: Mixed EGU/EGO floating-point (thread)

comp.arch

Posted: 8 Days 19 Hours ago by: Anton Ertl

No, we already have NaNs for that. He wants to be able to compute the sum/product of all present data, and what he proposes works for that. Having a result that tells me that there is something missing but not bogus (the distinction you

Re: Mixed EGU/EGO floating-point (thread)

comp.arch

Posted: 8 Days 23 Hours ago by: Ivan Godard

It preserves the distinction between missing data and bogus data. There are algorithms that are massively sped up by having a None (with suitable None-aware operations present). While bogus is just bogus no matter what you do with it.

Re: Mixed EGU/EGO floating-point (thread)

comp.arch

Posted: 9 Days ago by: MitchAlsup

< Which makes None just another kind of NaN. {Which nobody uses anyway.}

Re: Mixed EGU/EGO floating-point (thread)

comp.arch

Posted: 9 Days ago by: MitchAlsup

< This may simplify software implementations, but HW implementations find it easy enough to detect the fraction == 0 as Infinity. In any event, HW does not process calculations with infinities, just special case them to decide what the ans

Re: Mixed EGU/EGO floating-point (thread)

comp.arch

Posted: 9 Days 3 Hours ago by: Ivan Godard

Don't you want x+None or x*None to be None?

Re: Mixed EGU/EGO floating-point (thread)

comp.arch

Posted: 9 Days 4 Hours ago by: Terje Mathisen

Afair ieee754 can be traced back to around 1978, i.e. the design of the 8087, didn't Intel consult with Kahan around that timeframe? The original 1985 standard mostly blessed those already-implemented versions of the prior drafts, this

Re: Mixed EGU/EGO floating-point (thread)

comp.arch

Posted: 9 Days 4 Hours ago by: Stefan Monnier

Indeed, for pretty much all software, the direction has historically been firmly towards ever more bit-for-bit reproducibility (makes a lot of things much easier), and over the years we've learned how to get that at reasonable cost. Neur

Re: Perfect roudning of trogonometric functions (thread)

comp.arch

Posted: 9 Days 4 Hours ago by: Terje Mathisen

It is only exact (i.e. the actual/delivered result) if both guard and sticky are zero, any other combination means that rounding _will_ happen. Terje

Re: Mixed EGU/EGO floating-point (thread)

comp.arch

Posted: 9 Days 5 Hours ago by: MitchAlsup

< Infiniity has a defined magnitude :: bigger than any IEEE representable number NaN has a defined non-value :: not comparable to any IEEE representable number (including even itself). < I don't see why you would want Infinity to just be a

Re: Mixed EGU/EGO floating-point (thread)

comp.arch

Posted: 9 Days 5 Hours ago by: BGB

One could go the other direction: Drop any semantic distinction between Inf and NaN; Effectively, Inf is just a special case of NaN. Define "Denormal As Zero" as canonical; ... FPU operations: ADD/SUB/MUL CMP, CONV

Re: Perfect roudning of trogonometric functions (thread)

comp.arch

Posted: 9 Days 6 Hours ago by: Stefan Monnier

Indeed, it's just the same. For that reason I put it between quotes. In the blog (and in the article), they do point out that it's not just the like the classic round to even. They probably kept the name that was used in https://hal.ar

Re: Perfect roudning of trogonometric functions (thread)

comp.arch

Posted: 9 Days 6 Hours ago by: Terje Mathisen

Their two extra bits sounds exactly the same as guard & sticky which is what every FP implementation (hardware or software) need as a minimum solution in order to handle any rounding mode correctly. I would not call it "round to odd" t

Re: Mixed EGU/EGO floating-point (thread)

comp.arch

Posted: 9 Days 7 Hours ago by: Michael S

Depends on definition of "significant". Supercomputer vendors (Cray, NEC, other Japanese vendors and one or two Soviet) we not selling many machines, but rather significant number of FLOPs. Of course, significant by standards of the 1st

Re: Mixed EGU/EGO floating-point (thread)

comp.arch

Posted: 9 Days 7 Hours ago by: Michael S

That is true for rather narrow definition of "CPU" that does not include TMS320C30/C40 series that definitely was sold in higher numbers than S/360 and all dwarfs combined, and likely by orders of magnitude. I think, but not sure, that Mo

Re: Ill-advised use of CMOVE (thread)

comp.arch

Posted: 9 Days 7 Hours ago by: MitchAlsup

< You CoUlD make a CMOV predictor or you could make CMOV deliver whichever operand arrives first, and then backup and rerun if the prediction is wrong. With the proper infrastructure, this would not take more cycles than simply waiting.

Re: Ill-advised use of CMOVE (thread)

comp.arch

Posted: 9 Days 8 Hours ago by: Anton Ertl

Yes, that's what I was thinking of (of course only if the architectural register name is the same). No, I don't. I pass the result on as soon as it exists (and the predicate is satisfied). If an earlier instruction traps, that will ca

Re: Ill-advised use of CMOVE (thread)

comp.arch

Posted: 9 Days 9 Hours ago by: MitchAlsup

< < < < I give the same physical register name to instructions in the then-clause and in the else clause. One will get delivered, the other one will get suppressed. < < You can have a smart decoder that fin

Re: Ill-advised use of CMOVE (thread)

comp.arch

Posted: 9 Days 9 Hours ago by: MitchAlsup

< Instruction are installed into the cache of instructions in observed order. When a branch is encountered (1 per cycle) if the predictor agrees, then then instructions are played out in the same order as observed. If not, then control is

Re: Mixed EGU/EGO floating-point (thread)

comp.arch

Posted: 9 Days 12 Hours ago by: EricP

https://en.wikipedia.org/wiki/Weitek

Perfect roudning of trogonometric functions (was: Mixed EGU/EGO floating-point) (thread)

comp.arch

Posted: 9 Days 12 Hours ago by: Stefan Monnier

https://blog.sigplan.org/2022/04/28/one-polynomial-approximation-to-produce-correctly-rounded-results-for-multiple-representations-and-rounding-modes/ They're not arguing for this "round to odd" behavior as the best choice for the ac

Re: Mixed EGU/EGO floating-point (thread)

comp.arch

Posted: 9 Days 12 Hours ago by: JimBrakefield

https://en.wikipedia.org/wiki/Floating_Point_Systems

Re: Mixed EGU/EGO floating-point (thread)

comp.arch

Posted: 9 Days 13 Hours ago by: EricP

Ok, so explicit reset only. Functions Exact() and Inexact() to clear or set it, and IsExact() to test it. But the Posit inexact bit, which John is referring to, is in the lsb of the mantissa not in a separate status register. And if it

Re: Mixed EGU/EGO floating-point (thread)

comp.arch

Posted: 9 Days 13 Hours ago by: Terje Mathisen

For the exact same reason, the new Augmented[Addition|Multiplication] operations (two-in, two-out) will report exact results for everything except borderline operations, like gradual underflow of subnormals. Your CARRY is of course a wo

Re: Ill-advised use of CMOVE (thread)

comp.arch

Posted: 9 Days 17 Hours ago by: Anton Ertl

Probably with a different predictor. See <https://ict.iitk.ac.in/wp-content/uploads/CS422-Computer-Architecture-agree_predictor.pdf>. - anton

Re: Ill-advised use of CMOVE (thread)

comp.arch

Posted: 9 Days 18 Hours ago by: Thomas Koenig

I have to confess I do not understand that. What would agree / disagree with what?

Re: Ill-advised use of CMOVE (thread)

comp.arch

Posted: 9 Days 18 Hours ago by: Anton Ertl

Yes, I doubt that the extra uOps cause a slowdown in most cases and a significant slowdown in the rest. And I think that the independence of the other source is so important for performance that this is a good approach for predication, t

Re: finite but unbounded? (thread)

comp.arch

Posted: 9 Days 18 Hours ago by: Thomas Koenig

Who's left? NGOs? Research institutes (who would then have to go into manufacturing)?

Re: Tachyum Launches Prodigy Universal Processor

comp.arch

Posted: 9 Days 18 Hours ago by: Thomas Koenig

It is astonishingly rich in marketing babble and astonishingly poor in technical detail. They make a big thing about their 5.7 GHz clock cycle - unless they have built a super-duper 3nm fab in their back yard when nobody was looking, thi

Re: Mixed EGU/EGO floating-point (thread)

comp.arch

Posted: 9 Days 18 Hours ago by: John Dallman

Also on Data General's bigger machines, and some other minis. Those are recognizably ancestors of IEEE. They have a few more mantissa bits and correspondingly fewer in the exponent, but work the same way. Lots of smaller players, wi

Re: Mixed EGU/EGO floating-point (thread)

comp.arch

Posted: 9 Days 19 Hours ago by: Anton Ertl

True, but the claim did not mention MS-DOS. The IBM PC, PC XT, and its clones were popular and had an 8087 socket, so these machines were the way to go if you needed to do a lot of FP and could not afford a minicomputer with an FPU (e.g.

Re: Tachyum Launches Prodigy Universal Processor

comp.arch

Posted: 9 Days 19 Hours ago by: aph

This is giving me Transmeta flashbacks. Andrew.

Re: Mixed EGU/EGO floating-point (thread)

comp.arch

Posted: 10 Days 2 Hours ago by: John Levine

MS-DOS was certainly popular in the 1980s. I was one of the authors of Javelin, a time-series modelling package for MS DOS in the mid-1980s, and all of the numeric stuff depended on 8087 arithmetic. There was a hack to trap to an emulato

Re: Mixed EGU/EGO floating-point (thread)

comp.arch

Posted: 10 Days 3 Hours ago by: JimBrakefield

I think the 754 committee overlooked the need (and support thereof) for extended precision. Also the typical status register should be replaced by a "residue" register, whether within the register file or separate; collecting floating-po

Re: Mixed EGU/EGO floating-point (thread)

comp.arch

Posted: 10 Days 4 Hours ago by: MitchAlsup

< You store something that is exact into the memory/register. < < Obvious rules don't need to be specified. < < One of the crazy things about IEEE 754 is that if you go out of your way to write exact floating point arithmetic sequences (tw

Re: Mixed EGU/EGO floating-point (thread)

comp.arch

Posted: 10 Days 4 Hours ago by: JimBrakefield

The other choice is to have distinct load and store floating instructions for exact and inexact. Yeah!

Re: Mixed EGU/EGO floating-point (thread)

comp.arch

Posted: 10 Days 4 Hours ago by: EricP

The Posit inexact bits were sticky so how did they get reset? I looked through that Posit book and didn't see a rule. Because it looked to me as though once they got set, they would propagate through a calculation. And if so then they wou

Re: Petahertz, femtosecond logic gates (thread)

comp.arch

Posted: 10 Days 5 Hours ago by: BGB

I may need to look at it some more. Looking at the code, it both updates MOSI and samples MISO on the falling edge, then did nothing on the rising edge. This is sorta working, but prone to break if I try to go much over 12.5 MHz. Whe

Re: Mixed EGU/EGO floating-point (thread)

comp.arch

Posted: 10 Days 6 Hours ago by: MitchAlsup

< < All of the first generation RISC machines were IEEE 754 with some allowance to compliance. These came out contemporaneously with 68881 and x87 but later. By 1985 there was nobody making CPUs, and trying to sell them in the millions*, t

Re: Mixed EGU/EGO floating-point (thread)

comp.arch

Posted: 10 Days 6 Hours ago by: MitchAlsup

< This gains nothing !! < Consider that you lose 1 bit from the fraction in order to represent inexact. The inexact bit tells you to prepare and operand that is 1.fraction+epsilon Which is exactly what you would have had if you simply roun

Re: Petahertz, femtosecond logic gates (thread)

comp.arch

Posted: 10 Days 7 Hours ago by: David Schultz

SPI outputs data on one edge and clocks it in on the other. The details for SD cards have been kept out of the free version of the specification since version 1.0. I found it in a MMC document long ago but of course details are a bit d

Re: Mixed EGU/EGO floating-point (thread)

comp.arch

Posted: 10 Days 8 Hours ago by: Quadibloc

Huh? I was thinking that you could get proper sorting of numbers with an inexact bit as follows: 1) Put the inexact bit at the end of the number. 0 = exact, 1 = inexact. 2) Round inexacts to the halfway points between successive exact

Re: Mixed EGU/EGO floating-point (thread)

comp.arch

Posted: 10 Days 8 Hours ago by: Quadibloc

Thank you very much! Incidentally, I've now created a page at http://www.quadibloc.com/comp/cp020101.htm which includes a more fully fleshed-out description of this proposed numeric format. John Savard

Re: Mixed EGU/EGO floating-point (thread)

comp.arch

Posted: 10 Days 8 Hours ago by: Anton Ertl

This particular claim is easy to refute: IEEE 754 won before Windows, so it did not become popular because of Windows. Basically all new architectures after IEEE 754 (1985) and a few before (e.g., 8087 (1980), 68881 (1984)) adopted IEEE

Re: Mixed EGU/EGO floating-point (thread)

comp.arch

Posted: 10 Days 8 Hours ago by: JimBrakefield

The revised Posit standard now fixes the number of "es" bits at two for all data sizes. Simplifies the memory to/from register format conversion somewhat. https://posithub.org/ Still think there is room for further improvement. Howeve

Re: Petahertz, femtosecond logic gates (thread)

comp.arch

Posted: 10 Days 9 Hours ago by: BGB

Pretty much the whole point of my "assume it moves more or less instantly" thing. The wavelength is a lot longer than the lengths of wire one is dealing with, so generally "everything works OK". Well, as noted, except things like SD

Re: Mixed EGU/EGO floating-point (thread)

comp.arch

Posted: 10 Days 9 Hours ago by: Quadibloc

I would not be able to define the arithmetic rules fully, given that the continuum problem hasn't been solved. Actually, though, that's probably not what you mean, and instead you're talking about infinities all with the cardinality of

Re: Ill-advised use of CMOVE (thread)

comp.arch

Posted: 10 Days 9 Hours ago by: Ivan Godard

Terje, you have a dirty mind!

Re: Mixed EGU/EGO floating-point (thread)

comp.arch

Posted: 10 Days 12 Hours ago by: MitchAlsup

Oh and BTW: John's email (as of Feb) is John Gustafson (johngustafson@earthlink.net)

Re: Mixed EGU/EGO floating-point (thread)

comp.arch

Posted: 10 Days 12 Hours ago by: MitchAlsup

< Why not a hierarchy of infinities ? < The most useful thing about posits is that they do not come with all the IEEE 754 baggage.

Re: Architecture comparison (was: Upcoming DFP support in clang/LLVM) (thread)

comp.arch

Posted: 10 Days 13 Hours ago by: Torbjorn Lindgren

The V20/V30 also has additional instructions not found in either and a 8080 emulation mode, but critically it *IS* possible to make a fully PC compatible machine with it while the same isn't possible using an 80188/80186. The 8018x model

Re: Ill-advised use of CMOVE (thread)

comp.arch

Posted: 10 Days 16 Hours ago by: Terje Mathisen

When doing a predicated increment my preferred (asm) solution is to place the condition in the Carry flag, then just do ADC x,0 This is of course far faster than any CMOV version. If the increment is more or less random then we used

Re: Petahertz, femtosecond logic gates (thread)

comp.arch

Posted: 10 Days 16 Hours ago by: Terje Mathisen

1-5 MHz seems almost like DC these days, when individual clock pulses can reach more than 100m, a computer enclosure seems tiny. :-) It is the same scaling which means that almost any conceivable sound processing can be handled in sw th

Re: WOW, SKYBUCK, PREDICTED CORONA on 4 OCTOBER 2019 !

comp.arch

Posted: 10 Days 18 Hours ago by: Quadibloc

That's nothing. Laurie Garrett predicted it in 1994 - that's when the book "The Coming Plague" was published. John Savard

Mixed EGU/EGO floating-point

comp.arch

Posted: 10 Days 22 Hours ago by: Quadibloc

I tried sending an E-mail about this brainstorm of mine to posituhb.org, and in my unsuccessful attempt to do so, I discovered that John L. Gustafson's E-mail address at the University of Singapore is no longer valid. Also, the most rece

Re: Petahertz, femtosecond logic gates (thread)

comp.arch

Posted: 10 Days 23 Hours ago by: BGB

Likely the only real way to do this would be for the processor core to be a fully 3D structure, so that routing distances can be shorter. But, if this could be done, the issue of "where to send the waste heat" would likely be even more

Re: Petahertz, femtosecond logic gates (thread)

comp.arch

Posted: 10 Days 23 Hours ago by: MitchAlsup

< What mechanism to you want the designers to apply to make wires faster ? a) you can use 2 wires side by side (1.5× capacitance ½ resistance) for a small gain. {Top/Bottom does not work because alternating metal layers run N-S and L-R.}

Re: Tachyum Launches Prodigy Universal Processor

comp.arch

Posted: 10 Days 23 Hours ago by: Quadibloc

This at least _sounds_ like the sort of processor chip I'm interested in seeing. John Savard

Re: Ill-advised use of CMOVE (thread)

comp.arch

Posted: 11 Days ago by: MitchAlsup

< While a set (handful) of instructions can have a predicate cast over them, One can have several of these sets with different predicates for each set (some sets partially overlapping). && and || are easily accommodated. < There are a coup

Re: Petahertz, femtosecond logic gates (thread)

comp.arch

Posted: 11 Days ago by: Quadibloc

You've noted this fact before. But if faster logic were available, surely more effort and expenditure would be devoted to reducing wire delays than at present. John Savard

Re: finite but unbounded? (thread)

comp.arch

Posted: 11 Days 1 Hour ago by: Brett

And does this theory explain the horizontal distribution as well as the vertical distribution, like the single slit theory does? Here’s a hint, NO. PHd stands for Pile it Higher and Deeper, it being bullshit. Look behind the curtain

Re: Ill-advised use of CMOVE (thread)

comp.arch

Posted: 11 Days 2 Hours ago by: Ivan Godard

It is worth it IMO. Mill has predicated forms for both load and store. It is if you want to do aggressive speculation. How you fit it into your instruction set can vary a lot though: some (ARM) ISAs predicate everything; Mill has ind

Re: Ill-advised use of CMOVE (thread)

comp.arch

Posted: 11 Days 5 Hours ago by: BGB

In my case, once it reaches EX1, if the predicate is false, the instruction's opcode turns into a NOP (except for branches, where it turns into a "NoBranch" instruction, and the BRA/BRANB logic may then initiate a branch based on wheth

Re: Petahertz, femtosecond logic gates (thread)

comp.arch

Posted: 11 Days 6 Hours ago by: BGB

Yeah. Free wires work OK so long as they are roughly length matched. Like, near the end of IDE/PATA's lifespan, there was a thing for a little while of IDE cables where, instead of a flat ribbon cable, they would have a bundle of loose

Re: finite but unbounded? (thread)

comp.arch

Posted: 11 Days 8 Hours ago by: EricP

That is not demonstration of quantum phenomena. Young's 1882 double slit experiment works fine as a demonstration of the wave nature of light or sound. The puzzles for the double slit experiments came when you try to explain it using par

Re: Ill-advised use of CMOVE (thread)

comp.arch

Posted: 11 Days 10 Hours ago by: MitchAlsup

< With Intel being the parent company !!! That is a BIG assumption. < This is how predication is defined on My 66000--result cancellation. < < You are expecting "sane" behavior from a company like Intel ?!?!?!

Re: Ill-advised use of CMOVE (thread)

comp.arch

Posted: 11 Days 11 Hours ago by: EricP

Right and for what I understood to be the take-away from Alpha's CMOV and it looks like x86 too is that a reg<-reg CMOV by itself is not sufficiently useful. There are other problems, like if (cond) x = x + 1; has conditional m

Re: Petahertz, femtosecond logic gates (thread)

comp.arch

Posted: 11 Days 11 Hours ago by: Andy Valencia

I remember my eyes widening with astonishment, the first time I opened up a SuperPET 9000 and saw how they wired the 6809 companion processor into the unit. https://vintagecomputer.ca/wp-content/uploads/2015/10/SuperPET-internals-close-u

Re: Ill-advised use of CMOVE (thread)

comp.arch

Posted: 11 Days 11 Hours ago by: EricP

It would appear so. I understand NOW that they look at it that way. The way I saw it is that the whole point is to skip the expensive memory access part if condition is false. I have LDC & STC Load & Store Conditional instructions in m

Re: Ill-advised use of CMOVE (thread)

comp.arch

Posted: 11 Days 19 Hours ago by: Anton Ertl

I just tried it on a Skylake, and "cmovne (%rsi), %eax" with %rsi=0 segfaults whether ZF is set or not. Suppressing the exception if the condition is false would have been useful for the idiom above. AFAIK predicated instructions tend t

Re: Ill-advised use of CMOVE (thread)

comp.arch

Posted: 11 Days 20 Hours ago by: Terje Mathisen

Mainly this! 2 cycles of forced latency, on top of typically some extra work to setup the two alternatives, was just too long. Yeah, see above. I don't think this is the case, but mainly that branch predictors are just too good: c

Re: Ill-advised use of CMOVE (thread)

comp.arch

Posted: 11 Days 20 Hours ago by: Terje Mathisen

CMOV is a MOV, so it always accesses the source, but then it might discard the result? The key here is probably that all load-op instructions on x86 perform the load part in an earlier phase than the operation itself, so it makes perf

Re: Petahertz, femtosecond logic gates (thread)

comp.arch

Posted: 11 Days 22 Hours ago by: BGB

I was more meaning with things like running MHz on a bread-board or over repurposed 24 AWG solid-core wire from CAT5e wires or similar. One can almost get along assuming that signals move instantly, but wire lengths need to be kept ev

Re: Petahertz, femtosecond logic gates (thread)

comp.arch

Posted: 12 Days 1 Hour ago by: MitchAlsup

< With a 50ns clock, 5ns was in the staging flip-flops, 3ns in clock jitter and skew leaving 42ns of logic. One could get 400 microns in a single clock on unbuffered wire--this was the width of the chip. < With a 1ns clock, 100ps was in t

Re: Petahertz, femtosecond logic gates (thread)

comp.arch

Posted: 12 Days 2 Hours ago by: BGB

Possibly. When hand wiring stuff, it seems like a lot of MHz level stuff, some amount of variability is acceptable. Though, one might still need to care about the wires being approximately the same length for things like SPI links. B

Re: Petahertz, femtosecond logic gates (thread)

comp.arch

Posted: 12 Days 4 Hours ago by: MitchAlsup

< There was significant and measurable wire delay at 20 MHz............ < Capacitance (and resistance) only slow the edge speed (FET logic) by themselves, they do not eat voltage. < Yes, wire is the problem. < Like mercury delay lines ? <

Re: Petahertz, femtosecond logic gates (thread)

comp.arch

Posted: 12 Days 4 Hours ago by: BGB

I have my doubts here as to how useful this could be "in general". Even if one can make the laser pulses fast, and the switching fast, unless the whole system is very compact, it seems like things like wire propagation delays and simi

Re: Petahertz, femtosecond logic gates (thread)

comp.arch

Posted: 12 Days 4 Hours ago by: MitchAlsup

< Back in 2004-ish, AMD was predicting 6ps inverter delays in 65nm process. At these edge speeds, one needs to model Amperé's law in simulation (above and beyond simple skin effect stuff). < But to temper expectations: If you took the cur

Petahertz, femtosecond logic gates

comp.arch

Posted: 12 Days 6 Hours ago by: EricP

A group has created petahertz (10^15 Hz), femtosecond logic gates. In a lab, and it needs lasers to drive it, and it's really more like 10's of femtoseconds, but still... neet. Laser bursts drive fastest-ever logic gates 2022-May-11 https:

Re: Ill-advised use of CMOVE (thread)

comp.arch

Posted: 12 Days 6 Hours ago by: MitchAlsup

< Yes, but the very vast majority of pointer checks against zero are predicted with great accuracy. < CMOV is for unpredictable branches not the predictable ones.

Re: finite but unbounded? (thread)

comp.arch

Posted: 12 Days 6 Hours ago by: Brett

The studies saying statins improve outcomes have not been reproduced. Your brain needs cholesterol to work, taking statins leads to cognitive impairment. A one month study will not test for or see this, or the long term effects of the mu

Re: Ill-advised use of CMOVE (thread)

comp.arch

Posted: 12 Days 7 Hours ago by: EricP

Hmmm... looks like it could be either way. Intel Vol-2 manual CMOV description says: "If the condition is not satisfied, a move is not performed and execution continues with the instruction following the CMOVcc instruction." which seems

Re: Ill-advised use of CMOVE (thread)

comp.arch

Posted: 12 Days 8 Hours ago by: BGB

Yeah, dunno there. If one had enough context, lots of things could be possible. The current scheme is sorta: Take low bits of address, xor them with the recent global branch history; Use this as an index into the table of 3-bit states

Re: Ill-advised use of CMOVE (thread)

comp.arch

Posted: 12 Days 9 Hours ago by: Michael S

I don't think so. If I am not mistaken, x86 architecture specifies that load should be always executed. Or, may be, that's just an implementation thing, but since it's documented in the manuals future architects will be always afraid to c

Re: Ill-advised use of CMOVE (thread)

comp.arch

Posted: 12 Days 12 Hours ago by: EricP

For CMOV 1 or 2 extra uOps in a 100+ instruction queue is not a problem but for full predication this approach would not do and it requires smarter uOps. Note that Alpha CMOV is only reg<-reg x86/x64 allows reg<-reg or reg<-mem (conditi

Re: Architecture comparison (was: Upcoming DFP support in clang/LLVM) (thread)

comp.arch

Posted: 12 Days 16 Hours ago by: John Dallman

Actually, it was incompatible with software written for the IBM PC. This wasn't an instruction set issue. The 80186 had several integrated peripherals that weren't PC-compatible. John

Re: finite but unbounded? (thread)

comp.arch

Posted: 12 Days 17 Hours ago by: David Brown

Agreed. Health services should be funded publicly (usually at a national level, but perhaps in the USA or other federated states it might be state level) based on taxes of some sort. But they should be independent of politics at all

Re: finite but unbounded? (thread)

comp.arch

Posted: 12 Days 17 Hours ago by: David Brown

No, that's not how it works (most of the time). A study shows that a drug is not giving much benefit, so the study /and/ the drug are pulled and the drug company works on something else instead. Other researchers cannot then benefit f

Re: Architecture comparison (was: Upcoming DFP support in clang/LLVM) (thread)

comp.arch

Posted: 12 Days 18 Hours ago by: BGB

Probably true. My comment was ambiguous in this case, more in relation to the 80186 being not entirely binary compatible with software written for the 8086 (so IBM went from 8086 to 80286). Goes and looks at some of the chips, and de

Re: Architecture comparison (was: Upcoming DFP support in clang/LLVM) (thread)

comp.arch

Posted: 12 Days 19 Hours ago by: Stephen Fuld

8085 was software compatible with the 8080. And from a hardware perspective needing only one voltage in instead of three for the 8080 was more than a "slight" improvement. https://en.wikipedia.org/wiki/Intel_8085

Re: finite but unbounded? (thread)

comp.arch

Posted: 13 Days ago by: MitchAlsup

< Which, by the way, is why medicine should not be left to "for profit" companies or corporations (and especially not to *.gov).

Re: finite but unbounded? (thread)

comp.arch

Posted: 13 Days ago by: Stefan Monnier

I wonder why you'd be disappointed. I mean they are large companies, just like Nestlé, Apple, etc... They're designed to maximize their own profit, not other people's well being. Stefan

Re: Architecture comparison (was: Upcoming DFP support in clang/LLVM) (thread)

comp.arch

Posted: 13 Days 2 Hours ago by: BGB

Yeah, 8085 is sorta like the 80186. Sorta existed, mostly forgotten about. I guess also things "like its predecessor, but only slight improvements, but not backwards compatible" is also sort of a deal-breaker. So, for example, people

Re: finite but unbounded? (thread)

comp.arch

Posted: 13 Days 2 Hours ago by: Brett

Yes, the policy of pulling funding from drug studies that are not fraudulent shows the fraud. So you have four studies on statins, three show downsides with no useful effect and are canceled, and the fraudulent study gets published. Cong

Re: Architecture comparison (was: Upcoming DFP support in clang/LLVM) (thread)

comp.arch

Posted: 13 Days 3 Hours ago by: Stephen Fuld

It was used in a lot of embedded systems, as it led to lower cost systems than the 8080 and had a few features that helped there. Embedded processors rarely make a "splash" in consumer consciousness.

Re: Ill-advised use of CMOVE (thread)

comp.arch

Posted: 13 Days 3 Hours ago by: MitchAlsup

As I have related in the past: The Mc 88120 had a branch predictor which is not based on taken/not-taken, but upon agree/disagree. This allows different branches which map to the same counters one taken one not-taken to use the same code i

Re: Ill-advised use of CMOVE (thread)

comp.arch

Posted: 13 Days 3 Hours ago by: BGB

It is mostly that range-coders need to use pipelined multiplies and conditional re-normalization (say, when the high order bits of the high and low range become equal). It isn't as bad with predication as it is with branches, but it st

Re: Architecture comparison (was: Upcoming DFP support in clang/LLVM) (thread)

comp.arch

Posted: 13 Days 4 Hours ago by: Quadibloc

Not enough machines were made using it so that software applications were written that used 8085 instructions that the 8080 didn't have. It didn't make much of a historical 'splash', even if at least one book was written about it. John

Re: Ill-advised use of CMOVE (thread)

comp.arch

Posted: 13 Days 4 Hours ago by: BGB

I can note here that my branch predictor ended up with states both for predicting branches which are always the same, and for branches which are nearly always the opposite. The "nearly always the same" case being the more common, more

Re: Ill-advised use of CMOVE (thread)

comp.arch

Posted: 13 Days 5 Hours ago by: MitchAlsup

< if( 0 <= x && x <= MAX ) {then-clause} < CMP Rt,Rx,Rmax BRIN Rt,end-if // RIN is Really In // then-clause end-if:

Re: Architecture comparison (thread)

comp.arch

Posted: 13 Days 6 Hours ago by: Michael S

Green500 has the same rules as Top500. The benchmark used is Linpack’s “Highly Parallel Computing” benchmark. FAQ: https://www.top500.org/resources/frequently-asked-questions/ The complete rules can be found here: http://www.netlib.

Re: Ill-advised use of CMOVE (thread)

comp.arch

Posted: 13 Days 6 Hours ago by: BGB

In my case, something like: CMPEQ R4, R5 ADD?F R4, 1, R4 Will take 2 cycles. And: CMPEQ R4, R5 BT .L0 ADD R4, 1, R4 .L0: Will take either 3 or 4 cycles (if predicted correctly), or 8 cycles (mispredict). In a lot of

Re: Architecture comparison (thread)

comp.arch

Posted: 13 Days 6 Hours ago by: BGB

Such is the problem... Too many of these sorts of benchmarks are basically people doing vector operations in tight loops with little else going on (such as accessing memory), because doing anything else would ruin their GFLOPs numbers

Re: Architecture comparison (thread)

comp.arch

Posted: 13 Days 6 Hours ago by: MitchAlsup

< I assume these are with the memory reference footprint. But are they ? <

Re: Ill-advised use of CMOVE (thread)

comp.arch

Posted: 13 Days 6 Hours ago by: MitchAlsup

< CMOV cannot begin executing until: a) both operands are available b) the condition is available Instructions dependent on CMOV cannot begin execution until c) CMOV delivers its result(s). < It is often this (c) that makes CMOV appear to

Re: Ill-advised use of CMOVE (thread)

comp.arch

Posted: 13 Days 6 Hours ago by: Anton Ertl

Normal profiles don't tell you how predictable a branch is. Of course always-taken is predictable, but 50% taken might be unpredictable, or perfectly predictable. So: compilers are bad at knowing predictability (even with profile feedba

Re: Architecture comparison (thread)

comp.arch

Posted: 13 Days 7 Hours ago by: Michael S

The current Green500 champion (custom ASIC) delivers 39.379 GFlops/watts == 25.4 pJ/FLOP ~= 50 pJ/FMADD The best GPGPU is not far behind at 33.983 GFlops/watts The best CPU is further down at 16.876 GFlops/watts. Only 2 years (4 lists) ag

Re: Ill-advised use of CMOVE (thread)

comp.arch

Posted: 13 Days 8 Hours ago by: Anton Ertl

Probably not enough bad CMOVs around for that to make sense, although maybe with such a feature that might change: For every CMOV, keep track of how predictable it is (and maybe when its three inputs become available relative to the inpu

Re: Ill-advised use of CMOVE (thread)

comp.arch

Posted: 13 Days 8 Hours ago by: Stephen Fuld

So if you don't use hardware "condition prediction", and the compiler by itself doesn't know how well a particular branch will be predicted, we are left with the aforementioned programmer provided hints, or perhaps some form of profile

Re: Ill-advised use of CMOVE (thread)

comp.arch

Posted: 13 Days 8 Hours ago by: Anton Ertl

I would like to see some empirical support for that statement. It's pretty easy to design a micro benchmark where CMOV wins by a large margin. I guess what he meant is that in a typical microbenchmark aimed at some other charasteristic

Re: Ill-advised use of CMOVE (thread)

comp.arch

Posted: 13 Days 8 Hours ago by: Stefan Monnier

I the present case it seems that indeed the problem is that the condition depends on an amount of code comparable (if not larger) than the rest of the iteration. With branch prediction, the computation of those conditions can be performed

Re: Ill-advised use of CMOVE (thread)

comp.arch

Posted: 13 Days 9 Hours ago by: Stephen Fuld

Thank you. So, would it make sense to develop some kind of "CMOV predictor", sort of like a branch predictor?

Re: Ill-advised use of CMOVE (thread)

comp.arch

Posted: 13 Days 9 Hours ago by: MitchAlsup

< Yes:: One can issue through the predicated set of instructions, and sort out which ones should be executed later. < Whereas: < The general philosophy of branches is to predict them and then attempt to get to the target rapidly. Thus, h

Re: Architecture comparison (thread)

comp.arch

Posted: 13 Days 9 Hours ago by: MitchAlsup

< In your typical integer data path, the flip-flops (and clock driving them) consume more power than the calculations (excepting MUL and DIV).

Re: Ill-advised use of CMOVE (thread)

comp.arch

Posted: 13 Days 10 Hours ago by: Michael S

It used to be a significant reason on Pentium4. After that the answer is pretty much always "No". Yes, but that's something that would cause slowdown by 10-20% rather than 2x. Unless compiler did something obviously stupid that also ha

Re: Ill-advised use of CMOVE (thread)

comp.arch

Posted: 13 Days 10 Hours ago by: Stephen Fuld

I am not questioning the truth of that, but I am trying to figure out why. Since the CMOV is a pretty big win when the branch is mispredicted, it must be a loss when the prediction would have been correct. Is this because The CMOV it

Re: Ill-advised use of CMOVE (thread)

comp.arch

Posted: 13 Days 11 Hours ago by: Marcus

Any particular reason why an explicit PNE (Predicate Not Equal, I assume) instruction is better than repurposing a branch instruction and interpreting it as a predication instruction in the front end? E.g: BEQ 1f // Skip nex

Re: The purpose of our universe simulation was just made clear to me

comp.arch

Posted: 13 Days 12 Hours ago by: Michael S

This branch (or branches) of math did not interested me even when I was young and even when treated by legitimate thinkers rather than by cranks. Somehow, my main instinctive emotion is "Who cares?". I wonder if people like Euler and Lag

Re: The purpose of our universe simulation was just made clear to me

comp.arch

Posted: 13 Days 12 Hours ago by: JimBrakefield

Am struggling to find a point of view where Muckenheim makes sense. Then again I have better things to do with my time. As one ages it becomes a contest between the plumbing and the wiring.

Re: The purpose of our universe simulation was just made clear to me

comp.arch

Posted: 13 Days 13 Hours ago by: David Brown

The paper appears to be gobbledegook. You don't need to get further than the introduction to see a mix-up between natural numbers, fractions and real numbers, poorly defined terms, unsubstantiated claims and invalid conclusions drawn

Re: finite but unbounded? (thread)

comp.arch

Posted: 13 Days 15 Hours ago by: David Brown

It is also the basic principle of how experimental science works. One group does some experiments or measurements, and finds something interesting. They publish the results. Other groups in the field see these, and try to replicate

Re: Architecture comparison (thread)

comp.arch

Posted: 13 Days 20 Hours ago by: Thomas Koenig

Interesting figure. Looking at the 94 GB per second with four memory channels with DDR 2933 cited in https://www.intel.com/content/www/us/en/support/articles/000056722/processors/intel-core-processors.html gives you about 12 W of power

Re: The purpose of our universe simulation was just made clear to me

comp.arch

Posted: 14 Days 6 Hours ago by: JimBrakefield

Have a copy of GEB, never got beyond a hundred pages. Sorta like a botanist's description of his plants. I think the general subject area has some potential for mathematical treatment: Monoids: as in unification, as in how Newton unified

Re: The purpose of our universe simulation was just made clear to me through communication with time.

comp.arch

Posted: 14 Days 8 Hours ago by: Anton Ertl

If only I could buy back the time I wasted on reading it! This book was hugely popular in its day (I fell for it, too), but it was deservedly pretty much forgotten a few years later. - anton

Re: The purpose of our universe simulation was just made clear to me

comp.arch

Posted: 14 Days 11 Hours ago by: David Brown

I have ordered it now. If only there was a webshop where I could buy the time to read it! :-)

Re: finite but unbounded? (thread)

comp.arch

Posted: 14 Days 13 Hours ago by: Andy Valencia

And for decades, reproducibility was not given much attention. Why work on something hard when it is--by the standards of your industry--not valuable? I'm hearing from friends and family across quite a breadth of the industry that this

Re: finite but unbounded? (thread)

comp.arch

Posted: 14 Days 13 Hours ago by: Stefan Monnier

The way you state this makes me think you're probably not a scientist (or if so, probably not a good one): a scientist would know that none of those elements are truly "known" with any kind of certainty. They're just the best available ex

Re: Architecture comparison (thread)

comp.arch

Posted: 14 Days 17 Hours ago by: BGB

Probably... Say, BJX2 core executing 4x Binary32 SIMD ops, So 4 FLOPs in 10 cycles. So, could theoretically pull off 20 MFLOP at 50MHz if doing it all with FP-SIMD ops. Now, say, it is in DRAM and operating on one big array and putt

Re: Architecture comparison (thread)

comp.arch

Posted: 14 Days 19 Hours ago by: Terje Mathisen

This is particularly galling when we consider how the relative costs of memory access vs FPU operations have changed: I have seen quotes of over 1000 pJ (1200?) to load a 64-bit double from RAM, vs less than 10 pJ for an FMUL. I.e. th

Re: Architecture comparison (was: Upcoming DFP support in clang/LLVM) (thread)

comp.arch

Posted: 14 Days 19 Hours ago by: BGB

I was thinking in the DMIPS sense. Say, one has a core which is running: 39M bundles/sec, at ~ 1.3 instructions per bundle; But, still only gets ~ 0.9 DMIPS/MHz... Having briefly looked at VAX, it makes a little more sense. Abil

Re: finite but unbounded? (thread)

comp.arch

Posted: 14 Days 20 Hours ago by: Brett

The Chicxulub impact was ~300,000 years before the KT boundary that killed the dinosaurs. This has been known since day one, but excuses were made of tsunami’s, the story of Chicxulub was just too popular, so the press went with the cli

Re: finite but unbounded? (thread)

comp.arch

Posted: 14 Days 20 Hours ago by: Brett

Publish or perish leads to just random crap getting published. No stupidity required. Drug trials are notorious for slants that are just short of criminal, in that you can’t prove criminal intent, but the results always slant that dire

Re: Architecture comparison (was: Upcoming DFP support in clang/LLVM) (thread)

comp.arch

Posted: 15 Days ago by: MitchAlsup

< Some think an instruction can contain cause than one state change. {But this all get murky when you add in HW tablewalked TLBs.}

Re: Architecture comparison (was: Upcoming DFP support in clang/LLVM) (thread)

comp.arch

Posted: 15 Days ago by: MitchAlsup

< !!! EVEN without having 'bar' or 'foo' in registers !!! < < This is a question about how one defines work. Floating point people do not count loads and stores--just FLOPs. < There are ways (now) of making VAX run above 1 instruction per

Re: Architecture comparison (was: Upcoming DFP support in clang/LLVM) (thread)

comp.arch

Posted: 15 Days ago by: Ivan Godard

Yes. But then, what's an instruction?

Re: finite but unbounded? (thread)

comp.arch

Posted: 15 Days ago by: MitchAlsup

Never attribute to malfeasance that which can be attributed to stupidity.

Re: Architecture comparison (was: Upcoming DFP support in clang/LLVM) (thread)

comp.arch

Posted: 15 Days ago by: BGB

Hmm... I guess I can note that I had previously designed some 3R bytecode ISA's along a vaguely similar line, but still didn't go quite as far as VAX did in this direction. It does also sorta remind me of EFI ByteCode, though after fi

Re: finite but unbounded? (thread)

comp.arch

Posted: 15 Days 1 Hour ago by: Stefan Monnier

Most researchers would make more money doing something else than research, so I don't buy this "money" argument very much. In most cases you don't need to resort to money, politics, or fraud, to explain the impossibility to reproduce a pa

Re: finite but unbounded? (thread)

comp.arch

Posted: 15 Days 1 Hour ago by: Brett

Re: finite but unbounded? (thread)

comp.arch

Posted: 15 Days 1 Hour ago by: Ivan Godard

Perhaps you mean "foisted"?

Re: finite but unbounded? (thread)

comp.arch

Posted: 15 Days 2 Hours ago by: Brett

Half of research papers cannot be reproduced. Google it. Add in money or politics and the fraud quickly goes epic. Cheap available effective treatments are slandered in favor of patented trillion dollar “cures” that only last six mon

Re: Architecture comparison (was: Upcoming DFP support in clang/LLVM) (thread)

comp.arch

Posted: 15 Days 2 Hours ago by: MitchAlsup

< Exactly what it was designed for:: except the interpreter was microcode. < Compared to modern wide-issue machines yes. Compared to its competitors of the day:: no -- a resounding NO Its competitors were doing about 1 instruction every 4

Re: Architecture comparison (was: Upcoming DFP support in clang/LLVM) (thread)

comp.arch

Posted: 15 Days 3 Hours ago by: BGB

I have now found something (in the form of a scanned copy of an 80s era ISA manual), it appears to be a variable-length byte-oriented encoding. Most other stuff I had seen previously does not go below the level of its ASM notation. A

Re: Architecture comparison (was: Upcoming DFP support in clang/LLVM) (thread)

comp.arch

Posted: 15 Days 3 Hours ago by: MitchAlsup

-------------------------------------------------------------------------------------------------------------------------- < https://view.officeapps.live.com/op/view.aspx?src=https%3A%2F%2Fusers.cs.jmu.edu%2Fabzugcx%2FPublic%2FStudent-Prod

Re: Architecture comparison (was: Upcoming DFP support in clang/LLVM) (thread)

comp.arch

Posted: 15 Days 4 Hours ago by: Bill Findlay

O tempora, o mores! 8-)

Re: Ill-advised use of CMOVE (thread)

comp.arch

Posted: 15 Days 5 Hours ago by: MitchAlsup

< < Predication works by conservation of fetch bandwidth. One has a certain amount of code that can be read each cycle, and a certain sized repository of that code (instruction buffer). When code needs some flow control, but the span of th

Re: Architecture comparison (was: Upcoming DFP support in clang/LLVM) (thread)

comp.arch

Posted: 15 Days 5 Hours ago by: BGB

Yeah, but similar-looking ASM, apart from splitting up the registers and some other things. All 3 ISA designs have in common that they are built around 16-bit instruction words, often with a similar-ish layout, eg: op,rn,mode,rm Or:

Re: finite but unbounded? (thread)

comp.arch

Posted: 15 Days 5 Hours ago by: MitchAlsup

< https://www.youtube.com/watch?v=dr6nNvw55C4 <

Re: Ill-advised use of CMOVE (thread)

comp.arch

Posted: 15 Days 7 Hours ago by: Terje Mathisen

It particularly helps when your cpu is wide enough to execute both branches at the same time, and they are of approximately equal length. The final requirement is that the branch join operation needs to be fast, and CMOV have been 2 cy

Re: Ill-advised use of CMOVE (thread)

comp.arch

Posted: 15 Days 7 Hours ago by: Ivan Godard

Too dependent on other things in the app, the architecture, and the hardware to generalize. Pipe depth, issue, dispatch and LS queue sizes impact miss cost. Loop-heavy vs. open code impacts miss frequency. Micro- vs real- benchmarks i

Re: finite but unbounded? (thread)

comp.arch

Posted: 15 Days 8 Hours ago by: MitchAlsup

Cute !

Re: Ill-advised use of CMOVE (thread)

comp.arch

Posted: 15 Days 8 Hours ago by: MitchAlsup

< Conversely: it is not hard to find places where predication to conditionally execute a "few" instructions (or not) is beneficial.

Re: Ill-advised use of CMOVE (thread)

comp.arch

Posted: 15 Days 8 Hours ago by: Anton Ertl

Does it help to use __builtin_expect ? - anton

Re: finite but unbounded? (thread)

comp.arch

Posted: 15 Days 9 Hours ago by: George Neuner

In fact, the double-slit experiment has been replicated in the 'macro' world using easily visible objects. https://www.youtube.com/watch?v=sGCtMKthRh4

Re: Ill-advised use of CMOVE (thread)

comp.arch

Posted: 15 Days 11 Hours ago by: Terje Mathisen

This has actually been the rule for almost all code since the pentiumPro! It is extremely hard to find micro benchmarks where using CMOV to eliminate a branch is a win, somewhat better for larger/full programs but still very rare. The

Re: Ill-advised use of CMOVE (thread)

comp.arch

Posted: 15 Days 11 Hours ago by: Thomas Koenig

It would probably good to submit a PR for this. I'm not overly optimisic about the chances of finding a good heuristic for when code can be well predicted by a branch predictor, but it's worth a shot.

Ill-advised use of CMOVE

comp.arch

Posted: 15 Days 11 Hours ago by: Stefan Monnier

We recently bumped into a funny performance behavior in Emacs. Some code computing the length of a (possibly circular) list ended up macroexpanded to something like: for (struct for_each_tail_internal li = { list, 2, 0, 2 };

Architecture comparison (was: Upcoming DFP support in clang/LLVM) (thread)

comp.arch

Posted: 15 Days 12 Hours ago by: Anton Ertl

Both have two-address instructions where both operands can be in memory. The M68k does not have general-purpose registers. Apart from MOVE, all instructions have only one memory operand. VAX has three-operand instructions, and all ope

Re: The purpose of our universe simulation was just made clear to me

comp.arch

Posted: 16 Days 8 Hours ago by: Terje Mathisen

Afair DH wasn't even a full professor at the time GEB came out, due to this thread I've just gone back and read his Wikipedia entry: https://en.wikipedia.org/wiki/Douglas_Hofstadter Wow! Terje PS. My wife Tone & I have taken ballroom

Re: The purpose of our universe simulation was just made clear to me

comp.arch

Posted: 16 Days 8 Hours ago by: Terje Mathisen

It is one a very small set of books that I have ordered (in hardcover) from the US and paid to have it shipped to Norway. Terje

Re: The purpose of our universe simulation was just made clear to me through communication with time.

comp.arch

Posted: 16 Days 8 Hours ago by: Stefan Monnier

I can't recommend it enough, indeed. Stefan

Re: The purpose of our universe simulation was just made clear to me

comp.arch

Posted: 16 Days 9 Hours ago by: Stephen Fuld

Oh, you definitely should. I still remember with joy, over 40 years later, first reading it in my hotel room one evening while working at a customer's site, encountering "Crab Canon", and having my mind just totally blown at the incre

Re: The purpose of our universe simulation was just made clear to me

comp.arch

Posted: 16 Days 9 Hours ago by: David Brown

I knew someone else would have a suggestion when I did not - and when both you and Ivan agree, then I take that as a strong recommendation. I will see about ordering it myself.

Re: The purpose of our universe simulation was just made clear to me through communication with time.

comp.arch

Posted: 16 Days 10 Hours ago by: Stefan Monnier

Indeed, we humans can occasionally encounter non-computable numbers that we can describe (another example is for some some (bounded) increasing sequences, where the least upper bound is not computable even though each of the numbers in th

Re: The purpose of our universe simulation was just made clear to me

comp.arch

Posted: 16 Days 13 Hours ago by: Terje Mathisen

Id doesn't matter if you care about the details or not, everyone should read Hofstadter's GEB - An Eternal Golden Braid! The book won a Pulitzer Price as well afair? Terje

Re: The purpose of our universe simulation was just made clear to me

comp.arch

Posted: 16 Days 14 Hours ago by: David Brown

The fun thing is that we can identify and describe some non-computable real numbers without ever having a way to actually compute them! <https://en.wikipedia.org/wiki/Chaitin%27s_constant>

Re: finite but unbounded? (thread)

comp.arch

Posted: 16 Days 16 Hours ago by: David Brown

Is that how you see your position? Certainly it is more generous to yourself than how you come across in your posts. If it is more accurate or not, only you can say - but if it /is/ more accurate, then you do not do yourself justice in

Re: The purpose of our universe simulation was just made clear to me through communication with time.

comp.arch

Posted: 16 Days 18 Hours ago by: Andreas Eder

That paper is by Wolfgang Mückenheim. He is a well known crank. 'Andreas

Re: The purpose of our universe simulation was just made clear to me

comp.arch

Posted: 16 Days 20 Hours ago by: Quadibloc

The paper claims to prove that there exist natural numbers that can't be identified by a finite string of characters. That is simply false. Every natural number, no matter how large, can be identified by a string of digits. The fact tha

Re: finite but unbounded? (thread)

comp.arch

Posted: 17 Days ago by: BGB

Pretty much. The modern solution would probably be throwing an LZ77 variant at it. Though, for a columnar numeric data or similar, LZ77 may not be optimal. For general data compression, some major options (LZ77 based) are: Byte-ori

Re: The purpose of our universe simulation was just made clear to me

comp.arch

Posted: 17 Days ago by: JimBrakefield

Flash news: Dark reals magnify mathematical ignorance. Actually it is worse than that. There are dark natural numbers: https://www.hs-augsburg.de/~mueckenh/Transfinity/Dark%20Numbers.pdf Delving further into the subject leads to politics

Re: The purpose of our universe simulation was just made clear to me

comp.arch

Posted: 17 Days 2 Hours ago by: MitchAlsup

< Plank data shows that something like dark mater is really there. We just don't know how to detect it (other than by watching galactic clusters magnify background galaxies.)

Re: The purpose of our universe simulation was just made clear to me through communication with time.

comp.arch

Posted: 17 Days 3 Hours ago by: Stefan Monnier

The numbers that human will encounter are pretty much all among the computable numbers (of which there are aleph0 "only") anyway. The rest is a bit like dark matter: the theory says it's there but we can't see it. Stefan

Re: The purpose of our universe simulation was just made clear to me

comp.arch

Posted: 17 Days 4 Hours ago by: JimBrakefield

It has been said that almost all integers are large, as opposed to almost all numbers humans encounter are small. The same can not be said for reals as they are uncountable?

Re: finite but unbounded? (thread)

comp.arch

Posted: 17 Days 6 Hours ago by: MitchAlsup

< I was hired to build a compression program for a semiconductor "chip" tester. < I took me a couple of hours. I noticed that there was essentially no correspondence reading left-to-right (or right-to-left) but more than 90% of each colum

Re: finite but unbounded? (thread)

comp.arch

Posted: 17 Days 7 Hours ago by: BGB

If observing a point where time was moving faster (relative to the observer), there would be a higher density of photons, at higher energy levels, leading to a blue-shift. This would be similar to if that point was rapidly approaching

Re: finite but unbounded? (thread)

comp.arch

Posted: 17 Days 7 Hours ago by: Brett

Take a look at Young’s Single Slit Experiment from 1802. You get the same distribution, the effect is interaction with the atoms of the slit. No need for ridiculous theories Ike negative time. This also explains the horizontal distribut

Re: finite but unbounded? (thread)

comp.arch

Posted: 17 Days 8 Hours ago by: Brett

A more generous and accurate interpretation is that we are all muddling though life. Don’t look to close at your hero’s as they are just as muddled. ;)

Re: Corona Update 25, Insurance Fraud, Killing the old and sick to

comp.arch

Posted: 17 Days 9 Hours ago by: MitchAlsup

< An ICU nurse told me abut 5 months ago, that 90% of here ICU patients were over 300 pounds and 100% of them were unvaccinated. < Perhaps COVID is a solution to the problem of taking in more calories than your body burns.

Re: The purpose of our universe simulation was just made clear to me

comp.arch

Posted: 17 Days 11 Hours ago by: Ivan Godard

"Godel, Escher, Bach" - Hofstadter

Re: finite but unbounded? (thread)

comp.arch

Posted: 17 Days 11 Hours ago by: Ivan Godard

+1

Re: finite but unbounded? (thread)

comp.arch

Posted: 17 Days 17 Hours ago by: David Brown

Keep an open mind, but not /so/ open that your brains dribble out. There is a difference between having a bit of healthy scepticism and being critical about picking good sources of information, and the kind of absurd delusions popular

Re: The purpose of our universe simulation was just made clear to me

comp.arch

Posted: 17 Days 17 Hours ago by: David Brown

That is the correct Unicode symbol to use, and is the one I have been using too. (Unicode mathematics symbols are not always the same as the normal language letters they are based on.) Mitch has been using the symbol "ℌ", which is

Re: The purpose of our universe simulation was just made clear to me

comp.arch

Posted: 17 Days 17 Hours ago by: David Brown

We know how many real numbers there are - that's easy. It's the cardinality of the power set of the integers. It turns out, however, that both the continuum hypotheses and its negation are consistent with ZF (and ZFC) set theory. It

Re: finite but unbounded? (thread)

comp.arch

Posted: 18 Days 1 Hour ago by: MitchAlsup

< < In the book "Schrodinger 's Kittens" John Gribbin explains essentially how Stephan Weinberg won his Nobel Prize. The trick, Weinberg determined has to do with the interpretation of the Schrodinger equations with negative values in the

Re: finite but unbounded? (thread)

comp.arch

Posted: 18 Days 1 Hour ago by: robf...@gmail.com

More units of time per wavelength equals a lower frequency. I think it would result in apparent red-shifting to the remote observer. Point is I believe that time may not pass at the same rate everywhere at cosmic distances. It could expla

Re: finite but unbounded? (thread)

comp.arch

Posted: 18 Days 5 Hours ago by: Brett

An ape who sees no evil, hears no evil, and speaks no evil. You sound like my astronomy professor who stated that we should spend more money on theorists like him instead of billions on experiments. I pointed out that Theorists are like

Re: Upcoming DFP support in clang/LLVM (thread)

comp.arch

Posted: 18 Days 5 Hours ago by: BGB

My preferred style is mostly: All declarations at top of function (like in C89); Either /* ... */ or // Usually /* ... */ for semantic / informational comments. Usually // for commenting out code lines. Will generally u

Re: Upcoming DFP support in clang/LLVM (thread)

comp.arch

Posted: 18 Days 6 Hours ago by: Terje Mathisen

I'm not a language lawyer! I just started using single-line comments and block-local variable declarations back when this was allowed in C++ but not in C. There might be some other stuff as well but nothing that specifically comes to m

Re: finite but unbounded? (thread)

comp.arch

Posted: 18 Days 7 Hours ago by: BGB

Yeah. One can't see stuff this far, only guess... So, yeah, at the edge of the light cone, it exceeds light speed, and going past this point, at a large enough distance, the speed could approach infinity. As expansion gets closer to

Re: finite but unbounded? (thread)

comp.arch

Posted: 18 Days 7 Hours ago by: BGB

Yeah, if they existed, they might be easier to find in the Lagrange points... On an actual planet, most/all would likely sink into the planetary core and be effectively undetectable.

Re: finite but unbounded? (thread)

comp.arch

Posted: 18 Days 7 Hours ago by: MitchAlsup

< The Lagrange points are planetary gravitational wells.

Re: finite but unbounded? (thread)

comp.arch

Posted: 18 Days 7 Hours ago by: MitchAlsup

< Once it exceeds the speed of light it is no longer visible to us. C < ∞ They (scientists) have recently had 2 photons collide (at an extraordinarily low rate) At the energy level of these things, one would expect gasses. < Energy disto

Re: Upcoming DFP support in clang/LLVM (thread)

comp.arch

Posted: 18 Days 8 Hours ago by: MitchAlsup

< I use: volatile (exceedingly lightly) [u]type_t's // compound literals (lightly) I do not use: inline complex variable length arrays array[n] {I still use K&R C array[];} designated initializers variadic macros restrict IEEE FP extensio

Re: finite but unbounded? (thread)

comp.arch

Posted: 18 Days 8 Hours ago by: BGB

I then thought, if such objects existed, they could pile up within the gravity well of planets, and loosely follow along with the planet, potentially forming things like invisible mountain ranges or similar. If the particles do not int

Re: The purpose of our universe simulation was just made clear to me

comp.arch

Posted: 18 Days 9 Hours ago by: Quadibloc

I'm using the character U+2135, "Alef Symbol", since the Hebrew letter Alef plays tricks with character ordering in display. John Savard

Re: The purpose of our universe simulation was just made clear to me

comp.arch

Posted: 18 Days 9 Hours ago by: Quadibloc

It's true that there are surfaces in Euclidean geometry which follow the laws of Riemannian/spherical or Lobatchevskian/hyperbolic geometry. However, it would _seem_ that questions like _how many well-orderings of the integers are the

Re: Upcoming DFP support in clang/LLVM (thread)

comp.arch

Posted: 18 Days 11 Hours ago by: Tim Rentsch

Can you be more specific? Presumably that includes many or most of the features introduced in C99. Do you know which ones (or which ones not)? Do you use any of the features introduced in C11 (or later?)?.

Re: The purpose of our universe simulation was just made clear to me through communication with time.

comp.arch

Posted: 18 Days 12 Hours ago by: Andreas Eder

Well, it shows that the continuum hypthesis is independent of the other axioms of ZF. If you are a platonist you may think this shows a deficiency in our understanding of set theory. I would say it shows that there is a wide landscape if

Re: finite but unbounded? (thread)

comp.arch

Posted: 18 Days 14 Hours ago by: robf...@gmail.com

I have thought that the apparent red-shift in distant light may be due to time passing faster the further away something is. So the universe may not be expanding at the rate suggested by the red-shift. Time is not a dimension although it

Re: The purpose of our universe simulation was just made clear to me

comp.arch

Posted: 18 Days 14 Hours ago by: Terje Mathisen

Mitch, you are wrong here: The proof that the cardinality of real numbers is greater than Aleph-0, the number of integers (as well as rationals by the triangle countability proof) was on of the first findings of Cantor afaik. (I did re

Re: Upcoming DFP support in clang/LLVM (thread)

comp.arch

Posted: 18 Days 14 Hours ago by: Terje Mathisen

Or the alternative "Never Twice the Same Color". It is about as horrible as the prevailing Norwegian electricity setup, with 230-240V 3-phase and no defined ground, just a floating wire in the more or less center of the phase triangle.

Re: Upcoming DFP support in clang/LLVM (thread)

comp.arch

Posted: 18 Days 14 Hours ago by: Terje Mathisen

I bought my very first calculator in high school, it had just dropped 40% in price, to NOK 565,- from over 1K. (This was the SR-50A version of the SR-50 which originally cost $170 in 1974). In today's money that is about $400, so a sig

Re: Upcoming DFP support in clang/LLVM (thread)

comp.arch

Posted: 18 Days 15 Hours ago by: Terje Mathisen

As I've stated before I tend to use C(+), where all the (+) features have typically been included in more recent updates to the C standard. Ditto. :-) Terje

Re: The purpose of our universe simulation was just made clear to me

comp.arch

Posted: 18 Days 15 Hours ago by: David Brown

Gödel's incompleteness theorem shows that mathematics will /always/ be incomplete. Why? You can do some interesting maths assuming the continuum hypothesis is true. You can do some assuming it is /not/ true. You can do some inter

Re: finite but unbounded? (thread)

comp.arch

Posted: 18 Days 17 Hours ago by: BGB

I guess, idle thoughts: For parts of the universe outside of Earth's light-cone, how do we know that the (effective) expansion rate (from our perspective) does not approach infinity?... Though, I guess, this would depend on the size an

Re: The purpose of our universe simulation was just made clear to me

comp.arch

Posted: 18 Days 18 Hours ago by: David Brown

Re: The purpose of our universe simulation was just made clear to me

comp.arch

Posted: 18 Days 18 Hours ago by: Quadibloc

That just shows that our understanding of set theory is incomplete. What we need is something that makes absolutely clear the exact relationship between the cardinality of the continuum and aleph-one, in the same way that Cantor's diago

Re: finite but unbounded? (thread)

comp.arch

Posted: 18 Days 18 Hours ago by: David Brown

I haven't bothered looking at your links - I am not interested in conspiracy theories and ridiculous scaremongering generalisations about the news. At the bleeding edge, science is always somewhat speculative - data is incomplete, evid

Re: The purpose of our universe simulation was just made clear to me through communication with time.

comp.arch

Posted: 18 Days 19 Hours ago by: Andreas Eder

That is the continuum problem of Cantor. It was solved (in a way) by Gödel and Cohen who proved that of os independent of the axioms of set theory (ZF). So you can have it both ways, whichever you prefer. 'Andreas

Re: The purpose of our universe simulation was just made clear to me

comp.arch

Posted: 18 Days 22 Hours ago by: Quadibloc

Visit my page at: http://www.quadibloc.com/math/infint.htm Cantor's diagonal proof is shown there. John Savard

Re: tiny little computers, was Upcoming DFP support in clang/LLVM (thread)

comp.arch

Posted: 18 Days 22 Hours ago by: John Levine

I can see how that might sort of work, but urrgh. Since Unix processes are pretty cheap, you could use a bunch of processes to get the effect of a bigger program, e.g., the C compiler was three passes plus the assembler, with each pass

Re: tiny little computers, was Upcoming DFP support in clang/LLVM (thread)

comp.arch

Posted: 19 Days ago by: EricP

I vaguely recall reading about PDP-11 OS, RSX I think, used thunking for a semiautomatic segment swapping. IIUC you manually arrange the subroutines into 8 kB segments and then the linker and run-time would use thunks to check if the pr

Re: tiny little computers, was Upcoming DFP support in clang/LLVM (thread)

comp.arch

Posted: 19 Days ago by: John Levine

The /10 and /05, which were the same machine, were a lower cost reimplementation of a /20 and had as far as I recall the same instruction set as the /20. The terminal emulator for our early bitmap terminals ran on a /10. Unix never did

Re: tiny little pages, was Upcoming DFP support in clang/LLVM (thread)

comp.arch

Posted: 19 Days ago by: John Levine

Sure, but each segment was swapped as a unit, not paged.

Re: tiny little computers, was Upcoming DFP support in clang/LLVM (thread)

comp.arch

Posted: 19 Days 1 Hour ago by: Ivan Godard

Hardware segs were on the B5000 circa 1961

Re: finite but unbounded? (thread)

comp.arch

Posted: 19 Days 3 Hours ago by: Brett

Dark matter does not exist: https://www.scientificamerican.com/article/dark-matter-may-be-missing-from-this-newfound-galaxy-astronomers-say/ And scientists have known this since Kalnajis (1983) who did the correct disc math instead of s

Re: tiny little computers, was Upcoming DFP support in clang/LLVM (thread)

comp.arch

Posted: 19 Days 3 Hours ago by: MitchAlsup

< PDP-11/20 did not have separate code/data. /40, /45, and /75 did; don't remember about the /10 and /05. < < < < RSTS allowed user programs to perform their own I/O (via UniBus access). < < < I got an entire BASIC interpreter in less than

Re: tiny little computers, was Upcoming DFP support in clang/LLVM (thread)

comp.arch

Posted: 19 Days 3 Hours ago by: BGB

OK. I wasn't sure. I know that both 32-bit x86, and ARM, had this. M68K apparently had it as soft of a chipset feature (until it was integrated with the CPU). Also SH4 had a software managed TLB, and the BJX2 TLB design is still part

Re: tiny little computers, was Upcoming DFP support in clang/LLVM (thread)

comp.arch

Posted: 19 Days 4 Hours ago by: Bill Findlay

If by "the time", you mean 1972, when the 11/45 went on sale, ICL's System 4/75, and the 1900 models 4A and 6A had been paging for years. By the time of Unix V5 the entire 2900 Series had paging and segmentation, and there were many pagi

Re: Upcoming DFP support in clang/LLVM (thread)

comp.arch

Posted: 19 Days 5 Hours ago by: BGB

More viable than DFP at least... I can at least imagine how one could do it within a plausible cost range in hardware. It would mostly just be slower and more expensive than Binary64. Same issue as my Binary96 format (truncated Bina

Re: tiny little computers, was Upcoming DFP support in clang/LLVM (thread)

comp.arch

Posted: 19 Days 5 Hours ago by: John Levine

Right. The PDP-11's didn't have paging, just a much cruder scheme that mapped the 64K address space as eight 8K chunks, which was much too big to treat as pages. At the time swapping was quite common. CTSS did it as did the PDP-6/10 opera

Re: The purpose of our universe simulation was just made clear to me

comp.arch

Posted: 19 Days 6 Hours ago by: Ivan Godard

You are mis-remembering. The number of integers is the same as the number of *rationals*, but less than the number of reals, and it is unknown whether there is a number between that of integers and that of reals. For any set S, the po

Re: The purpose of our universe simulation was just made clear to me

comp.arch

Posted: 19 Days 6 Hours ago by: MitchAlsup

< I was taught (1974) that the number of integers and the number of reals is the same both ℌ₀. You have to find a counting problem that causes k×ℌ₀×ℌ₀ = ℌ₁ (for any

Re: tiny little computers, was Upcoming DFP support in clang/LLVM (thread)

comp.arch

Posted: 19 Days 7 Hours ago by: BGB

Looks some, yeah, this stuff is a bit more minimal than I have been tending to be doing in TestKern and similar. I guess, in this sense, it probably makes a little more sense how they fit stuff into less space. Also apparently Unix v

Re: The purpose of our universe simulation was just made clear to me

comp.arch

Posted: 19 Days 7 Hours ago by: Quadibloc

Of course, we know what ℵ₀ is, the number of years in forever - the cardinality of the natural numbers. While we also have a definition for ℵ₁ - which is the number of well-orderings of the natural numbers - we don't know if the

Re: tiny little computers, was Upcoming DFP support in clang/LLVM (thread)

comp.arch

Posted: 19 Days 7 Hours ago by: John Levine

Our machine at Yale had core but we also had a third party add-in cache card which sped it up considerably. The 11/70 was an 11/45 with a built in cache and a new memory bus that allowed 4MB of RAM.

Re: Upcoming DFP support in clang/LLVM (thread)

comp.arch

Posted: 19 Days 9 Hours ago by: Thomas Koenig

They're far too often at the version that causes grief, which is why snaps were invented. Never mind that you should not even have a shared library if it is only used by a single executable. All it does then is add overhead.

Re: tiny little computers, was Upcoming DFP support in clang/LLVM (thread)

comp.arch

Posted: 19 Days 9 Hours ago by: Bill Findlay

Then wikipedia is almost right. The 11/45 could have up to 128KW of memory, most of which had to be core. 11/45 RAM, either ~450ns MOS or ~300ns bipolar, was limited to 32KW. We had 32KW of MOS which we put at the low end of the physica

Re: Upcoming DFP support in clang/LLVM (thread)

comp.arch

Posted: 19 Days 10 Hours ago by: Thomas Koenig

These guys were really good, and they made a _lot_ of the limited resources that they had. The lack of resources was a prime reason why UNIX was so lean - they had no room for unnecessary features. Ken Thompson makes mention of that in

Re: Fortran archaeology, Upcoming DFP support in clang/LLVM (thread)

comp.arch

Posted: 19 Days 10 Hours ago by: MitchAlsup

< Just like any dynamically linked library (hierarchy). You dynamically load each format subroutine as it gets called--that is all the calls were part of the object module but all of the called subroutines were dynamically linked individua

Re: tiny little computers, was Upcoming DFP support in clang/LLVM (thread)

comp.arch

Posted: 19 Days 11 Hours ago by: EricP

According to Wikipedia, the 11/45 was introduced in 1972 and could have 256 kB of semiconductor memory. A chip of that day was the Intel 1103 1kb*1 DRAM in an 18 pin DIP, takes say 0.5 sq" (inch) of board space, plus 0.5 sq" for wires.

Re: tiny little computers, was Upcoming DFP support in clang/LLVM (thread)

comp.arch

Posted: 19 Days 13 Hours ago by: Robert Swindells

The source to UNIX v7 is available, you could build it for your CPU.

Re: Upcoming DFP support in clang/LLVM (thread)

comp.arch

Posted: 19 Days 15 Hours ago by: Quadibloc

No, it stands for National Television Standards Committee. However, it was indeed susceptible to phase shift issues, which was what had led to the Europeans investigating ways to mitigate them, leading to PAL (Phase Alternation Line) and

Re: Upcoming DFP support in clang/LLVM (thread)

comp.arch

Posted: 19 Days 16 Hours ago by: Michael S

23.6B per Quarter! Total server market, as estimated by Gartner, is more like 100B USD per year. That's, according to my understanding, not including servers that hyperscalers order from ODMs. We can only guess the size of this chunk. M

Re: tiny little computers, was Upcoming DFP support in clang/LLVM (thread)

comp.arch

Posted: 19 Days 17 Hours ago by: aph

It's not that mysterious. The source code for UNIX 6th Edition is available here https://warsus.github.io/lions-/ Andrew.

Re: Upcoming DFP support in clang/LLVM (thread)

comp.arch

Posted: 19 Days 17 Hours ago by: John Dallman

However, we have teams who are absolutely convinced that using C++17 and C++20 will improve their developers' productivity, for their particular arcane field of programming. It's hard to gainsay them in their specific case. John

Re: tiny little computers, was Upcoming DFP support in clang/LLVM (thread)

comp.arch

Posted: 19 Days 17 Hours ago by: BGB

Curious then... I am also going to guess that the 16K x 1b or 64K x 1b DIP16 RAM chips weren't a thing yet... Best I can tell is that they caught on "some time in the 70s" (replacing magnetic core memory which had in-turn replaced d

Re: finite but unbounded? (thread)

comp.arch

Posted: 19 Days 18 Hours ago by: David Brown

Well, that's the current theory. If you look at the curves according to scientists' best understanding, the speed of expansion has varied - an explosion at the start, then a ridiculous "inflation" speed, then relatively constant for a

Re: Upcoming DFP support in clang/LLVM (thread)

comp.arch

Posted: 19 Days 20 Hours ago by: Thomas Koenig

I did some digging. Apparently, IBM is selling around 2 billion dollar's worth of POWER systems per year, according to https://www.itjungle.com/2022/03/07/the-low-down-on-ibms-power-systems-sales/ compared to a total server market of aro

Re: Upcoming DFP support in clang/LLVM (thread)

comp.arch

Posted: 19 Days 20 Hours ago by: BGB

With NTSC, compatible would have been doing what they did (with colorburst modulation). In some of my Sci-Fi stories, I had imagined a world where color TV had, instead of colorburst+QAM, or color-keyed frames, had instead worked by i

Re: Upcoming DFP support in clang/LLVM (thread)

comp.arch

Posted: 19 Days 20 Hours ago by: robf...@gmail.com

I remember those small red LED calculators when they become popular. Built a two-bit adder out of transistors when I was a teenager, which was a bit retro. Having become interested in digital electronics. Bought a bunch of digital electro

Re: Upcoming DFP support in clang/LLVM (thread)

comp.arch

Posted: 19 Days 20 Hours ago by: Thomas Koenig

Doesn't NTSC stand for "Never The Same Color"?

Re: Fortran archaeology, Upcoming DFP support in clang/LLVM (thread)

comp.arch

Posted: 19 Days 20 Hours ago by: Thomas Koenig

Sure. FORTRAN 77 did not have derived types, these were introduced in Fortran 90, and derived type I/O was only added in Fortran 2003 (and its specification is a mess even now). But if you didn't know in advance which of the pre-selecte

Re: Upcoming DFP support in clang/LLVM (thread)

comp.arch

Posted: 19 Days 22 Hours ago by: Quadibloc

This was actually tried for television - by CBS. Eventually, though, the FCC reconsidered, and NBC's color system, "compatible color", won out, as NTSC. Europe waited a while and went with PAL and SECAM. John Savard

Re: Upcoming DFP support in clang/LLVM (thread)

comp.arch

Posted: 19 Days 23 Hours ago by: Quadibloc

Ordinary people did interact with electronic devices personally. Even in the 1920s, whenever they turned on the radio, because it used vacuum tubes to amplify the signal. What most people didn't do personally in the 1960s, though, was

Re: Upcoming DFP support in clang/LLVM (thread)

comp.arch

Posted: 19 Days 23 Hours ago by: Quadibloc

Yes, that's mostly right. One additional detail, though, is that electronic amplification is required to play 45 RPM and 33 RPM records; only the 78 RPM records could be played mechanically. Also, there was no such thing as mechanical

Re: Graphics in the old days, wasUpcoming DFP support in clang/LLVM (thread)

comp.arch

Posted: 19 Days 23 Hours ago by: John Levine

Uh, no. Until the early 1970s, computer graphics used an oscilliscope screen with the images drawn by a processor that interpreted a display list that showed points, lines, and maybe circles. A few of the fancier ones had some way to do t

Re: Upcoming DFP support in clang/LLVM (thread)

comp.arch

Posted: 19 Days 23 Hours ago by: BGB

I meant in the "modern sense", where one is prone to develop an ever increasing mass of old electronics and cables. And, a world where everyday people could interact with electronics devices and similar directly, rather than, say, read

Re: Upcoming DFP support in clang/LLVM (thread)

comp.arch

Posted: 20 Days ago by: Bill Findlay

Believe it or not, we had one of those devices for a grad student to play with. Quite what voice synthesis had to do with screws I never fully understood. Yes, our 11/40 was replaced by a 128K 11/45. It easily supported 16 users. (Wit

Re: Upcoming DFP support in clang/LLVM (thread)

comp.arch

Posted: 20 Days 1 Hour ago by: BGB

OK. Not that sure of the specifics, but I think these wires were used on the modulator boxes of some of the early game consoles (Atari 2600 and similar). Not really dealt with this console myself, but IIRC I watched the AVGN guy ran

Re: Upcoming DFP support in clang/LLVM (thread)

comp.arch

Posted: 20 Days 1 Hour ago by: Brian G. Lucas

Bill, I must be older. I picked up 5th edition Unix by going to Ken and Dennis's lab in NJ and copying a RK05. It all fit on one, with source (because all the comments had been deleted). I had a second RK05 so they copied some other

Re: Upcoming DFP support in clang/LLVM (thread)

comp.arch

Posted: 20 Days 1 Hour ago by: Ivan Godard

The Mary2 compiler compiled itself on a DG Nova1200 with 64k of core that you shared with the OS. I was spendthrift at that - compare the Algol60 compiler on the GEAR: 1K words, and a drum that could swap memory and a track.

Re: Upcoming DFP support in clang/LLVM (thread)

comp.arch

Posted: 20 Days 2 Hours ago by: Bill Findlay

This discussion makes me feel old (perhaps because I am 8-). Unix V6 ran in 48KW, on two RK05 disks, each of 2.5MB. I ran it on an 11/40 of that specification in 1976. The 11/40 addressed up to 128KW, of which 124KW were usable for RAM,

Re: Upcoming DFP support in clang/LLVM (thread)

comp.arch

Posted: 20 Days 2 Hours ago by: Stephen Fuld

snip You really ought to learn some history. That "no real tech" was sufficient to design, build and control rockets that put men on the moon. Record players date to long before the 1960s. There were other color movies even befor

Re: finite but unbounded? (thread)

comp.arch

Posted: 20 Days 2 Hours ago by: Brett

“Universe grew faster than the speed of light” Universe speeding up, etc. These fools just can’t stop dividing by zero and violating their own rules. Red shift by distance is better explained by decay. There is a good paper on it.

Re: Upcoming DFP support in clang/LLVM (thread)

comp.arch

Posted: 20 Days 4 Hours ago by: BGB

Yeah. I would not exist for around another decade after this... Was initially confused some trying to narrow down just what sort of hardware stats people were dealing with at the time. It looks like: PDP-11: 8x 16-bit Va

Re: Upcoming DFP support in clang/LLVM (thread)

comp.arch

Posted: 20 Days 4 Hours ago by: Quadibloc

There _was_ a black and white CRT that was used - mostly with the LINC-8 and pdp-12 - which worked with a raster display, using a 4 by 6 matrix for characters, the VR12. Yes; one used 300 ohm cable from aerials, as opposed to 75 ohm im

Re: Upcoming DFP support in clang/LLVM (thread)

comp.arch

Posted: 20 Days 5 Hours ago by: MitchAlsup

< When forced to use C++, I use the C subset. And I still use printf() instead of <<.

Re: Upcoming DFP support in clang/LLVM (thread)

comp.arch

Posted: 20 Days 5 Hours ago by: BGB

I don't get the point personally. Almost better I think to stick with the minimal feature set needed to accomplish a given task (being conservative about what is used and when). One could almost assume sticking to C90 for pretty much e

Re: Upcoming DFP support in clang/LLVM (thread)

comp.arch

Posted: 20 Days 5 Hours ago by: Quadibloc

The S-100 backplane dates specifically from December 1974, when the January 1975 issue of Popular Electronics hit the newsstands; it is the backplane of the Altair 8800 computer. How could anyone not remember this? (Oh, yes, the obvious

Re: Fortran archaeology, Upcoming DFP support in clang/LLVM (thread)

comp.arch

Posted: 20 Days 5 Hours ago by: MitchAlsup

< Yes, but you could not invent a random-precision floating point type in a character string. You were restricted to the format types already installed. < It is one thing to have to compile (or interpret) a known set of format capabilities

Re: tiny little computers, was Upcoming DFP support in clang/LLVM (thread)

comp.arch

Posted: 20 Days 5 Hours ago by: John Levine

Yes, we did. The most memory you could put on a PDP-11/45 was 248K, the 256K bus address space minus 8K for the I/O devices. Each process was limited to 64K of code and 64K of data. Memory was expensive and many real systems had 128KB

Re: Upcoming DFP support in clang/LLVM (thread)

comp.arch

Posted: 20 Days 5 Hours ago by: BGB

OK, this turns it back into being a mystery how it could fit... When I was trying to gather information, looking up PDP-11 was turning up things like PDP-11 processor boards for an S-100 backplane with 2 MB of RAM and similar. Thoug

Re: Fortran archaeology, Upcoming DFP support in clang/LLVM (thread)

comp.arch

Posted: 20 Days 5 Hours ago by: John Levine

Sure it did. Fortran 77 let you put your format statement in a character variable. There weren't that many formats specifiers, so the F77 compiler I wrote just put them all in the library and interpreted formats at runtime, both static

Re: Upcoming DFP support in clang/LLVM (thread)

comp.arch

Posted: 20 Days 6 Hours ago by: MitchAlsup

< The FORTRAN of that era did not have those. I don't know what the modern FORTRANs have. But out (S.E.L.) FORTRAN had a format compiler that read character strings and emitted a sequence of code that performed formatting chores (much of

Re: Upcoming DFP support in clang/LLVM (thread)

comp.arch

Posted: 20 Days 6 Hours ago by: Thomas Koenig

How did you deal with formats created at run-time? Shared libraries. You can see how well they work from the proliferation of dockers, snaps, and whatnot.

Re: Upcoming DFP support in clang/LLVM (thread)

comp.arch

Posted: 20 Days 8 Hours ago by: John Dallman

My employers have various teams that get bees in their bonnets about C++ features, though some teams definitely do this more than others. There's enough code shared between different products that it's worth coordinating company-wide.

Re: finite but unbounded? (thread)

comp.arch

Posted: 20 Days 8 Hours ago by: robf...@gmail.com

Where did the original mass come from? Why would there be only one explosion? I forget how e=mc^2 was derived. I believe it was a very close approximation power series. e = mc^2 + nc + k ( where nc and k can be dropped). A circle does n

Re: Upcoming DFP support in clang/LLVM (thread)

comp.arch

Posted: 20 Days 9 Hours ago by: MitchAlsup

< in 1982 I was on a project that built a FORTRAN runtime library. We built the WRITE routines so you only pulled in the data formatters you actually used, not the entire suite of potentials. < Computers are more flexible today, why can't

Re: finite but unbounded? (thread)

comp.arch

Posted: 20 Days 9 Hours ago by: Thomas Koenig

Citation needed.

Re: finite but unbounded? (thread)

comp.arch

Posted: 20 Days 9 Hours ago by: MitchAlsup

< What does it matter if the visible parts of the universe is only a small portion of the entire universe ? < < Then where is that new mass coming from ? If you believe in e=mc^2, then where is the energy used to create that mass coming fr

Re: finite but unbounded? (thread)

comp.arch

Posted: 20 Days 9 Hours ago by: MitchAlsup

< But the universe's expansion is accelerating--right now. <

Re: finite but unbounded? (thread)

comp.arch

Posted: 20 Days 12 Hours ago by: robf...@gmail.com

This is a bit off-topic. I think our cognitions of the universe has more to do with our ability to perceive than the characteristics of the universe itself. What we are measuring is essentially human perceptual limits. Do we really unders

Re: Upcoming DFP support in clang/LLVM (thread)

comp.arch

Posted: 20 Days 12 Hours ago by: Stephen Fuld

snip Not strange at all once you understand the history. The 8 bit "byte" was "invented" by IBM in the early 1960s as part of the S/360 development. Even then, there was disagreement within IBM over how many bits to use for a charac

Re: finite but unbounded? (thread)

comp.arch

Posted: 20 Days 14 Hours ago by: David Brown

The fact that portions of the universe are not /currently/ observable does not mean it is unbounded. Even if (as current measurements and theory suggest) it continues to grow, that does not in itself imply there is no bound to its siz

Re: Upcoming DFP support in clang/LLVM (thread)

comp.arch

Posted: 20 Days 15 Hours ago by: Quadibloc

Did such machines exist? Well, you could get a IBM 360/195 with up to 2 MB of RAM. And even up to 16 MB of RAM if you accepted the special slow core. If we discount the PDP-7 on which Unix was originally developed, however, on the basi

Re: Upcoming DFP support in clang/LLVM (thread)

comp.arch

Posted: 20 Days 16 Hours ago by: Michael S

What about quad-precision binary floating point? Right now, it seems that on x64 Linux __float128 is supported by compiler. However I was unable to figure out whether run-time support library is provided by LLVM or compiler relies on syst

Re: Upcoming DFP support in clang/LLVM (thread)

comp.arch

Posted: 20 Days 17 Hours ago by: Thomas Koenig

This turns out not to be the case. The PDP-11 they started serious UNIX development on had 24 kB of memory and a half-megabyte disk. Source: UNIX: A History and a Memoir, Brian Kernighan, page 52.

Re: Upcoming DFP support in clang/LLVM (thread)

comp.arch

Posted: 20 Days 18 Hours ago by: Terje Mathisen

I read the proposal, due to the need for cross-compiling and having multiple targets, clang will be forced to include code (software emaulation) for both BID and DPD encodings, as well as conversions between them. This will probably l

Re: Upcoming DFP support in clang/LLVM (thread)

comp.arch

Posted: 20 Days 18 Hours ago by: BGB

For a small team or single-developer C compiler, trying to play catch-up and implement all this stuff is basically no-go. It is more viable, say: Mostly or exclusively focus on C; Cherry pick those features which seem useful, and ignor

Re: Upcoming DFP support in clang/LLVM (thread)

comp.arch

Posted: 20 Days 19 Hours ago by: BGB

Not every compiler feature necessarily needs to be present in hardware. For example, C was/is often compiled to machines which where only a select few arithmetic operators could actually be implemented in hardware, and pretty much eve

Re: Upcoming DFP support in clang/LLVM (thread)

comp.arch

Posted: 20 Days 19 Hours ago by: John Dallman

IBM is abandoning its own C and C++ compilers and replacing them with Clang. Presumably Clang rather than GCC because of the more permissive license for vendor-specific add-ons. The IBM XL family of compilers was stopped at C++98 for q

Re: Upcoming DFP support in clang/LLVM (thread)

comp.arch

Posted: 20 Days 21 Hours ago by: Thomas Koenig

IBM is funding a lot of open source compiler development. In software especially, a little money (by corporate standards) can go a long way in getting an excellent result. This is why POWER is a primary platform for gcc, for example. (an

Re: Upcoming DFP support in clang/LLVM (thread)

comp.arch

Posted: 20 Days 21 Hours ago by: Thomas Koenig

Nice floating point format :-) Are there any sources for the number of POWER processors sold these days?

Re: Upcoming DFP support in clang/LLVM (thread)

comp.arch

Posted: 20 Days 23 Hours ago by: Quadibloc

Well, if GCC has it, clearly clang has to have it too! Otherwise, clang would fall behind, and no longer be an active competitor to GCC! John Savard

Re: Upcoming DFP support in clang/LLVM (thread)

comp.arch

Posted: 20 Days 23 Hours ago by: John Levine

Maybe, but why clang? There's already DFP support in the IBM compilers people are likely to use on machines with DFP hardware and in gcc.

Re: Upcoming DFP support in clang/LLVM (thread)

comp.arch

Posted: 21 Days ago by: Guillaume

Yes, but you'll be able to represent this fraction exactly using DFP. =)

Re: Upcoming DFP support in clang/LLVM (thread)

comp.arch

Posted: 21 Days ago by: MitchAlsup

< Probably banks--COBOLholics.

Re: Upcoming DFP support in clang/LLVM (thread)

comp.arch

Posted: 21 Days ago by: John Levine

I was wondering who's funding this. Who thinks it's worth the effort? R's, John

Re: Upcoming DFP support in clang/LLVM (thread)

comp.arch

Posted: 21 Days 1 Hour ago by: MitchAlsup

< Representing 0.000,1% of chips containing CPUs; sold annually.

Re: finite but unbounded? (thread)

comp.arch

Posted: 21 Days 1 Hour ago by: MitchAlsup

< The amount of energy in the Big Bang was finite. Thus the energy or mass of the universe must be finite. < However, during inflation the scale of the universe grew faster than the speed of light, so there are significant portions of the

finite but unbounded?

comp.arch

Posted: 21 Days 3 Hours ago by: Ivan Godard

@Mitch: Why not one of the other possibilities?

Re: Upcoming DFP support in clang/LLVM (thread)

comp.arch

Posted: 21 Days 4 Hours ago by: Quadibloc

According to that, the project has just begun. It is reasonable that this project _should_ begin, given that the current C standard has provision for DFP, and there are some machines out there with DFP support. John Savard

Re: The purpose of our universe simulation was just made clear to me

comp.arch

Posted: 21 Days 5 Hours ago by: MitchAlsup

< Neither of these is true in a mathematical sense. Both of these are true in a computer arithmetic sense. < So, it is not math that has the problem, but computer arithmetics. < Off your lithium again ? < Universe is finite but unbounded.

Re: The purpose of our universe simulation was just made clear to me

comp.arch

Posted: 21 Days 5 Hours ago by: David Brown

Pick /one/ hobby: 1. Using the internet. 2. Recreational pharmaceuticals. Choose whatever makes you feel good, but please stop mixing them.

Re: 0.0 / 0.0 = -NAN ?

comp.arch

Posted: 21 Days 5 Hours ago by: MitchAlsup

< < Some calculation can produce a NaN, and then overly agressive optimization creates the opportunity for the program to simply fail unexpectedly. No Compiler should ever remove a comparison designed to ferret NaNs from values.

Re: 0.0 / 0.0 = -NAN ?

comp.arch

Posted: 21 Days 5 Hours ago by: MitchAlsup

< A == A -> FALSE means A is a NaN. < Given that NaNs exist, NaN < A can never be considered to be !(NaN >= A) < What happens is that these kinds of comparisons change which clause the NaNs go into (should be the else clause but inversion

Re: 0.0 / 0.0 = -NAN ?

comp.arch

Posted: 21 Days 5 Hours ago by: Michael S

Are you sure? According to my reading of the standard, result of 0/0 has to be NaN, but there are no requirements for sign of the NaN. From the standard: "When either an input or result is NaN, this standard does not interpret the sign of

Upcoming DFP support in clang/LLVM

comp.arch

Posted: 21 Days 6 Hours ago by: Ivan Godard

https://discourse.llvm.org/t/rfc-decimal-floating-point-support-iso-iec-ts-18661-2-and-c23/62152

Re: 0.0 / 0.0 = -NAN ?

comp.arch

Posted: 21 Days 9 Hours ago by: Anton Ertl

If they specify to the compiler that they don't use NaNs, always returning 0 for isnan() is correct. Where's the danger? Note that the gcc manual says about this option: This option is not turned on by any -O option since it can result

Re: 0.0 / 0.0 = -NAN ?

comp.arch

Posted: 21 Days 10 Hours ago by: Niklas Holsti

Rather, and as you no doubt meant, with the silly C double-equal: int isnan(double x) { return !(x==x); }

Re: 0.0 / 0.0 = -NAN ?

comp.arch

Posted: 21 Days 10 Hours ago by: Thomas Koenig

Both are somewhat dangerous, as people may take the code one day and run it through a compiler which removes these tests, for one reason or another, specifying an option like -ffinite-math-only being one of them.

Re: 0.0 / 0.0 = -NAN ?

comp.arch

Posted: 21 Days 15 Hours ago by: Anton Ertl

You assume method to this madness. However, x <> NaN is specified to produce true (in contrast to the other 5 conventional comparison operators). Seems overly complicated int isnan(double x) { return x!=x; } Even without the extra ma

Re: 0.0 / 0.0 = -NAN ?

comp.arch

Posted: 21 Days 18 Hours ago by: Terje Mathisen

Not at all: By definition, comparing anything to NaN should return false. This is actually a way to determine if a number _is_ a NaN! It is better to have a supported isnan(x) test function, but you can write your own like this: bool

Re: 0.0 / 0.0 = -NAN ?

comp.arch

Posted: 21 Days 19 Hours ago by: Anton Ertl

Actually, both comparisons produce false. - anton

Re: 0.0 / 0.0 = -NAN ?

comp.arch

Posted: 21 Days 23 Hours ago by: MitchAlsup

< Careful:: < -0.0 × +infinity does equal -NaN.

Re: Changing the Width of Memory is Easy (thread)

comp.arch

Posted: 22 Days 19 Hours ago by: Quadibloc

And asking myself the question of why I didn't do it that way all along, instead of going to the somewhat elaborate lengths described at http://www.quadibloc.com/arch/per14.htm has allowed me to articulate what the "problem" was I was

Re: Changing the Width of Memory is Easy (thread)

comp.arch

Posted: 22 Days 19 Hours ago by: Quadibloc

Whatever the problem is, I think I've finally, at long last, found an acceptable solution, as shown at the bottom of the page: http://www.quadibloc.com/arch/per14.htm However, that question _is_ food for thought. What _is_ the problem?

Re: Extended double precision decimal floating point (thread)

comp.arch

Posted: 23 Days 1 Hour ago by: Quadibloc

(quoting Anton Ertl) However, while this shows that he wasn't _quite_ right, it hardly refutes Anton Ertl's main point. Since instruction sets that include decimal floating-point as a feature on a computer which does binary arithmetic (

Re: Extended double precision decimal floating point (thread)

comp.arch

Posted: 23 Days 2 Hours ago by: Quadibloc

Another example, also predating IBM's modern version of DFP, is the Wang VS series of computers, which had decimal floating point based on packed decimal, with the format otherwise the same as the IBM 360's normal floating-point. John Sa

Re: Changing the Width of Memory is Easy (thread)

comp.arch

Posted: 23 Days 2 Hours ago by: Quadibloc

And now, in a Kekulé benzene moment, it has finally come to me how a conventional dual-channel arrangement, with each channel 128 bits wide, can be used, and meet the conditions that I have! It just required a little bit of original thi

Re: Finally, Some "Good" News (thread)

comp.arch

Posted: 23 Days 4 Hours ago by: antispam

Unfortunately this is very short on details. If they built ideal diode, then this would be huge revolution for power generation (rectify Jonson noise to beat second law of thermodynamics). If this is non-ideal in what sense it is "one-wa

Re: Extended double precision decimal floating point (thread)

comp.arch

Posted: 23 Days 5 Hours ago by: Anton Ertl

Interesting, but I expect that they did not implement IEEE 754 decimal formats (which AFAIK were only standardized in the 2008 revision). So the only hardware implementation of that remains IBM's. - anton

Re: Extended double precision decimal floating point (thread)

comp.arch

Posted: 23 Days 6 Hours ago by: mac

Not exactly nobody. Honeywell 6600 (later Honeywell DPS8) had decimal floating point in the extended instruction set. Binary exponent, though. Implemented in hardware in 6600, microcode in DPS8. A variant of this architecture ran Multics.

Re: 0.0 / 0.0 = -NAN ?

comp.arch

Posted: 23 Days 8 Hours ago by: MitchAlsup

< IEEE 754 states:: division by zero creates properly signed NaNs. +0.0/+0.0 = +NaN -0.0/+0.0 = -NaN +0.0/-0.0 = -NaN -0.0/-0.0 = + NaN < One of those 0.0s got a negative sign. But How ? Must be a Delphi thing.

Re: Finally, Some "Good" News (thread)

comp.arch

Posted: 23 Days 8 Hours ago by: MitchAlsup

< If only room temperature was that of liquid helium................

Re: Changing the Width of Memory is Easy (thread)

comp.arch

Posted: 23 Days 10 Hours ago by: Quadibloc

And now this page has been augmented with an additional diagram, showing an additional alternative: a conventional dual-channel arrangement, with each channel 192 bits wide. John Savard

Finally, Some "Good" News

comp.arch

Posted: 23 Days 11 Hours ago by: Quadibloc

This article on TechSpot https://www.techspot.com/news/94398-physicists-discover-impossible-one-way-superconductor-could-lead.html reports on a laboratory experiment in which a new kind of material was created which could, concievably, lead

Re: Changing the Width of Memory is Easy (thread)

comp.arch

Posted: 23 Days 19 Hours ago by: Quadibloc

I have finally organized my thoughts on this matter, and come up with two arrangements that meet the conditions I seek; one favors conventional memory widths, and is quad-channel, with each channel being 64 bits wide, the other favors th

Re: Changing the Width of Memory is Easy (thread)

comp.arch

Posted: 24 Days 4 Hours ago by: Quadibloc

With further thought, I see that if I don't need to handle unaligned items wider than 60 bits, I don't need to have the memory with that wide a path to the CPU. Essentially, I'm looking for a way to have a system: Suitable for operati

Re: Changing the Width of Memory is Easy (thread)

comp.arch

Posted: 24 Days 7 Hours ago by: BGB

Presumably, one could do a mapping like: P_LineB = (V_Line>>1)*3 So, V_Line 0/1 map to P_Line 0/1/2, 2/3 to 3/4/5, ... The LSB of V_Line selecting whether to use the low or high half of this 3-line pair. Though, this is assuming th

Re: Changing the Width of Memory is Easy (thread)

comp.arch

Posted: 24 Days 7 Hours ago by: Quadibloc

I don't modify the memory to accomodate this, except for the fact that the memory can handle unaligned data items with no additional delay, any more than conventional computer systems would do so. This would be true both for 64-bit mode

Re: Changing the Width of Memory is Easy (thread)

comp.arch

Posted: 24 Days 8 Hours ago by: MitchAlsup

< ? < < This is the wrong way of thinking about the problem:: The right way:: If you pay 20ns to get at the data you require (RAS), You should spend 20ns funneling data in/out (multiple CAS). This has nothing to do with the width of DRAM (

Re: Changing the Width of Memory is Easy (thread)

comp.arch

Posted: 24 Days 8 Hours ago by: BGB

With DIMMs, maybe. With typical FPGA boards, it is more often 4/8/16 bits. You might get 32 bits if buying a more expensive FPGA board. In my case, L2 line size is 64B (512 bits), mostly because this had lower overheads for DRAM transf

Re: Changing the Width of Memory is Easy (thread)

comp.arch

Posted: 24 Days 8 Hours ago by: Quadibloc

But what about the case when unaligned data is _inside_ a single block, instead of crossing block boundaries? Then, we don't want to fetch two consecutive words in a single block. So we need to have the odd words on one side and the even

Re: Changing the Width of Memory is Easy (thread)

comp.arch

Posted: 24 Days 9 Hours ago by: Quadibloc

I am going to need a divide by three circuit, then, for conventional memory accesses. But I can save a few pins and go to "four-channel" memory of a sort. That is, I can have only four address buses out of the CPU. Two will be associate

Re: Changing the Width of Memory is Easy (thread)

comp.arch

Posted: 24 Days 9 Hours ago by: Quadibloc

It allows the entirety of an _unaligned_ word to be fetched in a single memory read operation. Otherwise, I would have to do two reads in the case of an unaligned word that crosses a memory word boundary. John Savard

Re: Changing the Width of Memory is Easy (thread)

comp.arch

Posted: 24 Days 9 Hours ago by: Quadibloc

Another thing occurred to me. I figured it would also help if the external memory was dual-channel. But if my goal is to take advantage of the fact that I can choose which word in a DRAM line to fetch first, so that I can get at a specif

Re: Changing the Width of Memory is Easy (thread)

comp.arch

Posted: 24 Days 14 Hours ago by: Theo Markettos

Agreed, depending on the density of the DRAM chip. You want to get as much of a row as your line size allows. John is proposing a scheme where words aren't a power-of-two sized, eg 48 bits. 48 = 3*2^4, so that prime factor of 3 will m

Re: Changing the Width of Memory is Easy (thread)

comp.arch

Posted: 24 Days 14 Hours ago by: Michael S

I'm not sure that John can explain what the problem is :( What you call LLC width, most people call cache line size. And, indeed, most popular sizes nowadays are 64B (512bits) and 128B (1024 bits). Both much smaller than DRAM DIMM row (

Re: Changing the Width of Memory is Easy (thread)

comp.arch

Posted: 24 Days 15 Hours ago by: Anton Ertl

It seems to me that cache line sizes are very much influenced by spatial locality (or the lack of it). We have seen bandwidths soar and latencies stagnate, so one might expect cache lines to become longer, yet cache lines on general-purp

Re: Changing the Width of Memory is Easy (thread)

comp.arch

Posted: 24 Days 16 Hours ago by: Theo Markettos

I haven't read it if this came up as a previous thread, but I'm not sure what the problem is. Your DRAM *bus* might be 64 bits wide, but your DRAM rows aren't. Your LLC would ideally be as wide as your DRAM row, or if not a power of two

Changing the Width of Memory is Easy

comp.arch

Posted: 24 Days 17 Hours ago by: Quadibloc

On further reflection, I see that I have been overthinking things a bit. How can one get 48-bit words in a 64-bit world? If one is reconciled to the fact that DRAM is only efficient when blocks of memory are accessed, then it should be cle

Re: Extended double precision decimal floating point (thread)

comp.arch

Posted: 25 Days 4 Hours ago by: BGB

Haven't really used GMP so don't know that much how it works. I guess I can note that within my compiler, some similar sort of cases have come up, but is handled in different ways at different levels. Decided to leave out a bunch of

Re: The Green Star of Hope for RISC-V (thread)

comp.arch

Posted: 25 Days 7 Hours ago by: MitchAlsup

< If you are good with the → key you can get all the information content in just over 5 minutes...... <

Re: The Green Star of Hope for RISC-V (thread)

comp.arch

Posted: 25 Days 9 Hours ago by: Guillaume

Yep this is nothing really new here. And the fact those are RISC-V cores? Yeah? There are now tens if not hundreds of RISC-V cores out there. The successful chips are still few, though. Talking about an interesting approach with many

Re: The Green Star of Hope for RISC-V (thread)

comp.arch

Posted: 25 Days 17 Hours ago by: Anton Ertl

s/performance/importance/ You see what's important to me:-) - anton

Re: The Green Star of Hope for RISC-V (thread)

comp.arch

Posted: 25 Days 19 Hours ago by: Anton Ertl

If you count commercial success for evaluating performance, then yes, Mill is 1000 times more important than this chip and this chip is 1000 times more important than Mill: 1001*0 = 0 0 = 1001*0 - anton

Re: The Green Star of Hope for RISC-V (thread)

comp.arch

Posted: 25 Days 21 Hours ago by: Quadibloc

That's entirely possible. This chip being sampled may turn out to be not important at all, so if the Mill is successful, it would be more than a thousand times as important. John Savard

Re: The Green Star of Hope for RISC-V (thread)

comp.arch

Posted: 25 Days 21 Hours ago by: Quadibloc

Even so, the article does claim that those cores are real RISC-V cores, with the full instruction set (at least the basic part of it). They may indeed have so little memory available that all they can do is guide the computations of the

Re: Extended double precision decimal floating point (thread)

comp.arch

Posted: 26 Days 6 Hours ago by: Terje Mathisen

OK. I accept that it does work to keep everything in twos-complement, but in my own libraires I've found it easier to keep the signs separate, partly because that makes it much easier to compare the magnitude of two numbers. Terje

Re: Extended double precision decimal floating point (thread)

comp.arch

Posted: 26 Days 6 Hours ago by: Terje Mathisen

One of the key differences between BID and GMP is that the 128-bit binary decimal fp is exactly specified, and the 128-bit size if fixed and will therefore always fit in two 64-bit (unsigned) regs. GMPs limbs are by definition the indi

Re: Extended double precision decimal floating point (thread)

comp.arch

Posted: 26 Days 8 Hours ago by: Michael S

Other ECC. Not Error Correction Codes, but Elliptic Curve Cryptography

Re: Extended double precision decimal floating point (thread)

comp.arch

Posted: 26 Days 8 Hours ago by: BGB

Don't do this. This is a case where low-high ordering makes more sense. You could simplify things by making all of the bignums be power-of-2 sizes, and by padding inputs to the same size in cases where they differ. An alternative (

Re: IBM features, Extended double precision decimal floating point (thread)

comp.arch

Posted: 26 Days 8 Hours ago by: EricP

It might be easier if memory block instructions were defined like async IO - that the input and output buffers are considered "in use" and must not be touched or modified until the full operation completes. Like async IO, it is software'

Re: Extended double precision decimal floating point (thread)

comp.arch

Posted: 26 Days 8 Hours ago by: MitchAlsup

< SECDED works with 2^n+n+2 containers. SECDED on 64-bits uses a 128-bit code matrix and uses the 72-bit subset. < Thus, ECC usually works on non-power of 2 sizes, or degenerate next power of 2 sizes. < There are higher forms of ECC that u

Re: IBM features, Extended double precision decimal floating point (thread)

comp.arch

Posted: 26 Days 10 Hours ago by: Thomas Koenig

Patches welcome.

Re: IBM features, Extended double precision decimal floating point (thread)

comp.arch

Posted: 26 Days 10 Hours ago by: Thomas Koenig

Having read this, it's more a transatction server where the transactions come in via https. But you can call this a web server if you so desire ;-)

Re: Extended double precision decimal floating point (thread)

comp.arch

Posted: 26 Days 10 Hours ago by: Anton Ertl

Unless you use a complete byte for the sign bit, the overhead is the same as in the code I showed. And if you use a complete byte, you have to load these bytes, and store the result byte, which is also overhead. No overhead only if you

Re: Extended double precision decimal floating point (thread)

comp.arch

Posted: 26 Days 10 Hours ago by: Michael S

Pre-scaling likely not needed, at least for multiplication. Post-scaling is because upfront you don't know an exact number of words in result. If the number is stored with LS word first then post-scaling only needs to update metadata. Bu

Re: IBM features, Extended double precision decimal floating point (thread)

comp.arch

Posted: 26 Days 12 Hours ago by: EricP

In this case IBM refers to both Deflate and Crypto features as "accelerators" which I suspect is IBM-speak for "even though the hardware is already present in your machine, we charge extra to enable this". While their operations may be t

Re: Extended double precision decimal floating point (thread)

comp.arch

Posted: 26 Days 12 Hours ago by: Anton Ertl

.... [Again, reformatted] Why would big-integer arithmetic have any scaling? For multiplication, multiplying a two-word integer a with a two-word integer b (giving a four-word integer c) should display all the interesting cases, so I'll

Re: IBM features, Extended double precision decimal floating point (thread)

comp.arch

Posted: 26 Days 15 Hours ago by: Anton Ertl

Fortran accepts the operator "×"? Fortran's array sub-language goes mostly in the right direction, but is not typical of programming languages. Also, Thomas Koenig tells us that gcc converts the array notation into loops of scalar oper

Re: The Green Star of Hope for RISC-V (thread)

comp.arch

Posted: 26 Days 23 Hours ago by: Brett

It’s hard to make use of a thousand CPU’s as the individual ram bandwidth is tiny being divided by a thousand. You need lots of medium sized tasks the feed into a decision tree, and you have to map them roughly correctly to die layo

Re: Extended double precision decimal floating point (thread)

comp.arch

Posted: 27 Days 4 Hours ago by: Michael S

Did you read my question?

Re: Extended double precision decimal floating point (thread)

comp.arch

Posted: 27 Days 4 Hours ago by: MitchAlsup

void Long_multiplication( uint64_t multiplicand[], multiplier[], sum[], ilength, jlength ) { for( uint64_t i = 0;

Re: IBM features, Extended double precision decimal floating point (thread)

comp.arch

Posted: 27 Days 4 Hours ago by: MitchAlsup

< < You can microcode the execution pipeline OR You can microcode a function unit When you microcode a function unit, every other function unit is free to run other code while one function unit crunches on its current instruction. < Microc

Re: Extended double precision decimal floating point (thread)

comp.arch

Posted: 27 Days 5 Hours ago by: Michael S

So, what would be your code for, say, multiplication, when numbers are stored as two-complements? Can you sketch it here? Leave away management, memory allocation, prescaling, postscaling, etc... Just show two inner loops of convolution.

Re: IBM features, Extended double precision decimal floating point (thread)

comp.arch

Posted: 27 Days 5 Hours ago by: Anton Ertl

The question is what benefit microcode buys over architectural code. If you do just the same things, microcode is not any faster on current machines. I remember one case (don't remember if it was an ARM instruction or an Intel instructio

Re: Extended double precision decimal floating point (thread)

comp.arch

Posted: 27 Days 6 Hours ago by: Anton Ertl

[reformatted for conventional Usenet line length] I don't see why you would need such instructions for implementing twos-complement big-integers. I don't see a single reason for sign-magnitude big-integer representation. - anton

Re: IBM features, Extended double precision decimal floating point (thread)

comp.arch

Posted: 27 Days 6 Hours ago by: John Levine

IBM sure seems to promote linux on Z as a high performance web server: https://www.ibm.com/downloads/cas/POB59BLE

Re: Extended double precision decimal floating point (thread)

comp.arch

Posted: 27 Days 8 Hours ago by: Michael S

Another reason to prefer sign-magnitude over two-complement as internal format for Big Integer library is weak support for mixed signed-unsigned arithmetic in many instruction sets and even worse support in all popular HLLs that are like

Re: Extended double precision decimal floating point (thread)

comp.arch

Posted: 27 Days 10 Hours ago by: MitchAlsup

< Yes, indeed. < Given a vector of integers considered as a single big-num, at most 1 of these is signed while the n-1 are unsigned. That is one reason unsigned numbers are more important than signed.

Re: Extended double precision decimal floating point (thread)

comp.arch

Posted: 27 Days 11 Hours ago by: Stefan Monnier

Are sorry. You were talking about the unsigned fixed size integers inside the bignums (what GMP calls the "limbs" IIRC), where I was thinking about the bignums themselves. Stefan

Re: Extended double precision decimal floating point (thread)

comp.arch

Posted: 27 Days 13 Hours ago by: Terje Mathisen

Because you have to work with arrays of words, and I really don't want those words to be signed! There will of course be a sign bit somewhere, but it does not take any direct part in the core operations which happens on arrays of unsig

Re: Extended double precision decimal floating point (thread)

comp.arch

Posted: 27 Days 13 Hours ago by: Stefan Monnier

Why *unsigned* binary? Stefan

Re: IBM features, Extended double precision decimal floating point (thread)

comp.arch

Posted: 27 Days 17 Hours ago by: BGB

Does raise the question if there would be a good set of ISA extensions to help with codec tasks. Huffman would be an obvious place to look, but generally this is more bottlenecked by L1 misses and similar than by the ISA itself. Most

Re: IBM features, Extended double precision decimal floating point (thread)

comp.arch

Posted: 27 Days 20 Hours ago by: Thomas Koenig

zlib is also using this. Sure, for a given computer, it could be useful both for file compression and for http decoding. Given the huge disparity in price between a computer based on an Intel or AMD CPU and a zSystem, I simply doubt th

Re: IBM features, Extended double precision decimal floating point (thread)

comp.arch

Posted: 27 Days 21 Hours ago by: Thomas Koenig

You described a decimal floating point with more digits and less exponent range than the ones currently in use.

Re: IBM features, Extended double precision decimal floating point (thread)

comp.arch

Posted: 28 Days ago by: John Levine

That's not surprising. The rule of thumb used to be that CVB and CVD were a win if you were going to do two arithmetic operations.

Re: IBM features, Extended double precision decimal floating point (thread)

comp.arch

Posted: 28 Days ago by: John Levine

There aren't very many deflate/gzip libraries so I would hope they'd have versions of the libraries that use the instruction, but I haven't been sufficiently interested to go look. Possibly, but http potentially does deflate encoding f

Re: IBM features, Extended double precision decimal floating point (thread)

comp.arch

Posted: 28 Days 4 Hours ago by: Quadibloc

In which case, perhaps vector COBOL _should_ be a thing, so that they don't have to keep writing DB/2 in assembler language. John Savard

Re: Extended double precision decimal floating point (thread)

comp.arch

Posted: 28 Days 4 Hours ago by: Quadibloc

However, given that people did billing and payroll back before DFP was invented, and fixed-point decimal arithmetic suited those applications just fine, *I* am still in doubt about the validity of your point. John Savard

Re: IBM features, Extended double precision decimal floating point (thread)

comp.arch

Posted: 28 Days 4 Hours ago by: BGB

This seems like a possibility, though for decimal fixed-point, one doesn't really need to store the decimal-point per-se, as it can be managed entirely by the compiler (so doesn't really need to exist at runtime). I have realized th

Re: IBM features, Extended double precision decimal floating point (thread)

comp.arch

Posted: 28 Days 5 Hours ago by: MitchAlsup

< < You can ALSO consider: that unlike binary integers, the uses of decimal data tends a lot more to singular than multiple. {Whereas binary integers are used 1.2 times on average--decimal fixed point tend to be used closer to 1.03 times

Re: IBM features, Extended double precision decimal floating point (thread)

comp.arch

Posted: 28 Days 6 Hours ago by: robf...@gmail.com

I am getting the impression that my time would be better spent working on fixed point BCD arithmetic primitives. It would probably be better to have those in the ISA than DFP. For a BCD format 128 bits could be used for 36 significant dig

Re: IBM features, Extended double precision decimal floating point (thread)

comp.arch

Posted: 28 Days 6 Hours ago by: Thomas Koenig

This is something they had previously as "zEnterprise Data Compression". Apparently, you can use this from Java (among others). Seems like it is possible to create a compressed data set on MVS^H^H^HzOs. If so, it makes sense to have t

Re: Extended double precision decimal floating point (thread)

comp.arch

Posted: 28 Days 6 Hours ago by: BGB

For the scheme I came up with, have currently defined 16 and 32 digit BCD (as 64 and 128 bits). In concept, there could be a 36 or 38 digit 128-bit DPD variant. Say: ( 59: 0): Digits 0..17 ( 63: 60): Digit 18 (123: 64): Digit

Re: IBM features, Extended double precision decimal floating point (thread)

comp.arch

Posted: 28 Days 7 Hours ago by: John Dallman

Bulk database updates? DB/2 can use decimal formats, as can many databases. John

Re: The Green Star of Hope for RISC-V (thread)

comp.arch

Posted: 28 Days 8 Hours ago by: Guillaume

Yep. And the fact all the money seems to go to "AI"/machine learning these days is pretty sad actually. It doesn't look like "hope" much to me. Whether it's RISC-V-based or not doesn't change much here.

Re: IBM features, Extended double precision decimal floating point (thread)

comp.arch

Posted: 28 Days 8 Hours ago by: John Levine

New models of the z series have had some oddly specific addtions, like the DEFLATE instruction that does the inner part of gzip and the digital signature instruction that does elliptic curve signing and verifying. I realize those are bot

Re: Extended double precision decimal floating point (thread)

comp.arch

Posted: 28 Days 9 Hours ago by: Thomas Koenig

That had a lot to do with Backus, if he is to be believed. As the author of Speedcode for the 701, he was acutely aware that people needed floating point. Apparently, he went to lots of meetings for the 704 design, and was a bit disappo

Re: Extended double precision decimal floating point (thread)

comp.arch

Posted: 28 Days 9 Hours ago by: Thomas Koenig

No. I here that this is a quaint custom in Leftpondia, where bank transfers between employers and employees are frowned upon. Yes.

Re: Extended double precision decimal floating point (thread)

comp.arch

Posted: 28 Days 10 Hours ago by: MitchAlsup

< Then we are in violent agreement.

Re: Extended double precision decimal floating point (thread)

comp.arch

Posted: 28 Days 10 Hours ago by: MitchAlsup

< For example: Saw we have an account that is accruing interest over a period of time and someone wants to draw from that account. One takes the starting data, the ending date, does a date modulo, then applies the data-modulo to a floating

Re: Extended double precision decimal floating point (thread)

comp.arch

Posted: 28 Days 10 Hours ago by: Ivan Godard

Sounds like we are in violent agreement.

Re: Extended double precision decimal floating point (thread)

comp.arch

Posted: 28 Days 10 Hours ago by: Ivan Godard

I think you mistook my post. OP asserted that the *only* use for DFP was spreadsheets; I offered counterexamples. My point was that there are other uses for DFP, not that billing and payroll are done on spreadsheets.

Re: Extended double precision decimal floating point (thread)

comp.arch

Posted: 28 Days 10 Hours ago by: MitchAlsup

< The same way the floating point library computes these. You go to the definition of the transcendental, investigate all the suitable algorithms, and choose a series of instructions that perform the required calculation. Some of the time

Re: Extended double precision decimal floating point (thread)

comp.arch

Posted: 28 Days 10 Hours ago by: MitchAlsup

< < Binary has limitations generally associated with CPU registers (32-bits, 64-bits,...) whereas decimal fixed point generally does not (IBM has 31 decimal digits in 16-bytes). {Aside: it is pretty clear that 31 decimal digits is a bit to

Re: Extended double precision decimal floating point (thread)

comp.arch

Posted: 28 Days 10 Hours ago by: MitchAlsup

, < No company in their right mind would use a spreadsheet as a billing program. < Billing programs need audit logs, report generation, financial summaries,... You might use a relational database as if it were a spreadsheet--but a) they ar

Re: The Green Star of Hope for RISC-V (thread)

comp.arch

Posted: 28 Days 10 Hours ago by: Thomas Koenig

You didn't mention that this is a chip for machine learning. I suppose that all these "efficiency cores" are doing is driving FMA units. It is not a general purpose CPU.

Re: Extended double precision decimal floating point (thread)

comp.arch

Posted: 28 Days 10 Hours ago by: MitchAlsup

< < It is a product with 4 digits behind the decimal point (2 from the first operand 2 from the second). COBOL compilers from the 1970s were adept at keeping track of these. Note--the fixed point decimal math calculates (without rounding)

Re: Extended double precision decimal floating point (thread)

comp.arch

Posted: 28 Days 10 Hours ago by: MitchAlsup

< < You misread the context. Binary integers were not in the question, nor was binary floating point:: The question was between decimal fixed point and decimal floating point. < Their decimal fixed point does not provide enough digits (31

The Green Star of Hope for RISC-V

comp.arch

Posted: 28 Days 12 Hours ago by: Quadibloc

https://www.theregister.com/2022/04/22/samsung_esperanto_riscv/ A company, called Esperanto Technologies, has sampled a new chip that it has developed to Samsung and other companies. This chip is a RISC-V CPU with both performance and ef

Re: Extended double precision decimal floating point (thread)

comp.arch

Posted: 28 Days 13 Hours ago by: EricP

The IBM Decimal Arithmetic group moved to this web site: http://speleotrove.com/decimal/ It has links to various decimal packages including open source. You might be able to reuse some of their code.

Re: Extended double precision decimal floating point (thread)

comp.arch

Posted: 28 Days 16 Hours ago by: Terje Mathisen

The same as for FP? I.e. the most obvious algorithms for exp & log use effectively fixed-point operations all the way, with different polynomials for each range. log2() or log10() starts by extracting the exponent part, bringing the

Re: Extended double precision decimal floating point (thread)

comp.arch

Posted: 28 Days 16 Hours ago by: Ivan Godard

I have not worked extensively in the commercial world, but in what work I have done I have never seen EXP or LOG actually used. Sure, they show up in the formulae in econ textbooks, but enterprises don't do the mathematically correct c

Re: Extended double precision decimal floating point (thread)

comp.arch

Posted: 28 Days 17 Hours ago by: Ivan Godard

That's where the "preferred quantum" comes in. The ops can round to the quantum automatically for you, without risk of precision overflow. You only tumble into actual *floating* point if you get overflow in the units you are using, whi

Re: Extended double precision decimal floating point (thread)

comp.arch

Posted: 28 Days 17 Hours ago by: robf...@gmail.com

I am all for fixed point, I added a fixed point multiply to the ISA a while ago. But how does one handle exponentiation or logarithm functions using fixed point? As needed for mortgage, annuity payments, and other financial projections.

Re: Extended double precision decimal floating point (thread)

comp.arch

Posted: 28 Days 17 Hours ago by: Michael S

Except when [after few muls or one div] you reach 34 digits (or 16, for 64-bit variant) DFP become a real floating-point with all advantages and disadvantages of normal FP and with one special disadvantage of coarse precision steps.

Re: Extended double precision decimal floating point (thread)

comp.arch

Posted: 28 Days 18 Hours ago by: Terje Mathisen

This is the crux right here: Fixed-point decimal, using unsigned binary variables and external scale, will almost always be faster than both binary FP and decimal FP, but you need to hand-code (or get your compiler to do it for you?)

Re: Financial arithmetic, was Extended double precision decimal (thread)

comp.arch

Posted: 28 Days 18 Hours ago by: Terje Mathisen

I did almost the same thing, using the same ideas, in 1982/83: I had a bunch of gift cards issued by my father-in-law's business, when they were returned we had to reimburse the face value minus a (possibly different per customer) surc

Re: Extended double precision decimal floating point (thread)

comp.arch

Posted: 28 Days 19 Hours ago by: Ivan Godard

Ever get a paycheck? Ever pay a phone bill?

Re: Extended double precision decimal floating point (thread)

comp.arch

Posted: 28 Days 19 Hours ago by: Ivan Godard

But intermediate results don't. 3.75% of $1721.25 is a product with a different exponent, and the rules say whether you have to keep or round in subsequent calculation. This is not precision-preserving like BFP. It is really auto-scale

Re: Extended double precision decimal floating point (thread)

comp.arch

Posted: 28 Days 19 Hours ago by: Anton Ertl

IBM's application is to have a USP for their expensive machines that their salesmen and buyers (however they were convinced to buy these machines) can point to as a justification for the decision to buy these machines. Are there any spr

Re: Extended double precision decimal floating point (thread)

comp.arch

Posted: 28 Days 20 Hours ago by: Anton Ertl

For commercial work it's usually determined by regulations what the exponent is, and it is therefore easy; that's because the regulations are based on the old times when people were computing by hand or with mechanical calculators. E.g.,

Re: Extended double precision decimal floating point (thread)

comp.arch

Posted: 28 Days 22 Hours ago by: robf...@gmail.com

My decimal float multiplier takes up to about 400 clock cycles to complete a 128-bit multiply on random data. It uses repeated addition with shifting, the standard algorithm. It is using an adder that takes three clocks per addition. It s

Re: Extended double precision decimal floating point (thread)

comp.arch

Posted: 28 Days 23 Hours ago by: Quadibloc

I agree with you that for any commercial application that I can think of, if decimal arithmetic is required, then integers will be used. After all, floating-point will prevent proper rounding just as effectively as the use of binary. Ho

Re: Extended double precision decimal floating point (thread)

comp.arch

Posted: 29 Days 1 Hour ago by: MitchAlsup

< < I buy the argument that it is easier for HW to track the exponent when it varies from -308..+308. < But the very vast majority of commercial decimal arithmetic have an exponent of +2 or +3 (almost invariably fixed).

Re: Extended double precision decimal floating point (thread)

comp.arch

Posted: 29 Days 2 Hours ago by: John Levine

Yeah, that's why von Neumann didn't put floating point in the IAS machine, and probably why Amdahl's group did put it in the IBM 704.

Re: Extended double precision decimal floating point (thread)

comp.arch

Posted: 29 Days 2 Hours ago by: Ivan Godard

And binary integer is faster than BFP, so if you can track the exponent yourself you should use fixed-point. For both binary and decimal, we let the hardware keep track of the exponents because it's a pain to do it manually.

Re: Extended double precision decimal floating point (thread)

comp.arch

Posted: 29 Days 2 Hours ago by: Ivan Godard

Commercial work (aside from I/O is usually pretty trivial. Take $100 from your account - read the balance, a compare for balance avail and a subtract, and write it back. The compare and the subtract are DFP, while the get and put are t

Re: Extended double precision decimal floating point (thread)

comp.arch

Posted: 29 Days 6 Hours ago by: MitchAlsup

< Assuming decimal arithmetic is 4× faster than decimal floating point:: Can you name a commercial system for which decimal arithmetic is insufficient while decimal floating point would be sufficient ? Remembering this is about speed and

Re: Extended double precision decimal floating point (thread)

comp.arch

Posted: 29 Days 6 Hours ago by: Quadibloc

It is true that since higher speed is attainable with binary FP, that's what would be used for the truly speed-critical applications, like, say, simulating the workings of a star. But if you need decimal FP for serious work, like, say,

Re: Extended double precision decimal floating point (thread)

comp.arch

Posted: 29 Days 8 Hours ago by: BGB

There are several algorithms I am aware of. I am not sure if I know which one you have in mind. Binary to Decimal can be done via multiplying by a fixed point value which "squeezes" groups of one or more digits above the decimal point.

Re: Extended double precision decimal floating point (thread)

comp.arch

Posted: 29 Days 8 Hours ago by: John Levine

Either that or the rest of the work was so trivial that the decimal emulation was all that was left.

Re: Financial arithmetic, was Extended double precision decimal floating point (thread)

comp.arch

Posted: 29 Days 9 Hours ago by: John Levine

Actually, that's how you price any bond in the secondary market. If the current market rate is different from the coupon rate on the bond, which it usually is, the price of the bond goes up or down correspondingly. The same formula shoul

Re: Extended double precision decimal floating point (thread)

comp.arch

Posted: 29 Days 9 Hours ago by: EricP

I built a bond pricing server and system 25 years ago for a customer. It would listen to the price ticker and reprice the bond securities database in real time and broadcast the price changes to a front end. Written in C on (IIRC) a Penti

Re: Extended double precision decimal floating point (thread)

comp.arch

Posted: 29 Days 13 Hours ago by: Thomas Koenig

Re: Extended double precision decimal floating point (thread)

comp.arch

Posted: 29 Days 13 Hours ago by: Terje Mathisen

I agree with pretty much all your estimates above, i.e. the common case of DFPADD of equal-magnitude/exponent values could be done in ~10 cycles, matching the stated 12 for POWER9, but falling far behind when having to rescale. It wou

Re: Extended double precision decimal floating point (thread)

comp.arch

Posted: 29 Days 15 Hours ago by: Michael S

It would be very dependent on operands. Cases that need no change of scale on either of operands will be very fast. Those that have to multiply one of the inputs by power of 10 will be slower. And those that have to re-normalize (==divide

Re: Extended double precision decimal floating point (thread)

comp.arch

Posted: 29 Days 17 Hours ago by: Terje Mathisen

You have seen the algorithm I invented 20+ years ago to do really fast conversion from binary to BCD or ASCII? Michael S have tweaked it a bit more since then: You get the full 32-bit unsigned to 10 ASCII digits in less than the time f

Re: Extended double precision decimal floating point (thread)

comp.arch

Posted: 29 Days 17 Hours ago by: BGB

This is probably a lot more true of Binary FP than Decimal FP ... If done poorly (or at the bare minimum) in hardware, it is also possible for things to be slower than if they were done in software (software having the ability to use

Re: Extended double precision decimal floating point (thread)

comp.arch

Posted: 29 Days 18 Hours ago by: Terje Mathisen

Sorry, no, it never came up duing the 2016-2019 round afair. I strongly doubt Oracle's numbers though: Spending 40% here seems like a clear case of badly written SW. Terje

Re: Extended double precision decimal floating point (thread)

comp.arch

Posted: 29 Days 18 Hours ago by: Terje Mathisen

I used exactly this split when I wrote my own q&d 128-bit binary FP library over a weekend in Jan 1995: I needed this to verify our sw FDIV workaround code, including the FPATAN replacement. Working with a mantissa having 3 32-bit valu

Re: Extended double precision decimal floating point (thread)

comp.arch

Posted: 29 Days 18 Hours ago by: Thomas Koenig

Where did you find them? It would be interesting to look at them. For POWER9, I have the handbook before me. It gives 12 cycles latency for daddq (quad add DFP), same as xsaddqp, the version for a 128-bit IEEE floating point add. How

Re: Extended double precision decimal floating point (thread)

comp.arch

Posted: 29 Days 19 Hours ago by: BGB

2-cycle BCD ADD/SUB (for 16 digits) is doable... This would do 32 digits in 3 cycles. I eventually got my attempt to pass timing, turns out I needed to drop the Carry-Select to operating on 2 digits at a time... I had also switched it

Re: Extended double precision decimal floating point (thread)

comp.arch

Posted: 29 Days 19 Hours ago by: Quadibloc

While that may be true for an experimental floating-point format that is not likely to see much use, in general that wouldn't be a sane option, or even an option at all. Because you will never get the fastest possible speed attainable f

602 recent articles found.

rocksolid light 0.7.2
clearneti2ptor