novaBBS - comp.arch.embedded - Re: How to write a simple driver in bare metal systems: volatile, memory barrier, critical sections and so on

Don Y <blockedofcourse@foo.invalid> wrote:
> On 10/26/2021 5:20 PM, antispam@math.uni.wroc.pl wrote:
> > Don Y <blockedofcourse@foo.invalid> wrote:
> >> On 10/25/2021 2:32 PM, antispam@math.uni.wroc.pl wrote:
> >>> Don Y <blockedofcourse@foo.invalid> wrote:
> >>>> On 10/24/2021 4:14 AM, Dimiter_Popoff wrote:
> >>>>>> Disable interrupts while accessing the fifo. you really have to.
> >>>>>> alternatively you'll often get away not using a fifo at all,
> >>>>>> unless you're blocking for a long while in some part of the code.
> >>>>>
> >>>>> Why would you do that. The fifo write pointer is only modified by
> >>>>> the interrupt handler, the read pointer is only modified by the
> >>>>> interrupted code. Has been done so for times immemorial.
> >>>>
> >>>> The OPs code doesn't differentiate between FIFO full and empty.
> >>>
> >>> If you read carefuly what he wrote you would know that he does.
> >>> The trick he uses is that his indices may point outside buffer:
> >>> empty is equal indices, full is difference equal to buffer
> >>
> >> Doesn't matter as any index can increase by any amount and
> >> invalidate the "reality" of the buffer's contents (i.e.
> >> actual number of characters that have been tranfered to
> >> that region of memory).
> >
> > AFAIK OP considers this not a problem in his application.
>
> And I don't think I have to test for division by zero -- as
> *my* code is the code that is passing numerator and denominator
> to that operator, right?

Well, I do not test for zero if I know that divisor must be
nonzero. To put it differently, having zero in such place
is a bug and there is already enough machinery so that
such bug will not remain undetected. Having extra test
adds no value.

OTOH is zero is possible, then handling it is part of program
logic and test is needed to take correct action.

> Can you remember all of the little assumptions you've made in
> any non-trivial piece of code -- a week later? a month later?
> 6 months later (when a bug manifests or a feature upgrade
> is requested)?

Well, my normal practice is that there are no "little assumptions".
To put it differently, code is structured to make things clear,
even if this requires more code than some "clever" solution.
There may be "big assumptions", that is highly nontrivial facts
used by the code. Some of them are considered "well known",
with proper naming in code it is easy to recall them years later.
Some deserve comments/referece. In most of may coding I have
pretty comfortable situation: for human there is quite clear
what is valid and what is invalid. So code makes a lot of
effort to handle valid (but possibly quite unusual) cases

> Do not check the inputs of routines for validity -- assume everything is
> correct (cuz YOU wrote it to be so, right?).

Well, correct inputs are part of contract. Some things (like
array indices inside bounds) are checked, but in general you can
expect garbage if you pass incorrect input. Most of my code is
of sort that called routine can not really check validity of input
(there are complex invariants). Note: here I am talking mostly
about my non-embedded code (which is majority of my coding).
In most of may coding I have pretty comfortable situation: for
human there is quite clear what is valid and what is invalid.
So code makes a lot of effort to handle valid (but possibly quite
unusual) cases. User input is normally checked to give sensible
error message, but some things are deemed to tricky/expensive
to check. Other routines are deemed "system level", and here
there us up to user/caller to respect the contract.

My embedded code consists of rather small systems, and normally
there are no explicit validity checks. To clarify: when system
receives commands it recognizes and handles valid commands.
So there is implicit check: anything not recognized as valid
is invalid. OTOH frequently there is nothing to do in case
of errors: if there are no display to print error message,
no persistent store to log erreor and shuting down is not helpful,
then what else potential error handler would do?

I do not check if 12-bit ADC really returns numbers in range.
My 'print_byte' routine takes integer argument and blindly
truncates it to 8-bit without worring about possible
spurious upper bits. "Safety critical" folks my be worried
by such practice, but my embedded code is fairly non-critical.

> Do not handle error conditions -- because they can't exist (because
> you wrote the code and feel confident that you've anticipated
> every contingency -- including those for future upgrades).
>
> Ignore compiler warnings -- surely you know better than a silly
> "generic" program!
>
> Would you hire someone who viewed your product's quality (and
> your reputation) in this regard?

Well, you do not know what OP code is doing. I would prefer
my code to be robust and I feel that I am doing resonably
well here. OTOH, coming back to serial comunication, it
is not hard to design communication protocal such that in
normal operation there is no possibility for buffer
overflow. It would still make sense to add a single line
to say drop excess characters. But it does not make
sense to make big story of lack of this line. In particular
issue that OP wanted to discuss is still valid.

> > Of course, if such changes were a problem he would need to
> > add test preventing writing to full buffer (he already have
> > test preventing reading from empty buffer).
> >
> >> Buffer size is 128, for example. in is 127, out is 127.
> >> What's that mean?
> >
> > Empty buffer.
>
> No, it means you can't sort out *if* there have been any characters
> received, based solely on this fact (and, what other facts are there
> to observe?)

Of course you can connect to system and change values of variables
in debugger, so specific values mean nothing. I am telling
you what to protocal is. If all part of system (including parts
that OP skipped) obey the protocal, then you have meaning above.
If something misbehaves (say cosmic ray flipped a bit), it does
not mean that protocal is incorrect. Simply _if_ probability
of misbehaviour is too high you need to fix the system (add
radiation shielding, appropiate seal to avoid tampering with
internals, extra checks inside, etc). But what/if to fix
something is for OP to decide.

> >> Can you tell me what has happened prior
> >> to this point in time? Have 127 characters been received?
> >> Or, 383? Or, 1151?
> >
> > Does not matter.
>
> Of course it does! Something has happened that the code MIGHT have
> detected in other circumstances (e.g., if uart_task had been invoked
> more frequently). The world has changed and the code doesn't know it.
> Why write code that only *sometimes* works?

All code works only sometimes. Parafrazing famous answer to
Napoleon: fisrt you need a processor. There are a lot of
conditons so that code works as intended. Granted, I would
not skip needed check in real code. But this is obvious
thing to add. You are somewhat making OP code as "broken
beyond repair". Well, as discussion showed, OP had problem
using "volatile" and that IMHO is much more important to
fix.

> >> How many characters have been removed from the buffer?
> >> (same numeric examples).
> >
> > The same as has been stored. Point is that received is
> > always bigger or equal to removed and does not exceed
> > removed by more than 128. So you can exactly recover
> > difference between received and removed.
>
> If it can wrap, then "some data" can look like "no data".
> If "no data", then NOTHING has been received -- from the
> viewpoint of the code.
>
> Tell me what prevents 256 characters from being received
> after .in (and .out) are initially 0 -- without any
> indication of their presence. What "limits" the difference
> to "128"? Do you see any conditionals in the code that
> do so? Is there some magic in the hardware that enforces
> this?

That is the protocol. How to avoid violation is different
matter: dropping characters _may_ be solution. But dropping
characters means that some data is lost, and how to deal
with lost data is different issue. As is OP code will loose
some old data. It is OP problem to decide which failure
mode is more problematic and how much extra checks are
needed.

> This is how you end up with bugs in your code. The sorts
> of bugs that you can witness -- with your own eyes -- and
> never reproduce (until the code has been released and
> lots of customers' eyes witness it as well).

IME it is issues that you can not prodict that catch you.
The above is obvious issue, and should not be a problem
(unless designer is seriously incompenent and misjudged
what can happen).

> >> The biggest practical limitation is that of expectations of
> >> other developers who may inherit (or copy) his code expecting
> >> the FIFO to be "well behaved".
> >
> > Well, personally I would avoid storing to full buffer. And
> > even on small MCU it is not clear for me if his "savings"
> > are worth it. But his core design is sound.
> >
> > Concerning other developers, I always working on assumption
> > that code is "as is" and any claim what it is doing are of
> > limited value unless there is convincing argument (proof
> > or outline of proof) what it is doing.
>
> Ever worked on 100KLoC projects? 500KLoC? Do you personally examine
> the entire codebase before you get started?

Of course I do not read all code before start. But I accept
risc that code may turn out to be faulty and I may be forced
to fix or abandon it. My main project has 450K wc lines.
I know that parts are wrong and I am working on fixing that
(which will probably involve substantial rewrite). I worked
a little on gcc and I can tell you that only sure thing in
such projects is that there are bugs. Of course, despite
bugs gcc is quite useful. But I also met Modula 2 compiler
that carefuly checked programs for violation of language
rules, but miscompiled nested function calls.

> Do you purchase source
> licenses for every library that you rely upon in your design?
> (or, do you just assume software vendors are infallible?)

Well, for several years I work exclusively with open source code.
I see a lot of defects. While my experience with comercial codes
is limited I do not think that commercial codes have less defects
than open source ones. In fact, there are reasons to suspect
that there are more defects in commercial codes.

> How would you feel if a fellow worker told you "yeah, the previous
> guy had a habit of cutting corners in his FIFO management code"?
> Or, "the previous guy always assumed malloc would succeed and
> didn't even build an infrastructure to address the possibility
> of it failing"

Well, there is a lot of bad code. Sometimes best solution is simply
to throw it out. In other cases (likely in your malloc scenario above)
there may be simple workaround (replace malloc by checking version).

> You could, perhaps, grep(1) for "malloc" or "FIFO" and manually
> examine those code fragments.

Yes, that one of possible appraches.

> What about division operators?

I have a C parser. In desperation I could try to search parse
tree or transform program. Or, more likely decide that program
is broken beyond repair.

> Or, verifying that data types never overflow their limits? Or...

Well, one thing is to look at structure of program. Code may
look complicated, but some programs are reasonably testable:
few random inputs can give some confidence that "main"
execution path computes correct values. Then you look if
you can hit limits. Actually, much of my coding is in
arbitrary precision, so overflow is impossible. Instead
program may run out of memory. But there parts for speed
use fixed precision. If I correctly computed limits
overflow is impossible. But this is big if.

> > Fact that code
> > worked well in past system(s) is rather unconvincing.
> > I have seen small (few lines) pieces of code that contained
> > multiple bugs. And that code was in "production" use
> > for several years and passed its tests.
> >
> > Certainly code like FIFO-s where there are multiple tradeofs
> > and actual code tends to be relatively small deserves
> > examination before re-use.
>
> It's not "FIFO code". It's a UART driver. Do you examine every piece
> of code that might *contain* a FIFO? How do you know that there *is* a FIFO
> in a piece of code -- without manually inspecting it? What if it is a
> FIFO mechanism but not explicitly named as a FIFO?
>
> One wants to be able to move towards the goal of software *components*.
> You don't want to have to inspect the design of every *diode* that
> you use; you want to look at it's overall specifications and decide
> if those fit your needs.

Sure, I would love to see really reusable components. But IMHO we
are quite far from that. There are some things which are reusable
if you accept modest to severe overhead. For example things tends
to compose nicely if you dynamically allocate everything and use
garbage collection. But performace cost may be substantial.
And in embedded setting garbage collection may be unacceptable.
In some cases I have found out that I can get much better
speed joing things that could be done as composition of library
operations into single big routine. In other cases I fixed
bugs by replacing composition of library routines by a single
routine: there were interactions making simple composition
incorrect. Correct alterantive was single routine.

As I wrote my embedded programs are simple and small. But I
use almost no external libraries. Trying some existing libraries
I have found out that some produce rather large programs, linking
in a lot of unneeded stuff. Of course, writing for scratch
will not scale to bigger programs. OTOH, I feel that with
proper tooling it would be possible to retain efficiency and
small code size at least for large class of microntroller
programs (but existing tools and libraries do not support this).

> Unlikely that this code will describe itself as "works well enough
> SOME of the time..."
>
> And, when/if you stumble on such faults, good luck explaining to
> your customer why it's going to take longer to fix and retest the
> *existing* codebase before you can get on with your modifications...

Commercial vendors like to say how good their progam are. But
market reality is that program my be quite bad and still sell.

--
Waldek Hebisch

Subject	Replies	Author
How to write a simple driver in bare metal systems: volatile, memory By: pozz on Fri, 22 Oct 2021	58	pozz

Vax Vobiscum

computers / comp.arch.embedded / Re: How to write a simple driver in bare metal systems: volatile, memory barrier, critical sections and so on