Rocksolid Light

Welcome to novaBBS (click a section below)

mail  files  register  newsreader  groups  login

Message-ID:  

Bringing computers into the home won't change either one, but may revitalize the corner saloon.


devel / comp.arch.embedded / Re: Embedding a Checksum in an Image File

SubjectAuthor
* Embedding a Checksum in an Image FileRick C
+* Re: Embedding a Checksum in an Image FileNiklas Holsti
|`- Re: Embedding a Checksum in an Image FileRick C
+- Re: Embedding a Checksum in an Image FilePeter Heitzer
+* Re: Embedding a Checksum in an Image Filedalai lamah
|`- Re: Embedding a Checksum in an Image FileRick C
+- Re: Embedding a Checksum in an Image FileDavid Brown
+* Re: Embedding a Checksum in an Image FileGeorge Neuner
|+* Re: Embedding a Checksum in an Image FileRick C
||+* Re: Embedding a Checksum in an Image FileDavid Brown
|||`* Re: Embedding a Checksum in an Image FileUlf Samuelsson
||| `- Re: Embedding a Checksum in an Image FileUlf Samuelsson
||`* Re: Embedding a Checksum in an Image FileGeorge Neuner
|| +* Re: Embedding a Checksum in an Image FileDon Y
|| |`* Re: Embedding a Checksum in an Image FileDavid Brown
|| | `* Re: Embedding a Checksum in an Image FileDon Y
|| |  `* Re: Embedding a Checksum in an Image FileDavid Brown
|| |   `* Re: Embedding a Checksum in an Image FileDon Y
|| |    `* Re: Embedding a Checksum in an Image FileDavid Brown
|| |     `* Re: Embedding a Checksum in an Image FileDon Y
|| |      `* Re: Embedding a Checksum in an Image FileDavid Brown
|| |       `* Re: Embedding a Checksum in an Image FileDon Y
|| |        `* Re: Embedding a Checksum in an Image FileDavid Brown
|| |         +* Re: Embedding a Checksum in an Image FileUlf Samuelsson
|| |         |`* Re: Embedding a Checksum in an Image FileDavid Brown
|| |         | `* Re: Embedding a Checksum in an Image FileUlf Samuelsson
|| |         |  `- Re: Embedding a Checksum in an Image FileDon Y
|| |         `- Re: Embedding a Checksum in an Image FileDon Y
|| `- Re: Embedding a Checksum in an Image FileStefan Reuther
|`* Re: Embedding a Checksum in an Image FileTauno Voipio
| `- Re: Embedding a Checksum in an Image FileGeorge Neuner
+* Re: Embedding a Checksum in an Image FileRichard Damon
|`* Re: Embedding a Checksum in an Image FileRick C
| `- Re: Embedding a Checksum in an Image FileRichard Damon
+* Re: Embedding a Checksum in an Image FileBrian Cockburn
|`* Re: Embedding a Checksum in an Image FileRick C
| +* Re: Embedding a Checksum in an Image FileDavid Brown
| |+* Re: Embedding a Checksum in an Image FileBrian Cockburn
| ||`- Re: Embedding a Checksum in an Image FileDavid Brown
| |`* Re: Embedding a Checksum in an Image FileRick C
| | +* Re: Embedding a Checksum in an Image FileDavid Brown
| | |`* Re: Embedding a Checksum in an Image FileRick C
| | | `* Re: Embedding a Checksum in an Image FileDavid Brown
| | |  +* Re: Embedding a Checksum in an Image FileGrant Edwards
| | |  |`* Re: Embedding a Checksum in an Image FileDavid Brown
| | |  | `* Re: Embedding a Checksum in an Image FileGrant Edwards
| | |  |  `* Re: Embedding a Checksum in an Image FileDavid Brown
| | |  |   `* Re: Embedding a Checksum in an Image FileRichard Damon
| | |  |    `- Re: Embedding a Checksum in an Image FileDavid Brown
| | |  +- Re: Embedding a Checksum in an Image FileboB
| | |  `* Re: Embedding a Checksum in an Image FileRick C
| | |   `* Re: Embedding a Checksum in an Image FileDavid Brown
| | |    `* Re: Embedding a Checksum in an Image FileRick C
| | |     `* Re: Embedding a Checksum in an Image FileDavid Brown
| | |      `- Re: Embedding a Checksum in an Image FileRick C
| | `* Re: Embedding a Checksum in an Image FileUlf Samuelsson
| |  `- Re: Embedding a Checksum in an Image FileUlf Samuelsson
| `* Re: Embedding a Checksum in an Image FileBrian Cockburn
|  `* Re: Embedding a Checksum in an Image FileRick C
|   `* Re: Embedding a Checksum in an Image FileBrian Cockburn
|    +- Re: Embedding a Checksum in an Image FileRichard Damon
|    `- Re: Embedding a Checksum in an Image FileRick C
+* Re: Embedding a Checksum in an Image FileUlf Samuelsson
|+* Re: Embedding a Checksum in an Image FileRick C
||+* Re: Embedding a Checksum in an Image FileNiklas Holsti
|||+- Re: Embedding a Checksum in an Image FileNiklas Holsti
|||`* Re: Embedding a Checksum in an Image FileUlf Samuelsson
||| `* Re: Embedding a Checksum in an Image FileUlf Samuelsson
|||  `* Re: Embedding a Checksum in an Image FileDavid Brown
|||   `* Re: Embedding a Checksum in an Image FileUlf Samuelsson
|||    `* Re: Embedding a Checksum in an Image FileDavid Brown
|||     `* Re: Embedding a Checksum in an Image FileUlf Samuelsson
|||      `- Re: Embedding a Checksum in an Image FileDavid Brown
||`* Re: Embedding a Checksum in an Image FileUlf Samuelsson
|| `* Re: Embedding a Checksum in an Image FileNiklas Holsti
||  `- Re: Embedding a Checksum in an Image FileUlf Samuelsson
|`* Re: Embedding a Checksum in an Image FileUlf Samuelsson
| `* Re: Embedding a Checksum in an Image FileDavid Brown
|  `- Re: Embedding a Checksum in an Image FileUlf Samuelsson
`- Re: Embedding a Checksum in an Image FileUlf Samuelsson

Pages:1234
Re: Embedding a Checksum in an Image File

<04d4cbda-216d-4d0d-8db3-f9decc6e4142n@googlegroups.com>

  copy mid

https://www.novabbs.com/devel/article-flat.php?id=1461&group=comp.arch.embedded#1461

  copy link   Newsgroups: comp.arch.embedded
X-Received: by 2002:ae9:e003:0:b0:74e:f:4339 with SMTP id m3-20020ae9e003000000b0074e000f4339mr1351063qkk.13.1682121362439;
Fri, 21 Apr 2023 16:56:02 -0700 (PDT)
X-Received: by 2002:ac8:59d3:0:b0:3d3:28e2:d020 with SMTP id
f19-20020ac859d3000000b003d328e2d020mr2131404qtf.3.1682121362182; Fri, 21 Apr
2023 16:56:02 -0700 (PDT)
Path: i2pn2.org!i2pn.org!weretis.net!feeder8.news.weretis.net!proxad.net!feeder1-2.proxad.net!209.85.160.216.MISMATCH!news-out.google.com!nntp.google.com!postnews.google.com!google-groups.googlegroups.com!not-for-mail
Newsgroups: comp.arch.embedded
Date: Fri, 21 Apr 2023 16:56:01 -0700 (PDT)
In-Reply-To: <u1u8hu$2ps79$1@dont-email.me>
Injection-Info: google-groups.googlegroups.com; posting-host=180.150.36.196; posting-account=ZaAWpAoAAABFnmcUGQHISv7vlLrTdUCZ
NNTP-Posting-Host: 180.150.36.196
References: <116ff07e-5e25-469a-90a0-9474108aadd3n@googlegroups.com>
<66e9fb2a-b9e6-4597-9afa-6572caaa8ca1n@googlegroups.com> <a4495c87-c68d-4c87-94aa-701c21bdd19cn@googlegroups.com>
<u1u8hu$2ps79$1@dont-email.me>
User-Agent: G2/1.0
MIME-Version: 1.0
Message-ID: <04d4cbda-216d-4d0d-8db3-f9decc6e4142n@googlegroups.com>
Subject: Re: Embedding a Checksum in an Image File
From: brian.co...@gmail.com (Brian Cockburn)
Injection-Date: Fri, 21 Apr 2023 23:56:02 +0000
Content-Type: text/plain; charset="UTF-8"
Content-Transfer-Encoding: quoted-printable
 by: Brian Cockburn - Fri, 21 Apr 2023 23:56 UTC

On Saturday, April 22, 2023 at 1:02:28 AM UTC+10, David Brown wrote:
> On 21/04/2023 14:12, Rick C wrote:
> >
> > This is simply to be able to say this version is unique, regardless
> > of what the version number says. Version numbers are set manually
> > and not always done correctly. I'm looking for something as a backup
> > so that if the checksums are different, I can be sure the versions
> > are not the same.
> >
> > The less work involved, the better.
> >
> Run a simple 32-bit crc over the image. The result is a hash of the
> image. Any change in the image will show up as a change in the crc.
David, a hash and a CRC are not the same thing. They both produce a reasonably unique result though. Any change would show in either (unless as a result of intentional tampering).

Re: Embedding a Checksum in an Image File

<u1v9q8$2v4d5$1@dont-email.me>

  copy mid

https://www.novabbs.com/devel/article-flat.php?id=1462&group=comp.arch.embedded#1462

  copy link   Newsgroups: comp.arch.embedded
Path: i2pn2.org!i2pn.org!eternal-september.org!news.eternal-september.org!.POSTED!not-for-mail
From: blockedo...@foo.invalid (Don Y)
Newsgroups: comp.arch.embedded
Subject: Re: Embedding a Checksum in an Image File
Date: Fri, 21 Apr 2023 17:29:51 -0700
Organization: A noiseless patient Spider
Lines: 422
Message-ID: <u1v9q8$2v4d5$1@dont-email.me>
References: <116ff07e-5e25-469a-90a0-9474108aadd3n@googlegroups.com>
<pvl24i57aef4vc7bbdk9mvj7sic9dsh64t@4ax.com>
<f0afa198-e735-4da1-a16a-82764af3de4dn@googlegroups.com>
<36534il81ipvnhog6980r9ln9tdqn5cbh6@4ax.com> <u1t7eb$10gmu$3@dont-email.me>
<u1tpcr$1377f$1@dont-email.me> <u1tsm3$13ook$1@dont-email.me>
<u1u7ro$2poss$1@dont-email.me>
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8; format=flowed
Content-Transfer-Encoding: 8bit
Injection-Date: Sat, 22 Apr 2023 00:30:01 -0000 (UTC)
Injection-Info: dont-email.me; posting-host="36a9b22d512164686265c2c5ea3ae45f";
logging-data="3117477"; mail-complaints-to="abuse@eternal-september.org"; posting-account="U2FsdGVkX1+0e0tB8qL6rbCP61Y1rJ//"
User-Agent: Mozilla/5.0 (Windows NT 6.1; Win64; x64; rv:102.0) Gecko/20100101
Thunderbird/102.2.2
Cancel-Lock: sha1:Jj+VJiUYYpNilbT8upHUwHKNRNA=
In-Reply-To: <u1u7ro$2poss$1@dont-email.me>
Content-Language: en-US
 by: Don Y - Sat, 22 Apr 2023 00:29 UTC

On 4/21/2023 7:50 AM, David Brown wrote:
> On 21/04/2023 13:39, Don Y wrote:
>> On 4/21/2023 3:43 AM, David Brown wrote:
>>>> Note that you want to choose a polynomial that doesn't
>>>> give you a "win" result for "obviously" corrupt data.
>>>> E.g., if data is all zeros or all 0xFF (as these sorts of
>>>> conditions can happen with hardware failures) you probably
>>>> wouldn't want a "success" indication!
>>>
>>> No, that is pointless for something like a code image.  It just adds
>>> needless complexity to your CRC algorithm.
>>
>> Perhaps you've forgotten that you don't just use CRCs (secure hashes, etc.)
>> on "code images"?
>
> No - but "code images" is the topic here.

So, anything unrelated to CRC's as applied to code images is off limits...
per order of the Internet Police"?

If *all* you use CRCs for is checking *a* code image at POST, you're
wasting a valuable resource.

Do you not think data/parameters need to be safeguarded? Program images?
Communication protocols?

Or, do you develop yet another technique for *each* of those?

> However, in almost every case where CRC's might be useful, you have additional
> checks of the sanity of the data, and an all-zero or all-one data block would
> be rejected.  For example, Ethernet packets use CRC for integrity checking, but
> an attempt to send a packet type 0 from MAC address 00:00:00:00:00:00 to
> address 00:00:00:00:00:00, of length 0, would be rejected anyway.

Why look at "data" -- which may be suspect -- and *then* check its CRC?
Run the CRC first. If it fails, decide how you are going to proceed
or recover.

["Data" can be code or parameters]

I treat blocks of "data" (carefully arranged) with individual CRCs,
based on their relative importance to the operation. If the CRC is
corrupt, I have no idea *where* the error lies -- as it could
be anything in the checked block. So, one has to (typically)
restore some defaults (or, invoke a reconfigure operation) which
recreates *a* valid dataset.

This is particularly useful when power to a device can be
removed at arbitrary points in time (or, some other abrupt
crash). Before altering anything in a block, take deliberate
steps to invalidate the CRC, make your changes, then "fix"
the CRC. So, an interrupted process causes the CRC to fail
and remedial action taken.

Note that replacing a FLASH image (mostly code) falls under
such a mechanism.

> I can't think of any use-cases where you would be passing around a block of
> "pure" data that could reasonably take absolutely any value, without any type
> of "envelope" information, and where you would think a CRC check is appropriate.

I append a *version specific* CRC to each packet of marshalled data
in my RMIs. If the data is corrupted in transit *or* if the
wrong version API ends up targeted, the operation will abend
because we know the data "isn't right".

I *could* put a header saying "this is version 4.2". And, that
tells me nothing about the integrity of the rest of the data.
OTOH, ensuring the CRC reflects "4.2" does -- it the recipient
expects it to be so.

>>>> You can also "salt" the calculation so that the residual
>>>> is deliberately nonzero.  So, for example, "success" is
>>>> indicated by a residual of 0x474E.  :>
>>>
>>> Again, pointless.
>>>
>>> Salt is important for security-related hashes (like password hashes), not
>>> for integrity checks.
>>
>> You've missed the point.  The correct "sum" can be anything.
>> Why is "0" more special than any other value?  As the value is
>> typically meaningless to anything other than the code that verifies
>> it, you couldn't look at an image (or the output of the verifier)
>> and gain anything from seeing that obscure value.
>
> Do you actually know what is meant by "salt" in the context of hashes, and why
> it is useful in some circumstances?  Do you understand that "salt" is added
> (usually prepended, or occasionally mixed in in some other way) to the data
> /before/ the hash is calculated?

What term would you have me use to indicate a "bias" applied to a CRC
algorithm?

> I have not given the slightest indication to suggest that "0" is a special
> value.  I fully agree that the value you get from the checking algorithm does
> not have to be 0 - I already suggested it could be compared to the stored
> value.  I.e., your build your image file as "data ++ crc(data)", at check it by
> re-calculating "crc(data)" on the received image and comparing the result to
> the received crc.  There is no necessity or benefit in having a crc run
> calculated over the received data plus the received crc being 0.
>
> "Salt" is used in cases where the original data must be kept secret, and only
> the hashes are transmitted or accessible - by adding salt to the original data
> before hashing it, you avoid a direct correspondence between the hash and the
> original data.  The prime use-case is to stop people being able to figure out a
> password by looking up the hash in a list of pre-computed hashes of common
> passwords.

See above.

>> OTOH, if the CRC yields something familiar -- or useful -- then
>> it can tell you something about the image.  E.g., salt the algorithm
>> with the product code, version number, your initials, 0xDEADBEEF, etc.
>
> You are making no sense at all.  Are you suggesting that it would be a good
> idea to add some value to the start of the image so that the resulting crc
> calculation gives a nice recognisable product code?  This "salt" would be
> different for each program image, and calculated by trial and error.  If you
> want a product code, version number, etc., in the program image (and it's a
> good idea), just put these in the program image!

Again, that tells you nothing about the rest of the image!
See the RMI desciption.

[Note that the OP is expecting the checksum to help *him*
identify versions: "Just put these in the program image!" Eh?]

>>>>> So now you have a new extended block   |....data....|crc|
>>>>>
>>>>> Now if you compute a new CRC on the extended block, the resulting
>>>>> value /should/ come out to zero. If it doesn't, either your data or
>>>>> the original CRC value appended to it has been changed/corrupted.
>>>>
>>>> As there is usually a lack of originality in the algorithms
>>>> chosen, you have to consider if you are also hoping to use
>>>> this to safeguard the *integrity* of your image (i.e.,
>>>> against intentional modification).
>>>
>>> "Integrity" has nothing to do with the motivation for change. /Security/ is
>>> concerned with intentional modifications that deliberately attempt to defeat
>>> /integrity/ checks.  Integrity is about detecting any changes.
>>>
>>> If you are concerned about the possibility of intentional malicious changes,
>>
>> Changes don't have to be malicious.
>
> Accidental changes (such as human error, noise during data transfer, memory
> cell errors, etc.) do not pass integrity tests unnoticed.

That's not true. The role of the 8test* is to notice these. If the test
is blind to the types of errors that are likely to occur, then it CAN'T
notice them.

A CRC (hash, etc.) reduces a large block of data to a small bit of
data. So, by definition, there are multiple DIFFERENT sets of data that
map to the same CRC/hash/etc. (2^(data_size-CRC-size))

E.g., simply summing the values in a block of memory will yield "0"
for ANY condition that results in the block having identical values
for ALL members, if the block size is a power of 2. So, a block
of 0xFF, 0x00, 0xFE, 0x27, 0x88, etc. will all yield the same sum.
Clearly a bad choice of test!

OTOH, "salting" the calculation so that it is expected to yield
a value of 0x13 means *those* situations will be flagged as errors
(and a different set of situations will sneak by, undetected).
The trick (engineering) is to figure out which types of
failures/faults/errors are most common to occur and guard
against them.

> To be more accurate,
> the chances of them passing unnoticed are of the order of 1 in 2^n, for a good
> n-bit check such as a CRC check.  Certain types of error are always detectable,
> such as single and double bit errors.  That is the point of using a checksum or
> hash for integrity checking.
>
> /Intentional/ changes are a different matter.  If a hacker changes the program
> image, they can change the transmitted hash to their own calculated hash.  Or
> for a small CRC, they could change a different part of the image until the
> original checksum matched - for a 16-bit CRC, that only takes 65,535 attempts
> in the worst case.


Click here to read the complete article
Re: Embedding a Checksum in an Image File

<ef5ad4e6-57ed-4950-baed-3a6746b9d16en@googlegroups.com>

  copy mid

https://www.novabbs.com/devel/article-flat.php?id=1463&group=comp.arch.embedded#1463

  copy link   Newsgroups: comp.arch.embedded
X-Received: by 2002:a05:620a:129a:b0:74d:fdca:a6c6 with SMTP id w26-20020a05620a129a00b0074dfdcaa6c6mr1179750qki.14.1682133272070;
Fri, 21 Apr 2023 20:14:32 -0700 (PDT)
X-Received: by 2002:a05:620a:130a:b0:74e:13d5:3427 with SMTP id
o10-20020a05620a130a00b0074e13d53427mr1039888qkj.6.1682133271814; Fri, 21 Apr
2023 20:14:31 -0700 (PDT)
Path: i2pn2.org!i2pn.org!usenet.goja.nl.eu.org!2.eu.feeder.erje.net!feeder.erje.net!proxad.net!feeder1-2.proxad.net!209.85.160.216.MISMATCH!news-out.google.com!nntp.google.com!postnews.google.com!google-groups.googlegroups.com!not-for-mail
Newsgroups: comp.arch.embedded
Date: Fri, 21 Apr 2023 20:14:31 -0700 (PDT)
In-Reply-To: <u1u8hu$2ps79$1@dont-email.me>
Injection-Info: google-groups.googlegroups.com; posting-host=65.207.89.54; posting-account=I-_H_woAAAA9zzro6crtEpUAyIvzd19b
NNTP-Posting-Host: 65.207.89.54
References: <116ff07e-5e25-469a-90a0-9474108aadd3n@googlegroups.com>
<66e9fb2a-b9e6-4597-9afa-6572caaa8ca1n@googlegroups.com> <a4495c87-c68d-4c87-94aa-701c21bdd19cn@googlegroups.com>
<u1u8hu$2ps79$1@dont-email.me>
User-Agent: G2/1.0
MIME-Version: 1.0
Message-ID: <ef5ad4e6-57ed-4950-baed-3a6746b9d16en@googlegroups.com>
Subject: Re: Embedding a Checksum in an Image File
From: gnuarm.d...@gmail.com (Rick C)
Injection-Date: Sat, 22 Apr 2023 03:14:32 +0000
Content-Type: text/plain; charset="UTF-8"
Content-Transfer-Encoding: quoted-printable
 by: Rick C - Sat, 22 Apr 2023 03:14 UTC

On Friday, April 21, 2023 at 11:02:28 AM UTC-4, David Brown wrote:
> On 21/04/2023 14:12, Rick C wrote:
> >
> > This is simply to be able to say this version is unique, regardless
> > of what the version number says. Version numbers are set manually
> > and not always done correctly. I'm looking for something as a backup
> > so that if the checksums are different, I can be sure the versions
> > are not the same.
> >
> > The less work involved, the better.
> >
> Run a simple 32-bit crc over the image. The result is a hash of the
> image. Any change in the image will show up as a change in the crc.

No one is trying to detect changes in the image. I'm trying to label the image in a way that can be read in operation. I'm using the checksum simply because that is easy to generate. I've had problems with version numbering in the past. It will be used, but I want it supplemented with a number that will change every time the design changes, at least with a high probability, such as 1 in 64k.

--

Rick C.

--- Get 1,000 miles of free Supercharging
--- Tesla referral code - https://ts.la/richard11209

Re: Embedding a Checksum in an Image File

<52b5ae94-a5b4-4dc0-8e79-d27a9a4f2805n@googlegroups.com>

  copy mid

https://www.novabbs.com/devel/article-flat.php?id=1464&group=comp.arch.embedded#1464

  copy link   Newsgroups: comp.arch.embedded
X-Received: by 2002:ad4:559c:0:b0:56e:ac29:dc16 with SMTP id f28-20020ad4559c000000b0056eac29dc16mr1189216qvx.9.1682133803513;
Fri, 21 Apr 2023 20:23:23 -0700 (PDT)
X-Received: by 2002:a05:622a:289:b0:3ef:3541:435e with SMTP id
z9-20020a05622a028900b003ef3541435emr2503178qtw.1.1682133803304; Fri, 21 Apr
2023 20:23:23 -0700 (PDT)
Path: i2pn2.org!i2pn.org!usenet.blueworldhosting.com!diablo1.usenet.blueworldhosting.com!peer01.iad!feed-me.highwinds-media.com!news.highwinds-media.com!news-out.google.com!nntp.google.com!postnews.google.com!google-groups.googlegroups.com!not-for-mail
Newsgroups: comp.arch.embedded
Date: Fri, 21 Apr 2023 20:23:23 -0700 (PDT)
In-Reply-To: <1f26bbc6-964c-4081-b9f6-f460a799c9b0n@googlegroups.com>
Injection-Info: google-groups.googlegroups.com; posting-host=65.207.89.54; posting-account=I-_H_woAAAA9zzro6crtEpUAyIvzd19b
NNTP-Posting-Host: 65.207.89.54
References: <116ff07e-5e25-469a-90a0-9474108aadd3n@googlegroups.com>
<66e9fb2a-b9e6-4597-9afa-6572caaa8ca1n@googlegroups.com> <a4495c87-c68d-4c87-94aa-701c21bdd19cn@googlegroups.com>
<1f26bbc6-964c-4081-b9f6-f460a799c9b0n@googlegroups.com>
User-Agent: G2/1.0
MIME-Version: 1.0
Message-ID: <52b5ae94-a5b4-4dc0-8e79-d27a9a4f2805n@googlegroups.com>
Subject: Re: Embedding a Checksum in an Image File
From: gnuarm.d...@gmail.com (Rick C)
Injection-Date: Sat, 22 Apr 2023 03:23:23 +0000
Content-Type: text/plain; charset="UTF-8"
Content-Transfer-Encoding: quoted-printable
X-Received-Bytes: 6820
 by: Rick C - Sat, 22 Apr 2023 03:23 UTC

On Friday, April 21, 2023 at 7:52:27 PM UTC-4, Brian Cockburn wrote:
> On Friday, April 21, 2023 at 10:12:49 PM UTC+10, Rick C wrote:
> > On Friday, April 21, 2023 at 4:53:18 AM UTC-4, Brian Cockburn wrote:
> > > On Thursday, April 20, 2023 at 12:06:36 PM UTC+10, Rick C wrote:
> > > > This is a bit of the chicken and egg thing. If you want a embed a checksum in a code module to report the checksum, is there a way of doing this? It's a bit like being your own grandfather, I think.
> > > >
> > > > I'm not thinking anything too fancy, like a CRC, but rather a simple modulo N addition, maybe N being 2^16.
> > > >
> > > > I keep thinking of using a placeholder, but that doesn't seem to work out in any useful way. Even if you try to anticipate the impact of adding the checksum, that only gives you a different checksum, that you then need to anticipate further... ad infinitum.
> > > >
> > > > I'm not thinking of any special checksum generator that excludes the checksum data. That would be too messy.
> > > >
> > > > I keep thinking there is a different way of looking at this to achieve the result I want...
> > > >
> > > > Maybe I can prove it is impossible. Assume the file checksums to X when the checksum data is zero. The goal would then be to include the checksum data value Y in the file, that would change X to Y. Given the properties of the module N checksum, this would appear to be impossible for the general case, unless... Add another data value, called, checksum normalizer. This data value checksums with the original checksum to give the result zero. Then, when the checksum is also added, the resulting checksum is, in fact, the checksum. Another way of looking at this is to add a value that combines with the added checksum, to be zero, leaving the original checksum intact.
> > > >
> > > > This might be inordinately hard for a CRC, but a simple checksum would not be an issue, I think. At least, this could work in software, where data can be included in an image file as itself. In a device like an FPGA, it might not be included in the bit stream file so directly... but that might depend on where in the device it is inserted. Memory might have data that is stored as itself. I'll need to look into that.
> > > >
> > > > --
> > > >
> > > > Rick C.
> > > >
> > > > - Get 1,000 miles of free Supercharging
> > > > - Tesla referral code - https://ts.la/richard11209
> > > Rick, What is the purpose of this? Is it (1) to be able to externally identify a binary, as one might a ROM image by computing a checksum? Is it (2) for a run-able binary to be able to check itself? This would of course only be able to detect corruption, not tampering. Is it (3) for the loader (whatever that might be) to be able to say 'this binary has the correct checksum' and only jump to it if it does? Again this would only be able to detect corruption, not tampering. Are you hoping for more than corruption detection?
> > This is simply to be able to say this version is unique, regardless of what the version number says. Version numbers are set manually and not always done correctly. I'm looking for something as a backup so that if the checksums are different, I can be sure the versions are not the same.
> >
> > The less work involved, the better.
> >
> > --
> >
> > Rick C.
> >
> > ++ Get 1,000 miles of free Supercharging
> > ++ Tesla referral code - https://ts.la/richard11209
> Rick, so you want the executable to, as part of its execution, print on the console the 'checksum' of itself? Or do you want to be able to inspect the executable with some other tool to calculate its 'checksum'? For the latter there are lots of tools to do that (your OS or PROM programmer for instance), for the former you need to embed the calculation code into the executable (along with the length over which to calculate) and run this when asked. Neither of these involve embedding the 'checksum' value.
> And just to be sure I understand what you wrote in a somewhat convoluted way. When you have two binary executables that report the same version number you want to be able to distinguish them with a 'checksum', right?

Yes, I want the checksum to be readable while operating. Calculation code??? Not going to happen. That's why I want to embed the checksum.

Yes, two compiled files which ended up with the same version number by error. We are using an 8 bit version number, so two hex digits. Negative numbers are lab versions, positive numbers are releases, so 64 of each. We don't do a lot of actual work on the hardware. This code usually is 99.9% working by the time it is tested on hardware. So no need for lots of rev numbers. But sometimes, in the lab, the rev number is not bumped when it should be. The checksum will tell us if we are working with different revisions in that case.

So far, it looks like a simple checksum is the way to go. Include the checksum and the 2's complement of the checksum (in locations that were zeros), and the checksum will not change.

--

Rick C.

--+ Get 1,000 miles of free Supercharging
--+ Tesla referral code - https://ts.la/richard11209

Re: Embedding a Checksum in an Image File

<58ed7e73-759a-4741-aee6-de44f330be1cn@googlegroups.com>

  copy mid

https://www.novabbs.com/devel/article-flat.php?id=1466&group=comp.arch.embedded#1466

  copy link   Newsgroups: comp.arch.embedded
X-Received: by 2002:ae9:e713:0:b0:74d:e887:167 with SMTP id m19-20020ae9e713000000b0074de8870167mr1720594qka.10.1682172454922;
Sat, 22 Apr 2023 07:07:34 -0700 (PDT)
X-Received: by 2002:a05:6214:8eb:b0:5ef:4a35:d1e5 with SMTP id
dr11-20020a05621408eb00b005ef4a35d1e5mr1357928qvb.3.1682172454662; Sat, 22
Apr 2023 07:07:34 -0700 (PDT)
Path: i2pn2.org!i2pn.org!usenet.blueworldhosting.com!diablo1.usenet.blueworldhosting.com!peer01.iad!feed-me.highwinds-media.com!news.highwinds-media.com!news-out.google.com!nntp.google.com!postnews.google.com!google-groups.googlegroups.com!not-for-mail
Newsgroups: comp.arch.embedded
Date: Sat, 22 Apr 2023 07:07:34 -0700 (PDT)
In-Reply-To: <52b5ae94-a5b4-4dc0-8e79-d27a9a4f2805n@googlegroups.com>
Injection-Info: google-groups.googlegroups.com; posting-host=180.150.36.196; posting-account=ZaAWpAoAAABFnmcUGQHISv7vlLrTdUCZ
NNTP-Posting-Host: 180.150.36.196
References: <116ff07e-5e25-469a-90a0-9474108aadd3n@googlegroups.com>
<66e9fb2a-b9e6-4597-9afa-6572caaa8ca1n@googlegroups.com> <a4495c87-c68d-4c87-94aa-701c21bdd19cn@googlegroups.com>
<1f26bbc6-964c-4081-b9f6-f460a799c9b0n@googlegroups.com> <52b5ae94-a5b4-4dc0-8e79-d27a9a4f2805n@googlegroups.com>
User-Agent: G2/1.0
MIME-Version: 1.0
Message-ID: <58ed7e73-759a-4741-aee6-de44f330be1cn@googlegroups.com>
Subject: Re: Embedding a Checksum in an Image File
From: brian.co...@gmail.com (Brian Cockburn)
Injection-Date: Sat, 22 Apr 2023 14:07:34 +0000
Content-Type: text/plain; charset="UTF-8"
Content-Transfer-Encoding: quoted-printable
X-Received-Bytes: 3633
 by: Brian Cockburn - Sat, 22 Apr 2023 14:07 UTC

Rick,
>> Rick, so you want the executable to, as part of its execution, print on the console the 'checksum' of itself? Or do you want to be able to inspect the executable with some other tool to calculate its 'checksum'? For the latter there are lots of tools to do that (your OS or PROM programmer for instance), for the former you need to embed the calculation code into the executable (along with the length over which to calculate) and run this when asked. Neither of these involve embedding the 'checksum' value.
>> And just to be sure I understand what you wrote in a somewhat convoluted way. When you have two binary executables that report the same version number you want to be able to distinguish them with a 'checksum', right?
>
> Yes, I want the checksum to be readable while operating. Calculation code??? Not going to happen. That's why I want to embed the checksum.

Can you expand on what you mean or expect by 'readable while operating' please? Are you planning to use some sort of tool to inspect the executing binary to 'read' this thing, or provoke output to the console in some way like:

$ run my-binary-thing --checksum
10FD
$

This would be as distinct from:
$ run my-binary-thing --version
-52
$
> Yes, two compiled files which ended up with the same version number by error. We are using an 8 bit version number, so two hex digits. Negative numbers are lab versions, positive numbers are releases, so 64 of each.

Signed 8-bit numbers range from -128 to +127 (0x80 to 0x7F) so probably a few more than 64.

> ... sometimes, in the lab, the rev number is not bumped when it should be..

This may be an indicator that better procedures are needed for code review-for-release. And that in independent pair of eyes should be doing the review against an agreed check list.

> So far, it looks like a simple checksum is the way to go. Include the checksum and the 2's complement of the checksum (in locations that were zeros), and the checksum will not change.

How will the checksum 'not change'? It will be different for every build won't it?

Cheers, Brian.

Re: Embedding a Checksum in an Image File

<odS0M.292423$wfQc.236913@fx43.iad>

  copy mid

https://www.novabbs.com/devel/article-flat.php?id=1467&group=comp.arch.embedded#1467

  copy link   Newsgroups: comp.arch.embedded
Path: i2pn2.org!i2pn.org!usenet.blueworldhosting.com!diablo1.usenet.blueworldhosting.com!peer01.iad!feed-me.highwinds-media.com!news.highwinds-media.com!fx43.iad.POSTED!not-for-mail
MIME-Version: 1.0
User-Agent: Mozilla/5.0 (Macintosh; Intel Mac OS X 10.15; rv:102.0)
Gecko/20100101 Thunderbird/102.10.0
Subject: Re: Embedding a Checksum in an Image File
Content-Language: en-US
Newsgroups: comp.arch.embedded
References: <116ff07e-5e25-469a-90a0-9474108aadd3n@googlegroups.com>
<66e9fb2a-b9e6-4597-9afa-6572caaa8ca1n@googlegroups.com>
<a4495c87-c68d-4c87-94aa-701c21bdd19cn@googlegroups.com>
<1f26bbc6-964c-4081-b9f6-f460a799c9b0n@googlegroups.com>
<52b5ae94-a5b4-4dc0-8e79-d27a9a4f2805n@googlegroups.com>
<58ed7e73-759a-4741-aee6-de44f330be1cn@googlegroups.com>
From: Rich...@Damon-Family.org (Richard Damon)
In-Reply-To: <58ed7e73-759a-4741-aee6-de44f330be1cn@googlegroups.com>
Content-Type: text/plain; charset=UTF-8; format=flowed
Content-Transfer-Encoding: 7bit
Lines: 12
Message-ID: <odS0M.292423$wfQc.236913@fx43.iad>
X-Complaints-To: abuse@easynews.com
Organization: Forte - www.forteinc.com
X-Complaints-Info: Please be sure to forward a copy of ALL headers otherwise we will be unable to process your complaint properly.
Date: Sat, 22 Apr 2023 10:31:48 -0400
X-Received-Bytes: 1687
 by: Richard Damon - Sat, 22 Apr 2023 14:31 UTC

On 4/22/23 10:07 AM, Brian Cockburn wrote:
> Rick,
>
>> So far, it looks like a simple checksum is the way to go. Include the checksum and the 2's complement of the checksum (in locations that were zeros), and the checksum will not change.
>
> How will the checksum 'not change'? It will be different for every build won't it?
>
> Cheers, Brian.

He means the checksum of the file for a given build after the
modification will be the same as the checksum of the file before the
modification.

Re: Embedding a Checksum in an Image File

<u20sli$3ag7l$1@dont-email.me>

  copy mid

https://www.novabbs.com/devel/article-flat.php?id=1468&group=comp.arch.embedded#1468

  copy link   Newsgroups: comp.arch.embedded
Path: i2pn2.org!i2pn.org!eternal-september.org!news.eternal-september.org!.POSTED!not-for-mail
From: david.br...@hesbynett.no (David Brown)
Newsgroups: comp.arch.embedded
Subject: Re: Embedding a Checksum in an Image File
Date: Sat, 22 Apr 2023 16:57:53 +0200
Organization: A noiseless patient Spider
Lines: 658
Message-ID: <u20sli$3ag7l$1@dont-email.me>
References: <116ff07e-5e25-469a-90a0-9474108aadd3n@googlegroups.com>
<pvl24i57aef4vc7bbdk9mvj7sic9dsh64t@4ax.com>
<f0afa198-e735-4da1-a16a-82764af3de4dn@googlegroups.com>
<36534il81ipvnhog6980r9ln9tdqn5cbh6@4ax.com> <u1t7eb$10gmu$3@dont-email.me>
<u1tpcr$1377f$1@dont-email.me> <u1tsm3$13ook$1@dont-email.me>
<u1u7ro$2poss$1@dont-email.me> <u1v9q8$2v4d5$1@dont-email.me>
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8; format=flowed
Content-Transfer-Encoding: 8bit
Injection-Date: Sat, 22 Apr 2023 14:57:54 -0000 (UTC)
Injection-Info: dont-email.me; posting-host="36ac789b885e447ecbd5a7ad2828e367";
logging-data="3490037"; mail-complaints-to="abuse@eternal-september.org"; posting-account="U2FsdGVkX18Ig2ADpg5R+fTkLqfhOXx9E0G9EjFL8Co="
User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:102.0) Gecko/20100101
Thunderbird/102.7.1
Cancel-Lock: sha1:JkPLd0toIMlraeGQBa1XtNshJhk=
Content-Language: en-GB
In-Reply-To: <u1v9q8$2v4d5$1@dont-email.me>
 by: David Brown - Sat, 22 Apr 2023 14:57 UTC

On 22/04/2023 02:29, Don Y wrote:
> On 4/21/2023 7:50 AM, David Brown wrote:
>> On 21/04/2023 13:39, Don Y wrote:
>>> On 4/21/2023 3:43 AM, David Brown wrote:
>>>>> Note that you want to choose a polynomial that doesn't
>>>>> give you a "win" result for "obviously" corrupt data.
>>>>> E.g., if data is all zeros or all 0xFF (as these sorts of
>>>>> conditions can happen with hardware failures) you probably
>>>>> wouldn't want a "success" indication!
>>>>
>>>> No, that is pointless for something like a code image.  It just adds
>>>> needless complexity to your CRC algorithm.
>>>
>>> Perhaps you've forgotten that you don't just use CRCs (secure hashes,
>>> etc.)
>>> on "code images"?
>>
>> No - but "code images" is the topic here.
>
> So, anything unrelated to CRC's as applied to code images is off limits...
> per order of the Internet Police"?
>

No, it's fine to discuss them - threads on Usenet often wander, and
that's often good. (At least, that's my opinion - some people get their
knickers in a twist if people stray from answering their original question.)

But you have to assume that people are on topic unless it's clear that
the topic is being expanded. We were discussing CRC's for code images,
and so it is appropriate to take advantage of the features of code
images. If you want to expand and talk about other uses of CRC's, I've
no problem with that - but you need to say so.

> If *all* you use CRCs for is checking *a* code image at POST, you're
> wasting a valuable resource.
>
> Do you not think data/parameters need to be safeguarded?  Program images?
> Communication protocols?

Sure. Many things need integrity checks. And CRC's are flexible enough
to be useful in many circumstances.

>
> Or, do you develop yet another technique for *each* of those?

Sometimes, yes. CRC's are, as I wrote, flexible. But they don't cover
everything. Maybe you need a specific type of check to match existing
protocols or requirements. Maybe you want forward error correction, not
just error detection. Maybe you are guarding against malicious
interference. Maybe you are guarding against different kinds of errors
- CRC's are great for spotting a few damaged bits, but a poor choice if
the risk is dropped bytes in transmission.

But often CRC's will be a first choice, because they are simple and
effective in a wide range of uses.

>
>> However, in almost every case where CRC's might be useful, you have
>> additional checks of the sanity of the data, and an all-zero or
>> all-one data block would be rejected.  For example, Ethernet packets
>> use CRC for integrity checking, but an attempt to send a packet type 0
>> from MAC address 00:00:00:00:00:00 to address 00:00:00:00:00:00, of
>> length 0, would be rejected anyway.
>
> Why look at "data" -- which may be suspect -- and *then* check its CRC?
> Run the CRC first.  If it fails, decide how you are going to proceed
> or recover.
>

That is usually the order, yes. Sometimes you want "fail fast", such as
dropping a packet that was not addressed to you (it doesn't matter if it
was received correctly but for someone else, or it was addressed to you
but the receiver address was corrupted - you are dropping the packet
either way). But usually you will run the CRC then look at the data.

But the order doesn't matter - either way, you are still checking for
valid data, and if the data is invalid, it does not matter if the CRC
only passed by luck or by all zeros.

> ["Data" can be code or parameters]
>
> I treat blocks of "data" (carefully arranged) with individual CRCs,
> based on their relative importance to the operation.  If the CRC is
> corrupt, I have no idea *where* the error lies -- as it could
> be anything in the checked block.  So, one has to (typically)
> restore some defaults (or, invoke a reconfigure operation) which
> recreates *a* valid dataset.
>
> This is particularly useful when power to a device can be
> removed at arbitrary points in time (or, some other abrupt
> crash).  Before altering anything in a block, take deliberate
> steps to invalidate the CRC, make your changes, then "fix"
> the CRC.  So, an interrupted process causes the CRC to fail
> and remedial action taken.
>
> Note that replacing a FLASH image (mostly code) falls under
> such a mechanism.
>

That's all standard stuff. (Maybe it's new to some people in this group
- although most of the regular posters here are experienced embedded
developers, it's nice to think there might be some people reading these
posts and learning!)

If you have the space in your flash, eeprom, etc., then it is also
common to have two slots for your configuration data or code. You don't
"invalidate" anything - you keep a version counter with your data, and
write your new data to the slot with the oldest version. When your
system starts, it checks both slots - and uses the one with the newest
version for which the CRC check passes.

>> I can't think of any use-cases where you would be passing around a
>> block of "pure" data that could reasonably take absolutely any value,
>> without any type of "envelope" information, and where you would think
>> a CRC check is appropriate.
>
> I append a *version specific* CRC to each packet of marshalled data
> in my RMIs.  If the data is corrupted in transit *or* if the
> wrong version API ends up targeted, the operation will abend
> because we know the data "isn't right".

Using a version-specific CRC sounds silly. Put the version information
in the packet.

>
> I *could* put a header saying "this is version 4.2".  And, that
> tells me nothing about the integrity of the rest of the data.
> OTOH, ensuring the CRC reflects "4.2" does -- it the recipient
> expects it to be so.

Now you don't know if the data is corrupted, or for the wrong version -
or occasionally, corrupted /and/ the wrong version but passing the CRC
anyway.

Unless you are absolutely desperate to save every bit you can, your
system will be simpler, clearer, and more reliable if you separate your
purposes.

>
>>>>> You can also "salt" the calculation so that the residual
>>>>> is deliberately nonzero.  So, for example, "success" is
>>>>> indicated by a residual of 0x474E.  :>
>>>>
>>>> Again, pointless.
>>>>
>>>> Salt is important for security-related hashes (like password
>>>> hashes), not for integrity checks.
>>>
>>> You've missed the point.  The correct "sum" can be anything.
>>> Why is "0" more special than any other value?  As the value is
>>> typically meaningless to anything other than the code that verifies
>>> it, you couldn't look at an image (or the output of the verifier)
>>> and gain anything from seeing that obscure value.
>>
>> Do you actually know what is meant by "salt" in the context of hashes,
>> and why it is useful in some circumstances?  Do you understand that
>> "salt" is added (usually prepended, or occasionally mixed in in some
>> other way) to the data /before/ the hash is calculated?
>
> What term would you have me use to indicate a "bias" applied to a CRC
> algorithm?

Well, first I'd note that any kind of modification to the basic CRC
algorithm is pointless from the viewpoint of its use as an integrity
check. (There have been, mostly historically, some justifications in
terms of implementation efficiency. For example, bit and byte
re-ordering could be done to suit hardware bit-wise implementations.)

Otherwise I'd say you are picking a specific initial value if that is
what you are doing, or modifying the final value (inverting it or
xor'ing it with a fixed value). There is, AFAIK, no specific terms for
these - and I don't see any benefit in having one. Misusing the term
"salt" from cryptography is certainly not helpful.

>
>> I have not given the slightest indication to suggest that "0" is a
>> special value.  I fully agree that the value you get from the checking
>> algorithm does not have to be 0 - I already suggested it could be
>> compared to the stored value.  I.e., your build your image file as
>> "data ++ crc(data)", at check it by re-calculating "crc(data)" on the
>> received image and comparing the result to the received crc.  There is
>> no necessity or benefit in having a crc run calculated over the
>> received data plus the received crc being 0.
>>
>> "Salt" is used in cases where the original data must be kept secret,
>> and only the hashes are transmitted or accessible - by adding salt to
>> the original data before hashing it, you avoid a direct correspondence
>> between the hash and the original data.  The prime use-case is to stop
>> people being able to figure out a password by looking up the hash in a
>> list of pre-computed hashes of common passwords.
>
> See above.
>
>>> OTOH, if the CRC yields something familiar -- or useful -- then
>>> it can tell you something about the image.  E.g., salt the algorithm
>>> with the product code, version number, your initials, 0xDEADBEEF, etc.
>>
>> You are making no sense at all.  Are you suggesting that it would be a
>> good idea to add some value to the start of the image so that the
>> resulting crc calculation gives a nice recognisable product code?
>> This "salt" would be different for each program image, and calculated
>> by trial and error.  If you want a product code, version number, etc.,
>> in the program image (and it's a good idea), just put these in the
>> program image!
>
> Again, that tells you nothing about the rest of the image!


Click here to read the complete article
Re: Embedding a Checksum in an Image File

<u20srm$3ag7l$2@dont-email.me>

  copy mid

https://www.novabbs.com/devel/article-flat.php?id=1469&group=comp.arch.embedded#1469

  copy link   Newsgroups: comp.arch.embedded
Path: i2pn2.org!i2pn.org!eternal-september.org!news.eternal-september.org!.POSTED!not-for-mail
From: david.br...@hesbynett.no (David Brown)
Newsgroups: comp.arch.embedded
Subject: Re: Embedding a Checksum in an Image File
Date: Sat, 22 Apr 2023 17:01:10 +0200
Organization: A noiseless patient Spider
Lines: 28
Message-ID: <u20srm$3ag7l$2@dont-email.me>
References: <116ff07e-5e25-469a-90a0-9474108aadd3n@googlegroups.com>
<66e9fb2a-b9e6-4597-9afa-6572caaa8ca1n@googlegroups.com>
<a4495c87-c68d-4c87-94aa-701c21bdd19cn@googlegroups.com>
<u1u8hu$2ps79$1@dont-email.me>
<04d4cbda-216d-4d0d-8db3-f9decc6e4142n@googlegroups.com>
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8; format=flowed
Content-Transfer-Encoding: 8bit
Injection-Date: Sat, 22 Apr 2023 15:01:10 -0000 (UTC)
Injection-Info: dont-email.me; posting-host="36ac789b885e447ecbd5a7ad2828e367";
logging-data="3490037"; mail-complaints-to="abuse@eternal-september.org"; posting-account="U2FsdGVkX1/GU8gSL4QEH5gDy6OQhYkttQfejTUqUjw="
User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:102.0) Gecko/20100101
Thunderbird/102.7.1
Cancel-Lock: sha1:Ber1aAvlFSyAmeYIGJ00L90Lufc=
Content-Language: en-GB
In-Reply-To: <04d4cbda-216d-4d0d-8db3-f9decc6e4142n@googlegroups.com>
 by: David Brown - Sat, 22 Apr 2023 15:01 UTC

On 22/04/2023 01:56, Brian Cockburn wrote:
> On Saturday, April 22, 2023 at 1:02:28 AM UTC+10, David Brown wrote:
>> On 21/04/2023 14:12, Rick C wrote:
>>>
>>> This is simply to be able to say this version is unique,
>>> regardless of what the version number says. Version numbers are
>>> set manually and not always done correctly. I'm looking for
>>> something as a backup so that if the checksums are different, I
>>> can be sure the versions are not the same.
>>>
>>> The less work involved, the better.
>>>
>> Run a simple 32-bit crc over the image. The result is a hash of
>> the image. Any change in the image will show up as a change in the
>> crc.
> David, a hash and a CRC are not the same thing.

A CRC is a type of hash - but hash is a more generic term.

> They both produce a
> reasonably unique result though. Any change would show in either
> (unless as a result of intentional tampering).

Exactly. Thus a CRC is a hash.

It is not a cryptographically secure hash, and is not suitable for
protecting against intentional tampering. But it /is/ a hash.

Re: Embedding a Checksum in an Image File

<u20tin$3alrj$1@dont-email.me>

  copy mid

https://www.novabbs.com/devel/article-flat.php?id=1470&group=comp.arch.embedded#1470

  copy link   Newsgroups: comp.arch.embedded
Path: i2pn2.org!i2pn.org!eternal-september.org!news.eternal-september.org!.POSTED!not-for-mail
From: david.br...@hesbynett.no (David Brown)
Newsgroups: comp.arch.embedded
Subject: Re: Embedding a Checksum in an Image File
Date: Sat, 22 Apr 2023 17:13:27 +0200
Organization: A noiseless patient Spider
Lines: 33
Message-ID: <u20tin$3alrj$1@dont-email.me>
References: <116ff07e-5e25-469a-90a0-9474108aadd3n@googlegroups.com>
<66e9fb2a-b9e6-4597-9afa-6572caaa8ca1n@googlegroups.com>
<a4495c87-c68d-4c87-94aa-701c21bdd19cn@googlegroups.com>
<u1u8hu$2ps79$1@dont-email.me>
<ef5ad4e6-57ed-4950-baed-3a6746b9d16en@googlegroups.com>
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8; format=flowed
Content-Transfer-Encoding: 8bit
Injection-Date: Sat, 22 Apr 2023 15:13:27 -0000 (UTC)
Injection-Info: dont-email.me; posting-host="36ac789b885e447ecbd5a7ad2828e367";
logging-data="3495795"; mail-complaints-to="abuse@eternal-september.org"; posting-account="U2FsdGVkX18W+aOmJrjX4yx5fFY11EhUmQvdwXOj7Xk="
User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:102.0) Gecko/20100101
Thunderbird/102.7.1
Cancel-Lock: sha1:n5kfnKLogZXeXBYlj2yiXeot+ZU=
In-Reply-To: <ef5ad4e6-57ed-4950-baed-3a6746b9d16en@googlegroups.com>
Content-Language: en-GB
 by: David Brown - Sat, 22 Apr 2023 15:13 UTC

On 22/04/2023 05:14, Rick C wrote:
> On Friday, April 21, 2023 at 11:02:28 AM UTC-4, David Brown wrote:
>> On 21/04/2023 14:12, Rick C wrote:
>>>
>>> This is simply to be able to say this version is unique,
>>> regardless of what the version number says. Version numbers are
>>> set manually and not always done correctly. I'm looking for
>>> something as a backup so that if the checksums are different, I
>>> can be sure the versions are not the same.
>>>
>>> The less work involved, the better.
>>>
>> Run a simple 32-bit crc over the image. The result is a hash of
>> the image. Any change in the image will show up as a change in the
>> crc.
>
> No one is trying to detect changes in the image. I'm trying to label
> the image in a way that can be read in operation. I'm using the
> checksum simply because that is easy to generate. I've had problems
> with version numbering in the past. It will be used, but I want it
> supplemented with a number that will change every time the design
> changes, at least with a high probability, such as 1 in 64k.
>

Again - use a CRC. It will give you what you want.

You might want to go for 32-bit CRC rather than a 16-bit CRC, depending
on the kind of program, how often you build it, and what consequences a
hash collision could have. With a 16-bit CRC, you have a 5% chance of a
collision after 82 builds. If collisions only matter for releases, and
you only release a couple of updates, fine - but if they matter during
development builds, you are getting a more significant risk. Since a
32-bit CRC is quick and easy, it's worth using.

Re: Embedding a Checksum in an Image File

<08922ed4-b36b-4698-bce1-b8615d2ccfd9n@googlegroups.com>

  copy mid

https://www.novabbs.com/devel/article-flat.php?id=1471&group=comp.arch.embedded#1471

  copy link   Newsgroups: comp.arch.embedded
X-Received: by 2002:a05:622a:613:b0:3ee:be98:9fce with SMTP id z19-20020a05622a061300b003eebe989fcemr2732510qta.1.1682182440662;
Sat, 22 Apr 2023 09:54:00 -0700 (PDT)
X-Received: by 2002:a05:622a:d2:b0:3ef:35e2:addb with SMTP id
p18-20020a05622a00d200b003ef35e2addbmr2845677qtw.3.1682182440335; Sat, 22 Apr
2023 09:54:00 -0700 (PDT)
Path: i2pn2.org!i2pn.org!weretis.net!feeder8.news.weretis.net!proxad.net!feeder1-2.proxad.net!209.85.160.216.MISMATCH!news-out.google.com!nntp.google.com!postnews.google.com!google-groups.googlegroups.com!not-for-mail
Newsgroups: comp.arch.embedded
Date: Sat, 22 Apr 2023 09:54:00 -0700 (PDT)
In-Reply-To: <58ed7e73-759a-4741-aee6-de44f330be1cn@googlegroups.com>
Injection-Info: google-groups.googlegroups.com; posting-host=65.207.89.54; posting-account=I-_H_woAAAA9zzro6crtEpUAyIvzd19b
NNTP-Posting-Host: 65.207.89.54
References: <116ff07e-5e25-469a-90a0-9474108aadd3n@googlegroups.com>
<66e9fb2a-b9e6-4597-9afa-6572caaa8ca1n@googlegroups.com> <a4495c87-c68d-4c87-94aa-701c21bdd19cn@googlegroups.com>
<1f26bbc6-964c-4081-b9f6-f460a799c9b0n@googlegroups.com> <52b5ae94-a5b4-4dc0-8e79-d27a9a4f2805n@googlegroups.com>
<58ed7e73-759a-4741-aee6-de44f330be1cn@googlegroups.com>
User-Agent: G2/1.0
MIME-Version: 1.0
Message-ID: <08922ed4-b36b-4698-bce1-b8615d2ccfd9n@googlegroups.com>
Subject: Re: Embedding a Checksum in an Image File
From: gnuarm.d...@gmail.com (Rick C)
Injection-Date: Sat, 22 Apr 2023 16:54:00 +0000
Content-Type: text/plain; charset="UTF-8"
Content-Transfer-Encoding: quoted-printable
 by: Rick C - Sat, 22 Apr 2023 16:54 UTC

On Saturday, April 22, 2023 at 10:07:37 AM UTC-4, Brian Cockburn wrote:
> Rick,
> >> Rick, so you want the executable to, as part of its execution, print on the console the 'checksum' of itself? Or do you want to be able to inspect the executable with some other tool to calculate its 'checksum'? For the latter there are lots of tools to do that (your OS or PROM programmer for instance), for the former you need to embed the calculation code into the executable (along with the length over which to calculate) and run this when asked. Neither of these involve embedding the 'checksum' value.
> >> And just to be sure I understand what you wrote in a somewhat convoluted way. When you have two binary executables that report the same version number you want to be able to distinguish them with a 'checksum', right?
> >
> > Yes, I want the checksum to be readable while operating. Calculation code??? Not going to happen. That's why I want to embed the checksum.
> Can you expand on what you mean or expect by 'readable while operating' please? Are you planning to use some sort of tool to inspect the executing binary to 'read' this thing, or provoke output to the console in some way like:
>
> $ run my-binary-thing --checksum
> 10FD
> $
>
> This would be as distinct from:
>
> $ run my-binary-thing --version
> -52
> $

More like $ run my-binary thing
Hello, master. Would you like to achieve world domination today?
> No, thank you, can you display the contents of registers 26 and 27 in hex please?
That would be X0FE38
> Thank you.

> > Yes, two compiled files which ended up with the same version number by error. We are using an 8 bit version number, so two hex digits. Negative numbers are lab versions, positive numbers are releases, so 64 of each.
> Signed 8-bit numbers range from -128 to +127 (0x80 to 0x7F) so probably a few more than 64.

See? This is why I need the checksum. I make mistakes.

> > ... sometimes, in the lab, the rev number is not bumped when it should be.
>
> This may be an indicator that better procedures are needed for code review-for-release. And that in independent pair of eyes should be doing the review against an agreed check list.

Or that I need a checksum. This is a lab compile, not a release. Let's try to stay on task.

> > So far, it looks like a simple checksum is the way to go. Include the checksum and the 2's complement of the checksum (in locations that were zeros), and the checksum will not change.
> How will the checksum 'not change'? It will be different for every build won't it?

It won't be changed by including the checksum and the complement because they add up to zero.

--

Rick C.

-+- Get 1,000 miles of free Supercharging
-+- Tesla referral code - https://ts.la/richard11209

Re: Embedding a Checksum in an Image File

<3986aac8-aec3-4cde-8bb8-8a61c20084c9n@googlegroups.com>

  copy mid

https://www.novabbs.com/devel/article-flat.php?id=1472&group=comp.arch.embedded#1472

  copy link   Newsgroups: comp.arch.embedded
X-Received: by 2002:a05:620a:15c5:b0:74d:fd99:ab3f with SMTP id o5-20020a05620a15c500b0074dfd99ab3fmr1768615qkm.5.1682182564436;
Sat, 22 Apr 2023 09:56:04 -0700 (PDT)
X-Received: by 2002:a05:622a:1744:b0:3ef:57f8:8471 with SMTP id
l4-20020a05622a174400b003ef57f88471mr2549625qtk.3.1682182564165; Sat, 22 Apr
2023 09:56:04 -0700 (PDT)
Path: i2pn2.org!i2pn.org!weretis.net!feeder8.news.weretis.net!proxad.net!feeder1-2.proxad.net!209.85.160.216.MISMATCH!news-out.google.com!nntp.google.com!postnews.google.com!google-groups.googlegroups.com!not-for-mail
Newsgroups: comp.arch.embedded
Date: Sat, 22 Apr 2023 09:56:03 -0700 (PDT)
In-Reply-To: <u20tin$3alrj$1@dont-email.me>
Injection-Info: google-groups.googlegroups.com; posting-host=65.207.89.54; posting-account=I-_H_woAAAA9zzro6crtEpUAyIvzd19b
NNTP-Posting-Host: 65.207.89.54
References: <116ff07e-5e25-469a-90a0-9474108aadd3n@googlegroups.com>
<66e9fb2a-b9e6-4597-9afa-6572caaa8ca1n@googlegroups.com> <a4495c87-c68d-4c87-94aa-701c21bdd19cn@googlegroups.com>
<u1u8hu$2ps79$1@dont-email.me> <ef5ad4e6-57ed-4950-baed-3a6746b9d16en@googlegroups.com>
<u20tin$3alrj$1@dont-email.me>
User-Agent: G2/1.0
MIME-Version: 1.0
Message-ID: <3986aac8-aec3-4cde-8bb8-8a61c20084c9n@googlegroups.com>
Subject: Re: Embedding a Checksum in an Image File
From: gnuarm.d...@gmail.com (Rick C)
Injection-Date: Sat, 22 Apr 2023 16:56:04 +0000
Content-Type: text/plain; charset="UTF-8"
Content-Transfer-Encoding: quoted-printable
 by: Rick C - Sat, 22 Apr 2023 16:56 UTC

On Saturday, April 22, 2023 at 11:13:32 AM UTC-4, David Brown wrote:
> On 22/04/2023 05:14, Rick C wrote:
> > On Friday, April 21, 2023 at 11:02:28 AM UTC-4, David Brown wrote:
> >> On 21/04/2023 14:12, Rick C wrote:
> >>>
> >>> This is simply to be able to say this version is unique,
> >>> regardless of what the version number says. Version numbers are
> >>> set manually and not always done correctly. I'm looking for
> >>> something as a backup so that if the checksums are different, I
> >>> can be sure the versions are not the same.
> >>>
> >>> The less work involved, the better.
> >>>
> >> Run a simple 32-bit crc over the image. The result is a hash of
> >> the image. Any change in the image will show up as a change in the
> >> crc.
> >
> > No one is trying to detect changes in the image. I'm trying to label
> > the image in a way that can be read in operation. I'm using the
> > checksum simply because that is easy to generate. I've had problems
> > with version numbering in the past. It will be used, but I want it
> > supplemented with a number that will change every time the design
> > changes, at least with a high probability, such as 1 in 64k.
> >
> Again - use a CRC. It will give you what you want.

Again - as will a simple addition checksum.

> You might want to go for 32-bit CRC rather than a 16-bit CRC, depending
> on the kind of program, how often you build it, and what consequences a
> hash collision could have. With a 16-bit CRC, you have a 5% chance of a
> collision after 82 builds. If collisions only matter for releases, and
> you only release a couple of updates, fine - but if they matter during
> development builds, you are getting a more significant risk. Since a
> 32-bit CRC is quick and easy, it's worth using.

Or, I might want to go with a simple checksum.

Thanks for your comments.

--

Rick C.

-++ Get 1,000 miles of free Supercharging
-++ Tesla referral code - https://ts.la/richard11209

Re: Embedding a Checksum in an Image File

<u21eln$25c$1@reader2.panix.com>

  copy mid

https://www.novabbs.com/devel/article-flat.php?id=1473&group=comp.arch.embedded#1473

  copy link   Newsgroups: comp.arch.embedded
Path: i2pn2.org!i2pn.org!weretis.net!feeder6.news.weretis.net!panix!.POSTED.localhost!not-for-mail
From: inva...@invalid.invalid (Grant Edwards)
Newsgroups: comp.arch.embedded
Subject: Re: Embedding a Checksum in an Image File
Date: Sat, 22 Apr 2023 20:05:11 -0000 (UTC)
Organization: PANIX Public Access Internet and UNIX, NYC
Message-ID: <u21eln$25c$1@reader2.panix.com>
References: <116ff07e-5e25-469a-90a0-9474108aadd3n@googlegroups.com>
<66e9fb2a-b9e6-4597-9afa-6572caaa8ca1n@googlegroups.com>
<a4495c87-c68d-4c87-94aa-701c21bdd19cn@googlegroups.com>
<u1u8hu$2ps79$1@dont-email.me>
<ef5ad4e6-57ed-4950-baed-3a6746b9d16en@googlegroups.com>
<u20tin$3alrj$1@dont-email.me>
<3986aac8-aec3-4cde-8bb8-8a61c20084c9n@googlegroups.com>
<u2171f$3cbcu$1@dont-email.me>
Injection-Date: Sat, 22 Apr 2023 20:05:11 -0000 (UTC)
Injection-Info: reader2.panix.com; posting-host="localhost:::1";
logging-data="2220"; mail-complaints-to="abuse@panix.com"
User-Agent: slrn/1.0.3 (Linux)
 by: Grant Edwards - Sat, 22 Apr 2023 20:05 UTC

On 2023-04-22, David Brown <david.brown@hesbynett.no> wrote:

> A simple addition checksum might be okay much of the time, but it
> doesn't have the resolving power of a CRC. If the source code changes
> "a = 1; b = 2;" to "a = 2; b = 1;", the addition checksum is likely to
> be exactly the same despite the change in the source. In general, you
> will have much higher chance of collisions, though I think it would be
> very hard to quantify that.

I remember a long discussion about this a few decades ago. An N bit
additive checksum maps the source data into the same hash space
as a N-bit crc.

Therefore, for two randomly chosen sets of input bits, they both have
a 1 in 2^N chance of a collision. I think that means that for random
changes to an input set of unspecified properties, they would both
have the same chance that the hash is unchanged.

However... IIRC, somebody (probably at somewhere like Bell labs)
noticed that errors in data transmitted over media like phone lines
and microwave links are _not_ random. Errors tend to be "bursty" and
can be statistically characterized. And it was shown that for the
common error modes for _those_ media, CRCs were better at detecting
real-world failures than additive checksum. And (this is also
important) a CRC is far, far simpler to implement in hardware than an
additive checksum. For the same reasons, CRCs tend to get used for
things like Ethernet frames, disc sectors, etc.

Later people seem to have adopted CRCs for detecting failures in other
very dissimilar media (e.g. EPROMs) where implementing a CRC is _more_
work than an additive checksum. If the failure modes for EPROM are
similar to those studied at <wherever> when CRCs were chosen, then
CRCs are probably also a good choice for EPROMs despite the additional
overhead. If the failure modes for EPROMs are significantly different,
then CRCs might be both sub-optimal and unnecessarily expensive.

I have no hard data either way, but it was never obvious to me that
the arguments people use in favor of CRCs (better at detecting burst
errors on transmission media) necessarily applied to EPROMs.

That said, I do use CRCs rather than additive checksums for things
like EPROM and flash.

--
Grant

Re: Embedding a Checksum in an Image File

<ich84idn1vub7t4lggn1rkdrq00crmh3rc@4ax.com>

  copy mid

https://www.novabbs.com/devel/article-flat.php?id=1474&group=comp.arch.embedded#1474

  copy link   Newsgroups: comp.arch.embedded
Path: i2pn2.org!i2pn.org!usenet.blueworldhosting.com!diablo1.usenet.blueworldhosting.com!peer03.iad!feed-me.highwinds-media.com!news.highwinds-media.com!feeder.usenetexpress.com!tr3.iad1.usenetexpress.com!69.80.99.22.MISMATCH!Xl.tags.giganews.com!local-2.nntp.ord.giganews.com!nntp.supernews.com!news.supernews.com.POSTED!not-for-mail
NNTP-Posting-Date: Sat, 22 Apr 2023 20:41:01 +0000
From: boB...@K7IQ.com (boB)
Newsgroups: comp.arch.embedded
Subject: Re: Embedding a Checksum in an Image File
Date: Sat, 22 Apr 2023 13:41:00 -0700
Message-ID: <ich84idn1vub7t4lggn1rkdrq00crmh3rc@4ax.com>
References: <116ff07e-5e25-469a-90a0-9474108aadd3n@googlegroups.com> <66e9fb2a-b9e6-4597-9afa-6572caaa8ca1n@googlegroups.com> <a4495c87-c68d-4c87-94aa-701c21bdd19cn@googlegroups.com> <u1u8hu$2ps79$1@dont-email.me> <ef5ad4e6-57ed-4950-baed-3a6746b9d16en@googlegroups.com> <u20tin$3alrj$1@dont-email.me> <3986aac8-aec3-4cde-8bb8-8a61c20084c9n@googlegroups.com> <u2171f$3cbcu$1@dont-email.me>
User-Agent: ForteAgent/8.00.32.1272
MIME-Version: 1.0
Content-Type: text/plain; charset=us-ascii
Content-Transfer-Encoding: 7bit
Lines: 79
X-Trace: sv3-9wJXcVAIWYV58y+Wgr+lwK464P7ALH3uHMfnL3T+NvOXaxANboAHnCmT/ZTCOME3Qz3YH1bDtCfijUd!fEHrI7TnBEUYPwZ+zuvOR6Vp7bu2Qd7veBgv5AHFoa2V5PXNKvqJjUehqqt303//OqYbU6pCEQuS!jUjxpw==
X-Complaints-To: www.supernews.com/docs/abuse.html
X-DMCA-Complaints-To: www.supernews.com/docs/dmca.html
X-Abuse-and-DMCA-Info: Please be sure to forward a copy of ALL headers
X-Abuse-and-DMCA-Info: Otherwise we will be unable to process your complaint properly
X-Postfilter: 1.3.40
X-Received-Bytes: 4662
 by: boB - Sat, 22 Apr 2023 20:41 UTC

On Sat, 22 Apr 2023 19:54:54 +0200, David Brown
<david.brown@hesbynett.no> wrote:

>On 22/04/2023 18:56, Rick C wrote:
>> On Saturday, April 22, 2023 at 11:13:32?AM UTC-4, David Brown wrote:
>>> On 22/04/2023 05:14, Rick C wrote:
>>>> On Friday, April 21, 2023 at 11:02:28?AM UTC-4, David Brown wrote:
>>>>> On 21/04/2023 14:12, Rick C wrote:
>>>>>>
>>>>>> This is simply to be able to say this version is unique,
>>>>>> regardless of what the version number says. Version numbers are
>>>>>> set manually and not always done correctly. I'm looking for
>>>>>> something as a backup so that if the checksums are different, I
>>>>>> can be sure the versions are not the same.
>>>>>>
>>>>>> The less work involved, the better.
>>>>>>
>>>>> Run a simple 32-bit crc over the image. The result is a hash of
>>>>> the image. Any change in the image will show up as a change in the
>>>>> crc.
>>>>
>>>> No one is trying to detect changes in the image. I'm trying to label
>>>> the image in a way that can be read in operation. I'm using the
>>>> checksum simply because that is easy to generate. I've had problems
>>>> with version numbering in the past. It will be used, but I want it
>>>> supplemented with a number that will change every time the design
>>>> changes, at least with a high probability, such as 1 in 64k.
>>>>
>>> Again - use a CRC. It will give you what you want.
>>
>> Again - as will a simple addition checksum.
>
>A simple addition checksum might be okay much of the time, but it
>doesn't have the resolving power of a CRC. If the source code changes
>"a = 1; b = 2;" to "a = 2; b = 1;", the addition checksum is likely to
>be exactly the same despite the change in the source. In general, you
>will have much higher chance of collisions, though I think it would be
>very hard to quantify that.
>
>Maybe it will be good enough for you. Simple checksums were popular
>once, and can still make sense if you are very short on program space.
>But there are good reasons why they fell out of favour in many uses.
>
>>
>>
>>> You might want to go for 32-bit CRC rather than a 16-bit CRC, depending
>>> on the kind of program, how often you build it, and what consequences a
>>> hash collision could have. With a 16-bit CRC, you have a 5% chance of a
>>> collision after 82 builds. If collisions only matter for releases, and
>>> you only release a couple of updates, fine - but if they matter during
>>> development builds, you are getting a more significant risk. Since a
>>> 32-bit CRC is quick and easy, it's worth using.

Totally agree ! I stopped using simple checksums years ago.
Many processors these days also have a CRC peripheral that makes it
easy to use. And I can simply chop that off to 16 bits if I don't
want to transmit all 32 bits. OR even 24 bits.

boB

>>
>> Or, I might want to go with a simple checksum.
>>
>> Thanks for your comments.
>>
>
>
>It's your choice (obviously). I only point out the weaknesses in case
>anyone else is listening in to the thread.
>
>If you like, I can post code for a 32-bit CRC. It's a table, and a few
>lines of C code.
>
>
>

Re: Embedding a Checksum in an Image File

<u2171f$3cbcu$1@dont-email.me>

  copy mid

https://www.novabbs.com/devel/article-flat.php?id=1475&group=comp.arch.embedded#1475

  copy link   Newsgroups: comp.arch.embedded
Path: i2pn2.org!i2pn.org!weretis.net!feeder8.news.weretis.net!eternal-september.org!news.eternal-september.org!.POSTED!not-for-mail
From: david.br...@hesbynett.no (David Brown)
Newsgroups: comp.arch.embedded
Subject: Re: Embedding a Checksum in an Image File
Date: Sat, 22 Apr 2023 19:54:54 +0200
Organization: A noiseless patient Spider
Lines: 65
Message-ID: <u2171f$3cbcu$1@dont-email.me>
References: <116ff07e-5e25-469a-90a0-9474108aadd3n@googlegroups.com>
<66e9fb2a-b9e6-4597-9afa-6572caaa8ca1n@googlegroups.com>
<a4495c87-c68d-4c87-94aa-701c21bdd19cn@googlegroups.com>
<u1u8hu$2ps79$1@dont-email.me>
<ef5ad4e6-57ed-4950-baed-3a6746b9d16en@googlegroups.com>
<u20tin$3alrj$1@dont-email.me>
<3986aac8-aec3-4cde-8bb8-8a61c20084c9n@googlegroups.com>
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8; format=flowed
Content-Transfer-Encoding: 8bit
Injection-Date: Sat, 22 Apr 2023 17:54:55 -0000 (UTC)
Injection-Info: dont-email.me; posting-host="36ac789b885e447ecbd5a7ad2828e367";
logging-data="3550622"; mail-complaints-to="abuse@eternal-september.org"; posting-account="U2FsdGVkX191IGZGJX8j02JwbmnB/jjo1FVAOkcC9Rk="
User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:102.0) Gecko/20100101
Thunderbird/102.7.1
Cancel-Lock: sha1:WTQ1pQV04FV3i3mzdrZpUq/BNjE=
In-Reply-To: <3986aac8-aec3-4cde-8bb8-8a61c20084c9n@googlegroups.com>
Content-Language: en-GB
 by: David Brown - Sat, 22 Apr 2023 17:54 UTC

On 22/04/2023 18:56, Rick C wrote:
> On Saturday, April 22, 2023 at 11:13:32 AM UTC-4, David Brown wrote:
>> On 22/04/2023 05:14, Rick C wrote:
>>> On Friday, April 21, 2023 at 11:02:28 AM UTC-4, David Brown wrote:
>>>> On 21/04/2023 14:12, Rick C wrote:
>>>>>
>>>>> This is simply to be able to say this version is unique,
>>>>> regardless of what the version number says. Version numbers are
>>>>> set manually and not always done correctly. I'm looking for
>>>>> something as a backup so that if the checksums are different, I
>>>>> can be sure the versions are not the same.
>>>>>
>>>>> The less work involved, the better.
>>>>>
>>>> Run a simple 32-bit crc over the image. The result is a hash of
>>>> the image. Any change in the image will show up as a change in the
>>>> crc.
>>>
>>> No one is trying to detect changes in the image. I'm trying to label
>>> the image in a way that can be read in operation. I'm using the
>>> checksum simply because that is easy to generate. I've had problems
>>> with version numbering in the past. It will be used, but I want it
>>> supplemented with a number that will change every time the design
>>> changes, at least with a high probability, such as 1 in 64k.
>>>
>> Again - use a CRC. It will give you what you want.
>
> Again - as will a simple addition checksum.

A simple addition checksum might be okay much of the time, but it
doesn't have the resolving power of a CRC. If the source code changes
"a = 1; b = 2;" to "a = 2; b = 1;", the addition checksum is likely to
be exactly the same despite the change in the source. In general, you
will have much higher chance of collisions, though I think it would be
very hard to quantify that.

Maybe it will be good enough for you. Simple checksums were popular
once, and can still make sense if you are very short on program space.
But there are good reasons why they fell out of favour in many uses.

>
>
>> You might want to go for 32-bit CRC rather than a 16-bit CRC, depending
>> on the kind of program, how often you build it, and what consequences a
>> hash collision could have. With a 16-bit CRC, you have a 5% chance of a
>> collision after 82 builds. If collisions only matter for releases, and
>> you only release a couple of updates, fine - but if they matter during
>> development builds, you are getting a more significant risk. Since a
>> 32-bit CRC is quick and easy, it's worth using.
>
> Or, I might want to go with a simple checksum.
>
> Thanks for your comments.
>

It's your choice (obviously). I only point out the weaknesses in case
anyone else is listening in to the thread.

If you like, I can post code for a 32-bit CRC. It's a table, and a few
lines of C code.

Re: Embedding a Checksum in an Image File

<u23jc6$3s2qo$1@dont-email.me>

  copy mid

https://www.novabbs.com/devel/article-flat.php?id=1476&group=comp.arch.embedded#1476

  copy link   Newsgroups: comp.arch.embedded
Path: i2pn2.org!i2pn.org!eternal-september.org!news.eternal-september.org!.POSTED!not-for-mail
From: david.br...@hesbynett.no (David Brown)
Newsgroups: comp.arch.embedded
Subject: Re: Embedding a Checksum in an Image File
Date: Sun, 23 Apr 2023 17:37:41 +0200
Organization: A noiseless patient Spider
Lines: 92
Message-ID: <u23jc6$3s2qo$1@dont-email.me>
References: <116ff07e-5e25-469a-90a0-9474108aadd3n@googlegroups.com>
<66e9fb2a-b9e6-4597-9afa-6572caaa8ca1n@googlegroups.com>
<a4495c87-c68d-4c87-94aa-701c21bdd19cn@googlegroups.com>
<u1u8hu$2ps79$1@dont-email.me>
<ef5ad4e6-57ed-4950-baed-3a6746b9d16en@googlegroups.com>
<u20tin$3alrj$1@dont-email.me>
<3986aac8-aec3-4cde-8bb8-8a61c20084c9n@googlegroups.com>
<u2171f$3cbcu$1@dont-email.me> <u21eln$25c$1@reader2.panix.com>
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8; format=flowed
Content-Transfer-Encoding: 7bit
Injection-Date: Sun, 23 Apr 2023 15:37:42 -0000 (UTC)
Injection-Info: dont-email.me; posting-host="0ea7176da4100d35ba975bc405962607";
logging-data="4066136"; mail-complaints-to="abuse@eternal-september.org"; posting-account="U2FsdGVkX1+ulc3Vbh55lwiupMdH0Ku52Pn+gGZTM3w="
User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:102.0) Gecko/20100101
Thunderbird/102.7.1
Cancel-Lock: sha1:NlT38Ck6PjyBWBPhg2n6zj4+VkI=
In-Reply-To: <u21eln$25c$1@reader2.panix.com>
Content-Language: en-GB
 by: David Brown - Sun, 23 Apr 2023 15:37 UTC

On 22/04/2023 22:05, Grant Edwards wrote:
> On 2023-04-22, David Brown <david.brown@hesbynett.no> wrote:
>
>> A simple addition checksum might be okay much of the time, but it
>> doesn't have the resolving power of a CRC. If the source code changes
>> "a = 1; b = 2;" to "a = 2; b = 1;", the addition checksum is likely to
>> be exactly the same despite the change in the source. In general, you
>> will have much higher chance of collisions, though I think it would be
>> very hard to quantify that.
>
> I remember a long discussion about this a few decades ago. An N bit
> additive checksum maps the source data into the same hash space
> as a N-bit crc.
>
> Therefore, for two randomly chosen sets of input bits, they both have
> a 1 in 2^N chance of a collision. I think that means that for random
> changes to an input set of unspecified properties, they would both
> have the same chance that the hash is unchanged.
>
> However... IIRC, somebody (probably at somewhere like Bell labs)
> noticed that errors in data transmitted over media like phone lines
> and microwave links are _not_ random. Errors tend to be "bursty" and
> can be statistically characterized. And it was shown that for the
> common error modes for _those_ media, CRCs were better at detecting
> real-world failures than additive checksum. And (this is also
> important) a CRC is far, far simpler to implement in hardware than an
> additive checksum. For the same reasons, CRCs tend to get used for
> things like Ethernet frames, disc sectors, etc.
>
> Later people seem to have adopted CRCs for detecting failures in other
> very dissimilar media (e.g. EPROMs) where implementing a CRC is _more_
> work than an additive checksum. If the failure modes for EPROM are
> similar to those studied at <wherever> when CRCs were chosen, then
> CRCs are probably also a good choice for EPROMs despite the additional
> overhead. If the failure modes for EPROMs are significantly different,
> then CRCs might be both sub-optimal and unnecessarily expensive.
>
> I have no hard data either way, but it was never obvious to me that
> the arguments people use in favor of CRCs (better at detecting burst
> errors on transmission media) necessarily applied to EPROMs.
>
> That said, I do use CRCs rather than additive checksums for things
> like EPROM and flash.
>

That's a lot of good points. You are absolutely correct that CRC's are
better for the types of errors that are often seen in transmission
systems. The person at Bell Labs that you are thinking about is
probably Claude Shannon, famous for his quantitive definition of
information and work on the information capacity of communication
channels with noise.

Another thing you can look at is the distribution of checksum outputs,
for random inputs. For an additive checksum, you can consider your
input as N independent 0-255 random values, added together. The result
will be a normal distribution of the checksum. If you have, say, a 100
byte data block and a 16-bit checksum, it's clear that you will never
get a checksum value greater than 25500, and that you are much more
likely to get a value close to 12750. This kind of clustering means
that the 16-bit checksum contains a lot less than 16 bits of
information. Real data - program images, data telegrams, etc., - are
not fully random and the result is even more clustering and less
information in the checksum.

Taking the additive checksum over a larger range, then "folding" the
distribution back by wrapping the checksum to 8-bit or 16-bit will
greatly reduce the clustering. That will help a lot if you have a
program image and use a 16-bit additive checksum, but if you need more
than "1 in 65536" integrity, it's hard to get.

A particular weakness of purely additive checksums is that they only
consider the values of the bytes, not their order - re-arranging the
order of the same data gives the same additive checksum.

CRC's are not as good as more advanced hashes like SHA or MD5. But
their distributions are vastly better than additive checksums, and they
provide integrity checks for a wider variety of possible errors.

Of course, for some uses, an additive checksum might be considered good
enough. There's no need to be more complicated than you need to be.
But since CRC's are usually very simple and efficient to calculate, they
give an option that is a lot better than an additive checksum for little
extra cost, while going beyond them to MD5 or SHA involves significantly
more effort. (SHA is your first choice if you are protecting against
malicious changes.)

Re: Embedding a Checksum in an Image File

<u23qc9$g6p$1@reader2.panix.com>

  copy mid

https://www.novabbs.com/devel/article-flat.php?id=1477&group=comp.arch.embedded#1477

  copy link   Newsgroups: comp.arch.embedded
Path: i2pn2.org!i2pn.org!weretis.net!feeder6.news.weretis.net!panix!.POSTED.localhost!not-for-mail
From: inva...@invalid.invalid (Grant Edwards)
Newsgroups: comp.arch.embedded
Subject: Re: Embedding a Checksum in an Image File
Date: Sun, 23 Apr 2023 17:37:13 -0000 (UTC)
Organization: PANIX Public Access Internet and UNIX, NYC
Message-ID: <u23qc9$g6p$1@reader2.panix.com>
References: <116ff07e-5e25-469a-90a0-9474108aadd3n@googlegroups.com>
<66e9fb2a-b9e6-4597-9afa-6572caaa8ca1n@googlegroups.com>
<a4495c87-c68d-4c87-94aa-701c21bdd19cn@googlegroups.com>
<u1u8hu$2ps79$1@dont-email.me>
<ef5ad4e6-57ed-4950-baed-3a6746b9d16en@googlegroups.com>
<u20tin$3alrj$1@dont-email.me>
<3986aac8-aec3-4cde-8bb8-8a61c20084c9n@googlegroups.com>
<u2171f$3cbcu$1@dont-email.me> <u21eln$25c$1@reader2.panix.com>
<u23jc6$3s2qo$1@dont-email.me>
Injection-Date: Sun, 23 Apr 2023 17:37:13 -0000 (UTC)
Injection-Info: reader2.panix.com; posting-host="localhost:::1";
logging-data="16601"; mail-complaints-to="abuse@panix.com"
User-Agent: slrn/1.0.3 (Linux)
 by: Grant Edwards - Sun, 23 Apr 2023 17:37 UTC

On 2023-04-23, David Brown <david.brown@hesbynett.no> wrote:

> Another thing you can look at is the distribution of checksum outputs,
> for random inputs. For an additive checksum, you can consider your
> input as N independent 0-255 random values, added together. The result
> will be a normal distribution of the checksum. If you have, say, a 100
> byte data block and a 16-bit checksum, it's clear that you will never
> get a checksum value greater than 25500, and that you are much more
> likely to get a value close to 12750.

It never occurred to me that for an N-bit checksum, you would sum
something other than N-bit "words" of the input data.

--
Grant

Re: Embedding a Checksum in an Image File

<u248tk$3vk28$1@dont-email.me>

  copy mid

https://www.novabbs.com/devel/article-flat.php?id=1478&group=comp.arch.embedded#1478

  copy link   Newsgroups: comp.arch.embedded
Path: i2pn2.org!i2pn.org!eternal-september.org!news.eternal-september.org!.POSTED!not-for-mail
From: david.br...@hesbynett.no (David Brown)
Newsgroups: comp.arch.embedded
Subject: Re: Embedding a Checksum in an Image File
Date: Sun, 23 Apr 2023 23:45:24 +0200
Organization: A noiseless patient Spider
Lines: 25
Message-ID: <u248tk$3vk28$1@dont-email.me>
References: <116ff07e-5e25-469a-90a0-9474108aadd3n@googlegroups.com>
<66e9fb2a-b9e6-4597-9afa-6572caaa8ca1n@googlegroups.com>
<a4495c87-c68d-4c87-94aa-701c21bdd19cn@googlegroups.com>
<u1u8hu$2ps79$1@dont-email.me>
<ef5ad4e6-57ed-4950-baed-3a6746b9d16en@googlegroups.com>
<u20tin$3alrj$1@dont-email.me>
<3986aac8-aec3-4cde-8bb8-8a61c20084c9n@googlegroups.com>
<u2171f$3cbcu$1@dont-email.me> <u21eln$25c$1@reader2.panix.com>
<u23jc6$3s2qo$1@dont-email.me> <u23qc9$g6p$1@reader2.panix.com>
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8; format=flowed
Content-Transfer-Encoding: 7bit
Injection-Date: Sun, 23 Apr 2023 21:45:24 -0000 (UTC)
Injection-Info: dont-email.me; posting-host="0ea7176da4100d35ba975bc405962607";
logging-data="4182088"; mail-complaints-to="abuse@eternal-september.org"; posting-account="U2FsdGVkX1/EdHEds2tK++/DUoWUzUAEMZVajq19vm8="
User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:102.0) Gecko/20100101
Thunderbird/102.7.1
Cancel-Lock: sha1:Sd1gTdTkhXOIIXH/76zs5mZlOl8=
In-Reply-To: <u23qc9$g6p$1@reader2.panix.com>
Content-Language: en-GB
 by: David Brown - Sun, 23 Apr 2023 21:45 UTC

On 23/04/2023 19:37, Grant Edwards wrote:
> On 2023-04-23, David Brown <david.brown@hesbynett.no> wrote:
>
>> Another thing you can look at is the distribution of checksum outputs,
>> for random inputs. For an additive checksum, you can consider your
>> input as N independent 0-255 random values, added together. The result
>> will be a normal distribution of the checksum. If you have, say, a 100
>> byte data block and a 16-bit checksum, it's clear that you will never
>> get a checksum value greater than 25500, and that you are much more
>> likely to get a value close to 12750.
>
> It never occurred to me that for an N-bit checksum, you would sum
> something other than N-bit "words" of the input data.
>

Usually - in my experience - you sum bytes, using an unsigned integer
8-bit or 16-bit wide. Simple additive checksums are often used on small
8-bit microcontrollers where CRC's are seen (rightly or wrongly) as too
demanding. Perhaps other people have different experiences.

You could certainly sum 16-bit words to get your 16-bit additive
checksum, and that would give a different kind of clustering - maybe
better, maybe not.

Re: Embedding a Checksum in an Image File

<u249ml$3vn9o$1@dont-email.me>

  copy mid

https://www.novabbs.com/devel/article-flat.php?id=1479&group=comp.arch.embedded#1479

  copy link   Newsgroups: comp.arch.embedded
Path: i2pn2.org!i2pn.org!eternal-september.org!news.eternal-september.org!.POSTED!not-for-mail
From: david.br...@hesbynett.no (David Brown)
Newsgroups: comp.arch.embedded
Subject: Re: Embedding a Checksum in an Image File
Date: Sun, 23 Apr 2023 23:58:45 +0200
Organization: A noiseless patient Spider
Lines: 92
Message-ID: <u249ml$3vn9o$1@dont-email.me>
References: <116ff07e-5e25-469a-90a0-9474108aadd3n@googlegroups.com>
<66e9fb2a-b9e6-4597-9afa-6572caaa8ca1n@googlegroups.com>
<a4495c87-c68d-4c87-94aa-701c21bdd19cn@googlegroups.com>
<u1u8hu$2ps79$1@dont-email.me>
<ef5ad4e6-57ed-4950-baed-3a6746b9d16en@googlegroups.com>
<u20tin$3alrj$1@dont-email.me>
<3986aac8-aec3-4cde-8bb8-8a61c20084c9n@googlegroups.com>
<u2171f$3cbcu$1@dont-email.me>
<99aaf8df-1f0e-4911-9706-0bac770e3d1cn@googlegroups.com>
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8; format=flowed
Content-Transfer-Encoding: 8bit
Injection-Date: Sun, 23 Apr 2023 21:58:45 -0000 (UTC)
Injection-Info: dont-email.me; posting-host="0ea7176da4100d35ba975bc405962607";
logging-data="4185400"; mail-complaints-to="abuse@eternal-september.org"; posting-account="U2FsdGVkX192A+gUqur979EZY2eAENNOM0v+qqo6M8c="
User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:102.0) Gecko/20100101
Thunderbird/102.7.1
Cancel-Lock: sha1:61F1xjA/n4h6ZGRw3GfsbLT/vh8=
In-Reply-To: <99aaf8df-1f0e-4911-9706-0bac770e3d1cn@googlegroups.com>
Content-Language: en-GB
 by: David Brown - Sun, 23 Apr 2023 21:58 UTC

On 23/04/2023 19:34, Rick C wrote:
> On Saturday, April 22, 2023 at 1:55:01 PM UTC-4, David Brown wrote:
>> On 22/04/2023 18:56, Rick C wrote:
>>> On Saturday, April 22, 2023 at 11:13:32 AM UTC-4, David Brown
>>> wrote:
>>>> On 22/04/2023 05:14, Rick C wrote:
>>>>> On Friday, April 21, 2023 at 11:02:28 AM UTC-4, David Brown
>>>>> wrote:
>>>>>> On 21/04/2023 14:12, Rick C wrote:
>>>>>>>
>>>>>>> This is simply to be able to say this version is unique,
>>>>>>> regardless of what the version number says. Version
>>>>>>> numbers are set manually and not always done correctly.
>>>>>>> I'm looking for something as a backup so that if the
>>>>>>> checksums are different, I can be sure the versions are
>>>>>>> not the same.
>>>>>>>
>>>>>>> The less work involved, the better.
>>>>>>>
>>>>>> Run a simple 32-bit crc over the image. The result is a
>>>>>> hash of the image. Any change in the image will show up as
>>>>>> a change in the crc.
>>>>>
>>>>> No one is trying to detect changes in the image. I'm trying
>>>>> to label the image in a way that can be read in operation.
>>>>> I'm using the checksum simply because that is easy to
>>>>> generate. I've had problems with version numbering in the
>>>>> past. It will be used, but I want it supplemented with a
>>>>> number that will change every time the design changes, at
>>>>> least with a high probability, such as 1 in 64k.
>>>>>
>>>> Again - use a CRC. It will give you what you want.
>>>
>>> Again - as will a simple addition checksum.
>> A simple addition checksum might be okay much of the time, but it
>> doesn't have the resolving power of a CRC. If the source code
>> changes "a = 1; b = 2;" to "a = 2; b = 1;", the addition checksum
>> is likely to be exactly the same despite the change in the source.
>> In general, you will have much higher chance of collisions, though
>> I think it would be very hard to quantify that.
>>
>> Maybe it will be good enough for you. Simple checksums were
>> popular once, and can still make sense if you are very short on
>> program space. But there are good reasons why they fell out of
>> favour in many uses.
>>>
>>>
>>>> You might want to go for 32-bit CRC rather than a 16-bit CRC,
>>>> depending on the kind of program, how often you build it, and
>>>> what consequences a hash collision could have. With a 16-bit
>>>> CRC, you have a 5% chance of a collision after 82 builds. If
>>>> collisions only matter for releases, and you only release a
>>>> couple of updates, fine - but if they matter during development
>>>> builds, you are getting a more significant risk. Since a 32-bit
>>>> CRC is quick and easy, it's worth using.
>>>
>>> Or, I might want to go with a simple checksum.
>>>
>>> Thanks for your comments.
>>>
>> It's your choice (obviously). I only point out the weaknesses in
>> case anyone else is listening in to the thread.
>>
>> If you like, I can post code for a 32-bit CRC. It's a table, and a
>> few lines of C code.
>
> You know nothing of the project I am working on or those that I
> typically work on. But thanks for the advice.
>

You haven't given much to go on. It is still not really clear (to me,
at least) if you are asking about checksums or how to manipulate binary
images as part of a build process, or what you are really asking.

When someone wants a checksum on an image file, the appropriate choice
in most cases is a CRC. If security is an issue, then a secure hash is
needed. For a very limited system, additive checksums might be then
only realistic choice.

But more often, the reason people pick additive checksums rather than
CRCs is because they don't realise that CRCs are actually very simple
and efficient to implement. People unfamiliar with them might have read
a little, and think they need to do calculations for each bit (which is
possible but /slow/), or that they would have to understand the theory
of binary polynomial division rings (they don't). They think CRC's are
complicated and advanced, and shy away from them.

There are a number of people who read this group - maybe some of them
have learned a little from this thread.

Re: Embedding a Checksum in an Image File

<A6i1M.2358052$iU59.1633184@fx14.iad>

  copy mid

https://www.novabbs.com/devel/article-flat.php?id=1480&group=comp.arch.embedded#1480

  copy link   Newsgroups: comp.arch.embedded
Path: i2pn2.org!i2pn.org!weretis.net!feeder8.news.weretis.net!feeder1.feed.usenet.farm!feed.usenet.farm!peer01.ams4!peer.am4.highwinds-media.com!peer02.iad!feed-me.highwinds-media.com!news.highwinds-media.com!fx14.iad.POSTED!not-for-mail
MIME-Version: 1.0
User-Agent: Mozilla/5.0 (Macintosh; Intel Mac OS X 10.15; rv:102.0)
Gecko/20100101 Thunderbird/102.10.0
Subject: Re: Embedding a Checksum in an Image File
Content-Language: en-US
Newsgroups: comp.arch.embedded
References: <116ff07e-5e25-469a-90a0-9474108aadd3n@googlegroups.com>
<66e9fb2a-b9e6-4597-9afa-6572caaa8ca1n@googlegroups.com>
<a4495c87-c68d-4c87-94aa-701c21bdd19cn@googlegroups.com>
<u1u8hu$2ps79$1@dont-email.me>
<ef5ad4e6-57ed-4950-baed-3a6746b9d16en@googlegroups.com>
<u20tin$3alrj$1@dont-email.me>
<3986aac8-aec3-4cde-8bb8-8a61c20084c9n@googlegroups.com>
<u2171f$3cbcu$1@dont-email.me> <u21eln$25c$1@reader2.panix.com>
<u23jc6$3s2qo$1@dont-email.me> <u23qc9$g6p$1@reader2.panix.com>
<u248tk$3vk28$1@dont-email.me>
From: Rich...@Damon-Family.org (Richard Damon)
In-Reply-To: <u248tk$3vk28$1@dont-email.me>
Content-Type: text/plain; charset=UTF-8; format=flowed
Content-Transfer-Encoding: 8bit
Lines: 30
Message-ID: <A6i1M.2358052$iU59.1633184@fx14.iad>
X-Complaints-To: abuse@easynews.com
Organization: Forte - www.forteinc.com
X-Complaints-Info: Please be sure to forward a copy of ALL headers otherwise we will be unable to process your complaint properly.
Date: Sun, 23 Apr 2023 18:16:00 -0400
X-Received-Bytes: 2821
 by: Richard Damon - Sun, 23 Apr 2023 22:16 UTC

On 4/23/23 5:45 PM, David Brown wrote:
> On 23/04/2023 19:37, Grant Edwards wrote:
>> On 2023-04-23, David Brown <david.brown@hesbynett.no> wrote:
>>
>>> Another thing you can look at is the distribution of checksum outputs,
>>> for random inputs.  For an additive checksum, you can consider your
>>> input as N independent 0-255 random values, added together.  The result
>>> will be a normal distribution of the checksum.  If you have, say, a 100
>>> byte data block and a 16-bit checksum, it's clear that you will never
>>> get a checksum value greater than 25500, and that you are much more
>>> likely to get a value close to 12750.
>>
>> It never occurred to me that for an N-bit checksum, you would sum
>> something other than N-bit "words" of the input data.
>>
>
> Usually - in my experience - you sum bytes, using an unsigned integer
> 8-bit or 16-bit wide.  Simple additive checksums are often used on small
> 8-bit microcontrollers where CRC's are seen (rightly or wrongly) as too
> demanding.  Perhaps other people have different experiences.
>
> You could certainly sum 16-bit words to get your 16-bit additive
> checksum, and that would give a different kind of clustering - maybe
> better, maybe not.
>
>

I have seen 16-bit checksums done both ways. Summing 16 bit units does
eliminate the issue of clustering, and makes adjacent byte swaps
detectable.

Re: Embedding a Checksum in an Image File

<070fc693-6b70-4767-af26-2d3ccb4f7919n@googlegroups.com>

  copy mid

https://www.novabbs.com/devel/article-flat.php?id=1481&group=comp.arch.embedded#1481

  copy link   Newsgroups: comp.arch.embedded
X-Received: by 2002:ac8:5a54:0:b0:3bf:da0f:ed7c with SMTP id o20-20020ac85a54000000b003bfda0fed7cmr3795879qta.11.1682288680999;
Sun, 23 Apr 2023 15:24:40 -0700 (PDT)
X-Received: by 2002:ad4:559e:0:b0:5ef:5059:1ae5 with SMTP id
f30-20020ad4559e000000b005ef50591ae5mr1805198qvx.7.1682288680611; Sun, 23 Apr
2023 15:24:40 -0700 (PDT)
Path: i2pn2.org!i2pn.org!weretis.net!feeder8.news.weretis.net!news.uzoreto.com!peer03.ams4!peer.am4.highwinds-media.com!peer02.iad!feed-me.highwinds-media.com!news.highwinds-media.com!news-out.google.com!nntp.google.com!postnews.google.com!google-groups.googlegroups.com!not-for-mail
Newsgroups: comp.arch.embedded
Date: Sun, 23 Apr 2023 15:24:40 -0700 (PDT)
In-Reply-To: <u249ml$3vn9o$1@dont-email.me>
Injection-Info: google-groups.googlegroups.com; posting-host=65.207.89.54; posting-account=I-_H_woAAAA9zzro6crtEpUAyIvzd19b
NNTP-Posting-Host: 65.207.89.54
References: <116ff07e-5e25-469a-90a0-9474108aadd3n@googlegroups.com>
<66e9fb2a-b9e6-4597-9afa-6572caaa8ca1n@googlegroups.com> <a4495c87-c68d-4c87-94aa-701c21bdd19cn@googlegroups.com>
<u1u8hu$2ps79$1@dont-email.me> <ef5ad4e6-57ed-4950-baed-3a6746b9d16en@googlegroups.com>
<u20tin$3alrj$1@dont-email.me> <3986aac8-aec3-4cde-8bb8-8a61c20084c9n@googlegroups.com>
<u2171f$3cbcu$1@dont-email.me> <99aaf8df-1f0e-4911-9706-0bac770e3d1cn@googlegroups.com>
<u249ml$3vn9o$1@dont-email.me>
User-Agent: G2/1.0
MIME-Version: 1.0
Message-ID: <070fc693-6b70-4767-af26-2d3ccb4f7919n@googlegroups.com>
Subject: Re: Embedding a Checksum in an Image File
From: gnuarm.d...@gmail.com (Rick C)
Injection-Date: Sun, 23 Apr 2023 22:24:40 +0000
Content-Type: text/plain; charset="UTF-8"
Content-Transfer-Encoding: quoted-printable
X-Received-Bytes: 7796
 by: Rick C - Sun, 23 Apr 2023 22:24 UTC

On Sunday, April 23, 2023 at 5:58:51 PM UTC-4, David Brown wrote:
> On 23/04/2023 19:34, Rick C wrote:
> > On Saturday, April 22, 2023 at 1:55:01 PM UTC-4, David Brown wrote:
> >> On 22/04/2023 18:56, Rick C wrote:
> >>> On Saturday, April 22, 2023 at 11:13:32 AM UTC-4, David Brown
> >>> wrote:
> >>>> On 22/04/2023 05:14, Rick C wrote:
> >>>>> On Friday, April 21, 2023 at 11:02:28 AM UTC-4, David Brown
> >>>>> wrote:
> >>>>>> On 21/04/2023 14:12, Rick C wrote:
> >>>>>>>
> >>>>>>> This is simply to be able to say this version is unique,
> >>>>>>> regardless of what the version number says. Version
> >>>>>>> numbers are set manually and not always done correctly.
> >>>>>>> I'm looking for something as a backup so that if the
> >>>>>>> checksums are different, I can be sure the versions are
> >>>>>>> not the same.
> >>>>>>>
> >>>>>>> The less work involved, the better.
> >>>>>>>
> >>>>>> Run a simple 32-bit crc over the image. The result is a
> >>>>>> hash of the image. Any change in the image will show up as
> >>>>>> a change in the crc.
> >>>>>
> >>>>> No one is trying to detect changes in the image. I'm trying
> >>>>> to label the image in a way that can be read in operation.
> >>>>> I'm using the checksum simply because that is easy to
> >>>>> generate. I've had problems with version numbering in the
> >>>>> past. It will be used, but I want it supplemented with a
> >>>>> number that will change every time the design changes, at
> >>>>> least with a high probability, such as 1 in 64k.
> >>>>>
> >>>> Again - use a CRC. It will give you what you want.
> >>>
> >>> Again - as will a simple addition checksum.
> >> A simple addition checksum might be okay much of the time, but it
> >> doesn't have the resolving power of a CRC. If the source code
> >> changes "a = 1; b = 2;" to "a = 2; b = 1;", the addition checksum
> >> is likely to be exactly the same despite the change in the source.
> >> In general, you will have much higher chance of collisions, though
> >> I think it would be very hard to quantify that.
> >>
> >> Maybe it will be good enough for you. Simple checksums were
> >> popular once, and can still make sense if you are very short on
> >> program space. But there are good reasons why they fell out of
> >> favour in many uses.
> >>>
> >>>
> >>>> You might want to go for 32-bit CRC rather than a 16-bit CRC,
> >>>> depending on the kind of program, how often you build it, and
> >>>> what consequences a hash collision could have. With a 16-bit
> >>>> CRC, you have a 5% chance of a collision after 82 builds. If
> >>>> collisions only matter for releases, and you only release a
> >>>> couple of updates, fine - but if they matter during development
> >>>> builds, you are getting a more significant risk. Since a 32-bit
> >>>> CRC is quick and easy, it's worth using.
> >>>
> >>> Or, I might want to go with a simple checksum.
> >>>
> >>> Thanks for your comments.
> >>>
> >> It's your choice (obviously). I only point out the weaknesses in
> >> case anyone else is listening in to the thread.
> >>
> >> If you like, I can post code for a 32-bit CRC. It's a table, and a
> >> few lines of C code.
> >
> > You know nothing of the project I am working on or those that I
> > typically work on. But thanks for the advice.
> >
> You haven't given much to go on. It is still not really clear (to me,
> at least) if you are asking about checksums or how to manipulate binary
> images as part of a build process, or what you are really asking.

If you don't understand, you are making this far more complicated than it is. I don't know what to tell you. There are no other details that are relevant. Don't read into this, what is not there.

> When someone wants a checksum on an image file, the appropriate choice
> in most cases is a CRC.

Why? What makes a CRC an "appropriate" choice. Normally, when I design something, I establish the requirements. What requirements are you assuming, that would make the CRC more desireable than a simple checksum?

> If security is an issue, then a secure hash is
> needed. For a very limited system, additive checksums might be then
> only realistic choice.

What have I said that makes you think security is an issue??? I don't recall ever mentioning anything about security. Do you recall what I did say?

> But more often, the reason people pick additive checksums rather than
> CRCs is because they don't realise that CRCs are actually very simple
> and efficient to implement.

The fact that they are "simple and efficient" is not a reason to use them. I repeat, what are the requirements?

> People unfamiliar with them might have read
> a little, and think they need to do calculations for each bit (which is
> possible but /slow/), or that they would have to understand the theory
> of binary polynomial division rings (they don't). They think CRC's are
> complicated and advanced, and shy away from them.
>
> There are a number of people who read this group - maybe some of them
> have learned a little from this thread.

I suppose there is that possibility. But when people make claims about something being good or "better", without substantiation, there's not much to learn.

If you think a discussion of CRC calculations would be useful, why don't you open a thread and discuss them, instead of insisting they are the right solution to my problem, when you don't even know what the problem requirements are? It's all here in the thread. You only need to read, without projecting your opinions on the problem statement.

--

Rick C.

+-+ Get 1,000 miles of free Supercharging
+-+ Tesla referral code - https://ts.la/richard11209

Re: Embedding a Checksum in an Image File

<u25a69$87j1$1@dont-email.me>

  copy mid

https://www.novabbs.com/devel/article-flat.php?id=1482&group=comp.arch.embedded#1482

  copy link   Newsgroups: comp.arch.embedded
Path: i2pn2.org!i2pn.org!eternal-september.org!news.eternal-september.org!.POSTED!not-for-mail
From: david.br...@hesbynett.no (David Brown)
Newsgroups: comp.arch.embedded
Subject: Re: Embedding a Checksum in an Image File
Date: Mon, 24 Apr 2023 09:13:13 +0200
Organization: A noiseless patient Spider
Lines: 44
Message-ID: <u25a69$87j1$1@dont-email.me>
References: <116ff07e-5e25-469a-90a0-9474108aadd3n@googlegroups.com>
<66e9fb2a-b9e6-4597-9afa-6572caaa8ca1n@googlegroups.com>
<a4495c87-c68d-4c87-94aa-701c21bdd19cn@googlegroups.com>
<u1u8hu$2ps79$1@dont-email.me>
<ef5ad4e6-57ed-4950-baed-3a6746b9d16en@googlegroups.com>
<u20tin$3alrj$1@dont-email.me>
<3986aac8-aec3-4cde-8bb8-8a61c20084c9n@googlegroups.com>
<u2171f$3cbcu$1@dont-email.me> <u21eln$25c$1@reader2.panix.com>
<u23jc6$3s2qo$1@dont-email.me> <u23qc9$g6p$1@reader2.panix.com>
<u248tk$3vk28$1@dont-email.me> <A6i1M.2358052$iU59.1633184@fx14.iad>
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8; format=flowed
Content-Transfer-Encoding: 8bit
Injection-Date: Mon, 24 Apr 2023 07:13:13 -0000 (UTC)
Injection-Info: dont-email.me; posting-host="e3bce7eaad6b1d9cc0a7f5b3c9177dad";
logging-data="269921"; mail-complaints-to="abuse@eternal-september.org"; posting-account="U2FsdGVkX19Q+SafSe4XAB3ThSg5goddy1LCvY6NNbg="
User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:102.0) Gecko/20100101
Thunderbird/102.9.0
Cancel-Lock: sha1:/aBS+nRMhyppJdl6YT7x3tgqzyY=
Content-Language: en-GB
In-Reply-To: <A6i1M.2358052$iU59.1633184@fx14.iad>
 by: David Brown - Mon, 24 Apr 2023 07:13 UTC

On 24/04/2023 00:16, Richard Damon wrote:
> On 4/23/23 5:45 PM, David Brown wrote:
>> On 23/04/2023 19:37, Grant Edwards wrote:
>>> On 2023-04-23, David Brown <david.brown@hesbynett.no> wrote:
>>>
>>>> Another thing you can look at is the distribution of checksum outputs,
>>>> for random inputs.  For an additive checksum, you can consider your
>>>> input as N independent 0-255 random values, added together.  The result
>>>> will be a normal distribution of the checksum.  If you have, say, a 100
>>>> byte data block and a 16-bit checksum, it's clear that you will never
>>>> get a checksum value greater than 25500, and that you are much more
>>>> likely to get a value close to 12750.
>>>
>>> It never occurred to me that for an N-bit checksum, you would sum
>>> something other than N-bit "words" of the input data.
>>>
>>
>> Usually - in my experience - you sum bytes, using an unsigned integer
>> 8-bit or 16-bit wide.  Simple additive checksums are often used on
>> small 8-bit microcontrollers where CRC's are seen (rightly or wrongly)
>> as too demanding.  Perhaps other people have different experiences.
>>
>> You could certainly sum 16-bit words to get your 16-bit additive
>> checksum, and that would give a different kind of clustering - maybe
>> better, maybe not.
>>
>>
>
> I have seen 16-bit checksums done both ways. Summing 16 bit units does
> eliminate the issue of clustering, and makes adjacent byte swaps
> detectable.

Long ago, there used to be a definite risk of mixing up endianness when
dealing with program images burned to flash or eeprom. Popular "hex"
formats like Intel Hex and Motorola SRecord could differ in endianness.
So byte swaps in the entire image was a real possibility, and good to
guard against. But it's hard to imagine how an individual byte swap
could occur - I see bigger movements and re-arrangements being more
likely, and using 16-bit units will not help much there. Still, I think
there is little doubt that using 16-bit units is better than using 8-bit
units in many ways (except for efficient implementation on small 8-bit
devices).

Re: Embedding a Checksum in an Image File

<u25ae7$87j1$2@dont-email.me>

  copy mid

https://www.novabbs.com/devel/article-flat.php?id=1483&group=comp.arch.embedded#1483

  copy link   Newsgroups: comp.arch.embedded
Path: i2pn2.org!i2pn.org!eternal-september.org!news.eternal-september.org!.POSTED!not-for-mail
From: david.br...@hesbynett.no (David Brown)
Newsgroups: comp.arch.embedded
Subject: Re: Embedding a Checksum in an Image File
Date: Mon, 24 Apr 2023 09:17:27 +0200
Organization: A noiseless patient Spider
Lines: 37
Message-ID: <u25ae7$87j1$2@dont-email.me>
References: <116ff07e-5e25-469a-90a0-9474108aadd3n@googlegroups.com>
<66e9fb2a-b9e6-4597-9afa-6572caaa8ca1n@googlegroups.com>
<a4495c87-c68d-4c87-94aa-701c21bdd19cn@googlegroups.com>
<u1u8hu$2ps79$1@dont-email.me>
<ef5ad4e6-57ed-4950-baed-3a6746b9d16en@googlegroups.com>
<u20tin$3alrj$1@dont-email.me>
<3986aac8-aec3-4cde-8bb8-8a61c20084c9n@googlegroups.com>
<u2171f$3cbcu$1@dont-email.me>
<99aaf8df-1f0e-4911-9706-0bac770e3d1cn@googlegroups.com>
<u249ml$3vn9o$1@dont-email.me>
<070fc693-6b70-4767-af26-2d3ccb4f7919n@googlegroups.com>
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8; format=flowed
Content-Transfer-Encoding: 8bit
Injection-Date: Mon, 24 Apr 2023 07:17:27 -0000 (UTC)
Injection-Info: dont-email.me; posting-host="e3bce7eaad6b1d9cc0a7f5b3c9177dad";
logging-data="269921"; mail-complaints-to="abuse@eternal-september.org"; posting-account="U2FsdGVkX1/E++hcPNenfCiG9za9UxSip6Q1mM97Jn8="
User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:102.0) Gecko/20100101
Thunderbird/102.9.0
Cancel-Lock: sha1:nDSxhBNZ2lbFZVctDuldSrjh7RA=
In-Reply-To: <070fc693-6b70-4767-af26-2d3ccb4f7919n@googlegroups.com>
Content-Language: en-GB
 by: David Brown - Mon, 24 Apr 2023 07:17 UTC

On 24/04/2023 00:24, Rick C wrote:
> On Sunday, April 23, 2023 at 5:58:51 PM UTC-4, David Brown wrote:
>
>> When someone wants a checksum on an image file, the appropriate
>> choice in most cases is a CRC.
>
> Why? What makes a CRC an "appropriate" choice. Normally, when I
> design something, I establish the requirements. What requirements
> are you assuming, that would make the CRC more desireable than a
> simple checksum?
>

I've already explained this in quite a lot of detail in this thread (as
have others). If you don't like my explanation, or didn't read it,
that's okay. You are under no obligation to learn about CRCs. Or if
you prefer to look it up in other sources, that's obviously also an option.

>
>> If security is an issue, then a secure hash is needed. For a very
>> limited system, additive checksums might be then only realistic
>> choice.
>
> What have I said that makes you think security is an issue??? I
> don't recall ever mentioning anything about security. Do you recall
> what I did say?
>
>
> If you think a discussion of CRC calculations would be useful, why
> don't you open a thread and discuss them, instead of insisting they
> are the right solution to my problem, when you don't even know what
> the problem requirements are? It's all here in the thread. You only
> need to read, without projecting your opinions on the problem
> statement.
>

I've asked you this before - are you /sure/ you understand how Usenet works?

Re: Embedding a Checksum in an Image File

<u25bas$8fh0$2@dont-email.me>

  copy mid

https://www.novabbs.com/devel/article-flat.php?id=1484&group=comp.arch.embedded#1484

  copy link   Newsgroups: comp.arch.embedded
Path: i2pn2.org!i2pn.org!eternal-september.org!news.eternal-september.org!.POSTED!not-for-mail
From: blockedo...@foo.invalid (Don Y)
Newsgroups: comp.arch.embedded
Subject: Re: Embedding a Checksum in an Image File
Date: Mon, 24 Apr 2023 00:32:42 -0700
Organization: A noiseless patient Spider
Lines: 552
Message-ID: <u25bas$8fh0$2@dont-email.me>
References: <116ff07e-5e25-469a-90a0-9474108aadd3n@googlegroups.com>
<pvl24i57aef4vc7bbdk9mvj7sic9dsh64t@4ax.com>
<f0afa198-e735-4da1-a16a-82764af3de4dn@googlegroups.com>
<36534il81ipvnhog6980r9ln9tdqn5cbh6@4ax.com> <u1t7eb$10gmu$3@dont-email.me>
<u1tpcr$1377f$1@dont-email.me> <u1tsm3$13ook$1@dont-email.me>
<u1u7ro$2poss$1@dont-email.me> <u1v9q8$2v4d5$1@dont-email.me>
<u20sli$3ag7l$1@dont-email.me>
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8; format=flowed
Content-Transfer-Encoding: 8bit
Injection-Date: Mon, 24 Apr 2023 07:32:44 -0000 (UTC)
Injection-Info: dont-email.me; posting-host="b37e9c17cad6a6aa99f47b9df5bd7dce";
logging-data="278048"; mail-complaints-to="abuse@eternal-september.org"; posting-account="U2FsdGVkX196lFNZCUeZiw9CH3wlUCYW"
User-Agent: Mozilla/5.0 (Windows NT 6.1; Win64; x64; rv:102.0) Gecko/20100101
Thunderbird/102.2.2
Cancel-Lock: sha1:bnhvnFxBYJ1Dczk8YCIJiy3lSO4=
In-Reply-To: <u20sli$3ag7l$1@dont-email.me>
Content-Language: en-US
 by: Don Y - Mon, 24 Apr 2023 07:32 UTC

On 4/22/2023 7:57 AM, David Brown wrote:
>>> However, in almost every case where CRC's might be useful, you have
>>> additional checks of the sanity of the data, and an all-zero or all-one data
>>> block would be rejected.  For example, Ethernet packets use CRC for
>>> integrity checking, but an attempt to send a packet type 0 from MAC address
>>> 00:00:00:00:00:00 to address 00:00:00:00:00:00, of length 0, would be
>>> rejected anyway.
>>
>> Why look at "data" -- which may be suspect -- and *then* check its CRC?
>> Run the CRC first.  If it fails, decide how you are going to proceed
>> or recover.
>
> That is usually the order, yes.  Sometimes you want "fail fast", such as
> dropping a packet that was not addressed to you (it doesn't matter if it was
> received correctly but for someone else, or it was addressed to you but the
> receiver address was corrupted - you are dropping the packet either way).  But
> usually you will run the CRC then look at the data.
>
> But the order doesn't matter - either way, you are still checking for valid
> data, and if the data is invalid, it does not matter if the CRC only passed by
> luck or by all zeros.

You're assuming the CRC is supposed to *vouch* for the data.
The CRC can be there simply to vouch for the *transport* of a
datagram.

>>> I can't think of any use-cases where you would be passing around a block of
>>> "pure" data that could reasonably take absolutely any value, without any
>>> type of "envelope" information, and where you would think a CRC check is
>>> appropriate.
>>
>> I append a *version specific* CRC to each packet of marshalled data
>> in my RMIs.  If the data is corrupted in transit *or* if the
>> wrong version API ends up targeted, the operation will abend
>> because we know the data "isn't right".
>
> Using a version-specific CRC sounds silly.  Put the version information in the
> packet.

The packet routed to a particular interface is *supposed* to
conform to "version X" of an interface. There are different stubs
generated for different versions of EACH interface. The OCL for
the interface defines (and is used to check) the form of that
interface to that service/mechanism.

The parameters are checked on the client side -- why tie up the
transport medium with data that is inappropriate (redundant)
to THAT interface? Why tie up the server verifying that data?
The stub generator can perform all of those checks automatically
and CONSISTENTLY based on the OCL definition of that version
of that interface (because developers make mistakes).

So, at the instant you schedule the marshalled data for transmission,
you *know* the parameters are "appropriate" and compliant with
the constraints of THAT version of THAT interface.

Now, you have to ensure the packet doesn't get corrupted (altered) in
transmission. If it remains intact, then there is no need to check
the parameters on the server side.

NONE OF THE PARAMETERS... including the (implied) "interface version" field!

Yet, folks make mistakes. So, you want some additional reassurance
that this is at least intended for this version of the interface,
ESPECIALLY IF THAT CAN BE MADE AVAILABLE FOR ZERO COST (i.e., check
to see if the residual is 0xDEADBEEF instead of 0xB16B00B5).

Why burden the packet with a "protocol version" parameter?

So, use a version-specific CRC on the packet. If it fails, then
either the data in the packet has been corrupted (which could just
as easily have involved an embedded "interface version" parameter);
or the packet was formed with the wrong CRC.

If the CRC is correct FOR THAT VERSION OF THE PROTOCOL, then
why bother looking at a "protocol version" parameter? Would
you ALSO want to verify all the rest of the parameters?

>> I *could* put a header saying "this is version 4.2".  And, that
>> tells me nothing about the integrity of the rest of the data.
>> OTOH, ensuring the CRC reflects "4.2" does -- it the recipient
>> expects it to be so.
>
> Now you don't know if the data is corrupted, or for the wrong version - or
> occasionally, corrupted /and/ the wrong version but passing the CRC anyway.

You don't know if the parameters have been corrupted in a manner that
allows a packet intended for the correct interface to appear as correct.
What's your point?

> Unless you are absolutely desperate to save every bit you can, your system will
> be simpler, clearer, and more reliable if you separate your purposes.

Yes. You verify the correct interface at the client side -- where
it is invoked by the client and enforced in the OCL generated stub.
Thereafter, the server is concerned with corruption during transport
and the version specific CRC just gives another reassurance of
correct version without adding another cost.

[Imagine EVERY subroutine function call in your system having
such overhead. Would you want to push an "interface version"
onto the stack along with all of the arguments for that
subr/ftn? Or, would you just hope everything was intact?]

>>>>>> You can also "salt" the calculation so that the residual
>>>>>> is deliberately nonzero.  So, for example, "success" is
>>>>>> indicated by a residual of 0x474E.  :>
>>>>>
>>>>> Again, pointless.
>>>>>
>>>>> Salt is important for security-related hashes (like password hashes), not
>>>>> for integrity checks.
>>>>
>>>> You've missed the point.  The correct "sum" can be anything.
>>>> Why is "0" more special than any other value?  As the value is
>>>> typically meaningless to anything other than the code that verifies
>>>> it, you couldn't look at an image (or the output of the verifier)
>>>> and gain anything from seeing that obscure value.
>>>
>>> Do you actually know what is meant by "salt" in the context of hashes, and
>>> why it is useful in some circumstances?  Do you understand that "salt" is
>>> added (usually prepended, or occasionally mixed in in some other way) to the
>>> data /before/ the hash is calculated?
>>
>> What term would you have me use to indicate a "bias" applied to a CRC
>> algorithm?
>
> Well, first I'd note that any kind of modification to the basic CRC algorithm
> is pointless from the viewpoint of its use as an integrity check.  (There have
> been, mostly historically, some justifications in terms of implementation
> efficiency.  For example, bit and byte re-ordering could be done to suit
> hardware bit-wise implementations.)
>
> Otherwise I'd say you are picking a specific initial value if that is what you
> are doing, or modifying the final value (inverting it or xor'ing it with a
> fixed value).  There is, AFAIK, no specific terms for these - and I don't see
> any benefit in having one.  Misusing the term "salt" from cryptography is
> certainly not helpful.

Salt just ensures that you can differentiate between functionally identical
values. I.e., in a CRC, it differentiates between the "0x0000" that CRC-1
generates from the "0x0000" that CRC-2 generates.

You don't see the parallel to ensuring that *my* use of "Passw0rd" is
encoded in a different manner than *your* use of "Passw0rd"?

>> See the RMI desciption.
>
> I'm sorry, I have no idea what "RMI" is or where it is described. You've
> mentioned that abbreviation twice, but I can't figure it out.

<https://en.wikipedia.org/wiki/RMI>
<https://en.wikipedia.org/wiki/OCL>

Nothing magical with either term.

>> OTOH, "salting" the calculation so that it is expected to yield
>> a value of 0x13 means *those* situations will be flagged as errors
>> (and a different set of situations will sneak by, undetected).
>
> And that gives you exactly /zero/ benefit.

See above.

> You run your hash algorithm, and check for the single value that indicates no
> errors.  It does not matter if that number is 0, 0x13, or - often more
-----------^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^

As you've admitted, it doesn't matter. So, why wouldn't I opt to have
an algorithm for THIS interface give me a result that is EXPECTED
for this protocol? What value picking "0"?

> conveniently - the number attached at the end of the image as the expected
> result of the hash of the rest of the data.

>>> To be more accurate, the chances of them passing unnoticed are of the order
>>> of 1 in 2^n, for a good n-bit check such as a CRC check. Certain types of
>>> error are always detectable, such as single and double bit errors.  That is
>>> the point of using a checksum or hash for integrity checking.
>>>
>>> /Intentional/ changes are a different matter.  If a hacker changes the
>>> program image, they can change the transmitted hash to their own calculated
>>> hash.  Or for a small CRC, they could change a different part of the image
>>> until the original checksum matched - for a 16-bit CRC, that only takes
>>> 65,535 attempts in the worst case.
>>
>> If the approach used is "typical", then you need far fewer attempts to
>> produce a correct image -- without EVER knowing where the CRC is stored.
>
> It is difficult to know what you are trying to say here, but if you believe
> that different initial values in a CRC algorithm makes it harder to modify an
> image to make it pass the integrity test, you are simply wrong.


Click here to read the complete article
Re: Embedding a Checksum in an Image File

<cbfefcb9-2d39-4e75-bb4f-cbda15e1b91en@googlegroups.com>

  copy mid

https://www.novabbs.com/devel/article-flat.php?id=1485&group=comp.arch.embedded#1485

  copy link   Newsgroups: comp.arch.embedded
X-Received: by 2002:a05:620a:1128:b0:74e:4595:f39 with SMTP id p8-20020a05620a112800b0074e45950f39mr1591351qkk.11.1682323634438;
Mon, 24 Apr 2023 01:07:14 -0700 (PDT)
X-Received: by 2002:a05:622a:1194:b0:3ef:3af7:1c40 with SMTP id
m20-20020a05622a119400b003ef3af71c40mr4935410qtk.3.1682323634182; Mon, 24 Apr
2023 01:07:14 -0700 (PDT)
Path: i2pn2.org!i2pn.org!weretis.net!feeder6.news.weretis.net!1.us.feeder.erje.net!feeder.erje.net!usenet.blueworldhosting.com!diablo1.usenet.blueworldhosting.com!peer02.iad!feed-me.highwinds-media.com!news.highwinds-media.com!news-out.google.com!nntp.google.com!postnews.google.com!google-groups.googlegroups.com!not-for-mail
Newsgroups: comp.arch.embedded
Date: Mon, 24 Apr 2023 01:07:13 -0700 (PDT)
In-Reply-To: <u25ae7$87j1$2@dont-email.me>
Injection-Info: google-groups.googlegroups.com; posting-host=65.207.89.54; posting-account=I-_H_woAAAA9zzro6crtEpUAyIvzd19b
NNTP-Posting-Host: 65.207.89.54
References: <116ff07e-5e25-469a-90a0-9474108aadd3n@googlegroups.com>
<66e9fb2a-b9e6-4597-9afa-6572caaa8ca1n@googlegroups.com> <a4495c87-c68d-4c87-94aa-701c21bdd19cn@googlegroups.com>
<u1u8hu$2ps79$1@dont-email.me> <ef5ad4e6-57ed-4950-baed-3a6746b9d16en@googlegroups.com>
<u20tin$3alrj$1@dont-email.me> <3986aac8-aec3-4cde-8bb8-8a61c20084c9n@googlegroups.com>
<u2171f$3cbcu$1@dont-email.me> <99aaf8df-1f0e-4911-9706-0bac770e3d1cn@googlegroups.com>
<u249ml$3vn9o$1@dont-email.me> <070fc693-6b70-4767-af26-2d3ccb4f7919n@googlegroups.com>
<u25ae7$87j1$2@dont-email.me>
User-Agent: G2/1.0
MIME-Version: 1.0
Message-ID: <cbfefcb9-2d39-4e75-bb4f-cbda15e1b91en@googlegroups.com>
Subject: Re: Embedding a Checksum in an Image File
From: gnuarm.d...@gmail.com (Rick C)
Injection-Date: Mon, 24 Apr 2023 08:07:14 +0000
Content-Type: text/plain; charset="UTF-8"
Content-Transfer-Encoding: quoted-printable
X-Received-Bytes: 4240
 by: Rick C - Mon, 24 Apr 2023 08:07 UTC

On Monday, April 24, 2023 at 3:17:33 AM UTC-4, David Brown wrote:
> On 24/04/2023 00:24, Rick C wrote:
> > On Sunday, April 23, 2023 at 5:58:51 PM UTC-4, David Brown wrote:
> >
> >> When someone wants a checksum on an image file, the appropriate
> >> choice in most cases is a CRC.
> >
> > Why? What makes a CRC an "appropriate" choice. Normally, when I
> > design something, I establish the requirements. What requirements
> > are you assuming, that would make the CRC more desireable than a
> > simple checksum?
> >
> I've already explained this in quite a lot of detail in this thread (as
> have others). If you don't like my explanation, or didn't read it,
> that's okay. You are under no obligation to learn about CRCs. Or if
> you prefer to look it up in other sources, that's obviously also an option.

Hmmm... I ask you a question about why you think CRC is better for my application and you respond oddly. So you can't explain why the CRC would be better for my application? OK, thanks anyway.

> >> If security is an issue, then a secure hash is needed. For a very
> >> limited system, additive checksums might be then only realistic
> >> choice.
> >
> > What have I said that makes you think security is an issue??? I
> > don't recall ever mentioning anything about security. Do you recall
> > what I did say?
> >
> >
> > If you think a discussion of CRC calculations would be useful, why
> > don't you open a thread and discuss them, instead of insisting they
> > are the right solution to my problem, when you don't even know what
> > the problem requirements are? It's all here in the thread. You only
> > need to read, without projecting your opinions on the problem
> > statement.
> >
> I've asked you this before - are you /sure/ you understand how Usenet works?

I will say this again, rather than burying your comments on CRC in this thread about checksums, why not open a new thread, and allow the world to read what you have to say, instead of commenting as a side topic in a thread where most people have tuned out long ago? You can use an appropriate subject line like, "Why CRC is better than checksums for some applications".

Or you can continue to muddy up the waters here by discussing something that is of no value in this application.

--

Rick C.

++- Get 1,000 miles of free Supercharging
++- Tesla referral code - https://ts.la/richard11209

Re: Embedding a Checksum in an Image File

<u2646q$clg2$1@dont-email.me>

  copy mid

https://www.novabbs.com/devel/article-flat.php?id=1486&group=comp.arch.embedded#1486

  copy link   Newsgroups: comp.arch.embedded
Path: i2pn2.org!i2pn.org!eternal-september.org!news.eternal-september.org!.POSTED!not-for-mail
From: david.br...@hesbynett.no (David Brown)
Newsgroups: comp.arch.embedded
Subject: Re: Embedding a Checksum in an Image File
Date: Mon, 24 Apr 2023 16:37:14 +0200
Organization: A noiseless patient Spider
Lines: 209
Message-ID: <u2646q$clg2$1@dont-email.me>
References: <116ff07e-5e25-469a-90a0-9474108aadd3n@googlegroups.com>
<pvl24i57aef4vc7bbdk9mvj7sic9dsh64t@4ax.com>
<f0afa198-e735-4da1-a16a-82764af3de4dn@googlegroups.com>
<36534il81ipvnhog6980r9ln9tdqn5cbh6@4ax.com> <u1t7eb$10gmu$3@dont-email.me>
<u1tpcr$1377f$1@dont-email.me> <u1tsm3$13ook$1@dont-email.me>
<u1u7ro$2poss$1@dont-email.me> <u1v9q8$2v4d5$1@dont-email.me>
<u20sli$3ag7l$1@dont-email.me> <u25bas$8fh0$2@dont-email.me>
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8; format=flowed
Content-Transfer-Encoding: 8bit
Injection-Date: Mon, 24 Apr 2023 14:37:15 -0000 (UTC)
Injection-Info: dont-email.me; posting-host="e3bce7eaad6b1d9cc0a7f5b3c9177dad";
logging-data="415234"; mail-complaints-to="abuse@eternal-september.org"; posting-account="U2FsdGVkX1/aKQlbVVG9DKWz8V+Z+HrmLN03JBancNg="
User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:102.0) Gecko/20100101
Thunderbird/102.9.0
Cancel-Lock: sha1:jLqfMxFUtljWk/LWmJJk5NxIms8=
In-Reply-To: <u25bas$8fh0$2@dont-email.me>
Content-Language: en-GB
 by: David Brown - Mon, 24 Apr 2023 14:37 UTC

On 24/04/2023 09:32, Don Y wrote:
> On 4/22/2023 7:57 AM, David Brown wrote:
>>>> However, in almost every case where CRC's might be useful, you have
>>>> additional checks of the sanity of the data, and an all-zero or
>>>> all-one data block would be rejected.  For example, Ethernet packets
>>>> use CRC for integrity checking, but an attempt to send a packet type
>>>> 0 from MAC address 00:00:00:00:00:00 to address 00:00:00:00:00:00,
>>>> of length 0, would be rejected anyway.
>>>
>>> Why look at "data" -- which may be suspect -- and *then* check its CRC?
>>> Run the CRC first.  If it fails, decide how you are going to proceed
>>> or recover.
>>
>> That is usually the order, yes.  Sometimes you want "fail fast", such
>> as dropping a packet that was not addressed to you (it doesn't matter
>> if it was received correctly but for someone else, or it was addressed
>> to you but the receiver address was corrupted - you are dropping the
>> packet either way).  But usually you will run the CRC then look at the
>> data.
>>
>> But the order doesn't matter - either way, you are still checking for
>> valid data, and if the data is invalid, it does not matter if the CRC
>> only passed by luck or by all zeros.
>
> You're assuming the CRC is supposed to *vouch* for the data.
> The CRC can be there simply to vouch for the *transport* of a
> datagram.

I am assuming that the CRC is there to determine the integrity of the
data in the face of possible unintentional errors. That's what CRC
checks are for. They have nothing to do with the content of the data,
or the type of the data package or image.

As an example of the use of CRC's in messaging, look at Ethernet frames:

<https://en.wikipedia.org/wiki/Ethernet_frame>

The CRC does not care about the content of the data it protects.

>
> So, use a version-specific CRC on the packet.  If it fails, then
> either the data in the packet has been corrupted (which could just
> as easily have involved an embedded "interface version" parameter);
> or the packet was formed with the wrong CRC.
>
> If the CRC is correct FOR THAT VERSION OF THE PROTOCOL, then
> why bother looking at a "protocol version" parameter?  Would
> you ALSO want to verify all the rest of the parameters?
>

I'm sorry, I simply cannot see your point. Identifying the version of a
protocol, or other protocol type information, is a totally orthogonal
task to ensuring the integrity of the data. The concepts should be
handled separately.

>>> What term would you have me use to indicate a "bias" applied to a CRC
>>> algorithm?
>>
>> Well, first I'd note that any kind of modification to the basic CRC
>> algorithm is pointless from the viewpoint of its use as an integrity
>> check.  (There have been, mostly historically, some justifications in
>> terms of implementation efficiency.  For example, bit and byte
>> re-ordering could be done to suit hardware bit-wise implementations.)
>>
>> Otherwise I'd say you are picking a specific initial value if that is
>> what you are doing, or modifying the final value (inverting it or
>> xor'ing it with a fixed value).  There is, AFAIK, no specific terms
>> for these - and I don't see any benefit in having one.  Misusing the
>> term "salt" from cryptography is certainly not helpful.
>
> Salt just ensures that you can differentiate between functionally identical
> values.  I.e., in a CRC, it differentiates between the "0x0000" that CRC-1
> generates from the "0x0000" that CRC-2 generates.

Can we agree that this is called an "initial value", not "salt" ?

>
> You don't see the parallel to ensuring that *my* use of "Passw0rd" is
> encoded in a different manner than *your* use of "Passw0rd"?

No. They are different things.

An important difference is that adding "salt" to a password hash is an
important security feature. Picking a different initial value for a CRC
instead of having appropriate protocol versioning in the data (or a
surrounding envelope) is a misfeature.

The second difference is the purpose of the hashing. The CRC here is
for data integrity - spotting mistakes in the data during transfer or
storage. The hash in a password is for security, avoiding the password
ever being transmitted or stored in plain text.

Any coincidence in the the way these might be implemented is just that -
coincidence.

>
>>> See the RMI desciption.
>>
>> I'm sorry, I have no idea what "RMI" is or where it is described.
>> You've mentioned that abbreviation twice, but I can't figure it out.
>
> <https://en.wikipedia.org/wiki/RMI>
> <https://en.wikipedia.org/wiki/OCL>
>
> Nothing magical with either term.

I looked up RMI on Wikipedia before asking, and saw nothing of relevance
to CRC's or checksums. I noticed no mention of "OCL" in your posts, and
looking it up on Wikipedia gives no clues.

So for now, I'll assume you don't want anyone to know what you meant and
I can safely ignore anything you write in connection with the terms.

>
>>> OTOH, "salting" the calculation so that it is expected to yield
>>> a value of 0x13 means *those* situations will be flagged as errors
>>> (and a different set of situations will sneak by, undetected).
>>
>> And that gives you exactly /zero/ benefit.
>
> See above.

I did. Zero benefit.

Actually, it is worse than useless - it makes it harder to identify the
protocol, and reduces the information content of the CRC check.

>
>> You run your hash algorithm, and check for the single value that
>> indicates no errors.  It does not matter if that number is 0, 0x13, or
>> - often more
> -----------^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
>
> As you've admitted, it doesn't matter.  So, why wouldn't I opt to have
> an algorithm for THIS interface give me a result that is EXPECTED
> for this protocol?  What value picking "0"?
>

A /single/ result does not matter (other than needlessly complicating
things). Having multiple different valid results /does/ matter.

>>>> That is why you need to distinguish between the two possibilities.
>>>> If you don't have to worry about malicious attacks, a 32-bit CRC
>>>> takes a dozen lines of C code and a 1 KB table, all running
>>>> extremely efficiently.  If security is an issue, you need digital
>>>> signatures - an RSA-based signature system is orders of magnitude
>>>> more effort in both development time and in run time.
>>>
>>> It's considerably more expensive AND not fool-proof -- esp if the
>>> attacker knows you are signing binaries.  "OK, now I need to find
>>> WHERE the signature is verified and just patch that "CALL" out
>>> of the code".
>>
>> I'm not sure if that is a straw-man argument, or just showing your
>> ignorance of the topic.  Do you really think security checks are done
>> by the program you are trying to send securely?  That would be like
>> trying to have building security where people entering the building
>> look at their own security cards.
>
> Do YOU really think we all design applications that run in PCs where some
> CLOSED OS performs these tests in a manner that can't be subverted?

Do you bother to read my posts at all? Or do you prefer to make up
things that you imagine I write, so that you can make nonsensical
attacks on them? Certainly there is no sane reading of my posts
(written and sent from an /open/ OS) where "do not rely on security by
obscurity" could be taken to mean "rely on obscured and closed platforms".

> *WE* (tend to) write ALL the code in the products developed, here.
> So, whether it's the POST WE wrote that is performing the test or
> the loader WE wrote, it's still *our* program.
>
> Yes, we ARE looking at our own security cards!
>
> Manufacturers *try* to hide ("obscurity") details of these mechanisms
> in an attempt to improve effective security.  But, there's nothing
> that makes these guarantees.

Why are you trying to "persuade" me that manufacturer obscurity is a bad
thing? You have been promoting obscurity of algorithms as though it
were helpful for security - I have made clear that it is not. Are you
getting your own position mixed up with mine?

>
> Give me the sources for Windows (Linux, *BSD, etc.) and I can
> subvert all the state-of-the-art digital signing used to ensure
> binaries aren't altered.  Nothing *outside* the box is involved
> so, by definition, everything I need has to reside *in* the box.


Click here to read the complete article
Pages:1234
server_pubkey.txt

rocksolid light 0.9.8
clearnet tor