Rocksolid Light

Welcome to novaBBS (click a section below)

mail  files  register  newsreader  groups  login

Message-ID:  

The nation that controls magnetism controls the universe. -- Chester Gould/Dick Tracy


devel / comp.lang.c / simple compression mathod to code itself?

SubjectAuthor
* simple compression mathod to code itself?fir
`* Re: simple compression mathod to code itself?David LaRue
 +* Re: simple compression mathod to code itself?Paul
 |`- Re: simple compression mathod to code itself?fir
 `* Re: simple compression mathod to code itself?fir
  +* Re: simple compression mathod to code itself?fir
  |+* Re: simple compression mathod to code itself?bart
  ||`- Re: simple compression mathod to code itself?fir
  |`* Re: simple compression mathod to code itself?David LaRue
  | `* Re: simple compression mathod to code itself?Scott Lurndal
  |  `- Re: simple compression mathod to code itself?BGB
  `* Re: simple compression mathod to code itself?David LaRue
   `- Re: simple compression mathod to code itself?fir

1
simple compression mathod to code itself?

<uupflr$5t1n$1@i2pn2.org>

  copy mid

https://www.novabbs.com/devel/article-flat.php?id=34941&group=comp.lang.c#34941

  copy link   Newsgroups: comp.lang.c
Path: i2pn2.org!.POSTED!not-for-mail
From: fir...@grunge.pl (fir)
Newsgroups: comp.lang.c
Subject: simple compression mathod to code itself?
Date: Fri, 05 Apr 2024 20:24:51 +0200
Organization: i2pn2 (i2pn.org)
Message-ID: <uupflr$5t1n$1@i2pn2.org>
MIME-Version: 1.0
Content-Type: text/plain; charset=ISO-8859-1; format=flowed
Content-Transfer-Encoding: 7bit
Injection-Date: Fri, 5 Apr 2024 18:25:01 -0000 (UTC)
Injection-Info: i2pn2.org;
logging-data="193591"; mail-complaints-to="usenet@i2pn2.org";
posting-account="+ydHcGjgSeBt3Wz3WTfKefUptpAWaXduqfw5xdfsuS0";
User-Agent: Mozilla/5.0 (Windows NT 5.1; rv:27.0) Gecko/20100101 Firefox/27.0 SeaMonkey/2.24
X-Spam-Checker-Version: SpamAssassin 4.0.0
 by: fir - Fri, 5 Apr 2024 18:24 UTC

i not code at all in recent years
(recently i coded half of my compuler with a plan to write second half
but its to ambitious to lazy coding mood i got recently)
but recently i started this slamm coding modd and found
ite pleasurable

searching for lazy side things i could code in such mood
i thought maybe i wopuld like to have compression routine
but i would like to write it myself sortla like i use quicksort
or so but i also want to write it myself

so is there maybe some method i could use i mean some simple
routine that compresses and uncompresses an array of bytes but
maybe something a bit more complex than RLE (rune length encoding)
- only thing i know from this domain

Re: simple compression mathod to code itself?

<XnsB14BC6028E8BAhueydlltampabayrrcom@135.181.20.170>

  copy mid

https://www.novabbs.com/devel/article-flat.php?id=34942&group=comp.lang.c#34942

  copy link   Newsgroups: comp.lang.c
Path: i2pn2.org!i2pn.org!eternal-september.org!feeder3.eternal-september.org!news.eternal-september.org!.POSTED!not-for-mail
From: huey....@tampabay.rr.com (David LaRue)
Newsgroups: comp.lang.c
Subject: Re: simple compression mathod to code itself?
Date: Fri, 5 Apr 2024 23:28:12 -0000 (UTC)
Organization: A noiseless patient Spider
Lines: 46
Message-ID: <XnsB14BC6028E8BAhueydlltampabayrrcom@135.181.20.170>
References: <uupflr$5t1n$1@i2pn2.org>
Injection-Date: Fri, 05 Apr 2024 23:28:13 +0200 (CEST)
Injection-Info: dont-email.me; posting-host="1c2b76cfe394269a5091a3e37ee7224f";
logging-data="1757956"; mail-complaints-to="abuse@eternal-september.org"; posting-account="U2FsdGVkX185h/QH9X6ICzOdmatxLy3S"
User-Agent: Xnews/2006.08.24
Cancel-Lock: sha1:BajGN+Ho0FJ2vPHcDTybd0OrRsk=
 by: David LaRue - Fri, 5 Apr 2024 23:28 UTC

fir <fir@grunge.pl> wrote in news:uupflr$5t1n$1@i2pn2.org:

> i not code at all in recent years
> (recently i coded half of my compuler with a plan to write second half
> but its to ambitious to lazy coding mood i got recently)
> but recently i started this slamm coding modd and found
> ite pleasurable
>
> searching for lazy side things i could code in such mood
> i thought maybe i wopuld like to have compression routine
> but i would like to write it myself sortla like i use quicksort
> or so but i also want to write it myself
>
> so is there maybe some method i could use i mean some simple
> routine that compresses and uncompresses an array of bytes but
> maybe something a bit more complex than RLE (rune length encoding)
> - only thing i know from this domain
>

Hello fir,

A method a bit more complex that you might try builds a table of all bytes
as you scan them from the input. The compressed output is a reference to
the table you just built (wnhem repeated byte strings are found, otherwise
feed the input to the compressed image so that the expansion method can
build the sme dynamic table the encoder built. The table generally has a
limit on the number of entries (usually a good size) and allows the table
of bytes to dynamically change as new patterns are read from the input.

This is a well known and documented compression/expansion algorithm. PKZIP
and other engines use this as one of their compression methodz. Look for
the description of that if you need more details to figure out what you
need to write.

Expansion is the reverse. Read the source (now the compressed image) and
build the compression table from the bytes. As encoded references to the
compression table are read from the compressed image output the source byte
sequences. The output should be the same as what your encoder originally
read.

A good check on the final code is to compare the original input with the
eventual output and make sure they agree exactly.

Have fun,

David

Re: simple compression mathod to code itself?

<uuq7sa$1mutb$1@dont-email.me>

  copy mid

https://www.novabbs.com/devel/article-flat.php?id=34943&group=comp.lang.c#34943

  copy link   Newsgroups: comp.lang.c
Path: i2pn2.org!i2pn.org!eternal-september.org!feeder3.eternal-september.org!news.eternal-september.org!.POSTED!not-for-mail
From: nos...@needed.invalid (Paul)
Newsgroups: comp.lang.c
Subject: Re: simple compression mathod to code itself?
Date: Fri, 5 Apr 2024 21:18:01 -0400
Organization: A noiseless patient Spider
Lines: 57
Message-ID: <uuq7sa$1mutb$1@dont-email.me>
References: <uupflr$5t1n$1@i2pn2.org>
<XnsB14BC6028E8BAhueydlltampabayrrcom@135.181.20.170>
MIME-Version: 1.0
Content-Type: text/plain; charset=utf-8
Content-Transfer-Encoding: 7bit
Injection-Date: Sat, 06 Apr 2024 01:18:03 +0200 (CEST)
Injection-Info: dont-email.me; posting-host="a05746dbe29120f23f6f980acb67ef40";
logging-data="1801131"; mail-complaints-to="abuse@eternal-september.org"; posting-account="U2FsdGVkX1+qEpjUbFzTUo1ygj2Adr5uqnqHkBdeGDE="
User-Agent: Ratcatcher/2.0.0.25 (Windows/20130802)
Cancel-Lock: sha1:DXyktwObUGF9e4KmNtbx52ggiIc=
Content-Language: en-US
In-Reply-To: <XnsB14BC6028E8BAhueydlltampabayrrcom@135.181.20.170>
 by: Paul - Sat, 6 Apr 2024 01:18 UTC

On 4/5/2024 7:28 PM, David LaRue wrote:
> fir <fir@grunge.pl> wrote in news:uupflr$5t1n$1@i2pn2.org:
>
>> i not code at all in recent years
>> (recently i coded half of my compuler with a plan to write second half
>> but its to ambitious to lazy coding mood i got recently)
>> but recently i started this slamm coding modd and found
>> ite pleasurable
>>
>> searching for lazy side things i could code in such mood
>> i thought maybe i wopuld like to have compression routine
>> but i would like to write it myself sortla like i use quicksort
>> or so but i also want to write it myself
>>
>> so is there maybe some method i could use i mean some simple
>> routine that compresses and uncompresses an array of bytes but
>> maybe something a bit more complex than RLE (rune length encoding)
>> - only thing i know from this domain
>>
>
> Hello fir,
>
> A method a bit more complex that you might try builds a table of all bytes
> as you scan them from the input. The compressed output is a reference to
> the table you just built (wnhem repeated byte strings are found, otherwise
> feed the input to the compressed image so that the expansion method can
> build the sme dynamic table the encoder built. The table generally has a
> limit on the number of entries (usually a good size) and allows the table
> of bytes to dynamically change as new patterns are read from the input.
>
> This is a well known and documented compression/expansion algorithm. PKZIP
> and other engines use this as one of their compression methodz. Look for
> the description of that if you need more details to figure out what you
> need to write.
>
> Expansion is the reverse. Read the source (now the compressed image) and
> build the compression table from the bytes. As encoded references to the
> compression table are read from the compressed image output the source byte
> sequences. The output should be the same as what your encoder originally
> read.
>
> A good check on the final code is to compare the original input with the
> eventual output and make sure they agree exactly.
>
> Have fun,
>
> David
>

Some people have written compression codes, purely for educational purposes.
That's why I got a copy of this, some years ago. For fun.

https://github.com/grtamayo/RLE

gtrle35.c # Run Length Encoding, one of the simpler compressors

Paul

Re: simple compression mathod to code itself?

<uurcqh$8fsp$1@i2pn2.org>

  copy mid

https://www.novabbs.com/devel/article-flat.php?id=34944&group=comp.lang.c#34944

  copy link   Newsgroups: comp.lang.c
Path: i2pn2.org!.POSTED!not-for-mail
From: fir...@grunge.pl (fir)
Newsgroups: comp.lang.c
Subject: Re: simple compression mathod to code itself?
Date: Sat, 06 Apr 2024 13:48:38 +0200
Organization: i2pn2 (i2pn.org)
Message-ID: <uurcqh$8fsp$1@i2pn2.org>
References: <uupflr$5t1n$1@i2pn2.org> <XnsB14BC6028E8BAhueydlltampabayrrcom@135.181.20.170> <uuq7sa$1mutb$1@dont-email.me>
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8; format=flowed
Content-Transfer-Encoding: 7bit
Injection-Date: Sat, 6 Apr 2024 11:48:33 -0000 (UTC)
Injection-Info: i2pn2.org;
logging-data="278425"; mail-complaints-to="usenet@i2pn2.org";
posting-account="+ydHcGjgSeBt3Wz3WTfKefUptpAWaXduqfw5xdfsuS0";
User-Agent: Mozilla/5.0 (Windows NT 5.1; rv:27.0) Gecko/20100101 Firefox/27.0 SeaMonkey/2.24
X-Spam-Checker-Version: SpamAssassin 4.0.0
In-Reply-To: <uuq7sa$1mutb$1@dont-email.me>
 by: fir - Sat, 6 Apr 2024 11:48 UTC

Paul wrote:
> On 4/5/2024 7:28 PM, David LaRue wrote:
>> fir <fir@grunge.pl> wrote in news:uupflr$5t1n$1@i2pn2.org:
>>
>>> i not code at all in recent years
>>> (recently i coded half of my compuler with a plan to write second half
>>> but its to ambitious to lazy coding mood i got recently)
>>> but recently i started this slamm coding modd and found
>>> ite pleasurable
>>>
>>> searching for lazy side things i could code in such mood
>>> i thought maybe i wopuld like to have compression routine
>>> but i would like to write it myself sortla like i use quicksort
>>> or so but i also want to write it myself
>>>
>>> so is there maybe some method i could use i mean some simple
>>> routine that compresses and uncompresses an array of bytes but
>>> maybe something a bit more complex than RLE (rune length encoding)
>>> - only thing i know from this domain
>>>
>>
>> Hello fir,
>>
>> A method a bit more complex that you might try builds a table of all bytes
>> as you scan them from the input. The compressed output is a reference to
>> the table you just built (wnhem repeated byte strings are found, otherwise
>> feed the input to the compressed image so that the expansion method can
>> build the sme dynamic table the encoder built. The table generally has a
>> limit on the number of entries (usually a good size) and allows the table
>> of bytes to dynamically change as new patterns are read from the input.
>>
>> This is a well known and documented compression/expansion algorithm. PKZIP
>> and other engines use this as one of their compression methodz. Look for
>> the description of that if you need more details to figure out what you
>> need to write.
>>
>> Expansion is the reverse. Read the source (now the compressed image) and
>> build the compression table from the bytes. As encoded references to the
>> compression table are read from the compressed image output the source byte
>> sequences. The output should be the same as what your encoder originally
>> read.
>>
>> A good check on the final code is to compare the original input with the
>> eventual output and make sure they agree exactly.
>>
>> Have fun,
>>
>> David
>>
>
> Some people have written compression codes, purely for educational purposes.
> That's why I got a copy of this, some years ago. For fun.
>
> https://github.com/grtamayo/RLE
>
> gtrle35.c # Run Length Encoding, one of the simpler compressors
>
> Paul
>
rle i think i could write by hand but i would like something a bit more
eleborate than this - not much but somewhat

Re: simple compression mathod to code itself?

<uurd34$8ga0$1@i2pn2.org>

  copy mid

https://www.novabbs.com/devel/article-flat.php?id=34945&group=comp.lang.c#34945

  copy link   Newsgroups: comp.lang.c
Path: i2pn2.org!.POSTED!not-for-mail
From: fir...@grunge.pl (fir)
Newsgroups: comp.lang.c
Subject: Re: simple compression mathod to code itself?
Date: Sat, 06 Apr 2024 13:53:14 +0200
Organization: i2pn2 (i2pn.org)
Message-ID: <uurd34$8ga0$1@i2pn2.org>
References: <uupflr$5t1n$1@i2pn2.org> <XnsB14BC6028E8BAhueydlltampabayrrcom@135.181.20.170>
MIME-Version: 1.0
Content-Type: text/plain; charset=ISO-8859-1; format=flowed
Content-Transfer-Encoding: 7bit
Injection-Date: Sat, 6 Apr 2024 11:53:08 -0000 (UTC)
Injection-Info: i2pn2.org;
logging-data="278848"; mail-complaints-to="usenet@i2pn2.org";
posting-account="+ydHcGjgSeBt3Wz3WTfKefUptpAWaXduqfw5xdfsuS0";
User-Agent: Mozilla/5.0 (Windows NT 5.1; rv:27.0) Gecko/20100101 Firefox/27.0 SeaMonkey/2.24
In-Reply-To: <XnsB14BC6028E8BAhueydlltampabayrrcom@135.181.20.170>
X-Spam-Checker-Version: SpamAssassin 4.0.0
 by: fir - Sat, 6 Apr 2024 11:53 UTC

David LaRue wrote:
> fir <fir@grunge.pl> wrote in news:uupflr$5t1n$1@i2pn2.org:
>
>> i not code at all in recent years
>> (recently i coded half of my compuler with a plan to write second half
>> but its to ambitious to lazy coding mood i got recently)
>> but recently i started this slamm coding modd and found
>> ite pleasurable
>>
>> searching for lazy side things i could code in such mood
>> i thought maybe i wopuld like to have compression routine
>> but i would like to write it myself sortla like i use quicksort
>> or so but i also want to write it myself
>>
>> so is there maybe some method i could use i mean some simple
>> routine that compresses and uncompresses an array of bytes but
>> maybe something a bit more complex than RLE (rune length encoding)
>> - only thing i know from this domain
>>
>
> Hello fir,
>
> A method a bit more complex that you might try builds a table of all bytes
> as you scan them from the input. The compressed output is a reference to
> the table you just built (wnhem repeated byte strings are found, otherwise
> feed the input to the compressed image so that the expansion method can
> build the sme dynamic table the encoder built. The table generally has a
> limit on the number of entries (usually a good size) and allows the table
> of bytes to dynamically change as new patterns are read from the input.
>
> This is a well known and documented compression/expansion algorithm. PKZIP
> and other engines use this as one of their compression methodz. Look for
> the description of that if you need more details to figure out what you
> need to write.
>
> Expansion is the reverse. Read the source (now the compressed image) and
> build the compression table from the bytes. As encoded references to the
> compression table are read from the compressed image output the source byte
> sequences. The output should be the same as what your encoder originally
> read.
>
> A good check on the final code is to compare the original input with the
> eventual output and make sure they agree exactly.
>
> Have fun,
>
> David
>
this could be good but i dont quite understood that .. but eventually
could be good...

i thinged something abut that if rle search for repetitions of 1
byte then maybe after that search for repetitions of 2 bytes, then 3
bytes, 4 bytes and so on.. then do some "report" how many found and then
find a way to encode that

need to think a bit becouse if rle only stores repetitions that are
one after another then this method should store repetitions that have
various distances among them

i also do nto want to spend a much time on this 1-2 days eventually

Re: simple compression mathod to code itself?

<uurefe$8i04$1@i2pn2.org>

  copy mid

https://www.novabbs.com/devel/article-flat.php?id=34946&group=comp.lang.c#34946

  copy link   Newsgroups: comp.lang.c
Path: i2pn2.org!.POSTED!not-for-mail
From: fir...@grunge.pl (fir)
Newsgroups: comp.lang.c
Subject: Re: simple compression mathod to code itself?
Date: Sat, 06 Apr 2024 14:16:51 +0200
Organization: i2pn2 (i2pn.org)
Message-ID: <uurefe$8i04$1@i2pn2.org>
References: <uupflr$5t1n$1@i2pn2.org> <XnsB14BC6028E8BAhueydlltampabayrrcom@135.181.20.170> <uurd34$8ga0$1@i2pn2.org>
MIME-Version: 1.0
Content-Type: text/plain; charset=ISO-8859-1; format=flowed
Content-Transfer-Encoding: 7bit
Injection-Date: Sat, 6 Apr 2024 12:16:46 -0000 (UTC)
Injection-Info: i2pn2.org;
logging-data="280580"; mail-complaints-to="usenet@i2pn2.org";
posting-account="+ydHcGjgSeBt3Wz3WTfKefUptpAWaXduqfw5xdfsuS0";
User-Agent: Mozilla/5.0 (Windows NT 5.1; rv:27.0) Gecko/20100101 Firefox/27.0 SeaMonkey/2.24
X-Spam-Checker-Version: SpamAssassin 4.0.0
In-Reply-To: <uurd34$8ga0$1@i2pn2.org>
 by: fir - Sat, 6 Apr 2024 12:16 UTC

fir wrote:
> David LaRue wrote:
>> fir <fir@grunge.pl> wrote in news:uupflr$5t1n$1@i2pn2.org:
>>
>>> i not code at all in recent years
>>> (recently i coded half of my compuler with a plan to write second half
>>> but its to ambitious to lazy coding mood i got recently)
>>> but recently i started this slamm coding modd and found
>>> ite pleasurable
>>>
>>> searching for lazy side things i could code in such mood
>>> i thought maybe i wopuld like to have compression routine
>>> but i would like to write it myself sortla like i use quicksort
>>> or so but i also want to write it myself
>>>
>>> so is there maybe some method i could use i mean some simple
>>> routine that compresses and uncompresses an array of bytes but
>>> maybe something a bit more complex than RLE (rune length encoding)
>>> - only thing i know from this domain
>>>
>>
>> Hello fir,
>>
>> A method a bit more complex that you might try builds a table of all
>> bytes
>> as you scan them from the input. The compressed output is a reference to
>> the table you just built (wnhem repeated byte strings are found,
>> otherwise
>> feed the input to the compressed image so that the expansion method can
>> build the sme dynamic table the encoder built. The table generally has a
>> limit on the number of entries (usually a good size) and allows the table
>> of bytes to dynamically change as new patterns are read from the input.
>>
>> This is a well known and documented compression/expansion algorithm.
>> PKZIP
>> and other engines use this as one of their compression methodz. Look for
>> the description of that if you need more details to figure out what you
>> need to write.
>>
>> Expansion is the reverse. Read the source (now the compressed image) and
>> build the compression table from the bytes. As encoded references to the
>> compression table are read from the compressed image output the source
>> byte
>> sequences. The output should be the same as what your encoder originally
>> read.
>>
>> A good check on the final code is to compare the original input with the
>> eventual output and make sure they agree exactly.
>>
>> Have fun,
>>
>> David
>>
> this could be good but i dont quite understood that .. but eventually
> could be good...
>
> i thinged something abut that if rle search for repetitions of 1
> byte then maybe after that search for repetitions of 2 bytes, then 3
> bytes, 4 bytes and so on.. then do some "report" how many found and then
> find a way to encode that
>
> need to think a bit becouse if rle only stores repetitions that are
> one after another then this method should store repetitions that have
> various distances among them
>
>
> i also do nto want to spend a much time on this 1-2 days eventually

though maybe i should start from RLE indeed..in fact ihis shouldnt be so
bad as for some of my eventuall funny needs - also it maybe seem ok to
be first step until something more elaborate

if someone want to talk on compression adn how to code it i could like
to read it (as reading net articles on this may be seem to hard for my
old sick head and posts are much are easier to get into it)

Re: simple compression mathod to code itself?

<uuril0$23mkc$1@dont-email.me>

  copy mid

https://www.novabbs.com/devel/article-flat.php?id=34947&group=comp.lang.c#34947

  copy link   Newsgroups: comp.lang.c
Path: i2pn2.org!i2pn.org!eternal-september.org!feeder3.eternal-september.org!news.eternal-september.org!.POSTED!not-for-mail
From: bc...@freeuk.com (bart)
Newsgroups: comp.lang.c
Subject: Re: simple compression mathod to code itself?
Date: Sat, 6 Apr 2024 14:28:01 +0100
Organization: A noiseless patient Spider
Lines: 98
Message-ID: <uuril0$23mkc$1@dont-email.me>
References: <uupflr$5t1n$1@i2pn2.org>
<XnsB14BC6028E8BAhueydlltampabayrrcom@135.181.20.170>
<uurd34$8ga0$1@i2pn2.org> <uurefe$8i04$1@i2pn2.org>
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8; format=flowed
Content-Transfer-Encoding: 8bit
Injection-Date: Sat, 06 Apr 2024 13:28:00 +0200 (CEST)
Injection-Info: dont-email.me; posting-host="720c88f66c1ad2d04dad0254b37d503e";
logging-data="2218636"; mail-complaints-to="abuse@eternal-september.org"; posting-account="U2FsdGVkX1/A6zrvwdE6XIA815bIyYR6"
User-Agent: Mozilla Thunderbird
Cancel-Lock: sha1:AVLf5e7HAAxyYB39qBlDf4gtLcE=
Content-Language: en-GB
In-Reply-To: <uurefe$8i04$1@i2pn2.org>
 by: bart - Sat, 6 Apr 2024 13:28 UTC

On 06/04/2024 13:16, fir wrote:
> fir wrote:
>> David LaRue wrote:
>>> fir <fir@grunge.pl> wrote in news:uupflr$5t1n$1@i2pn2.org:
>>>
>>>> i not code at all in recent years
>>>> (recently i coded half of my compuler with a plan to write second half
>>>> but its to ambitious to lazy coding mood i got recently)
>>>> but recently i started this slamm coding modd and found
>>>> ite pleasurable
>>>>
>>>> searching for lazy side things i could code in such mood
>>>> i thought maybe i wopuld like to have compression routine
>>>> but i would like to write it myself sortla like i use quicksort
>>>> or so but i also want to write it myself
>>>>
>>>> so is there maybe some method i could use i mean some simple
>>>> routine that compresses and uncompresses an array of bytes but
>>>> maybe something a bit more complex than RLE (rune length encoding)
>>>> - only thing i know from this domain
>>>>
>>>
>>> Hello fir,
>>>
>>> A method a bit more complex that you might try builds a table of all
>>> bytes
>>> as you scan them from the input.  The compressed output is a
>>> reference to
>>> the table you just built (wnhem repeated byte strings are found,
>>> otherwise
>>> feed the input to the compressed image so that the expansion method can
>>> build the sme dynamic table the encoder built.  The table generally
>>> has a
>>> limit on the number of entries (usually a good size) and allows the
>>> table
>>> of bytes to dynamically change as new patterns are read from the input.
>>>
>>> This is a well known and documented compression/expansion algorithm.
>>> PKZIP
>>> and other engines use this as one of their compression methodz.  Look
>>> for
>>> the description of that if you need more details to figure out what you
>>> need to write.
>>>
>>> Expansion is the reverse.  Read the source (now the compressed image)
>>> and
>>> build the compression table from the bytes.  As encoded references to
>>> the
>>> compression table are read from the compressed image output the source
>>> byte
>>> sequences.  The output should be the same as what your encoder
>>> originally
>>> read.
>>>
>>> A good check on the final code is to compare the original input with the
>>> eventual output and make sure they agree exactly.
>>>
>>> Have fun,
>>>
>>> David
>>>
>> this could be good but i dont quite understood that .. but eventually
>> could be good...
>>
>> i thinged something abut that if rle search for repetitions of 1
>> byte then maybe after that search for repetitions of 2 bytes, then 3
>> bytes, 4 bytes and so on.. then do some "report" how many found and then
>> find a way to encode that
>>
>>   need to think a bit becouse if rle only stores repetitions that are
>> one after another then this method should store  repetitions that have
>> various distances among them
>>
>>
>> i also do nto want to spend a much time on this 1-2 days eventually
>
>
> though maybe i should start from RLE indeed..in fact ihis shouldnt be so
> bad as for some of my eventuall funny needs - also it maybe seem ok to
> be first step until something more elaborate

What sort of data are you compressing?

If it is computer generated imagery with no noise or artefacts, then RLE
will probably work well.

If it is a noisy image captured from a camera then it'll be rubbish.

Stuff like text files will likely be mildly compressed, but probably not
enough to be worth the trouble.

Decent compression is hard; you're not going to come up with anything in
1-2 days that will give worthwhile results across a range of inputs.

> if someone want to talk on compression adn how to code it i could like
> to read it (as reading net articles on this may be seem to hard for my
> old sick head and posts are much are easier to get into it)

Re: simple compression mathod to code itself?

<uurkku$8pn1$1@i2pn2.org>

  copy mid

https://www.novabbs.com/devel/article-flat.php?id=34948&group=comp.lang.c#34948

  copy link   Newsgroups: comp.lang.c
Path: i2pn2.org!.POSTED!not-for-mail
From: fir...@grunge.pl (fir)
Newsgroups: comp.lang.c
Subject: Re: simple compression mathod to code itself?
Date: Sat, 06 Apr 2024 16:02:12 +0200
Organization: i2pn2 (i2pn.org)
Message-ID: <uurkku$8pn1$1@i2pn2.org>
References: <uupflr$5t1n$1@i2pn2.org> <XnsB14BC6028E8BAhueydlltampabayrrcom@135.181.20.170> <uurd34$8ga0$1@i2pn2.org> <uurefe$8i04$1@i2pn2.org> <uuril0$23mkc$1@dont-email.me>
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8; format=flowed
Content-Transfer-Encoding: 7bit
Injection-Date: Sat, 6 Apr 2024 14:02:06 -0000 (UTC)
Injection-Info: i2pn2.org;
logging-data="288481"; mail-complaints-to="usenet@i2pn2.org";
posting-account="+ydHcGjgSeBt3Wz3WTfKefUptpAWaXduqfw5xdfsuS0";
User-Agent: Mozilla/5.0 (Windows NT 5.1; rv:27.0) Gecko/20100101 Firefox/27.0 SeaMonkey/2.24
In-Reply-To: <uuril0$23mkc$1@dont-email.me>
X-Spam-Checker-Version: SpamAssassin 4.0.0
 by: fir - Sat, 6 Apr 2024 14:02 UTC

bart wrote:
> On 06/04/2024 13:16, fir wrote:
>> fir wrote:
>>> David LaRue wrote:
>>>> fir <fir@grunge.pl> wrote in news:uupflr$5t1n$1@i2pn2.org:
>>>>
>>>>> i not code at all in recent years
>>>>> (recently i coded half of my compuler with a plan to write second half
>>>>> but its to ambitious to lazy coding mood i got recently)
>>>>> but recently i started this slamm coding modd and found
>>>>> ite pleasurable
>>>>>
>>>>> searching for lazy side things i could code in such mood
>>>>> i thought maybe i wopuld like to have compression routine
>>>>> but i would like to write it myself sortla like i use quicksort
>>>>> or so but i also want to write it myself
>>>>>
>>>>> so is there maybe some method i could use i mean some simple
>>>>> routine that compresses and uncompresses an array of bytes but
>>>>> maybe something a bit more complex than RLE (rune length encoding)
>>>>> - only thing i know from this domain
>>>>>
>>>>
>>>> Hello fir,
>>>>
>>>> A method a bit more complex that you might try builds a table of all
>>>> bytes
>>>> as you scan them from the input. The compressed output is a
>>>> reference to
>>>> the table you just built (wnhem repeated byte strings are found,
>>>> otherwise
>>>> feed the input to the compressed image so that the expansion method can
>>>> build the sme dynamic table the encoder built. The table generally
>>>> has a
>>>> limit on the number of entries (usually a good size) and allows the
>>>> table
>>>> of bytes to dynamically change as new patterns are read from the input.
>>>>
>>>> This is a well known and documented compression/expansion algorithm.
>>>> PKZIP
>>>> and other engines use this as one of their compression methodz.
>>>> Look for
>>>> the description of that if you need more details to figure out what you
>>>> need to write.
>>>>
>>>> Expansion is the reverse. Read the source (now the compressed
>>>> image) and
>>>> build the compression table from the bytes. As encoded references
>>>> to the
>>>> compression table are read from the compressed image output the source
>>>> byte
>>>> sequences. The output should be the same as what your encoder
>>>> originally
>>>> read.
>>>>
>>>> A good check on the final code is to compare the original input with
>>>> the
>>>> eventual output and make sure they agree exactly.
>>>>
>>>> Have fun,
>>>>
>>>> David
>>>>
>>> this could be good but i dont quite understood that .. but eventually
>>> could be good...
>>>
>>> i thinged something abut that if rle search for repetitions of 1
>>> byte then maybe after that search for repetitions of 2 bytes, then 3
>>> bytes, 4 bytes and so on.. then do some "report" how many found and then
>>> find a way to encode that
>>>
>>> need to think a bit becouse if rle only stores repetitions that are
>>> one after another then this method should store repetitions that have
>>> various distances among them
>>>
>>>
>>> i also do nto want to spend a much time on this 1-2 days eventually
>>
>>
>> though maybe i should start from RLE indeed..in fact ihis shouldnt be
>> so bad as for some of my eventuall funny needs - also it maybe seem ok
>> to be first step until something more elaborate
>
> What sort of data are you compressing?
>
> If it is computer generated imagery with no noise or artefacts, then RLE
> will probably work well.
>
> If it is a noisy image captured from a camera then it'll be rubbish.
>
> Stuff like text files will likely be mildly compressed, but probably not
> enough to be worth the trouble.
>
> Decent compression is hard; you're not going to come up with anything in
> 1-2 days that will give worthwhile results across a range of inputs.
>
>> if someone want to talk on compression adn how to code it i could like
>> to read it (as reading net articles on this may be seem to hard for my
>> old sick head and posts are much are easier to get into it)
>
at the initial idea i want to add this as a method to my "bytes"
microcontainer (where you could put or load anything)

i just tried to think what usebale metods i could add and some
pack/unpack method caould be handy

though even i dont know if to do it todaye/tomorrow or maybe more in
future..discussing something abut it would be interesting imo

Re: simple compression mathod to code itself?

<XnsB14C767BEE016hueydlltampabayrrcom@135.181.20.170>

  copy mid

https://www.novabbs.com/devel/article-flat.php?id=34952&group=comp.lang.c#34952

  copy link   Newsgroups: comp.lang.c
Path: i2pn2.org!i2pn.org!eternal-september.org!feeder3.eternal-september.org!news.eternal-september.org!.POSTED!not-for-mail
From: huey....@tampabay.rr.com (David LaRue)
Newsgroups: comp.lang.c
Subject: Re: simple compression mathod to code itself?
Date: Sat, 6 Apr 2024 15:39:11 -0000 (UTC)
Organization: A noiseless patient Spider
Lines: 86
Message-ID: <XnsB14C767BEE016hueydlltampabayrrcom@135.181.20.170>
References: <uupflr$5t1n$1@i2pn2.org> <XnsB14BC6028E8BAhueydlltampabayrrcom@135.181.20.170> <uurd34$8ga0$1@i2pn2.org>
Injection-Date: Sat, 06 Apr 2024 15:39:11 +0200 (CEST)
Injection-Info: dont-email.me; posting-host="1c2b76cfe394269a5091a3e37ee7224f";
logging-data="2280863"; mail-complaints-to="abuse@eternal-september.org"; posting-account="U2FsdGVkX19kr7V7F8XIXf+mzQoKM44k"
User-Agent: Xnews/2006.08.24
Cancel-Lock: sha1:gQU3HCwVKYGxOS9hiYjpOtU940s=
 by: David LaRue - Sat, 6 Apr 2024 15:39 UTC

fir <fir@grunge.pl> wrote in news:uurd34$8ga0$1@i2pn2.org:

> David LaRue wrote:
>> fir <fir@grunge.pl> wrote in news:uupflr$5t1n$1@i2pn2.org:
>>
>>> i not code at all in recent years
>>> (recently i coded half of my compuler with a plan to write second half
>>> but its to ambitious to lazy coding mood i got recently)
>>> but recently i started this slamm coding modd and found
>>> ite pleasurable
>>>
>>> searching for lazy side things i could code in such mood
>>> i thought maybe i wopuld like to have compression routine
>>> but i would like to write it myself sortla like i use quicksort
>>> or so but i also want to write it myself
>>>
>>> so is there maybe some method i could use i mean some simple
>>> routine that compresses and uncompresses an array of bytes but
>>> maybe something a bit more complex than RLE (rune length encoding)
>>> - only thing i know from this domain
>>>
>>
>> Hello fir,
>>
>> A method a bit more complex that you might try builds a table of all
>> bytes as you scan them from the input. The compressed output is a
>> reference to the table you just built (wnhem repeated byte strings are
>> found, otherwise feed the input to the compressed image so that the
>> expansion method can build the sme dynamic table the encoder built.
>> The table generally has a limit on the number of entries (usually a
>> good size) and allows the table of bytes to dynamically change as new
>> patterns are read from the input.
>>
>> This is a well known and documented compression/expansion algorithm.
>> PKZIP and other engines use this as one of their compression methodz.
>> Look for the description of that if you need more details to figure out
>> what you need to write.
>>
>> Expansion is the reverse. Read the source (now the compressed image)
>> and build the compression table from the bytes. As encoded references
>> to the compression table are read from the compressed image output the
>> source byte sequences. The output should be the same as what your
>> encoder originally read.
>>
>> A good check on the final code is to compare the original input with
>> the eventual output and make sure they agree exactly.
>>
>> Have fun,
>>
>> David
>>
> this could be good but i dont quite understood that .. but eventually
> could be good...
>
> i thinged something abut that if rle search for repetitions of 1
> byte then maybe after that search for repetitions of 2 bytes, then 3
> bytes, 4 bytes and so on.. then do some "report" how many found and then
> find a way to encode that
>
> need to think a bit becouse if rle only stores repetitions that are
> one after another then this method should store repetitions that have
> various distances among them
>
>
> i also do nto want to spend a much time on this 1-2 days eventually
>

Hello fir,

The method above can do the same thing for each repetition of a single
byte. Alone it has the advantage that compression and decompression are
single pass operations. It is similar to RLE but allows for more
variations to be used along the way. The encoder also needs some table
searching to determine best matches, new byte sequences, and improved
length sequences. An ordered table of byte strings is typically used.
I've also seen more advanced search methods that use branch tables to
represent the encoded sequences with the final entry being the value to put
in the compressed output. Very easy to search and gives some improvements
to the basic search design to make performance better. This can be done in
a few days or less, depending on how much time you spend on deciding the
table size and so on. I've also seen the search part as a coding
requirement for a one hour test to see how the coder breaks the idea down.

I like all the ideas you have.

David

Re: simple compression mathod to code itself?

<XnsB14C7BB22DDD2hueydlltampabayrrcom@135.181.20.170>

  copy mid

https://www.novabbs.com/devel/article-flat.php?id=34953&group=comp.lang.c#34953

  copy link   Newsgroups: comp.lang.c
Path: i2pn2.org!i2pn.org!eternal-september.org!feeder3.eternal-september.org!news.eternal-september.org!.POSTED!not-for-mail
From: huey....@tampabay.rr.com (David LaRue)
Newsgroups: comp.lang.c
Subject: Re: simple compression mathod to code itself?
Date: Sat, 6 Apr 2024 16:09:55 -0000 (UTC)
Organization: A noiseless patient Spider
Lines: 32
Message-ID: <XnsB14C7BB22DDD2hueydlltampabayrrcom@135.181.20.170>
References: <uupflr$5t1n$1@i2pn2.org> <XnsB14BC6028E8BAhueydlltampabayrrcom@135.181.20.170> <uurd34$8ga0$1@i2pn2.org> <uurefe$8i04$1@i2pn2.org>
Injection-Date: Sat, 06 Apr 2024 16:09:55 +0200 (CEST)
Injection-Info: dont-email.me; posting-host="1c2b76cfe394269a5091a3e37ee7224f";
logging-data="2295475"; mail-complaints-to="abuse@eternal-september.org"; posting-account="U2FsdGVkX1/Bl4YkabZa884XElrQzOqz"
User-Agent: Xnews/2006.08.24
Cancel-Lock: sha1:DVkRSUhKtXZCLJgppy1B1Ln8i44=
 by: David LaRue - Sat, 6 Apr 2024 16:09 UTC

fir <fir@grunge.pl> wrote in news:uurefe$8i04$1@i2pn2.org:

> fir wrote:
<snip>
>
> if someone want to talk on compression adn how to code it i could like
> to read it (as reading net articles on this may be seem to hard for my
> old sick head and posts are much are easier to get into it)

There are several good books on search and compression methods that provide
examples of complexity and discussions about the complexity and performance.
I have book on just algorithms that I bought years ago. One of the first
chapters discussed the absurdaty of an OS/App requiring seperate Account and
Password entries when only one is needed. The result is the same and takes
one less character to enter. I found that book in a Barnes and Noble years
ago and loved reading and trying to understand the suggestions they made. A
great place to look for such books is in a college book store; the one for
books that are used for the second year or higher students. I've not found
much online EXCEPT for the papers and examples published by Nicholas Wirth or
by Knuth. Knuth's publications are best if you don't mind reading for a
while and then deciding what is best to code/do. Wirth had books on
algorithms and for specific languages. Much easier to read for a beginner.

The Knuth discussions are well organized and usually available for free
online. He covered an enormous variety on topics in his many books/papers.
They have complete descriptions about why something was done and then
discusses how to improve them. Again this is deep material but well worth
the effort once you've mastered a language or two. C gives the ideas needed
to read and und1erstand his comments about angorithms and language design.
Very good stuff, IMHO.

David

Re: simple compression mathod to code itself?

<mXeQN.385487$q3F7.369853@fx45.iad>

  copy mid

https://www.novabbs.com/devel/article-flat.php?id=34954&group=comp.lang.c#34954

  copy link   Newsgroups: comp.lang.c
Path: i2pn2.org!i2pn.org!weretis.net!feeder8.news.weretis.net!news.bawue.net!npeer.as286.net!npeer-ng0.as286.net!peer02.ams1!peer.ams1.xlned.com!news.xlned.com!peer01.iad!feed-me.highwinds-media.com!news.highwinds-media.com!fx45.iad.POSTED!not-for-mail
X-newsreader: xrn 9.03-beta-14-64bit
Sender: scott@dragon.sl.home (Scott Lurndal)
From: sco...@slp53.sl.home (Scott Lurndal)
Reply-To: slp53@pacbell.net
Subject: Re: simple compression mathod to code itself?
Newsgroups: comp.lang.c
References: <uupflr$5t1n$1@i2pn2.org> <XnsB14BC6028E8BAhueydlltampabayrrcom@135.181.20.170> <uurd34$8ga0$1@i2pn2.org> <uurefe$8i04$1@i2pn2.org> <XnsB14C7BB22DDD2hueydlltampabayrrcom@135.181.20.170>
Lines: 26
Message-ID: <mXeQN.385487$q3F7.369853@fx45.iad>
X-Complaints-To: abuse@usenetserver.com
NNTP-Posting-Date: Sat, 06 Apr 2024 16:41:54 UTC
Organization: UsenetServer - www.usenetserver.com
Date: Sat, 06 Apr 2024 16:41:54 GMT
X-Received-Bytes: 1635
 by: Scott Lurndal - Sat, 6 Apr 2024 16:41 UTC

David LaRue <huey.dll@tampabay.rr.com> writes:
>fir <fir@grunge.pl> wrote in news:uurefe$8i04$1@i2pn2.org:
>
>> fir wrote:
><snip>
>>
>> if someone want to talk on compression adn how to code it i could like
>> to read it (as reading net articles on this may be seem to hard for my
>> old sick head and posts are much are easier to get into it)
>
>There are several good books on search and compression methods that provide
>examples of complexity and discussions about the complexity and performance.

https://en.wikipedia.org/wiki/Lossless_compression

One of the earliest:

https://en.wikipedia.org/wiki/Huffman_coding

Former colleague wrote this one:

https://en.wikipedia.org/wiki/Gzip

This was once under patent to a former employer:

https://en.wikipedia.org/wiki/LZ77_and_LZ78

Re: simple compression mathod to code itself?

<uus9eh$9k9a$1@i2pn2.org>

  copy mid

https://www.novabbs.com/devel/article-flat.php?id=34957&group=comp.lang.c#34957

  copy link   Newsgroups: comp.lang.c
Path: i2pn2.org!.POSTED!not-for-mail
From: fir...@grunge.pl (fir)
Newsgroups: comp.lang.c
Subject: Re: simple compression mathod to code itself?
Date: Sat, 06 Apr 2024 21:57:13 +0200
Organization: i2pn2 (i2pn.org)
Message-ID: <uus9eh$9k9a$1@i2pn2.org>
References: <uupflr$5t1n$1@i2pn2.org> <XnsB14BC6028E8BAhueydlltampabayrrcom@135.181.20.170> <uurd34$8ga0$1@i2pn2.org> <XnsB14C767BEE016hueydlltampabayrrcom@135.181.20.170>
MIME-Version: 1.0
Content-Type: text/plain; charset=ISO-8859-1; format=flowed
Content-Transfer-Encoding: 7bit
Injection-Date: Sat, 6 Apr 2024 19:57:06 -0000 (UTC)
Injection-Info: i2pn2.org;
logging-data="315690"; mail-complaints-to="usenet@i2pn2.org";
posting-account="+ydHcGjgSeBt3Wz3WTfKefUptpAWaXduqfw5xdfsuS0";
User-Agent: Mozilla/5.0 (Windows NT 5.1; rv:27.0) Gecko/20100101 Firefox/27.0 SeaMonkey/2.24
In-Reply-To: <XnsB14C767BEE016hueydlltampabayrrcom@135.181.20.170>
X-Spam-Checker-Version: SpamAssassin 4.0.0
 by: fir - Sat, 6 Apr 2024 19:57 UTC

David LaRue wrote:
> fir <fir@grunge.pl> wrote in news:uurd34$8ga0$1@i2pn2.org:
>
>> David LaRue wrote:
>>> fir <fir@grunge.pl> wrote in news:uupflr$5t1n$1@i2pn2.org:
>>>
>>>> i not code at all in recent years
>>>> (recently i coded half of my compuler with a plan to write second half
>>>> but its to ambitious to lazy coding mood i got recently)
>>>> but recently i started this slamm coding modd and found
>>>> ite pleasurable
>>>>
>>>> searching for lazy side things i could code in such mood
>>>> i thought maybe i wopuld like to have compression routine
>>>> but i would like to write it myself sortla like i use quicksort
>>>> or so but i also want to write it myself
>>>>
>>>> so is there maybe some method i could use i mean some simple
>>>> routine that compresses and uncompresses an array of bytes but
>>>> maybe something a bit more complex than RLE (rune length encoding)
>>>> - only thing i know from this domain
>>>>
>>>
>>> Hello fir,
>>>
>>> A method a bit more complex that you might try builds a table of all
>>> bytes as you scan them from the input. The compressed output is a
>>> reference to the table you just built (wnhem repeated byte strings are
>>> found, otherwise feed the input to the compressed image so that the
>>> expansion method can build the sme dynamic table the encoder built.
>>> The table generally has a limit on the number of entries (usually a
>>> good size) and allows the table of bytes to dynamically change as new
>>> patterns are read from the input.
>>>
>>> This is a well known and documented compression/expansion algorithm.
>>> PKZIP and other engines use this as one of their compression methodz.
>>> Look for the description of that if you need more details to figure out
>>> what you need to write.
>>>
>>> Expansion is the reverse. Read the source (now the compressed image)
>>> and build the compression table from the bytes. As encoded references
>>> to the compression table are read from the compressed image output the
>>> source byte sequences. The output should be the same as what your
>>> encoder originally read.
>>>
>>> A good check on the final code is to compare the original input with
>>> the eventual output and make sure they agree exactly.
>>>
>>> Have fun,
>>>
>>> David
>>>
>> this could be good but i dont quite understood that .. but eventually
>> could be good...
>>
>> i thinged something abut that if rle search for repetitions of 1
>> byte then maybe after that search for repetitions of 2 bytes, then 3
>> bytes, 4 bytes and so on.. then do some "report" how many found and then
>> find a way to encode that
>>
>> need to think a bit becouse if rle only stores repetitions that are
>> one after another then this method should store repetitions that have
>> various distances among them
>>
>>
>> i also do nto want to spend a much time on this 1-2 days eventually
>>
>
> Hello fir,
>
> The method above can do the same thing for each repetition of a single
> byte. Alone it has the advantage that compression and decompression are
> single pass operations. It is similar to RLE but allows for more
> variations to be used along the way. The encoder also needs some table
> searching to determine best matches, new byte sequences, and improved
> length sequences. An ordered table of byte strings is typically used.
> I've also seen more advanced search methods that use branch tables to
> represent the encoded sequences with the final entry being the value to put
> in the compressed output. Very easy to search and gives some improvements
> to the basic search design to make performance better. This can be done in
> a few days or less, depending on how much time you spend on deciding the
> table size and so on. I've also seen the search part as a coding
> requirement for a one hour test to see how the coder breaks the idea down.
>
> I like all the ideas you have.
>
> David
>
okay though i will answer to it alter as i decided to do it but not
today yet

Re: simple compression mathod to code itself?

<uuuugs$2vq34$1@dont-email.me>

  copy mid

https://www.novabbs.com/devel/article-flat.php?id=34969&group=comp.lang.c#34969

  copy link   Newsgroups: comp.lang.c
Path: i2pn2.org!i2pn.org!eternal-september.org!feeder3.eternal-september.org!news.eternal-september.org!.POSTED!not-for-mail
From: cr88...@gmail.com (BGB)
Newsgroups: comp.lang.c
Subject: Re: simple compression mathod to code itself?
Date: Sun, 7 Apr 2024 15:08:57 -0500
Organization: A noiseless patient Spider
Lines: 221
Message-ID: <uuuugs$2vq34$1@dont-email.me>
References: <uupflr$5t1n$1@i2pn2.org>
<XnsB14BC6028E8BAhueydlltampabayrrcom@135.181.20.170>
<uurd34$8ga0$1@i2pn2.org> <uurefe$8i04$1@i2pn2.org>
<XnsB14C7BB22DDD2hueydlltampabayrrcom@135.181.20.170>
<mXeQN.385487$q3F7.369853@fx45.iad>
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8; format=flowed
Content-Transfer-Encoding: 7bit
Injection-Date: Sun, 07 Apr 2024 20:09:01 +0200 (CEST)
Injection-Info: dont-email.me; posting-host="9796307e8b8c0234504f926a63004b70";
logging-data="3139684"; mail-complaints-to="abuse@eternal-september.org"; posting-account="U2FsdGVkX1+VqSyNdT2ppsxXf07+b1AZ1vpCfcMacG0="
User-Agent: Mozilla Thunderbird
Cancel-Lock: sha1:k3kB2UI5MJijuutE0isC7T4clnw=
In-Reply-To: <mXeQN.385487$q3F7.369853@fx45.iad>
Content-Language: en-US
 by: BGB - Sun, 7 Apr 2024 20:08 UTC

On 4/6/2024 11:41 AM, Scott Lurndal wrote:
> David LaRue <huey.dll@tampabay.rr.com> writes:
>> fir <fir@grunge.pl> wrote in news:uurefe$8i04$1@i2pn2.org:
>>
>>> fir wrote:
>> <snip>
>>>
>>> if someone want to talk on compression adn how to code it i could like
>>> to read it (as reading net articles on this may be seem to hard for my
>>> old sick head and posts are much are easier to get into it)
>>
>> There are several good books on search and compression methods that provide
>> examples of complexity and discussions about the complexity and performance.
>
> https://en.wikipedia.org/wiki/Lossless_compression
>
> One of the earliest:
>
> https://en.wikipedia.org/wiki/Huffman_coding
>
> Former colleague wrote this one:
>
> https://en.wikipedia.org/wiki/Gzip
>
> This was once under patent to a former employer:
>
> https://en.wikipedia.org/wiki/LZ77_and_LZ78

Also LZ4 is a moderately simple format, and works reasonably OK.

I have another (similar category but different design) format that gets
slightly better compression to LZ4 at a similar decode speed.

I guess, if I were to try to quickly design another LZ format that was
simple to decode, say:
Tag is a 16 bit word (may fetch more bits though);
LSB=0: Short Run
(15:8): Match Distance (if non 0)
( 7:5): Raw Length (0..7)
( 4:1): Match Length (4..19)
LSB=01: Longer Run
(31:16): Match Distance (up to 64K)
(15:11): Raw Length (0..31)
(10: 2): Match Length (4..515)

With a few special cases:
tag=0x0000: EOB
if md==0:
if(ml!=4)
rl+=(ml-3)*X; //longer run of literal bytes

So, main decoder might look something like (pseudo C):
cs=src;
ct=dst;
while(1)
{
tag=*(u32 *)cs; //assume misaligned safe and little endian
if(!(tag&1))
{
rl=(tag>>5)&7;
ml=((tag>>1)&15)+4;
md=(tag>>8)&255;
cs+=2;
if(!md)
{
if(!tag)
break; //EOB
if(ml!=4)
rl+=(ml-3)*8;
}
}else if(!(tag&2))
{
rl=(tag>>11)&31;
ml=((tag>>2)&511)+4;
md=(tag>>16)&65535;
cs+=4;
if(!md)
{
if(ml!=4)
rl+=(ml-3)*32;
}
}else
{
//maybe more cases
}

if(rl)
{
memcpy(ct, cs, rl);
cs+=rl; ct+=rl;
}

if(md)
{
matchcopy(ct, ct-md, ml);
ct+=ml;
}
}

The matchcopy function would resemble memcpy, but have different
semantics in the case of self-overlap.

Say, a simple version (not tuned for performance):
void matchcopy(byte *dst, byte *src, int len)
{
if((src+len)<dst)
{
//no overlap
memcpy(dst, src, len);
return;
}
while(len--)
*dst++=*src++;
}

The match copy operation tends to bigger and more complicated if one
tries to make it faster (where generally byte-for-byte copies are rather
slow, and the case of long repeating RLE like runs isn't particularly rare).

Designing an LZ encoder is more of a challenge though.
Where typically, speed, compression, and code simplicity, are all
mutually opposed (a simple LZ encoder will be either slow or give crappy
compression, though speed is a matter of perspective).

Note that comparably, Deflate is a fairly complicated format.

Huffman coding is effective, but as noted, comes at a cost in terms of
both speed and code complexity (but is still much faster than something
like range coding).

Though, partial ways to make Huffman decoding a little faster (in the
design of a compressor):
Limit maximum symbol length to around 12 or 13 bits, which allows lookup
tables to fit in the L1 caches of typical CPUs;
Decode blocks of symbols in advance, then fetch symbols as-needed from
these blocks (this allows overlapping the Huffman symbol decoding,
making more effective use of the CPU pipeline).

These designs had sort of ended up resembling an entropy-coded version
of LZ4 (usually with Huffman coded blocks for tags, literal bytes, and
length/distance prefixes; using a scheme similar to Deflate's distance
coding scheme for longer match or raw lengths, as well as for match
distance). Note that after decoding the symbol blocks, the bitstream
would only contain "extra bits" for the distance coding.

But that said, on a modern PC style CPU, it is rather difficult to get
Huffman coded designs much over about 600 or 700 MB/sec for decode
speed, even with these tricks (if one skips an entropy-coding stage,
typically several GB/sec is possible for LZ decoding).

Though, one possibility could be a pseudo-entropy scheme, say:
Sort all the symbols into descending frequency order;
Stored directly as a table of 128 bytes (skip uncommon bytes).
Encode the symbol blocks with bytes encoding table indices, say:
00..78: Symbol Pair (0..10)
7F: Escaped Symbol (encoded as a byte)
80..FF: Direct Index (0..127)

With the block being encoded as a blob of raw bytes if the
pseudo-entropy scheme didn't save enough space (say, if there isn't a
significant probability skew). Idea being that this would be faster to
decode than blobs of Huffman-coded symbols.

Then, say, block is encoded as a 16-bit prefix, say:
0000..3FFF: Up to 16K, raw byte symbols
4000..7FFF: Up to 16K, pseudo-entropy.
8000..FFFF: Escape other block types.

Though, efficiently dealing with the dual-symbol case would be harder,
most obvious answers being one of:
Lookup index pair in a lookup table;
itmp=indexpairlut[relidx];
i0=itmp&255;
i1=itmp>>8;
Or, try to split it up directly.
Naive:
i1=relidx/11; //can be turned into multiply by reciprocal
i1=(relidx*0x175)>>12; //alternate
i0=relidx-(i1*11);
Though, unclear if this would be fast enough to be worthwhile.

Could maybe also abandon the use of a bitstream for the extra-bits, to
using nybbles or bytes instead (where a nybble or byte stream is faster
than a bitstream). Decided not to go into the specifics of
length/distance coding here.

But, say, in this case, each LZ coded block would consist of 4 symbol
blocks:
Tag Symbols
Literal Symbols
Distance Symbols
Distance Extra (probably always raw bytes)

TBD if it could be performance competitive with the purely byte-oriented
alternatives though, which would still have the advantage of being
simpler in any case.

Well, among any number of possibilities.
Could probably fiddle with it...

But, all this is getting a bit more advanced...

1
server_pubkey.txt

rocksolid light 0.9.8
clearnet tor