Welcome to novaBBS (click a section below) |
mail files register nodelist faq login |

Subject | Author |

Re: Typical set vs. smallest delta-sufficient subset | Frederico Guth |

Re: Typical set vs. smallest delta-sufficient subset | Eli the Bearded |

1 |

X-Received: by 2002:a37:f612:: with SMTP id y18mr36678566qkj.436.1621977056286;

Tue, 25 May 2021 14:10:56 -0700 (PDT)

X-Received: by 2002:a25:addc:: with SMTP id d28mr48318990ybe.448.1621977055891;

Tue, 25 May 2021 14:10:55 -0700 (PDT)

Path: i2pn2.org!i2pn.org!weretis.net!feeder8.news.weretis.net!proxad.net!feeder1-2.proxad.net!209.85.160.216.MISMATCH!news-out.google.com!nntp.google.com!postnews.google.com!google-groups.googlegroups.com!not-for-mail

Newsgroups: comp.compression

Date: Tue, 25 May 2021 14:10:55 -0700 (PDT)

In-Reply-To: <36da806e-3612-44e1-9157-c667816de78c@l16g2000yqo.googlegroups.com>

Injection-Info: google-groups.googlegroups.com; posting-host=189.6.81.232; posting-account=2Cg1wQoAAACsEilJTAtGFd4E7AYMYsa9

NNTP-Posting-Host: 189.6.81.232

References: <36da806e-3612-44e1-9157-c667816de78c@l16g2000yqo.googlegroups.com>

User-Agent: G2/1.0

MIME-Version: 1.0

Message-ID: <fa4d39eb-808b-4d45-a3b4-fe7d283dda41n@googlegroups.com>

Subject: Re: Typical set vs. smallest delta-sufficient subset

From: fredg...@fredguth.com (Frederico Guth)

Injection-Date: Tue, 25 May 2021 21:10:56 +0000

Content-Type: text/plain; charset="UTF-8"

Content-Transfer-Encoding: quoted-printable

View all headersTue, 25 May 2021 14:10:56 -0700 (PDT)

X-Received: by 2002:a25:addc:: with SMTP id d28mr48318990ybe.448.1621977055891;

Tue, 25 May 2021 14:10:55 -0700 (PDT)

Path: i2pn2.org!i2pn.org!weretis.net!feeder8.news.weretis.net!proxad.net!feeder1-2.proxad.net!209.85.160.216.MISMATCH!news-out.google.com!nntp.google.com!postnews.google.com!google-groups.googlegroups.com!not-for-mail

Newsgroups: comp.compression

Date: Tue, 25 May 2021 14:10:55 -0700 (PDT)

In-Reply-To: <36da806e-3612-44e1-9157-c667816de78c@l16g2000yqo.googlegroups.com>

Injection-Info: google-groups.googlegroups.com; posting-host=189.6.81.232; posting-account=2Cg1wQoAAACsEilJTAtGFd4E7AYMYsa9

NNTP-Posting-Host: 189.6.81.232

References: <36da806e-3612-44e1-9157-c667816de78c@l16g2000yqo.googlegroups.com>

User-Agent: G2/1.0

MIME-Version: 1.0

Message-ID: <fa4d39eb-808b-4d45-a3b4-fe7d283dda41n@googlegroups.com>

Subject: Re: Typical set vs. smallest delta-sufficient subset

From: fredg...@fredguth.com (Frederico Guth)

Injection-Date: Tue, 25 May 2021 21:10:56 +0000

Content-Type: text/plain; charset="UTF-8"

Content-Transfer-Encoding: quoted-printable

I am studying this myself. Here is my take. A Typical subset is a delta-sufficient subset. But not all delta-sufficient subsets are typical. Why? Because the elements of the typical subset have similar probabilities, while the smallest delta-sufficient subset, for example, is not typical (by construction. You chose the most probable elements).

Why to compress with the typical and not with the smallest delta-sufficient subset:

Imagine a very naive compression of a text. You may choose to compress using the smallest delta-sufficient subset of the letters of the alphabet. So, e is the most probable letter and you will keep it. q is the least used letter and it is not in your subset. The problem is that the least used letters are more informative than the most used letters. You can probabily infr a txt without the lttr 'e'. The typical set is the most informative subset of your alphabet.

Fred

On Wednesday, February 25, 2009 at 10:26:20 AM UTC-3, Simba wrote:

Why to compress with the typical and not with the smallest delta-sufficient subset:

Imagine a very naive compression of a text. You may choose to compress using the smallest delta-sufficient subset of the letters of the alphabet. So, e is the most probable letter and you will keep it. q is the least used letter and it is not in your subset. The problem is that the least used letters are more informative than the most used letters. You can probabily infr a txt without the lttr 'e'. The typical set is the most informative subset of your alphabet.

Fred

On Wednesday, February 25, 2009 at 10:26:20 AM UTC-3, Simba wrote:

Hi,

I studied the source coding theorem, chapter 4 of the MacKay's book,

but I have some perplexities.

The source coding theorem implies that, for large N, the cardinality

of the smallest delta-sufficient subset is about 2^(NH). So, in this

set we have the 2^(NH) most probable outcomes.

There is also the asymptotic equipartition principle, which implies

that, for large N, the typical set, whose elements have probability of

'about' 2^(-NH), contains almost all the probability. So, the typical

set has about 2^(NH) elements.

So, we have that the typical set and the smallest delta-sufficient

subset for large N have approximately the same cardinality, 2^(NH),

right?

But the typical set doesn't contain the most probable outcomes (as for

the other set), so I imagine it as the smallest delta-sufficient

subset "shifted" a bit towards the less probable outcomes, where

"shifted" refers to a picture in which the outcomes are represented as

points on segment, with probability increasing from left to right: the

smallest delta-sufficient subset is at the extreme right, while the

typical set is a bit more centered, right?

So, the book says that we can (and probably should) define a

compression algorithm that gives a distinct name of length NH bits to

each element of the typical set. That's ok, but my question is: why

the typical set and not the smallest delta-sufficient set? I know that

it should be equivalent, because both sets contains almost all the

probability, but why should we choose the typical set?

Thanks

Path: i2pn2.org!i2pn.org!weretis.net!feeder6.news.weretis.net!panix!qz!not-for-mail

From: *...@eli.users.panix.com (Eli the Bearded)

Newsgroups: comp.compression

Subject: Re: Typical set vs. smallest delta-sufficient subset

Date: Tue, 25 May 2021 21:56:22 +0000 (UTC)

Organization: Some absurd concept

Lines: 20

Message-ID: <eli$2105251756@qaz.wtf>

References: <36da806e-3612-44e1-9157-c667816de78c@l16g2000yqo.googlegroups.com> <fa4d39eb-808b-4d45-a3b4-fe7d283dda41n@googlegroups.com>

NNTP-Posting-Host: panix5.panix.com

X-Trace: reader1.panix.com 1621979782 9340 166.84.1.5 (25 May 2021 21:56:22 GMT)

X-Complaints-To: abuse@panix.com

NNTP-Posting-Date: Tue, 25 May 2021 21:56:22 +0000 (UTC)

X-Liz: It's actually happened, the entire Internet is a massive game of Redcode

X-Motto: "Erosion of rights never seems to reverse itself." -- kenny@panix

X-US-Congress: Moronic Fucks.

X-Attribution: EtB

XFrom: is a real address

Encrypted: double rot-13

User-Agent: Vectrex rn 2.1 (beta)

View all headersFrom: *...@eli.users.panix.com (Eli the Bearded)

Newsgroups: comp.compression

Subject: Re: Typical set vs. smallest delta-sufficient subset

Date: Tue, 25 May 2021 21:56:22 +0000 (UTC)

Organization: Some absurd concept

Lines: 20

Message-ID: <eli$2105251756@qaz.wtf>

References: <36da806e-3612-44e1-9157-c667816de78c@l16g2000yqo.googlegroups.com> <fa4d39eb-808b-4d45-a3b4-fe7d283dda41n@googlegroups.com>

NNTP-Posting-Host: panix5.panix.com

X-Trace: reader1.panix.com 1621979782 9340 166.84.1.5 (25 May 2021 21:56:22 GMT)

X-Complaints-To: abuse@panix.com

NNTP-Posting-Date: Tue, 25 May 2021 21:56:22 +0000 (UTC)

X-Liz: It's actually happened, the entire Internet is a massive game of Redcode

X-Motto: "Erosion of rights never seems to reverse itself." -- kenny@panix

X-US-Congress: Moronic Fucks.

X-Attribution: EtB

XFrom: is a real address

Encrypted: double rot-13

User-Agent: Vectrex rn 2.1 (beta)

In comp.compression, Frederico Guth <fredguth@fredguth.com> wrote:

Top-posted reply 12 years after the original post.

By construction, the least useful subset of replies are those that are

composed long after an answer is expected.

Elijah

------

need to wait a few more decades to for the smallest possible usefulness

I am studying this myself. Here is my take. A Typical subset is a....

delta-sufficient subset. But not all delta-sufficient subsets are

typical. Why? Because the elements of the typical subset have similar

probabilities, while the smallest delta-sufficient subset, for example,

is not typical (by construction. You chose the most probable elements).

On Wednesday, February 25, 2009 at 10:26:20 AM UTC-3, Simba wrote:

Hi,

I studied the source coding theorem, chapter 4 of the MacKay's book,

but I have some perplexities.

Top-posted reply 12 years after the original post.

By construction, the least useful subset of replies are those that are

composed long after an answer is expected.

Elijah

------

need to wait a few more decades to for the smallest possible usefulness

1 |

clearneti2ptor