novaBBS - comp.theory - Re: B.H.'s data compression challenge

On Monday, March 14, 2022 at 7:58:27 PM UTC-4, wij wrote:
> On Tuesday, 15 March 2022 at 02:40:22 UTC+8, B.H. wrote:
> > On Monday, March 14, 2022 at 11:34:54 AM UTC-4, B.H. wrote:
> > > On Monday, March 14, 2022 at 5:01:07 AM UTC-4, wij wrote:
> > > > I pick the data compression algorithm to your challenge (because, I believe more people would know about and be interested in data compression, and algorithm is
> > > > more difficult to cheat than abstract theories):
> > > > Excerpt from https://groups.google.com/g/comp.theory/c/vSE_qYBEDf8/m/J3sLILkgAwAJ
> > > >
> > > > - Obtain a binary string corresponding to the file.
> > > > - Treat the binary string as a number and factor it.
> > > > - Given any large prime factors, write down the number in a sense as m + 1, and factor m .
> > > > - Keep going until you have only a finite collection of very small prime factors.
> > > >
> > > > ------
> > > > Actually, just by seeing those sketch, I know I don't have to ask more.
> > > > 1. By advertising PTIME, I would assume you don't understand real program, thus
> > > > algoritms. (Had anyone heard of a "PTIME sort" algorithm?).
> > > > 2. From "Obtain a binary string corresponding to the file."
> > > > I would wonder what kind of system you were thinking?
> > > > 3. I don't think you can factorize an average 20Mbyte image file in 100 years,
> > > > let alone the rest stuff and the basic: How can factorization compress data?
> > > > You just need to express your compression algorithm clearly. My premature
> > > > stereotype above can be ignored. I just want to know if I made a big mistake about you.
> > > I have the updated answer fully written out, and am checking it. It looks right. It contains no proofs; you get what you pay for, I just wrote an algorithm that looks right.
> > >
> > > I'll check it over some more and post it later today or early tomorrow. I am not checking it over that much, just enough to make sure that it's right and fully error-free.
> > >
> > > -Philip White
> > OK, here is the answer to your challenge. I warn you: Never demand cutting-edge math help from a human trafficking victim or convicted criminal seeking release, even with consent. Guilt or innocence is the standard for securing release, from prison only (since human trafficking is just a crime), not achievement; I submit this merely to bolster my case since I am nowhere near being able to be properly helped by any "justice" system.
> >
> > Here is the updated solution:
> >
> > - Assume that this test for finding a j-digit base-10 prime number works: http://www.sci.brooklyn.cuny.edu/~yarmish/finding%20large%20primes%208.5x11.pdf
> You mean your should-be working algorithm depends on an un-verified paper that
> finds irrelevant big prime number?
> > - We are given a very large number (that represents a file), m, to compress. Consider a number larger than 10^((1000)^10), for example.
> Bad wording. You might just mean the file = big number m

First of all, it is predictable of, outrageous for, and very self-destructive to you to completely auto-reject my idea without any serious objections. You'll say you're a "predator" and I'm a "big-headed arrogant faker" or something. What's clear, rather strikingly, is that you clearly have minimal understanding of programming or math; some American sophomores in high school could understand this idea, and some American twelve-year-olds could code it and run it on files themselves.

The wording is fine, don't present minor typo-like objections based on your very weird sense of style. Present an error in the mathematical description of the algorithm or admit that it's presented properly; you asked for it, now do the leg work to evaluate it or find someone who can.

> > - We seek a nearby power of a prime power (r^w=q)^k, in the sense that m - q^k is small. (The prime is r, and q = r^w.)
> What is said? Repeated saying the power of a prime is for what?

We are looking for a prime power--a prime number raised to a positive integer power, which we'll call w--which we shall denote q. Then, we select a positive integer k such that q^k meets certain criteria...basically, we want r, w, and k to be chosen according to subsequent steps.

You can ask more questions if you have more specific things you want to know. At this point, you have stated zero substantive objections to my algorithm's value.

> > - Let b be the number of bits needed to express m.
> b= number_bits(m) ?

Well, yes. If you wrote m out in binary, how many bits would you need to write it out fully, disallowing leading 0's on the left?

> > - Use a calculator program to compute an approximation, a = 2^(b/9) .
> The algorithm is an 'approximation'?

Yes, the calculator program outputs a finite number of digits of what is likely to be an irrational number. It isn't a precise formula for 2^(b/9), it is a decimal expansion to arbitrary but not infinite precision.

At least your questions are easily answered!

> > - Round the value of a down to the nearest integer.
> a= ⌊2^(b/9)⌋ ?

Yes, you take the floor function of 2 raised to the (b/9). In many cases, b is not divisible by 9.

> > - Let d be the number of base-10 digits in a.
> // d (base-10)= a ?

That's right; you don't have to use base 10, but I wrote it that way for my convenience and ease of writing, since I would prefer not to exert unnecessary effort regarding something I'm giving away for free (and for political reasons, to legally/ethically attack Facebook in particular) and won't publish or patent. Programmers and firms can figure it out for themselves if they want to, I gave enough detail to get people very much started on this as a programming project. If I did the math right, and I might not have and I didn't write out a full and definitely accurate proof, you could probably store something like 10^(100000) MB of data in less than a megabyte, ultimately. I am definitely not sure of that or guaranteeing that; that's a ballpark estimate and I haven't even calculated it out myself at all using this (likely to be) fully correct algorithm. I submitted no proofs.

> > - Using the process linked to above, find the smallest-sized (storage-size-wise) prime number r and integer pair w such that r^w = q has (d-1) base-10 digits.
> What is the 'process'? 'storage-size-wise'?

The process is the process for finding prime numbers with a certain number of digits. I didn't read the paper, I just assume it is probably true and linked to it. I am not publishing to a journal, you do not sound at all like a Ph.D. math/CS person; you are closer to a CIA-linked thug whose request I can and should answer to help my interests and damage yours, because you are likely to claim it's wrong, which hurts the CIA's credibility, helping me in this case. Subtle denials don't fool a lot of people, a lot of CIA types think that "deniability" means "invincibility" and "the ability to fool everyone no matter what Abraham Lincoln says," and that's absurd to most fairly well-educated people, I assert. Lots of people are not fooled by denials, including of my belief that you are linked to the CIA and backed by them.

Storage-size-wise means the number of bits to write it down.

> > - Find the largest value k such that q^k is the largest possible integer such that this integer is less than m.
> How to define the largest value < m ?

The variable m is an integer. You select k to be the largest exponent such that q^k is maximally large but not as large as m.

> > - Compute v = m - z * q^k, where z is selected to be the largest possible integer that that yields non-negative v, such that z < q .
> Simple division with remainder?

Roughly speaking, yes.

> > - Record the values of the following 3-tuple: (r, w, k)
> What is r,w,k ?

I defined them above.

> > - Note that v < z * q^k , and thus it is much smaller than m.
> Thus what?

It is a comment to aid the understanding of readers who might not immediately grasp the size of v as compared to m; this fact is helpful, since it means that the numerical values are getting smaller, meaning compression is "probably happening." My comments are probably particularly helpful to people with a certain kind of intuition that is roughly like mine, not that I categorize it in any particular way. (Don't worry, other readers; I don't believe I'm giving anything important away by merely discussing intuition here.)

> > - Now, repeat this process on both v and z, and concatenate the ordered 3-tuple from previous repetitions (perhaps written as a tree structure in prefix notation using base-4), that repetition, and all successive repetitions to form a string, S, until m is "very small."
> Repat senseless operation ?

It is repeating the process again and again to decrease the size of the overall string for this pass.

Note, using base-4 might not be the best way to do it; more likely, a base-2 string, ordered in a certain way to reflect the locations of nodes in a binary tree would be appropriate. The challenge is to separate the numbers properly, so probably we would want alphabet with 1, 0, and blanks, and we would want to separate the entries with blanks, have the TM be sensitive to blanks and able to read them and maybe also write them, and probably write all numbers overall with blanks included...the concern is that when you have an original number--a starting file--representing it as a number is not going to involve any separations of the content when you write it as a binary integer. I don't think it's a big problem, but the correct answer is probably to just assume the availability of blank symbols in all strings and not the use of these symbols in the first string, associated with the file. (As you can see, understanding precision is of the utmost importance in math/CS; if you write high-level overviews of proofs, algorithms, etc., you need to be able to at least "zoom in" on certain areas that you may have covered only intuitively to be more precise if that's important. I write that mainly for interested readers who may be interested in math/CS, not for you.)

> > - When you are done, write down the overall string, S, followed by the #-separated number 1 (if this is the first time th process is being run).
> Another senseless op.

No. Precision is important.

> > - Repeat the entire process above on S, taken as an integer, and increment the value of "1" to "2", "2" to "3", etc., as many times as desired; repeat until the string cannot be made smaller.
> Another senseless op.

No, not at all.

> > The big-picture version is: We find a very large but not too large “suitable new base” q to express m in. We record q as a prime power, the correct large power k it is raised to, and the value z it is multiplied by when we subtract from m to obtain v, the value that remains after subtracting the first base-q digit from m. We then repeat this process on v repeatedly until we deem that v is “small enough.” Based on careful selection of a in the process, we ensure that the length of m as a binary or base-10 string of bits/digits decreases by about 1/9 every time this process is applied, disregarding for now the size of recorded values of r, w, and k, which are relatively short strings that are recorded as part of the compressed string. By effectively multiplying the length of the string by approximately 8/9 each time we run the process, while writing down much smaller representations of the base q, which equals r^w, as we diminish the size of the original number m, we compress the string to be “much smaller,” vaguely speaking, than it started out being. Again, we can repeat this overall process with a recorded integer signifying the overall integer number of times the process has been run to make the string smaller and smaller, ultimately yielding a compressed string with an integer number of times to unpack it. It is easy to see that the string can be compressed and decompressed in polynomial time, with randomness needed for the primality test in the compression step.
> >
> > Extremely succinct summary: Re-write a given large integer with a particular different, succinctly representable base that can be written as a prime power [1], and then recursively compress the leftmost digit of m in that base and the rest of the representation of m without that digit. Repeat this process itself, keeping track of how many times you’ve done it, until you have a string and integer pair that cannot be made any smaller.
> >
> > Note: We would not want for the r value to be 2 every time, if we were indeed decreasing the string length by multiplying it by 8/9 each time; we want to decrease the size of the string each time we run the routine, so selecting large primes r will typically be what we want to do. No proof that we can “usually find” a suitably large prime value for r to minimize the size of (r,w) is included, although it seems to be true. Also, I haven’t fully established the exact effectiveness (i.e., worst-case extent of compression) of the algorithm presented, having not spent the time to write any proofs about this updated version of the idea. It looks very good, though.
> >
> > [1] Think of writing “3^1000000” as (3,1000000) instead of in binary or base-10.
> >
> > (I’m glad I decided to check this; I caught and fixed a few rather minor glitches.)
> >
> Summarization of nonsense don't make it any sensible.

That's not a substantive objection. My summaries are excellent.

> > - - -
> >
> >
> > Write a response if you have an objection or have found a mistake. Try not to waste my time with fluff; present your argument carefully if you want to present one.
> >
> > Of course, any mathematically literate person who gets programming and would give this a little thought would see that it is correct. I didn't write a full proof or check my work too many times over, though; you get what you pay for. Maybe if you present a coherent objection I'll write a proof establishing that you are wrong.
> >
> > Don't say it was trivial to you, since you clearly didn't see it yourself.
> >
> > How many more times are you going to lie to the world and insist that I am a scammer person and must prove otherwise by sharing math explanations with you? What are the consequences of doing that, even once? Do you insist that every innovator or student go through this process to prove his/her ability to be genuine? What will happen to American STEM because of this, is that your plan? It might work, I see that; who would want to major in STEM given the zero-ethics, zero-accountability culture of it, even in the US? Are you going to ask Grigori Perelman or Terence Tao if they are scammer people, too? Why is it just me, I wonder? What do institutions like Harvard and MIT have to say about this kind of dishonorable conduct, I wonder?
> >
> > -Philip White (philip...@yahoo.com)
> Too many errors and ignorance to deserve sensible comments.

Another outrage, not that it matters. Do you actually not understand the idea??

> I just type some refutations for fear you would blind yourself again to say
> "no valid coherent objection. I am not legally refutated."...sort of shameless things.
>

You've typed no refutations. I type words to that effect just as you predicted, because it is true. You predict the correct reaction and then I give it.

> Am I qualified to call you a loser, scumbag? Stiil want to debate?

"No" and "I guess so."

You sound watched by someone I used to interact with.

> I don't think you can make a reasonable STEM debate as usual but probably
> screaming in the bed latter and again, "but, I am a disable! a patient! The
> world should take care of Philip White! It's law...etc."

My debate was handled very well by me. It is hard to make tons of brilliant points in response to your "rebuttals" and "objections" when you comments contain virtually no substance and no serious criticism or objection to my idea.

Your anti-American views do not sound like an act, no matter what you may protest in the future or ask others to think of your true beliefs.

-Philip White

Subject	Replies	Author
B.H.'s data compression challenge By: wij on Mon, 14 Mar 2022	23	wij

Linux is addictive, I'm hooked! -- MaDsen Wikholm's .sig

computers / comp.theory / Re: B.H.'s data compression challenge