Lossless Image Compression Through Super-Resolution

beagle3 | 384 points

This is utterly fascinating.

To be clear -- it stores a low-res version in the output file, uses neural networks to predict the full-res version, then encodes the difference between the predicted full-res version and the actual full-res version, and stores that difference as well. (Technically, multiple iterations of this.)

I've been wondering when image and video compression would start utilizing standard neural network "dictionaries" to achieve greater compression, at the (small) cost of requiring a local NN file that encodes all the standard image "elements".

This seems like a great step in that direction.

crazygringo | 2 months ago

Interesting. It sounds like the idea is fundamentally like factoring out knowledge of "real image" structure into a neutral net. In a way, this is similar to the perceptual models used to discard data in lossy compression.

acjohnson55 | 2 months ago

This is really interesting but out of my league technically. I understand that super-resolution is the technique of inferring a higher-resolution truth from several lower-resolution captured photos, but I'm not sure how this is used to turn a high-resolution image into a lower-resolution one. Can someone explain this to an educated layman?

propter_hoc | 2 months ago

I asked a question about a similar idea on Stack Overflow in 2014. https://cs.stackexchange.com/questions/22317/does-there-exis...

They did not have any idea and they were dicks about it as usual.

ilaksh | 2 months ago

This technology is super awesome... and it's been available for awhile.

A few years ago, I worked for #bigcorp on a product which, among other things, optimized and productized a super resolution model and made it available to customers.

For anyone looking for it - it should be available in several open source libraries (and closed source #bigcorp packages) as an already trained model which is ready to deploy

Der_Einzige | 2 months ago

On the order of 10% smaller than WebP, substantially slower encode/decode.

trevyn | 2 months ago

Reminds me of this.

https://en.wikipedia.org/wiki/Jan_Sloot

Gave me a comical thought if such things can be permitted.

You split into rgb and b/w, turn the pictures into blurred vector graphics. Generate and use an incredibly large spectrum of compression formulas made up of separable approaches that each are sorted in such a way that one can dial into the most movie-like result.

3d models for the top million famous actors and 10 seconds of speech then deepfake to infinite resolution.

Speech to text with plot analysis since most movies are pretty much the same.

Sure, it wont be lossless but replacing a few unknown actors with famous ones and having a few accidental happy endings seems entirely reasonable.

6510 | 2 months ago

Related for an other domain, lossless text compression using LSTM: https://bellard.org/nncp/

(this is by Fabrice Bellard, one wonder how he can achieve so much)

m3at | 2 months ago

This is a lot like "waifu2x".[1] That's super-resolution for anime images.

[1] https://github.com/nagadomi/waifu2x

Animats | 2 months ago
[deleted]
| 2 months ago

Reminds me of RAISR (https://ai.googleblog.com/2016/11/enhance-raisr-sharp-images...).

I remember talking with the team and they had production apps using it and reducing bandwidth by 30%, while only adding a few hundred kb to the app binary.

asciimike | 2 months ago

and what's the size of the neural network you have to ship for this to work? has anyone done the math on the break even point compared to other compression tools?

e: actually a better metric would be how much does it compress compared to doing the resolution increase with just lanczos in place of the neural net and keeping the Delta part intact

LoSboccacc | 2 months ago

Does anyone know how much better the compression ratio is compared to png? Which is also a lossless encoder.

nojvek | 2 months ago

I wonder how well this technique works when the depth of field is infinite?

Out of focus parts of an image should be pretty darned easy to compress using what is effectively a thumbnail.

That said, the idea of having an image format where 'preview' code barely has to do any work at all is pretty damned cool.

hinkley | 2 months ago

Would massive savings be achieved if an image sharing app like say, Instagram were to adopt it, considering a lot of user-uploaded travel photos of popular destinations look more or less the same?

tjchear | 2 months ago

I believe a big issue with this will be floating point differences. Due to the network being essentially recursive, tiny errors in the initial layers can grow to yield an unrecognizably different result in the final layers.

That's why most compression algorithms use fixed point mathematics.

There are ways to quantizise neutral networks to make them use integer coefficients, but that tends to lose quite a lot of performance.

Still, this is a very promising lead to explore. Thank you for sharing :)

fxtentacle | 2 months ago

Is this actually lossless - that is, the same pixels as the original are recovered, guaranteed? I'm surprised such guarantees can be made from a neural network.

eximius | 2 months ago

I though superresolution uses multiple input files to "enhance". For example - extracting a highres image from a video clip

jbverschoor | 2 months ago

This is interesting but I'm not sure if the economics of it will ever work out. It'll only be practical when the computation costs become lower than storage costs

ackbar03 | 2 months ago

How do ML based lossy codecs compare to state of the art lossy compression? Intuitively it sounds like something AI will do much better. But this is rather cool.

dvirsky | 2 months ago

Looks like FLIF has a slight edge on compression ratio according to the paper, but it beats out other common compression schemes which is impressive.

slaymaker1907 | 2 months ago

How does it work for data other than Open Images, if trained on Open Images? If it recognizes fur, it's going to be great on cat videos.

Animats | 2 months ago

It seems like "lossless" isn't quite right; some of the information (as opposed to just the algo) seems to be in the NN?

Is a soft-link a lossless compression?

It's like the old joke about a pub where they optimise by numbering all the jokes, .. just the joke number isn't enough, it can be used to losslessly recover the joke, but it's using the community storage to hold the data.

pbhjpbhj | 2 months ago
[deleted]
| 2 months ago