
The benefit to doing it this way is that the more our guess character extends into the block, the more likely the block is to be a good guess, and so we keep more of the block. This is because we chop the comparison off at the edge of where the “t” ends: You can see the quality of our match went way up, since we’re including less of that incorrect area on the right. So instead, what we did was to chop off the comparison block at the boundary of the letter itself. There’s always a chance that your letter will accidentally line up and produce a match by pure chance, and this chance goes way up when there’s fewer blocks to consider. The problem with this was that, in practice, it reduced the total size of our guess by so much that you start receiving false positives. It’s the column that will have the most bleed-over and can have quite a bit of error. The first thing we tried was to avoid counting the right-most block of any guess. If we just looked at this alone, you might conclude that the letter “t” was an incorrect first letter, since it gets almost half of the blocks totally wrong. The reason the second column is wrong is because the letter “h” is there messing things up. So, if we try to make a guess for the letter “t”, the left-most column of blocks turns out correct, but the right-most ones are a bit wrong.Ĭorrect Pixels vs. You can see that the letters “t” and “h” share a column of blocks. To see what I mean, check out this example: This means that a given correct guess might actually have some wrong blocks on the right-most edge. The first problem we immediately encountered is that the characters of our text don’t line up 1:1 with the blocks of the redaction. Doesn’t sound so hard, right? Well, there’s still a bunch of logistical issues to overcome that might not be so obvious at first! Let’s dig into those further. We’ll do a recursive depth-first search on each character, scoring each guess by how well it marginally matches up to the redacted text.īasically, we guess the letter “a”, pixelate that letter, and see how well it matches up to our redacted image. A change of one pixel somewhere in the original image ONLY impacts the redacted block it belongs to, meaning that we can (mostly) guess the image character by character. In cryptographic terms, we’d say it has no diffusion.

The key thing we’re focusing on is that the redaction process is inherently local. The Many Problems to Beating the Redaction In our challenge text, you can see a few words right above the pixelated text that give us this information.

#Redacted image full#
These are fairly reasonable assumptions, I would assert, since the attacker in a realistic scenario would likely have received a full report, with just one piece redacted out.

Then for each block, you set the redacted image’s color equal to the average color of the original for that same area. The algorithm is pretty simple you divide up your image into a grid of a given block size (the example above is 8x8).
