Python CRC-16 collision brute-forcer

How common are CRC-16 collisions? Let’s find out with some Python magic.

Library’s used: crcmod .

from itertools import product
import crcmod.predefined
from math import factorial
import string

# Init variables
found = False
currLen = 1
txtTarget = ""
crcTarget = 0

# Init functions
def crc(data):
        crcInstance = crcmod.predefined.Crc("crc-16")
        crcInstance.update(str(data).encode("utf-8", "replace"))
        return crcInstance.crcValue

def nCr(n,r):
    f = factorial
    return round(f(n) / f(r) / f(n-r))

# Get input
txtTarget = input("Enter string to find CRC16 collisions for: ")
crcTarget = crc(txtTarget)
print ("CRC of target: " + hex(crcTarget))

availableChars = input("""
Enter level of detail:
         1. Only lowercase letters  (abc)
         2. Letters  (ABc)
         3. Letters and numbers  (ABc123)
         4. Letters, numbers, punctuation and whitespace  (ABc123 .,)
""")
if availableChars == "1":
        availableChars = string.ascii_lowercase
elif availableChars == "2":
        availableChars = string.ascii_letters
elif availableChars == "3":
        availableChars = string.ascii_letters + string.digits
elif availableChars == "4":
        availableChars = string.printable
else:
        print ("Incorrect choice, detail set to 1")
        availableChars = string.ascii_lowercase

# Main loop
while True:

        # Get new combinations
        print ("\nGenerating CRC's of all combinations of " + str(currLen) + " characters...")
        print ("         (Total of " + str(nCr(len(availableChars),currLen)) + " combinations)")

        for possibleCombination in product(availableChars, repeat=currLen):
                currTxt = "".join(possibleCombination)
                currCrc = crc(currTxt)
                if currCrc == crcTarget:
                        found = True
                        if currTxt == txtTarget:
                                print ("         ORIGINAL FOUND (" + currTxt + ": " + hex(currCrc) + ")")
                        else:
                                print ("         COLLISION FOUND: " + currTxt + ": " + hex(currCrc))

        currLen = currLen + 1

        if found:
                if input("\nContinue searching? Y/N ").upper() == "Y":
                        found = False
                else:
                        break
        else:
                print ("         Nothing found.")

In the start of this script I generate the target CRC and define the available characters to create combinations with. Because Python 3.x uses unicode strings by default and the crc lib doesn’t like that, I have to convert it to an utf-8 byte array. (See highlighted line 15.)

After the initialization, I just loop through all available combinations and check their CRC’s for a match. When something is found, you can abort or continue searching for more.

Example results:

Enter string to find CRC16 collisions for: sample text
CRC of target: 0xb40d

Enter level of detail:
         1. Only lowercase letters               (abc)
         2. Letters                              (ABc)
         3. Letters and numbers                  (ABc123)
         4. Letters, numbers and special chars   (ABc123# %€)
1

Generating CRC's of all combinations of 1 characters...
         (Total of 26 combinations)
         Nothing found.

Generating CRC's of all combinations of 2 characters...
         (Total of 325 combinations)
         Nothing found.

Generating CRC's of all combinations of 3 characters...
         (Total of 2600 combinations)
         Nothing found.

Generating CRC's of all combinations of 4 characters...
         (Total of 14950 combinations)
         COLLISION FOUND: cgxj: 0xb40d
         COLLISION FOUND: ckxo: 0xb40d
         COLLISION FOUND: csxe: 0xb40d
         COLLISION FOUND: scti: 0xb40d
         COLLISION FOUND: sotl: 0xb40d
         COLLISION FOUND: swtf: 0xb40d
         COLLISION FOUND: wgwj: 0xb40d
         COLLISION FOUND: wkwo: 0xb40d
         COLLISION FOUND: wswe: 0xb40d

Continue searching? Y/N N

So that’s 9 collisions in 17901 combinations in ~5 minutes.