Tag Archives: Python

Python Downloaded File Helper

I still see way to many people who don’t know the difference between a .exe and a .jpeg. I mean, you downloaded that cute cat picture, right? You only get viruses by torrenting illegal stuff, right?

NOPE. These so-called ‘extensions’ define what type a file is and what it does. A .jpeg is an image and needs to be opened by an image viewer. A .exe is an executable file and will be executed by Windows.

If you download a file which should not be a .exe, then you’re probably dealing with something dark and scary. But how can you prevent these kinds of accidents from happening?

I made a Python script which monitors your Downloads folder for new files and checks their extensions.

import os
import win32file
import win32con
from os.path import expanduser

from tkinter import *
from tkinter import messagebox

root = Tk().withdraw()

neverRun = {
	".BAT" : "Interpreted script",
	".CMD" : "Interpreted script",
	".COM" : "MS-DOS executable",
	".CPL" : "Control Panel executable",
	".PIF" : "Link to MS-DOS executable which can contain executable code",
	".APPLICATION" : "MS ClickOnce executable",
	".MSI" : "Application installer",
	".MSP" : "Application patcher",
	".SCR" : "Screensaver, rename of .EXE",
	".HTA" : "Unsandboxed browser executable",
	".MSC" : "Management Console executable",
	".VB" : "Interpreted script",
	".VBS" : "Interpreted script",
	".VBE" : "Encrypted script",
	".JS" : "Unsandboxed Javascript",
	".JSE" : "Unsandboxed encrypted Javascript",
	".WS" : "Interpreted script",
	".WSF" : "Interpreted script",
	".WSC" : "WS component",
	".WSH" : "WS host control",
	".PS1" : "Interpreted script",
	".PS1XML" : "Interpreted script",
	".PS2" : "Interpreted script",
	".PS2XML" : "Interpreted script",
	".PSC1" : "Interpreted script",
	".PSC2" : "Interpreted script",
	".MSH" : "Interpreted script",
	".MSH1" : "Interpreted script",
	".MSH2" : "Interpreted script",
	".MSHXML" : "Interpreted script",
	".MSH1XML" : "Interpreted script",
	".MSH2XML" : "Interpreted script",
	".SCF" : "Explorer shortcut, can contain malicious arguments",
	".LNK" : "Link to an executable, can execute",
	".INF" : "Autorun script, can execute",
	".REG" : "Registry changing file"

possibleDanger = {
	".EXE" : "Most used executable",
	".GADGET" : "Executable, installed as Windows Gadget",
	".APPLICATION" : "MS ClickOnce executable",
	".MSI" : "Application installer",
	".MSP" : "Application patcher",
	".SCR" : "Screensaver, rename of .EXE"

#  1 : "Created",
#  2 : "Deleted",
#  3 : "Updated",
#  4 : "Renamed from something",
#  5 : "Renamed to something"

path_to_watch = os.path.join(expanduser("~"), "Downloads")
hDir = win32file.CreateFile (
  win32con.FILE_SHARE_READ | win32con.FILE_SHARE_WRITE | win32con.FILE_SHARE_DELETE,

messagebox.showinfo(message = "Now monitoring " + path_to_watch + " in the background. Have fun.", title = "Downloaded File Helper running!")

while True:
	results = win32file.ReadDirectoryChangesW (
	for action, file in results:
		if action == 1:
			full_filename = os.path.join(path_to_watch, file)
			_, file_extension = os.path.splitext(file)
			file_extension = file_extension.upper()
			if file_extension in neverRun:
				messagebox.showwarning(message = "YOU DOWNLOADED AN EXECUTABLE FILE:\n" +
				       full_filename + " : " + neverRun[file_extension] +
					   "\nDO NOT run this file if you're not 100% certain why you need this and what it does.",
					   ok_button = "I understand.",
					   title = "High risk download")
			elif file_extension in possibleDanger:
				messagebox.showwarning(message = "YOU DOWNLOADED AN EXECUTABLE FILE:\n" +
				       full_filename + " : " + possibleDanger[file_extension] +
					   "\nThis filetype is really common, so there's a good chance nothing is wrong.\nJust check if what you're doing needs anything to be executed or installed.\n99% of the malware is shipped as .exe, so still be careful.",
					   ok_button = "I understand.",
					   title = "Possible risk download")

I use the win32 api to monitor for changes. The extensions I marked as dangerous only include the ones that will run without any other software. For ex. ‘.jar’ will run when Java is installed – but is not listed.

Download a compiled version HERE. Note that this file is a .exe! I created a self-extracting archive with 7zip to save some space.

HOW TO: Python libraries

This little tutorial is for anyone who is new to Python and doesn’t know how to install libraries yet.

The easiest way to install a new Python library is to use PIP.

Open a command prompt, and enter

pip install package_name

… to install a new library.

To update one, use the ‘–upgrade’ option:

pip install package_name --upgrade

When any errors occur, try updating PIP the same way:

pip install pip --upgrade

If a library isn’t available through PIP, you have to install it yourself.

Download the library and extract it, then open a command prompt in your new directory (shift + right click inside the folder -> Open command prompt here). Install it by running:

python setup.py install

If there is no ‘setup.py’ script available, you could try to just copy the library into your Python site-packages folder. Most commonly this is installed in C:\Python3x\Lib\site-packages.

Python script to single .exe

It’s plain annoying when you want to show someone your amazing Python script but they don’t have Python (and all of your libraries) installed. It’d be nice to just bundle everything in one executable, wouldn’t it?

Download py2exe !

When installed, open a command prompt and enter:

build_exe "the_name_of_my_script.py" --bundle-files 0 -O -c

…which will gather all needed libraries, optimize them (the -O option), compress them (-c) and bundle them into one big file inside a folder called ‘dist’.

Python HashMark

Because I recently made a CRC-16 brute forcer, I’d like to know how fast Python really is when it comes to hashing. So I created a little benchmarking script which tries to create as many hashes as possible in a set amount of time – for all available algorithms.

Libraries used: tabulate (to print the results in a nice readable way)

import time
import hashlib
import sys
from tabulate import tabulate

deltaTime = 0.1 # Seconds

hashPerSecond = dict()
algorithms = hashlib.algorithms_available

filteredAlgorithms = list()
seen = set()
seen.add("SHA") # Weird undocumented SHA algorithm
for algorithm in algorithms:
	algorithmNormalized = algorithm.upper()
	if algorithmNormalized not in seen:

print ("HASHING BENCHMARK\n{0} available hashing algorithms will be tested for {1} seconds each".format(str(len(filteredAlgorithms)), str(deltaTime)))

input("Press enter to start >")

for algorithm in filteredAlgorithms:
	counter = 0
	startTime = time.time()
	endTime = startTime + deltaTime

	print ("Testing {0}".format(algorithm.upper()))

	while (time.time() < endTime):
		h = hashlib.new(algorithm)
		h.update(b"Hello World!")
		counter += 1

	hashPerSecond[algorithm] = round(counter/deltaTime)

algorithmsSorted = sorted(hashPerSecond, key=lambda key: hashPerSecond[key], reverse=True)

output = list()
for algorithm in algorithmsSorted:
	line = [algorithm.upper(), str(hashPerSecond[algorithm])]

headers = ["Algorithm", "Hashes per second"]

print (tabulate(output, headers, tablefmt="psql"))

As you can see, first I get all of the available algorithms. Because there are duplicates in the list, I filter the list. I have also removed an obscure ‘SHA’ algorithm which I couldn’t find documentation about and doesn’t give the same results as SHA1.

Then I iterate through all the rest and let them generate as many hashes as they can in ‘deltaTime’ seconds. I set it to a low value as setting is to higher values didn’t make much of a difference – the hashes per second stayed the ~same.

When done, I sort the results and generate some outputlines to feed tabulate with.

Example results:

14 available hashing algorithms will be tested for 0.1 seconds each
Press enter to start >
Testing RIPEMD160
Testing SHA1
Testing SHA224
Testing DSA
Testing MD5
Testing MD4
Testing SHA256
Testing DSA-SHA
Testing SHA512
Testing SHA384
| Algorithm       |   Hashes per second |
| MD5             |              453440 |
| MD4             |              449250 |
| DSA             |              433300 |
| SHA1            |              429570 |
| DSA-SHA         |              425580 |
| DSAWITHSHA      |              424790 |
| DSAENCRYPTION   |              422710 |
| ECDSA-WITH-SHA1 |              421060 |
| SHA256          |              404070 |
| SHA224          |              401780 |
| RIPEMD160       |              396920 |
| WHIRLPOOL       |              360560 |
| SHA384          |              349780 |
| SHA512          |              349200 |

So. Not that fast, really.

Let’s try again using PyPy – because it should be way quicker, right?

C:\Users\PiPro\pypy3-2.4.0-win32>pypy.exe "HashMark.py"
Traceback (most recent call last):
  File "HashMark.py", line 4, in <module>
    from tabulate import tabulate
ImportError: No module named tabulate

But …

14 available hashing algorithms will be tested for 0.1 seconds each
Press enter to start >
Testing SHA1
Testing SHA224
Testing SHA384
Testing SHA256
Testing SHA512
Testing MD4
Testing MD5
Testing DSA-SHA
Testing DSA
Testing RIPEMD160
| Algorithm       |   Hashes per second |
| DSA-SHA         |              144060 |
| MD4             |              142910 |
| SHA256          |              142660 |
| ECDSA-WITH-SHA1 |              142200 |
| DSAENCRYPTION   |              141560 |
| DSAWITHSHA      |              140800 |
| RIPEMD160       |              140050 |
| SHA1            |              137250 |
| MD5             |              133770 |
| DSA             |              129750 |
| WHIRLPOOL       |              126450 |
| SHA512          |              123780 |
| SHA384          |              119360 |
| SHA224          |              116380 |

Apparently that’s not the case. MD5 (the fastest when using regular Python) is ~3 times slower – and the fastest when using PyPy, DSA-SHA, is still ~3 times slower than the regular Python version.


Python Java update checker

Keeping Java up-to-date isn’t always straightforward. Especially if you’ve disabled the automatic update checker because it eats up your RAM.

import subprocess, time, sys
from urllib.request import urlopen

proc = subprocess.Popen("java -version", shell=True, stdout=subprocess.PIPE, stderr=subprocess.PIPE, stdin=subprocess.PIPE)
localVersion = (proc.stdout.read() + proc.stderr.read()).decode("utf-8").split('"')[1]

print ("Your local Java version is {0}.".format(localVersion))

print ("Checking online version...")
	tmp = urlopen("http://www.java.com/applet/JreCurrentVersion2.txt").read().decode("utf-8")
	latestVersion = tmp[:-2]
	print ("Java version online: " + latestVersion)

	if not localVersion == latestVersion:
		print ("\n     There is a newer version available: {0} .".format(latestVersion))
		if input("     Do you want to open the webpage? Y/N ").upper() == "Y":
			print ("     Opening the webpage...")
			subprocess.Popen("start http://java.com/nl/download/manual.jsp", shell=True)
		print ("\n     You are up-to-date!")

except Exception as e:
	print ("\nAn error occured: {0}.\nPlease check your internet.".format(str(e)))


I’m using subprocess.Popen to execute commandline commands. First I read the available Java version, then I grab the current version from this page and compare them. Because Python 3 wants all strings as unicode, I have to decode everything I read.

PyPy – Extremely fast Python

When using Python you may want some more speed sometimes. Maybe when bruteforcing CRC’s and such (hehe).

PyPy comes with a Just-In-Time compiler, which is way faster than the standard CPython interpreter.


  1. Download and extract PyPy
  2. Open cmd and cd into your PyPy directory
  3. Use pypy.exe just like you’d use python.exe .
  4. Copy any needed libraries from your main Python site-packages folder (Ex. C:\Python34\Lib\site-packages\ to the PyPy site-packages C:\pypy3-2.4.0-win32\site-packages\ folder.)
  5. ??
  6. Profit

My CRC brute forcer suddenly finished within 30 seconds instead of 5 minutes.
I ran it again but this time with ‘detail-mode’ set to 4 so it’d use ‘Letters, numbers, punctuation and whitespace’. This brings the total of combinations to try up to a whopping 166750… Which is ~9 times more than on mode 1.

Enter string to find CRC16 collisions for: sample text
CRC of target: 0xb40d

Enter level of detail:
         1. Only lowercase letters  (abc)
         2. Letters  (ABc)
         3. Letters and numbers  (ABc123)
         4. Letters, numbers, punctuation and whitespace  (ABc123 .,)

Generating CRC's of all combinations of 1 characters...
         (Total of 100 combinations)
         Nothing found.

Generating CRC's of all combinations of 2 characters...
         (Total of 4950 combinations)
         Nothing found.

Generating CRC's of all combinations of 3 characters...
         (Total of 161700 combinations)
         COLLISION FOUND: 3Qe: 0xb40d
         COLLISION FOUND: cQY: 0xb40d
         COLLISION FOUND: oQ\: 0xb40d
         COLLISION FOUND: wQV: 0xb40d
         COLLISION FOUND: GQB: 0xb40d
         COLLISION FOUND: KQG: 0xb40d
         COLLISION FOUND: SQM: 0xb40d
         COLLISION FOUND: 'Qj: 0xb40d
         COLLISION FOUND: +Qo: 0xb40d
         COLLISION FOUND: ?Q`: 0xb40d
         COLLISION FOUND: _QH: 0xb40d
         COLLISION FOUND: {QS: 0xb40d

Continue searching? Y/N N

It finished in about 2 minutes. Amazing.

Python CRC-16 collision brute-forcer

How common are CRC-16 collisions? Let’s find out with some Python magic.

Library’s used: crcmod .

from itertools import product
import crcmod.predefined
from math import factorial
import string

# Init variables
found = False
currLen = 1
txtTarget = ""
crcTarget = 0

# Init functions
def crc(data):
        crcInstance = crcmod.predefined.Crc("crc-16")
        crcInstance.update(str(data).encode("utf-8", "replace"))
        return crcInstance.crcValue

def nCr(n,r):
    f = factorial
    return round(f(n) / f(r) / f(n-r))

# Get input
txtTarget = input("Enter string to find CRC16 collisions for: ")
crcTarget = crc(txtTarget)
print ("CRC of target: " + hex(crcTarget))

availableChars = input("""
Enter level of detail:
         1. Only lowercase letters  (abc)
         2. Letters  (ABc)
         3. Letters and numbers  (ABc123)
         4. Letters, numbers, punctuation and whitespace  (ABc123 .,)
if availableChars == "1":
        availableChars = string.ascii_lowercase
elif availableChars == "2":
        availableChars = string.ascii_letters
elif availableChars == "3":
        availableChars = string.ascii_letters + string.digits
elif availableChars == "4":
        availableChars = string.printable
        print ("Incorrect choice, detail set to 1")
        availableChars = string.ascii_lowercase

# Main loop
while True:

        # Get new combinations
        print ("\nGenerating CRC's of all combinations of " + str(currLen) + " characters...")
        print ("         (Total of " + str(nCr(len(availableChars),currLen)) + " combinations)")

        for possibleCombination in product(availableChars, repeat=currLen):
                currTxt = "".join(possibleCombination)
                currCrc = crc(currTxt)
                if currCrc == crcTarget:
                        found = True
                        if currTxt == txtTarget:
                                print ("         ORIGINAL FOUND (" + currTxt + ": " + hex(currCrc) + ")")
                                print ("         COLLISION FOUND: " + currTxt + ": " + hex(currCrc))

        currLen = currLen + 1

        if found:
                if input("\nContinue searching? Y/N ").upper() == "Y":
                        found = False
                print ("         Nothing found.")

In the start of this script I generate the target CRC and define the available characters to create combinations with. Because Python 3.x uses unicode strings by default and the crc lib doesn’t like that, I have to convert it to an utf-8 byte array. (See highlighted line 15.)

After the initialization, I just loop through all available combinations and check their CRC’s for a match. When something is found, you can abort or continue searching for more.

Example results:

Enter string to find CRC16 collisions for: sample text
CRC of target: 0xb40d

Enter level of detail:
         1. Only lowercase letters               (abc)
         2. Letters                              (ABc)
         3. Letters and numbers                  (ABc123)
         4. Letters, numbers and special chars   (ABc123# %€)

Generating CRC's of all combinations of 1 characters...
         (Total of 26 combinations)
         Nothing found.

Generating CRC's of all combinations of 2 characters...
         (Total of 325 combinations)
         Nothing found.

Generating CRC's of all combinations of 3 characters...
         (Total of 2600 combinations)
         Nothing found.

Generating CRC's of all combinations of 4 characters...
         (Total of 14950 combinations)
         COLLISION FOUND: cgxj: 0xb40d
         COLLISION FOUND: ckxo: 0xb40d
         COLLISION FOUND: csxe: 0xb40d
         COLLISION FOUND: scti: 0xb40d
         COLLISION FOUND: sotl: 0xb40d
         COLLISION FOUND: swtf: 0xb40d
         COLLISION FOUND: wgwj: 0xb40d
         COLLISION FOUND: wkwo: 0xb40d
         COLLISION FOUND: wswe: 0xb40d

Continue searching? Y/N N

So that’s 9 collisions in 17901 combinations in ~5 minutes.