October 30, 2013
In my last post, I wrote a small Python script called
domain_finder.pywhich finds unregistered .com domain names. It worked pretty well, so I decided to move it into a folder where I archive my small projects.
After moving it, I ran it again just for fun. But this time, it yielded a series of numbers before printing its real output:
$ ./domain_finder.py -102 -100 1 4 5 6 8 101 103 monial.com medinid.com <-- Available pluvious.com caribbed.com <-- Available infang.com ...
Where were those numbers coming from? I scanned the script for
domain = word + ".com" if available(domain): print domain + " <-- Available" else: print domain
It clearly wasn’t those. So if I wasn’t printing those numbers, who was? I took a look at my imports:
from collections import defaultdict from random import random, choice from string import ascii_lowercase from subprocess import Popen, PIPE from time import time, sleep
Those are all from the standard library—they should just work, no matter what, right? But I knew it had to be one of them. I started removing them one by one. The numbers disappeared when I commented this one out:
from collections import defaultdict from random import random, choice from string import ascii_lowercase # from subprocess import Popen, PIPE from time import time, sleep
subprocesswas printing those numbers? Weird. That’s a widely used Python library—surely people would have complained about this before me. Something wasn’t right. I opened a Python shell and tested it out:
>>> import subprocess -102 -100 1 4 5 6 8 101 103
Okay, at least there was nothing wrong with my script. So what’s
subprocessdoing? Is my installation of Python broken? I Google’d around—nothing came up. Why would moving the script cause the program to change behavior? I looked in the directory for clues.
$ ls domain_finder.py select.py merger.py
There were two other files, select.py (my linear-time implementation of the k-select algorithm) and merger.py (an algorithm I invented for a school project). It then dawned on me—select is a system call that is used to wait on file descriptors, and there is a corresponding Python module of the same name—and I bet it is imported by
To test my hypothesis, I renamed
kselect.py. Sure enough, that fixed it.
But this didn’t sit well with me. I never directly imported
select, and I only used modules in the standard library. My program should work anywhere. When you use the
subprocessmodule, there is an unwritten rule that you are not allowed to have a file named
select.py. Maybe I should have just not been stupid enough to name a file
select.py, but does that mean I’m supposed to know the name of every module in the standard library before naming a script?