October 31, 2013
Note: All of the domain names mentioned below were unregistered at the time of this writing.
I recently wrote a script to find unregistered .com domains (see this post for details). It builds a simple statistical model of the English lexicon and generates word-like strings according to that model. I let the program run for a few hours, and it generated thousands of unregistered domain names. The results were unexpectedly entertaining.
Here are a few of the ones I especially liked:
antispicy.com brideweed.com dewdropping.com evilship.com honeysweetways.com milkgrower.com neckmold.com outjinx.com piewipe.com shakespearmaid.com speedboatman.com subpubic.com weirdlike.com
Could be real websites
The following domains seem to me like they could be actual startups, although I don’t know what these startups would do. Another description for this category is “domains that I think are worth more than their list price.”
blamework.com bordable.com cojustify.com factioner.com fribbly.com goosebeak.com harmproof.com hairsplice.com jawfall.com lowmeter.com mistag.com parabold.com pipewalk.com polycook.com punnable.com ruinproof.com songfulness.com stoplifted.com tipmost.com uglified.com voiled.com yardkeep.com
Found in a textbook
I must have trained the model on a rather technical dictionary, because many of the domains it produced sound like they came straight out of a textbook.
ganglionlike.com isoporic.com monozygous.com myotropism.com ozonomer.com phonozoa.com postasis.com protonometer.com scabiosis.com
Could be real words
A jillion of them seemed like they could be actual words, though they are not.
embowel.com introscopy.com isolatry.com pederate.com sejunctive.com varicate.com
Signal vs. noise
Some of these domains are better than others. For example, overfamed.com is probably worth more than subcollembolize.com. I would like to train a classifier, such as a support vector machine, on some hand-labeled examples and use it to filter out the bad names. What features would we give this classifier?
- Number of characters
- Number of vowels
- Number of Google search results
- Number of repeated characters
Hold on—I have an idea for a research project. Stay tuned!