## Generating domain names with Markov chains

October 29, 2013

Coming up with good product names is a weakness of mine, and finding names that are available as

*.com*domains is especially difficult. I wrote a small Python script that trains an n-gram Markov chain on the English lexicon, generates word-like strings according to that model, and checks if the strings are available as*.com*domains.Here is some sample output:

$ ./domain_finder.py sibility.com caliology.com mandarch.com hopefulness.com spinical.com <-- Available! opinate.com motion.com synous.com torrency.com <-- Available! mullock.com genetic.com apophysis.com intrated.com <-- Available! calycule.com <-- Available! amido.com swiveless.com <-- Available! bajury.com <-- Available! tendother.com <-- Available! embole.com argasia.com stroot.com ...

The code uses the

`whois`

command and the `/usr/share/dict/words`

wordlist (both are available by default on Linux and Mac OS X installations).#!/usr/bin/python -O from collections import defaultdict from random import random, choice from string import ascii_lowercase from subprocess import Popen, PIPE from time import time, sleep # get a list of words with only ASCII characters words = [w.strip().lower() for w in open("/usr/share/dict/words").readlines()] words = [w for w in words if all([c in ascii_lowercase for c in w])] words = ["^" + w + "$" for w in words if w != ""] # construct a discrete-time markov chain of n-grams n = 5 # this is the "n" in n-grams, try adjusting this for different results transitions = defaultdict(lambda: defaultdict(float)) for word in words: if len(word) >= n: transitions[""][word[:n]] += 1.0 for i in range(len(word) - n): gram = word[i : i + n] next = word[i + 1 : i + n + 1] transitions[gram][next] += 1.0 # normalize the probabilities for gram in transitions: total = sum([transitions[gram][next] for next in transitions[gram]]) for next in transitions[gram]: transitions[gram][next] /= total # sample a probability mass function (dict from elements to probabilities) def sample(pmf): sample = random() cdf = 0.0 for e in pmf: cdf += pmf[e] if cdf >= sample: return e return choice(pmf.keys()) # generate a word according to the markov chain def gen_word(): # start with a prefix word = sample(transitions[""]) # wait until the markov chain adds a terminator to the word while word[-1] != "$": # append a new letter chosen according to the markov chain gram = word[-n:] if gram in transitions: word += sample(transitions[gram])[-1:] else: word += choice(ascii_lowercase + "$") # optional: allow multi-word domains if word[-1] == "$" and random() > 0.7 and len(word) < 8: word += sample(transitions[""]) # remove the boundary markers and return the word return word.replace("^", "").replace("$", "") # check whether a domain is available (e.g., "example.com") # returns a bool indicating if the domain is available or None on timeout def available(domain): # use the "whois" command to determine availability process = Popen(["whois", domain], stdout=PIPE, stderr=PIPE) end_time = time() + 4.0 # timeout after 4 seconds while time() < end_time: if process.poll() is not None: return "No match for" in process.stdout.read() sleep(0.1) try: process.kill() except: pass return None # generate domain names forever while True: # generate a few words and pick the smallest word = sorted([gen_word() for i in range(3)], key=lambda x: len(x))[0] # report whether the domain is available domain = word + ".com" if available(domain): print domain + " <-- Available" else: print domain

As always, feel free to use this for anything. See the follow-up post.

## Region-based memory management

October 19, 2013

If I were to design a programming language, it probably would be very different from mainstream languages. Every day new wacky ideas about programming languages come to me—it’s an unhealthy obsession. Here’s what I was thinking about today.

### Manual memory management

Before I spill the beans, let me give you some context. In the beginning, there was manual memory management, like in this C example:

// allocate a kilobyte of memory char* buffer = malloc(1024); ... // free the memory free(buffer);

For every call to

`malloc`

, you need a corresponding call to `free`

to release the memory back to the operating system. This is an annoying requirement, and sometimes software engineers forget to call `free`

, resulting in memory leaks.### Garbage collection

So a famous computer scientist named John McCarthy invented the garbage collector, which is a special program that periodically frees memory when you’re done with it so you don’t have to. So, for example, here is what the above example would look like in Java, a programming language with a garbage collector:

// allocate a kilobyte of memory byte[] buffer = new byte[1024]; ... // the memory is automatically released

Notice we didn’t have to deallocate the memory—but this comes at a cost. Garbage collectors can incur noticeable performance hits, especially if they are poorly written. For example, my cell phone runs Android, and I frequently notice my phone lock up for a few hundred milliseconds while the garbage collector reclaims unused memory. Applications that have real-time requirements (e.g., audio DSPs, games, operating systems, pacemakers, autopilots, etc.) typically do not use garbage collection, because the degradation in performance or lack of predictability is unacceptable.

### Resource Acquisition Is Initialization

So do we have to revert to C-style manual memory management for these kinds of programs? C++ has a different solution to this problem, called Resource Acquisition Is Initialization (RAII). The idea is that objects allocate whatever memory they need in their constructor and release that memory in their destructor, which automatically runs when the object goes out of scope. Check this out:

// allocate a kilobyte of memory vector<char> buffer(1024); ... // the memory is automatically released

It looks a lot like garbage collection, right? But here the memory is released exactly when

`buffer`

goes out of scope. So instead of a garbage collector which runs at unpredictable times and frees a large number of objects at once (which is slow), objects are automatically released one at a time, in a deterministic, predictable order, as they would be if you wrote it in C.### Limitations of RAII

In RAII, heap-allocated memory is tied to the lifetime of an object on the stack. But if we’re doing that, why don’t we just allocate all memory on the stack? There are two issues with that:

- Maybe the stack isn’t big enough. A typical stack size is 8 megabytes. What if I want a gigabyte?
- Maybe I want the memory to outlive the current stack frame, e.g., I want to return it from a function.

RAII is a good solution to

*Issue 1*. Does it also fix*Issue 2*? Not really—the whole point of RAII is that the memory is released automatically when the function returns. So if we held onto that memory after the function returned, when would it get released? C++ has some mechanisms (e.g., smart pointers, copy constructors, move semantics, etc.) to let you transfer ownership of that memory from the function call to the surrounding scope. But in my opinion, these non-orthogonal concepts are inelegant hacks that are only necessary because the language is poorly-designed (and overly complicated!), and they remind me too much of manual memory management. But what concerns me more is that using these workarounds is optional—the possibility of memory leaks still exists. What I want is a system where a) heap allocations are possible, b) memory leaks are fundamentally impossible (in the sense that we can easily pinpoint exactly where every piece of memory is released), and c) there is no unpredictable performance penalty due to a garbage collector running in the background. It’s easy to design a system with any two of these properties, but can we have all three?### Region-based memory management

Consider this code in a Java-like (call-by-sharing) language that does not have garbage collection:

int[] compute_big_array() { int[] data = new int[1048576]; ... return data; } void foobar() { int[] big_array = compute_big_array(); ... delete [] big_array; }

Without garbage collection, it is the engineer’s responsibility to release

`big_array`

at the end of `foobar`

. We want to return the array from `compute_big_array`

, so vanilla RAII can’t help us here.RAII ties heap-allocated memory to lexical scopes. But often times memory should be shared between multiple, unrelated lexical scopes. In the example above, an RAII-like approach to memory management would free

`data`

when `compute_big_array`

returns (perhaps after making a copy of it). But we want the original memory to be available to `foobar`

after `compute_big_array`

returns. So this memory shouldn’t be bound to the lexical scope of `compute_big_array`

, but it also shouldn’t be tied to the lexical scope of `foobar`

.Notice, however, that we only need the memory to exist during the execution of

`foobar`

. But this may include calls to other functions defined elsewhere (in unrelated lexical scopes), and we may want to make the memory available to those functions as well. To use the domain lingo, we want the memory to be available in the *dynamic scope*of`foobar`

. Can we design a system that ties the memory to this dynamic (not necessarily lexical) scope, guaranteeing that it will be freed when the program execution leaves that scope?Let’s come up with a new strategy that allows us to specify a dynamic scope for each memory allocation which dictates the lifetime of that memory.

int[] compute_big_array(MemoryAllocator<int> allocator) { int[] data = allocator(1048576); ... return data; } void foobar() { MemoryAllocator<int> allocator = makeAllocator<int>(); int[] big_array = compute_big_array(allocator); ... }

Here’s how it works. The language gives you a parametrically-polymorphic

`makeAllocator`

function, that you can use to create a memory allocator (and the lifetime of this allocator is called a *region*). If you want to allocate some memory, you have to use an allocator. When an allocator goes out of scope (i.e., the program leaves the region), its destructor frees any memory that it allocated. Every variable goes out of scope at some point, so any memory you allocate is guaranteed to be freed. But unlike garbage collection, there is no extra performance penalty! It may be a little cumbersome to pass around memory allocators, but for some applications the strong safety guarantees make it worth it.### Case study: a web server

Web servers are long-lived programs that serve web pages to clients, and it’s not uncommon for a web server to run continuously for months or longer. It is crucial that web servers do not leak memory, because otherwise they would quickly consume all available resources and crash. With region-based memory management, we can be sure that no memory is leaked from one request to the next, which is a stronger guarantee that most garbage collectors provide. Consider a simple single-threaded web server:

void main() { while (true) { MemoryAllocator<int> allocator = makeAllocator<int>(); // listen on port 80 and handle requests, using allocator as necessary } }

When the server is done processing a request, the allocator goes out of scope and all memory allocated for that request is deallocated. With most garbage collectors, you have no guarantees about when the collector will run and what it will free. So a) our web sever will have a tiny memory footprint, b) it is guaranteed not to leak memory from one request to the next, and c) it does not have the performance cost or unpredictability of a garbage collector.

### Conclusion

Here are some cool things about region-based memory management:

- Regions cannot leak their memory to the outside world. All heap allocations are performed inside regions, so no memory is ever leaked.
- Often, when functions in C return pointers to memory, it’s not clear who is responsible for deallocating it. Systems based on regions do not have this problem: if a function allocates and returns memory, you have to provide it with an allocator that is responsible for this memory.
- There is no performance penalty due to a garbage collector running in the background. Memory allocation is deterministic and predictable.
- Writing a good garbage collector is hard. Writing a good region-based allocator is easy.

However, there are a couple of disadvantages as well:

- Region-based memory management requires more bookkeeping (you have to declare regions and keep track of memory allocators). This can be alleviated by having functions declare implicit regions around their bodies, and allocating memory to those regions by default.
- Sometimes, the lifetime of objects does not tie well to scopes—neither lexical nor dynamic. For example, consider a web server that needs some data to persist between requests, but that data grows and some of it needs to be freed occasionally (a memory pool might be a good solution to this). Another example is data that is shared between threads of a program.

## Algorithmic scheduling

October 10, 2013

A few weeks ago I had an idea.

All of my scheduling constraints are imposed by other people. For example, I have to go to an interview at 1pm because that’s when the recruiter is available. Or I have to go to class at 2:30 because that’s when my university scheduled it. Everything on my calendar has to happen at a specific, rigid time because that’s when someone else scheduled it—or that’s the time we previously agreed upon.

Sometimes, scheduling is difficult. I remember trying to create a rehearsal schedule for my music group was a nightmare—it seemed like there was no rehearsal time that worked for everyone. Similarly, I was recently trying to organize a banquet potluck and found it obnoxiously frustrating to coordinate.

I asked myself:

*where do everyone’s time constraints come from?*Surely, there must be some fountain of scheduling constraints that everyone is drinking from. I quickly realized that scheduling constraints are almost always artificial—ultimately, it all comes down to when people agree to meet. There’s a cost associated with rescheduling—and this cost is mostly one of bookkeeping and communication.For example, the reason I can’t meet with you tomorrow at 3pm is because I have a meeting with my coworker from 2-4pm. But that meeting with my coworker was scheduled for an arbitrary time that worked with both of our schedules—why can’t I just reschedule it? Because that’s the only time that works for him. But why can’t he reschedule his other meetings so I can reschedule my meeting with him? Simply because this is a logistical challenge that takes time and energy to resolve.

Fundamentally, there’s only one strict constraint:

*you can’t be at two events at the same time.*So we want to maximize the number of items on our agenda such that the items never overlap.This is just a textbook computer science problem in disguise! Optimization, network flow, graph theory, constraint satisfaction, artificial intelligence, something like that. I have the tools to solve this problem.

So imagine this: a computer program that arbitrates everyone’s schedules to find the globally optimal solution. People are really good at acting like greedy algorithms and finding locally optimal solutions. But we could be so much more efficient if this was done algorithmically—a computer has no problem working out the logistical details of coordinating everyone’s schedules. So you tell it who you want to meet with, rather than when you want to meet. It knows your schedule, it knows their schedules, it knows how to work it out. It’s frames the scheduling problem in terms of a social graph. Scheduling is a social activity, after all.

I have a million other ideas that piggyback off of this one. Perhaps you have some preference for when you want to meet. Maybe you want to meet in the afternoon, because that’s when you can focus the best. You can associate some cost function with this event that penalizes the algorithm for not scheduling it in the afternoon. Maybe it’s a parabola centered at 2pm. You could control the focal length of this parabola—this lets you tell the algorithm how much you care about this preference.

Maybe you can tell it that you have a homework assignment due at 5pm on Friday. You know it will take you four hours to finish—but you don’t care which four hours. The algorithm finds time for you to work on it, or makes time if necessary.

Maybe you have other kinds of constraints. You know that you need to pick up your cousin’s gift from the UPS Store before you can go to her birthday party. The algorithm will make sure these events happen in the correct order. Or maybe you want to have lunch and dinner every day, but not within 4 hours of each other. The algorithm should be able to space those events apart.

I could go on forever. But the point is that I’m trying to convince you that the way we think about scheduling is

*outdated*. Finding times to do things is an algorithmic process, and computers should do it for us. What we care about is*who*we want to meet with. And that’s the primary input to this algorithm.A few concerns:

*People will be disgruntled with the idea of a computer shuffling around their calendars.*For example, you don’t want to be in the middle of a 2-hour drive to a board meeting, only to find out that the meeting has been rescheduled for tomorrow because the algorithm was trying to schedule the CEO’s dentist appointment today. Solution: freeze all events that are within, say, 24 hours of the present. This window should be configurable.*UI/UX challenges.*What is a user-friendly way to input all this data into an application? I have a few ideas. For example, start with a Google Calendar-like interface. When you create a new event, you see a heading entitled “Event Time.” Underneath this heading you see a time/date picker control, but this is grayed out by default. On the right of the time picker, you see a slider control that defaults to the far right—indicating “Don’t care.” If you drag this slider to the left, you are indicating that you have some preference for the time. Then the time picker control becomes enabled. You can drag the slider all the way to the right, indicating that the event has to happen exactly at the time you specify. Or you can leave the slider somewhere in the middle, indicating a mild preference. I think, if done right, this would be fairly intuitive.*People will be too lazy to input all this information into an app.*That’s fine—they can use it like a traditional calendar if they like. But the more information you give it, the smarter it becomes—and the more it can do for you.*The “dating site problem”—the app is only useful if everyone uses it already.*This isn’t really true—when you’re the only one using it, it’s no worse than Google Calendar (it’s better: the app can still help automate your schedule). In my vision for the product, the app can act as your secretary to the outside world, even if the world doesn’t use it. If someone wants to schedule a meeting with you, you can point them to some webpage where they can book a time on your calendar (without necessarily revealing what’s on your calendar), and you don’t have to be a part of the process. This saves you time, and also promotes the product.*How can it be monetized?*Two ideas come to mind. First, advertisements that target you based on your schedule—which should be much more effective than normal ads, because they will be more relevant to your life (e.g., “Hey! Why don’t you go to this concert on Saturday? You’re not doing anything else this weekend!”). Second, an enterprise version of the product. A company could save a lot of man-hours using a product like this—and as a bonus, the app could automatically reserve conference rooms and ensure that they don’t get double-booked. Additionally, it would enable the company to do all kinds of crazy scheduling feats—like scheduling a 400-person conference on a moment’s notice. Everyone’s schedules would be automatically re-optimized to make it possible.

## Let there be light!

May 27, 2013

I usually write about computer science, engineering, or music, but today I’m going to digress and talk about physics. In particular, I’ll explain how Maxwell’s equations predict the existence of electromagnetic waves.

I will assume you have a basic familiarity with vector calculus and partial differential equations, but no knowledge of electrodynamics.

### Electric and magnetic fields

There are two important vector fields you need to know about: the electric field \( \mathbf{E} \) and the magnetic field \( \mathbf{B} \). What is the significance of these two fields? Mathematically, they describe the forces that are exerted on charged particles due to the charges of other particles. This is called the Lorentz force, and is given by:

\[ \mathbf{F} = q \left( \mathbf{E} + \mathbf{v} \times \mathbf{B} \right) \]

Here, \( \mathbf{v} \) is the velocity of the particle. Of course, this velocity depends on the choice of reference frame. Due to Faraday’s Law and Ampère’s Law (both described below), we’ll see that the choice of reference frame doesn’t actually matter (phew!). The factor \( q \) is the charge of the particle. For an electron, \( q \approx -1.602 \times 10^{-19} \) Coulombs. Notice that this value is negative, so electrons tend to accelerate in the opposite direction of the electric field.

### Maxwell’s equations

Maxwell’s equations relate electric fields, magnetic fields, charges, and current. They come in various forms—here I’ll describe them in their microscopic, differential form. We’ll start with Gauss’s Law:

\[ \boldsymbol{\nabla} \cdot \mathbf{E} = \frac{ \rho }{ \epsilon_0 } \]

This states that for any point in space, the divergence of the electric field \( \mathbf{E} \) is proportional to the charge density \( \rho \) (the constant factor \( \epsilon_0 \) is called the permittivity of free space). More simply, Gauss’s Law tells us that charges create electric fields. There is also a similar law for the magnetic field \( \mathbf{B} \):

\[ \boldsymbol{\nabla} \cdot \mathbf{B} = 0 \]

In other words, magnetic “charges” do not exist; magnetic field lines are always loops. Speaking of magnetic fields, the next law states that, in addition to charges, a time-varying magnetic field can also affect the electric field:

\[ \boldsymbol{\nabla} \times \mathbf{E} = - \frac{ \partial \mathbf{B} }{ \partial t } \]

This is called Faraday’s Law. For example, if I pick an intertial reference frame \( X \) with a stationary electron and move a nearby magnet (which generates a magnetic field) through space, the magnetic field will be time-varying, so an electric field is induced due to Faraday’s Law. Let’s say this electric field imparts a Lorentz force \( f \) on the electron. From the reference frame \( Y \) of the magnet, the magnetic field appears stationary but the electron is moving and will feel the same Lorentz force \( f \) due to the second term in the Lorentz force. So the choice of reference frame does not matter.

The final equation is Ampère’s Law:

\[ \boldsymbol{\nabla} \times \mathbf{B} = \mu_0 \left( \mathbf{J} + \epsilon_0 \frac{ \partial \mathbf{E} }{ \partial t } \right) \]

The constant \( \mu_0 \) is called the permeability of free space. \( \mathbf{J} \) is the current density, which describes how current is locally flowing at a point in space. In English, Ampère’s Law states that a magnetic field is generated by electric current or by a time-varying electric field (the latter term is called “Maxwell’s correction”). Just as with Faraday’s Law, the derivative term will depend on the choice of reference frame, but in the end the Lorentz force always comes out to be the same.

Amazingly, from these simple laws, you can derive all of classical electrodynamics!

### The Helmholtz wave equation

In a vacuum, there is no charge or current density. Then we can reduce Maxwell’s equations to:

\[ \boldsymbol{\nabla} \cdot \mathbf{E} = 0 \]
\[ \boldsymbol{\nabla} \cdot \mathbf{B} = 0 \]
\[ \boldsymbol{\nabla} \times \mathbf{E} = - \frac{ \partial \mathbf{B} }{ \partial t } \]
\[ \boldsymbol{\nabla} \times \mathbf{B} = \mu_0 \epsilon_0 \frac{ \partial \mathbf{E} }{ \partial t } \]

Before we continue, we need the following vector calculus identity:

\[ \boldsymbol{\nabla} \times \left( \boldsymbol{\nabla} \times \mathbf{E} \right) = \boldsymbol{\nabla} \left( \boldsymbol{\nabla} \cdot \mathbf{E} \right) - \boldsymbol{\nabla}^2 \mathbf{E} \]

In other words, the curl of the curl is equal to the gradient of the divergence minus the vector Laplacian (the proof follows directly from the definition of the vector Laplacian). For this identity, I chose the letter \( \mathbf{E} \) because we will be taking the curl of the curl of the electric field:

\[ \boldsymbol{\nabla} \times \left( \boldsymbol{\nabla} \times \mathbf{E} \right) \]

But first, let’s take the curl of Faraday’s Law to get this in terms of the magnetic field:

\[ \boldsymbol{\nabla} \times \left( \boldsymbol{\nabla} \times \mathbf{E} \right) = - \frac{ \partial \left( \boldsymbol{\nabla} \times \mathbf{B} \right) }{ \partial t } \]

Now we apply our vector identity to the left-hand side:

\[ \boldsymbol{\nabla} \left( \boldsymbol{\nabla} \cdot \mathbf{E} \right) - \boldsymbol{\nabla}^2 \mathbf{E} = - \frac{ \partial \left( \boldsymbol{\nabla} \times \mathbf{B} \right) }{ \partial t } \]

By Gauss’s Law in a vacuum, we can eliminate the leftmost term:

\[ \boldsymbol{\nabla}^2 \mathbf{E} = \frac{ \partial \left( \boldsymbol{\nabla} \times \mathbf{B} \right) }{ \partial t } \]

Now we can use Ampère’s Law in a vacuum on the right-hand side:

\[ \boldsymbol{\nabla}^2 \mathbf{E} = \mu_0 \epsilon_0 \frac{ \partial^2 \mathbf{E} }{ \partial t^2 } \]

This is called the

*Helmholtz wave equation*. It states that the second spatial derivative (the vector Laplacian) of the electric field is proportional to the second time derivative, where the proportionality constant is \( \mu_0 \epsilon_0 \).### Electromagnetic waves

In a classical setting (no relativistic or quantum effects), Maxwell’s equations always hold—and we’ve just shown how they imply the Helmholtz wave equation in a vacuum. Let’s solve the wave equation. Once we have a solution, we’ll check that it agrees with Maxwell’s equations (not all do).

One solution is \( \mathbf{E} = \mathbf{0} \) (which implies \( \mathbf{B} = \mathbf{0} \)), but it’s a bit prosaic to call that a

*wave*. Still, a quick look at Maxwell’s equations reveals that this solution does not violate the laws of classical electrodynamics, so we can have \( \mathbf{E} = \mathbf{B} = \mathbf{0} \) in a vacuum. In other words, darkness is physically possible, which is great because otherwise I would have difficulty sleeping at night.The nice thing about Maxwell’s equations is that they are

*linear*in \( \mathbf{E} \) and \( \mathbf{B} \). In other words, if we find multiple solutions to these equations, their superposition is also a solution. This fact is very useful in harmonic analysis. In particular, we can represent almost all functions as the superposition of basic sinusoidal waves with the Fourier series. Conveniently, the form of the Helmholtz wave equation suggests a sinusoid might be a solution. If it works and agrees with Maxwell’s equations, then the superposition principle allows us to predict the existence of all kinds of electromagnetic waves.With that in mind, let’s try a solution of the form \( \mathbf{E} = \hat{x} E_0 \cos \left( \omega t - k z - \phi \right) \), for some unknown \( E_0 \) (amplitude), \( \omega \) (frequency), \( k \) (wavenumber), and \( \phi \) (phase shift). We say that this wave is \( \hat{x} \)-polarized and propagates in the \( \hat{z} \) direction.

First, we’ll try it with the wave equation:

\[ \boldsymbol{\nabla}^2 \mathbf{E} = \mu_0 \epsilon_0 \frac{ \partial^2 \mathbf{E} }{ \partial t^2 } \]
\[ \boldsymbol{\nabla}^2 \left( \hat{x} E_0 \cos \left( \omega t - k z - \phi \right) \right) = \mu_0 \epsilon_0 \frac{ \partial^2 }{ \partial t^2 } \left( \hat{x} E_0 \cos \left( \omega t - k z - \phi \right) \right) \]
\[ - \hat{x} E_0 k^2 \cos \left( \omega t - k z - \phi \right) = - \hat{x} \mu_0 \epsilon_0 E_0 \omega^2 \cos \left( \omega t - k z - \phi \right) \]
\[ k^2 = \mu_0 \epsilon_0 \omega^2 \]
\[ \frac{k}{\omega} = \sqrt{\mu_0 \epsilon_0} \]

So \( \mathbf{E} = \hat{x} E_0 \cos \left( \omega t - k z - \phi \right) \) is a solution to the wave equation as long as \( \frac{k}{\omega} = \sqrt{\mu_0 \epsilon_0} \). The quotient of the frequency \( \omega \) and the wavenumber \( k \) is the speed \( c \) of the wave, so we can see that in a vacuum \(c = \frac{\omega}{k} = \frac{1}{\sqrt{ \mu_0 \epsilon_0 }} \). In SI units, that gives 299,792,458 m/s, which seems just right.

Let’s test our solution against Maxwell’s equations in a vacuum. Taking the divergence of \( \mathbf{E} \) yields \( 0 \) because the wave is \( \hat{x} \)-polarized but has no dependence on \( \hat{x} \). Therefore Gauss’s Law is satisfied. The rest of Maxwell’s equations involve \( \mathbf{B} \), so we must find a \( \mathbf{B} \) that works with our trial solution \( \mathbf{E} \).

Let’s start with Faraday’s law:

\[ \boldsymbol{\nabla} \times \mathbf{E} = - \frac{ \partial \mathbf{B} }{ \partial t } \]
\[ \boldsymbol{\nabla} \times \left( \hat{x} E_0 \cos \left( \omega t - k z - \phi \right) \right) = - \frac{ \partial \mathbf{B} }{ \partial t } \]
\[ \hat{y} E_0 k \sin \left( \omega t - k z - \phi \right) = - \frac{ \partial \mathbf{B} }{ \partial t } \]

This tells us that the time derivative of \( \mathbf{B} \) is a sinusoid polarized in the \( \hat{y} \) direction. Therefore:

\[ \mathbf{B} = \hat{y} E_0 \frac{k}{\omega} \cos \left( \omega t - k z - \phi \right) + \mathbf{C} \]

The vector \( \mathbf{C} \) is the constant of integration. From the Helmholtz equation we saw that \( \frac{k}{\omega} = \sqrt{\mu_0 \epsilon_0} \), so we can rewrite \( \mathbf{B} \) as:

\[ \mathbf{B} = \hat{y} E_0 \sqrt{\mu_0 \epsilon_0} \cos \left( \omega t - k z - \phi \right) + \mathbf{C} \]

Now let’s test \( \mathbf{E} \) and \( \mathbf{B} \) against Ampère’s Law:

\[ \boldsymbol{\nabla} \times \mathbf{B} = \mu_0 \epsilon_0 \frac{ \partial \mathbf{E} }{ \partial t } \]
\[ \boldsymbol{\nabla} \times \left( \hat{y} E_0 \sqrt{\mu_0 \epsilon_0} \cos \left( \omega t - k z - \phi \right) + \mathbf{C} \right) = \mu_0 \epsilon_0 \frac{ \partial }{ \partial t } \left( \hat{x} E_0 \cos \left( \omega t - k z - \phi \right) \right) \]
\[ - \hat{x} E_0 k \sqrt{\mu_0 \epsilon_0} \sin \left( \omega t - k z - \phi \right) = - \hat{x} \mu_0 \epsilon_0 \omega E_0 \sin \left( \omega t - k z - \phi \right) \]
\[ \frac{k}{\omega} = \sqrt{ \mu_0 \epsilon_0 } \]

This is the same constraint that we got from the Helmholtz equation (so it works). Finally, we check Gauss’s Law for Magnetism:

\[ \boldsymbol{\nabla} \cdot \mathbf{B} = 0 \]
\[ \boldsymbol{\nabla} \cdot \left( \hat{y} E_0 \frac{k}{\omega} \cos \left( \omega t - k z - \phi \right) + \mathbf{C} \right) = 0 \]
\[ 0 = 0 \]

So our sinusoidal \( \mathbf{E} \) and \( \mathbf{B} \) fields agree with all of Maxwell’s equations. Since \( E_0 \), \( \omega \), and \( \phi \) were arbitrary (and we saw that \( k \) depends on \( \omega \)), any amplitude, frequency, and phase shift is valid (but the wave always has constant velocity \(c = \frac{1}{\sqrt{ \mu_0 \epsilon_0 }} \) in a vacuum). By the superposition principle, we can take any linear combination of waves of this form without violating Maxwell’s equations. This means, at the very least, any wave which can be decomposed into basic sinusoids (almost every wave) is valid. Furthermore, the polarization of the wave \( \hat{x} \) and the direction of propagation \( \hat{z} \) were also chosen arbitrarily—Maxwell’s equations do not have a preference, as long as they are perpendicular.

So Maxwell’s equations predict that a lot of electromagnetic waves are possible. Let’s experimentally verify our findings—oh, you’ve done that by reading this article!

## Pirates of the Caribbean and Bach

May 27, 2013

My favorite Classical composer is Bach. One of my favorite contemporary composers is Hans Zimmer, who scored the soundtrack for countless movies. I often play Zimmer’s music on piano, and I would play Bach too if I had the technical ability.

I was listening to one of Zimmer’s pieces for Pirates of the Caribbean: Dead Man’s Chest called

*The Kraken*, and I noticed that it sounded very...Baroque!At 1:29, I figured out why. Amidst the barrage of pirate contention surrounding the

*Black Pearl*and the*Flying Dutchman*, I could hear an homage to my second-favorite\( ^1 \) Bach piece—the well-known*Toccata and Fugue in D minor*! You will surely recognize it:Same key, same instrumentation (pipe organ), same motif. Now I know why I like Hans Zimmer so much—we have the same inspiration.

[1] My top favorite Bach piece is his “Little” Fugue in G minor. You can hear this in

*The Kraken*too, if you listen with a bit of imagination.