## Highest note on a guitar

December 29, 2012

I just bought a new guitar, and I asked myself:

*what is the highest note you can play in standard tuning?*When plucked, the high E string vibrates at \(440 \times 2^{-\frac{5}{12}}\) Hz, or approximately 329.628 Hz.

If we pluck this string while fingering the innermost fret (assuming there are 24 frets), we hit the E a major 18th above middle C, with frequency \(440 \times 2^{\frac{19}{12}}\) Hz, or approximately 1318.510 Hz. We can also place our finger just past this fret and hit the next F or maybe even the F# before we run out of fretboard.

However, we can also pluck the string on the other side of the fret to get even higher notes. The highest note we can produce this way happens when we pluck the high E string on the far side of the first fret, with the finger pressing the string against this fret from the inward side.

This note will be higher than any audible harmonics you will be able to produce. Let’s calculate its pitch. To do this, we need to know the width of the space between the nut and the first fret (the string is effectively this long when we press it against the first fret from the other side).

Halving the length of a string raises its pitch by an octave. This generalizes, of course, to arbitrary intervals and length ratios. If a string is length \(x\), we can raise its pitch by \(n\) semitones by reducing its length to \(\left( \frac{1}{2} \right)^{\frac{n}{12}} x\).

Using this we can calculate the width of the first segment of the fretboard. Each segment of the fretboard constitutes a semitone in pitch. If we want to raise the high E string (of length \(x\)) by a semitone, we neet to shorten it to \(\left( \frac{1}{2} \right)^{\frac{1}{12}} x\), which is about \(0.944x\). This means the piece we cut off, which is the part we’re will be playing, has length \(\left( 1 - \left( \frac{1}{2} \right)^{\frac{1}{2}} \right) x\), or about \(0.056x\).

This means we changed the length of the string by a factor of about 0.056. We can use the formula above to calculate how many semitones by which we have increased. Solving \(1 - \left( \frac{1}{2} \right)^{\frac{1}{2}} = \left( \frac{1}{2} \right)^{\frac{n}{12}}\) for \(n\), we find that we have gone up about \(n = 49.862\) semitones (about 4 octaves and a major 2nd) from the E, which brings us approximately to a really high F#, a tritone above the highest note on a piano. The frequency of this note is exactly

\[ 440 \times 2 ^ { \left( 12 \log \left( 1 - \left( \frac{1}{2} \right) ^ { \frac{1}{12} } \right) / \log \left( \frac{1}{2} \right) - 5 \right) / 12 } \text{ Hz,} \]

which is roughly 5.873 kHz. Some adults will not be able to hear this note.

As my collegue Ben Sena points out, you can hit even higher notes by pressing the string against two frets and plucking in between. You can pluck the highest note then by plucking between the innermost two frets on the high E string. Let’s calculate the frequency of this note, assuming the guitar has 24 frets. If the string has length \(x\), then the length of this piece is \( \left( \left( \frac{1}{2} \right) ^ { \frac{23}{12} } - \left( \frac{1}{2} \right) ^ { \frac{24}{12} } \right) x \), which is about \( \left( 1.487 \times 10^{-2} \right) x\). Using the method we used above, we solve for \(n\) in the equation \( \left( \frac{1}{2} \right) ^ { \frac{23}{12} } - \left( \frac{1}{2} \right) ^ { \frac{24}{12} } = \left( \frac{1}{2} \right) ^ { \frac{n}{12} } \). We find that \( n = 12 \log \left( \left( \frac{1}{2} \right) ^ { \frac{23}{12} } - \left( \frac{1}{2} \right) ^ { \frac{24}{12} } \right) / \log \frac{1}{2} \), which is about 72.862 semitones, raising the high E string by 6 octaves and almost a minor 2nd. This means that, assuming a 24 fret quitar, the highest note you can produce is approximately an F. The frequency of this note is exactly

\[ 440 \times 2 ^ { \left( 7 + 12 \log \left( \left( \frac{1}{2} \right) ^ { \frac{23}{12} } - \left( 1 / 2 \right) ^ { \frac{24}{12} } \right) / \log \left( \frac{1}{2} \right) \right) / 12 } \text{ Hz,} \]

which is about

**44.347 kHz**. You will not be able to hear this note!This analysis uses a very simple model of guitar. We assume that the strings are perfectly elastic and ignore strain forces, air resistance, and other factors that are likely to affect the sound in these extreme conditions.

**TLDR:**F, with some waving of hands.

## Unicode library for C++

October 30, 2012

I wrote a lightweight, low-level ASCII/Unicode processing library in C++ out of dissatisfaction with existing solutions. It’s very minimalistic and has no external dependencies. It supports the bare minimum features for dealing with text encodings:

// detect the encoding for a string encoding_type detect_encoding(const std::string &input); // determine whether a string is valid in a particular encoding bool is_valid(const std::string &input, encoding_type encoding); // convert a string from one encoding to another std::string convert_encoding(const std::string &input, encoding_type input_encoding, encoding_type output_encoding, bool include_bom); // get the number of code points in a string size_t get_length(const std::string &input, encoding_type encoding); // return the number of bytes of the code point at pos, or 0 if the byte index does not refer to a valid code point size_t get_char_size(const std::string &input, size_t pos, encoding_type encoding); // get the code point at a particular byte index uint32_t get_char(const std::string &input, size_t pos, encoding_type encoding); // set the code point at a particular byte index void set_char(std::string &input, size_t pos, uint32_t code_point, encoding_type encoding); // add a code point to the end of a string void add_char(std::string &input, uint32_t code_point, encoding_type encoding); // determine whether a code point is a letter bool is_alpha(uint32_t code_point); // determine whether a code point is uppercase bool is_upper(uint32_t code_point); // determine whether a code point is lowercase bool is_lower(uint32_t code_point); // determine whether a code point is titlecase bool is_title(uint32_t code_point); // determine whether a code point is a number bool is_numeric(uint32_t code_point); // determine whether a code point is whitespace bool is_whitespace(uint32_t code_point); // determine whether a code point is a line separator bool is_newline(uint32_t code_point); // convert a code point to uppercase (return the input if no uppercase form exists) uint32_t to_upper(uint32_t code_point); // convert a code point to lowercase (return the input if no lowercase form exists) uint32_t to_lower(uint32_t code_point); // convert a code point to titlecase (return the input if no titlecase form exists) uint32_t to_title(uint32_t code_point);

Get it here!

## Finite automata and regular expressions

October 28, 2012

Anyone who has taken a class on computability theory, compilers, or finite automata knows that regular expressions can be parsed in linear time by generating a nondeterministic finite automaton (NFA) from the regular expression and converting it into a deterministic one (DFA).

Unfortunately, regular expression libraries like Python’s

`re`

module frequently generate exponential-time parsers. Why? The reason is that regex libraries like `re`

produce backtracking parsers that can do more than just parse regular languages. They can parse some non-regular languages and support extra features like backreferences, which finite automata alone cannot do.To combat this, I’ve built the backbone of a true regular expressions library: a finite automata engine. It can be used to build a regex parser generator that takes worst-case linear time, as a true regex parser should do.

## Algebraic data types

August 21, 2012

### Primitive types

A type system starts with some primitive types. Some examples include:

`Int`

(the type of integers, perhaps limited to some range)`Real`

(the type of real numbers, perhaps represented as IEEE 754 floating-point numbers)`Char`

(the type of Unicode code points or ASCII characters)`Bool`

(logical`true`

and`false`

)

Any good programming language will let us construct compound types from simple ones like these. Let’s examine some of these operations, which are called type constructors.

### Product types

Most programmers will be familiar with product types, but in programming they are more commonly called

*tuples*(the name “tuple” is a generalization of the terms “double”, “triple”, etc.). A product type is a compound type that is formed by a sequence of types, and is commonly denoted`(T1, T2, ..., Tn)`

or `T1 × T2 × ... × Tn`

. In set theory parlance, tuples correspond to cartesian products, and that is why they’re called “product types.”As an example, we could represent a point in 3-space by a Cartesian triple:

`(Real, Real, Real)`

.Most programming languages will allow the elements of a tuple to be named (rather than indexed). This has the same type theoretic meaning, but named tuples are often called

*records*,*structs*, or*classes*, depending on the programming language and context.*Note:*In type theory, we usually do not distinguish between a 1-tuple (called a “singleton”) and the type of its element (however, most programming languages do make this distinction). This means that, for example,

`(Int)`

is usually equivalent to `Int`

in the literature.A special case of tuples is the 0-tuple, which is also called the unit type. The unit type can only hold one value (which means it stores zero information). It is sometimes called

`Void`

or `Unit`

, and its value is often called `null`

, `nil`

, `none`

, or `nothing`

.Common properties of the type product constructor:

- Closure: the product of two types is a type
- Associativity (sometimes):
`(A × B) × C = A × (B × C)`

- Identity element:
`Unit`

Depending on the type system, types and their constructors can form various algebraic structures, such as a monoid.

### Sum types

Sum types, also called

*tagged unions*,*disjoint unions*, or*variants*, are used to describe values that could be of one of several types (the corresponding set theoretic operation is union of disjoint sets). We usually write sum types as`T1 + T2 + .. + Tn`

, `T1 | T2 | ... | Tn`

, or `T1 ∪ T2 ∪ ... ∪ Tn`

.Many programmers often use sum types without knowing it. For example, what is the return type of the following Python function?

def f(flag): return 5 if flag else "Hello"

It seems that

`f`

could return an `int`

or a `str`

, so its type must be `int + str`

.As a special case, if we take the sum of zero types, we get a special type called

`Bottom`

or `⊥`

, which can take on no values. A function which does not return (i.e., does not halt) is said to have a return type of `Bottom`

. Likewise, the sum of all types is called `Top`

or `⊤`

, and it is sometimes called `Any`

, `Object`

, or `General`

, because it can take on any value.Note that, unlike product types, sum types can be defined inductively, as we shall see with lists and trees.

Properties of the union constructor:

- Closure: the union of two types is a type
- Associativity (sometimes):
`(A + B) + C = A + (B + C)`

- Commutativity:
`A + B = B + A`

- Identity element:
`Bottom`

You can see that types and the sum operation might form a commutative monoid, depending on whether

`+`

is associative.### Algebraic data types

We can represent a number of different data structures using

*algebraic data types*, which are simply sums of products of types. For example, we can represent a list of integers asIntList = Unit + Int × IntList

where the product operation (

`×`

) has a higher precedence than the sum operation (`+`

), and `Unit`

is the unit type (which represents an empty list here). In English, this means: *a list of integers is either the empty list or an integer paired with a list of integers.*Using a nominative type system, we can also represent enumerated types as the union of disjoint unit types:

Clubs = Unit Diamonds = Unit Hearts = Unit Spades = Unit Suit = Clubs + Diamonds + Hearts + Spades

This only works in a nominative type system because, for example,

`Clubs`

and `Diamonds`

need to be disjoint unit types or else the their union would still be the unit type (in type theory we talk about *the*unit type, but in a nominative type system we can have several unit types). In a structural type system, we would have`Suit = Clubs = Diamonds = Hearts = Spades`

, which is not what we want.We might define a binary tree of integers like this:

IntTree = Unit + Int × IntTree × IntTree

This reads:

*a binary tree can be the empty tree or an integer and two subtrees.*### Fun with algebra

The title of this article suggests that we can manipulate expressions involving types algebraically, as we do with numbers. To make this easier, let’s adopt the following notation:

`1`

is the unit type`T`

is a primitive type`X + Y`

is the sum of two types`X × Y`

is the product of two types (which has a higher syntactic precedence than sum)

Additionally, we can let the rest of the natural numbers represent sums of unit types (e.g.

`2 = 1 + 1 = bool`

). Repeated products can be written as exponentials (e.g. `X × X × X = X^3`

). Now let’s describe some common data structures.`1 + T`

is a nullable type (also called an option type or a maybe type)`L = 1 + T × L`

is the type of lists`B = 1 + T × B × B`

is the type of binary trees, like`IntTree`

defined above

Take a look at that second one (lists). With a bit of hand-waving (treating type addition and multiplication like the corresponding operations on reals), we can solve for

`L`

:L = 1 + T × L L - T × L = 1 L × (1 - T) = 1 L = 1 / (1 - T) L = 1 + T + T^2 + T^3 + ...

Interpreting that last line in set theoretic terms, we can translate it into English as:

*a list is either empty, or it contains one element, or two elements, or three elements, etc.*Pretty cool, huh? If we’re treating these symbols as numbers, then they correspond to the number of values that the corresponding types can take on (i.e. the sizes of the corresponding sets). For example, the unit type has 1 possible value, so in this notation, it becomes`1`

. The fact that `L = 1 + T + T^2 + T^3 + ...`

does not converge simply means that there are an infinite number of list topologies we can construct. In general, the algebraic forms of inductive datatypes do not converge.I want to mention one more thing you can do with this notation. In 1997, a computer scientist named Gérard Huet suggested that we can also take derivatives of types in an unpublished work on zipper types. Oddly, the derivative laws look very much like their classical counterparts from differential calculus! The derivative of a type describes the “one-hole context” of a type, which is a data structure that represents a pointer to an item in a larger container type. For example, one-hole contexts can be used to mark the position of an item in a linked list or a binary tree and can be used to efficiently implement such containers in a purely functional fashion.

## My electric unicycle

August 21, 2012

### Introduction

I recently built a partially self-balancing electric unicycle called “Bullet,” featuring:

- A custom MIG-welded steel chassis
- A 450 Watt electric motor
- Two 7 Ah 12 Volt batteries
- A 5DOF inertial measurement unit
- The OSMC H-bridge
- An ATmega328P microcontroller

I say “partially” self-balancing because it only balances along one axis (forward/backward), and the rider still needs to balance left and right (it’s analagous to riding a bicycle “no hands”). It operates much like a Segway—you lean forward to accelerate, and lean back to brake. The top speed is about 15 mph, and it easily goes 5 miles on a single charge. This was my primary mode of transportation on the MIT campus (see the “Press and Aftermath” section below).

The thing in my right hand (in the video at the bottom) is a “kill switch”—if I let go of it, the unicycle deactivates the motor, much like a treadmill safety key.

Also, you may notice that the circuit is fully exposed. When the unicycle falls forward, the circuit has about two inches of ground clearance (so it never hits the floor). However, I’m currently building a proper case for more protection.

### Details

To estimate its orientation, Bullet integrates readings from the gyro and accelerometer using a complementary filter. To balance, the angle estimate is fed through a PID loop (with no integral term). The loop runs at 625 Hz. The output from this stage determines the duty cycle of a 1.22 kHz PWM signal, which is connected to the H-bridge. The code was written in C, and is in the public domain.

The circuit itself is essentially a pretty standard microcontroller setup—filtering capacitors on the power rails, RESET pin held high (for AVR microcontrollers), external 20 MHz crystal oscillator, some indicator LEDs, and the IMU connected to the ADC pins. Two of the digital output pins are wired to the motor controller, which is connected to the motor and the batteries. The motor controller has an onboard switching voltage regulator that powers the logic subcircuit (and also the charge pump for the high-side MOSFETs of the H-bridge). I expect that most undergraduates in electrical engineering could probably reconstruct the circuit given this description, burn the code onto the microcontroller, and have a working electric unicycle controller. A trip to the machine shop to construct the chassis and voilà, you’ve made your very own self-balancing unicycle!

### Learning to ride

Unfortunately, one cannot simply pick up a self-balancing unicycle and ride it with ease. It took me several hours to be able to ride in a straight line without crashing, and it took several days to learn how to turn in a controlled manner. Many of my friends have tried riding it, usually with little success (including some actual unicyclers). However, my talented friend Harry can ride it about as well as I do.

Here are some riding tips, in case you ever build one:

- Don’t let go of the safety switch unless you crash. Bullet is designed to help you balance—let it do it’s job (even while mounting). If you ever let go of the safety switch while riding, you will fall (however, it was designed so that if you accidentally let go for a fraction of a second, you can usually recover).
- Bumps in the terrain are
*always*bigger than they appear. - Put most of your wait on the seat, not the pedals.
- Gain weight! An inverted pendulum with a high center of gravity is easier to balance than one where most of the weight comes from the motor and wheel. To illustrate this fact, try balancing a long broom on your hand (with the broomstick in your palm) vs. a short broom.
- When you first press the safety switch, Bullet will assume an upright position as fast as it can, even if you are in the way. So don’t be in the way (it can run into your ankles or run over your foot)!

### Practicality

As I said earlier, Bullet is the primary way I navigate MIT and the surrounding Cambridge area. I often zoom past students, faculty, custodians, and tourists, with generally positive reactions from everyone. I’ve been told one can be fined for riding a scooter in the Infinite Corridor. Fortunately, Bullet ain’t no scooter.

However, Bullet is

*heavy*. I live on the fifth floor of a dorm with no elevator, so carrying Bullet up and down the stairs can be quite tiresome. If I were to start over, I would try to minimize weight with a wheel hub motor, NiCad or NiMH batteries, and an aluminum frame (instead of steel).Bullet’s small size (compared to a bicycle, scooter, moped, etc.) means I can take it on the bus or train with no problem, so I can take public transportation to the stop nearest my destination and ride Bullet the rest of the way. With over 5 miles of range on a single-charge, battery life is usually not a problem in my urban hometown.

### Related work

I am certainly not the first person to build an electric unicycle. Perhaps the most well-known self-balancing unicycle is Trevor Blackwell’s Eunicycle, which also uses the OSMC. His design is similar to mine, but uses a much more expensive battery pack ($218 for his vs $44 for mine). Also, the Eunicycle’s motor and gearbox cost a grand total of $644, whereas Bullet’s drive system (including the wheel itself) was $195. Finally, the IMU he uses is about $100 more than mine. Overall, Bullet is several hundred dollars cheaper than the Eunicycle, but this comes at a price (mostly weight).

Focus Designs (of whom Adam Savage of MythBusters fame is a customer) manufactures a commercial electric unicycle called the SBU (“Self-Balancing Unicycle”). At $1,795, the SBU is much more expensive than mine, but it has extra safety features and a hub motor, which results in a smoother ride. For comparison, Bullet uses a toothed belt drive, and the Eunicycle features a metal gearbox. Because of the belt drive, I can often feel Bullet’s motor engaging the wheel with a slight jerk.

### Video

Here is a video of me riding it and some of my friends trying it out (no audio, sorry):

### Press and aftermath

Bullet has been featured on a number of media outlets, including:

- MIT Admissions
- Engadget
- MAKE
- Hack A Day
- MIT CSAIL
- Slice of MIT
- The Huffington Post
- Coolest Gadgets
- EcoFriend
- TheTechJournal
- PhysOrg.com
- TechMog
- Softpedia
- The Green Optimistic
- New Rising Media
- TrendHunter
- All Best Reviews
- DailyMail
- GizmoWatch
- BostInno
- The Register
- Ubergizmo
- DVICE
- Geekosystem
- Neatorama
- Cambridge Bicycle
- The Discovery Channel