November 12, 2013.
My ideas often come on a grand scale, and many of the ambitious projects I want to tackle require a fundamental paradigm shift in some field/industry. I’ll list a few of them here. I didn’t invent most of them (even though I thought I did), and some of them are partially-solved (but none are in widespread use).
- In an earlier post I wrote about a future world where meetings are scheduled by a computer system that knows everyone’s time constraints. The system should be able to automatically shuffle around everyone’s schedules to make all kinds of crazy things possible. For example, suppose a musician gets sick right before a rehearsal. The system could reschedule the rehearsal for a later date, in a way that works for everyone in the band. If there is no time that works for the whole group, the system controls each member’s schedule and re-optimizes it to make a time. Or on a bigger scale: think about an entire university using this to schedule class times (dynamically). There’s a lot more to this idea, so check out that blog post for details.
- Unit tests are often used not to assert that a function is correct, but rather to detect when its behavior has been changed (possibly by a seemingly-unrelated change somewhere else in the codebase). It seems like these kinds of tests could be automatically generated. I could imagine a build system that automatically generates random inputs to every function and then records their outputs. Then whenever you build/deploy the application, the system runs the new code against these inputs and checks that the outputs match what was previously recorded. This doesn’t guarantee correctness, but it hopefully tells you when you’ve broken something.
- A type system that allows you to prove that function calls will not allocate memory and forget to release it. I’ve worked out a system where this is possible in another blog post. Similarly, a type system that lets you prove termination or get bounds on the asymptotic resource (CPU/memory) usage of a procedure. Or a type system that models latency in network requests (see my notes above about auto-generating AJAX calls). I have a lot of ideas for type systems. Some type system ideas that I thought I came up with (but didn’t): types that distinguish between unsafe vs. escaped strings (e.g. to prevent injection attacks), types that let you create ad-hoc subsets of types (e.g. even integers, sorted lists, etc.), and using monads for resource management.
- Software for life-critical systems, such as airplane autopilots, medical devices, etc. are often formally-verified—meaning that a computer program checks that the software matches the specification exactly (so it should be bug-free, so long as the specification is correct). Yet tons of life-critical software runs on unverified kernels, and so are fundamentally unsafe. Understandably, it would be near-impossible to verify a large kernel like Linux, but why can’t we have completely verified microkernels for use in these safety-critical applications? I know, easier said than done.
- Of course, it would be nice if every piece of software running on a machine was formally-verified. From the kernel, to the compiler, to the web browser. Pretty unrealistic, I know, but think for a second about what this means. Yes, your computer system would be bug-free—but there’s something else I want to draw attention to. If all software was formally-verified, it could all run in kernel mode. Similarly, all programs could share the same address space (no need for virtual memory). We could even eliminate the need for a task scheduler—all programs could implement cooperative multitasking, and we could be sure that no program would hang and bring the system down. Together, these facts would allow software to be vastly more performant than today’s computer systems. Modern hardware is very underutilized.
- Self-balancing electric unicycles. Unicycles very space efficient, but exhausting to ride (if you can even ride them at all). An electric one could be a very energy-efficient, compact mode of transportation, and the self-balancing system means that even non-unicyclers could ride it. Before you say it’s impossible, I built one and it worked really well before it was stolen from me. I could ride over 5 miles on a single charge, at about 15 mph. For longer trips I could easily take it on the subway with me. With only one wheel, they can be made cheaper than Segways.
- I have to remember a zillion different passwords for all my online accounts. If I could compute cryptographic hashes in my head, I could eliminate this need—I would simply have to remember a single master password, and my password for each website would be the hash of the master password concatenated with the name of the website. But no human is capable of this—but why should they be? That’s what computers are for. All web browsers should automatically implement this behavior for every password field—they should compute the hash of the password you enter concatenated with the website domain and send that to the server. The password you type in should NEVER make it to the server. Currently no browsers do this—why not? There are browser extensions that do this, but the fact that they are extensions means that most people aren’t benefitting from the idea.
- Algorithms theoreticians probably don’t spend a lot of time thinking about UTF-8 encodings for strings. We have a large theory of string algorithms, but they all assume indexing is constant-time. For a naive UTF-8 implementation of strings, indexing is linear-time. I bet we could come up with a clever string data structure for variable-width encodings. It should have worst-case logarithmic-time indexing (like a balanced tree), linear-time traversal, and its space usage should be better than just using fixed-width characters (that’s the whole point of variable-width encodings, after all). Or can we do even better, with average-case constant-time random access?
- Web applications depend on the browser to faithfully execute front-end code. Part of the reason that business logic typically resides on the server is because of security—front-end code can easily be modified by the user and shouldn’t be trusted. What if there was a way for the browser to keep track of the computation, and build a cryptographic proof that it was executed faithfully? Then the browser sends this “certificate of computation” along with the computed result back to the server, which checks that they match up (without actually re-executing the computation). Is this even possible? I recently discovered that I didn’t invent this idea—it’s called verifiable computing.