Red October: CloudFlare’s Open Source Implementation of the Two-Man Rule | 21-11-13
At CloudFlare, we are always looking for better ways to secure the data we’re entrusted with. This means hardening our system against outside threats such as hackers, but it also means protecting against insider threats. According to a recent Verizon report, insider threats account for around 14% of data breaches in 2013. While we perform background checks and carefully screen team members, we also implement technical barriers to protect the data with which we are entrusted.
One good information security practice is known as the “two-man rule.” It comes from military history, where a nuclear missile couldn’t be launched unless two people agreed and turned their launch keys simultaneously. This requirement was introduced in order to prevent one individual from accidentally (or intentionally) starting World War III.
To prevent the risk of rogue employees misusing sensitive data we built a service in Go to enforce the two-person rule. We call the service Red October after the famous scene from “The Hunt for Red October.” In line with our philosophy on security software, we are open sourcing the technology so you can use it in your own organization (here’s a link to the public Github repo). If you are interested in the nitty-gritty details, read on.
What it is
Red October is a cryptographically-secure implementation of the two-person rule to protect sensitive data. From a technical perspective, Red October is a software-based encryption and decryption server. The server can be used to encrypt a payload in such a way that no one individual can decrypt it. The encryption of the payload is cryptographically tied to the credentials of the authorized users.
Authorized persons can delegate their credentials to the server for a period of time. The server can decrypt any previously-encrypted payloads as long as the appropriate number of people have delegated their credentials to the server.
This architecture allows Red October to act as a convenient decryption service. Other systems, including CloudFlare’s build system, can use it for decryption and users can delegate their credentials to the server via a simple web interface. All communication with Red October is encrypted with TLS, ensuring that passwords are not sent in the clear.
How to use it
Setting up a Red October server is simple; all it requires is a locally-readable path and an SSL key pair. After that, all control is handled remotely through a set of JSON-based APIs.
Red October is backed by a database of accounts stored on disk in a portable password vault. The server never stores the account password there, only a salted hash of the password for each account. For each user, the server creates an RSA key pair and encrypts the private key with a key derived from the password and a randomly generated salt using a secure derivation function.
Any administrator can encrypt any piece of data with the encrypt API. This request takes a list of users and the minimum number of users needed to decrypt it. The server returns a somewhat larger piece of data that contains an encrypted version of this data. The encrypted data can then be stored elsewhere.
This data can later be decrypted with the decrypt API, but only if enough people have delegated their credentials to the server. The delegation API lets a user grant permission to a server to use their credentials for a limited amount of time and a limited number of uses.
Red October was designed from cryptographic first principles, combining trusted and understood algorithms in known ways. CloudFlare is also opening the source of the server to allow others to analyze its design.
Red October is based on combinatorial techniques and trusted cryptographic primitives. We investigated using complicated secret primitives like Shamir's sharing scheme, but we found that a simpler combinatorial approach based on primitives from Go's standard library was preferable to implementing a mathematical algorithm from scratch. Red October uses 128-bit AES, 2048-bit RSA and scrypt as its cryptographic primitives.
Creating an account
Each user is assigned a unique, randomly-generated RSA key pair when creating an account on a Red October server. The private key is encrypted with a password key derived from the user’s password and salt using scrypt. The public key is stored unencrypted in the vault with the encrypted private key.
When asked to encrypt a piece of data, the server generates a random 128-bit AES key. This key is used to encrypt the data. For each user that is allowed to decrypt the data, a user-specific key encryption key is chosen. For each unique pair of users, the data key is doubly encrypted, once with the key encryption key of each user. The key encryption keys are then encrypted with the public RSA key associated with their account. The encrypted data, the set of doubly-encrypted data keys, and the RSA-encrypted key encryption keys are all bundled together and returned. The encrypted data is never stored on the server.
Delegating credentials to the server
When a user delegates their key to the server, they submit their username and password over TLS using the delegate JSON API. For each account, the password is verified against the salted hash. If the password is correct, a password key is derived from the password and used to decrypt the user’s RSA private key. This key is now “Live” for the length of time and number of decryptions chosen by the user.
To decrypt a file, the server validates that the requesting user is an administrator and has the correct password. If two users of the list of valid users have delegated their keys, then decryption can occur. First the RSA private key is used to decrypt the key encryption key for these two users, then the key encryption keys are used to decrypt the doubly encrypted data key, which is then used to decrypt the data.
Some other key points:
- Cryptographic security. The Red October server does not have the ability to decrypt user keys without their password. This prevents someone with access to the vault from decrypting data.
- Password flexibility. Passwords can be changed without changing the encryption of a given file. Key encryption keys ensure that password changes are decoupled from data encryption keys.
The version of Red October we are releasing to GitHub is in beta. It is licensed under the 3-clause BSD license. We plan to continue to release our improvements to the open source community. Here is the project on GitHub: Red October.
Writing the server in Go allowed us to design the different components of this server in a modular way. Our hope is this modularity will make it easy for anyone to build in support for different authentication methods that are not based on passwords (for example, TLS client certificates, time-based one-time-passwords) and new core cryptographic primitives (for example, elliptic curve cryptography).
CloudFlare is always looking to improve the state of security on the Internet. It is important to us to share our advances with the world and contribute back to the community. See the CloudFlare GitHub page for the list of our open source projects and initiatives.
The two reasons to be an engineer at CloudFlare | 12-11-13
If you're thinking about joining a startup as an engineer we'd like you to think of CloudFlare. The two most important reasons to think of CloudFlare are... because of who your colleagues would be and who our customers are (and who their customers are).
It's the people, inside and outside the company, that make CloudFlare a fascinating and stimulating place to work.
Core to making CloudFlare work is rolling out features without eating up CPU or memory. That means optimizing everything so that we get the most out of our custom hardware. That requires engineers who like to get under the hood of the tools they are working with. We look for people with deep knowledge of their speciality and we hire them. If you know a lot about any of our technologies we'd like to talk to you.
Daily life at CloudFlare means working with people who know an enormous amount about different web and server technologies and who are happy to share their knowledge. We regularly speak at meetups, conferences and internal 'lunch and learn' meetings.
Hiring the smartest people means we have a wide range of ages and backgrounds: you'll meet people just a couple of years out of college and those for whom college is a distant memory; you'll meet people from all over the world. We also love to hire people who've worked on open source projects.
CloudFlare's customers are web site owners. And there are a lot of them. We currently handle about 5% of the Internet's web traffic and have major customers like imgur, Eurovision and MIT, and smaller customers like The Pumpkin Lady. Both the traffic we handle and our customer base are growing very rapidly.
Code you write will be used in one of the most dynamic environments possible and by a huge number of people. If there are bugs, they will be found! Roughly 750 million unique IP addresses touch our global network every month generating around 250 billion page views. We handle more traffic than Yahoo, Amazon, Instagram, eBay, AOL, Apple... combined. The code you write here makes a difference to our legion of web site owners and their customers.
Whether you are writing code to make DNS faster, change the way Internet compression is done, optimize the mobile experience, protect sites against hacker attacks, provide the fastest and most secure SSL service, rollout new protocols like SPDY or IPv6, or make our service the easiest to use both through the web or through an API, your code will have a huge impact.
The breadth of CloudFlare's customer base and visitor base means that we see all the world's networks, every web browser imaginable, all manner of attacks, and work with customers using every sort of web server. And with variety and reach come scale and scaling problems; as one small example of our scaling challenges: we generate around 60GB of log files per minute.
Talk to us
Interested in making the web faster, safer and easier for web site owners? Take a look at who we are looking for in San Francisco and London.
What we've been doing with Go | 11-11-13
Almost two years ago CloudFlare started working with Go. What started as an experiment on one network and concurrency heavy project has turned into full, production use of Go for multiple services. Today Go is at the heart of CloudFlare's services including handling compression for high-latency HTTP connections, our entire DNS infrastructure, SSL, load testing and more.
I first wrote about CloudFlare's use of Go back in July 2012. At that time there was only one CloudFlare project, Railgun, written in Go, but others were starting to germinate around the company. Today we have many major projects in Go. So, we celebrate Go's 4th birthday with a short list of interesting things we've written in Go.
RRDNS is a DNS proxy that was written to help scale CloudFlare's DNS infrastructure. Our DNS was already very fast and we wanted to make it faster, more reliable and resilient to attack.
RRDNS provides response rate limiting to stop DNS attacks, caching to lower the load on the database, load balancing to detect downed upstreams, seamless binary upgrade (with no service interruption), CNAME flattening and more. It is built on a modular framework that allows the implementation of each
behaviour to exist in separately maintained modules.
This modularity was trivial to enforce with interfaces, allowing each module to remain strictly self-contained but retain the flexibility of implementation (some use cgo, others have background workers). A goroutine is dedicated to each request to force isolation where panics are recovered and logged instead of crashing the server.
The guarantees needed to avoid leaving the server in a bad state when handling panics would be impossible without the defer mechanism Go provides. Other language features simplified the complexity of writing new modules; garbage collection allows us to avoid complex reference counting
schemes and concentrate on the business logic. Managing the concurrent requests' access to shared data was also easy, either by wrapping this in a goroutine or using read-write locks.
Railgun is CloudFlare's compression technology that's used to speed up connections between CloudFlare's data centers and origin web servers (especially when there is high latency between them). It's 100% written in Go and performs on the fly compression of web pages (and other textual assets) against recent versions of those pages to send the absolute minimum data possible.
Railgun also helps cache previously uncacheable assets (such as dynamically generated web pages) by spotting the often small changes between web pages over time, or the small changes between different users. Railgun is now widely used by CloudFlare customers to create responsive web sites wherever the end user is in the world.
It is now widely used by CloudFlare customers and hosting partners and achieves impressive real world speedups, including faster TTFB and better page load time, and bandwidth savings which translate into additional savings for people using services like AWS.
Red October is a cryptographically secure implementation of the two-man rule control mechanism. It is a software-based encryption and decryption server. The server allows authorized individuals to encrypt a payload in such a way that no one individual can decrypt it. To decrypt the payload, at least two authorized individuals must be logged into the server. The decryption of the payload is cryptographically tied to the login credentials of the authorized individuals and is only kept in memory for the duration of the decryption.
We'll be using Red October to secure very sensitive material (such as private encryption keys) so that no single CloudFlare employee (or single attacker) can get access to them. In coming months we'll write more about the cryptographic underpinnings of keeping CloudFlare secure.
The SSL Bundler allows us to take a customer's own SSL certificate and compute the fastest, shortest chain of intermediate certificates that can be used to verify the connection. When someone uploads a custom SSL certificate, we use our directory of intermediate certificates to build all the possible chains from the uploaded cert to a trusted browser root. We then rank these chains based on a number of factors including:
- The length of the certificate chain
- The ubiquity of the root certificate in browsers and other clients
- The security of each step in the chain (e.g., does their Extended Key Usage include Server Authentication)
- The length of the validity period of all the steps in the chain
The result is a server bundle that is small, fast and strong while having ubiquitous browser and client support. We then present that chain of certificates when an SSL connection is made so that the browser can securely verify the SSL connection as quickly as possible.
And that's not all
We're also experimenting with Go-based software like CoreOS (we're working with their go-raft library) and Docker. We have internal tools for load testing written in Go for Kyoto Tycoon, HTTP servers, SPDY and memcached. And we've open-sourced our Go stream processing library, a Go wrapper for high-performance syslogging, our changes to Go bindings for Kyoto Cabinet, and a tool for 'curling' SPDY sites. And, of course, we've been contributing directly to Go itself.
All in all Go is one of the main languages in use at CloudFlare and from here it looks like it has a bright future.
Happy 4th Birthday Go!
(And, yes, we're hiring Go programmers in London and San Francisco)
Cloud-o-ween | 31-10-13
Over the weekend, I happened by a pumpkin carving contest in Monterey. One of the winning pumpkins was a low-relief scene of a pirate ship. The hull was created by carving the pumpkin in the traditional way. To make the sails, some of the pumpkin skin was cut away giving them a translucent effect. Looking at this, I thought to myself, “Hey, that would be really cool do with the CloudFlare logo.” So I did.
As CloudFlare’s legal counsel, I spend a good deal of my time protecting the trademarks. Since our launch in 2010, the CloudFlare brand has come to stand for something more than just our products and services. The CloudFlare brand is an important symbol of our company’s efforts to build a better Internet. That mission is why I decided to carve my jack-o’-lantern with the CloudFlare logo.
CloudFlare has protected many websites from threats and specters. A quick glance at some of our customers reveals that we have protected 215 websites with the word Halloween from spooky, spooky vampires, trolls, and other demons. We’ve also had 49 ‘scary’ websites, 15 goblins, 10 ghouls, 577 ghosts, 18 spooky, 49 scary, 25 haunted, and 2 trick or treat, but no jack-o’-lanterns.
If you’re the kind of person who can take pride in not just warding off bad spirits but also addressing the serious challenges of building a better Internet, join our team. CloudFlare is hiring.
Wishing you a safe and happy halloween!
A (Relatively Easy To Understand) Primer on Elliptic Curve Cryptography | 24-10-13
Elliptic Curve Cryptography (ECC) is one of the most powerful but least understood types of cryptography in wide use today. At CloudFlare, we make extensive use of ECC to secure everything from our customers' HTTPS connections to how we pass data between our data centers.
Fundamentally, we believe it's important to be able to understand the technology behind any security system in order to trust it. To that end, we looked around to find a good, relatively easy-to-understand primer on ECC in order to share with our users. Finding none, we decided to write one ourselves. That is what follows.
Be warned: this is a complicated subject and it's not possible to boil down to a pithy blog post. In other words, settle in for a bit of an epic because there's a lot to cover. If you just want the gist, the TL;DR is: ECC is the next generation of public key cryptography and, based on currently understood mathematics, provides a significantly more secure foundation than first generation public key cryptography systems like RSA. If you're worried about ensuring the highest level of security while maintaining performance, ECC makes sense to adopt. If you're interested in the details, read on.
The dawn of public key cryptography
The history of cryptography can be split into two eras: the classical era and the modern era. The turning point between the two occurred in 1977, when both the RSA algorithm and the Diffie-Hellman key exchange algorithm were introduced. These new algorithms were revolutionary because they represented the first viable cryptographic schemes where security was based on the theory of numbers; it was the first to enable secure communication between two parties without a shared secret. Cryptography went from being about securely transporting secret codebooks around the world to being able to have provably secure communication between any two parties without worrying about someone listening in on the key exchange.
Whitfield Diffie and Martin Hellman
Modern cryptography is founded on the idea that the key that you use to encrypt your data can be made public while the key that is used to to decrypt your data can be kept private. As such, these systems are known as public key cryptographic systems. The first, and still most widely used of these systems, is known as RSA — named after the initials of the three men who first publicly described the algorithm: Ron Rivest, Adi Shamir and Leonard Adleman.
What you need for a public key cryptographic system to work is a set of algorithms that is easy to process in one direction, but difficult to undo. In the case of RSA, the easy algorithm multiplies two prime numbers. If multiplication is the easy algorithm, its difficult pair algorithm is factoring the product of the multiplication into its two component primes. Algorithms that have this characteristic — easy in one direction, hard the other — are known as Trapdoor Functions. Finding a good Trapdoor Function is critical to making a secure public key cryptographic system. Simplistically: the bigger the spread between the difficulty of going one direction in a Trapdoor Function and going the other, the more secure a cryptographic system based on it will be.
A toy RSA algorithm
The RSA algorithm is the most popular and best understood public key cryptography system. Its security relies on the fact that factoring is slow and multiplication is fast. What follows is a quick walk-through of what a small RSA system looks like and how it works.
In general, a public key encryption system has two components, a public key and a private key. Encryption works by taking a message and applying a mathematical operation to it to get a random-looking number. Decryption takes the random looking number and applies a different operation to get back to the original number. Encryption with the public key can only be undone by decrypting with the private key.
Computers don't do well with arbitrarily large numbers. We can make sure that the numbers we are dealing with do not get too large by choosing a maximum number and only dealing with numbers less than the maximum. We can treat the numbers like the numbers on an analog clock. Any calculation that results in a number larger than the maximum gets wrapped around to a number in the valid range.
In RSA, this maximum value (call it max) is obtained by multiplying two random prime numbers. The public and private keys are two specially chosen numbers that are greater than zero and less than the maximum value, call them pub and priv. To encrypt a number you multiply it by itself pub times, making sure to wrap around when you hit the maximum. To decrypt a message, you multiply it by itself priv times and you get back to the original number. It sounds surprising, but it actually works. This property was a big breakthrough when it was discovered.
To create a RSA key pair, first randomly pick the two prime numbers to obtain the maximum (max). Then pick a number to be the public key pub. As long as you know the two prime numbers, you can compute a corresponding private key priv from this public key. This is how factoring relates to breaking RSA — factoring the maximum number into its component primes allows you to compute someone's private key from the public key and decrypt their private messages.
Let's make this more concrete with an example. Take the prime numbers 13 and 7, their product gives us our maximum value of 91. Let's take our public encryption key to be the number 5. Then using the fact that we know 7 and 13 are the factors of 91 and applying an algorithm called the Extended Euclidean Algorithm, we get that the private key is the number 29.
These parameters (max: 91, pub: 5; priv: 29) define a fully functional RSA system. You can take a number and multiply it by itself 5 times to encrypt it, then take that number and multiply it by itself 29 times and you get the original number back.
Let's use these values to encrypt the message "CLOUD".
In order to represent a message mathematically we have to turn the letters into numbers. A common representation of the Latin alphabet is UTF-8. Each character corresponds to a number.
Under this encoding, CLOUD is 67, 76, 79, 85, 68. Each of these digits are smaller than our maximum of 91, so we can encrypt them individually. Let's start with the first letter.
We have to multiply it by itself 5 times to get the encrypted value.
67×67 = 4489 = 30 *
*Since 4489 is larger than max, we have to wrap it around. We do that by dividing by 91 and taking the remainder.
4489 = 91×41 + 30
30×67 = 2010 = 8
8×67 = 536 = 81
81×67 = 5427 = 58
This means the encrypted version of 67 is 58.
Repeating the process for each of the letters we get that the encrypted message CLOUD becomes:
58, 20, 53, 50, 87
To decrypt this scrambled message, we take each number and multiply it by itself 29 times:
58×58 = 3364 = 88 (remember, we wrap around when the number is greater than max)
88×58 = 5104 = 8
9×58 = 522 = 67
Voila, we're back to 67. This works with the rest of the digits, resulting in the original message.
The takeaway is that you can take a number, multiply it by itself a number of times to get a random-looking number, then multiply that number by itself a secret number of times to get back to the original number.
Not a perfect Trapdoor
RSA and Diffie-Hellman were so powerful because they came with rigorous security proofs. The authors proved that breaking the system is equivalent to solving a mathematical problem that is thought to be difficult to solve. Factoring is a very well known problem and has been studied since antiquity (see Sieve of Eratosthenes). Any breakthroughs would be big news and would net the discoverer a significant financial windfall.
"Find factors, get money" - Notorious T.K.G. (Reuters)
That said, factoring is not the hardest problem on a bit for bit basis. Specialized algorithms like the Quadratic Sieve and the General Number Field Sieve were created to tackle the problem of prime factorization and have been moderately successful. These algorithms are faster and less computationally intensive than the naive approach of just guessing pairs of known primes.
These factoring algorithms get more efficient as the size of the numbers being factored get larger. The gap between the difficulty of factoring large numbers and multiplying large numbers is shrinking as the number (i.e. the key's bit length) gets larger. As the resources available to decrypt numbers increase, the size of the keys need to grow even faster. This is not a sustainable situation for mobile and low-powered devices that have limited computational power. The gap between factoring and multiplying is not sustainable in the long term.
All this means is that RSA is not the ideal system for the future of cryptography. In an ideal Trapdoor Function, the easy way and the hard way get harder at the same rate with respect to the size of the numbers in question. We need a public key system based on a better Trapdoor.
Elliptic curves: Building blocks of a better Trapdoor
After the introduction of RSA and Diffie-Hellman, researchers explored other mathematics-based cryptographic solutions looking for other algorithms beyond factoring that would serve as good Trapdoor Functions. In 1985, cryptographic algorithms were proposed based on an esoteric branch of mathematics called elliptic curves.
But what exactly is an elliptic curve and how does the underlying Trapdoor Function work? Unfortunately, unlike factoring — something we all had to do for the first time in middle school — most people aren't as familiar with the math around elliptic curves. The math isn't as simple, nor is explaining it, but I'm going to give it a go over the next few sections. (If your eyes start to glaze over, you can skip way down to the section: What does it all mean.)
An elliptic curve is the set of points that satisfy a specific mathematical equation. The equation for an elliptic curve looks something like this:
y2 = x3 + ax + b
That graphs to something that looks a bit like the Lululemon logo tipped on its side:
There are other representations of elliptic curves, but technically an elliptic curve is the set points satisfying an equation in two variables with degree two in one of the variables and three in the other. An elliptic curve is not just a pretty picture, it also has some properties that make it a good setting for cryptography.
Take a closer look at the elliptic curve plotted above. It has several interesting properties.
One of these is horizontal symmetry. Any point on the curve can be reflected over the x axis and remain the same curve. A more interesting property is that any non-vertical line will intersect the curve in at most three places.
Let's imagine this curve as the setting for a bizarre game of billiards. Take any two points on the curve and draw a line through them, it will intersect the curve at exactly one more place. In this game of billiards, you take a ball at point A, shoot it towards point B. When it hits the curve, the ball bounces either straight up (if it's below the x-axis) or straight down (if it's above the x-axis) to the other side of the curve.
We can call this billiards move on two points "dot." Any two points on a curve can be dotted together to get a new point.
A dot B = C
We can also string moves together to "dot" a point with itself over and over.
A dot A = B
A dot B = C
A dot C = D
It turns out that if you have two points, an initial point "dotted" with itself n times to arrive at a final point, finding out n when you only know the final point and the first point is hard. To continue our bizzaro billiards metaphor, imagine one person plays our game alone in a room for a random period of time. It is easy for him to hit the ball over and over following the rules described above. If someone walks into the room later and sees where the ball has ended up, even if they know all the rules of the game and where the ball started, they cannot determine the number of times the ball was struck to get there without running through the whole game again until the ball gets to the same point. Easy to do, hard to undo: this is the basis for a very good Trapdoor Function.
Let's get weird
This simplified curve above is great to look at and explain the general concept of elliptic curves, but it doesn't represent what the curves used for cryptography look like.
For this, we have to restrict ourselves to numbers in a fixed range, like in RSA. Rather than allow any value for the points on the curve, we restrict ourselves to whole numbers in a fixed range. When computing the formula for the elliptic curve (y2 = x3 + ax + b), we use the same trick of rolling over numbers when we hit the maximum. If we pick the maximum to be a prime number, the elliptic curve is called a prime curve and has excellent cryptographic properties.
Here's an example of a curve (y2 = x3 - x + 1) plotted for all numbers:
Here's the plot of the same curve with only the whole number points represented with a maximum of 97:
This hardly looks like a curve in the traditional sense, but it is. It's like the original curve was wrapped around at the edges and only the parts of the curve that hit whole number coordinates are colored in. You can even still see the horizontal symmetry.
In fact, you can still play the billiards game on this curve and dot points together. The equation for a line on the curve still has the same properties. Moreover, the dot operation can be efficiently computed. You can visualize the line between two points as a line that wraps around at the borders until it hits a point. It's as if in our bizarro billiards game, when a ball hits the edge of the board (the max) then it is magically transported to the opposite side of the table and continues on its path until reaching a point, kind of like the game Asteroids.
With this new curve representation, you can take messages and represent them as points on the curve. You could imagine taking a message and setting it as the x coordinate, and solving for y to get a point on the curve. It is slightly more complicated than this in practice, but this is the general idea.
You get the points
(70,6), (76,48), -, (82,6), (69,22)
*There are no coordinates with 65 for the x value, this can be avoided in the real world
An elliptic curve cryptosystem can be defined by picking a prime number as a maximum, a curve equation and a public point on the curve. A private key is a number priv, and a public key is the public point dotted with itself priv times. Computing the private key from the public key in this kind of cryptosystem is called the elliptic curve discrete logarithm function. This turns out to be the Trapdoor Function we were looking for.
What does it all mean?
The elliptic curve discrete logarithm is the hard problem underpinning elliptic curve cryptography. Despite almost three decades of research, mathematicians still haven't found an algorithm to solve this problem that improves upon the naive approach. In other words, unlike with factoring, based on currently understood mathematics there doesn't appear to be a shortcut that is narrowing the gap in a Trapdoor Function based around this problem. This means that for numbers of the same size, solving elliptic curve discrete logarithms is significantly harder than factoring. Since a more computationally intensive hard problem means a stronger cryptographic system, it follows that elliptic curve cryptosystems are harder to break than RSA and Diffie-Hellman.
To visualize how much harder it is to break, Lenstra recently introduced the concept of "Global Security." You can compute how much energy is needed to break a cryptographic algorithm, and compare that with how much water that energy could boil. This is a kind of cryptographic carbon footprint. By this measure, breaking a 228-bit RSA key requires less energy to than it takes to boil a teaspoon of water. Comparatively, breaking a 228-bit elliptic curve key requires enough energy to boil all the water on earth. For this level of security with RSA, you'd need a key with 2,380-bits.
With ECC, you can use smaller keys to get the same levels of security. Small keys are important, especially in a world where more and more cryptography is done on less powerful devices like mobile phones. While multiplying two prime numbers together is easier than factoring the product into its component parts, when the prime numbers start to get very long even just the multiplication step can take some time on a low powered device. While you could likely continue to keep RSA secure by increasing the key length that comes with a cost of slower cryptographic performance on the client. ECC appears to offer a better tradeoff: high security with short, fast keys.
Elliptic curves in action
After a slow start, elliptic curve based algorithms are gaining popularity and the pace of adoption is accelerating. Elliptic curve cryptography is now used in a wide variety of applications: the U.S. government uses it to protect internal communications, the Tor project uses it to help assure anonymity, it is the mechanism used to prove ownership of bitcoins, it provides signatures in Apple's iMessage service, it is used to encrypt DNS information with DNSCurve, and it is the preferred method for authentication for secure web browsing over SSL/TLS. CloudFlare uses elliptic curve cryptography to provide perfect forward secrecy which is essential for online privacy. First generation cryptographic algorithms like RSA and Diffie-Hellman are still the norm in most arenas, but elliptic curve cryptography is quickly becoming the go-to solution for privacy and security online.
If you are accessing the HTTPS version of this blog (https://blog.cloudflare.com) from a recent enough version of Chrome or Firefox, your browser is using elliptic curve cryptography. You can check this yourself. In Chrome, you can click on the lock in the address bar and go to the connection tab to see which cryptographic algorithms were used in establishing the secure connection. Clicking on the lock in the Chrome 30 should show the following image.
The relevant portions of this text to this discussion is ECDHE_RSA. ECDHE stands for Elliptic Curve Diffie Hellman Ephemeral and is a key exchange mechanism based on elliptic curves. This algorithm is used by CloudFlare to provide perfect forward secrecy in SSL. The RSA component means that RSA is used to prove the identity of the server.
We use RSA because CloudFlare's SSL certificate is bound to an RSA key pair. Modern browsers also support certificates based on elliptic curves. If CloudFlare's SSL certificate was an elliptic curve certificate this part of the page would state ECDHE_ECDSA. The proof of the identity of the server would be done using ECDSA, the Elliptic Curve Digital Signature Algorithm.
CloudFlare's ECC curve for ECDHE (This is the same curve used by Google.com):
curve: y² = x³ + ax + b
a = 115792089210356248762697446949407573530086143415290314195533631308867097853948
b = 41058363725152142129326129780047268409114441015993725554835256314039467401291
The performance improvement of ECDSA over RSA is dramatic. Even with an older version of OpenSSL that does not have assembly-optimized elliptic curve code, an ECDSA signature with a 256-bit key is over 20x faster than an RSA signature with a 2,048-bit key.
On a MacBook Pro with OpenSSL 0.9.8, the "speed" benchmark returns:
Doing 256 bit sign ecdsa's for 10s: 42874 256 bit ECDSA signs in 9.99s
Doing 2048 bit private rsa's for 10s: 1864 2048 bit private RSA's in 9.99s
That's 23x as many signatures using ECDSA as RSA.
CloudFlare is constantly looking to improve SSL performance. Just this week, CloudFlare started using an assembly-optimized version of ECC that more than doubles the speed of ECDHE. Using elliptic curve cryptography saves time, power and computational resources for both the server and the browser helping us make the web both faster and more secure.
It is not all roses in the world of elliptic curves, there have been some questions and uncertainties that have held them back from being fully embraced by everyone in the industry.
One point that has been in the news recently is the Dual Elliptic Curve Deterministic Random Bit Generator (Dual_EC_DRBG). This is a random number generator standardized by the National Institute of Standards and Technology (NIST), and promoted by the NSA. Dual_EC_DRBG generates random-looking numbers using the mathematics of elliptic curves. The algorithm itself involves taking points on a curve and repeatedly performing an elliptic curve "dot" operation. After publication it was reported that it could have been designed with a backdoor, meaning that the sequence of numbers returned could be fully predicted by someone with the right secret number. Recently, the company RSA recalled several of their products because this random number generator was set as the default PRNG for their line of security products. Whether or not this random number generator was written with a backdoor or not does not change the strength of the elliptic curve technology itself, but it does raise questions about the standardization process for elliptic curves. As we've written about before, it's also part of the reason that attention should be spent to ensuring that your system is using adequately random numbers. In a future blog post, we will go into how a backdoor could be snuck into the specification of this algorithm.
Some of the more skeptical cryptographers in the world now have a general distrust for NIST itself and the standards it has published that were supported by the NSA. Almost all of the widely implemented elliptic curves fall into this category. There are no known attacks on these special curves, chosen for their efficient arithmetic, however bad curves do exist and some feel it is better to be safe than sorry. There has been progress in developing curves with efficient arithmetic outside of NIST, including curve 25519 created by Daniel Bernstein (djb) and more recently computed curves by Paulo Baretto and collaborators, though widespread adoption of these curves are several years away. Until these non-traditional curves are implemented by browsers, they won't be able to be used for securing cryptographic transport on the web.
Another uncertainty about elliptic curve cryptography is related to patents. There are over 130 patents that cover specific uses of elliptic curves owned by BlackBerry (through their 2009 acquisition of Certicom). Many of these patents were licensed for use by private organizations and even the NSA. This has given some developers pause over whether their implementations of ECC infringe upon this patent portfolio. In 2007, Certicom filed suit against Sony for some uses of elliptic curves, however that lawsuit was dismissed in 2009. There are now many implementations of elliptic curve cryptography that are thought to not infringe upon these patents and are in wide use.
The ECDSA digital signature has a drawback compared to RSA in that it requires a good source of entropy. Without proper randomness, the private key could be revealed. A flaw in the random number generator on Android allowed hackers to find the ECDSA private key used to protect the bitcoin wallets of several people in early 2013. Sony's Playstation implementation of ECDSA had a similar vulnerability. A good source of random numbers is needed on the machine making the signatures. Dual_EC_DRBG is not recommended.
Even with the above cautions, the advantages of elliptic curve cryptography over traditional RSA are widely accepted. Many experts are concerned that the mathematical algorithms behind RSA and Diffie-Hellman could be broken within 5 years, leaving ECC as the only reasonable alternative.
Elliptic curves are supported by all modern browsers, and most certification authorities offer elliptic curve certificates. Every SSL connection for a CloudFlare protected site will default to ECC on a modern browser. Soon, CloudFlare will allow customers to upload their own elliptic curve certificates. This will allow ECC to be used for identity verification as well as securing the underlying message, speeding up HTTPS sessions across the board. More on this when the feature becomes available.
CloudFlare And Open Source Software: A Two-Way Street | 08-10-13
CloudFlare uses a great deal of open source and free software. Our
core server platform is nginx (which is released
using a two-clause BSD license) and our
primary database of choice is postgresql
(which is released using their own BSD-like
license). We've talked in
about our use of Kyoto Tycoon
(which is released under the GNU General Public License) and we've
built many things on top of
And, of course, we make use of open source tools such as gcc, make,
the Go programming language, Lua, python, Perl, and PHP, and projects
nagios. And, naturally, we use Linux.
It would take a while to write down all the software that we use to
build CloudFlare, but all that software has one thing in common: it's
open source or free software. Our stack consists of either software
we've built ourselves or an open source project (which we've sometimes
Why Build On Open Source
It's probably obvious to most readers why we use open source software:
it's reliable, it's easy to modify and it's easy to maintain. But
there's another benefit that should not be overlooked: using and
working on open source software brings a great deal of job
satisfaction for programmers and it helps us hire the best.
We encourage our programmers to release changes they've made to open
source software and to release projects through the CloudFlare
At GitHub you'll find projects such as
golog (a high-performance Go
(an implementation of MessagePack for Lua), a Python-based CNAME
and a macro language for
systemtap, amongst others.
You'll also find the ngx_lua
module which embeds
Lua in nginx. That's not something CloudFlare initially wrote, but we
make such extensive use of it that we hired Yichun
Zhang. He continues to work full-time on
it while at CloudFlare.
And, if you've ever delved into the internals of nginx, you'll know
another CloudFlare employee, Piotr Sikora, who recently added the
ability to set keys for TLS Session
So, at CloudFlare, open source can get you a job, be your job or, at
least, be a significant part of your job.
Where appropriate (i.e. where we think we make the biggest impact and
get something we need) we've sponsored external open source projects
and paid for improvements that all can use.
We make wide use of the excellent LuaJIT project
and after much profiling by engineers we discovered areas where more
JITing would improve our performance. Rather than do the work
LuaJIT project. These speedups will be appearing in LuaJIT 2.1 when it
Two Way Street
Of course, it would be easy for us to use open source software, make
modifications and not release them. None of the licenses for the
software we use force us to release our modifications. But we prefer
to give back and not just because of karma.
There are two big advantages to releasing modifications we've made to
existing projects: the many eyeballs effect and reducing fork cost.
The first, many eyeballs, is common to any open source project: the
more people look at code, the better it gets. And that applies equally
to code written by a core team of developers and code written by
outsiders. When we contribute changes we've made, others look at the
changes and improve them.
For example, back in 2012 we contributed an improvement to Go's
module. This year that work was
And the cost of maintaining a fork provides useful economic pressure
making releasing our modifications make sense. It's cheaper for us to
release than to maintain a fork and merge as the core of a project
But, what about CloudFlare's secret sauce?
Open Sourcing CloudFlare Core
Our strong bias is to open source everything we've built. When we
don't, it's usually because it's highly specific to us and/or because
the support cost is high. Ultimately, we'd like to open source all our
major components so that they can be used to build a faster, safer,
Many of our smaller components are really glue code that don't make
any sense to open source as they are so specific to our implementation
of the overall system.
We don't believe that there is any chunk of code so clever that it
gives us a long term competitive advantage. Instead, our advantage
comes from the network we've built, the data we collect on making the
web faster and safer, and, most importantly, the people we're able to
A commitment to open source builds trust in the community which helps
us continue to build our service and attract the best people.
In fact, the best way to get hired at CloudFlare is to make good
contributions to open source projects we find interesting and
useful. Contributions like that often speak more loudly than a resume.
How I created the viral sensation: isthegovernmentopen.com | 04-10-13
The following is a guest blog post by Michael Tomko, Production Director at The Able Few. Michael has been using CloudFlare for a number of his projects for the last few years.
A few years ago, amidst the final crunch of a project deadline, a friend and former colleague looked me directly in the eye and said, "It's like I don't even know how to build a website anymore." He was of course referring to the combination of configuration, tuning, testing, and other tedious tasks that are necessary to deploy an otherwise simple website in a world of CMS's, responsive images, and web fonts. That comment has stuck with me ever since because as a developer I often find it very difficult to stand up to the pressure of having to use the latest libraries and techniques, optimized to the hilt, just to do something very rudimentary.
Flash-forward to the night of September 30th...Having just finished watching The Voice — the TV was turned up way louder than usual — my wife and I found ourselves under siege by the roar of TV news personalities bickering about the imminent government shutdown. This obviously led to us discussing what the shutdown really meant and many comments about how ridiculous it all seemed.
I suddenly remembered the silly, single-serve websites like Is It Christmas or Is Mitt Romney The President and the idea popped into my head to see if there were any for the #shutdown. A few minutes later, I Instagrammed a screenshot of the Hover.com results for "isthegovernmentopen.com" showing that it was in fact available. This immediately led to a smattering of "Do It!" comments and the normal "You're Insane" glare from my wife, so of course I had to go for it.
Finding the perfect GIF
There was a big part of me though that wasn't going to do it. I really didn’t think that the typical giant "NO" was going to make it worth the time and if I was going to buy yet ANOTHER domain, it had to be worth it. That's when I re-stumbled upon this amazing, near-perfect loop of President Obama being locked out in the White House rose garden.
You see, I have a few other "giant GIF" website— Bummersauce.com and HowMuchIsMoney.com — which were created to be the perfect one-liner response to someone complaining on social media, and this GIF of the President was the absolute best response to people posting comments about the shutdown. Plus, I think that it brought a fairly unique sense of bipartisan levity to the situation.
Quick and easy
But, this is where I had to make myself comfortable with hitting the "Easy Button", so-to-speak, and to not waste a bunch of time over-engineering the thing. I didn't need a CMS. I didn't need any special fonts. I just needed a single HTML file, a few lines of CSS, and the GIF.
Within something like 30 minutes from the time that my initial Instagram went up, I had acquired the domain, built the site, run 2 simple commands to publish it to GitHub Pages, and had it on CloudFlare's worldwide CDN. IsTheGovernmentOpen.com was built and out the door in a matter of minutes.
Originally, I thought that it would get a few hundred hits and be passed around by my friends here in St. Louis. But, by time that I went to bed that night, it already had 3,500 page views. I woke up to another 3,000 and something like 7,000 came in just during my morning commute. Immediately I knew that this was bigger than I could have expected and I was basically waiting for GitHub to call and tell me that they were shutting me/it down.
Well, about 11:30am UTC New York Magazine included the site in their government shutdown liveblog and traffic soared to something like 1200 concurrent users — and stayed that way for over 6 hours. But GitHub never called and the site stayed up.
Throughout the first two days, IsTheGovernmentOpen.com survived being simultaneously linked to by NY Magazine, Mashable, The Verge, Pophangover, the Harvard Nieman Lab, Vice, and nearly every funny GIF site imaginable, two separate posts staying on the front page of Reddit for almost 24 hours, being shared on Facebook 5,177 times, and over 3,969 tweets from many highly influential Twitter users, including Anonymous.
How CloudFlare helped
Up until this week, CloudFlare had been the simple DNS provider and no-assembly-required CDN that I would use to keep my annual event's site up on commodity servers, the go-to recommendation for our clients when setting up DNS for their projects, and the cool suggestion as a starting place for my peers if they were complaining about issues with their site's performance. But, throughout the onslaught that IsTheGovernmentOpen.com threw at my free CloudFlare and GitHub accounts this week, this is where I truly realized the power of CloudFlare.
It sort of felt like that scene from X-Men 2 movie when Jean Grey turned into the Phoenix to save the rest of the X-Men from being swallowed by a giant tidal wave. CloudFlare quite literally had my back and they were taking quite a punch on my behalf.
All in all, this site never would have happened if CloudFlare — and GitHub — didn't provide such simple-to-use services that were also worth using, even on the free tier. Not only did they provide a very snappy and robust experience for my users but they also kept the site up for the entire duration of giant traffic spikes.
At The Able Few, pretty much none of our projects are just simple single-serve, GIF-based websites. In fact, we don’t really even build websites at all. Working with many early-stage startups — like Fizziology, Click With Me Now, and Tunespeak — we build a lot of very complicated and seemingly never-been-done-before type of applications. It is NOT our desire to spend time setting up and maintaining the things that CloudFlare provides with just a few clicks. And, if time is money, then CloudFlare basically pays you to use their service. I like that.
If you haven't yet signed up for an account, you can be up and running in a matter of minutes and will immediately see a benefit, but if you already have an account, well then you probably know what I'm talking about.
Patching a WHMCS zero day on day zero | 03-10-13
A critical zero-day vulnerability was published today affecting any hosting provider using WHMCS. As part of building a safer web, CloudFlare has added a ruleset to our Web Application Firewall (WAF) to block the published attack vector. Hosting partners running their WHMCS behind CloudFlare's WAF can enable the WHMCS Ruleset and implement best practices to be fully protected from the attack.
Our friends at WHMCS quickly published a patch here: http://blog.whmcs.com/?t=79427
CloudFlare recommends applying the patch for your current version of WHMCS or updating WHMCS to version 5.2.8 to close this vulnerability.
Ensuring Randomness with Linux's Random Number Generator | 03-10-13
attribution: Flickr/mark van de wouw license: CC Attribution-NonCommercial-ShareAlike 2.0 Generic
When building secure systems, having a source of random numbers is essential. Without them, most cryptographic systems break down and the privacy and authenticity of communications between two parties can be subverted. For example, if you’re reading this using a link to https://blog.cloudflare.com then the SSL connection you are using will have required random numbers to ensure its security (they were used as part of the establishment of the secure connection).
We’ve covered why secure systems require random numbers in a previous blog post, but getting random numbers from a computer is very hard. This blog post looks at Linux’s internal random number generator and how it overcomes the problem of generating random numbers on a machine that’s anything but random.
CloudFlare’s servers require a good source of random numbers for authentication and to assure perfect forward secrecy in SSL. But, internally, the computers we all use are deterministic machines that follow instructions and are required to do so in a predictable manner. Uncertainty and unpredictability are not built in: there is no easy way to tell a computer to go flip a coin or roll some dice. To get randomness in a computer it has to be looked for in the outside world.
Consumer computers and mobile devices have a number of sensors that provide unpredictable input. The timing of keystrokes and mouse movements of a user will have some degree of randomness if measured closely enough. Noise from microphones and cameras can also provide a lot of randomness. Mobile devices have even more sources including fluctuating wifi signals, motion sensors and GPS information.
Most of these sensors are not available on servers where random numbers are needed most. This is especially true for servers that run in virtualized environments that might not have access to a precise system clock. For CloudFlare’s servers, we currently rely on the random number generator built into the Linux operating system.
Linux is one of the most popular operating systems in the world. It serves as the operating system for everything from the web servers and data centers of many the largest sites in the world (Google, Facebook, Amazon, Apple, etc.), to desktop computers (Ubuntu, Chrome OS, etc.) to embedded devices (smart TVs, Android, etc.). CloudFlare’s software is built on the solid foundation of the Linux operating system kernel.
Linux itself provides a random number service so that any program has access to random numbers at any time. Luckily for us, Linux is open source software and we can learn how it works by reading the code. And verify that it provides a suitable source of random numbers for our cryptographic purposes.
Entropy and Randomness
Not all randomness is created equally. There are two sorts of randomness to think about: uniformity and unpredictability. A random number generator provides ‘uniform’ output if all numbers will come up equally often if run long enough. That’s useful for modeling random processes, but not good enough for security.
For computer security, random numbers need to be hard to guess: they need to be unpredictable. The predictability of numbers is quantified in a measure called entropy.
If a fair coin is tossed it provides one bit of entropy: the coin lands with equal probability on heads or tails (which can be thought of as 0 and 1). Because the probability is equal there’s no predictability in the coin’s ‘output’. We say it provides one bit of entropy.
An unfair coin toss provides less than one bit, since it’s much easier to guess when you know the bias. Flipping a coin with heads on both sides provides no entropy, since the result of a coin toss can be guessed with absolute certainty.
Entropy is distinct from statistical randomness. Looking at the statistical properties of a stream of numbers does not guarantee that the stream contains any entropy. For example, the digits of pi look random by almost any statistical measure, but contain no entropy since there is a well known formula to calculate them and perfectly predict the next value. (As an aside, pi is an example of a normal number: one where all the digits will appear in equal quantities).
attribution: Flickr/foxtongue License: CC Attribution 2.0 Generic
Also, large numbers do not always have high entropy. You can take a small random number and turn it into a large random number and the entropy remains the same. For example, take a random number from 1 to 16 and compute its cryptographic hash with an algorithm like SHA-1. The resulting 160 bit number looks very random, but it is only one of only 16 possible such numbers. Guessing the number is just as easy as guessing a random number from 1 to 16. It’s the size of the pool from which random numbers are drawn that matters.
For cryptographic keys, the amount of entropy used to create them is tied to how hard they are to guess. A 128 bit key created from a source with 20 bits of entropy is no more secure than a 20 bit key. A good source of entropy is necessary to create secure keys.
Take a dip in the pool
On Linux, the root of all randomness is something called the kernel entropy pool. This is a large (4,096 bit) number kept privately in the kernel’s memory. There are 24096 possibilities for this number so it can contain up to 4,096 bits of entropy. There is one caveat - the kernel needs to be able to fill that memory from a source with 4,096 bits of entropy. And that’s the hard part: finding that much randomness.
The entropy pool is used in two ways: random numbers are generated from it and it is replenished with entropy by the kernel. When random numbers are generated from the pool the entropy of the pool is diminished (because the person receiving the random number has some information about the pool itself). So as the pool’s entropy diminishes as random numbers are handed out, the pool must be replenished.
Replenishing the pool is called stirring: new sources of entropy are stirred into the mix of bits in the pool.
This is the key to how random number generation works on Linux. If randomness is needed, it’s derived from the entropy pool. When available, other sources of randomness are used to stir the entropy pool and make it less predictable. The details are a little mathematical, but it’s interesting to understand how the Linux random number generator works as the principles and techniques apply to random number generation in other software and systems.
The kernel keeps a rough estimate of the number of bits of entropy in the pool. You can check the value of this estimate through the following command:
A healthy Linux system with a lot of entropy available will have return close to the full 4,096 bits of entropy. If the value returned is less than 200, the system is running low on entropy.
The kernel is watching you
I mentioned that the system takes other sources of randomness and uses this to stir the entropy pool. This is achieved using something called a timestamp.
Most systems have precise internal clocks. Every time that a user interacts with a system, the value of the clock at that time is recorded as a timestamp. Even though the year, month, day and hour are generally guessable, the millisecond and microsecond are not and therefore the timestamp contains some entropy. Timestamps obtained from the user’s mouse and keyboard along with timing information from the network and disk each have different amount of entropy.
How does the entropy found in a timestamp get transferred to the entropy pool? Simple, use math to mix it in. Well, simple if you like math.
Just mix it in
A fundamental property of entropy is that it mixes well. If you take two unrelated random streams and combine them, the new stream cannot have less entropy. Taking a number of low entropy sources and combining them results in a high entropy source.
All that’s needed is the right combination function: a function that can be used to combine two sources of entropy. One of the simplest such functions is the logical exclusive or (XOR). This truth table shows how bits x and y coming from different random streams are combined by the XOR function.
Even if one source of bits does not have much entropy, there is no harm in XORing it into another source. Entropy always increases. In the Linux kernel, a combination of XORs is used to mix timestamps into the main entropy pool.
Generating random numbers
Cryptographic applications require very high entropy. If a 128 bit key is generated with only 64 bits of entropy then it can be guessed in 264 attempts instead of 2128 attempts. That is the difference between needing a thousand computers running for a few years to brute force the key versus needing all the computers ever created running for longer than the history of the universe to do so.
Cryptographic applications require close to one bit of entropy per bit. If the system’s pool has fewer than 4,096 bits of entropy, how does the system return a fully random number? One way to do this is to use a cryptographic hash function.
A cryptographic hash function takes an input of any size and outputs a fixed size number. Changing one bit of the input will change the output completely. Hash functions are good at mixing things together. This mixing property spreads the entropy from the input evenly through the output. If the input has more bits of entropy than the size of the output, the output will be highly random. This is how highly entropic random numbers are derived from the entropy pool.
The hash function used by the Linux kernel is the standard SHA-1 cryptographic hash. By hashing the entire pool and and some additional arithmetic, 160 random bits are created for use by the system. When this happens, the system lowers its estimate of the entropy in the pool accordingly.
Above I said that applying a hash like SHA-1 could be dangerous if there wasn’t enough entropy in the pool. That’s why it’s critical to keep an eye on the available system entropy: if it drops too low the output of the random number generator could have less entropy that it appears to have.
Running out of entropy
One of the dangers of a system is running out of entropy. When the system’s entropy estimate drops to around the 160 bit level, the length of a SHA-1 hash, things get tricky, and how they effect programs and performance depends on which of two Linux random number generators are used.
Linux exposes two interfaces for random data that behave differently when the entropy level is low. They are /dev/random and /dev/urandom. When the entropy pool becomes predictable, both interfaces for requesting random numbers become problematic.
When the entropy level is too low, /dev/random blocks and does not return until the level of entropy in the system is high enough. This guarantees high entropy random numbers. If /dev/random is used in a time-critical service and the system runs low on entropy, the delays could be detrimental to the quality of service.
On the other hand, /dev/urandom does not block. It continues to return the hashed value of its entropy pool even though there is little to no entropy in it. This low-entropy data is not suited for cryptographic use.
The solution to the problem is to simply add more entropy into the system.
Hardware random number generation to the rescue?
Intel’s Ivy Bridge family of processors have an interesting feature called “secure key." These processors contain a special piece of hardware inside that generates random numbers. The single assembly instruction RDRAND returns allegedly high entropy random data derived on the chip.
It has been suggested that Intel’s hardware number generator may not be fully random. Since it is baked into the silicon, that assertion is hard to audit and verify. As it turns out, even if the numbers generated have some bias, it can still help as long as this is not the only source of randomness in the system. Even if the random number generator itself had a back door, the mixing property of randomness means that it cannot lower the amount of entropy in the pool.
On Linux, if a hardware random number generator is present, the Linux kernel will use the XOR function to mix the output of RDRAND into the hash of the entropy pool. This happens here in the Linux source code (the XOR operator is ^ in C).
Third party entropy generators
Hardware number generation is not available everywhere, and the sources of randomness polled by the Linux kernel itself are somewhat limited. For this situation, a number of third party random number generation tools exist. Examples of these are haveged, which relies on processor cache timing, audio-entropyd and video-entropyd which work by sampling the noise from an external audio or video input device. By mixing these additional sources of locally collected entropy into the Linux entropy pool, the entropy can only go up.
A diversity of sources
The main thing to understand is that better randomness comes through diversity. Taking a variety of sources of random data and mixing them together results in better random numbers. For servers, this should include data local to the machine (hardware random number generator, network timing) along with sources derived externally in a safe location.
In addition to the sources described above, there are many sources of random numbers to be harvested. These include lava lamps, space noise and the quantum properties of light. CloudFlare is working on a system to ensure high quality random numbers to all of our servers by adding new sources into the system Linux currently provides. As these systems come online over the coming months, we will share the details with the community.
CloudFlare: Happy 3rd Birthday | 27-09-13
Today is CloudFlare's birthday. We opened to the public exactly three years ago today, September 27, 2010. In those three years we've grown to power more than 1.5 million websites and sit in front of more than 4% of all web requests. Three years ago, however, those first signups were terrifying.
Early on we realized that if we were going to bring CloudFlare's services to as broad an audience as possible we needed to focus on making the signup process ridiculously easy. Traditional CDNs typically require months of integration work and, often, thousands of dollars in professional consulting services to get up and running. We spent most of our engineering time designing a signup process that took less than 5 minutes to complete on average.
While we were convinced that would be a long term advantage, as our launch grew near we started to worry: what would happen if hundreds, or even thousands, of websites signed up? We only had a limited number of servers running in a handful of data centers around the world (Chicago, San Jose, Ashburn, Amsterdam, and Tokyo).
We debated internally how to deal with this problem. We decided, about a month before launch, that what we would do is create an invite system for people to sign up with their email and then we'd let them in as we had capacity. Unfortunately, the Monday morning of our launch we realized no one on the team had gotten around to building the invite system.
As Michelle and I practiced our presentation back stage at the TechCrunch Disrupt conference, where we were scheduled to launch on stage, Lee sat in the audience and hastily coded together a system of "circuit breakers" that would limit new signups. Every hundred signups or so the circuit breakers would trip and cause the signup form to shut down. This would allow us time to check all the systems before opening up again for new customers.
The signups started within the first minute of our Disrupt presentation. About half way through we'd already signed up more than 100 customers and the first circuit breaker tripped and our engineering team was alerted to check on the health of the servers. Unexpectedly, signups continued. In the haste to get the circuit breakers in place, the piece of code to shut down the signup form never went live.
As the judges on stage quizzed Michelle and me on stage about our business plan and technical systems, our engineering team sat in the audience frantically auditing our systems and trying to decide whether to shut down signups manually. By the time the judges finished questioning us, we'd signed up about 500 sites and there was no sign that the pace was slowing down. The TechCrunch launch article went live and there was another surge.
Full Steam Ahead
Sri, who was our first and, at the time, only member of our ops team, was checking all the servers to find any hot spots. Our traffic across the network had spiked from virtually nothing just a few minutes before to hundreds of requests per second. But the system appeared to be holding up. Lee made the call to allow people to keep signing up and didn't fix the circuit breakers.
By the end of the day, we'd signed up just shy of 2,000 new sites -- almost 10 times what we'd originally anticipated. The core system we'd spent the previous 9 months building held up. None of us slept much that first night as we were all logged in to our dashboards and watched the health of the systems. New signups slowed down a bit but traffic only compounded. Lee called me sometime Tuesday afternoon and said, "Ok, I've looked everything over and I'm confident what we've built will continue to scale for at least the next 2 to 3 years. After that, we'll have to redesign some things, but until then I don't think we'll ever have to turn away a new customer, so I killed the circuit breakers."
Hitting Our Stride
Today we signup as many as 5,000 new customers a day. And, as Lee correctly predicted, the core system has held up very well. About a year ago we conducted a full survey of CloudFlare's core systems. Based on that we embarked on an effort to rearchitect all the systems where we were beginning to see bottlenecks. That effort is largely finished and is beginning to bear fruit with a new flexibility and extensibility for more features, faster performance, and a dramatic expansion of our network.
While we've had an amazing third year, I predict you'll be blown away by what we have in store over the next twelve months. But, today, at the CloudFlare office our team will be taking a short break to celebrate our birthday.