Upcoming Meetups at CloudFlare | 22-04-14
At CloudFlare, we love connecting with our communities, and so we are excited to announce two meetups to be hosted here at the CloudFlare headquarters in San Francisco next month.
All Things Crypto - 5/8/2014
On Thursday, May 8, Nick Sullivan from the security engineering team at CloudFlare will host a meetup with several cryptography industry experts. The meetup will consist of three speakers giving quick 15-20 minute presentations followed by Q&A. Speakers and topics include: Trevor Perrin who will discuss "next-generation" protocols in end-to-end security for applications like chat, text messaging, and email, Brian Warner who will be discussing Firefox Sync, and Michael Hamburg who will discuss new developments in Elliptic Curve Cryptography. If you are interested in anything and everything about cryptography, then this meetup is for you! Doors open at 6:00pm PDT. RSVP on the CloudFlare Meetup page
GoSF - 5/21/2014
GoSF is a meetup group that has been around since 2011 and caters to those who program in Go. On Wednesday, May 21, CloudFlare will be hosting GoSF’s Go Session 2 - Programming in Go (for Experienced Devs). Go Sessions are part of a program designed by GoSF members to provide a guided pair-programming experience to increase knowledge of using Go in production. Experienced Go developers will lead coding exercises that will run you through Go compiling, dependency management, program structure, concurrency, and more. Interested in learning more? RSVP on the GoSF meetup page.
In addition to the networking and learning these meetups offer, CloudFlare provides pizza and beverages for all attendees. We hope to see some of you at our next event!
Improving vulnerability disclosure for researchers | 21-04-14
Trust, transparency, and collaboration are values which we hold dear at CloudFlare. As a web security and performance company, we are always interested in how we can make our service and our infrastructure more secure. We also know how the power of the security researcher community can help us achieve results more quickly and more effectively than we could do on our own; witness results from our CloudFlare Heartbleed challenge. We appreciate the work that security researchers worldwide have done in helping us build a better Internet, and we want to make it even easier for them to collaborate with us. Today, we are announcing CloudFlare’s new vulnerability disclosure program.
The new vulnerability disclosure program, facilitated by HackerOne’s bug reporting platform, makes it easy to report a vulnerability you have discovered, track our progress in addressing it, and understand when it has been fixed. When we’ve fixed an eligible bug you have reported, we will recognize you publicly on our Hall of Fame page and reward you with a CloudFlare ‘Venator Errorum’ t-shirt.
Close-up of 'Venator Errorum' design on the limited edition t-shirt
This bug hunter T-shirt is a limited edition shirt and will only be available to the exclusive group of vulnerability reporters who submit an accepted bug. It is such an exclusive shirt that not even CloudFlare employees will be given one without an eligible vulnerability submission. In addition, we will also provide you with 12 months of Pro or 1 month of Business service for free if you have a domain you would like CloudFlare to make safer and faster.
We spent a lot of time considering the best way for us to manage a vulnerability reporting program, including evaluating several crowd-sourced solutions. We chose to partner with HackerOne to power this program because not only have they streamlined the disclosure process, but we also agree with their vulnerability disclosure philosophy. They have also partnered with Nginx, PHP, Yahoo, OpenSSL and a range of security-minded companies.
Previously, we did not have a dedicated external reporting channel for vulnerabilities. We realized having a formal program would improve responsiveness to vulnerability reporters and provide more transparency to the researcher community.
If this program piques your interest, please read through our vulnerability disclosure policy. To report a vulnerability, please visit our program on HackerOne.
If you are a startup struggling with what is the best way to develop a vulnerability disclosure policy and program for your organization, feel free to reach out to us at email@example.com, and we will share our experiences and insight.
The Hidden Costs of Heartbleed | 17-04-14
A quick followup to our last blog post on our decision to reissue and revoke all of CloudFlare's customers' SSL certificates. One question we've received is why we didn't just reissue and revoke all SSL certificates as soon as we got word about the Heartbleed vulnerability? The answer is that the revocation process for SSL certificates is far from perfect and imposes a significant cost on the Internet's infrastructure.
Today, after having done a mass reissuance and revocation, we have a tangible sense of that cost. To understand it, you need to understand a bit about how your browser checks if an SSL certificate has been revoked.
OCSP & CRL
When most browsers visit web pages over HTTPS they perform a check using one of two certificate revocation methods: Online Certificate Status Protocol (OCSP) or Certificate Revocation List (CRL). For OCSP, the browser pings the certificate authority and asks whether a particular site's certificate has been revoked. For CRL, the browser pings the certificate authority (CA) and downloads a complete list of all the certificates that have been revoked by that CA.
There are pluses and minuses to both systems. OCSP imposes a lighter bandwidth cost, but a higher number of requests and backend lookups. CRL doesn't generate as many requests, but, as the CRL becomes large, can impose a significant bandwidth burden. These costs are borne by visitors to websites, whose experience will be slower as a result, but even more so by the CAs who need significant resources in place to handle these requests.
Technical Costs of Revocation
Yesterday, CloudFlare completed the process of reissuing all the SSL certificates we manage for our customers. Once that was complete, we revoked all previously used certificates. You can see the spike in global CRL activity we generated:
What you can't see is the spike in bandwidth that imposed. Globalsign, who is CloudFlare's primary CA partner, saw their CRL grow to approximately 4.7MB in size from approximately 22KB on Monday. The activity of browsers downloading the Globalsign CRL generated around 40Gbps of net new traffic across the Internet. If you assume that the global average price for bandwidth is around $10/Mbps, just supporting the traffic to deliver the CRL would have added $400,000USD to Globalsign's monthly bandwidth bill.
Lest you think that’s an overestimate, to make the total costs more accurate, we ran the numbers using AWS’s CloudFront price calculator using a mix of traffic across regions that approximates what we see at CloudFlare. The total cost to Globalsign if they were using AWS’s infrastructure, would be at least $952,992.40/month. Undoubtedly they’d give some additional discounts above the pricing they list publicly, but any way you slice it the costs are significant.
Beyond the cost, many CAs are not setup to be able to handle this increased load. Revoking SSL certificates threatens to create a sort of denial of service attack on their own infrastructures. Thankfully, CloudFlare helps power Globalsign's OCSP and CRL infrastructure. We were able to bear the brunt of the load, allowing us to move forward with revocation as quickly as we did. And, no, we didn’t charge them anything extra.
So, if you're wondering why some people are dragging their feet on mass certificate revocation, now you know why — it imposes a real cost. And if you're a CA who's wondering what you're going to do when you inevitably have to revoke all the certs you've issued over the last year, we're happy to help.
The Heartbleed Aftermath: all CloudFlare certificates revoked and reissued | 17-04-14
Eleven days ago the Heartbleed vulnerability was publicly announced.
Last Friday, we issued the CloudFlare Challenge: Heartbleed and simultaneously started the process of revoking and reissuing all the SSL certificates that CloudFlare manages for our customers.
That process is now complete. We have revoked and reissued every single certificate we manage and all the certificates we use.
That mass revocation showed up dramatically on the ISC's CRL activity chart.
Customers who use custom certificates are encouraged to rekey, upload their new certificates, and revoke the previous custom certificates.
We announced last Monday that we had patched a bug in OpenSSL and our customers were safe. We did not know then that CloudFlare was among the few to whom the bug was disclosed before the public announcement. In fact, we did not even know the bug's name. At that time we had simply removed TLS heartbeat functionality completely from OpenSSL by recompiling with the
After learning the full extent of the bug and that it had been live on the Internet for two years, we started an investigation to see whether our private keys and those of our customers were at risk.
We started our investigation by attempting to see what sort of information we could get through Heartbleed. We set up a test server on a local machine and bombarded it with Heartbleed attacks, saving the blocks of memory it returned. We scanned that memory for copies of the private key and after extensive scanning, we could not find a trace of it. Still, we were not fully convinced that it was impossible to do so we decided to crowdsource the problem by setting up the CloudFlare Heartbleed Challenge.
Nine hours into the challenge the first instance of the key was revealed on Twitter. Over the five days of the challenge we saw over 75 million Heartbleed attacks against our server and 8,000 submissions.
The winners of the challenge used two different techniques in order to obtain the private key.
How the private key leaks
From looking at the code we knew that some pieces of the key are duplicated and manipulated during the computation of the private key operation. Following the code down into bn_div.c this code ends up making a copy of the divisor (in a modulus operation):
/* First we normalise the numbers */
if (!(BN_lshift(sdiv,divisor,norm_shift))) goto err;
The value stored in
sdiv ends up remaining in memory until something overwrites it. The divisor is one of the two primes p and q needed for RSA. More on that below. But here's a diagram showing memory inside OpenSSL. The diagram shows the state of memory after 26 HTTPS requests have been sent through (of varying sizes). Memory in black is unused, green is in use, red contains private keys and yellow contains the remnants of private keys.
The remnants are what enabled Heartbleed to leak private keys. A great deal of the green (in use) memory is being used by OpenSSL as buffers (memory where requests and responses are handled). If some of that green memory near the yellow key remnants is used for a Heartbleed request the yellow memory may be copied into the Heartbleed response and the key may leak.
Here's what that looks like in memory:
(gdb) x/128xb 0xbd1560
0xbd1560: 0xe0 0x09 0xbd 0x00 0x00 0x00 0x00 0x00
0xbd1568: 0xf0 0x34 0xbb 0x00 0x00 0x00 0x00 0x00
0xbd1570: 0x00 0x00 0x00 0x00 0x00 0x00 0x00 0x00
0xbd1578: 0x00 0x00 0x00 0x00 0x00 0x00 0x00 0x00
0xbd1580: 0x0f 0x47 0x67 0xe0 0x03 0x11 0x56 0x7d
0xbd1588: 0x1c 0xdf 0x92 0x9d 0xfe 0x41 0x59 0xfa
0xbd1590: 0x6e 0x2a 0x99 0x18 0x5f 0x1d 0x2a 0x48
0xbd1598: 0x99 0xaf 0x31 0x68 0x1e 0x36 0xc8 0x75
0xbd15a0: 0x3f 0xe2 0x29 0x36 0x53 0xb7 0x8e 0x8c
0xbd15a8: 0xa2 0x9a 0x9b 0x39 0x3c 0x18 0x94 0x1a
0xbd15b0: 0x0e 0x92 0xd5 0x1b 0x60 0xc7 0x13 0x60
0xbd15b8: 0x8a 0xb9 0x0b 0x31 0x95 0x6d 0x46 0x19
0xbd15c0: 0x40 0xa9 0x89 0x4e 0xc4 0x6d 0x26 0x74
0xbd15c8: 0x77 0xfe 0x5b 0x90 0xb1 0x5d 0xa9 0x79
0xbd15d0: 0xce 0xed 0x52 0x4d 0xbe 0x17 0xcc 0x37
0xbd15d8: 0x87 0xcd 0xfa 0xd9 0x4e 0x3d 0xb1 0x55
The data from one of the primes is in the middle of that block of memory.
A nagging question is why, when OpenSSL has functions to cleanse memory, are these chunks of keys being found in memory. We are continuing to investigate and if a bug is found will submit a patch for OpenSSL.
The more HTTPS traffic the server serves, the more likely some of these intermediate values end up on the heap where Heartbleed can read them. Unfortunately for us, our test server was on our local machine and did not have a large amount of HTTPS traffic. This made it a lot less likely that we would find private keys in our experimental setting. The CloudFlare Challenge site was serving a lot of traffic, making key extraction more likely. There were some other tricks we learned from the winners about efficiently finding keys in memory. To explain these, we will go a bit into how RSA works.
The RSA Cryptosystem
The RSA cryptosystem is based on picking two prime numbers (which are habitually called p and q) and then using various mathematical properties of prime numbers to create a secure system. First a number n (called the modulus) is calculated. It is just p x q.
The other two vital values in RSA are called exponents and typically called e and d. In common RSA cryptosystems the number e is 65537. The public key is (n, e) (i.e. the modulus and the exponent e), the private key is (n, d) (i.e. the modulus and the exponent d). Of course, the prime numbers p and q also have to be kept secret because everything depends on them.
In textbook RSA, you can perform the private key operation with only d and n, but it's slow.
In OpenSSL, RSA’s private key operation also uses the primes p and q to speed up the private key operation using Montgomery Arithemetic and other tricks. If either p or q are found, they can be used to derive the private key d. Searching for p and q was the approach taken by most of the successful contestants of the challenge.
The most common trick for identifying the prime factors from random data was to take all blocks of data that match the size of a prime factor (128 bytes in the case of our 2,048 bit RSA key) and convert them to a number. Some contestants did a primality check to see if the number was prime or not, and some contestants tried to divide the number into the public modulus n. If the number turned out to be a prime factor, then it can be used to extract the private key d. This was the approach taken by Fedor Indutny as published on his blog.
The second and slightly more elegant approach was taken by Rubin Xu. He started by attempting to find prime factors but quickly gave up on finding them. He then decided to use Coppersmith's Attack to derive the private exponent. In this attack, only pieces of the private key are needed and the private key is reconstructed mathematically. This attack is so efficient that he managed to extract the private key from only the first 50 dumps!
Here are the winners of the challenge in order of when they found the private key:
- Fedor Indutny (@indutny) Developer
- Ilkka Mattila, Information Security Adviser
- Rubin Xu (@xurubin), Security PhD Student
- Ben Murphy (@benmmurphy), Security Researcher
- Steve Hunter (@nonaxiomatic)
- Xavier Martin (@xav), Security Researcher
- no name given
- Jeremi Gosney (@jmgosney), CEO, Stricture Group
- Michele Guerini Rocco (@Rnhmjoj), Student
- David Gervais (@davidgervais), Software Engineer
- Christian Bürgi (@buergich)
- Daniel Burkard (@hiptomcat)
A big congratulations goes out to all of them!
Based on these findings, we believe that within two hours a dedicated attacker could retrieve a private key from a vulnerable server. Since the allocation of temporary key material is done by OpenSSL itself, and is not special to NGINX, we expect these attacks to work on different server software including non-web servers that use OpenSSL.
All administrators running servers using vulnerable versions of OpenSSL should patch OpenSSL with version 1.0.1g (or later) and also reissue and revoke all their private keys. As mentioned above, CloudFlare has now done this for all the customers where we manage SSL on their behalf.
As a follow up to the Heartbleed bug and the CloudFlare Challenge, I hosted a webinar with Ben Murphy from Fonix. Ben was one of the winners in the CloudFlare Heartbleed challenge. Ben and I answered many questions; a recording of the webinar is posted on CloudFlare's YouTube channel and can be seen below.
Certificate Revocation and Heartbleed | 12-04-14
As you may have noticed, the CloudFlare Heartbleed Challenge has been solved. The private key for the site cloudflarechallenge.com has been obtained by several authorized attackers via the Heartbleed exploit.
Any person who obtained the private key will be able to impersonate cloudflarechallenge.com, as Fedor Indutny demonstrated when proving he had the private key.
We have decided to revoke the certificate, but leave the site active so people can test their browsers. As we mentioned in a previous blog post, revocation is not a foolproof process. Each browser behaves differently when it encounters an expired certificate. If you are still able to visit the challenge site, you might have to change your browser settings.
Internet Explorer and Safari give warnings, but allow the user to bypass them.
Firefox fully denies access to sites using a revoked certificate.
Chrome allows the site to load with no warning. This is because online revocation checking is disabled by default. Instead, Chrome uses a proprietary method called CRLSets which relies on a pre-compiled list of revoked certificates. Scott Helme describes how to enable online verification in the Chrome advanced settings:
It is more important than ever to check certificates to see if they have been revoked. According to Netcraft that certificate revocation has gone up sharply since the Heartbleed vulnerability was announced.
We expect this trend to continue as more websites evaluate the risk that their private keys were stolen though Heartbleed. If your site was vulnerable to Heartbleed, we encourage you to talk to your CA to revoke your certificate an rekey.
I will be giving a webinar about this topic next week with updates. You can register for that here.
The Results of the CloudFlare Challenge | 12-04-14
Earlier today we announced the Heartbleed Challenge. We set up a nginx server with a vulnerable version of OpenSSL and challenged the community to steal its private key. The world was up to the task: two people independently retrieved private keys using the Heartbleed exploit.
The first valid submission was received at 16:22:01PST by Software Engineer Fedor Indutny. He sent at least 2.5 million requests over the course of the day. The second was submitted at 17:12:19PST by Ilkka Mattila at NCSC-FI, who sent around a hundred thousand requests over the same period of time.
UPDATE: Two more confirmed winners: Rubin Xu, PhD student in the Security group of Cambridge University submitted at 04:11:09PST on 04/12; and Ben Murphy, Security Researcher submitted at 7:28:50PST on 04/12.
We confirmed that all individuals used only the Heartbleed exploit to obtain the private key. We rebooted the server at 3:08PST, which may have caused the key to be available in uninitiallized heap memory as theorized in our previous blog post. It is at the discretion of the researchers to share the specifics of the techniques used.
This result reminds us not to underestimate the power of the crowd and emphasizes the danger posed by this vulnerability.
Answering the Critical Question: Can You Get Private SSL Keys Using Heartbleed? | 11-04-14
Below is what we thought as of 12:27pm UTC. To verify our belief we crowd sourced the investigation. It turns out we were wrong. While it takes effort, it is possible to extract private SSL keys. The challenge was solved by Software Engineer Fedor Indutny and Ilkka Mattila at NCSC-FI roughly 9 hours after the challenge was first published. Fedor sent 2.5 million requests over the course of the day and Ilkka sent around 100K requests. Our recommendation based on this finding is that everyone reissue and revoke their private keys. CloudFlare has accelerated this effort on behalf of the customers whose SSL keys we manage. You can read more here.
The widely-used open source library OpenSSL revealed on Monday it had a major bug, now known as “heartbleed". By sending a specially crafted packet to a vulnerable server running an unpatched version of OpenSSL, an attacker can get up to 64kB of the server’s working memory. This is the result of a classic implementation bug known as a Buffer over-read
There has been speculation that this vulnerability could expose server certificate private keys, making those sites vulnerable to impersonation. This would be the disaster scenario, requiring virtually every service to reissue and revoke its SSL certificates. Note that simply reissuing certificates is not enough, you must revoke them as well.
Unfortuntately, the certificate revocation process is far from perfect and was never built for revocation at mass scale. If every site revoked its certificates, it would impose a significant burden and performance penalty on the Internet. At CloudFlare scale the reissuance and revocation process could break the CA infrastructure. So, we’ve spent a significant amount of time talking to our CA partners in order to ensure that we can safely and successfully revoke and reissue our customers' certificates.
While the vulnerability seems likely to put private key data at risk, to date there have been no verified reports of actual private keys being exposed. At CloudFlare, we received early warning of the Heartbleed vulnerability and patched our systems 12 days ago. We’ve spent much of the time running extensive tests to figure out what can be exposed via Heartbleed and, specifically, to understand if private SSL key data was at risk.
Here’s the good news: after extensive testing on our software stack, we have been unable to successfully use Heartbleed on a vulnerable server to retrieve any private key data. Note that is not the same as saying it is impossible to use Heartbleed to get private keys. We do not yet feel comfortable saying that. However, if it is possible, it is at a minimum very hard. And, we have reason to believe based on the data structures used by OpenSSL and the modified version of NGINX that we use, that it may in fact be impossible.
To get more eyes on the problem, we have created a site so the world can challenge this hypothesis:
CloudFlare Challenge: Heartbleed
This site was created by CloudFlare engineers to be intentionally vulnerable to heartbleed. It is not running behind CloudFlare’s network. We encourage everyone to attempt to get the private key from this website. If someone is able to steal the private key from this site using heartbleed, we will post the full details here.
While we believe it is unlikely that private key data was exposed, we are proceeding with an abundance of caution. We’ve begun the process of reissuing and revoking the keys CloudFlare manages on behalf of our customers. In order to ensure that we don’t overburden the certificate authority resources, we are staging this process. We expect that it will be complete by early next week.
In the meantime, we’re hopeful we can get more assurance that SSL keys are safe through our crowd-sourced effort to hack them. To get everyone started, we wanted to outline the process we’ve embarked on to date in order to attempt to hack them.
A heartbeat is a message that is sent to the server just so the server can send it back. This lets a client know that the server is still connected and listening. The heartbleed bug was a mistake in the implementation of the response to a heartbeat message.
Here is the offending code
p = &s->s3->rrec.data
hbtype = *p++;
pl = p;
buffer = OPENSSL_malloc(1 + 2 + payload + padding);
bp = buffer;
memcpy(bp, pl, payload);
The incoming message is stored in a structure called
rrec, which contains the incoming request data. The code reads the type (finding out that it's a heartbeat) from the first byte, then reads the next two bytes which indicate the length of the heartbeat payload. In a valid heartbeat request, this length matches the length of the payload sent in the heartbeat request.
The major problem (and cause of heartbleed) is that the code does not check that this length is the actual length sent in the heartbeat request, allowing the request to ask for more data than it should be able to retrieve. The code then copies the amount of data indicated by the length from the incoming message to the outgoing message. If the length is longer than the incoming message, the software just keeps copying data past the end of the message. Since the length variable is 16 bits, you can request up to 65,535 bytes from memory. The data that lives past the end of the incoming message is from a kind of no-man’s land that the program should not be accessing and may contain data left behind from other parts of OpenSSL.
When processing a request that contains a longer length than the request payload, some of this unknown data is copied into the response and sent back to the client. This extra data can contain sensitive information like session cookies and passwords, as we describe in the next section.
The fix for this bug is simple: check that the length of the message actually matches the length of the incoming request. If it is too long, return nothing. That’s exactly what the OpenSSL patch does.
Malloc and the Heap
So what sort of data can live past the end of the request? The technical answer is “heap data,” but the more realistic answer is that it’s platform dependent.
On most computer systems, each process has its own set of working memory. Typically this is split into two data structures: the stack and the heap. This is the case on Linux, the operating system that CloudFlare runs on its servers.
The memory address with the highest value is where the stack data lives. This includes local working variables and non-persistent data storage for running a program. The lowest portion of the address space typically contains the program’s code, followed by static data needed by the program. Right above that is the heap, where all dynamically allocated data lives.
Managing data on the heap is done with the library calls
malloc (used to get memory) and
free (used to give it back when no longer needed). When you call
malloc, the program picks some unused space in the heap area and returns the address of the first part of it to you. Your program is then able to store data at that location. When you call
free, memory space is marked as unused. In most cases, the data that was stored in that space is just left there unmodified.
Every new allocation needs some unused space from the heap. Typically this is chosen to be at the lowest possible address that has enough room for the new allocation. A heap typically grows upwards; later allocations get higher addresses. If a block of data is allocated early it gets a low address and later allocations will get higher addresses, unless a big early block is freed.
This is of direct relevance because both the incoming message request (
s->s3->rrec.data) and the certificate private key are allocated on the heap with
malloc. The exploit reads data from the address of the incoming message. For previous requests that were allocated and freed, their data (including passwords and cookies) may still be in memory. If they are stored less than 65,536 bytes higher in the address space than the current request, the details can be revealed to an attacker.
Requests come and go, recycling memory at around the top of the heap. This makes extracting previous request data very likely from this attack. This is a important in understanding what you can and cannot get at using the vulnerability. Previous requests could contain password data, cookies or other exploitable data. Private keys are a different story; due to the way the heap is structured. The good news is this means that it is much less likely private SSL keys would be exposed.
Read up, not down
In NGINX, the keys are loaded immediately when the process is started, which puts the keys very low in the memory space. This makes it unlikely that incoming requests will be allocated with a lower address space. We tested this experimentally.
We modified our test version of NGINX to print out the location in memory of each request (
s->s3->rrec.data), whenever there was an incoming heartbeat. We compared this to the location in memory where the private key is stored and found that we could never get a request to be at a lower address than our private keys regardless of the number of requests we sent. Since the exploit only reads higher addresses, it could not be used to obtain private keys.
Here is a video of what searching for private keys looks like:
If NGINX is reloaded, it starts a new process and loads the keys right away, putting them at a low address. Getting a request to be allocated even lower in the memory space than the early-loaded keys is very unlikely.
We not only checked the location of the private keys, we wrote a tool to repeatedly extract extra data and write the results to file for analysis. We searched through gigabytes of these responses for private key information but did not find any. The most interesting things we found related to certificates were the occasional copy of the public certificate (from a previous output buffer) and some NGINX configuration data. However, the private keys were nowhere to be found.
To get an idea of what is happening inside the heap used by OpenSSL inside NGINX we wrote another tool to create a graphic showing the location of private keys (red pixels), memory that has never been used (black), memory that has been used but it now sitting idle because of a call to free (blue) and memory that is in use (green).
This picture shows the state of the heap memory (from left to right) immediately after NGINX has loaded and has yet to serve a request, after a single request, after two requests and after millions of requests. As described above the critical thing to note is that when the first request is made new memory is allocated far beyond the place where the private key is stored. Each 2x2 pixel square represents a single byte of memory; each row is 256 bytes.
Eagle-eyed readers will have noticed a block of memory that was allocated at a lower memory location than the private key. That's true. We looked into it and it is not being used to store the heartbleed (or other) TLS packet data. And it is much more than 64k away from the private key.
What can you get?
We said above that it's possible to get sensitive data from HTTP and TLS requests that the server has handled, even if the private key looks inaccessible.
Here, for example, is a dump showing some HTTP headers from a previous request to a running NGINX server. These headers would have been transmitted securely over HTTPS but Heartbleed means that an attacker can read them. That’s a big problem because the headers might contain login credentials or a cookie.
And here’s a copy of the public part of a certificate (as would be sent as part of the TLS/SSL handshake) sitting in memory and readable. Since it’s public this is not in itself dangerous -- by design, you can get the public key of a website even without the vulnerability and doing so does not create risk due to the nature of public/private key cryptography.
We have not fully ruled out the possibility, albeit slim, that some early elements of the heap get reused when NGINX is restarted. In theory, the old memory of the previous process might be available to a newly restarted NGINX. However, after extensive testing, we have not been able to reproduce this situation with an NGINX server on Linux. If a private key is available, it is most likely only available on the first request after restart. After that the chance that the memory is still available is extremely low.
There have been reports of a private key being stolen from Apache servers, but only on the first request. This fits with our hypothesis that restarting a server may cause the key to be revealed briefly. Apache also creates some special data structures in order to load private keys that are encrypted with a passphrase which may make it more likely for private keys to appear in the vulnerable portion of the stack.
At CloudFlare we do not restart our NGINX instances very often, so the likelihood that an attacker had hit our server with this exploit on the first request after restart is extremely low. Even if they did, the likelihood of seeing private key material on that request is very low. Moreover, NGINX, which is what CloudFlare’s system is based on, does not create the same special structures for HTTPS processing, making it less likely keys would ever appear in a vulnerable portion of the stack.
We think the stealing private keys on most NGINX servers is at least extremely hard and, likely, impossible. Even with Apache, which we think may be slightly more vulnerable, and we do not use at CloudFlare, we believe the likelihood of private SSL keys being revealed with the Heartbleed vulnerability is very low. That’s about the only good news of the last week.
We want others to test our results so we created the Heartbleed Challenge. Aristotle struggled with the problem of disproving the existence of something that doesn’t exist. You can’t prove the negative, so through experimental results we will never be absolutely sure there’s not a condition we haven’t tested. However, the more eyes we get on the problem, the more confident we will be that, in spite of a number of other ways the Heartbleed vulnerability was extremely bad, we may have gotten lucky and been spared the worst of the potential consequences.
That said, we’re proceeding assuming the worst. With respect to private keys held by CloudFlare, we patched the vulnerability before the public had knowledge of the vulnerability, making it unlikely that attackers were able to obtain private keys. Still, to be safe, as outlined at the beginning of this post, we are executing on a plan to reissue and revoke potentially affected certificates, including the cloudflare.com certificate.
Vulnerabilities like this one are challenging because people have imperfect information about the risks they pose. It is important that the community works together to identify the real risks and work towards a safer Internet. We’ll monitor the results on the Heartbleed Challenge and immediately publicize results that challenge any of the above. I will be giving a webinar about this topic next week with updates.
You can register for that here.
Jetpack for WordPress: automatic protection | 10-04-14
As we've said before, lots of our users run WordPress on their websites and its popularity makes it a big target. So when a new vulnerability is discovered, acting quickly is prudent.
Jetpack is an extremely popular plugin to provide self-hosted blogs with all of the additional functionality that WordPress provide to sites hosted with their own hosted platform at WordPress.com.
Very recently, a serious security flaw in Jetpack was discovered. It has the potential to allow an attacker to complete actions on a blog without having to log in, such as posting. The WordPress team has written about the the problem here.
This problem was assigned the CVE number CVE-2014-0173 and is fixed in Jetpack 2.9.3 released today. Everyone using Jetpack on their WordPress site should update immediately.
All CloudFlare customers who use WordPress are automatically protected against this bug. We rolled out a Web Application Firewall (WAF) rule that is automatically enabled for all customers (free or paid) to protect against this problem.
Customers using Jetpack should still upgrade immediately, but the WAF rule gives a little breathing space.
Staying ahead of OpenSSL vulnerabilities | 07-04-14
Today a new vulnerability was announced in OpenSSL 1.0.1 that allows an attacker to reveal up to 64kB of memory to a connected client or server (CVE-2014-0160). We fixed this vulnerability last week before it was made public. All sites that use CloudFlare for SSL have received this fix and are automatically protected.
OpenSSL is the core cryptographic library CloudFlare uses for SSL/TLS connections. If your site is on CloudFlare, every connection made to the HTTPS version of your site goes through this library. As one of the largest deployments of OpenSSL on the Internet today, CloudFlare has a responsibility to be vigilant about fixing these types of bugs before they go public and attackers start exploiting them and putting our customers at risk.
We encourage everyone else running a server that uses OpenSSL to upgrade to version 1.0.1g to be protected from this vulnerability. For previous versions of OpenSSL, re-compiling with the OPENSSL_NO_HEARTBEATS flag enabled will protect against this vulnerability. OpenSSL 1.0.2 will be fixed in 1.0.2-beta2.
This bug fix is a successful example of what is called responsible disclosure. Instead of disclosing the vulnerability to the public right away, the people notified of the problem tracked down the appropriate stakeholders and gave them a chance to fix the vulnerability before it went public. This model helps keep the Internet safe. A big thank you goes out to our partners for disclosing this vulnerability to us in a safe, transparent, and responsible manner. We will announce more about our responsible disclosure policy shortly.
Just another friendly reminder that CloudFlare is on top of things and making sure your sites stay as safe as possible.
Introducing CNAME Flattening: RFC-Compliant CNAMEs at a Domain's Root | 03-04-14
This post is about a new feature we've been quietly rolling out over the last few months. Last week we began enabling it for everyone by default. It's called CNAME Flattening and it's a bit geeky, but very useful and important if you’re using cloud-based services and you hate having a “www” subdomain. The gist is: you can now safely use a CNAME record, as opposed to an A record that points to a fixed IP address, as your root record in CloudFlare DNS without triggering a number of edge case error conditions because you’re violating the DNS spec. Geeky, I know, so before asking you to wade through why this is cool, here are some of our customers singing the feature’s praises:
"CloudFlare's CNAME Flattening feature enabled us to easily front our Amazon Elastic Load Balancer (ELB) with CloudFlare's DDoS and WAF protection."
"CNAME Flattening allowed us to use a root domain while still maintaining DNS fault-tolerance across multiple IP addresses."
- Andrew Warner, Director of Engineering, RapGenius
"This was a hard problem that needed solving. Leave it to the team at CloudFlare to come up with a great solution that will act as a gateway to a virtually unlimited number of other cloud services."
Intrigued? Read on to understand the technical details of exactly what it is and why we needed to build it.
The Inflexibility of DNS
Traditionally, the root record of a domain needed to point to an IP address (known as an A -- for "address" -- Record). While it may not seem like a big deal, tying a service to an IP address can be extremely limiting. Imagine that a new blogging platform, WordPlumblr, starts up. WordPlumblr allows its users to use custom domains that point to the WordPlumblr infrastructure. Foo.com signs up and WordPlumblr gives Foo.com an IP address. The supply of IP addresses is limited so, over time, as more sites sign up, IP addresses end up getting shared between multiple customers. No problem since with virtual hostnames WordPlumblr can return different content for different domains even hosted on the same IP address.
Everything is fine until Foo.com starts using too many of WordPlumblr's resources -- maybe because they're being attacked or they're featured on Oprah or who knows why. The other customers of WordPlumblr get poor performance because they're sharing the same resources as the overwhelmed Foo.com. WordPlumblr is put in a difficult position: having to reach out to Foo.com to get them to change their DNS settings or, even harder, having to reach out to all the other customers using the IP address to get them to change. Good luck.
CNAMEs For the Win
The solution is a CNAME. A CNAME is an alias. It allows one domain to point to another domain which, eventually if you follow the CNAME chain, will resolve to an A record and IP address. If WordPlumblr had handed out a unique CNAME for every customer then they wouldn't have had a problem. For example, WordPlumblr might have assigned the CNAME 6equj5.wordplumblr.com for Foo.com. Foo.com and the other customers may have all initially resolved, at the end of the CNAME chain, to the same IP address. However, when Foo.com started using too many resources WordPlumblr could have updated the CNAME and isolated Foo.com from the rest of the customers.
If you're familiar with programming, this is just like the idea of a pointer. When you're programming in C, in order to allow flexibility in memory management, you usually don't want to address memory directly but, instead, you set up a pointer to a block of memory where you're going to store something. If the operating system needs to move the memory around then it just updates the pointer to point to wherever the chunk of memory has been moved to. The program references the pointer and is none the wiser.
CNAMEs work great for subdomains like www.foo.com or blog.foo.com. Unfortunately, they don't work for a naked domain like foo.com itself. And, for reasons that somewhat perplex me, a lot of people are obsessed with using their naked domain for their website. So why don't CNAMEs work at the root?
No Root For You
The problem stems from the fact that the DNS specification dates from 1987. At the time, no one conceived of website -- it was two years before Tim Berners-Lee’s seminal paper first laying out the idea for the world wide web -- let alone modern outsourced cloud services like Amazon Web Services (AWS) or Heroku. As a result, the DNS spec enshrined that the root record -- the naked domain without any subdomain -- could not be a CNAME. Technically, the root could be a CNAME but the RFCs state that once a record has a CNAME it can't have any other entries associated with it: that's a problem for a root record like example.com because it will often have an MX record (so email gets delivered), an NS record (to find out which nameserver handles the zone) and an SOA record.
Because they follow this specification, most authoritative DNS servers won't allow you to include CNAME records at the root. At CloudFlare, we decided to let our users include a CNAME at the root even though we knew it violated the DNS specification. And that worked, most of the time. Unfortunately, there were a handful of edge cases that caused all sorts of problems.
Break the RFC at Your Own Peril
You'd never guess, but the biggest edge case had to do with email sent from Microsoft Exchange mail servers. Domains generally designate the servers that handle their email through what's known as a MX Record. The problem was that Exchange servers, under a very specific set of circumstances, could pick up the CNAME at the root record and then not properly respect the CNAME set at the MX record. You can't really blame Exchange. They were operating under the assumptions laid out by the DNS specification. However, this and a handful of other corner cases caused us to support but recommend against using a CNAME at the root record. Until now.
Introducing CNAME Flattening
What we needed was a way to support a CNAME at the root, but still follow the RFC and return an IP address for any query for the root record. To accomplish this, we extended our authoritative DNS infrastructure to, in certain cases, act as a kind of DNS resolver. What happens is that, if there's a CNAME at the root, rather than returning that record directly we recurse through the CNAME chain ourselves until we find an A Record. At that point, we return the IP address associated with the A Record. This, effectively, "flattens" the CNAME chain.
This follows the DNS specification and is invisible to any service that interacts with our DNS. We've tested it extensively over the last several months and it works great, completely resolving the Microsoft Exchange and other edge case problems we'd previously seen.
Flexible and Faster
The biggest benefit is that this allows the flexibility of having CNAMEs at the root without breaking the DNS specification. An ancillary benefit we've found is that we decrease the time for CNAME resolution by about 30% on average. We cache the CNAME responses -- respecting the DNS TTLs, just like a recursor should -- which means often we have the answer without having to traverse the chain. When we do need to traverse the CNAME chain, we often have much a much faster, more direct connection to whatever server is authoritative than your visitor's ISP's recursive DNS service, which would have otherwise been doing the heavy lifting.
While we're defaulting this on just for root records, we've begun experimenting with turning CNAME Flattening on more broadly. We've begun to flatten all the CNAMEs that CloudFlare uses internally. For example, if you signed up via a hosting partner then you may have gotten a CNAME that included in the CNAME chain something like cf-ssl2463-protected-www.cloudflare.com.cdn.cloudflare.net. That is increasingly going away and will now resolve directly from your domain to an IP address. This makes DNS resolution a bit faster and also better obscures the fact that you're using CloudFlare.
As of last week, CNAME Flattening is on by default. To take advantage of it, just add a CNAME to your root record. The new feature is included for free for everyone using CloudFlare's DNS. If you've been looking for a way to make your root record work on a hosted service like AWS, Heroku, or that hot new blogging platform WordPlumblr, look no further.
And, if you still need some convincing, here are some more quotes from happy beta testers:
"CNAME flattening solved email resolution errors for us which was very key. It also enables us to point our app hosted on skyprepapp.com to Amazon ELB and resolve emails there too."
"We chose to drop the www in our domain name around 6 years ago. Whilst I understand there are plenty of technical reasons to not use these APEX domains in URLs, in reality I don't like the idea of technical issues preventing us doing what we think is right. Our customers are internet savvy, and the www legacy is quite simply not needed anymore. So we dropped it and have used Amazon's Route 53 to date to balance our econsultancy.com traffic to our load balancers. When we moved across to CloudFlare, we had no intention to lose the Apex domain, but without CNAME flattening I do not believe this would have been possible. So this feature is very welcome and eased the migration to CloudFlare."
"CloudFlare's CNAME flattening solved a specific issue we saw with email resolution on our domain. When we first configured our DNS records for Junction we took advantage of a CloudFlare feature to set a CNAME record for our zone apex. This helped us resolve our root -- jct.com -- to an AWS ELB which routed traffic to our webservers and handled SSL termination. This worked well for resolving our http traffic, but we saw some issues with sending domains on older infrastructure that couldn't pick up our MX records correctly (we specifically saw this from senders on older versions of Microsoft SMTP Server). We worked with CloudFlare to isolate the problem and transition our domain to use their CNAME flattening solution. This effectively translates our CNAME to A records and keeps the IP addresses in sync with what is associated with our load balancer. All previous email problems have been resolved since we made the switch three months ago. A huge thanks to the CloudFlare team for getting a solution in place so quickly and an incredible level of support throughout the process."