An in-depth introduction to the weird world of SSL
Today Google Announced that HTTPS is to be used as a ranking factor, albeit initially a small one. So at the moment there are a lot of people asking a lot of questions about SSL, and overnight SSL has become more that just a sensible security issue but a requirement from the marketing team.
In many ways, it’s a great day for the web, but a sad day for the web’s priorities.
What’s coming up:
- So what is SSL anyway?
- What is a Cipher Suite?
- Do I actually need to get a certificate?
- Which certificate should I buy?
- How is this going to be setup?
- Can I secure WordPress?
- What is SPDY?
- What was Heartbleed?
- What is revoking and OSCP Stapling?
- Perfect Forward what now?
- how does HTTP Strict Transport Security header work?
- So I heard.. common myths debunked?
- So what has any of this got to do with rankings
- Final thoughts
So what is SSL anyway?
SSL (Secure Socket Layer) is a method for establishing and then passing packets that are encrypted from 2 end points. Now here’s the bit that’ll get your head spinning – no one uses SSL, and we haven’t for many years. You see SSL was succeeded by TLS (Transport Layer Security), so when someone talks about SSL they are probably talking about TLS.
Confused? Don’t worry too much about the names, it’s just one of those quirks, and while the underlying tech has changed we still refer to SSL certificates and SSL.
At its heart, TLS uses a type of cryptography known as Public Key Cryptography which relies on 2 keys: a public and a private key. These keys create a hand shake to then share a symmetrical key for transmitting data. Once a handshake has been made the server and client use the symmetrical key to encode and decode the data.
So how does you browser know about the public key that allows it to receive the second key? Well, it’s in certificates.
SSL Certificates are a mechanism for distributing the public key, and each certificate contains at least:
- Who owns the certificate
- How long the certificate is valid and a unique identifier
- The owner’s public key
- A signature of the person certifying the certificate.
Now in theory anyone can certify a certificate. It’s called self certifying but letting people give themselves the pat on the back isn’t always the greatest idea. So most systems including browsers don’t by default trust a self certified certificate and produce suitably scary error warnings. Instead, the internet relies on Certificate Authorities who act a bit like an underwriter, confirming the details of the certificate (nearly always for a fee, of course).
This system works well as long as everyone trusts the Certificate Authorities, but when they don’t or if there is an issue the system has a tendency to fall down. It turns out the web isn’t very trusting at all! If you do much reading on SSL you’ll find that it’s not SSL or certificates that are the weak points but the Certificate Authorities themselves. But as this is our default system on the web we have to rely on it until something better comes along.
So SSL isn’t really SSL but TLS, yet that’s just the “protocol” – the real work is done by the Cipher Suites. A Cipher suite is what is used to encrypt the random string which is being used to confirm the authenticity of each packet, and there are lots of different suites. The reason there are lots of suites is because security is forever changing, people are improving, and people are finding holes in old ciphers. When they do they release new ciphers, the problem with this is that not every client supports the latest and greatest cipher. So the balancing act is finding enough secure cyphers to cover all your clients, while not using ciphers which can be easily compromised. We do this on the server side by creating a list of preferred ciphers, so that when the client connects it goes through the preferred list till it finds a cipher it understands and runs with it. A complete list of ciphers is maintained by the Internet Assigned Names and Numbers Authority. In addition to a list of preferred ciphers you can also specify ones you won’t allow, resulting in a string that looks like:
In apache this will look something like:
SSLProtocol ALL -SSLv2 SSLHonorCipherOrder On SSLCipherSuite ECDH+AESGCM:DH+AESGCM:ECDH+AES256:DH+AES256:ECDH+AES128:DH+AES:ECDH+3DES:DH+3DES:RSA+AESGCM:RSA+AES:RSA+3DES:!aNULL:!MD5:!DSS
While in NGINX it looks like:
ssl_prefer_server_ciphers On; ssl_protocols SSLv3 TLSv1 TLSv1.1 TLSv1.2; ssl_ciphers ECDH+AESGCM:DH+AESGCM:ECDH+AES256:DH+AES256:ECDH+AES128:DH+AES:ECDH+3DES:DH+3DES:RSA+AESGCM:RSA+AES:RSA+3DES:!aNULL:!MD5:!DSS;
Both suite templates were taken from hardening your website SSL ciphers post.
Do I actually need to get a certificate?
The answer is probably yes. Ignoring the Google announcement, do you do any of the following:
- Collect any personal information, names, addresses, emails, phone numbers, social security etc
- Take payments of any sort (even if the payment itself is taken offsite)
- Login or have an admin area which you do not want others to have access to
- Have information on your site which might cause someone potential danger, for example many countries do not have free speech, or have laws against religious expression etc
If you do any of these things, then you probably should be running your site over SSL as a matter of course. Chances are you are not, and you are not alone.
The vast majority of sites who probably should be using SSL are not.
So why should you encrypt these details? Well whenever you or the person submitting the information sends that information across they are doing so in the clear, which means that the content can be snooped on, both at the local network level (so anyone sitting in the coffee shop) through to ISP and through the hosting company’s infrastructure. With SSL in theory anyone intercepting this data will not be able to decrypt it. As with any security nothing is 100% and it’s long been assumed that SSL was broken by several interested parties. That said, SSL provides a pretty good protection for transmitting data (though if you want to hide your data from NSA you might want to think about a typewriter and dead drops).
Where do I buy a certificate
So where do you buy these certificates? Chances are your hosting company will offer to sell you a certificate, but if possible I would avoid purchasing one from your hosting company (though in some cases you might have no choice). Instead I would look at one of the Domain Registrars who nearly always also offer a range of SSL certificates.
SSL Certificates range from < $10 to about $1k and they vary so much in price depending on what they do, how they are encrypted and how they have been validated. At some point, someone is going to point out that you can also find a couple of providers of Free certificates, but be warned, these providers (the best known is StartSSL) often provide these certificates with no support, and will charge to revoke them, often more than it would have cost to buy a cheap certificate with a free revoking option. Just something to bear in mind, spending a few dollars might save you a lot of time in the long run, and possibly work out cheaper. When you buy your SSL you will normally get the option to buy in 1, 3 or 10 year chunks. While there is always a financial advantage to buying long lasting SSL certificates if you do, make sure you will be able to revoke them for free as you don't want to pay for 10 years and then pay again if there is an issue.
What type of certificate should I get
There are generally 3 types of certificates on offer:
- Domain validated
- Organisation validated
- Extended Validation
Domain validated, are usually entirely automated (requiring the purchaser to have access to very specific email addresses) and require little information except for details about the domain. They come in two forms: single domains (so mydomain.example.com is a single domain and wouldn’t be covered by a certificate for www.example.com), and wildcards, which covers *.example.com. Wildcards are normally quite costly, where single domains are quite cheap. At time of writing a single domain is $9.95 for a PositiveSSL certificate from NameCheap (I will use Namecheap when quoting prices, but SSL certificates are available at your favourite registrar everywhere), $10.95 for a RapidSSL. For a wildcard you are looking about $94.95 for PositiveSSL wildcard and $148.88 for RapidSSL wildcard.
For personal use domain validation is cheap and effective. The result, apart from a secure site, is that the HTTPS in the browser bar goes green along with a padlock.
Organisation Validated, is more or less identical to domain validated and has pretty much been superseded by Extended Validation. However many CAs still offer Organisation Validation, which puts extra information into the certificate about the organisation. During the purchase process extra information is asked about the organisation which is then vetted (to varying levels of “vetting”). This information is designed to give confidence to a site visitor that the CA not only is validating the domain but that the domain owners are a legitimate organisation. Costs for Organisation Validated domains vary wildly, from about $39.00 for an InstantSSL through $675.88 for Symantec Secure Site Pro. Wildcard prices and batch prices also vary.
Given the existence of Extend Validation, Organisation Validated domains don’t normally represent good value for money, because while the extra information is available on the certificate, in browsers it’s hidden quite deep so only the most curious visitor is going to see it.
Extended Validation, AKA the big green bar is normally the most expensive type of certificate, and when purchasing the certificate you are required to give extra organisation information. The big feature over the other type of certificates is that it produces the green bar displaying the organisations name at the top. This is designed to give your visitors confidence that the domain and organisation have been confirmed through vetting. Prices for Extended Validation on NameCheap start at $145 for a Comodo EV certificate and go up to nearly $1k for a Symantec one. Like other certificates wildcard certificates and Multibuy deals are available.
Extended Validation is suitable for companies, charities and other organisations where a high level of trust is needed. In recent years a lot of emphasis has been put into training consumers to trust the green bar.
Regardless of which option you go for try to make sure you get a certificate with 2048 Bit encryption. Some cheaper providers may offer 1024, which is below the industry standard. Other than that the best piece of advice is go with a known named supplier, such as GeoTrust, Comodo, Symantec, Thawte. These are most likely to have their certificates bundled with various clients and have the greater coverage. Finally SSL prices vary a lot for no real reason, and just because a certificate is $400 more expensive does not make it “better” than a $10 certificate in terms of security. So do some research beforehand.
Who should set this up?
Ok so you purchased your certificate either through your host if you were locked in, or an SSL supplier. Now it comes to setting it all up. This is where instructions will vary wildly.
If you are on a shared host, chances are your host will either have an automated process for you to follow, or more likely they will ask you to email them with a support ticket. Beware on shared hosts as there may well be additional costs to hosting with an SSL certificate, as many hosts will charge for the additional IP address (by default most shared hosts host multiple sites on the same IP address and traditionally one certificate per IP is the norm).
For a VPS/Dedicated Box chances are it will be up to you or your sysadmin to follow the instructions provided by the certificate issuer which will depend on your stack setup. However, installing and configuring an SSL certificate is a fairly simple process, just make sure to keep your keys in a safe place.
OpenSSL, LibreSSL or BoringSSL
You may have heard a lot about OpenSSL recently in connection to a series of security issues, specifically Heartbleed (more about this later), but OpenSSL is a suite of programs designed to get SSL working on any server, or indeed client. It handles the creation of keys as well as the signing and the cryptographic functions needed for SSL. It’s not the only software to do this, for example Windows has its own suite and others exist, but it is probably the most common, as it’s found in most unix systems including virtually all Linux Distros.
OpenSSL is one of those ubiquitous projects used by everyone but which no one really cares about.
While it had a small team of volunteers it is a old codebase that’s been built on over the years. The Heartbleed exploit not only exposed how fragile OpenSSL is, but how under-appreciated it was. Since Heartbleed a couple of alternatives have popped up: LibreSSL, a port by the OpenBSD community and BoringSSL by Google. Both projects are in their infancy and along with renewed funding and drive mean OpenSSL and its forks are getting a lot of attention and some love. When you are setting up your SSL certificate on your server, chances are you will still be using OpenSSL, just make sure it’s the latest patched version, and keep an eye for updates. In the future you may well want to choose LibreSSL or BoringSSL as your default choice.
Fixed IP or SNI
Historically IT infrastructure could only allow 1 certificate per IP4 address, so if you needed an SSL certificate you needed a fixed IP, and as only 1 certificate could be issued from that IP every site wanting SSL needed to be on its own IP. As IP4 addresses become rarer the cost for these IPs is increasing. This cost with suitable markup (and more often than not with a huge markup) is being passed from the hosts to the web site owner (especially with shared hosting, it’s not unusual to see $10/Month charge for additional IP, while others offer 1 for free, or $2/month after that).
In such circumstances one of the options is SNI (Server Name Indication) which is an extension of TLS, which is the technology behind what we today call SSL (remember the history lesson back at the top). SNI allows you to associate an IP with a name, and then associate the name with a certificate, in effect allowing more than one certificate per IP. There is of course a catch, SNI was introduced into OpenSSL in 2004 which means some older clients don’t support it. The big one? Internet Explorer on Windows XP. There are others, very old builds of Android, some Java applications but the biggie is IE on XP. If an IE user with Windows XP tries to access the site, they get a “this certificate is not valid, proceed anyway” message. Sadly if you’re an IE user on Windows XP you’re probably used to the internet not working and click accept regardless, but it is worth considering that you may well be excluding these users. Also be aware that XP is still being used heavily in corporate environments, where the user may not have the option to click accept due to security controls at an organisation level.
Finally, your stack has to support SNI and not all hosts do! If you’re on shared hosting, then SSL is probably a money making scheme for your host, so they would rather offer to sell you a monthly IP address and SSL certificate than go through the SNI route. Others may have simply not considered implementing SNI, if in doubt contact your host and ask. I reached out to a range of hosts asking if they supported SNI, and of those who got back to me only Dan Foster at 34SP was able to say yes (they had implemented it that day). 34SP offer SNI for their smaller packages, and for their more expensive offerings they supply a fixed IP. A good example of giving someone a bit of a nudge goes along way, so it’s always worth asking your host.
Everything over SSL or mixed content
Once you have your certificate set up you will have to decide what you are going to do with your web content. Do you run your entire site over SSL or just specific areas such as checkouts, logins, admin area? This is going to be a business choice, but from personal experience, if you are going to be using SSL you might as well use it by default on your site unless there is a good reason not to. If everything is under SSL the amount of work you have to do is reduced significantly. It’s important to remember that while the content is the same you are running two separate sites, one over https and the other over http. There is nothing to stop you running different content on each and if you misconfigure your server that’s very easy to do. My preference therefore is to 301 redirect non https traffic to its https equivalent.
Playing well with WordPress
WordPress already has a strong support for SSL with some more improvements coming in WordPress 4.0. At the moment you can make checks if content is being served over SSL with the is_ssl function, and you can also set a pair of defines to force your admin area and login area over HTTPS.
define('FORCE_SSL_ADMIN', true); define('FORCE_SSL_LOGIN', true);
The only other issues you might have are with themes and content being served accidentally over HTTP. If you plan on working solely with HTTPS then you may wish to look at the SSL Insecure Content Fixer and the far more heavy weight WordPress-HTTPS. In themes you may wish to make sure you call files using // rather then http:// or https:// – this allows client to then call the file relative to the protocol being used.
Finally, watch out for badly written plugins that might enqueue or worse, hard code http links. If you come across one give the plugin author a nudge as it will normally be simple to fix assuming a HTTPS version is available.
Have you noticed a recurring theme with Google? Well they strike again. SPDY, which is a Google project that doesn’t really have much to do with SSL but rather HTTP and how data is transmitted to and from the client. Its goal is to reduce page load times, and its main way is through multiplexing so rather than making one handshake followed by another, it opens and does the handshake and then pushes multiple bits of content through. SPDY only works over SSL, hence why it comes up in conversation surrounding SSL, and it does make passing data faster over HTTPS for clients which support it.
What about Heartbleed, is this stuff even secure?
Ok, back with us, HeartBleed was a exploit in the OpenSSL library, it was not an issue with TLS (what we mean when we refer to SSL) or indeed the way we manage certificates through the Certificate Authorities. Rather it was a bug in a program. It’s important to emphasise this point, as it was the ubiquity of OpenSSL and the lack of scrutiny, funding and oversight of OpenSSL that caused the issue, not an issue with the system of SSL. We don’t shut down every airport when a plane crashes for example, though we do ground that type of plane until everything is given the all clear.
If you have read this article all the way through, you will get from the tone that the way we work with SSL today is an imperfect system and it does have flaws and weak spots.
However, it is far better to be using an imperfect system than no system at all.
It’s also important to remember that security and protection is never 100% and at best we can hope for an “ok” or “pretty good” system which is the state we are currently at.
Revoking a certificate and the day it all went wrong.
One of the areas that our current system needs a lot of work is in revoking certificates. When Heartbleed occurred, a lot of sites revoked and reissued their certificates, with a new public key. The idea being that if the old key has been compromised we trash the certificate and say “this is no longer valid, here is the new certificate”. The problem is, many clients were still running around with the old certificates and had no built in way to check if the certificate was still valid, until they tried to make a handshake and got told the public key wasn’t working. So how do we let the world know that a certificate has been revoked?
You probably don’t want to know this, but when you revoke a certificate, the Issuer adds the certificate to a revokation list which is basically a large dump of all revoked certificates, their are two competing standards OSCP and CRL, and then distributes this list to Top Level CA and other interested parties, by email and FTP. There is also CRLSets, again from Google, which they use in Google Chrome and which CAs can push data into. These sets are available on the web, and I strongly encourage anyone interested to read revocation still doesn’t work, then go have a cry.
So that’s how the list is produced, but what about your client? Well everyone handles things slightly differently but normally what happens is your client when presented with a certificate will communicate with their CA and get a dump of the latest list, which the client then parses. If the certificate isn’t on the list it carries on happily. If the certificate is on the list, then it will know not to trust the certificate being presented. This is one of the big reasons for people claiming SSL is slow, because even if the client already has an up to date list this process takes its sweet time.
To get round this slow process, there is an option for the server to bear the brunt of the processing in the form of OCSP Stapling. Instead of the client looking up the validity of the certificate with the CA it is “stapled” into the handshake by the server. This stapled version is signed by the certificate authority and has a limited lifespan. So the client simply has to validate the certificate authority key rather then making expensive extra requests to the CA for authentication.
When we destroy or revoke a certificate, it’s important to remember to also destroy the keys associated with it. The certificate is just a mechanism to deliver the public key to the client so if the client has the key and they still exist on the server they can still be used, but it can also be used to read existing content. When a handshake is made, the client creates a random string which is then encrypted with public key sent to the server, which uses the private key to decrypt. They then use this random string as a mechanism to validate each other’s authenticity. If at a later date the Private key is compromised, it can be used to get this random string. If the client and server support Forward secrecy then rather than relying on each others public and private key, a shared key is created by both sides using random strings. The result is that anyone eavesdropping in the middle will not be able to decrypt this, and even if they subsequently gain the private key they won’t have access to historical data. The protocol for this is named Diffie-Hellman after it’s inventors, and is used in several cyphers supported by OpenSSL and others. It’s also default when using OpenSSL with Nginx and can be set up for use with Apache and others. For anyone interested I strongly suggest reading the Diffie-Hellman Key Exchange article on Wikipedia For a better explanation.
HTTP Strict Transport Security header
A HTTP Strict Transport Security (HSTS) header is a way to force a client to only request content over HTTPS. It works by sending a HTTP header in the HTTPS packet with what domains should send content over HTTPS and how long it should be before the client rechecks if this is still the policy.
Strict-Transport-Security: max-age=31536000; includeSubDomains
So for example the above header would maintain the HSTS policy for a year, and expect the policy to be enforced across all sub domains. This is not an extension of SSL but rather a way for a client to determine what to do. It’s entirely up to the client to decide if it will obey the header. HSTS is useful even in sites that redirect traffic to HTTPS by default as it helps prevent man in the middle style attacks, where a client requests the HTTP version of the site, and before the server redirects them to the HTTPS version the request is diverted to another server, that returns a HTTP payload. With HSTS if the client has previously connected to the site they would not make the initial request to the HTTP end point, but directly to the HTTPS and so throw a warning.
Common HTTPs Myths Debunked
HTTPS is slow – Technically using SSL requires every connection to make a handshake which not only increases the size of the packet being sent over, but also requires extra processing power on either end. So while HTTPS is slower than HTTP we are talking by a very small factor, that most clients will not notice. For clients and servers supporting SPDY then you will actually see a page speed increase when SPDY is in use.
You can’t show Video/Images/Script from other sites – This is one of those myths that again is partly true. If you load an asset over HTTP on a page that is being served as HTTPS, then the user’s browser will produce a warning about Mixed Content, and turn your padlock from Green to Orange. Some browsers, depending on their security settings, won’t show HTTP content from a different domain on a page served over HTTPS. So if a page is served over HTTPS then all the assets on the page must also be served over HTTPS including external sites and things in iFrames.
It’s complicated – The math behind SSL is fairly complex but thankfully you don’t need to worry about it, implementing SSL on a site is fairly straight-forward as sysadmin tasks go. Just follow the instructions given to you by the issuer.
HTTPS content doesn’t rank and produces duplicate content – Well given Google just said it considers HTTPS to be a ranking factor the first part is clearly a myth, the second has a little bit of truth in it. When you set up SSL for a site, you normally create a new vhost (to use Apache terminology) which loads the same content as your HTTP version, but it doesn’t have to be. So in the purest sense, when you serve content over both HTTP and HTTPS you are creating duplicate content. Does Google see it that way? Well it certainly used to, but it’s getting far more clever about it. However, one way to prevent this as an issue is to force content served over HTTP to HTTPS via a 301 redirect, this way Google will only ever see your HTTPS traffic.
Google can’t index HTTPS content – Total rubbish, it regularly does, where I suspect this myth comes from is the fact that if you switch from HTTP to HTTPS only, it takes time for Google to reindex the site and you may see an initial drop in rankings, which then picks up again.
WordPress can’t be served over HTTPS – I don’t know where this came from but I keep hearing it, WordPress can be used over HTTPS and indeed it has built in features to force the admin area to be served over HTTPS, and version 4.0 has a dedicated “task force” looking at improvements to SSL across the board in WordPress.
HTTP is better then HTTPS – seriously I have seen that as a genuine statement, you read this post and work out for yourself what the answer is
It’s expensive to run HTTPS – This one I guess is relative, but certificates are under $10 a year, or free if you are willing to put up with a lot of work and some issues. If you’re on cheaper hosting then there is probably also an additional ongoing cost, but this should be minimal. If you have lots of sites sharing a single host then these costs will start to ratchet up, but consider this an investment in your reliability and your site visitors security and trust.
No one cares about the green bar – There has been a lot of training and communication from browsers and site owners about SSL, so most consumers are at least aware that a green padlock is good even if they don’t know why. But even if you don’t do it for your site visitors, consider that each time you log in you are doing so with a plain text password.
So why did Google put HTTPS as a ranking factor anyway?
Ok, we are nearly there but we have one question left to ask, why did Google make a huge announcement that SSL is a “minor” ranking factor?
Well, put simply the stick hasn’t worked. Security, sysadmins and operation teams have been trying to terrify senior management into using SSL and it’s scary how many sites that should be using SSL don’t. For many, “what if” scenarios are purely that, hypotheticals, and security is a thing to do later. However SEO and Marketing is something that is being done now, so Google is dangling a carrot – improve security and we will consider you more trusted and rank your content accordingly.
SSL as a ranking factor makes sense when you think of it less from a security point of view but from a trust point of view. The validation system means there has been a level of validation at the domain or organisation level and that someone cares enough to put an SSL certificate in place.
When you think of it as a trust indicator with security as a byproduct, the move makes more sense, and somewhere there are a bunch of sysadmins who are quietly feeling smug that they can now offset the cost of SSL to the companies marketing dept budget and not their own. Everyone is a winner!
That’s it folks
That’s it! Below are a few resources for further reading. If you are interested I tend to talk a lot about WordPress, security as well as performance, so please do sign up to my newsletter and if you happen to live in the UK I am running a Scaling and Managing WordPress “Advanced” Workshop on the 23rd September. You can grab details and find out how to book your place here.
Finally I strongly encourage everyone to check our Adam Langleys Blog (one of the security team at Google) and SSL lab who not only provide a good set of recommendations but also a nice online testing suite. Finally as my friend Barry said…
SSL is Good!