Images and https/ssl content

Imagine you have a web site. It contains static pages, images, and perhaps even dynamic content. You serve it to the world over http. Unencrypted.

One day you decide that you'd like to offer all that content in "secure" mode. It seems that everybody else makes their sites (or at least a substantial chunk of their sites) available over https, so you figure you ought to too.

There are one or two things you should know before you do this...

First, there is no reason why all your content shouldn't be available equally over http and https. As long as all your links are to relative URLs (ie. to /thing/a.html rather than to, you should be fine. However, so-called "client redirects" must use absolute URLs, so you must ensure that these are issued with the correct protocol identifier (http versus https).

Second, you should be aware that every single https site in the world each uses up an IP address. Large, mass-hosting operations can host thousands of sites very efficiently on one machine or ip address over http. However, because of the nature of the https protocol, only one https site can live at a given IP address. While it is always theoretically possible for your hosting provider to get enough IP addresses from Ripe, Arin, or whoever, it is a hassle. Expect your hosting provider to charge you extra for ssl hosting.

Third, you will need an SSL certificate. You can make and sign one yourself, but (i) you lose some of the benefits of ssl/https, and (ii) practically everybody that visits your site will get a warning from their browser about your certificate being signed by a non-recognised issuing authority. A certificate from Verisign will cost you several hundred dollars, and one from Thawte will cost you about US$125 or thereabouts per year. (Note that you'll have to renew it every year!)

Fourth, because the encryption and decryption required by https/ssl requires some computation at both ends, your site may appear slower to your clients. How much slower will depend to some extent on what kind of processor the client has, and to a much larger extent on what kind of processor is in your server, and how busy it is.

If you find that your web server is getting CPU bound serving up your web pages, you can either go buy a new, more powerful web server, or go buy yourself a hardware ssl accelerator card or a dedicated reverse-proxy ssl accelerator box. With modern multi-gigahertz processors, it is rare for machines involved in a little light web serving to be CPU-bound. Of course, it can happen: dynamic web pages, especially badly written ones, can be quite CPU-hungry. Also, even well-powered machines hosting thousands upon thousands of websites on a fast 'net connection can find themselves CPU-bound.

Fifth, because the connection between the browser and your web server is now encrypted, the content is invisible to proxy servers, so you lose any benefit you may have enjoyed from caching. While this isn't that big a deal, it shouldn't be underestimated: many large ISPs employ some form of fairly aggressive transparent http caches, which may have made your site when viewed over http appear much faster than it really was.

Sixth, a not uncommon apache configuration is to serve the same content over both http and https, with separate apache instances. Typically, though, the http apache is optimised to deal with many, many simultaneous clients, and the https apache is "optimised" not to hog too many resources!

Seventh, you may be tempted to think about offering your images over a plain http connection, and your html over an encrypted https connection. Forget about it. While it is technically possible for your html to refer to images with full URLs that specify the http (unencrypted) transport mechanism, almost all browsers will throw up a warning about a "secure" page containing "insecure" content.

So, if you have started to offer your web site over http and you're finding it to be slower than over https, you know to wonder whether it's due to a CPU-bound server, an inability to take advantage of intermediate caching, or a poorly optimised https server.

If you suspect your web server of being CPU bound, then you'll be able to analyse its process table ("top" under Linux/Unix, the Task Manager under Windows) to see if any processes are unduly hogging CPU resources.

If you try to access your web site over http from numerous different locations, you'll get a good feel for how much it benefits from intermediate caching. (Unless, of course, the cache is very close to your web server, and all your clients benefit from it.)

It is hard, though, to give general guidelines to help identify if your https server is just under-optimised, so I won't try here. (I should probably mention at this point that I have substantial experience with optimising Apache for both http and https content delivery, and am available for consultation at very reasonable rates. Drop me a line: wesley at yelsew dot com)

So, in summary, there a number of issues related to serving your website over https: Reasons not to do it (ip addresses, extra hosting cost, and certificate cost), and speed issues once you are already doing it (encryption overhead, lack of intermediate caching, difficulty of effectively configuring https servers). Hopefully this document has given you an introduction of sorts to these issues.

Finally, Swan Labs make an extremely neat Web Accelerator product that, among many other things, can be used as an ssl offloader. I am ... how shall I put this? ... not entirely unbiased in this matter.