Newer posts are loading.
You are at the newest post.
Click here to check if anything new just came in.

December 04 2013

14:29

Speed Up Your Mobile Website With Varnish


  

Imagine that you have just written a post on your blog, tweeted about it and watched it get retweeted by some popular Twitter users, sending hundreds of people to your blog at once. Your excitement at seeing so many visitors talk about your post turns to dismay as they start to tweet that your website is down — a database connection error is shown.

Or perhaps you have been working hard to generate interest in your startup. One day, out of the blue, a celebrity tweets about how much they love your product. The person’s followers all seem to click at once, and many of them find that the domain isn’t responding, or when they try to sign up for the trial, the page times out. Despite your apologies on Twitter, many of the visitors move on with their day, and you lose much of the momentum of that initial tweet.

These scenarios are fairly common, and I have noticed in my own work that when content becomes popular via social networks, the proportion of mobile devices that access that content is higher than usual, because many people use their mobile devices, rather than desktop applications, to access Twitter and other social networks. Many of these mobile users access the Web via slow data connections and crowded public Wi-Fi. So, anything you can do to ensure that your website loads quickly will benefit those users.

In this article, I’ll show you Varnish Web application accelerator, a free and simple thing that makes a world of difference when a lot of people land on your website all at once.

Introducing The Magic

For the majority of websites, even those whose content is updated daily, a large number of visitors are served exactly the same content. Images, CSS and JavaScript, which we expect not to change very much — but also content stored in a database using a blogging platform or content management system (CMS) — are often served to visitors in exactly the same way every time.

Visitors coming to a blog from Twitter would likely not all be served exactly the same content — including not only images, JavaScript and CSS, but also content that is created with PHP and with queries to the database before being served as a page to the browser. Each request for that blog’s post would require not only the Web server that serves the file (for example, Apache), but also PHP scripts, a connection to the database, and queries run against database tables.

The number of database connections that can be made and the number of Apache processes that can run are always limited. The greater the number of visitors, the less memory available and the slower each request becomes. Ultimately, users will start to see database connection errors, or the website will just seem to hang, with pages not loading as the server struggles to keep up with demand.

This is where an HTTP cache like Varnish comes in. Instead of requests from browsers directly hitting your Web server, making the server create and serve the pages requested, requests would first hit the cache. If the requested page is in the cache, then it is served directly from memory, never touching Apache or the database. If the page is not in the cache, then the request is handed over to Apache as usual, whereupon Apache will create and serve the page, which is then stored in the cache, ready for the next request.

Serving a page from memory is a lot faster than serving it from disk via Apache. In addition, the page never needs to touch PHP or the database, leaving those processes free to handle traffic that does require a database connection or some processing. For example, in our second scenario of a startup being mentioned by a celebrity, the majority of people clicking through would check out only a few pages of the website — all of those pages could be in the cache and served from memory. The few who go on to sign up would find that the registration form works well, because the server-side code and database connection are not bogged down by people pouring in from Twitter.

How Does It Work?

The diagram below shows how a blog post might be served when all requests go to the Apache Web server. This example shows five browsers all requesting the same page, which uses PHP and MySQL.

When all requests go to the Apache Web server.

Every HTTP request is served by Apache — images, CSS, JavaScript and HTML files. If a file is PHP, then it is parsed by PHP. And if content is required from the database, then a database connection is made, SQL queries are run, and the page is assembled from the returned data before being served to the browser via Apache.

If we place Varnish in front of Apache, we would instead see the following:

If we place Varnish in front of Apache.

If the page and assets requested are already cached, then Varnish serves them from memory — Apache, PHP and MySQL would never be touched. If a browser requests something that is not cached, then Varnish hands it over to Apache so that it can do the job detailed above. The key point is that Apache needs to do that job only once, because the result is then stored in memory, and when a second request is made, Varnish can serve it.

The tool has other benefits. In Varnish terminology, when you configure Apache as your Web server, you are configuring a “back end.” Varnish allows you to configure multiple back ends. So, you might want to run two Web servers — for example, using Apache for PHP pages while serving static assets (such as CSS files) from nginx. You can set this up in Varnish, which will pass the request through to the correct server. In this tutorial, we will look at the simplest use case.

I’m Sold! How Do I Get Started?

Varnish is really easy to install and configure. You will need root, or sudo, access to your server to install things on it. Therefore, your website needs to be hosted on a virtual private server (VPS) or the like. You can get a VPS very inexpensively these days, and Varnish is a big reason to choose a VPS over shared hosting.

Some CMS’ have plugins that work with Varnish or that integrate it in the control panel — usually to make clearing the cache easier. But you can put Varnish in any CMS or any static website, without any particular integration with other systems.

I’ll walk you through installing Varnish, assuming that you already run Apache as a Web server on your system. I run Debian Linux, but packages for other distributions are available. (The paths to files on the system will vary with the Linux distribution.)

Before starting, check that Apache is serving your website as expected. If the server is brand new or you are trying out Varnish on a local virtual machine, make sure to configure a virtual host and that you can view a test page on the server using a browser.

Install Varnish

Installation instructions for various platforms are in Varnish’s documentation. I am using Debian Wheezy; so, as root, I followed the instructions for Debian. Once Varnish is installed, you will see the following line in the terminal, telling you that it has started successfully.


[ ok ] Starting HTTP accelerator: varnishd.

By default, Apache listens for requests on port 80. This is where incoming HTTP requests go, because we want Varnish to essentially sit in front of Apache. We need to configure Varnish to listen on port 80 and change Apache to a different port — usually 8080. We then tell Varnish where Apache is.

Reconfigure Apache

To change the port that Apache listens on, open the file /etc/apache2/ports.conf as root, and find the following lines:


NameVirtualHost *:80
Listen 80

Change these lines to this:


NameVirtualHost *:8080
Listen 8080

If you see the following lines, just change 80 to 8080 in the same way.


NameVirtualHost 127.0.0.1:80
Listen 80

Save this file and open your default virtual host file, which should be in /etc/apache2/sites-available. In this file, find the following line:


<VirtualHost *:80>

Change it to this:


<VirtualHost *:8080>

You will also need to make this change to any other virtual hosts you have set up.

Configure Varnish

Open the file /etc/default/varnish, and scroll down to the uncommented section that starts with DAEMON_OPTS. Edit this so that it looks like the following block, which will make Varnish listen on port 80.


DAEMON_OPTS="-a :80 \
-T localhost:1234 \
-f /etc/varnish/default.vcl \
-S /etc/varnish/secret \
-s malloc,256m"

Open the file /etc/varnish/default.vcl, and check that the default back end is set to port 8080, because this is where Apache will be now.


backend default {
.host = "127.0.0.1";
.port = "8080";
}

Restart Apache and Varnish as root with the following commands:


service apache2 restart
service varnish restart

Check that your test website is still available. If it is, then you’ll probably be wondering how to test that it is being served from Varnish. There are a few ways to do this. The simplest is to use cURL. In the command line, type the following:


curl http://yoursite.com --head

The response should be something like Via: 1.1 varnish.

You can also look at the statistics generated by Varnish. In the command line, type varnishstat, and watch the hit rate increase as you refresh your page in the browser. Varnish refers to something it can serve as a “hit” and something it passes to Apache or another back end as a “miss.”

Another useful tool is varnish-top. Type varnishtop -i txurl in the command line, and refresh your page in the browser. This tool shows you which files are being served by Varnish.

Purging The Cache

Now that pages are being cached, if you change an HTML or CSS file, you won’t see the changes immediately. This trips me up all of the time. I know that a cache is in front of Apache, yet every so often I still have that baffled moment of “Where are my changes?!” Type varnishadm "ban.url ." in the command line to clear the entire cache.

You can also control Varnish over HTTP. Plugins are available, such as Varnish HTTP Purge for WordPress, that you can configure to purge the cache directly from the administration area.

Some Simple Customizations

You’ll probably want to know a few things about how Varnish works by default in order to tweak it. Configuring it as described above should cause most basic assets and pages to be served from the cache, once those assets have been cached in memory.

Varnish will only cache things that are safe to do so, and it might not cache some common things that you think it would. A good example is cookies.

In its default configuration, Varnish will not cache content if a cookie is set. So, if your website serves different content to logged-in users, such as personalized content, you wouldn’t want to serve everyone content that is meant for one user. However, you’d probably want to ignore some cookies, such as for analytics. If the website does not serve any personalized content, then the only cookies you would probably care about are those set for your admin area — it would be inconvenient if Varnish cached the admin area and you couldn’t see changes.

Let’s edit /etc/varnish/default.vcl. Assuming your admin area is at /admin, you would add the following:


sub vcl_recv {
   if ( !( req.url ~ ^/admin/) ) {
     unset req.http.Cookie;
   }
 }

Some cookies might be important — for example, logged-in users should get uncached content. So, you don’t want to eliminate all cookies. A trip to the land of regular expressions is required to identify the cookies we’ll need. Many recipes for doing this can be found with a quick search online. For analytics cookies, you could add the following.


sub vcl_recv {
  // Remove has_js and Google Analytics __* cookies.
  set req.http.Cookie = regsuball(req.http.Cookie, "(^|;\s*)(_[_a-z]+|has_js)=[^;]*", "");
  // Remove a ";" prefix, if present.
  set req.http.Cookie = regsub(req.http.Cookie, "^;\s*", "");
}

Varnish has a section in its documentation on “Cookies.”

In most cases, configuring Varnish as described above and removing analytics cookies will dramatically speed up your website. Once Varnish is up and running and you are familiar with the logs, you can start to tweak the configuration and get more performance from the cache.

Next Steps

To learn more, go through Varnish’s documentation. You should understand enough of Varnish’s basics by now to try some of the examples. The section on “Achieving a High Hit Rate” is well worth a read for the simple tips on tweaking your configuration.

Speed Up Your Mobile Website With Varnish
Keep calm and try Varnish to optimize mobile websites. (Image source)

(al, ea, il)


© Rachel Andrew for Smashing Magazine, 2013.

June 10 2013

12:01

Gone In 60 Frames Per Second: A Pinterest Paint Performance Case Study


  

Today we’ll discuss how to improve the paint performance of your websites and Web apps. This is an area that we Web developers have only recently started looking at more closely, and it’s important because it could have an impact on your user engagement and user experience.

Frame Rate Applies To The Web, Too

Frame rate is the rate at which a device produces consecutive images to the screen. A low frames per second (FPS) means that individual frames can be made out by the eye. A high FPS gives users a more responsive feel. You’re probably used to this concept from the world of gaming, but it applies to the Web, too.

Long image decoding, unnecessary image resizing, heavy animation and data processing can all lead to dropped frames, which reduces the frame rate, resulting in janky pages. We’ll explain what exactly we mean by “jank” shortly.

Why Care About Frame Rate?

Smooth, high frame rates drive user engagement and can affect how much users interact with your website or app.

At EdgeConf earlier this year, Facebook confirmed this when it mentioned that in an A/B test, it slowed down scrolling from 60 FPS to 30 FPS, causing engagement to collapse. That said, if you can’t do high frame rates and 60 FPS is out of reach, then you’d at least want something smooth. If you’re doing your own animation, this is one benefit of using requestAnimationFrame: the browser can dynamically adjust to keep the frame rate normal.

In cases where you’re concerned about scrolling, the browser can manage the frame rate for you. But if you introduce a large amount of jank, then it won’t be able to do as good a job. So, try to avoid big hitches, such as long paints, long JavaScript execution times, long anything.

Don’t Guess It, Test It!

Before getting started, we need to step back and look at our approach. We all want our websites and apps to run more quickly. In fact, we’re arguably paid to write code that runs not only correctly, but quickly. As busy developers with deadlines, we find it very easy to rely on snippets of advice that we’ve read or heard. Problems arise when we do that, though, because the internals of browsers change very rapidly, and something that’s slow today could be quick tomorrow.

Another point to remember is that your app or website is unique, and, therefore, the performance issues you face will depend heavily on what you’re building. Optimizing a game is a very different beast to optimizing an app that users will have open for 200+ hours. If it’s a game, then you’ll likely need to focus your attention on the main loop and heavily optimize the chunk of code that is going to run every frame. With a DOM-heavy application, the memory usage might be the biggest performance bottleneck.

Your best option is to learn how to measure your application and understand what the code is doing. That way, when browsers change, you will still be clear about what matters to you and your team and will be able to make informed decisions. So, no matter what, don’t guess it, test it!

We’re going to discuss how to measure frame rate and paint performance shortly, so hold onto your seats!

Note: Some of the tools mentioned in this article require Chrome Canary, with the “Developer Tools experiments” enabled in about:flags. (We — Addy Osmani and Paul Lewis — are engineers on the Developer Relations team at Chrome.)

Case Study: Pinterest

The other day we were on Pinterest, trying to find some ponies to add to our pony board (Addy loves ponies!). So, we went over to the Pinterest feed and started scrolling through, looking for some ponies to add.

Screen Shot 2013-03-25 at 14.30.57-500
Addy adding some ponies to his Pinterest board, as one does. Larger view.

Jank Affects User Experience

The first thing we noticed as we scrolled was that scrolling on this page doesn’t perform very well — scrolling up and down takes effort, and the experience just feels sluggish. When they come up against this, users get frustrated, which means they’re more likely to leave. Of course, this is the last thing we want them to do!

Screen Shot 2013-03-25 at 14.31.27-500
Pinterest showing a performance bottleneck when a user scrolls. Larger view.

This break in consistent frame rate is something the Chrome team calls “jank,” and we’re not sure what’s causing it here. You can actually notice some of the frames being drawn as we scroll. But let’s visualize it! We’re going to open up Frames mode and show what slow looks like there in just a moment.

Note: What we’re really looking for is a consistently high FPS, ideally matching the refresh rate of the screen. In many cases, this will be 60 FPS, but it’s not guaranteed, so check the devices you’re targeting.

Now, as JavaScript developers, our first instinct is to suspect a memory leak as being the cause. Perhaps some objects are being held around after a round of garbage collection. The reality, however, is that very often these days JavaScript is not a bottleneck. Our major performance problems come down to slow painting and rendering times. The DOM needs to be turned into pixels on the screen, and a lot of paint work when the user scrolls could result in a lot of slowing down.

Note: HTML5 Rocks specifically discusses some of the causes of slow scrolling. If you think you’re running into this problem, it’s worth a read.

Measuring Paint Performance

Frame Rate

We suspect that something on this page is affecting the frame rate. So, let’s go open up Chrome’s Developer Tools and head to the “Timeline” and “Frames” mode to record a new session. We’ll click the record button and start scrolling the page the way a normal user would. Now, to simulate a few minutes of usage, we’re going to scroll just a little faster.

Screen Shot 2013-05-15 at 17.57.48-500
Using Chrome’s Developer Tools to profile scrolling interactions. Larger view.

Up, down, up, down. What you’ll notice now in the summary view up at the top is a lot of purple and green, corresponding to painting and rendering times. Let’s stop recording for now. As we flip through these various frames, we see some pretty hefty “Recalculate Styles” and a lot of “Layout.”

If you look at the legend to the very right, you’ll see that we’ve actually blown our budget of 60 FPS, and we’re not even hitting 30 FPS either in many cases. It’s just performing quite poorly. Now, each of these bars in the summary view correspond to one frame — i.e. all of the work that Chrome has to do in order to be able to draw an app to the screen.

screen434343-500
Chrome’s Developer Tools showing a long paint time. Larger view.

Frame Budget

If you’re targeting 60 FPS, which is generally the optimal number of frames to target these days, then to match the refresh rate of the devices we commonly use, you’ll have a 16.7-millisecond budget in which to complete everything — JavaScript, layout, image decoding and resizing, painting, compositing — everything.

Note: A constant frame rate is our ideal here. If you can’t hit 60 FPS for whatever reason, then you’re likely better off targeting 30 FPS, rather than allowing a variable frame rate between 30 and 60 FPS. In practice, this can be challenging to code because when the JavaScript finishes executing, all of the layout, paint and compositing work still has to be done, and predicting that ahead of time is very difficult. In any case, whatever your frame rate, ensure that it is consistent and doesn’t fluctuate (which would appear as stuttering).

If you’re aiming for low-end devices, such as mobile phones, then that frame budget of 16 milliseconds is really more like 8 to 10 milliseconds. This could be true on desktop as well, where your frame budget might be lowered as a result of miscellaneous browser processes. If you blow this budget, you will miss frames and see jank on the page. So, you likely have somewhere nearer 8 to 10 milliseconds, but be sure to test the devices you’re supporting to get a realistic idea of your budget.

Screen Shot 2013-03-25 at 14.34.26-500
An extremely costly layout operation of over 500 milliseconds. Larger view.

Note: We’ve also got an article on how to use the Chrome Developer Tools to find and fix rendering performance issues that focuses more on the timeline.

Going back to scrolling, we have a sneaking suspicion that a number of unnecessary repaints are occurring on this page with onscroll.

One common mistake is to stuff just way too much JavaScript into the onscroll handlers of a page — making it difficult to meet the frame budget at all. Aligning the work to the rendering pipeline (for example, by placing it in requestAnimationFrame) gives you a little more headroom, but you still have only those few milliseconds in which to get everything done.

The best thing you can do is just capture values such as scrollTop in your scroll handlers, and then use the most recent value inside a requestAnimationFrame callback.

Paint Rectangles

Let’s go back to Developer Tools → Settings and enable “Show paint rectangles.” This visualizes the areas of the screen that are being painted with a nice red highlight. Now look at what happens as we scroll through Pinterest.

Screen Shot 2013-03-25 at 14.35.17-500
Enabling Chrome Developer Tools’ “Paint Rectangles” feature. Larger view.

Every few milliseconds, we experience a big bright flash of red across the entire screen. There seems to be a paint of the whole screen every time we scroll, which is potentially very expensive. What we want to see is the browser just painting what is new to the page — so, typically just the bottom or top of the page as it gets scrolled into view. The cause of this issue seems to be the little “scroll to top” button in the lower-right corner. As the user scrolls, the fixed header at the top needs to be repainted, but so does the button. The way that Chrome deals with this is to create a union of the two areas that need to be repainted.

Screen Shot 2013-05-15 at 19.00.12-500
Chrome shows freshly painted areas with a red box. Larger view.

In this case, there is a rectangle from the top left to top right, but not very tall, plus a rectangle in the lower-right corner. This leaves us with a rectangle from the top left to bottom right, which is essentially the whole screen! If you inspect the button element in Developer Tools and either hide it (using the H key) or delete it and then scroll again, you will see that only the header area is repainted. The way to solve this particular problem is to move the scroll button to its own layer so that it doesn’t get unioned with the header. This essentially isolates the button so that it can be composited on top of the rest of the page. But we’ll talk about layers and compositing in more detail in a little bit.

The next thing we notice has to do with hovering. When we hover over a pin, Pinterest paints an action bar containing “Repin, comment and like” buttons — let’s call this the action bar. When we hover over a single pin, it paints not just the bar but also the elements underlying it. Painting should happen only on those elements that you expect to change visually.

Screen Shot 2013-03-25 at 14.35.46-500
A cause for concern: full-screen flashes of red indicate a lot of painting. Larger view.

There’s another interesting thing about scrolling here. Let’s keep our cursor hovered over this pin and start scrolling the page again.

Every time we scroll through a new row of images, this action bar gets painted on yet another pin, even though we don’t mean to hover over it. This comes down more to UX than anything else, but scrolling performance in this case might be more important than the hover effect during scrolling. Hovering amplifies jank during scrolling because the browser essentially pauses to go off and paint the effect (the same is true when we roll out of the element!). One option here is to use a setTimeout with a delay to ensure that the bar is painted only when the user really intends to use it, an approach we covered in “Avoiding Unnecessary Paints.” A more aggressive approach would be to measure the mouseenter or the mouse’s trajectory before enabling hover behaviors. While this measure might seem rather extreme, remember that we are trying to avoid unnecessary paints at all costs, especially when the user is scrolling.

Overall Paint Cost

We now have a really great workflow for looking at the overall cost of painting on a page; go back into Developer Tools and “Enable continuous page repainting.” This feature will constantly paint to your screen so that you can find out what elements have costly paint times. You’ll get this really nice black box in the top corner that summarizes paint times, with the minimum and maximum also displayed.

screenshot43234242-500
Chrome’s “Continuous Page Repainting” mode helps you to assess the overall cost of a page. Larger view.

Let’s head back to the “Elements” panel. Here, we can select a node and just use the keyboard to walk the DOM tree. If we suspect that an element has an expensive paint, we can use the H shortcut key (something recently added to Chrome) to toggle visibility on that element. Using the continuous paint box, we can instantly see whether this has a positive effect on our pages’ paint times. We should expect it to in many cases, because if we hide an element, we should expect a corresponding reduction in paint times. But by doing this, we might see one element that is especially expensive, which would bear further scrutiny!

Screen Shot 2013-06-10 at 09.46.31_500_mini
The “Continuous Page Repainting” chart showing the time taken to paint the page.

For Pinterest’s website, we can do it to the categories bar or to the header, and, as you’d expect, because we don’t have to paint these elements at all, we see a drop in the time it takes to paint to the screen. If we want even more detailed insight, we can go right back to the timeline and record a new session to measure the impact. Isn’t that great? Now, while this workflow should work great for most pages, there might be times when it isn’t as useful. In Pinterest’s case, the pins are actually quite deeply nested in the page, making it hard for us to measure paint times in this workflow.

Luckily, we can still get some good mileage by selecting an element (such as a pin here), going to the “Styles” panel and looking at what CSS styles are being used. We can toggle properties on and off to see how they effect the paint times. This gives us much finer-grained insight into the paint profile of the page.

Here, we see that Pinterest is using box-shadow on these pins. We’ve optimized the performance of box-shadow in Chrome over the past two years, but in combination with other styles and when heavily used, it could cause a bottleneck, so it’s worth looking at.

Pinterest has reduced continuous paint mode times by 40% by moving box-shadow to a separate element that doesn’t have border-radius. The side effect is slightly fuzzy-looking corners; however, it is barely noticeable due to the color scheme and the low border-radius values.

Note: You can read more about this topic in “CSS Paint Times and Page Render Weight.”

Screen Shot 2013-03-25 at 15.47.40-500
Toggling styles to measure their effect on page-rendering weight. Larger view.

Let’s disable box-shadow to see whether it makes a difference. As you can see, it’s no longer visible on any of the pins. So, let’s go back to the timeline and record a new session in which we scroll the same way as we did before (up and down, up and down, up and down). We’re getting closer to 60 FPS now, and that’s just from one change.

Public service announcement: We’re absolutely not saying don’t use box-shadow — by all means, do! Just make sure that if you have a performance problem, measure correctly to find out what your own bottlenecks are. Always measure! Your website or application is unique, as will any performance bottleneck be. Browser internals change almost daily, so measuring is the smartest way to stay up to date on the changes, and Chrome’s Developer Tools makes this really easy to do.

Screen Shot 2013-03-25 at 15.50.25-500
Using Chrome Developer Tools to profile is the best way to track browser performance changes. Larger view.

Note: Eberhard Grather recently wrote a detailed post on “Profiling Long Paint Times With DevTools’ Continuous Painting Mode,” which you should spend some quality time with.

Another thing we noticed is that if you click on the “Repin” button, do you see the animated effect and the lightbox being painted? There’s a big red flash of repaint in the background. It’s not clear from the tooling if the paint is the white cover or some other affected being area. Be sure to double check that the paint rectangles correspond to the element or elements that you think are being repainted, and not just what it looks like. In this case, it looks like the whole screen is being repainted, but it could well be just the white cover, which might not be all that expensive. It’s nuanced; the important thing is to understand what you’re seeing and why.

Hardware Compositing (GPU Acceleration)

The last thing we’re going to look at on Pinterest is GPU acceleration. In the past, Web browsers have relied pretty heavily on the CPU to render pages. This involved two things: firstly, painting elements into a bunch of textures, called layers; and secondly, compositing all of those layers together to the final picture seen on screen.

Over the past few years, however, we’ve found that getting the GPU involved in the compositing process can lead to some significant speeding up. The premise is that, while the textures are still painted on the CPU, they can be uploaded to the GPU for compositing. Assuming that all we do on future frames is move elements around (using CSS transitions or animations) or change their opacity, we simply provide these changes to the GPU and it takes care of the rest. We essentially avoid having to give the GPU any new graphics; rather, we just ask it to move existing ones around. This is something that the GPU is exceptionally quick at doing, thus improving performance overall.

There is no guarantee that this hardware compositing will be available and enabled on a given platform, but if it is available the first time you use, say, a 3D transform on an element, then it will be enabled in Chrome. Many developers use the translateZ hack to do just that. The other side effect of using this hack is that the element in question will get its own layer, which may or may not be what you want. It can be very useful to effectively isolate an element so that it doesn’t affect others as and when it gets repainted. It’s worth remembering that the uploading of these textures from system memory to the video memory is not necessarily very quick. The more layers you have, the more textures need to be uploaded and the more layers that will need to be managed, so it’s best not to overdo it.

Note: Tom Wiltzius has written about the layer model in Chrome, which is a relevant read if you are interested in understanding how compositing works behind the scenes. Paul has also written a post about the translateZ hack and how to make sure you’re using it in the right ways.

Another great setting in Developer Tools that can help here is “Show composited layer borders.” This feature will give you insight into those DOM elements that are being manipulated at the GPU level.

layer_folders_addy_500_mini
Switching on composited layer borders will indicate Chrome’s rendering layers. Larger view.

If an element is taking advantage of the GPU acceleration, you’ll see an orange border around it with this on. Now as we scroll through, we don’t really see any use of composited layers on this page — not when we click “Scroll to top” or otherwise.

Chrome is getting better at automatically handling layer promotion in the background; but, as mentioned, developers sometimes use the translateZ hack to create a composited layer. Below is Pinterest’s feed with translateZ(0) applied to all pins. It’s not hitting 60 FPS, but it is getting closer to a consistent 30 FPS on desktop, which is actually not bad.

Screen Shot 2013-05-15 at 19.03.13-500
Using the translateZ(0) hack on all Pinterest pins. Note the orange borders. Larger view.

Remember to test on both desktop and mobile, though; their performance characteristics vary wildly. Use the timeline in both, and watch your paint time chart in Continuous Paint mode to evaluate how fast you’re busting your budget.

Again, don’t use this hack on every element on the page — it might pass muster on desktop, but it won’t on mobile. The reason is that there is increased video memory usage and an increased layer management cost, both of which could have a negative impact on performance. Instead, use hardware compositing only to isolate elements where the paint cost is measurably high.

Note: In the WebKit nightlies, the Web Inspector now also gives you the reasons for layers being composited. To enable this, switch off the “Use WebKit Web Inspector” option and you’ll get the front end with this feature in there. Switch it on using the “Layers” button.

A Find-and-Fix Workflow

Now that we’ve concluded our Pinterest case study, what about the workflow for diagnosing and addressing your own paint problems?

Finding the Problem

  • Make sure you’re in “Incognito” mode. Extensions and apps can skew the figures that are reported when profiling performance.
  • Open the page and the Developer Tools.
  • In the timeline, record and interact with your page.
  • Check for frames that go over budget (i.e. over 60 FPS).
  • If you’re close to budget, then you’re likely way over the budget on mobile.
  • Check the cause of the jank. Long paint? CSS layout? JavaScript?

Screen Shot 2013-05-15 at 19.36.22-500
Spend some quality time with Frame mode in Chrome Developer Tools to understand your website’s runtime profile. Larger view.

Fixing the Problem

  • Go to “Settings” and enable “Continuous Page Repainting.”
  • In the “Elements” panel, hide anything non-essential using the hide (H) shortcut.
  • Walk through the DOM tree, hiding elements and checking the FPS in the timeline.
  • See which element(s) are causing long paints.
  • Uncheck styles that could affect paint time, and track the FPS.
  • Continue until you’ve located the elements and styles responsible for the slow-down.

fixing-500_mini
Switch on extra Developer Tools features for more insight. Larger view.

What About Other Browsers?

Although at the time of writing, Chrome has the best tools to profile paint performance, we strongly recommend testing and measuring your pages in other browsers to get a feel for what your own users might experience (where feasible). Performance can vary massively between them, and a performance smell in one browser might not be present in another.

As we said earlier, don’t guess it, test it! Measure for yourself, understand the abstractions, know your browser’s internals. In time, we hope that the cross- browser tooling for this area improves so that developers can get an accurate picture of rendering performance, regardless of the browser being used.

Conclusion

Performance is important. Not all machines are created equal, and the fast machines that developers work on might not have the performance problems encountered on the devices of real users. Frame rate in particular can have a big impact on engagement and, consequently, on a project’s success. Luckily, a lot of great tools out there can help with that.

Be sure to measure paint performance on both desktop and mobile. If all goes well, your users will end up with snappier, more silky-smooth experiences, regardless of the device they’re using.

Further Reading

About the Authors

Addy Osmani and Paul Lewis are engineers on the Developer Relations team at Chrome, with a focus on tooling and rendering performance, respectively. When they’re not causing trouble, they have a passion for helping developers build snappy, fluid experiences on the Web.

(al)


© Addy Osmani for Smashing Magazine, 2013.

Sponsored post
feedback2020-admin
04:05

August 29 2012

23:08

Improve Your App’s Performance with Memcached

One of the easiest ways to improve your application’s performance is by putting a caching solution in front of your database. In this tutorial, I’ll show you how to use Memcached with Rails, Django, or Drupal.


Memcached is an excellent choice for this problem, given its solid history, simple installation, and active community. It is used by companies big and small, and includes giants, such as Facebook, YouTube, and Twitter. The Memcached site, itself, does a good job of describing Memcached as a “Free & open source, high-performance, distributed memory object caching system, generic in nature, but intended for use in speeding up dynamic web applications by alleviating database load.”

In general, database calls are slow.

In general, database calls are slow, since the query takes CPU resources to process and data is (usually) retrieved from disk. On the other hand, an in-memory cache, like Memcached, takes very little CPU resources and data is retrieved from memory instead of disk. The lightened CPU is an effect of Memcached’s design; it’s not queryable, like an SQL database. Instead, it uses key-value pairs to retrieve all data and you cannot retrieve data from Memcached without first knowing its key.

Memcached stores the key-value pairs entirely in memory. This makes retrieval extremely fast, but also makes it so the data is ephemeral. In the event of a crash or reboot, memory is cleared and all key-value pairs need to be rebuilt. There are no built-in high-availability and/or fail-over systems within Memcached. However, it is a distributed system, so data is stored across multiple nodes. If one node is lost, the remaining nodes carry on serving requests and filling in for the missing node.


Installing Memcached

Installing Memcached is a fairly simple process. It can be done through a package manager or by compiling it from source. Depending on your distribution, you may want to compile from source, since the packages tend to fall a bit behind.

# Install on Debian and Ubuntu
apt-get install memcached

# Install on Redhat and Fedora
yum install memcached

# Install on Mac OS X (with Homebrew)
brew install memcached

# Install from Source
get http://memcached.org/latest
tar -zxvf memcached-1.x.x.tar.gz
cd memcached-1.x.x
./configure
make && make test
sudo make install

You’ll want to configure Memcached for your specific needs, but, for this example, we’ll just get it running with some basic settings.

memcached -m 512 -c 1024 -p 11211 -d

At this point, you should be up and running with Memcached. Next, we’ll look at how to use it with Rails, Django and Drupal. It should be noted that Memcached is not restricted to being used within a framework. You can use Memcached with many programming languages through one of the many clients available.


Using Memcached with Rails 3

Rails 3 has abstracted the caching system so that you can change the client to your heart’s desire. In Ruby, the preferred Memcached client is Dalli.

# Add Dalli to your Gemfile
gem 'dalli'

# Enable Dalli in config/environments/production.rb:
config.perform_caching = true
config.cache_store = :dalli_store, 'localhost:11211'

In development mode, you will not normally hit Memcached, so either start Rails in production mode with rails server -e production, or add the above lines to your config/environments/development.rb.

The simplest use of the cache is through write/read methods to retrieve data:

Rails.cache.write 'hello', 'world'      #=> true
Rails.cache.read 'hello'                #=> "world"

The most common pattern for Rails caching is using fetch. It will attempt to retrieve the key (in this case, expensive-query) and return the value. If the key does not exist, it will execute the passed block and store the result in the key.

Rails.cache.fetch 'expensive-query' do
  results = Transaction.
    joins(:payment_profile).
    joins(:order).
    where(':created > orders.created_at', :created => Time.now)
end
# ... more code working with results

In the example above, the problem is cache expiry. (One of the two hard problems in computer science.) An advanced, very robust solution is to use some part of the results in the cache key itself, so that if the results change, then the key is expired automatically.

users = User.active
users.each do |u|
  Rails.cache.fetch "profile/#{u.id}/#{u.updated_at.to_i}" do
    u.profile
  end
end

Here, we’re using the epoch of updated_at as part of the key, which gives us built in cache expiration. So, if the user.updated_at time changes, we will get a cache miss on the pre-existing profile cache and write out a new one. In this case, we’ll need to update the user’s updated_at time when their profile is updated. That is as simple as adding:

class Profile < ActiveRecord::Base
  belongs_to :user, touch: true
end

Now, you have self-expiring profiles without any worry about retrieving old data when the user is updated. It’s almost like magic!


Using Memcached with Django

Once you have Memcached installed, it is fairly simple to access with Django. First, you’ll need to install a client library. We’ll use pylibmc.

# Install the pylibmc library
pip install pylibmc

# Configure cache servers and binding settings.py
CACHES = {
    'default': {
        'BACKEND': 'django.core.cache.backends.memcached.PyLibMCCache',
        'LOCATION': '127.0.0.1:11211',
    }
}

Your app should be up and running with Memcached now. Like other libraries, you’ll get basic getter and setter methods to access the cache:

cache.set('hello', 'world')
cache.get('hello')             #=> 'world'

You can conditionally set a key if it does not already exist with add. If the key already exists, the new value will be ignored.

cache.set('hello', 'world')
cache.add('hello', 'mundus')
cache.get('hello')              #=> 'world'

From the Python Decorator Library, you can create create a memoized decorator to cache the results of a method call.

import collections
import functools

class memoized(object):
    '''Decorator. Caches a function's return value each time it is called.
    If called later with the same arguments, the cached value is returned
    (not reevaluated).
    '''
    def __init__(self, func):
        self.func = func
        self.cache = {}
    def __call__(self, *args):
        if not isinstance(args, collections.Hashable):
            # uncacheable. a list, for instance.
            # better to not cache than blow up.
            return self.func(*args)
        if args in self.cache:
            return self.cache[args]
        else:
            value = self.func(*args)
            self.cache[args] = value
            return value
    def __repr__(self):
        '''Return the function's docstring.'''
        return self.func.__doc__
    def __get__(self, obj, objtype):
        '''Support instance methods.'''
        return functools.partial(self.__call__, obj)

@memoized
def fibonacci(n):
    "Return the nth fibonacci number."
    if n in (0, 1):
        return n
    return fibonacci(n-1) + fibonacci(n-2)

print fibonacci(12)

Decorators can give you the power to take most of the heavy lifting out of caching and cache expiration. Be sure to take a look at the caching examples in the Decorator Library while you are planning your caching system.


Using Memcached with Drupal

Getting started with Memcached in Drupal starts with installing the PHP extension for Memcached.

# Install the Memcached extension
pecl install memcache

<?php
    // Configure Memcached in php.ini
    [memcache]
    memcache.hash_strategy = consistent
    memcache.default_port = 11211
?>

<?php
    // Tell Drupal about Memcached in settings.php
    $conf['cache_backends'][] = 'sites/all/modules/contrib/memcache/memcache.inc';
    $conf['cache_default_class'] = 'MemCacheDrupal';
    $conf['memcache_key_prefix'] = 'app_name';
    $conf['memcache_servers'] = array(
        '10.1.1.1:11211' => 'default',
        '10.1.1.2:11212' => 'default'
    );
?>

You’ll need to restart your application for all the changes to take effect.

As expected, you’ll get the standard getter and setter methods with the Memcached module. One caveat is that cache_get returns the cache row, so you’ll need to access the serialized data within it.

<?php
    cache_set('hello', 'world');
    $cache = cache_get('hello');
    $value = $cache->data;  #=> returns 'world'
?>

And just like that, you’ve got caching in place in Drupal. You can build custom functions to replicate functionality such as cache.fetch in Rails. With a little planning, you can have a robust caching solution that will bring your app’s responsiveness to a new level.


And You’re Done

While a good caching strategy takes time to refine, it shouldn’t stop you from getting started.

Implementing a caching system can be fairly straightforward. With the right configuration, a caching solution can extend the life of your current architecture and make your app feel snappier than it ever has before. While a good caching strategy takes time to refine, it shouldn’t stop you from getting started.

As with any complex system, monitoring is critical. Understanding how your cache is being utilized and where the hotspots are in your data will help you improve your cache performance. Memcached has a quality stats system to help you monitor your cache cluster. You should also use a tool, like New Relic to keep an eye on the balance between cache and database time. As an added bonus, you can get a free ‘Data Nerd’ tshirt when you sign-up and deploy.


December 12 2011

15:55

Your jQuery: Now With 67% Less Suck

Fun fact: more websites are now using jQuery than Flash.

jQuery is an amazing tool that’s made JavaScript accessible to developers and designers of all levels of experience. However, as Spiderman taught us, “with great power comes great responsibility.” The unfortunate downside to jQuery is that while it makes it easy to write JavaScript, it makes it easy to write really really f*&#ing bad JavaScript. Scripts that slow down page load, unresponsive user interfaces, and spaghetti code knotted so deep that it should come with a bottle of whiskey for the next sucker developer that has to work on it.

This becomes more important for those of us who have yet to move into the magical fairy wonderland where none of our clients or users view our pages in Internet Explorer. The IE JavaScript engine moves at the speed of an advancing glacier compared to more modern browsers, so optimizing our code for performance takes on an even higher level of urgency.

Thankfully, there are a few very simple things anyone can add into their jQuery workflow that can clear up a lot of basic problems. When undertaking code reviews, three of the areas where I consistently see the biggest problems are: inefficient selectors; poor event delegation; and clunky DOM manipulation. We’ll tackle all three of these and hopefully you’ll walk away with some new jQuery batarangs to toss around in your next project.

Selector optimization

Selector speed: fast or slow?

Saying that the power behind jQuery comes from its ability to select DOM elements and act on them is like saying that Photoshop is a really good tool for selecting pixels on screen and making them change color – it’s a bit of a gross oversimplification, but the fact remains that jQuery gives us a ton of ways to choose which element or elements in a page we want to work with. However, a surprising number of web developers are unaware that all selectors are not created equal; in fact, it’s incredible just how drastic the performance difference can be between two selectors that, at first glance, appear nearly identical. For instance, consider these two ways of selecting all paragraph tags inside a <div> with an ID.

$("#id p");
$("#id").find("p");

Would it surprise you to learn that the second way can be more than twice as fast as the first? Knowing which selectors outperform others (and why) is a pretty key building block in making sure your code runs well and doesn’t frustrate your users waiting for things to happen.

There are many different ways to select elements using jQuery, but the most common ways can be basically broken down into five different methods. In order, roughly, from fastest to slowest, these are:

  • $("#id");
    This is without a doubt the fastest selector jQuery provides because it maps directly to the native document.getElementbyId() JavaScript method. If possible, the selectors listed below should be prefaced with an ID selector in conjunction with jQuery’s .find() method to limit the scope of the page that has to be searched (as in the $("#id").find("p") example shown above).
  • $("p");, $("input");, $("form"); and so on
    Selecting elements by tag name is also fast, since it maps directly to the native document.getElementsByTagname() method.
  • $(".class");
    Selecting by class name is a little trickier. While still performing very well in modern browsers, it can cause some pretty significant slowdowns in IE8 and below. Why? IE9 was the first IE version to support the native document.getElementsByClassName() JavaScript method. Older browsers have to resort to using much slower DOM-scraping methods that can really impact performance.
  • $("[attribute=value]");
    There is no native JavaScript method for this selector to use, so the only way that jQuery can perform the search is by crawling the entire DOM looking for matches. Modern browsers that support the querySelectorAll() method will perform better in certain cases (Opera, especially, runs these searches much faster than any other browser) but, generally speaking, this type of selector is Slowey McSlowersons.
  • $(":hidden");
    Like attribute selectors, there is no native JavaScript method for this one to use. Pseudo-selectors can be painfully slow since the selector has to be run against every element in your search space. Again, modern browsers with querySelectorAll() will perform slightly better here, but try to avoid these if at all possible. If you must use one, try to limit the search space to a specific portion of the page: $("#list").find(":hidden");

But, hey, proof is in the performance testing, right? It just so happens that said proof is sitting right here. Be sure to notice the class selector numbers beside IE7 and 8 compared to other browsers and then wonder how the people on the IE team at Microsoft manage to sleep at night. Yikes.

Chaining

Almost all jQuery methods return a jQuery object. This means that when a method is run, its results are returned and you can continue executing more methods on them. Rather than writing out the same selector multiple times over, just making a selection once allows multiple actions to be run on it.

Without chaining
$("#object").addClass("active");
$("#object").css("color","#f0f");
$("#object").height(300);
With chaining
$("#object").addClass("active").css("color", "#f0f").height(300);

This has the dual effect of making your code shorter and faster. Chained methods will be slightly faster than multiple methods made on a cached selector, and both ways will be much faster than multiple methods made on non-cached selectors. Wait… “cached selector”? What is this new devilry?

Caching

Another easy way to speed up your code that seems to be a mystery to developers is the idea of caching your selectors. Think of how many times you end up writing the same selector over and over again in any project. Every $(".element") selector has to search the entire DOM each time, regardless of whether or not that selector had been previously run. Running the selection once and then storing the results in a variable means that the DOM only has to be searched once. Once the results of a selector have been cached, you can do anything with them.

First, run your search (here we’re selecting all of the <li> elements inside <ul id="blocks">):

var blocks = $("#blocks").find("li");

Now, you can use the blocks variable wherever you want without having to search the DOM every time.

$("#hideBlocks").click(function() {
    blocks.fadeOut();
});
$("#showBlocks").click(function() {
    blocks.fadeIn();
});

My advice? Any selector that gets run more than once should be cached. This jsperf test shows just how much faster a cached selector runs compared to a non-cached one (and even throws some chaining love in to boot).

Event delegation

Event listeners cost memory. In complex websites and apps it’s not uncommon to have a lot of event listeners floating around, and thankfully jQuery provides some really easy methods for handling event listeners efficiently through delegation.

In a bit of an extreme example, imagine a situation where a 10×10 cell table needs to have an event listener on each cell; let’s say that clicking on a cell adds or removes a class that defines the cell’s background color. A typical way that this might be written (and something I’ve often seen during code reviews) is like so:

$('table').find('td').click(function() {
    $(this).toggleClass('active');
});

jQuery 1.7 has provided us with a new event listener method, .on(). It acts as a utility that wraps all of jQuery’s previous event listeners into one convenient method, and the way you write it determines how it behaves. To rewrite the above .click() example using .on(), we’d simply do the following:

$('table').find('td').on('click',function() {
    $(this).toggleClass('active');
});

Simple enough, right? Sure, but the problem here is that we’re still binding one hundred event listeners to our page, one to each individual table cell. A far better way to do things is to create one event listener on the table itself that listens for events inside it. Since the majority of events bubble up the DOM tree, we can bind a single event listener to one element (in this case, the <table>) and wait for events to bubble up from its children. The way to do this using the .on() method requires only one change from our code above:

$('table').on('click','td',function() {
    $(this).toggleClass('active');
});

All we’ve done is moved the td selector to an argument inside the .on() method. Providing a selector to .on() switches it into delegation mode, and the event is only fired for descendants of the bound element (table) that match the selector (td). With that one simple change, we’ve gone from having to bind one hundred event listeners to just one. You might think that the browser having to do one hundred times less work would be a good thing and you’d be completely right. The difference between the two examples above is staggering.

(Note that if your site is using a version of jQuery earlier than 1.7, you can accomplish the very same thing using the .delegate() method. The syntax of how you write the function differs slightly; if you’ve never used it before, it’s worth checking the API docs for that page to see how it works.)

DOM manipulation

jQuery makes it very easy to manipulate the DOM. It’s trivial to create new nodes, insert them, remove other ones, move things around, and so on. While the code to do this is simple to write, every time the DOM is manipulated, the browser has to repaint and reflow content which can be extremely costly. This is no more evident than in a long loop, whether it be a standard for() loop, while() loop, or jQuery $.each() loop.

In this case, let’s say we’ve just received an array full of image URLs from a database or Ajax call or wherever, and we want to put all of those images in an unordered list. Commonly, you’ll see code like this to pull this off:

var arr = [reallyLongArrayOfImageURLs]; 
    $.each(arr, function(count, item) {
        var newImg = '<li><img src="'+item+'"></li>';
        $('#imgList').append(newImg);
    });

There are a couple of problems with this. For one (which you should have already noticed if you’ve read the earlier part of this article), we’re making the $("#imgList") selection once for each iteration of our loop. The other problem here is that each time the loop iterates, it’s adding a new <li> to the DOM. Each of those insertions is going to be costly, and if our array is quite large then this could lead to a massive slowdown or even the dreaded ‘A script is causing this page to run slowly’ warning.

var arr = [reallyLongArrayOfImageURLs],
    tmp = ''; 
$.each(arr, function(count, item) {
    tmp += '<li><img src="'+item+'"></li>';
});
$('#imgList').append(tmp);

All we’ve done here is create a tmp variable that each <li> is added to as it’s created. Once our loop has finished iterating, that tmp variable will contain all of our list items in memory, and can be appended to our <ul> all in one go. Browsers work much faster when working with objects in memory rather than on screen, so this is a much faster, more CPU-cycle-friendly method of building a list.

Wrapping up

These are far from being the only ways to make your jQuery code run better, but they are among the simplest ones to implement. Though each individual change may only make a few milliseconds of difference, it doesn’t take long for those milliseconds to add up. Studies have shown that the human eye can discern delays of as few as 100ms, so simply making a few changes sprinkled throughout your code can very easily have a noticeable effect on how well your website or app performs. Do you have other jQuery optimization tips to share? Leave them in the comments and help make us all better.

Now go forth and make awesome!

Tags: performance

April 08 2011

17:08

Best Practices for test revisited

With Google and their apps like Search, Docs or GMail only a very small time is actually spent in the initial page load, writes Andreas Grabner in a recent blog post. Of course, much time is spent in JavaScript, XHR Calls and DOM Manipulations triggered by user actions. Grabner writes:

It is very important to speed up Page Load Time – don’t get me wrong. It is the initial perceived performance by a user who interacts with your site. But it is not all we need to focus on. Most of the time in modern web applications is spent in JavaScript, DOM Manipulations, XHR Calls and Rendering that happen after the initial page load. Automatic verification against Best Practices won’t work here anymore because we have to analyze individual user actions that do totally different things. The way this will work is to analyze the individual user actions, track performance metrics and automate regression detection based on these measured values.

13:22

PageSpeed: Suggestions how to speed up your site

Here’s a nifty tool by Google Labs that “analyzes the content of a web page, then generates suggestions to make that page faster. Reducing page load times can reduce bounce rates and increase conversion rates.”

March 23 2011

09:05

Speeding Up Your Website’s Database

Advertisement in Speeding Up Your Website’s Database
 in Speeding Up Your Website’s Database  in Speeding Up Your Website’s Database  in Speeding Up Your Website’s Database

Website speed has always been a big issue, and it has become even more important since April 2010, when Google decided to use it in search rankings. However, the focus of the discussion is generally on minimizing file sizes, improving server settings and optimizing CSS and Javascript.

The discussion glosses over another important factor: the speed with which your pages are actually put together on your server. Most big modern websites store their information in a database and use a language such as PHP or ASP to extract it, turn it into HTML and send it to the Web browser.

So, even if you get your home page down to 1.5 seconds (Google’s threshold for being considered a “fast” website), you can still frustrate customers if your search page takes too much time to respond, or if the product pages load quickly but the “Customer reviews” delay for several seconds.

Performance in Speeding Up Your Website’s Database
Google’s threshold for a fast-loading website is about 1.5 seconds. This screenshot comes from Google Webmaster Tools (go to [domain name] → Diagnostics → Site Performance).

This article looks at these sorts of issues and describes some simple ways to speed up your website by optimizing your database. It starts with common knowledge but includes more complex techniques at the end, with links to further reading throughout. The article is intended for fearless database beginners and designers who have been thrown in at the deep end.

What Is A Database? What Is SQL?

A database is basically a collection of tables of information, such as a list of customers and their orders. It could be a filing cabinet, a bunch of spreadsheets, a Microsoft Access file or Amazon’s 40 terabytes of book and customer data.

A typical database for a blog has tables for users, categories, posts and comments. WordPress includes these and a few other starter tables. A typical database for an e-commerce website has tables for customers, products, categories, orders and order items (for the contents of shopping baskets). The open-source e-commerce software Magento includes these and many others. Databases have many other uses — such as for content management, customer relations, accounts and invoicing, and events — but these two common types (i.e. for a blog and an e-commerce website) will be referenced throughout this article.

Some tables in a database are connected to other tables. For example, a blog post can have many comments, and a customer can make multiple orders (these are one-to-many relationships). The most complicated type of database relationship is a many-to-many relationship. One relationship is at the core of all e-commerce databases: an order can contain many products, and a single product can be added to many different orders. This is where the “order items” table comes in: it sits between the products and the orders, and it records every time a product is added to an order. This will be relevant later on in the article, when we look at why some database queries are slow.

The word database also refers to the software that contains all this data, as in “My database crashed while I was having breakfast,” or “I really need to upgrade my database.” Popular database software include Microsoft Access 2010, Microsoft SQL Server, MySQL, PostgreSQL and Oracle Database 11g.

The acronym SQL comes up a lot when dealing with databases. It stands for “structured query language” and is pronounced “sequel” or “es-cue-el.” It’s the language used to ask and tell a database things — exciting things like SELECT lastname FROM customers WHERE city='Brighton'. This is called a database query because it queries the database for data. There are other types of database statements: INSERT for putting in new data, UPDATE for updating existing data, DELETE for deleting things, CREATE TABLE for creating tables, ALTER TABLE and many more.

How Can A Database Slow Down A Website?

A brand new empty website will run very fast, but as it grows and ages, you may notice some sluggishness on certain pages, particularly pages with complicated bits of functionality. Suppose you wanted to show “Customers who bought this product also bought…” at the bottom of a page of products. To extract this information from the database, you would need to do the following:

  1. Start with the current product,
  2. See how many times the product has recently been added to anyone’s shopping basket (the “order items” table from above),
  3. Look at the orders related to those shopping baskets (for completed orders only),
  4. Find the customers who made those orders,
  5. Look at other orders made by those customers,
  6. Look at the contents of those orders’ baskets (the “order items” again),
  7. Look up the details of those products,
  8. Identify the products that appear the most often and display them.

You could, in fact, do all of that in one massive database query, or you could split it up over several different queries. Either way, it might run very quickly when your database has 20 products, 12 customers, 18 orders and 67 order items (i.e. items in shopping baskets). But if it is not written and programmed efficiently, then it will be a lot slower with 500 products, 10,000 customers, 14,000 orders and 100,000 order items, and it will slow down the page.

This is a very complicated example, but it shows what kind of stuff goes on behind the scenes and why a seemingly innocuous bit of functionality can grind a website to a halt.

A website could slow down for many other reasons: the server running low on memory or disc space; another website on the same server consuming resources; the server sending out a lot of emails or churning away at some other task; a software, hardware or network fault; a misconfiguration. Or it may have suddenly become a popular website. The next two sections, therefore, will look at speed in more detail.

Is It My Database?

There are now several ways to analyze your website’s speed, including the Firebug plug-in for Firefox, the developer tools in Google Chrome (press Shift + Control + I, and then go to Resources → Enable Resource Tracking) and Yahoo YSlow. There are also websites such as WebPagetest, where you can enter a URL, and it will time it from your chosen location.

All of these tools will show you a diagram of all of the different resources (HTML, images, CSS and JavaScript files) used by your page, along with how long each took to load. They will also break down the time taken to perform a DNS lookup (i.e. to convert your domain name into an IP address), the time taken to connect to your server, the time spent waiting for your server to reply (aka “time to first byte”), and the time spent receiving (i.e. downloading) the data.

Many Web pages are constructed in their entirety by the Web server, including by PHP that accesses the database, and then sent to the browser all at once, so any database delays would lead to a long waiting time, and the receiving/downloading time would be proportional to the amount of data sent. So, if your 20 kB HTML page has a quick connection, a waiting time of 5 seconds and a download time of 0.05 seconds, then the delay would occur on the server, as the page is being built.

Not all Web pages are like this, though. The PHP flush function forces the browser to send the HTML that it has already built to the browser right away. Any further delays would then be in the receiving time, rather than the waiting time.

Either way, you can compare the waiting/receiving time for your suspected slow and complicated Web page to the waiting time for a similarly sized HTML page (or image or other static resource) on the same server at the same time. This would rule out the possibility of a slow Internet connection or an overloaded server (both of which would cause delays) and allow you to compare the times taken to construct the pages. This is not an exact science, but it should give you some indication of where things are being held up.

The screenshots below show the analysis provide by Google Chrome’s Developer Tools of a 20 kB Web page versus a 20 kB image. The Web page waited 130 milliseconds (ms) and downloaded for 22 ms. The image waited for 51 ms and downloaded for 11 ms. The download/receiving times are about the same, as expected, but the server is spending about 80 ms extra on processing and constructing the Web page, which entails executing the PHP and calling the database.

When performing these tests, analyze the static resource by itself and click “Refresh,” so that you are not getting a quick cached version. Also, run each a few times to ensure that you’re not looking at a statistical anomaly. The third screenshot below shows that WebPagetest indicates almost double the time of Google for the same page at the same time, demonstrating that using the same environment for all tests is important.

Google-Chrome-web-page-snapshot in Speeding Up Your Website’s Database
Resource analysis using Google Chrome’s Developer Tools, showing a 130-ms wait time for a Web page.

Google-Chrome-image-snapshot in Speeding Up Your Website’s Database
The same tool, showing a 51-ms wait time for an image of about the same size.

Web-page-test-thumb in Speeding Up Your Website’s Database
Resource analysis of the same page from WebPagetest, with a 296-ms wait time and a 417-ms total time.

How To Time A Database Query In PHP And MySQL

The approach above was general; we can now get very specific. If you suspect that your database might be slowing down your website, then you need to figure out where the delay is coming from. I will define a couple of timing functions, and then use them to time every single database query that is run by a page. The code below is specific to PHP and MySQL, but the method could be used on any database-driven website:

function StartTimer ($what='') {
 global $MYTIMER; $MYTIMER=0; //global variable to store time
 //if ($_SERVER['REMOTE_ADDR'] != '127.0.0.1') return; //only show for my IP address
 echo '<p style="border:1px solid black; color: black; background: yellow;">';
 echo "About to run <i>$what</i>. "; flush(); //output this to the browser
 //$MYTIMER = microtime (true); //in PHP5 you need only this line to get the time
 list ($usec, $sec) = explode (' ', microtime());
 $MYTIMER = ((float) $usec + (float) $sec); //set the timer
}
function StopTimer() {
 global $MYTIMER; if (!$MYTIMER) return; //no timer has been started
 list ($usec, $sec) = explode (' ', microtime()); //get the current time
 $MYTIMER = ((float) $usec + (float) $sec) - $MYTIMER; //the time taken in milliseconds
 echo 'Took ' . number_format ($MYTIMER, 4) . ' seconds.</p>'; flush();
}

StartTimer starts the timer and also prints whatever you are trying to time. The second line is a check of your IP address. This is very useful if you are doing this (temporarily) on a live website and don’t want everyone in the world to see the timing messages. Uncomment the line by removing the initial //, and replace the 127.0.0.1 with your IP address. StopTimer stops the timer and displays the time taken.

Most modern websites (especially well-programmed open-source ones) have a lot of PHP files but query the database in only a handful of places. Search through all of the PHP files for your website for mysql_db_query or mysql_query. Many software development packages such as BBEdit have functions to perform searches like this; or, if you are familiar with the Linux command line, try this:
grep mysql_query `find . -name \*php`

You may find something like this:

mysql_query ($sql);

For WordPress 3.0.4, this is on line 1112 of the file wp-includes/wp-db.php. You can copy and paste the functions above into the top of this file (or into any PHP file that is included by every page), and then add the timer before and after the mysql_query line. It will look like this:

StartTimer ($query);
$this->result = @mysql_query( $query, $dbh );
StopTimer();

Below is a partial screenshot of this being done on a brand new WordPress installation. It is running about 15 database queries in total, each taking about 0.0003 seconds (0.3 ms); so, less than 5 ms in total, which is to be expected for an empty database.

Hello-world in Speeding Up Your Website’s Database
This shows and times all of the database queries that WordPress runs.

If you have found this line in other commonly used systems, please share this information by adding to the comments for this article.

You can also do other interesting things with it: you can see how fast your computer is compared to mine. Counting to 10 million takes my computer 2.9420 seconds. My Web server is a bit faster at 2.0726 seconds:

StartTimer ('counting to 10000000');
for ($i=0; $i<10000000; $i++); //count to a high number
StopTimer();

Notes on the Results

This technique gives you only comparative results. If your server was very busy at that moment, then all of the queries would be slower than normal. But you should have at least been able to determine how long a fast query takes on your server (maybe 1 to 5 ms), and therefore identify the slow-ish ones (200+ ms) and the really slow ones (1+ second). You can run the test a few times over the course of an hour or day (but not immediately after — see the section below about the database cache) to make sure you’re not getting a fluke.

This will also most likely severely mess up the graphical presentation of the page. It may also give you PHP warnings like “Cannot modify header information. Headers already sent by…” This is because the timing messages are interfering with cookie and session headers. As long as the page still displays below the warnings, you can ignore them. If the page does not display at all, then you may need to put the StartTimer and StopTimer around specific blocks of code, rather than around mysql_query.

This technique is essentially a quick hack to show some rough results. It should not be left on a live website.

What Else Could It Be?

If your database queries are not particularly slow, but the construction of your Web page is, then you might just have poorly written code. You can put the timer statements above around bigger and bigger blocks of code to see if and where the delay is occurring. It could be that you are looping through 10,000 full rows of product information, even if you are displaying only 20 product names.

Profiling

If you are still baffled and/or want more complete and accurate information about what’s happening in your code, you could try a debugging and profiling tool such as Xdebug, which analyzes a local copy of your website. It can even visually show where bottlenecks are occurring.

Indexing Database Tables

The experiment above may have surprised you by showing just how many database queries a page on your website is running, and hopefully, it has helped you identify particularly slow queries.

Let’s look now at some simple improvements to speed things up. To do this, you’ll need a way to run database queries on your database. Many server administration packages (like cPanel or Plesk) provide phpMyAdmin for this task. Alternatively, you could upload something like phpMiniAdmin to your website; this single PHP file enables you to look at your database and run queries. You’ll need to enter your database name, user name and password. If you don’t know these, you can usually find them in your website’s configuration file, if it has one (in WordPress, it’s wp-config.php).

Among the database queries that your page runs, you probably saw a few WHERE conditions. This is SQL’s way of filtering out results. For instance, if you are looking at an “Account history” type of page on your website, there is probably a query like this to look up all of the orders someone has placed. Something like this:

SELECT * FROM orders WHERE customerid = 2;

This retrieves all orders placed by the customer with the database ID 2. On my computer, with 100,000 orders in the database, running this took 0.2158 seconds.

Columns like customerid — which deal with a lot of WHERE conditions with = or < or > and have many possible values, should be indexed. This is like the index at the back of a book: it helps the database quickly retrieve indexed data. This is one of the quickest ways to speed up database queries.

What to Index

In order to know which columns to index, you need to understand a bit about how your database is being used. For example, if your website is often used to look up categories by name or events by date, then these columns should be indexed.

SELECT * FROM categories WHERE name = 'Books';
SELECT * FROM events WHERE startdate >= '2011-02-07';

Each of your database tables should already have an ID column (often called id, but sometimes ID or articleid or the like) that is listed as a PRIMARY KEY, as in the wp_posts screenshot below. These PRIMARY KEYs are automatically indexed. But you should also index any columns that refer to ID numbers in other tables, such as customerid in the example above. These are sometimes referred to as FOREIGN KEYs.

SELECT * FROM orders WHERE customerid = 2;
SELECT * FROM orderitems WHERE orderid = 231;

If a lot of text searches are being done, perhaps for descriptions of products or article content, then you can add another type of index called a FULL TEXT index. Queries using a FULL TEXT index can be done over multiple columns and are initially configured to work only with words of four or more letters. They also exclude certain common words like about and words that appear in more than 50% of the rows being searched. However, to use this type of index, you will need to change your SQL queries. Here is a typical text search, the first without and the second with a FULL TEXT index:

SELECT * FROM products WHERE name LIKE '%shoe%' OR description LIKE '%shoe%';
SELECT * FROM products WHERE MATCH(name,description) AGAINST ('shoe');

It may seem that you should go ahead and index everything. However, while indexing speeds up SELECTs, it slows down INSERTs, UPDATEs and DELETEs. So, if you have a products table that hardly ever changes, you can be more liberal with your indexing. But your orders and order items tables are probably being modified constantly, so you should be more sparing with them.

There are also cases where indexing may not help; for example, if most of the entries in a column have the same value. If you have a stock_status column that stores a value of 1 for “in stock,” and 95% of your products are in stock, then an index wouldn’t help someone search for in-stock products. Imagine if the word the was indexed at the back of a reference book: the index would list almost every page in the book.

SELECT * FROM products WHERE stock_status = 1;

How to Index

Using phpMyAdmin or phpMiniAdmin, you can look at the structure of each database table and see whether the relevant columns are already indexed. In phpMyAdmin, click the name of the table and browse to the bottom where it lists “Indexes.” In phpMiniAdmin, click “Show tables” at the top, and then “sct” for the table in question; this will show the database query needed to recreate the table, which will include any indices at the bottom — something like KEY 'orderidindex' ('orderid').

Show-create-table-snapshot1 in Speeding Up Your Website’s Database
Using phpMiniAdmin to check for indices in the WordPress wp_posts table.

If the index does not exist, then you can add it. In phpMyAdmin, below the index, it says “Create an index on 1 columns”; click “Go” here, enter a useful name for the index (like customeridindex), choose the column on the next page, and press “Save,” as seen in this screenshot:

Adding-index-snapshot1 in Speeding Up Your Website’s Database
Indexing a column using phpMyAdmin.

In phpMiniAdmin, you’ll have to run the following database statement directly in the large SQL query box at the top:

ALTER TABLE orders ADD INDEX customeridindex (customerid);

Running the query again after indexing takes only 0.0019 seconds on my computer, 113 times faster.

Adding a FULL TEXT index is a similar process. When you run searches against this index, you must list the same columns:

ALTER TABLE articles ADD FULLTEXT(title,author,articletext);
SELECT * FROM articles WHERE MATCH(title,author,articletext) AGAINST ('mysql');

Back-Ups and Security

Before altering your database tables in any way, make a back-up of the whole database. You can do this using phpMyAdmin or phpMiniAdmin by clicking “Export.” Especially if your database contains customer information, keep the back-ups in a safe place. You can also use the command mysqldump to back up a database via SSH:

mysqldump --user=myuser --password=mypassword
--single-transaction --add-drop-table mydatabase
> backup`date +%Y%e%d`.sql

These scripts also represent a security risk, because they make it much easier for someone to steal all of your data. While phpMyAdmin is often provided securely though your server management software, phpMiniAdmin is a single file that is very easy to upload and forget about. So, you may want to password-protect it or remove it after usage.

Optimizing Tables

MySQL and other kinds of database software have built-in tools for optimizing their data. If your tables get modified a lot, then you can run the tools regularly to make the database tables smaller and more efficient. But they take some time to run (from a few seconds to a few minutes or more, depending on the size of the tables), and they can block other queries from running on the table during optimization, so doing this at a non-busy time is best. There’s also some debate about how often to optimize, with opinions ranging from never to once in a while to weekly.

To optimize a table, run database statements such as the following in phpMyAdmin or phpMiniAdmin:

OPTIMIZE TABLE orders;

For example, before I optimized my orders table with 100,000 orders, it was 31.2 MB in size and took 0.2676 seconds to run SELECT * FROM orders. After its first ever optimization, it shrunk to 30.8 MB and took only 0.0595 seconds.

The PHP function below will optimize all of the tables in your database:

function OptimizeAllTables() {
 $tables = mysql_query ('SHOW TABLES'); //get all the tables
 while ($table = mysql_fetch_array ($tables))
 mysql_query ('OPTIMIZE TABLE ' . $table[0]); //optimize them
}

Before calling this function, you have to connect to your database. Most modern websites will connect for you, so you don’t need to worry about it, but the relevant MySQL calls are shown here for the sake of completeness:

mysql_connect (DB_HOST, DB_USER, DB_PASSWORD);
mysql_select_db (DB_NAME);
OptimizeAllTables();

Making Sure To Use The Cache

Just as a Web browser caches copies of pages you visit, database software caches popular queries. As above, the query below took 0.0019 seconds when I ran it the first time with an index:

SELECT * FROM orders WHERE customerid=2;

Running the same query again right away takes only 0.0004 seconds. This is because MySQL has remembered the results and can return them a second time without looking them up again.

However, many news websites and blogs might have queries like the following to ensure that articles are displayed only after their published date:

SELECT * FROM posts WHERE publisheddate > CURDATE();
SELECT * FROM articles WHERE publisheddate > NOW();

These queries cannot be cached because they depend on the current time or date. In a table with 100,000 rows, a query like the one above would take about 0.38 seconds every time I run it against an unindexed column on my computer.

If these queries are run on every page of your website, thousands of times per minute, it would speed things up considerably if they were cacheable. You can force queries to use the cache by replacing NOW or CURDATE with an actual time, like so:

SELECT * FROM articles WHERE publisheddate > '2011-01-17 17:00';

You can use PHP to make sure the time changes every five minutes or so:

$time = time();
$currenttime = date ('Y-m-d H:i', $time - ($time % 300));
mysql_query (“SELECT * FROM articles WHERE publisheddate > '$currenttime'”);

The percentage sign is the modulus operator. % 300 rounds the time down to the last 300 seconds or 5 minutes.

There are other uncacheable MySQL functions, too, like RAND.

Outgrowing Your Cache

Outgrowing your MySQL cache can also make your website appear to slow down. The more posts, pages, categories, products, articles and so on that you have on your website, the more related queries there will be. Take a look at this example:

SELECT * FROM articles WHERE publisheddate > '2011-01-17 17:00' AND categoryid=12

It could be that when your website had 500 categories, queries like this one all fit in the cache together and all returned in milliseconds. But with 1000 regularly visited categories, they keep knocking each other out of the cache and returning much slower. In this case, increasing the size of the cache might help. But giving more server RAM to your cache means spending less on other tasks, so consider this carefully. Plenty of advice is available about turning on and improving the efficiency of your cache by setting server variables.

When Caching Doesn’t Help

A cache is invalidated whenever a table changes. When a row is inserted, updated or deleted, all queries relying on that table are effectively cleared from the cache. So, if your articles table is updated every time someone views an article (perhaps to count the number of views), then the improvement suggested above might not help much.

In such cases, you may want to investigate an application-level cacher, such as Memcached, or read the next section for ideas on making your own ad-hoc cache. Both require much bigger programming changes than discussed up to now.

Making Your Own Cache

If a particularly viscous database query takes ages but the results don’t change often, you can cache the results yourself.

Let’s say you want to show the 20 most popular articles on your website in the last week, using an advanced formula that takes into account searches, views, saves and “Send to a friend” hits. And you want to show these on your home page in an unordered (<ul>) HTML list.

It might be easiest to use PHP to run the database query once an hour or once a day and save the full list to a file somewhere, which you can then include on your home page.

Once you have written the PHP to create the include file, you could take one of a couple approaches to scheduling it. You could use your server’s scheduler (in Plesk 8, go to Server → Scheduled Tasks) to call a PHP page every hour, with a command like this:

wget -O /dev/null -q http://www.mywebsite.co.uk/runhourly.php

Alternatively, you could get PHP to check whether the file is at least an hour old before running the query — something like this, where 3600 is the number of seconds in an hour:

$filestat = stat ('includes/complicatedfile.html');
//look up information about the file
if ($filestat['mtime'] < time()-3600) RecreateComplicatedIncludeFile();
//over 1 hour
readfile ('includes/complicatedfile.html');
//include the file into the page

Returning to the involved example above for “Customers who bought this product also bought…,” you could also cache items in a new database column (or table). Once a week or so, you could run that long set of queries for each and every product, to figure out which other products customers are buying. You could then store the resulting product ID numbers in a new database column as a comma-separated list. Then, when you want to select the other products bought by customers who bought the product with the ID 12, you can run this query:

SELECT * FROM products WHERE FIND_IN_SET(12,otherproductids);

Reducing The Number Of Queries By Using JOINs

Somewhere in the management and control area of your e-commerce website is probably a list of your orders with the names of the customers who made them.

This page might have a query like the following to find all completed orders (with a status value indicating whether an order has been completed):

SELECT * FROM orders WHERE status>1;

And for each order it comes across, it might look up the customer’s details:

SELECT * FROM customers WHERE id=1;
SELECT * FROM customers WHERE id=2;
SELECT * FROM customers WHERE id=3;
etc

If this page shows 100 orders at a time, then it has to run 101 queries. And if each of those customers looks up their delivery address in a different table, or looks for the total charge for all of their orders, then the time delay will start to add up. You can make it much faster by combining the queries into one using a JOIN. Here’s what a JOIN looks like for the queries above:

SELECT * FROM orders INNER JOIN customers
ON orders.customerid = customers.id WHERE orders.status>=1;

Here is another way to write this, without the word JOIN:

SELECT * FROM orders, customers
WHERE orders.customerid = customers.id AND orders.status>=1;

Restructuring queries to use JOINs can get complicated because it involves changing the accompanying PHP code. But if your slow page runs thousands of database statements, then it may be worth a look. For further information, Wikipedia offers a good explanation of JOINs. The columns with which you use a JOIN (customerid in this case) are also prime candidates for being INDEXed.

You could also ask MySQL to EXPLAIN a database query. This tells you which tables it will use and provides an “execution plan.” Below is a screenshot showing the EXPLAIN statement being used on one of the more complex WordPress queries from above:

Mysql-explain-snapshot in Speeding Up Your Website’s Database
Using the EXPLAIN statement to explain how MySQL plans to deal with a complex query.

The screenshot shows which tables and indices are being used, the JOIN types, the number of rows analyzed, and a lot more information. A comprehensive page on the MySQL website explains what the EXPLAIN explains, and another much shorter page goes over how to use that information to optimize your queries (by adding indices, for instance).

…Or Just Cheat

Finally, returning again to the advanced example above for “Customers who bought this product also bought…,” you could also simply change the functionality to be something less complicated for starters. You could call it “Recommended products” and just return a few other products from the same category or return some hand-picked recommendation.

Conclusion

This article has shown a number of techniques for improving database performance, ranging from simple to quite complex. While all well-built websites should already incorporate most of these techniques (particularly the database indices and JOINs), the techniques do get overlooked.

There is also a lot of debate on forums around the Web about the effectiveness and reliability of some of these techniques (i.e. measuring speed, indexing, optimization, how best to use the cache, etc.), so the advice here is not definitive, but hopefully it gives you an overview of what’s available.

If your website starts to mysteriously slow down after a few months or years, you will at least have a starting point for figuring out what’s wrong.

(al)


© Paul Tero for Smashing Magazine, 2011. | Permalink | Post a comment | Smashing Shop | Smashing Network | About Us
Post tags: database, performance, SQL

October 11 2010

12:01

Local Storage And How To Use It On Websites

Smashing-magazine-advertisement in Local Storage And How To Use It On WebsitesSpacer in Local Storage And How To Use It On Websites
 in Local Storage And How To Use It On Websites  in Local Storage And How To Use It On Websites  in Local Storage And How To Use It On Websites

Storing information locally on a user’s computer is a powerful strategy for a developer who is creating something for the Web. In this article, we’ll look at how easy it is to store information on a computer to read later and explain what you can use that for.

[Offtopic: by the way, did you know that there is a Smashing eBook Series? Book #2 is Successful Freelancing for Web Designers, 260 pages for just $9,90.]

Adding State To The Web: The “Why” Of Local Storage

The main problem with HTTP as the main transport layer of the Web is that it is stateless. This means that when you use an application and then close it, its state will be reset the next time you open it. If you close an application on your desktop and re-open it, its most recent state is restored.

This is why, as a developer, you need to store the state of your interface somewhere. Normally, this is done server-side, and you would check the user name to know which state to revert to. But what if you don’t want to force people to sign up?

This is where local storage comes in. You would keep a key on the user’s computer and read it out when the user returns.

C Is For Cookie. Is That Good Enough For Me?

The classic way to do this is by using a cookie. A cookie is a text file hosted on the user’s computer and connected to the domain that your website runs on. You can store information in them, read them out and delete them. Cookies have a few limitations though:

  • They add to the load of every document accessed on the domain.
  • They allow up to only 4 KB of data storage.
  • Because cookies have been used to spy on people’s surfing behavior, security-conscious people and companies turn them off or request to be asked every time whether a cookie should be set.

To work around the issue of local storage — with cookies being a rather dated solution to the problem — the WHATWG and W3C came up with a few local storage specs, which were originally a part of HTML5 but then put aside because HTML5 was already big enough.

Using Local Storage In HTML5-Capable Browsers

Using local storage in modern browsers is ridiculously easy. All you have to do is modify the localStorage object in JavaScript. You can do that directly or (and this is probably cleaner) use the setItem() and getItem() method:

localStorage.setItem('favoriteflavor','vanilla');

If you read out the favoriteflavor key, you will get back “vanilla”:

var taste = localStorage.getItem('favoriteflavor');
// -> "vanilla"

To remove the item, you can use — can you guess? — the removeItem() method:

localStorage.removeItem('favoriteflavor');
var taste = localStorage.getItem('favoriteflavor');
// -> null

That’s it! You can also use sessionStorage instead of localStorage if you want the data to be maintained only until the browser window closes.

Working Around The “Strings Only” Issue

One annoying shortcoming of local storage is that you can only store strings in the different keys. This means that when you have an object, it will not be stored the right way.

You can see this when you try the following code:

var car = {};
car.wheels = 4;
car.doors = 2;
car.sound = 'vroom';
car.name = 'Lightning McQueen';
console.log( car );
localStorage.setItem( 'car', car );
console.log( localStorage.getItem( 'car' ) );

Trying this out in the console shows that the data is stored as [object Object] and not the real object information:

Console-e1285930679229 in Local Storage And How To Use It On Websites

You can work around this by using the native JSON.stringify() and JSON.parse() methods:

var car = {};
car.wheels = 4;
car.doors = 2;
car.sound = 'vroom';
car.name = 'Lightning McQueen';
console.log( car );
localStorage.setItem( 'car', JSON.stringify(car) );
console.log( JSON.parse( localStorage.getItem( 'car' ) ) );

Console2-e1285930703974 in Local Storage And How To Use It On Websites

Where To Find Local Storage Data And How To Remove It

During development, you might sometimes get stuck and wonder what is going on. Of course, you can always access the data using the right methods, but sometimes you just want to clear the plate. In Opera, you can do this by going to Preferences → Advanced → Storage, where you will see which domains have local data and how much:

Storage-opera in Local Storage And How To Use It On Websites
Large view

Doing this in Chrome is a bit more problematic, which is why we made a screencast:

Mozilla has no menu access so far, but will in future. For now, you can go to the Firebug console and delete storage manually easily enough.

So, that’s how you use local storage. But what can you use it for?

Use Case #1: Local Storage Of Web Service Data

One of the first uses for local storage that I discovered was caching data from the Web when it takes a long time to get it. My World Info entry for the Event Apart 10K challenge shows what I mean by that.

When you call the demo the first time, you have to wait up to 20 seconds to load the names and geographical locations of all the countries in the world from the Yahoo GeoPlanet Web service. If you call the demo a second time, there is no waiting whatsoever because — you guessed it — I’ve cached it on your computer using local storage.

The following code (which uses jQuery) provides the main functionality for this. If local storage is supported and there is a key called thewholefrigginworld, then call the render() method, which displays the information. Otherwise, show a loading message and make the call to the Geo API using getJSON(). Once the data has loaded, store it in thewholefrigginworld and call render() with the same data:

if(localStorage && localStorage.getItem('thewholefrigginworld')){
  render(JSON.parse(localStorage.getItem('thewholefrigginworld')));
} else {
  $('#list').html('

'+loading+' '); var query = 'select centroid,woeid,name,boundingBox'+ ' from geo.places.children(0)'+ ' where parent_woeid=1 and placetype="country"'+ ' | sort(field="name")'; var YQL = 'http://query.yahooapis.com/v1/public/yql?q='+ encodeURIComponent(query)+'&diagnostics=false&format=json'; $.getJSON(YQL,function(data){ if(localStorage){ localStorage.setItem('thewholefrigginworld',JSON.stringify(data)); } render(data); }); }

You can see the difference in loading times in the following screencast:

The code for the world info is available on GitHub.

This can be extremely powerful. If a Web service allows you only a certain number of calls per hour but the data doesn’t change all that often, you could store the information in local storage and thus keep users from using up your quota. A photo badge, for example, could pull new images every six hours, rather than every minute.

This is very common when using Web services server-side. Local caching keeps you from being banned from services, and it also means that when a call to the API fails for some reason, you will still have information to display.

getJSON() in jQuery is especially egregious in accessing services and breaking their cache, as explained in this blog post from the YQL team. Because the request to the service using getJSON() creates a unique URL every time, the service does not deliver its cached version but rather fully accesses the system and databases every time you read data from it. This is not efficient, which is why you should cache locally and use ajax() instead.

Use Case #2: Maintaining The State Of An Interface The Simple Way

Another use case is to store the state of interfaces. This could be as crude as storing the entire HTML or as clever as maintaining an object with the state of all of your widgets. One instance where I am using local storage to cache the HTML of an interface is the Yahoo Firehose research interface (source on GitHub):

The code is very simple — using YUI3 and a test for local storage around the local storage call:

YUI().use('node', function(Y) {
  if(('localStorage' in window) && window['localStorage'] !== null){
    var key = 'lastyahoofirehose';
  
    localStorage.setItem(key,Y.one('form').get('innerHTML'));
  
  if(key in localStorage){
      Y.one('#mainform').set('innerHTML',localStorage.getItem(key));
      Y.one('#hd').append('

Notice: We restored your last search for you - not live data'); } } });

You don’t need YUI at all; it only makes it easier. The logic to generically cache interfaces in local storage is always the same: check if a “Submit” button has been activated (in PHP, Python, Ruby or whatever) and, if so, store the innerHTML of the entire form; otherwise, just read from local storage and override the innerHTML of the form.

The Dark Side Of Local Storage

Of course, any powerful technology comes with the danger of people abusing it for darker purposes. Samy, the man behind the “Samy is my hero” MySpace worm, recently released a rather scary demo called Evercookie, which shows how to exploit all kind of techniques, including local storage, to store information of a user on their computer even when cookies are turned off. This code could be used in all kinds of ways, and to date there is no way around it.

Research like this shows that we need to look at HTML5′s features and add-ons from a security perspective very soon to make sure that people can’t record user actions and information without the user’s knowledge. An opt-in for local storage, much like you have to opt in to share your geographic location, might be in order; but from a UX perspective this is considered clunky and intrusive. Got any good ideas?

(al)


© Christian Heilmann for Smashing Magazine, 2010. | Permalink | Post a comment | Add to del.icio.us | Digg this | Stumble on StumbleUpon! | Tweet it! | Submit to Reddit | Forum Smashing Magazine
Post tags: Coding, localstorage, performance, programming, security

September 22 2010

11:00

HTML5 Link Prefetching

From David Walsh comes a good writeup on the HTML5 link prefetch tag:

HTML:
<!-- full page -->

<!-- just an image -->

 

You use the link tag to do prefetching, setting the rel to "prefetch" and giving the URL to the resource to prefetch. When should you use link prefetching?

Whether prefetching is right for your website is up to you.  Here are a few ideas:

  • When a series of pages is much like a slideshow, load the next 1-3 pages, previous 1-3 pages (assuming they aren't massive).
  • Loading images to be used on most pages throughout the website.
  • Loading the next page of the search results on your website.

Some things to know about link prefetching though:

A few more notes about link prefetching:

  • Prefetching does work across domains, including pulling cookies from those sites.
  • Prefetching can throw off website statistics as the user doesn't technically visit a given page.
  • Mozilla Firefox, currently the only browser to support prefetching, has actually supported prefetching since 2003.

September 07 2010

17:47

WebPagetest and PageSpeed join up via PageSpeed SDK

Steve Souders just pointed me to the great news that two great open source performance projects are working well together:

Pat Meenan just blogged about Page Speed results now available in Webpagetest. This is a great step toward greater consistency in the world of web performance, something that benefits developers and ultimately benefits web users.

The Page Speed SDK gives a path for folks to unify behind standard performance metrics and results. Great work!

September 01 2010

10:00

Performance Optimization: How to Load your javascript faster!

Load your javascript fasterJavascript is now extremely important. Some sites use javascript for a tiny enchantments, many of today’s webapps are depending on it, some of them are even totally written in js. In this article I’ll point out some important rules, how to use your javascript, which tools to use and what benefits you’ll gain from it.

Keep your code at minimum

Keep your code at minimum

Don’t rely on javascript. Don’t duplicate your scripts. Treat it like a candy-tool to make things more pretty. Don’t bloat your site with s**t-loads of javascript. Use it only when necessary. Only when it really improves user experience.

Minimize DOM access

Accessing DOM elements with JavaScript is easy, code is more readable, but it’s slow. Here are some tips: Limit your layout fixing with javascript, cache references to accessed elements. Sometimes when your site is depending so much on DOM modifications you should consider limiting your markup. It’s a good reason to switch to HTML5 and leave those old XHTML, HTML4 behind. You can check the number of your DOM elements by typing in Firebug’s console: document.getElementsByTagName('*').length

Compress your code

The most efficient way to serve compressed JavaScript is to first run your code through a JavaScript compressor that shrinks variable and argument names, and then serve the resulting code using gzip compression.

Well, I don’t compress my main.js, but check if you have any jQuery plugins that are not compressed, do it (remember to keep author’s notes). Below I listed some options for compression.

GZip Compression: Idea behind this is to reduce time of transferring data between browser and server. When it’s done you get your file with Accept-Encoding: gzip,deflate header. It has some disadvantages though. It takes: CPU on both server-side and client side (to compress and uncompress) and disc space.

Avoid eval(): While sometimes it may bring some time efficiency, it’s definitely wrong practice. It makes your code look more dirty and it crashes out most of the compressors.

Tool to speed up javascript loading – Lab.js

There are many awesome tools that could speed up your javascript loading time. One is worth mentioning — Lab.js.

With LAB.js (Loading And Blocking JavaScript) you can load your javascript files in parallel, speeding up the total loading process. What is more you can also set up certain order for scripts to be loaded, so no dependencies are broken. Also, the developer declares a 2x speed improvement on his site.

Using proper CDN

Many webpages now use CDN (Content delivery network). It improves your caching, because everybody can use it. It can also save you some bandwidth. You can easy ping or firebug those servers to check from where you get data faster. Choose CDN by matching your readers localization. Remember to use public repositories when it’s possible.

Some CDN options for jQuery:

  • http://ajax.googleapis.com/ajax/libs/jquery/1.4.2/jquery.min.js – Google Ajax, information about more libraries
  • http://ajax.microsoft.com/ajax/jquery/jquery-1.4.2.min.js – Microsoft’s CDN
  • http://code.jquery.com/jquery-1.4.2.min.js – Edgecast (mt)

Load your javascript at the end of page

Load your javascript at the end

Very good practice if you care about user and him/her not leaving your page because of slow internet connection. Usability and user at first, javascript at the end. This may be painful, but you should be prepared for users with disabled javascript. You may put some javascript to be loaded in head section, but only if it’s loading asynchronously.

Load tracking asynchronously

This one is very important. Most of us are using Google Analytics for statistics. It’s good. Now look where you put your tracking code. Is it in head section? Is it using document.write? Then you should blame yourself for not using asynchronous tracking code for Google Analytics.

This is what asynchronous tracking code for Google Analytics looks like. We must acknowledge that it uses DOM instead of document.write, which may be better for you. It detects some of the events before page load which is very important. Now think of all the users closing your page before it even loaded. The cure of missing page views has been found.


	var _gaq = _gaq || [];
	_gaq.push(['_setAccount', 'UA-XXXXXXX-XX']);
	_gaq.push(['_trackPageview']);

	(function() {
		var ga = document.createElement('script'); ga.type = 'text/javascript'; ga.async = true;
		ga.src = ('https:' == document.location.protocol ? 'https://ssl' : 'http://www') + '.google-analytics.com/ga.js';
		var s = document.getElementsByTagName('script')[0]; s.parentNode.insertBefore(ga, s);
	})();

Don’t using GA? It’s not a problem, most of today’s analytics providers will allow you to use asynchronous tracking.

Ajax Optimization

Ajax Optimization

Ajax request have great impact on your site’s performance. Below I pointed some tips about ajax optimization.

Cache your ajax

Look at your code. Is your ajax cacheable? Well, it depends on data, but mostly your ajax requests should be cacheable. In jQuery your requests are cached by default, not including script and jsonp dataType.

Use GET for Ajax Requests

POST type requests takes two TCP packets to send (headers sent first, data next). GET type request takes only one packet to send (which may depend on your amount of cookies). So while your URL length is less than 2K and you want to request some data use GET.

Use ySlow

Use Free ySlow Tool

It’s both simple and extremely powerful when it comes to performance. It grades your website and shows you what you need to correct, what should be taken care of.

Bonus: Pack your javascript into PNG File

jQuery and Prototype Packed into one image

Imagine adding your JS and CSS to the end of an image and cropping it in CSS to have all the info you need in an app in a single HTTP request.

I have recently found this. What is basically does it packs up your javascript/css data into PNG file. After that you can unpack it by using the canvas API’s getImageData(). What is more it’s very efficient. You can gain about 35% compression without minifying your data. Lossless compression! I must point out that for larger scripts you can feel “some” load time while image is pointed to canvas and pixels are read.

For more information about this one check out this article from 2008.

Final Thoughts

Hope you guys liked this article. If yes, remember to share it and to say hello to me on twitter. Stay in tune for some further posts about serious performance optimization.

August 22 2010

19:30

Want to pack JS and CSS really well? Convert it to a PNG and unpack it via Canvas

Jacob Seidelin of nihilogic fame (remember his Super Mario in JavaScript solution) is one of my unsung heroes of JavaScript. His solutions have that Dean Edwards "genius bordering on the bat-sh*t-crazy" touch that make you shake your head in disbelief when they come out but later on become very interesting.

One of his posts from 2008 entitled "Compression using Canvas and PNG-embedded data" had a good idea: if you want to compress JavaScript and CSS you could reverse engineer a packing algorithm in JavaScript or you could use a lossless packing system that is already in use and supported in browsers. In this case the packed format is PNG and the way to unpack it is by using the canvas API's getImageData() method:

JAVASCRIPT:
var x = function(z, m, ix ) { // image, callback, chunk index
  var o = new Image();
  o.onload = function() {
    var s = "",
        c = d.createElement("canvas"),
        t = c.getContext("2d"),
        w = o.width,
        h = o.height;
    c.width = c.style.width = w;
    c.height = c.style.height = h;
    t.drawImage(o, 0, 0);
    var b = t.getImageData( 0, 0, w, h ).data; //b : bucket of data
    for(var i= 0; i <b.length; i += 4) {
      if( b[i]> 0 )
        s += String.fromCharCode(b[i]);
    }
    m(s, ix);
  }
  o.src = z;
}

As there are quite some interesting competitions going on that need really small JavaScript solutions Alex Le took up Jacob's work and wrapped it in a build script that concatenates, packs and converts to a PNG and unpacks it for the 10K competition with a JavaScript. In the process Alex also found some bug in Internet Explorer 9's canvas implementation as it only reads the first 8192 bytes of a PNG and returns 0 for the others :(.

It is pretty amazing how efficient this way of packing is. What we need to test now is when and if it is worth while to have the unpacking done on the client. Imagine adding your JS and CSS to the end of an image and cropping it in CSS to have all the info you need in an app in a single HTTP request. Let the games begin.

August 18 2010

11:26

When does JavaScript trigger reflows and rendering?

Thomas Fuchs has some good performance things to say reflows and rendering. A video of wikipedia gives you an idea of how much happens when a basic page is rendered:

The advice?

The important thing is to always remember that reflowing and rendering HTML is the single most expensive operation browsers do. If your page feels sluggish it’s most likely a problem with rendering. While the easiest way to optimize is to get rid of as many nodes as you can, and trying to have simpler CSS rules, sometimes JavaScript is the culprit.

Following changes to the page, a Javascript query like someElement.offsetHeight will block execution - to give you the right answer, any pending reflow has to be executed first. So code like this:

JAVASCRIPT:
someElement.style.fontSize = "14px";
if(someElement.offsetHeight>100){ /* ... */ }
someElement.style.paddingLeft = "20px";
if(someElement.offsetWidth>100){ /* ... */ }
 

could be twice as fast if you wrote it like this:

JAVASCRIPT:
someElement.style.fontSize = "14px";
someElement.style.paddingLeft = "20px";
if(someElement.offsetHeight>100){ /* ... */ }
if(someElement.offsetWidth>100){ /* ... */ }
 

because there are two reflows in the first example, and only one in the second.

July 27 2010

00:21

Canvas Color Cycling

Interest in Canvas, as well as mobile apps, has led to a renaissance of old-school 8-bit graphics. Joe Huckaby of Effect Games has been playing around with color cycling, leading to some stunning effects.

Anyone remember Color cycling from the 90s? This was a technology often used in 8-bit video games of the era, to achieve interesting visual effects by cycling (shifting) the color palette. Back then video cards could only render 256 colors at a time, so a palette of selected colors was used. But the programmer could change this palette at will, and all the onscreen colors would instantly change to match. It was fast, and took virtually no memory.

There’s a neat optimization going on here too: instead of clearing and redrawing the entire scene with each frame, he only updates the pixels that change:

In order to achieve fast frame rates in the browser, I had to get a little crazy in the engine implementation. Rendering a 640×480 indexed image on a 32-bit RGB canvas means walking through and drawing 307,200 pixels per frame, in JavaScript. That’s a very big array to traverse, and some browsers just couldn’t keep up. To overcome this, I pre-process the images when they are first loaded, and grab the pixels that reference colors which are animated (i.e. are part of cycling sets in the palette). Those pixel X/Y offsets are stored in a separate, smaller array, and thus only the pixels that change are refreshed onscreen. This optimization trick works so well, that the thing actually runs at a pretty decent speed on my iPhone 3GS and iPad!

July 26 2010

16:36

Quick Tip: Improve Site Performance in 3 Easy Steps


We all know we should do it, but how many of us do? I’m talking about minifying JavaScript, CSS, and optimizing images to reduce load times. Today, I’ll show you three quick and easy methods that all of us should implement to improve our site’s performance.


So what other techniques do you use to improve load times?

June 30 2010

08:15

IE9 gets a Web Timing API to measure performance

Web site performance is a very important topic. We should not let our end users wait for our sites and optimizing them for load time and rendering can save us thousands of dollars in traffic. There is a lot of great content out there on performance (spearheaded by Yahoo a few years back). When it comes to testing the performance after the page has loaded though there is a lot of things you can do wrong as you need to test things with timers and hope nothing else happening to your test machine interferes with your results.

The IE9 team wants to make it easier for developers and added a new Web Timing API in the browser. Web Timing is a W3C working draft and the API implemented the NavigationTiming part of the spec in window.msPerformance.timing and offers you a few sets of information without having to hack your own solution:

JAVASCRIPT:
interface MSPerformanceTiming{
     readonly attribute unsigned longlong navigationStart;
     readonly attribute unsigned longlong fetchStart;
     readonly attribute unsigned longlong unloadStart;
     readonly attribute unsigned longlong unloadEnd;
     readonly attribute unsigned longlong domainLookupStart;
     readonly attribute unsigned longlong domainLookupEnd;
     readonly attribute unsigned longlong connectStart;
     readonly attribute unsigned longlong connectEnd;
     readonly attribute unsigned longlong requestStart;
     readonly attribute unsigned longlong requestEnd;
     readonly attribute unsigned longlong responseStart;
     readonly attribute unsigned longlong responseEnd;
     readonly attribute unsigned longlong domLoading;
     readonly attribute unsigned longlong domInteractive;
     readonly attribute unsigned longlong domContentLoaded;
     readonly attribute unsigned longlong domComplete;
     readonly attribute unsigned longlong loadStart;
     readonly attribute unsigned longlong loadEnd;
     readonly attribute unsigned longlong firstPaint;
     readonly attribute unsigned longlong fullyLoaded;
}

You have even more granular control in timingMeasures

JAVASCRIPT:
interface MSPerformanceTimingMeasures{
     readonly attribute unsigned longlong navigation;
     readonly attribute unsigned longlong fetch;
     readonly attribute unsigned longlong unload;
     readonly attribute unsigned longlong domainLookup;
     readonly attribute unsigned longlong connect;
     readonly attribute unsigned longlong request;
     readonly attribute unsigned longlong response;
     readonly attribute unsigned longlong domLoading;
     readonly attribute unsigned longlong domInteractive;
     readonly attribute unsigned longlong domContentLoaded;
     readonly attribute unsigned longlong domComplete;
     readonly attribute unsigned longlong load;
     readonly attribute unsigned longlong firstPaint;
     readonly attribute unsigned longlong fullyLoaded;
}

Read the original post on MSDN and check out the demo on IE Test Drive

June 16 2010

13:16

JSKB: JavaScript Knowledge Base. Shrinking your code via BrowserScope and Caja

We have a screwed up tensions on the Web. The size of your source code really matters for performance. The larger your .js.... the longer it takes it to get down the pipe. This has a perverse incentive to write terse uncommented code. Add to this the problem of having to work cross browser, and having to do so all at runtime, and you end up shipping a ton of code to browsers that will never touch it.

This is where Mike Samuel of Caja, and Lindsey Simon of Browserscope come in. They have a plan to help reverse code bloat with JavaScript:

Lots of compilers (incl. (JSMin, Dojo, YUI, Closure, Caja) remove unnecessary code from JavaScript to make the code you ship smaller. They seem like a natural place to address this problems. Optimization is just taking into account the context that code is going to run in to improve it; giving compilers information about browsers will help them avoid shipping code to support marginal browsers to modern browsers.

The JavaScript Knowledge Base (JSKB) on browserscope.org seeks to systematically capture this information in a way that compilers can use.
It collects facts about browsers using JavaScript snippet. The JavaScript code (!!window.JSON && typeof window.JSON.stringify === 'function') is true if JSON is defined. JSKB knows that this is true for Firefox 3.5 but not Netscape 2.0.

Caja Web Tools includes a code optimizer that uses these facts. If it sees code like if (typeof JSON.stringify !== 'function') { /* lots of code */ } it knows that the body will never be executed on Firefox 3.5, and can optimize it out. The key here is that the developer writes feature tests, not version tests, and as browsers roll out new features, JSKB captures that information, letting compilers produce smaller code for that browser.

The Caja team just released Caja Web Tools, which already uses JSKB to optimize code. We hope that other JavaScript compilers will adopt these techniques. If you're working on a JavaScript optimizer, take a look at our JSON APIs to get an idea of what the JSKB contains.

You can see graphically how this works and learn more about how browser detection info is packaged:

JAVASCRIPT:
{
  "!!this.window !== 'undefined' && this === window": true,
  "typeof addEventListener": "function",
  "typeof attachEvent": "undefined"
  "typeof document.body.outerHTML": "undefined",
}
 

Definitely feels like there is a lot of room to do more with a compilation step that only sends down the right JS for the given browser.

June 15 2010

11:13

Y.preload: load before execution

Caridy Patino has posted on a new YUI3 module for preloading of content, implementing Stoyan's ideas.

You can now strap on some preloading goodness to your YUI application:

JAVASCRIPT:
YUI({
    //Last Gallery Build of this module
    gallery: 'gallery-2010.05.05-19-39'
}).use('gallery-preload', function(Y) {
  Y.preload ([
    'http://tools.w3clubs.com/pagr2/1.sleep.expires.png',
    'http://tools.w3clubs.com/pagr2/1.sleep.expires.js',
    'http://tools.w3clubs.com/pagr2/1.sleep.expires.css'
  ]);
});
 

As well as just loading away, you can also wait for the user to focus on something on the page before reloading:

JAVASCRIPT:
Y.on("focus", function() {
    Y.preload ([
      'http://tools.w3clubs.com/pagr2/2.sleep.expires.png',
      'http://tools.w3clubs.com/pagr2/2.sleep.expires.js',
      'http://tools.w3clubs.com/pagr2/2.sleep.expires.css'
    ]);
}, ".myform input.query");
 

Or, to take it even further, only preload if the user is probably idle:

Nicely packaged.

June 11 2010

17:30

How fast does FIFA.com score a goal?

It’s the World Cup again. Being a Brit, I am on tender hooks with the first England game coming up tomorrow with the USA. A family feud for me. We start to see great microsites such as the Twitter @WorldCup site, and as we think about what the fastest goal will be… what about the fastest website?

Dynatrace had some fun and did a performance test on FIFA.com:

  • Time to First Impression/Drawing: 3.74s
    • Analysis: so it takes almost 4s until the user sees a visual indication of the page load – that is definitely too long and should be improved
    • Recommendation: < 1s is great. <2.5s is acceptable
  • Time to onLoad: 8.25s
    • Analysis: it takes the browser 8.25 to download the initial document plus all referenced objects before it triggers the onLoad event that allows JavaScript to modify the page after it has been loaded – again – much too slow as nobody likes to wait 8s until the content is loaded
    • Recommendation: < 2s is great. <4s is acceptable
  • Time to Fully Loaded: 8.6s
    • the page loads additional resources triggered by JavaScript onLoad handlers. I consider the page as fully loaded when all these additional requests are downloaded. I guess I don’t need to mention that 8.6s is not fast :(
    • Recommendations: < 2s is great. <5s is acceptable
  • Number of HTTP Requests: 201
    • Analysis: 201 – that’s a lot of elements for a single page. We have seen many images that are the main contributor to this load. My first thought on this -> let’s seen how we can reduce this number by e.g.: merging files (more details later)
    • Recommendations: < 20 is great. < 100 is acceptable (This one is a hard recommendation as it really depends on the type of website – but – it is a good start to measure this KPI)
  • Number and Impact of HTTP Redirects: 1/1.44s
  • Number and Impact of HTTP 400’s: 1/0.71s
    • Analysis: There seems to be a javascript file that results in a HTTP 403 Forbidden Response and takes a total of 0.71s.
    • Recommendations: 0. Avoid any 400’s and 500’s
  • Size of JavaScript/CSS/Images: ~370kb/220kb/890kb
    • Analysis: Size of individual mime types is always a good indicator and helps to compare to other sites and other builds. 370kb of JavaScript and 220kb of CSS can probably reduced to a smaller size by using certain minimization techniques or by getting rid of unused code or styles
    • Recommendations: It is hard to give a definite threshold value. Keep in mind that these files need to be downloaded and parsed by the browser. The more content there is the more work on the browser. The goal must be to remove all information that is not needed for the current page. I often see developers packing everything in a huge global .js file. That might be a good practice but too often only a fraction of this code is actually used by the end-user. It is better to load what needs to be loaded in the beginning and delay load additional content when really needed
  • Max/Average Wait Time: 4.31s/1.9s
    • Analysis: this means that resources have to wait up to 4.3s to be downloaded and that they have to wait 1.9s on average. This is way to much and can be reduced by either reducing the number of resources or by spreading them on multiple domains (Domain Sharding) in order to allow the browser to use more physical connections.
    • Recommendations: < 20ms is good. < 50ms is acceptable (as you can see – we are FAR OFF these numbers in this example)
  • Single Resource Domains: 1
    • Analysis: from the timeline we can also see that there is one domain that only serves a single resource. In this particular case it seems to be serving an ad. We can assume that this might not be changeable but this KPI is a good indicator on whether it is worth paying the cost of a DNS Lookup and Connect if we only download a single resource from a domain
    • Recommendations: 0. Try to avoid single resource domains. It is not always possible – but do it if you can

The KPI’s tell me that the page is way too slow – especially the Full Page Load Time of 8.6s needs to be optimized. With the KPI’s we can already think about certain areas to focus on, e.g.: reducing the network roundtrips or minimizing content size. But there is much more. Let’s have a closer look into 4 different areas.

Then they get to analysis on the network, caching, JavaScript execution, and is all adds up to an F :/

  • Browser Caching: F – 175 images have a short expires header, 4 have a header in the past
  • Network: F – 201 Requests in total, 1 Redirect, 1 HTTP 400, duplicated image requests on different domains
  • Server-Side: C – 10 App-Server Requests with a total of 3.6s -> analyze server-side processing
  • JavaScript: D – Use CSS Lookups by ID instead of Class Name

June 10 2010

11:10

Facebook has a BigPipe to smoke competitors on performance

Remember a time when you would make fun of Facebook for having such poor performance? You would see 400 scripts that would be loading, some of which that would have code for no reason. That was in the distant past now.

Makinde Adeagbo gave that great talk at JSConf about the copious amount of code they were able to delete while speeding up the site. With folks like him and Tom Occhino on the case, you know good things are happening.

If you do a view source on the Facebook home page these days, you see a lot of this:

HTML:
<script>big_pipe = new BigPipe(null, 4, null, true);</script>
<script>big_pipe.onPageletArrive({"id":"pagelet_intentional_stream","phase":1,"is_last":false,"append":false,"bootloadable":{"ufi-tracking-js":["F+B8D","CDYbm","A5j5z","3NVRu"],"UIIntentionalStreamRefresh":["F+B8D","CDYbm","EMOa3","zwScZ","fWhta","EzjZW"]},"css":["jFmkz","z9ULo","lShFv","bh3tE","1AZL5","OxGjK"],"js":["F+B8D","CDYbm","A5j5z","fWhta","uUXWA"],"resource_map":{"fWhta":{"name":"js\/a62kak05d08cgw8o.pkg.js","type":"js","permanent":false,"src":"http:\/\/static.ak.fbcdn.net\/rsrc.php\/z1AQ7\/hash\/qkma6pho.js"},"lShFv":{"name":"css\/sprite\/autogen\/e6h3iy.css","type":"css","permanent":false,"src":"http:\/\/static.ak.fbcdn.net\/rsrc.php\/zALI5\/hash\/cngu73tz.css"},"bh3tE":{"name":"css\/sprite\/autogen\/3jkv60.css","type":"css","permanent":false,"src":"http:\/\/static.ak.fbcdn.net\/rsrc.php\/z4M49\/hash\/7wet04gi.css"},"OxGjK":{"name":"css\/1b9p1ur0qpog8cgw.pkg.css","type":"css","permanent":true,"src":"http:\/\/static.ak.fbcdn.net\/rsrc.php\/zC6TL\/hash\/1quse983.css"},"3NVRu":{"name":"js\/ufi\/tracking.js","type":"js","permanent":false,"src":"http:\/\/static.ak.fbcdn.net\/rsrc.php\/z8CIM\/hash\/7c5lvnd6.js"},"EMOa3":{"name":"js\/lib\/util\/user_activity.js","type":"js","permanent":false,"src":"http:\/\/static.ak.fbcdn.net\/rsrc.php\/z2MJ2\/hash\/7q88hxyg.js"},"EzjZW":{"name":"js\/stream\/UIIntentionalStreamRefresh.js","type":"js","permanent":false,"src":"http:\/\/static.ak.fbcdn.net\/rsrc.php\/z7LZY\/hash\/5vjds43u.js"}},"requires":[],"provides":["pagelet_controller::home_intentional_stream"],"onload":["window.__UIControllerRegistry[\"c4c0ebcac26d1c478579b3\"] = new UIPagelet(\"c4c0ebcac26d1c478579b3\", \"\\\/pagelet\\\/home\\\/intentional_stream.php\", {\"is_multi_stream\":true,\"is_prefetch\":false,\"first_load\":null}, {});; ;","share_data={max_recipients:20}","window.__UIControllerRegistry[\"c4c0ebcac36a540af71b6d\"] = new UIIntentionalStream($(\"c4c0ebcac36a540af71b6d\"), \"nile\", 1276034077, 1276032692, 5, \"lf\", 10, 0, \"[]\", \"[]\", false, 300);;
//..
</script>
 

This is BigPipe, and it is explained by this Facebook Note:

To exploit the parallelism between web server and browser, BigPipe first breaks web pages into multiple chunks called pagelets. Just as a pipelining microprocessor divides an instruction’s life cycle into multiple stages (such as “instruction fetch”, “instruction decode”, “execution”, “register write back” etc.), BigPipe breaks the page generation process into several stages:

  1. Request parsing: web server parses and sanity checks the HTTP request.
  2. Data fetching: web server fetches data from storage tier.
  3. Markup generation: web server generates HTML markup for the response.
  4. Network transport: the response is transferred from web server to browser.
  5. CSS downloading: browser downloads CSS required by the page.
  6. DOM tree construction and CSS styling: browser constructs DOM tree of the document, and then applies CSS rules on it.
  7. JavaScript downloading: browser downloads JavaScript resources referenced by the page.
  8. JavaScript execution: browser executes JavaScript code of the page.

The first three stages are executed by the web server, and the last four stages are executed by the browser. Each pagelet must go through all these stages sequentially, but BigPipe enables several pagelets to be executed simultaneously in different stages.

The picture above uses Facebook’s home page as an example to demonstrate how web pages are decomposed into pagelets. The home page consists of several pagelets: “composer pagelet”, “navigation pagelet”, “news feed pagelet”, “request box pagelet”, “ads pagelet”, “friend suggestion box” and “connection box”, etc. Each of them is independent of each. When the "navigation pagelet" is displayed to the user, the "news feed pagelet" can still be being generated at the server.

In BigPipe, the life cycle of a user request is the following: The browser sends an HTTP request to web server. After receiving the HTTP request and performing some sanity check on it, web server immediately sends back an unclosed HTML document that includes an HTML tag and the first part of the tag. The tag includes BigPipe’s JavaScript library to interpret pagelet responses to be received later. In the tag, there is a template that specifies the logical structure of page and the placeholders for pagelets.

Performance results

The graph below shows the performance data comparing the 75th percentile user perceived latency for seeing the most important content in a page (e.g. news feed is considered the most important content on Facebook home page) on traditional model and BigPipe. The data is collected by loading Facebook home page 50 times using browsers with cold browser cache. The graph shows that BigPipe reduces user perceived latency by half in most browsers.


Older posts are this way If this message doesn't go away, click anywhere on the page to continue loading posts.
Could not load more posts
Maybe Soup is currently being updated? I'll try again automatically in a few seconds...
Just a second, loading more posts...
You've reached the end.
(PRO)
No Soup for you

Don't be the product, buy the product!

close
YES, I want to SOUP ●UP for ...