Tuesday, April 7, 2009

Speed Up Your Web Site

Best Practices for Speeding Up Your Web Site

The Exceptional Performance team has identified a number of best practices for making web pages fast. The list includes 34 best practices divided into 7 categories.

Minimize HTTP Requests

80% of the end-user response time is spent on the front-end. Most of this time is tied up in downloading all the components in the page: images, style-sheets, scripts, Flash, etc. Reducing the number of components in turn reduces the number of HTTP requests required to render the page. This is the key to faster pages.

One way to reduce the number of components in the page is to simplify the page's design. But is there a way to build pages with richer content while also achieving fast response times? Here are some techniques for reducing the number of HTTP requests, while still supporting rich page designs.

Combined files are a way to reduce the number of HTTP requests by combining all scripts into a single script, and similarly combining all CSS into a single stylesheet. Combining files is more challenging when the scripts and stylesheets vary from page to page, but making this part of your release process improves response times.

CSS Sprites are the preferred method for reducing the number of image requests. Combine your background images into a single image and use the CSS background-image and background-position properties to display the desired image segment.

Image maps combine multiple images into a single image. The overall size is about the same, but reducing the number of HTTP requests speeds up the page. Image maps only work if the images are contiguous in the page, such as a navigation bar. Defining the coordinates of image maps can be tedious and error prone. Using image maps for navigation is not accessible too, so it's not recommended.

Inline images use the data: URL scheme to embed the image data in the actual page. This can increase the size of your HTML document. Combining inline images into your (cached) stylesheets is a way to reduce HTTP requests and avoid increasing the size of your pages. Inline images are not yet supported across all major browsers.

Reducing the number of HTTP requests in your page is the place to start. This is the most important guideline for improving performance for first time visitors. As described in Tenni Theurer's blog post Browser Cache Usage - Exposed!, 40-60% of daily visitors to your site come in with an empty cache. Making your page fast for these first time visitors is key to a better user experience.

Use a Content Delivery Network

The user's proximity to your web server has an impact on response times. Deploying your content across multiple, geographically dispersed servers will make your pages load faster from the user's perspective. But where should you start?

As a first step to implementing geographically dispersed content, don't attempt to redesign your web application to work in a distributed architecture. Depending on the application, changing the architecture could include daunting tasks such as synchronizing session state and replicating database transactions across server locations. Attempts to reduce the distance between users and your content could be delayed by, or never pass, this application architecture step.

Remember that 80-90% of the end-user response time is spent downloading all the components in the page: images, stylesheets, scripts, Flash, etc. This is the Performance Golden Rule. Rather than starting with the difficult task of redesigning your application architecture, it's better to first disperse your static content. This not only achieves a bigger reduction in response times, but it's easier thanks to content delivery networks.

A content delivery network (CDN) is a collection of web servers distributed across multiple locations to deliver content more efficiently to users. The server selected for delivering content to a specific user is typically based on a measure of network proximity. For example, the server with the fewest network hops or the server with the quickest response time is chosen.

Some large Internet companies own their own CDN, but it's cost-effective to use a CDN service provider, such as Akamai Technologies, Mirror Image Internet, or Limelight Networks. For start-up companies and private web sites, the cost of a CDN service can be prohibitive, but as your target audience grows larger and becomes more global, a CDN is necessary to achieve fast response times. At Yahoo!, properties that moved static content off their application web servers to a CDN improved end-user response times by 20% or more. Switching to a CDN is a relatively easy code change that will dramatically improve the speed of your web site.

Add an Expires or a Cache-Control Header

There are two things in this rule:

  • For static components: implement "Never expire" policy by setting far future Expires header
  • For dynamic components: use an appropriate Cache-Control header to help the browser with conditional requests

Web page designs are getting richer and richer, which means more scripts, stylesheets, images, and Flash in the page. A first-time visitor to your page may have to make several HTTP requests, but by using the Expires header you make those components cacheable. This avoids unnecessary HTTP requests on subsequent page views. Expires headers are most often used with images, but they should be used on all components including scripts, stylesheets, and Flash components.

Browsers (and proxies) use a cache to reduce the number and size of HTTP requests, making web pages load faster. A web server uses the Expires header in the HTTP response to tell the client how long a component can be cached. This is a far future Expires header, telling the browser that this response won't be stale until April 15, 2010.

      Expires: Thu, 15 Apr 2010 20:00:00 GMT

If your server is Apache, use the ExpiresDefault directive to set an expiration date relative to the current date. This example of the ExpiresDefault directive sets the Expires date 10 years out from the time of the request.

      ExpiresDefault "access plus 10 years"

Keep in mind, if you use a far future Expires header you have to change the component's filename whenever the component changes. At Yahoo! we often make this step part of the build process: a version number is embedded in the component's filename, for example, yahoo_2.0.6.js.

Using a far future Expires header affects page views only after a user has already visited your site. It has no effect on the number of HTTP requests when a user visits your site for the first time and the browser's cache is empty. Therefore the impact of this performance improvement depends on how often users hit your pages with a primed cache. (A "primed cache" already contains all of the components in the page.) We measured this at Yahoo! and found the number of page views with a primed cache is 75-85%. By using a far future Expires header, you increase the number of components that are cached by the browser and re-used on subsequent page views without sending a single byte over the user's Internet connection.

Gzip Components

The time it takes to transfer an HTTP request and response across the network can be significantly reduced by decisions made by front-end engineers. It's true that the end-user's bandwidth speed, Internet service provider, proximity to peering exchange points, etc. are beyond the control of the development team. But there are other variables that affect response times. Compression reduces response times by reducing the size of the HTTP response.

Starting with HTTP/1.1, web clients indicate support for compression with the Accept-Encoding header in the HTTP request.

      Accept-Encoding: gzip, deflate

If the web server sees this header in the request, it may compress the response using one of the methods listed by the client. The web server notifies the web client of this via the Content-Encoding header in the response.

      Content-Encoding: gzip

Gzip is the most popular and effective compression method at this time. It was developed by the GNU project and standardized by RFC 1952. The only other compression format you're likely to see is deflate, but it's less effective and less popular.

Gzipping generally reduces the response size by about 70%. Approximately 90% of today's Internet traffic travels through browsers that claim to support gzip. If you use Apache, the module configuring gzip depends on your version: Apache 1.3 uses mod_gzip while Apache 2.x uses mod_deflate.

There are known issues with browsers and proxies that may cause a mismatch in what the browser expects and what it receives with regard to compressed content. Fortunately, these edge cases are dwindling as the use of older browsers drops off. The Apache modules help out by adding appropriate Vary response headers automatically.

Servers choose what to gzip based on file type, but are typically too limited in what they decide to compress. Most web sites gzip their HTML documents. It's also worthwhile to gzip your scripts and stylesheets, but many web sites miss this opportunity. In fact, it's worthwhile to compress any text response including XML and JSON. Image and PDF files should not be gzipped because they are already compressed. Trying to gzip them not only wastes CPU but can potentially increase file sizes.

Gzipping as many file types as possible is an easy way to reduce page weight and accelerate the user experience.

Put Stylesheets at the Top

While researching performance at Yahoo!, we discovered that moving stylesheets to the document HEAD makes pages appear to be loading faster. This is because putting stylesheets in the HEAD allows the page to render progressively.

Front-end engineers that care about performance want a page to load progressively; that is, we want the browser to display whatever content it has as soon as possible. This is especially important for pages with a lot of content and for users on slower Internet connections. The importance of giving users visual feedback, such as progress indicators, has been well researched and documented. In our case the HTML page is the progress indicator! When the browser loads the page progressively the header, the navigation bar, the logo at the top, etc. all serve as visual feedback for the user who is waiting for the page. This improves the overall user experience.

The problem with putting stylesheets near the bottom of the document is that it prohibits progressive rendering in many browsers, including Internet Explorer. These browsers block rendering to avoid having to redraw elements of the page if their styles change. The user is stuck viewing a blank white page.

The HTML specification clearly states that stylesheets are to be included in the HEAD of the page: "Unlike A, [LINK] may only appear in the HEAD section of a document, although it may appear any number of times." Neither of the alternatives, the blank white screen or flash of unstyled content, are worth the risk. The optimal solution is to follow the HTML specification and load your stylesheets in the document HEAD.

Put Scripts at the Bottom

The problem caused by scripts is that they block parallel downloads. The HTTP/1.1 specification suggests that browsers download no more than two components in parallel per hostname. If you serve your images from multiple hostnames, you can get more than two downloads to occur in parallel. While a script is downloading, however, the browser won't start any other downloads, even on different hostnames.

In some situations it's not easy to move scripts to the bottom. If, for example, the script uses document.write to insert part of the page's content, it can't be moved lower in the page. There might also be scoping issues. In many cases, there are ways to workaround these situations.

An alternative suggestion that often comes up is to use deferred scripts. The DEFER attribute indicates that the script does not contain document.write, and is a clue to browsers that they can continue rendering. Unfortunately, Firefox doesn't support the DEFER attribute. In Internet Explorer, the script may be deferred, but not as much as desired. If a script can be deferred, it can also be moved to the bottom of the page. That will make your web pages load faster.

Avoid CSS Expressions

CSS expressions are a powerful (and dangerous) way to set CSS properties dynamically. They're supported in Internet Explorer, starting with version 5. As an example, the background color could be set to alternate every hour using CSS expressions.

      background-color: expression( (new Date()).getHours()%2 ? "#B8D4FF" : "#F08A00" );

As shown here, the expression method accepts a JavaScript expression. The CSS property is set to the result of evaluating the JavaScript expression. The expression method is ignored by other browsers, so it is useful for setting properties in Internet Explorer needed to create a consistent experience across browsers.

The problem with expressions is that they are evaluated more frequently than most people expect. Not only are they evaluated when the page is rendered and resized, but also when the page is scrolled and even when the user moves the mouse over the page. Adding a counter to the CSS expression allows us to keep track of when and how often a CSS expression is evaluated. Moving the mouse around the page can easily generate more than 10,000 evaluations.

One way to reduce the number of times your CSS expression is evaluated is to use one-time expressions, where the first time the expression is evaluated it sets the style property to an explicit value, which replaces the CSS expression. If the style property must be set dynamically throughout the life of the page, using event handlers instead of CSS expressions is an alternative approach. If you must use CSS expressions, remember that they may be evaluated thousands of times and could affect the performance of your page.

Make JavaScript and CSS External

Many of these performance rules deal with how external components are managed. However, before these considerations arise you should ask a more basic question: Should JavaScript and CSS be contained in external files, or inlined in the page itself?

Using external files in the real world generally produces faster pages because the JavaScript and CSS files are cached by the browser. JavaScript and CSS that are inlined in HTML documents get downloaded every time the HTML document is requested. This reduces the number of HTTP requests that are needed, but increases the size of the HTML document. On the other hand, if the JavaScript and CSS are in external files cached by the browser, the size of the HTML document is reduced without increasing the number of HTTP requests.

The key factor, then, is the frequency with which external JavaScript and CSS components are cached relative to the number of HTML documents requested. This factor, although difficult to quantify, can be gauged using various metrics. If users on your site have multiple page views per session and many of your pages re-use the same scripts and stylesheets, there is a greater potential benefit from cached external files.

Many web sites fall in the middle of these metrics. For these sites, the best solution generally is to deploy the JavaScript and CSS as external files. The only exception where inlining is preferable is with home pages, such as Yahoo!'s front page and My Yahoo!. Home pages that have few (perhaps only one) page view per session may find that inlining JavaScript and CSS results in faster end-user response times.

For front pages that are typically the first of many page views, there are techniques that leverage the reduction of HTTP requests that inlining provides, as well as the caching benefits achieved through using external files. One such technique is to inline JavaScript and CSS in the front page, but dynamically download the external files after the page has finished loading. Subsequent pages would reference the external files that should already be in the browser's cache.

Reduce DNS Lookups

The Domain Name System (DNS) maps hostnames to IP addresses, just as phonebooks map people's names to their phone numbers. When you type www.yahoo.com into your browser, a DNS resolver contacted by the browser returns that server's IP address. DNS has a cost. It typically takes 20-120 milliseconds for DNS to lookup the IP address for a given hostname. The browser can't download anything from this hostname until the DNS lookup is completed.

DNS lookups are cached for better performance. This caching can occur on a special caching server, maintained by the user's ISP or local area network, but there is also caching that occurs on the individual user's computer. The DNS information remains in the operating system's DNS cache (the "DNS Client service" on Microsoft Windows). Most browsers have their own caches, separate from the operating system's cache. As long as the browser keeps a DNS record in its own cache, it doesn't bother the operating system with a request for the record.

Internet Explorer caches DNS lookups for 30 minutes by default, as specified by the DnsCacheTimeout registry setting. Firefox caches DNS lookups for 1 minute, controlled by the network.dnsCacheExpiration configuration setting. (Fasterfox changes this to 1 hour.)

When the client's DNS cache is empty (for both the browser and the operating system), the number of DNS lookups is equal to the number of unique hostnames in the web page. This includes the hostnames used in the page's URL, images, script files, stylesheets, Flash objects, etc. Reducing the number of unique hostnames reduces the number of DNS lookups.

Reducing the number of unique hostnames has the potential to reduce the amount of parallel downloading that takes place in the page. Avoiding DNS lookups cuts response times, but reducing parallel downloads may increase response times. My guideline is to split these components across at least two but no more than four hostnames. This results in a good compromise between reducing DNS lookups and allowing a high degree of parallel downloads.

Minify JavaScript and CSS

Minification is the practice of removing unnecessary characters from code to reduce its size thereby improving load times. When code is minified all comments are removed, as well as unneeded white space characters (space, newline, and tab). In the case of JavaScript, this improves response time performance because the size of the downloaded file is reduced. Two popular tools for minifying JavaScript code are JSMin and YUI Compressor. The YUI compressor can also minify CSS.

Obfuscation is an alternative optimization that can be applied to source code. It's more complex than minification and thus more likely to generate bugs as a result of the obfuscation step itself. In a survey of ten top U.S. web sites, minification achieved a 21% size reduction versus 25% for obfuscation. Although obfuscation has a higher size reduction, minifying JavaScript is less risky.

In addition to minifying external scripts and styles, inlined <script> and <style> blocks can and should also be minified. Even if you gzip your scripts and styles, minifying them will still reduce the size by 5% or more. As the use and size of JavaScript and CSS increases, so will the savings gained by minifying your code.

Avoid Redirects

tag: content

Redirects are accomplished using the 301 and 302 status codes. Here's an example of the HTTP headers in a 301 response:

      HTTP/1.1 301 Moved Permanently

      Location: http://example.com/newuri

      Content-Type: text/html

The browser automatically takes the user to the URL specified in the Location field. All the information necessary for a redirect is in the headers. The body of the response is typically empty. Despite their names, neither a 301 nor a 302 response is cached in practice unless additional headers, such as Expires or Cache-Control, indicate it should be. The meta refresh tag and JavaScript are other ways to direct users to a different URL, but if you must do a redirect, the preferred technique is to use the standard 3xx HTTP status codes, primarily to ensure the back button works correctly.

The main thing to remember is that redirects slow down the user experience. Inserting a redirect between the user and the HTML document delays everything in the page since nothing in the page can be rendered and no components can start being downloaded until the HTML document has arrived.

One of the most wasteful redirects happens frequently and web developers are generally not aware of it. It occurs when a trailing slash (/) is missing from a URL that should otherwise have one. For example, going to http://astrology.yahoo.com/astrology results in a 301 response containing a redirect to http://astrology.yahoo.com/astrology/ (notice the added trailing slash). This is fixed in Apache by using Alias or mod_rewrite, or the DirectorySlash directive if you're using Apache handlers.

Connecting an old web site to a new one is another common use for redirects. Others include connecting different parts of a website and directing the user based on certain conditions (type of browser, type of user account, etc.). Using a redirect to connect two web sites is simple and requires little additional coding. Although using redirects in these situations reduces the complexity for developers, it degrades the user experience. Alternatives for this use of redirects include using Alias and mod_rewrite if the two code paths are hosted on the same server. If a domain name change is the cause of using redirects, an alternative is to create a CNAME (a DNS record that creates an alias pointing from one domain name to another) in combination with Alias or mod_rewrite.

Remove Duplicate Scripts

It hurts performance to include the same JavaScript file twice in one page. This isn't as unusual as you might think. A review of the ten top U.S. web sites shows that two of them contain a duplicated script. Two main factors increase the odds of a script being duplicated in a single web page: team size and number of scripts. When it does happen, duplicate scripts hurt performance by creating unnecessary HTTP requests and wasted JavaScript execution.

Unnecessary HTTP requests happen in Internet Explorer, but not in Firefox. In Internet Explorer, if an external script is included twice and is not cacheable, it generates two HTTP requests during page loading. Even if the script is cacheable, extra HTTP requests occur when the user reloads the page.

In addition to generating wasteful HTTP requests, time is wasted evaluating the script multiple times. This redundant JavaScript execution happens in both Firefox and Internet Explorer, regardless of whether the script is cacheable.

One way to avoid accidentally including the same script twice is to implement a script management module in your templating system. The typical way to include a script is to use the SCRIPT tag in your HTML page.

      <script type="text/javascript" src="menu_1.0.17.js"></script>

An alternative in PHP would be to create a function called insertScript.

      <?php insertScript("menu.js") ?>

In addition to preventing the same script from being inserted multiple times, this function could handle other issues with scripts, such as dependency checking and adding version numbers to script filenames to support far future Expires headers.

Configure ETags

Entity tags (ETags) are a mechanism that web servers and browsers use to determine whether the component in the browser's cache matches the one on the origin server. (An "entity" is another word a "component": images, scripts, stylesheets, etc.) ETags were added to provide a mechanism for validating entities that is more flexible than the last-modified date. An ETag is a string that uniquely identifies a specific version of a component. The only format constraints are that the string be quoted. The origin server specifies the component's ETag using the ETag response header.

      HTTP/1.1 200 OK

      Last-Modified: Tue, 12 Dec 2006 03:03:59 GMT

      ETag: "10c24bc-4ab-457e1c1f"

      Content-Length: 12195

Later, if the browser has to validate a component, it uses the If-None-Match header to pass the ETag back to the origin server. If the ETags match, a 304 status code is returned reducing the response by 12195 bytes for this example.

      GET /i/yahoo.gif HTTP/1.1

      Host: us.yimg.com

      If-Modified-Since: Tue, 12 Dec 2006 03:03:59 GMT

      If-None-Match: "10c24bc-4ab-457e1c1f"

      HTTP/1.1 304 Not Modified

The problem with ETags is that they typically are constructed using attributes that make them unique to a specific server hosting a site. ETags won't match when a browser gets the original component from one server and later tries to validate that component on a different server, a situation that is all too common on Web sites that use a cluster of servers to handle requests. By default, both Apache and IIS embed data in the ETag that dramatically reduces the odds of the validity test succeeding on web sites with multiple servers.

The ETag format for Apache 1.3 and 2.x is inode-size-timestamp. Although a given file may reside in the same directory across multiple servers, and have the same file size, permissions, timestamp, etc., its inode is different from one server to the next.

IIS 5.0 and 6.0 have a similar issue with ETags. The format for ETags on IIS is Filetimestamp:ChangeNumber. A ChangeNumber is a counter used to track configuration changes to IIS. It's unlikely that the ChangeNumber is the same across all IIS servers behind a web site.

The end result is ETags generated by Apache and IIS for the exact same component won't match from one server to another. If the ETags don't match, the user doesn't receive the small, fast 304 response that ETags were designed for; instead, they'll get a normal 200 response along with all the data for the component. If you host your web site on just one server, this isn't a problem. But if you have multiple servers hosting your web site, and you're using Apache or IIS with the default ETag configuration, your users are getting slower pages, your servers have a higher load, you're consuming greater bandwidth, and proxies aren't caching your content efficiently. Even if your components have a far future Expires header, a conditional GET request is still made whenever the user hits Reload or Refresh.

If you're not taking advantage of the flexible validation model that ETags provide, it's better to just remove the ETag altogether. The Last-Modified header validates based on the component's timestamp. And removing the ETag reduces the size of the HTTP headers in both the response and subsequent requests. This Microsoft Support article describes how to remove ETags. In Apache, this is done by simply adding the following line to your Apache configuration file:

      FileETag none

Make Ajax Cacheable

tag: content

One of the cited benefits of Ajax is that it provides instantaneous feedback to the user because it requests information asynchronously from the backend web server. However, using Ajax is no guarantee that the user won't be twiddling his thumbs waiting for those asynchronous JavaScript and XML responses to return. In many applications, whether or not the user is kept waiting depends on how Ajax is used. For example, in a web-based email client the user will be kept waiting for the results of an Ajax request to find all the email messages that match their search criteria. It's important to remember that "asynchronous" does not imply "instantaneous".

To improve performance, it's important to optimize these Ajax responses. The most important way to improve the performance of Ajax is to make the responses cacheable, as discussed in Add an Expires or a Cache-Control Header. Some of the other rules also apply to Ajax:

Let's look at an example. A Web 2.0 email client might use Ajax to download the user's address book for autocompletion. If the user hasn't modified her address book since the last time she used the email web app, the previous address book response could be read from cache if that Ajax response was made cacheable with a future Expires or Cache-Control header. The browser must be informed when to use a previously cached address book response versus requesting a new one. This could be done by adding a timestamp to the address book Ajax URL indicating the last time the user modified her address book, for example, &t=1190241612. If the address book hasn't been modified since the last download, the timestamp will be the same and the address book will be read from the browser's cache eliminating an extra HTTP roundtrip. If the user has modified her address book, the timestamp ensures the new URL doesn't match the cached response, and the browser will request the updated address book entries.

Even though your Ajax responses are created dynamically, and might only be applicable to a single user, they can still be cached. Doing so will make your Web 2.0 apps faster.

Flush the Buffer Early

When users request a page, it can take anywhere from 200 to 500ms for the backend server to stitch together the HTML page. During this time, the browser is idle as it waits for the data to arrive. In PHP you have the function flush(). It allows you to send your partially ready HTML response to the browser so that the browser can start fetching components while your backend is busy with the rest of the HTML page. The benefit is mainly seen on busy backends or light frontends.

A good place to consider flushing is right after the HEAD because the HTML for the head is usually easier to produce and it allows you to include any CSS and JavaScript files for the browser to start fetching in parallel while the backend is still processing.

Example:

      ... <!-- css, js -->

    </head>

    <?php flush(); ?>

    <body>

      ... <!-- content -->

Yahoo! search pioneered research and real user testing to prove the benefits of using this technique.

Use GET for AJAX Requests

The Yahoo! Mail team found that when using XMLHttpRequest, POST is implemented in the browsers as a two-step process: sending the headers first, then sending data. So it's best to use GET, which only takes one TCP packet to send (unless you have a lot of cookies). The maximum URL length in IE is 2K, so if you send more than 2K data you might not be able to use GET.

An interesting side affect is that POST without actually posting any data behaves like GET. Based on the HTTP specs, GET is meant for retrieving information, so it makes sense (semantically) to use GET when you're only requesting data, as opposed to sending data to be stored server-side.

Post-load Components

You can take a closer look at your page and ask yourself: "What's absolutely required in order to render the page initially?". The rest of the content and components can wait.

JavaScript is an ideal candidate for splitting before and after the onload event. For example if you have JavaScript code and libraries that do drag and drop and animations, those can wait, because dragging elements on the page comes after the initial rendering. Other places to look for candidates for post-loading include hidden content (content that appears after a user action) and images below the fold.

Tools to help you out in your effort: YUI Image Loader allows you to delay images below the fold and the YUI Get utility is an easy way to include JS and CSS on the fly. For an example in the wild take a look at Yahoo! Home Page with Firebug's Net Panel turned on.

It's good when the performance goals are inline with other web development best practices. In this case, the idea of progressive enhancement tells us that JavaScript, when supported, can improve the user experience but you have to make sure the page works even without JavaScript. So after you've made sure the page works fine, you can enhance it with some post-loaded scripts that give you more bells and whistles such as drag and drop and animations.

Preload Components

Preload may look like the opposite of post-load, but it actually has a different goal. By preloading components you can take advantage of the time the browser is idle and request components (like images, styles and scripts) you'll need in the future. This way when the user visits the next page, you could have most of the components already in the cache and your page will load much faster for the user.

There are actually several types of preloading:

  • Unconditional preload - as soon as onload fires, you go ahead and fetch some extra components. Check google.com for an example of how a sprite image is requested onload. This sprite image is not needed on the google.com homepage, but it is needed on the consecutive search result page.
  • Conditional preload - based on a user action you make an educated guess where the user is headed next and preload accordingly. On search.yahoo.com you can see how some extra components are requested after you start typing in the input box.
  • Anticipated preload - preload in advance before launching a redesign. It often happens after a redesign that you hear: "The new site is cool, but it's slower than before". Part of the problem could be that the users were visiting your old site with a full cache, but the new one is always an empty cache experience. You can mitigate this side effect by preloading some components before you even launched the redesign. Your old site can use the time the browser is idle and request images and scripts that will be used by the new site

Reduce the Number of DOM Elements

A complex page means more bytes to download and it also means slower DOM access in JavaScript. It makes a difference if you loop through 500 or 5000 DOM elements on the page when you want to add an event handler for example.

A high number of DOM elements can be a symptom that there's something that should be improved with the markup of the page without necessarily removing content. Are you using nested tables for layout purposes? Are you throwing in more <div>s only to fix layout issues? Maybe there's a better and more semantically correct way to do your markup.

A great help with layouts are the YUI CSS utilities: grids.css can help you with the overall layout, fonts.css and reset.css can help you strip away the browser's defaults formatting. This is a chance to start fresh and think about your markup, for example use <div>s only when it makes sense semantically, and not because it renders a new line.

The number of DOM elements is easy to test, just type in Firebug's console:
document.getElementsByTagName('*').length

And how many DOM elements are too many? Check other similar pages that have good markup. For example the Yahoo! Home Page is a pretty busy page and still under 700 elements (HTML tags).

Split Components Across Domains

Splitting components allows you to maximize parallel downloads. Make sure you're using not more than 2-4 domains because of the DNS lookup penalty. For example, you can host your HTML and dynamic content on www.example.org and split static components between static1.example.org and static2.example.org

For more information check "Maximizing Parallel Downloads in the Carpool Lane" by Tenni Theurer and Patty Chi.

Minimize the Number of iframes

Iframes allow an HTML document to be inserted in the parent document. It's important to understand how iframes work so they can be used effectively.

<iframe> pros:

  • Helps with slow third-party content like badges and ads
  • Security sandbox
  • Download scripts in parallel

<iframe> cons:

  • Costly even if blank
  • Blocks page onload
  • Non-semantic


 


 

No 404s

HTTP requests are expensive so making an HTTP request and getting a useless response (i.e. 404 Not Found) is totally unnecessary and will slow down the user experience without any benefit.

Some sites have helpful 404s "Did you mean X?", which is great for the user experience but also wastes server resources (like database, etc). Particularly bad is when the link to an external JavaScript is wrong and the result is a 404. First, this download will block parallel downloads. Next the browser may try to parse the 404 response body as if it were JavaScript code, trying to find something usable in it.

Reduce Cookie Size

HTTP cookies are used for a variety of reasons such as authentication and personalization. Information about cookies is exchanged in the HTTP headers between web servers and browsers. It's important to keep the size of cookies as low as possible to minimize the impact on the user's response time.

For more information check "When the Cookie Crumbles" by Tenni Theurer and Patty Chi. The take-home of this research:

  • Eliminate unnecessary cookies
  • Keep cookie sizes as low as possible to minimize the impact on the user response time
  • Be mindful of setting cookies at the appropriate domain level so other sub-domains are not affected
  • Set an Expires date appropriately. An earlier Expires date or none removes the cookie sooner, improving the user response time

Use Cookie-free Domains for Components

When the browser makes a request for a static image and sends cookies together with the request, the server doesn't have any use for those cookies. So they only create network traffic for no good reason. You should make sure static components are requested with cookie-free requests. Create a subdomain and host all your static components there.

If your domain is www.example.org, you can host your static components on static.example.org. However, if you've already set cookies on the top-level domain example.org as opposed to www.example.org, then all the requests to static.example.org will include those cookies. In this case, you can buy a whole new domain, host your static components there, and keep this domain cookie-free. Yahoo! uses yimg.com, YouTube uses ytimg.com, Amazon uses images-amazon.com and so on.

Another benefit of hosting static components on a cookie-free domain is that some proxies might refuse to cache the components that are requested with cookies. On a related note, if you wonder if you should use example.org or www.example.org for your home page, consider the cookie impact. Omitting www leaves you no choice but to write cookies to *.example.org, so for performance reasons it's best to use the www subdomain and write the cookies to that subdomain.

Minimize DOM Access

Accessing DOM elements with JavaScript is slow so in order to have a more responsive page, you should:

  • Cache references to accessed elements
  • Update nodes "offline" and then add them to the tree
  • Avoid fixing layout with JavaScript

For more information check the YUI theatre's "High Performance Ajax Applications" by Julien Lecomte.

Develop Smart Event Handlers

Sometimes pages feel less responsive because of too many event handlers attached to different elements of the DOM tree which are then executed too often. That's why using event delegation is a good approach. If you have 10 buttons inside a div, attach only one event handler to the div wrapper, instead of one handler for each button. Events bubble up so you'll be able to catch the event and figure out which button it originated from.

You also don't need to wait for the onload event in order to start doing something with the DOM tree. Often all you need is the element you want to access to be available in the tree. You don't have to wait for all images to be downloaded. DOMContentLoaded is the event you might consider using instead of onload, but until it's available in all browsers, you can use the YUI Event utility, which has an onAvailable method.

For more information check the YUI theatre's "High Performance Ajax Applications" by Julien Lecomte.

Choose <link> over @import

One of the previous best practices states that CSS should be at the top in order to allow for progressive rendering.

In IE @import behaves the same as using <link> at the bottom of the page, so it's best not to use it.

Avoid Filters

The IE-proprietary AlphaImageLoader filter aims to fix a problem with semi-transparent true color PNGs in IE versions < 7. The problem with this filter is that it blocks rendering and freezes the browser while the image is being downloaded. It also increases memory consumption and is applied per element, not per image, so the problem is multiplied.

The best approach is to avoid AlphaImageLoader completely and use gracefully degrading PNG8 instead, which are fine in IE. If you absolutely need AlphaImageLoader, use the underscore hack _filter as to not penalize your IE7+ users.

Optimize Images

After a designer is done with creating the images for your web page, there are still some things you can try before you FTP those images to your web server.

  • You can check the GIFs and see if they are using a palette size corresponding to the number of colors in the image. Using imagemagick it's easy to check using
    identify -verbose image.gif
    When you see an image useing 4 colors and a 256 color "slots" in the palette, there is room for improvement.
  • Try converting GIFs to PNGs and see if there is a saving. More often than not, there is. Developers often hesitate to use PNGs due to the limited support in browsers, but this is now a thing of the past. The only real problem is alpha-transparency in true color PNGs, but then again, GIFs are not true color and don't support variable transparency either. So anything a GIF can do, a palette PNG (PNG8) can do too (except for animations). This simple imagemagick command results in totally safe-to-use PNGs:
    convert image.gif image.png
    "All we are saying is: Give PiNG a Chance!"
  • Run pngcrush (or any other PNG optimizer tool) on all your PNGs. Example:
    pngcrush image.png -rem alla -reduce -brute result.png
  • Run jpegtran on all your JPEGs. This tool does lossless JPEG operations such as rotation and can also be used to optimize and remove comments and other useless information (such as EXIF information) from your images.
    jpegtran -copy none -optimize -perfect src.jpg dest.jpg

Optimize CSS Sprites

  • Arranging the images in the sprite horizontally as opposed to vertically usually results in a smaller file size.
  • Combining similar colors in a sprite helps you keep the color count low, ideally under 256 colors so to fit in a PNG8.
  • "Be mobile-friendly" and don't leave big gaps between the images in a sprite. This doesn't affect the file size as much but requires less memory for the user agent to decompress the image into a pixel map. 100x100 image is 10 thousand pixels, where 1000x1000 is 1 million pixels

Don't Scale Images in HTML

Don't use a bigger image than you need just because you can set the width and height in HTML. If you need
<img width="100" height="100" src="mycat.jpg" alt="My Cat" />
then your image (mycat.jpg) should be 100x100px rather than a scaled down 500x500px image.

Make favicon.ico Small and Cacheable

The favicon.ico is an image that stays in the root of your server. It's a necessary evil because even if you don't care about it the browser will still request it, so it's better not to respond with a 404 Not Found. Also since it's on the same server, cookies are sent every time it's requested. This image also interferes with the download sequence, for example in IE when you request extra components in the onload, the favicon will be downloaded before these extra components.

So to mitigate the drawbacks of having a favicon.ico make sure:

  • It's small, preferably under 1K.
  • Set Expires header with what you feel comfortable (since you cannot rename it if you decide to change it). You can probably safely set the Expires header a few months in the future. You can check the last modified date of your current favicon.ico to make an informed decision.

Imagemagick can help you create small favicons

Keep Components under 25K

This restriction is related to the fact that iPhone won't cache components bigger than 25K. Note that this is the uncompressed size. This is where minification is important because gzip alone may not be sufficient.

For more information check "Performance Research, Part 5: iPhone Cacheability - Making it Stick" by Wayne Shea and Tenni Theurer.

Pack Components into a Multipart Document

Packing components into a multipart document is like an email with attachments, it helps you fetch several components with one HTTP request (remember: HTTP requests are expensive). When you use this technique, first check if the user agent supports it (iPhone does not).

Reducing Page Load Time
Avoid excessively large images, redundant tags, and nested tables to facilitate faster page loads. Always avoid unnecessary roundtrips to the web server. Use client side scripts to dramatically reduce server roundtrips, thereby boosting the perceived, if not the actual application performance. Take advantage of the Page.IsPostback property to avoid unnecessary server processing on a roundtrip, reducing network traffic. I suggest you leave page buffering on (it is turned on by default) for a page unless you have a specific reason to turn it off.

You can pre-compile web pages in your application to reduce the working set size and boost application performance. Set AutoEventWireup attribute to false in the section of the server's Machine.config file to improve performance further, e.g.:

   <configuration> 

     <system.web> 

       <pages autoEventWireup="true|false" /> 

     </system.web> 

   </configuration>


The AutoEventWireup attribute accepts a Boolean value that indicates whether the ASP.NET pages events are auto-wired. If the AutoEventWireup is set to false, the runtime does not have to look for each of the page event handlers. This MSDN article about the AutoEventWireup Event concludes with, "When you explicitly set AutoEventWireup to true, Visual Studio .NET or Visual Studio 2005, by default, generates code to bind events to their event-handler methods. At the same time, the ASP.NET page framework automatically calls the event-handler methods based on their predefined names. This can lead to the same event-handler method being called two times when the page runs." The article therefore recommends that you always set AutoEventWireup to false while working in Visual Studio .NET.


 

Efficient ASP.NET State Management Practices
ViewState is great for storing control state but can degrade performance—especially on web sites with large page sizes. If you've ever looked at a page that contains a large DataSet, you know the amount of data stored in ViewState can be overwhelming. Every byte added to a web page by enabling its ViewState causes two bytes of network traffic, one in each direction. Evaluate whether each web page you write requires ViewState and avoid it when possible to speed up the page-load cycle in your applications. You should typically use ViewState only for controls that need to persist state. You can turn ViewState on or off at four levels: machine, application, page, and control. Limiting the size of the ViewState and eliminating its unnecessary usage would boost the application performance to a large extent due to the reduced size of the rendered pages and hence the network traffic. It should be noted that each byte of the view state causes two bytes of network traffic for each request, one from the server to the client and the other from the client to the server. For more information on ViewState and how to turn it off at control, page, application, or machine levels, see this article .

You can remove the runat="server" form tag completely to reduce page size by 20 bytes. If you don't remove this tag, the page itself passes on about 20 bytes of information to ViewState—even when the page's ViewState property is set to false.

Caching is one of the best strategies for storing relatively static application data. Caching reads data from memory to avoid repeatedly retrieving data from a database, file, or any other repository, and it can provide huge application performance gains. Use Page Output, Page Fragment or Data Caching directives depending on your requirements. Cache application-wide data that multiple users of the application need to share and access, but avoid storing user-specific data in the cache.

Use Session State only for storing a single user's session data. Avoid storing too many objects in the Session and turn Session State off for pages that do not need to access session data. You can turn Session State on or off at the same four levels as ViewState: machine, application, page, and control.

Note that there are three Session State storage modes. The right type of storage mode to choose depends on factors such as, speed, security, scalability, and reliability. Even though the InProc mode of Session State storage is the fastest, it is not at all well suited to a production environment and not scalable for large sites. The OutProc storage mode is well-suited for web sites with heavy traffic, while the SqlServer mode is great for securing session data. No matter which mode you choose though, Session State has one major disadvantage: the resource will be increasingly strained as you scale up. There are always tradeoffs. The best mode for security, scalability, and reliability is not always the best mode for performance, and vice versa.


 


 

Tips for Efficient Memory and Resource Management
I've listed a few tips in this section that can help you avoid problems. I don't have room to explain each tip in-depth here; instead, I've provided a basic description and added links where appropriate so you can explore further on your own.


 

Try never to refer to a short-lived object from a long lived one to avoid promoting short-lived objects to higher generations. Note that the .NET Garbage Collector (GC) works more frequently against lower generations than against higher ones. An in-depth discussion of the .NET GC is beyond the scope of this article, but this two-part MSDN article on garbage collection (Part 1, Part 2) provides more background

Avoid any code that can promote an object to a higher generation unnecessarily as shown in the code snippet below:

   class Employee

   {

    EmployeeRegister empRegister;

     void Create (int empCode, int deptCode, double basic)

      {

       EmployeeRegister empRegister = new EmployeeRegister();

       empRegister.Create(empCode, deptCode, basic);

       this.empRegister = empRegister;

      }

   }


 

Use Dispose and Finalize appropriately to handle unmanaged resources (for more information, see my article When and How to Use Dispose and Finalize in C#. In fact, you should avoid implementing the Finalize method as much as possible.

Set any objects no longer required to null prior to making any long-running calls. Developers often set locals to null after they're done using them, but in .NET, that's not required, as the GC can safely determine when objects are no longer needed or are not reachable, thus making them eligible for garbage collection. However, if you use machine resources such as files, database connections, or unmanaged objects, remember to release them in a finally block—never in a catch block.

Author's Note: In C#, it's best to wrap resource-handling code within a using block to ensure that the resources are disposed of properly when they're no longer needed. When you use the using statement, the .NET framework implicitly creates finally blocks for those objects that implement IDisposable.


Acquire resources and locks late and dispose of them as soon as possible. Ensure that you free or dispose unneeded resources and locks properly. In fact, I recommend that you avoid locking and synchronization altogether unless it's absolutely necessary.  Do not lock on the this object—it's better to lock on a private object.  If you lock on the current object instance (this), all the members of the object are also locked irrespective of whether a lock is required on them. Do not lock on object types, only on objects themselves.

Efficient Exception Management
Exceptions are errors that occur at run time. Exceptions are generally well understood but quite often used improperly. An understanding of proper exception handling is one of the more important aspects of efficient programming.

Handle only those exceptions that you know how to handle and avoid handling those that you don't know how to handle. As Eric Gunnerson, a former member of Microsoft's C# team, says in his blog, "the essence of exception handling is to be able to respond to a specific exception in a specific situation."

When possible, reduce the number of try...catch blocks in your code; exceptions take longer to be processed and can reduce application performance drastically. Do not use exceptions to control an application's logic flow. Use proper validation techniques instead of exceptions to avoid throwing unnecessary exceptions.

Efficient Data Access Strategies
Prefer DataReaders to DataSets. DataReaders are much faster than DataSets for simple sequential data access. Use a DataReader for fast data-rendering, but not to pass data between layers of the applications or through different application domains—unlike DataSets which can work in disconnected mode and can be cached, DataReaders always require an open connection. When you do have to use a DataSet, you can often set its EnableConstraints property to false and use the BeginLoadData and EndLoadData methods for faster data rendering. If you are using transactions keep them short to minimize the lock durations and to improve data concurrency.

Open database connections as late as possible and close them immediately when done. Effective use of connection pooling can increase application performance by reducing the number of roundtrips to the server. Note that each database request in a distributed environment creates network traffic that can cause performance bottlenecks and degrade the application's performance. Try to minimize the number of database connections in the application. Never hold a connection open, because that decreases the number of available connections in the connection pool and hence degrades performance if the demand for connections exceeds the pool count. See this article for more information on connection pooling.


 

Efficient Coding Practices
Avoid late binding whenever possible. Late-binding adds flexibility but is always slow compared to early binding. Note that using virtual methods in your code requires late binding, because virtual methods must be mapped. Any class that contains a virtual method has its own virtual table. The virtual table in turn contains entries that correspond to the virtual methods that the class contains. Note that the virtual method is class specific; there can be only one virtual table per class regardless of how many virtual methods the class contains. The runtime uses the virtual table to map a virtual method to the object on which it is called. Hence there's additional overhead involved (more resource usage—both processor and memory) in binding a virtual method to the object on which it is called to satisfy a virtual method call.

You should seal final classes (those that cannot be inherited) for performance gains.

Avoid recursive method calls and try to replace them with loops instead. Inline frequently called code inside loops. Avoid calling methods or properties repetitively inside a loop. Especially, avoid situations that will require boxing and unboxing of value types, because that carries performance overhead—use generics instead when possible, building strongly typed collections to avoid boxing and unboxing issues. This link provides more information.

Avoid inefficient string operations (see this article for more information) and use collections (if needed) efficiently.

Choosing between Server.Transfer and Response.Redirect
Use the Server.Transfer method to redirect between pages in the same application; Server.Transfer avoids an unnecessary client-side redirection. However, you cannot always just replace Response.Redirect calls with Server.Transfer. If you need authentication and authorization checks during redirection, use Response.Redirect instead. The two mechanisms are not equivalent. When you use Response.Redirect, make sure you use the overloaded method that accepts a Boolean second parameter, and pass a value of false to ensure an internal exception is not raised. Also note that you can only use Server.Transfer to transfer control to pages within the same application. To transfer to pages in other applications, you must use Response.Redirect.

Use Best Practices—and Common Sense
Even though there is no specific methodology that can fit in each and every environment, using best practices such as those discussed in this article can yield better application performance. You should plan and set some well-defined performance objectives, create performance test plans, performance checklists, and perform periodic tests based on the test plans. Keep these performance factors in mind when designing applications. This process should be iterative and repeated until we meet the predefined performance goals.


 

In the Application Web.Config File

1. Set debug=false under compilation as follows:-

<compilation defaultLanguage="c#" debug="false">

When you create the application, by default this attribute is set to "true" which is very useful while developing. However, when you are deploying your application, always set it to "false"

Setting it to "true" requires the pdb information to be inserted into the file and this results in a comparatively larger file and hence processing will be slow.

Therefore, always set debug="false" before deployment.

2. Turn off Tracing unless until required.

Tracing is one of the wonderful features which enable us to track the application's trace and the sequences. However, again it is useful only for developers and you can set this to "false" unless you require to monitor the trace logging. You can turn off tracing as follows:-

<trace enabled="false" requestLimit="10" pageOutput="false" traceMode="SortByTime" localOnly="true"/>

3. Turn off Session State, if not required.

ASP.NET Manages session state automatically. However, in case you dont require Sessions, disabling it will help in improving the performance.

You may not require seesion state when ur pages r static or whn u dont need to store infor captured in the page.

You can turn off session state as follows:-

<sessionstate timeout="20" cookieless="false" mode="Off" stateconnectionstring="tcpip=127.0.0.1:42424" sqlconnectionstring="data source=127.0.0.1;Trusted_Connection=no">

While developing using Visual Studio.NET

1. Select the Release mode before making the final Build for your application. This option is available in the Top Frame just under the Window Menu option. By default, the Mode is Debug

There are several things happening when you Build/Re-Build applications in the Debug Mode. First of all, it creates an additional PDB File under your BIN directory. This holds all the Debug information.

Secondly, the Timeout is very high since you require higher time out frequency while debugging such that your process hangs on until you get to the exact break point where error is there.

So, selecting Release Mode will greatly improve the performance of the application when u deploy.

General Methods to improve the Performance

1. Disable ViewState as and when not required.
ViewState is a wonderful technique which preserves the state of your form and controls. However, its a overhead since all the information needs to be stored in the viewstate and particularly if you are building applications which target Dial Up Internet Connection Users, ViewState can make your application very slow. In case you dont require viewstate, disable it.

You can disable it at different levels ex., for Page, Control etc., by setting
EnableViewState="false"

2. Avoid Frequent round trips to the Database.
Calls made to Database can be quite expensive in terms of response time as well as resources and it can be avoided by using Batch Processing.

Make calls to Database as mininal as possible and make them last even lesser time. Use of DataAdapter wherever applicable is very useful since, it automatically opens and closes Connection whenever required and doesnt require user to explicitly open the connection.

A number of connections opened and not closed adequately can directly influence in performance slow down.

3. Avoid Throwing Exceptions.
Exceptions are a greate way to handle errors that occur in your application logic. However, throwing exceptions is a costly resource and must be avoided. Use specific exceptions and use as minimal as possible to avoid resource overhead.

For example, catching a SQLException is better when you expect only those kind of exceptions instead of a generic Exception.

4. Use Caching to improve the performance of your application.
OutputCaching enables your page to be cached for specific duration and can be made invalid based on various paramters that can be specified. The Cache exists for the duration you specify and until that time, the requests do not go to the server and are served from the Cache.

Do not assign cached items a short expiration. Items that expire quickly cause unnecessary turnover in the cache and frequently cause more work for cleanup code and the garbage collector.

In case you have static as well as dynamic sections of your page, try to use Partial Caching (Fragment Caching) by breaking up your page into user controls and specify Caching for only those Controls which are more-or-less static.

For more details regarding caching look into ASP.NET Caching Features

5. Use appropriate Authentication Mechanism.
The Authentication Mechanism you choose determines the cost associated with it and hence select the appropriate mechanism. An informal but useful order is as follows:-

Authentication Modes

1. None
2. Windows
3. Forms
4. Passport

6. Validate all Input received from the Users.
User Input is Evil and it must be thoroughly validated before processing to avoid overhead and possible injections to your applications. Use Client Side Validations as much as possible. However, do a check at the Server side too to avoid the infamous Javascript disabled scenarios.

7. Use Finally Method to kill resources.
In your Try..Catch.. Block, always use the Finally method to close Open connections, Open DataReaders, Files and other resources such that they get executed independent of whether the code worked in Try or went to Catch.

The Finally method gets executed independent of the outcome of the Block.

8. The String and Stringbuilder Magic.
Perhaps the most ignored type in .NET is the stringbuilder. I am sure many of us are not even aware of Stringbuilder and its advantage over string (atleast I didnt know for 1 year :))

String is Evil when you want to append and concatenate text to your string. In other words, if you are initially creating a string say s = "Hello". Then you are appending to it as s = s + " World"; You are actually creating two instances of string in memory. Both the original as well as the new string will be stored in the memory. For that matter, all the activities you do to the string are stored in the memory as separate references and it must be avoided as much as possible.

Use StringBuilder which is very useful in these kind of scenarios. For the example above, using a StringBuilder as s.Append(" World"); which only stores the value in the original string and no additional reference is created.

9. Avoid Recursive Functions / Nested Loops
These are general things to adopt in any programming language, which consumes lot of memory. Always avoid Nested Loops, Recursive functions, to improve performance.

Having said that, proper functions and call backs do increase the performance instead of having a huge chunk of lines of code in single method.

The above are just pointers to improve the performance of your application and are just illustrative. There are many more ways in which you can improve the performance of your applications. I havent dealt with IIS and SQL Server side tips to improve performance which I would explain in my forthcoming articles.

SQL Server Optimization
The first thing to check is your disk using PerfMon (Windows Performance Monitor).  Use this to determine if your physical disk is keeping up.  Using PerfMon, select the PhysicalDisk object, Avg. Disk Queue Length counter, then the instance for the drive your SQL database is installed on.  This indicator count will vary based on if the drive you are monitoring is part of an array but it's best to keep the average below 1 if possible.  See Microsoft's documentation on this for more information especially when calculating for drive arrays.  http://support.microsoft.com/kb/146005

Next you want to check SQL memory usage.  By right clicking the server in Enterprise Manager and selecting properties you can view the Memory tab.  Use this tab to give as much memory as you can afford to SQL.  The more memory you give the SQL the more data it can cache, thus returning results faster.  Does your server have more than 2GB memory?  If so use this guide to configure your server to see all that available memory http://support.microsoft.com/kb/283037 then this guide http://support.microsoft.com/kb/274750 to configure SQL to see more available memory.  To make sure SQL is using available memory don't use task manager as it may report incorrectly.  Instead use PerfMon Process Object, Working Set Counter, sqlservr Instance.  Give your server time to build up its cache after restarting before checking this counter.

You want to make sure your CPU can keep up.  Caution this counter can be misleading.  Although you may be recording sustained high CPU utilization this may not mean you need a faster CPU, the lack of adequate disk speed and memory may be artificially increasing CPU utilization.  If CPU utilization is averaging over 80% for an extended period of time and you have already ruled out memory and disk as an issue you may need to upgrade your CPU(s).  Use PerfMon to track CPU utilization using the Processor Object, % Processor Time Counter, and applicable instance if you have more than one CPU.  If your server has more than one CPU and you can afford to let SQL use more than one, right click on the server in Enterprise Manager, select Properties, and then the Processor tab.  This will allow you to select multiple or all CPUs for SQL utilization.

Bandwidth is a little more obvious but still needs to be mentioned.  If you have sustained high bandwidth utilization this will become a bottleneck causing slowdowns.  Bandwidth utilization can be a bit harder to check.  If you are using a hosting company they should be able to tell you (typically with some type of real time monitor that you can log into) what you are averaging with bandwidth as well as spikes in bandwidth.  You will want to make sure you have plenty of available bandwidth from your hosting company especially during the busy part of your day.  If your hosting company cannot provide these reports you can utilize a hardware or software sniffer to determine bandwidth usage.  The key is to make sure your connection can handle much more than you typically use.  If you are averaging 80% utilization you should consider upgrading your connection.

Now that you have checked the major hardware bottlenecks put your DBA hat on and make sure your indexes are up to date.  If you're not an experienced DBA that knows the best way to manage your indexes, use SQL Profiler (Tools, SQL Profiler from Enterprise Manager).  Create a new trace and select the SQLProfilerTuning template.  Make sure the time you record is long enough to collect data on regular activity.  Additionally if you can generate additional traffic (especially traffic that causes slowdowns) you will want to do as much of this as possible while collecting data.  Once you have finished collecting data, click Tools, Index Tuning Wizard.  Use the profile you just created and make sure to select the database in question.  When finished running apply any changes suggested.  You may need to run this multiple times until there are not recommended changes.  Additionally as you make changes to your database such as adding new fields you will want to run this entire process again to find and apply new recommendations.  I like to run this process on the regular basis just make sure everything is running smooth.  As a side note, make sure every table in your database has a primary key. 

Once you have optimized your indexes run SQL Profiler again to check what kind of activity you are having on your SQL server. Do you have long running queries that could be changed to run more efficiently?  Do you have queries that are being run over and over that you could cache inside your website application?  As you make upgrades and additions to your web pages you many times will add extra queries to your database.  Go through your pages and find out if you can combine queries to return information with one query instead of two.  If you have to make multiple queries, try to make one connection to the database, and then use that same connection for all the queries.  This way you are not making repetitive connections to the database one after the other.  Making a database connection can be time consuming and resource intensive.

The way you partition your database across drives on your SQL server can help if you are suffering from high disk utilization.  The following configuration has worked well for me note that each partition is on a separate physical drive:
C: Operating System and SQL executables
D: Database File - RAID 5
E: SQL Log Files
F: Temp Database
G: Backups

Using RAID 5 on the D drive allows very fast reads because multiple drives can respond to the request.  Additionally requests do not have to wait for writes to the Log Files or Temp database.  Because the D drive is for read data, SQL can more effectively use memory to cache frequent requests.  There are a lot of write requests to the SQL log files and temp database so giving them they're own separate drive reduces contention.  It goes without saying that using drives with higher RPM, more throughput, and a larger cache will assist your SQL server to run faster.

Web Server Optimization
Since almost every page the typical Internet retailer website has will make some type of database connection, your bottleneck will probably be retrieving data from SQL.  If this is the case you probably won't need to do much optimization for your web server.  Generally speaking you will want to make sure the server has plenty of CPU, memory, and disk resources available to your web server.    Increasing memory to allow more web page caching (see caching section of this article) is one area you can really boost page load time.  Additionally if you find your disk having problems keeping up try some of the following:
1 Add memory and increase caching
2 If you have logging enabled, move the log file locations to another drive.
3 Make sure your web pages are being loaded from a RAID 5 partition.

Database Backups
I won't go into all the specifics of database backups as that requires an article of its own.  From a question of performance you will want to schedule your backup to run when you have the most available resources.  If you are backing up to another server, consider backing up to a local drive first, and then backing up across the LAN.  This can help complete the backup faster because it's done locally, then when transferring the backup to another server you won't need to be concerned with how long it takes to transfer to another server.  Another option is to setup a separate server that you mirror data to, then you can simply backup the mirrored server instead of the primary.  This mirrored server can also be utilized as a fail over server in the case the primary server fails.

Load Balancing/Multiple Servers
If you are still having problems after you have optimized your web server, database server, web pages, and server hardware you can consider adding servers and load balancing these servers.  Since every situation is different it's impossible to list every possible configuration.  Below are some examples but it's important for you to do in depth research to determine  exactly what resources are lacking and what is utilizing those resources so you can make the best decisions on what type of servers to put up and how to load balance.

If you currently run a single server, determine what resources are lacking, probably disk or memory, maybe CPU and what is using these resources.  Typically when moving from a single server to multiple servers you will put in a new server and only put your database on that new server.

Next I'll show you a configuration with three servers that has worked out exceedingly well for me:
Each server has SQL and IIS installed.  We have one master server and two slave servers.  The master server has our entire database on it, the websites can write orders to this database and our staff can process orders using this database and we keep vendor inventory records up to date on this server.  The master server is the only server that gets backed up.  The two slave servers are setup so that only the parts of the websites and database that are needed for the public to use our websites are replicated.  Therefore any change that is made to the master server is replicated to the slave servers within 15 min. of the change.  Because the public websites write back to the master server the slave servers can run in a "read only" mode which allows them to process an enormous number of requests very fast.  Because our master server only receives writes from the public websites when orders are being taken, it only has to handle a small amount of our public traffic.  As our traffic grows we simply add more read only servers.  In our case we have enough websites that we simply split them up between servers.  If however you have a single website, you can still load balance using a DNS round robin or in more extreme situations an appliance like F5 Big IP http://www.f5.com/products/big-ip/ which can perform smart load balancing based on current server load.

Another popular method is to setup multiple SQL servers (assuming this is the main bottleneck) with separate data.  For instance if you keep your images in SQL, you may want to move them to they're own server then make a connection to one server for some of your data and to the other server for retrieving images.

Web Page Design

Java Script and CSS Files
Putting Java Script and CSS in files instead of inline allows your browser to cache them instead of reading them each time.  Since your browser has to download each file separately, do your best to keep all your Java Script in a single file and do the same with CSS

Image Optimization
Optimizing images is very important in page load time.  Convert your images to PNG.  Don't use HTML to resize your images, resize and optimize them prior to uploading.  Try to keep your larger images to a file size under 30K if possible.  There are some advanced optimizations that can be done with inline images, sprites, and image maps but I am not experienced with them for the most part and in many instances management of this can be a nightmare.  Additionally there may be browser support issues. 

Caching
Caching is one way to drastically reduce page load time, especially for popular pages.  Even if your page is dynamic, you may be able to cache it for 4 hours or at least 15 min. 

You can enable browser caching by using the HTML Expires heading for any pages that can cache.  This type of caching tells the visitors browser that if it returns to this exact page it's doesn't need to go back to the web server and retrieve the page again.  This will help when people are browsing your site and come back to the same page multiple times.  If the page is more static, like an article page for instance, you may be able to set it to cache for 30 days.  This will allow the same person to go to that page different days without having to hit your server.  The downfall of this caching is that it only helps if people visit your page more than once within the caching time.

Another type of caching is server side page caching.  This allows you to cache a page on the server instead of the browser, therefore when defining the caching time, anyone that requests the page inside of the cache time will receive the cached page.  Server side caching will allow your web server to cache the page in memory (given there are enough memory resources available) which means it does not have to go to the disk to retrieve the page or query the database.  Let's say for example the front page of your website has a tree view menu that makes multiple database queries to build the page.  If this page is requested many times an hour it will generate a lot of resource strain on your servers.  If you set this page to cache for 15 min. you now only create that resource strain once every 15 min. allowing your server to respond better to other requests and your page will display much faster.  Of all the things you can do to make your site faster, this may have the biggest impact. 

Similar to server side page caching is partial page caching.  This is a little more complicated but allows you to cache parts of your page.  Let's use the example above again but this time the criteria mandates that we can't cache everything in the page.  In this case we may want to just cache the menu as the menu is relatively static and is a large resource drain on the servers and slows down page delivery.

Separate Domains
Some browsers will open multiple connections to download files simultaneously but will have a limit to how many per domain it will concurrently download.  To get around this you could move files like Images, CSS, and Java Script to separate domains thus allowing the browser to download more files simultaneously.

Progressive Rendering
Progressive rendering is when you visit a page and you start seeing parts of the page display right away instead of displaying the entire page at once.  This process gives the user feedback right away and they know more is coming.  If your page takes a second or so before it renders to the browser, the user may thing there is a problem and instead of waiting will move to a different page.  Progressive rendering can also give the illusion that the page is loading faster even though it takes the same amount of time to completely render the entire page.  To progressive render your page you may need to tell your back end programming language to pass parts of your page as they render.  Additionally with components like an ASP.NET Gridview you cannot progressively render that component, instead it will render than entire object all at once.  From an HTML perspective you will want to make sure style sheets are loaded in the page head and Java Script is loaded at the end of the page.  Not doing this can prevent your entire page from progressively rendering.

Large Classes
One of the benefits ASP.NET is using extremely rich classes like the grid view.  You could dedicate a complete book (and I'm sure someone has) to explore the features and inner workings of the grid view.  It's extremely flexible and easy to use.  With a couple simple queries you can define a simple editable grid view.  One cost of using the grid view is that because of all that's built in it carries a heavy payload.  Many times this is not an issue but you will notice a speed increase for simple display grid views if you generate the HTML and render it in a table or some other rudimentary display manner.

Compression
Remove excessive white space in your rendered code will reduce the size of the data being transferred, thus speeding up the page load time.  Additionally configurations can be made to most web servers to support GZIP (a GNU Compression Utility) http://www.gzip.org/ this can easily cut the size of what you are transferring in half or more.  This involves extra CPU cycles on the server to compress and on the browser to decompress but unless your server is running sustained high CPU utilization you shouldn't see much of a CPU hit or slowdown because of this.  On a side note, utilizing this compression could cut the bandwidth portion of your
hosting bill in half or at least free up extra bandwidth.

Website Errors
Errors will slow everything down, not to mention annoying your customers.  Every time your back end code or web server has to engage error processing it takes extra resources away from your server.  Capture coding errors using error control is the first step.  As a common practice I have all errors emailed to me so I am forced to see them and mentally acknowledge them.  This also helps me judge the frequency of the errors.  Because I am always seeing them it's usually on the top of my list to get them fixed.  Over time coding in fixes for all the possible errors will prevent error control from having to engage as often thus increasing overall performance.  The other errors you will want to keep track of are the web server errors such as 404 (Page Not Found).  You can track the occurrence of these errors in your server web logs.  If for instance, there is an old link to a page that no longer exists, instead of letting the web server generate a 404 error, place a file (same name and location of the missing one) with a friendly "does not exist" message or a script to redirect to another page on your website.

There are certainly many additional things you can do that don't fit the scope of this article.  This is meant as a practical guide for the small to mid size Internet retailer.  I hope you have found this informative and at lease consider each one of these techniques before exploring more in-depth or expensive solutions. Here is a free Web Page Load Time Tracker that you can use to monitor the speed at which your pages load over time. This resource is unique in that it uses an actual web browser to render the page including downloading and displaying CSS, Images, and JavaScript.  Set this up now on all your different pages so you have data to chart against when you make changes to your site.