ASP.NET (64 bit) Performance Tuning with Sitecore

There is not a lot of information on performance tuning ASP.NET 3.5, or any version of ASP.NET for that matter, especially for 64 bit servers, and recently I had the task of making a large scale Sitecore CMS website production ready with a view to handling around 2500 concurrent users. There's not much Sitecore documentation out there either, practically none about performance tuning in a production environment.

Anyway, I like a challenge! So, during a two day load testing session with frugal documentation, I scoured the internet, did some experiments, and had quite a lot of success at squeezing every last drop of performance out of ASP.NET 3.5 on a 64 bit server using a mixture of web.config and machine.config tweaks.

Although tuning is going to be different for every application the information in this post should be useful to a lot of other ASP.NET 2.0/3.5 production server deployments.

Application

The application was an ASP.NET 3.5 website built using Sitecore 6.0 CMS populated with approximately 20,000 content pages that were shredded and held in a SQL Server 2005 database. The majority of pages in the website just displayed data direct from the database via the Sitecore API; a few peripheral pages called off to web services; there were only light updates to the database. The only quirk of this solution was that Sitecore has its own caching mechanism independent to the caching in ASP.NET, something that had to be factored in to the performance tuning.

Hardware

The hardware consisted of one database server with four dual core Opteron processor containing 8GB of memory, and three web servers each with one dual core Operon processor containing 4GB memory. The database server had 3 x SCSI RAID 5 and 2 x SCSI RAID 1 and the three web servers each had 2 x SCSI RAID 1.

Platform

Both the database server and the three web servers were installed with 64 bit Windows Server 2003. IIS was configured on the web servers to run 64 bit ASP.NET. SQL Server Standard Edition 64 bit was installed on the database server.

Testing Approach

Three load injectors were set up to simulate user load and windows performance monitor "perfmon" was used to gather information about each test run. Each test run was executed for 45 minutes and ramped up to a specific number of concurrent users. For ease of configuration, testing was done using only one of the web servers on the assumption that load balancing over three web servers would give three times the throughput (an invalid assumption if the database had been the bottleneck).

Each concurrent user had a journey consisting of ten pages "transactions" that were randomised in a way to exhibit real world usage. Each transaction included the page and any associated media such as stylesheets and images (the simulation respected client cache control headers sent from the server for these content types).

Testing Baseline

As a baseline, I just used the default installation of 64 bit Windows Server 2003, IIS and the ASP.NET 2.0 64 bit ISAPI (installed via the aspnet_regiis command line tool in the "Windows\Microsoft.NET\2.X\Framework64" folder). The necessary databases were backed up and restored to the server.

The first results were run using 200 concurrent users with an average transaction response time of 24 seconds. This was 100 times slower than the required 1 second response times needed for peak website usage which felt pretty daunting.

Performance Tweak #1

The perfmon results from the baseline showed the % CPU topping out at 100, # Induced GC rising from 0 to 8000 (this should remaing close to zero) and % time in GC averaging around 20 (this should be as close to zero as possible, but it's obviously going to increase with memory contention).

Changing the web.config from <compilation debug="true"> to <compilation debug="false"> doubled performance. The second results were run again using 200 concurrent users, this time with an average transaction response time of 12 seconds (instead of 24).

Performance Tweak #2

The perfmon results still showed the same resource bottlenecks as the previous run.

Changing the <prcocessModel> section in machine.config was the next step. On my default installation of Windows Server 2003 64 bit, the machine.config didn't have a <processModel> section so I had to add one.

The following settings brought the average transaction response time to 3 seconds (down from 12).

Web.config

<httpRuntime
   maxRequestLength="16384"
   executionTimeout="600"
   requestLengthDiskThreshold="80"
   minFreeThreads="88"
   minLocalRequestFreeThreads="76"
   appRequestQueueLimit="5000"
   enableKernelOutputCache="true"
   enableVersionHeader="true"
   enable="true"
   shutdownTimeout="90"
   delayNotificationTimeout="5"
   waitChangeNotification="0"
   maxWaitChangeNotification="0"
   enableHeaderChecking="true"
   sendCacheControlHeader="true"
   apartmentThreading="false"
/>

Machine.config

<processModel
   enable="true"
   maxWorkerThreads="100"
   maxIoThreads="100"
   minWorkerThreads="1"
   minIoThreads="1"
   timeout="Infinite"
   idleTimeout="Infinite"
   requestLimit="Infinite"
   requestQueueLimit="5000"
   restartQueueLimit="10"
   memoryLimit="60"
   webGarden="false"
   userName="machine"
   password="AutoGenerate"
   logLevel="Errors"
   responseDeadlockInterval="00:03:00"
   responseRestartDeadlockInterval="00:03:00"
   serverErrorMessageFile=""
   pingFrequency="Infinite"
   pingTimeout="Infinite"
   maxAppDomains="2000"
/>

Sitecore config

<setting name="Caching.DefaultDataCacheSize" value="20MB"/>
<setting name="Caching.DefaultHtmlCacheSize" value="20MB"/>

Performance Tweak #3

The following changes brought the average transaction response time down to 1 second.

Web.config

<httpRuntime
   minFreeThreads="88"
   minLocalRequestFreeThreads="76"
/>

<cache
   disableMemoryCollection = "false"
   disableExpiration = "false"
   privateBytesLimit = "2576980377"
   percentagePhysicalMemoryUsedLimit="0"
   privateBytesPollTime="00:00:30"
/>

Machine.config

<processModel
   enableKernelOutputCache="false"
   memoryLimit="70"
/>

Sitecore config

<setting name="Caching.DefaultDataCacheSize" value="400MB"/>
<setting name="Caching.DefaultHtmlCacheSize" value="200MB"/>
<setting name="Caching.DefaultPathCacheSize" value="5MB"/>
<setting name="Caching.DefaultRegistryCacheSize" value="5MB"/>
<setting name="Caching.DefaultViewStateCacheSize" value="10MB"/>
<setting name="Caching.DefaultXslCacheSize" value="10MB"/>
<setting name="Caching.FastMediaCacheMapSize" value="10MB"/>

New Testing Baseline

Using the settings so far obtained, a new base line was measured for 400 users (instead of 200). Doubling the number of users from 200 to 400 actually reduced throughput by a factor of 4. So the new average transaction response time was 4 seconds for 400 users.

Performance Tweak #4

The following are enhancements specifically for Sitecore, and brought the average transaction response time from 4 seconds down to 1 second for 400 users making these changes an 8 times improvement over 200 users from performance tweak #3. This really goes to show how much Sitecore relies on caching for its speed as the changes are focussed on allocating larger cache sizes.

Sitecore config

<setting name="Caching.AccessResultCacheSize" value="10MB"/>
<setting name="Caching.StandardValues.DefaultCacheSize" value="10MB"/>

<database id="web" >
   <cacheSizes hint="setting">
   <data>400MB</data>
   <items>100MB</items>
   <paths>10MB</paths>
   <standardValues>10MB</standardValues>
   </cacheSizes>
</database>

<hooks>
   <param desc="Threshold">900MB</param>
</hooks>

<sites>
   <site name="website" htmlCacheSize="100MB"
      filteredItemsCacheSize="10MB" xslCacheSize="10MB"
      filteredItemsCacheSize="10MB"
   />
</sites>

<cacheSizes>
   <sites>
      <website>
         <html>100MB</html>
         <registry>0</registry>
         <viewState>0</viewState>
         <xsl>10MB</xsl>
     </website>
   </sites>
</cacheSizes>

<setting name="Caching.AccessResultCacheSize" value="10MB"/>

<fastCaches>
   <memoryCacheSize>5MB</memoryCacheSize>
</fastCaches>

Conclusion

Simply applying a good runtime configuration and with some careful tuning, your application can yield massive performance improvements, in fact the application I was tuning ended up being a factor of 100 times faster with no code changes. It was a very rewarding but time consuming effort of tweaking and running simulated loads over a period of several days and hopefully anyone who reads this can enjoy the benefits.

References

I found the following MSDN article helpful when I was performance tuning (although the title of the MSDN article may not seem directly relevant, the formulas to calculate web.config and machine.config settings based on the number of CPUs and RAM proved invaluable):

Contention, poor performance, and deadlocks when you make calls to Web services from an ASP.NET application