SharePoint Notes

Bleeding on the cutting edge …

SharePoint and very large databases

Posted by Christian Dam on October 27, 2009


I got a very interesting call from a colleague the other day. He had a customer with content database which apparently was approaching 1TB! My colleague of course told the customer that it was not aligned with best practices and that it could cause problems down the road to which the customer replied with a smile: “Well, if that happens we just throw additional resources at the SQL Server. Thanks for playing, now leave me alone”

So, he called me:  “Does it sounds plausible? SQL Server 2005/2008 can support exabytes of data so what’s the problem?”

Well, I went to work and starting digging up some information. My main bibles SharePoint 2007 Best Practices and Inside SharePoint 2007 Administration both referred to Microsoft guidelines which recommend to limit the content databases to 100GB:

Limit content database size to enhance manageability

Plan for database sizing that will enhance manageability and performance of your environment.

  • In most circumstances, to enhance the performance of Office SharePoint Server 2007, we discourage the use of content databases larger than 100 GB. If your design requires a database larger than 100 GB, follow the guidance below: 
    • Use a single site collection for the data. 
    • Use a differential backup solution, such as SQL Server 2005 or Microsoft System Center Data Protection Manager, rather than the built-in backup and recovery tools. 
    • Test the server that is running SQL Server 2005 and the I/O subsystem before moving to a solution that depends on a 100-GB content database. 
  • Whenever possible, we strongly advise that you split content from a site collection that is approaching 100 GB into a new site collection in a separate content database to avoid performance or manageability issues. 
  • Limit content databases that contain multiple site collections to approximately 100 GB.

However, the recommendations didn’t say anything about why 100GB is the magic number. MVP Ben Curry even at one point said “100GB doesn’t seem very big anymore 🙂 If you need larger site collections, optimize SQL and rock on… “ 7-8 month later, Ben had thought it over once again and had come to another conclusion: “So, use your head when architecting databases/site collections and don’t smoke crack. I’ve even said you can have monster content databases in the past – I was wrong. The only way I would now architect large content databases would be for fairly static data that did not have a large collaborative user population.”

OK, so why is Ben advising us not to smoke crack? In other words, why are large content databases evil?

Obviously large databases makes it difficult to administer the databases and to maintain a realistic back/recovery plan due to the sheer amount of data:

  • backup and restore take a long time (normally 150GB can be backed up in a 4 hour window)
  • the SLA may be in jeopardy
  • it is difficult to periodically test and verify the backup/restore procedure
  • database attach is slow 
  • upgrading WSS/MOSS service packs will be slow
  • I am sure some custom code start to suffer as well

Note: the backup/restore timing issues normally doesn’t apply to those customers using SQL Mirroring or SQL log shipping.

Additionally – and probably at least as important: more data will lead to more SQL I/O which again lead to more locking and blocking on the SQL Server backend. All data in a site collection share the same table and if that table is blocked, access to all sites in the site collection is blocked! It is a normal SQL Server thing and as such it is not even visible in the event or trace logs.

Splitting up the content database can help overcome most of these problems and it is fortunately a relative simple thing to do. Here’s how to do it on WSS and MOSS.

Advertisements

One Response to “SharePoint and very large databases”

  1. nice bullet points to remind us WHY stick with 100Gb max.
    thanks for the sharing!

Sorry, the comment form is closed at this time.

 
%d bloggers like this: