OSU managed a lot of Drupal sites:
- Production: 554 sites
- Development: 344 sites
- Training: 244 sites
To mange this, they have a created a custom solution: WebMange (RoR), that can spin up new Drupal sites
Don't use standard multisite. They symlink core files and root directories to a new root for each site. Allows for sub-directory sites. Hard to manage. For new versions of Drupal, they do a new "build" and then change the symlinks to push that out to the individual sites.
Python script to install Drupal - builds the server dirs and DBs, then fires Drush to do the install.
PHP script to add users - calls to Drush to do the actual work.
They have a custom distribution, with their own install profile. A single install profile for all sites. Worked to make it modular and flexible. They use the Features module to export roles, WYSIWYG configs, input filters, etc to modules that they add to the install profile.
Update in batches of 20 - 60 sites, they have a backend interface to select and launch the site updates. They backup the site and DB, break and rebuild the symlinks, then use Drush to do DB updates.
Scaling and Caching Cluster of web servers. Citrix NetScale for load balancing and front end cache. Use APC on the web servers. Dedicated memcache servers. Search engine requests go to a single web server from all systems. NFS for shared file system between the web heads.
Managing Distribution Used subversion for Drupal 6. Checked entire site into the repository. For Drupal 7, they use Drush Make and Git to pull core, contrib modules and libraries on build, and they only put their own custom code into the repository. Their Drush make scripts always use specific versions of core and contrib modules. Each release has a new make file and version control on that file can easily show you what changed.
Weaknesses and Upcoming Changes
Shared Infrastructure: They run everything on one cluster: Drupal, Wordpress, other PHP sites, etc. That means they have to configure the servers to lowest common denominator. One misbehaving app can cause problems for everyone. In the future they will have Drupal-specific
Site Consolidation: Too many sites, navigation is hard to move between when everything is on different sites. Moving to Organic Groups to segment Drupal 7 into "sections" that groups can manage. Different departments are just different organic group. Theme per group if departments want a custom theme.
Moving to standard tools: Moving to Aegir and Varnish instead of the custom code that they are using now.
Increased Automation: They would like to build more automation into the process. Managing staging between dev and production sites is a manual process right now. No automated site removal. In their new infrastructure they will be using Fabric to control Aegir, Drush and Provision to build platforms and sites. They use Puppet to install and configure Aegir servers.
Control Panel for Staging
Allows site admins to push their sites to dev with a custom control panel built into the site itself. Can clone a site from production to dev, work on it, then bring back changes.