Two articles on scaling myspace and ebay

http://www.addsimplicity.com.nyud.net:8080/downloads/eBaySDForum2006-11-29.pdf

http://glinden.blogspot.com/2006/12/talk-on-ebay-architecture.html “The parallels with Amazon are remarkable. Like Amazon, eBay started with a two-tiered architecture. Like Amazon, they split the website into a cluster in the late 1990's, followed soon after by partitioning the databases.

This "inside my space" article give a simple road map of the steps the myspace.com took to scale their website. I have summarized this roadmap as a table and diagram below.

Given that the roadmap seems to be standard and any website that is lucky enough to have this problem can simply define their scalability roadmap and see if they can skip some steps along the way. ie: go straight to an architecture that consists of

  • system partitioned-by-user-id
  • physical middletier server which wrap the database
  • middle tier supports caching using something like memcached
  • a user login database (contains hashed password and the database server their account is stored on).
    lots of simple frontend servers
  • user session is implemented using cookies and the contents of the url. ie. no server side session database
Option Scaling option
1 Simple 2 frontends, 1 database
2 Add frontends
3 Add database read slaves(1 master writer)
4 Vertical partitioning of database (1 database per feature)
5 Continue vertical partitioning of database (1 database per feature)
6 Add SAN
7 Scale up vs scale out decision - bigger machines vs lots of small machines - choose small machines
8 Split database based on users. 1 database per 1 million accounts, One central login server(UDB)
9 Language change
10 SAN bottleneck – move users accounts from disk to disk to distribute disk load
11 Change to 3PAR SAN which acts like a multidisk RAID. Similar to Google File System(GFS)
12 Add middle tier cache
13 Move to 64bit, add 64gb of Ram / machine
14 Replicate 3PAR SAN in 3 colos

A diagram