Google: Gmail outage a “big deal”
By Laura Isensee
A majority of Google users from California to Taiwan found themselves without access to Google’s popular email service on Tuesday.
The company outlined what went wrong on its blog.
“We took a small fraction of Gmail’s servers offline to perform routine upgrades,” Ben Treynor, vice president of engineering and site reliability czar, wrote.
But the company “slightly underestimated” the load that placed on other servers — called request routers — that direct web queries to the right Gmail server for response.
So those servers became overloaded, pushing the load to the remaining request routers, causing more to become overloaded. And “within minutes nearly all of the request routers were overloaded,” Treynor said.
“As a result, people couldn’t access Gmail via the web interface because their requests couldn’t be routed to a Gmail server,” Treynor said.
To fix the immediate problem, Google put more request routers online.
To ensure it doesn’t happen again, the company said it is increasing request router capacity well beyond peak demand. It is also making sure that its request routers work slower instead of refusing traffic if many are overloaded in the future.
We were wondering what Gmail users think of the outage. Are people considering changing email service? Did the outage spark any doubts about adopting “cloud computing” for business?