Amazon cloud outage staggers into Day 2

Many sites recovering, but Web 2.0 sites Reddit and Quora still affected

Amazon.com is well into the second day of trying to fix a cloud outage that has partially disabled or knocked out popular Web sites like Quora, Foursquare and Reddit.

The trouble started a little after 5 a.m. ET Thursday when the company's Service Health Dashboard reported connectivity issues that were affecting its Relational Database Service, which is used to manage a relational database in the cloud, across multiple zones in the eastern U.S.

Because of server problems at Amazon's data center , which handles the company's EC2 Web hosting services, Web sites, including popular Web 2.0 sites, were left staggering or disabled.

As of noon ET Friday, these sites have been affected for about 30 hours.

Earlier today, at 5:41 a.m. ET, Amazon reported that its engineers were making progress. At 9:18 a.m. it noted, "We're starting to see more meaningful progress in restoring volumes (many have been restored in the last few hours) and expect this progress to continue over the next few hours."

That comes about 19 hours after Amazon reported Thursday afternoon that it was only a few hours away from having the problem solved.

Amazon updated users again at 11:49 a.m., saying that "many" customers have confirmed that their sites are recovering. "Our current estimate is that the majority of volumes will be recovered over the next five to six hours," the company reported.

Reddit reported at 10:30 a.m. that it was still running in emergency mode. Foursquare appeared to be up and running, while Quora was bouncing between read-only mode and the site's not launching at all but showing an "internal server error" message.

HootSuite was also having problems, reporting at one point that it was "back up" and then coming back with "again offline."

Ezra Gottheil, an analyst with Technology Business Research, said the outage is a big problem for the disabled Web sites, but it's an even bigger problem for Amazon.

"It's a pretty big hit. It's big and it's public," Gottheil added. "When you're doing business on the Web, you don't want to have your doors closed -- ever... It's tough for the sites. Most users will check again later but they'll lose a few."

For Amazon, though, regardless of its reputation as the top player in the cloud sector, the disruption will be hard for current customers to brush off and could hurt attracting future customers.

"This will give the other cloud vendors, especially the higher-end ones, a talking point that won't go away for years," Gottheil said.

Sharon Gaudin covers the Internet and Web 2.0, emerging technologies, and desktop and laptop chips for Computerworld. Follow Sharon on Twitter at @sgaudin or subscribe to Sharon's RSS feed. Her e-mail address is sgaudin@computerworld.com.

Read more about cloud computing in Computerworld's Cloud Computing Topic Center.

Tags cloud computinginternetdisaster recoverysearch enginessoftwareapplicationsData Centere-commerceBusiness Continuityhardware systemsamazon.come-businessInternet SearchWeb 2.0 and Web AppsConfiguration / maintenance

Show Comments