Tag Archives: 架构

myspace架构及大型网站架构经验

六月 2, 2012架构myspace, 架构dotte

着中国大型IT企业信息化速度的加快，大部分应用的数据量和访问量都急剧增加，大型企业网站正面临性能和高数据访问量的压力，而且对存储、安全以及信息检索等等方面都提出了更高的要求……
本文中，我想通过几个国外大型IT企业及网站的成功案例，从Web技术人员角度探讨如何积极地应对国内大型网站即将面临的扩展（主要是技术方面，而较少涉及管理及营销等方面）矛盾。
一、国外大型IT网站的成功之道 (一) MySpace      今天，MySpace已经成为全球众口皆碑的社区网站之王。尽管一流和营销和管理经验自然是每个IT企业取得成功的首要因素，但是本节中我们却抛弃这一点，而主要着眼于探讨在数次面临系统扩张的紧急关头MySpace是如何从技术方面采取应对策略的。第一代架构—添置更多的Web服务器      MySpace最初的系统很小，只有两台Web服务器（分担处理用户请求的工作量）和一个数据库服务器（所有数据都存储在这一个地方）。那时使用的是Dell双CPU、4G内存的系统。在早期阶段，MySpace基本是通过添置更多Web服务器来对付用户暴增问题的。但到在2004年早期，在MySpace用户数增长到五十万后，其数据库服务器已经开始疲于奔命了。
第二代架构—增加数据库服务器      与增加Web服务器不同，增加数据库并没那么简单。如果一个站点由多个数据库支持，设计者必须考虑的是，如何在保证数据一致性的前提下让多个数据库分担压力。
MySpace运行在三个SQL Server数据库服务器上—一个为主，所有的新数据都向它提交，然后由它复制到其它两个；另两个数据库服务器全力向用户供给数据，用以在博客和个人资料栏显示。这种方式在一段时间内效果很好——只要增加数据库服务器，加大硬盘，就可以应对用户数和访问量的增加。
这一次的数据库架构按照垂直分割模式设计，不同的数据库服务于站点的不同功能，如登录、用户资料和博客。垂直分割策略利于多个数据库分担访问压力，当用户要求增加新功能时，MySpace只需要投入新的数据库加以支持。在账户到达二百万后，MySpace还从存储设备与数据库服务器直接交互的方式切换到SAN（存储区域网络）—用高带宽、专门设计的网络将大量磁盘存储设备连接在一起，而数据库连接到SAN。这项措施极大提升了系统性能、正常运行时间和可靠性。然而，当用户继续增加到三百万后，垂直分割策略也变得难以维持下去。
第三代架构—转到分布式计算架构      几经折腾，最终，MySpace将目光移到分布式计算架构——它在物理上分布的众多服务器，整体必须逻辑上等同于单台机器。拿数据库来说，就不能再像过去那样将应用拆分，再以不同数据库分别支持，而必须将整个站点看作一个应用。现在，数据库模型里只有一个用户表，支持博客、个人资料和其他核心功能的数据都存储在相同数据库。
既然所有的核心数据逻辑上都组织到一个数据库，那么MySpace必须找到新的办法以分担负荷——显然，运行在普通硬件上的单个数据库服务器是无能为力的。这次，不再按站点功能和应用分割数据库，MySpace开始将它的用户按每百万一组分割，然后将各组的全部数据分别存入独立的SQL Server实例。目前，MySpace的每台数据库服务器实际运行两个SQL Server实例，也就是说每台服务器服务大约二百万用户。据MySpace的技术人员说，以后还可以按照这种模式以更小粒度划分架构，从而优化负荷分担。
第四代架构—求助于微软方案      2005年早期，账户达到九百万，MySpace开始用微软的C#编写ASP.NET程序。在收到一定成效后，MySpace开始大规模迁移到ASP.NET。      账户达到一千万时，MySpace再次遭遇存储瓶颈问题。SAN的引入解决了早期一些性能问题，但站点目前的要求已经开始周期性超越SAN的I/O容量——即它从磁盘存储系统读写数据的极限速度。
第五代架构—增加数据缓存层并转到支持64位处理器的SQL Server 2005      2005年春天，MySpace账户达到一千七百万，MySpace又启用了新的策略以减轻存储系统压力，即增加数据缓存层——位于Web服务器和数据库服务器之间，其唯一职能是在内存中建立被频繁请求数据对象的副本，如此一来，不访问数据库也可以向Web应用供给数据。
2005年中期，服务账户数达到两千六百万时，MySpace因为我们对内存的渴求而切换到了还处于beta测试的支持64位处理器的SQL Server 2005。升级到SQL Server 2005和64位Windows Server 2003后，MySpace每台服务器配备了32G内存，后于2006年再次将配置标准提升到64G。
事实上，MySpace的Web服务器和数据库仍然经常发生超负荷，其用户频繁遭遇“意外错误”和“站点离线维护”等告示，他们不得不在论坛抱怨不停……
MySpace正是在这样不断重构站点软件、数据库和存储系统中，才一步步走到今天。事实上，MySpace已经成功解决了很多系统扩展性问题，其中存在相当的经验值得我们借鉴。MySpace系统架构到目前为止保持了相对稳定，但其技术人员仍然在为SQL Server支持的同时连接数等方面继续攻坚，尽可能把事情做到最好。
(二) Amazon      亚马逊书店无疑是电子商务发展的里程碑。2000年到现在，世界网络业腥风血雨。Amazon曾经成为网络泡沫的头号代表。如今，当这个“最大的泡沫”用几经易改的数字把自己变成了坚实的IT巨人。
历览Amazon发展过程，其成功经验在于，它创造性地进行了电子商务中每一环节的探索，包括系统平台的建设，程序编写、网站设立、配送系统等等方面。用Amazon当家人贝索斯的话说就是，“在现实世界的商店最有力的武器就是地段，地段，地段，而对于我们来说最重要的三件事就是技术，技术，技术。”
(三) eBay      eBay是世界闻名的拍卖网站，eBay公司通信部主管凯文•帕斯格拉夫认为，“eBay成功的最重要原因在于公司管理和服务。”      其成功的奥秘可以列举为以下几点：      ①敢为天下先—在网络尚不普及的时代，eBay率先进入网络拍卖领域；      ②依托虚拟商场所产生的特有的“零库存”是eBay公司取得成功的另一个重要原因。该公司的核心业务没有任何库存风险，所有的商品都是由客户提供，它只需要负责提供虚拟的拍卖平台—网络和软件。所以，eBay公司的财务报表上不会出现“库存费用”和“保管费用”等。 ③自eBay公司成立开始，它就一直遵循两条“黄金原则”：建设虚拟社区，给网民以家的感觉；保证网站稳定安全地运行。
二、国内大型网站开发时的几点建议      从本节开始，我们将结合国内外大型IT网站在技术扩展方面的沉痛教训和成功经验，探讨在如今刚刚开始的Web 2.0时代如何应对国内网站即将面临的数据访问量增加（甚至是急剧膨胀）的问题，并提出一些供参考的策略和建议。
(四) 搭建科学的系统架构      构建大型的商业网站绝对不可能像构建普通的小型网站一样一蹴而就，需要从严格的软件工程管理的角度进行认真规划，有步骤有逻辑地进行开发。对于大型网站来说，所采用的技术涉及面极其广泛，从硬件到软件、编程语言、数据库、Web服务器、防火墙等各个领域都有了很高的要求，已经不是原来简单的html静态网站所能比拟的。以著名的Yahoo!为例，他们的每一个大型网站工程都需要大量相应专业人员的参与。
(五) 页面静态化      可不要小看纯静态化的HTML页面！其实在很多情况下，HTML往往意味着“效率最高、消耗最小”，所以我们尽可能使我们的网站上的页面采用静态页面来实现。但是，对于大量内容并且频繁更新的网站，我们无法全部手动实现，因此可以开发相应的自动化更新工具，例如我们常见的信息发布系统CMS。像我们经常访问的各个门户站点的新闻频道，甚至他们的其他频道，都是通过信息发布系统来管理和实现的。信息发布系统可以实现最简单的信息录入自动生成静态页面，还能具备频道管理、权限管理、自动抓取等功能，对于一个大型网站来说，拥有一套高效、可管理的CMS是必不可少的。
(六) 存储问题      存储也是一个大问题，一种是小文件的存储，比如图片这类；另一种是大文件的存储，比如搜索引擎的索引。大家知道，对于Web服务器来说，不管是Apache、IIS还是其他容器，图片是最消耗资源的，于是我们有必要将图片与页面进行分离，这是基本上大型网站都会采用的策略，他们都有独立的图片服务器，甚至很多台图片服务器。这样的架构可以降低提供页面访问请求的服务器系统压力，并且可以保证系统不会因为图片问题而崩溃，在应用服务器和图片服务器上，可以进行不同的配置优化以保证更高的系统消耗和执行效率。
(七) 数据库技术—集群和库表散列      对于大型网站而言，使用大型的数据库服务器是必须的事情。但是，在面对大量访问的时候，数据库的瓶颈仍然会显现出来，这时一台数据库将很快无法满足应用，于是我们需要借助于数据库集群或者库表散列技术。
在数据库集群方面，很多数据库厂商都有自己的解决方案，Oracle、Sybase、SQL Server等都有很好的方案，常用的MySQL提供的Master/Slave也是类似的方案。因此，你使用了什么样的数据库，就参考相应的解决方案来实施即可。
上面提到的数据库集群由于在架构、成本、扩张性方面都会受到所采用数据库类型的限制，于是我们需要从应用程序的角度来考虑改善系统架构，其中，库表散列是常用并且最有效的解决方案。我们在应用程序中安装业务和应用或者功能模块将数据库进行分离，不同的模块对应不同的数据库或者表，再按照一定的策略对某个页面或者功能进行更小的数据库散列，比如用户表，按照用户ID进行表散列，这样就能够低成本的提升系统的性能并且有很好的扩展性。在这一方面一个现成的例子就是搜狐。它的论坛就是采用了这样的架构，将论坛的用户、设置、帖子等信息进行数据库分离，然后对帖子、用户按照板块和ID进行散列数据库和表，最终可以在配置文件中进行简单的配置便能让系统随时增加一台低成本的数据库进来补充系统性能。
(八) 缓存策略      这绝对不单指低级的缓存技术相关的编程，应从整个架构角度着眼，深入研究Web服务器、数据库服务器的各层级的缓冲策略，最后才是低级的缓冲技术的编程。不同的Web服务器、数据库服务器及Web编程语言都有自己不同的缓冲策略。例如数据库存储方面，SQL Serve 2005中的主动式缓存机制，Oracle数据的cache group技术，Hibernate的缓存包括Session的缓存和SessionFactory的缓存；Web服务器方面，Apache提供了自己的缓存模块，也可以使用外加的Squid模块进行缓存，这两种方式均可以有效的提高Apache的访问响应能力，IIS缓冲器技术；至于web开发语言，所用缓存技术更存在很大不同，例如ASP.NET 2.0中提出了两种缓存应用程序数据和缓存服务页输出的策略，这两种缓存技术相互独立但不相互排斥，PHP有Pear的Cache模块，等等。
(九) 镜像      镜像是大型网站常采用的提高性能和数据安全性的方式，镜像的技术可以解决不同网络接入商和地域带来的用户访问速度差异。在镜像的细节技术方面，这里不阐述太深，有很多专业的现成的解决架构和产品可选。也有廉价的通过软件实现的思路，比如Linux上的rsync等工具。
(十) 负载均衡      负载均衡将是大型网站解决高负荷访问和大量并发请求采用的终极解决办法。负载均衡技术发展了多年，有很多专业的服务提供商和产品可以选择，基于LAMP解决方案的Lighttped+Squid是相当不错的解决负载均衡和加速系统的有效方式。
(十一) 硬件四层交换      第四层交换使用第三层和第四层信息包的报头信息，根据应用区间识别业务流，将整个区间段的业务流分配到合适的应用服务器进行处理。第四层交换功能就象是虚IP，指向物理服务器。它传输的业务服从的协议多种多样，有HTTP、FTP、NFS、Telnet或其他协议。这些业务在物理服务器基础上，需要复杂的载量平衡算法。在IP世界，业务类型由终端TCP或UDP端口地址来决定，在第四层交换中的应用区间则由源端和终端IP地址、TCP和UDP端口共同决定。
在硬件四层交换产品领域，有一些知名的产品可以选择，比如Alteon、F5等，这些产品很昂贵，但是物有所值，能够提供非常优秀的性能和很灵活的管理能力。Yahoo中国当初接近2000台服务器使用了三四台Alteon就搞定了。
(十二) 软件四层交换      大家知道了硬件四层交换机的原理后，基于OSI模型来实现的软件四层交换也就应运而生，这样的解决方案实现的原理一致，不过性能稍差。但是满足一定量的压力还是游刃有余的。
一个典型的使用负载均衡的策略就是，在软件或者硬件四层交换的基础上搭建squid集群，这种思路在很多大型网站包括搜索引擎上被采用，这样的架构低成本、高性能还有很强的扩张性，随时往架构里面增减节点都非常容易。
(十三) 软件投资问题      据报导，目前国内除了一些上市企业和特别大知名大公司以外，很少有企业在成本中考虑正版软件的购置费用。这种思维极有可能给中国互联网带来噩梦。如果一些公司真正面临软件资金方面的困难，完全可以考虑使用开源世界的LAMP解决方案（Linux＋Apache＋MySQL＋Perl、PHP或者Python Web编程语言）；否则，随着我国加入WTO范围的不断扩大，盗版打击必然越来越严。因此，“苟且偷生”必将自食其果。
另外，随着网络带宽日渐提升，WEB 2.0技术必将影响到网络世界的几乎每一个角落。因此，如何积聚技术人员进行技术攻关并进一步加强安全防范也成为一个日益严峻的问题，宜尽早纳入到公司的议事日程。四、总结      中国电子商务真正理性发展的一个标志，是大量的传统企业实实在在地开始用互联网来处理商务、做生意，而现在这样的浪潮已经开始。北京发行集团，联合SINA、6688.com等单位共同推出的网上虚拟书店—新新书店就是这样的一个标志。
随着网络带宽日渐提升，随着网络理念和WEB 2.0技术的断深入人心，各种B2B、B2C、C2C等电子商务模式很可能以立体交叉方式整合到各种大型商务网站中来。因此，作为公司的技术人员，作为临危救驾的“白衣骑士”，如何应对海量存储、海量访问问题，海量信息检索的问题，日益严峻的安全问题，等等，已经刻不容缓。

数据库架构的升级和变更

六月 1, 2012SQL Server, 架构SQL Server, 架构dotte

SQLServer2008在数据的高安全、高性能、高可用方面的技术已经比较成熟，这些技术和方案都是随着很多公司的业务和数据访问压力的增加而不断的升级和

变迁的，同时经历了方方面面的考验，证明了它们都是成熟可靠的，下面就这方面的技术方案和变迁过程来做一些分析。

阶段一：

裸奔时代：

优点：裸奔最大的好处就是简单，成本低。

缺点：一旦服务器出现问题，恢复起来比较麻烦；如果访问压力变大，服务器可能不堪重负。

阶段二：

单库+Mirror+BackUp方案：

说明：Mirror有两种方式，同步和异步；同步方式能保证主库和Mirror端数据的一致性，而且不需要使用企业版，但是对主库的性能影响也比较大；异步方式需要

企业版才支持，绝大部分时刻能保证数据的一致性，但是也有丢失小部分的数据可能，不过它主库的影响比较小。

优点：此方案对主库的数据提供了可靠的保护，一旦主库出现问题，从库能在比较短的时间内恢复，尤其是数据库很大时（从备份恢复需要的时间会很长），能尽

快的恢复业务使用，而且Mirror端能生成快照，能给实时性要求不高的业务使用。

缺点：Mirror会影响主库的部分性能（异步方式影响比较小），主库出现问题后，前端需要更改访问的IP地址（或者将从服务器的IP地址改成主服务器的IP地址），

还需要账号、权限和作业等信息迁移过去。

单库+Replication+ BackUp方案：

优点：Replication端可以提供给前段访问，可以将读操作放到从库，分担主库的部分压力，还能提供数据库的备份功能，不过这种备份很可能数据会有丢失。

缺点：不能提供安全的数据保护功能，对主库有一些性能影响。

阶段三：

单库+Replication+Mirror+BackUp方案：

优点：这种方案是前面两种方案的结合，既能够解决数据保护的问题，也能够提供读写分离的功能。

缺点：主库上既有Mirror又有Replication，这种方式对主库影响会比较大，而且实际证明，Mirror和Replication在同一台机器上部署，在一个出现问题时，

会对另一个造成影响。

阶段四：

Cluster（双A）+BackUp方案：

说明：图中矩形部分代表存储，两台服务器做了双A的群集。

优点：Cluster能确保其中的一个服务器出现问题时所有的数据和服务能切换到另外一台机器，切换的时间很短，能尽快的恢复业务访问。

缺点：双A群集一般要求配置比较好，价格比较高；因数据都存放在存储上，所以群集不能保护数据，一旦数据或者存储出现问题，需要从备份中恢复数据；

SQLServer的群集不能提供负载均衡的功能。

阶段五：

Cluster（双A）+Mirror+BackUp方案：

说明：双A群集再加两个服务器上库的Mirror保护。

优点：这个方案能对数据提供可靠的保护，无论是服务器故障还是存储故障，都能保证数据的安全，而且数据恢复的时间比较短。

缺点：Mirror会消耗主服务器的部分性能，多了两台Mirror机器，成本会增加，如果存储出现问题，快速恢复的方案是启用Mirror机器，后面可能需要重做群集。

阶段六：

Cluster（双A）+Mirror+BackUp+Replication+单分发方案：

说明：双A 群集，Mirror保护，单分发机器和读写分离方案。

优点：群集和Mirror能充分保护数据的安全，读写分离能提高系统整体的性能。

缺点：成本较高，单分发机存在单点故障，如果分发机器出现问题，将需要重建，此时读和写都将集中到主库，压力会比较大。

Cluster（双A）+Mirror+BackUp+Replication+双分发方案：

优点：与单分发机相比，没有单点故障，即使某台分发机出现问题，也能保证读写分离机制继续运行。

缺点：成本增加，维护方面更复杂。

阶段七：

Cluster（双A）+双存储+BackUp+Replication+双分发方案：

优点：双存储方案使得数据能得到有效的保护，而且避开了Mirror和Replication同时在主库运行对主服务器造成的影响，节省主服务器资源，而且恢复比较方便。

缺点：成本增加。

阶段八：

Cluster（双A）+双存储+BackUp+Replication+双分发+SSB异步方案：

此方式的主要优势是将数据流异步处理，缓解瞬时高流量主库的压力，因为此方案比较复杂，暂时不做说明，可以参考数据库架构。

阶段九：

拆分业务和数据、采用分布式数据库、使用能负载均衡集群功能的数据库等。

此文档大致描述了随公司的发展、服务器压力的增加，数据库架构方面的变迁阶段，当然我们应该根据公司的具体情况，选择性的采用其中的技术，也可能是

直接跳过某些阶段，而上更高效的方案（如果成本能够接受），因此技术和方案的选择应该根据实际情况，灵活应对。

from：http://www.cnblogs.com/fygh/archive/2012/03/23/2413164.html

Stack Overflow Architecture Update – Now at 95 Million Page Views a Month

六月 1, 2012架构Stackoverflow, 架构dotte

A lot has happened since my first article on the Stack Overflow Architecture. Contrary to the theme of that last article, which lavished attention on Stack Overflow’s dedication to a scale-up strategy, Stack Overflow has both grown up and out in the last few years.

Stack Overflow has grown up by more then doubling in size to over 16 million users and multiplying its number of page views nearly 6 times to 95 million page views a month.

Stack Overflow has grown out by expanding into the Stack Exchange Network, which includes Stack Overflow, Server Fault, and Super User for a grand total of 43 different sites. That’s a lot of fruitful multiplying going on.

What hasn’t changed is Stack Overflow’s openness about what they are doing. And that’s what prompted this update. A recent series of posts talks a lot about how they’ve been handling their growth: Stack Exchange’s Architecture in Bullet Points, Stack Overflow’s New York Data Center, Designing For Scalability of Management and Fault Tolerance, Stack Overflow Search — Now 81% Less, Stack Overflow Network Configuration, Does StackOverflow use caching and if so, how?, Which tools and technologies build the Stack Exchange Network?.

Some of the more obvious differences across time are:

Just More. More users, more page views, more datacenters, more sites, more developers, more operating systems, more databases, more machines. Just a lot more of more.
Linux. Stack Overflow was known for their Windows stack, now they are using a lot more Linux machines for HAProxy, Redis, Bacula, Nagios, logs, and routers. All support functions seem to be handled by Linux, which has required the development of parallel release processes.
Fault Tolerance. Stack Overflow is now being served by two different switches on two different internet connections, they’ve added redundant machines, and some functions have moved to a second datacenter.
NoSQL. Redis is now used as a caching layer for the entire network. There wasn’t a separate caching tier before so this a big change, as is using a NoSQL database on Linux.

Unfortunately, I couldn’t find any coverage on some of the open questions I had last time, like how they were going to deal with multi-tenancy across so many diffrent properties, but there’s still plenty to learn from. Here’s a roll up a few different sources:

The Stats

95 Million Page Views a Month
800 HTTP requests a second
180 DNS requests a second
55 Megabits per second
16 Million Users – Traffic to Stack Overflow grew 131% in 2010, to 16.6 million global monthly uniques.

Data Centers

1 Rack with Peak Internet in OR (Hosts our chat and Data Explorer)
2 Racks with Peer 1 in NY (Hosts the rest of the Stack Exchange Network)

Hardware

10 Dell R610 IIS web servers (3 dedicated to Stack Overflow):
- 1x Intel Xeon Processor E5640 @ 2.66 GHz Quad Core with 8 threads
- 16 GB RAM
- Windows Server 2008 R2

2 Dell R710 database servers:
- 2x Intel Xeon Processor X5680 @ 3.33 GHz
- 64 GB RAM
- 8 spindles
- SQL Server 2008 R2

2 Dell R610 HAProxy servers:
- 1x Intel Xeon Processor E5640 @ 2.66 GHz
- 4 GB RAM
- Ubuntu Server

2 Dell R610 Redis servers:
- 2x Intel Xeon Processor E5640 @ 2.66 GHz
- 16 GB RAM
- CentOS

1 Dell R610 Linux backup server running Bacula:
- 1x Intel Xeon Processor E5640 @ 2.66 GHz
- 32 GB RAM

1 Dell R610 Linux management server for Nagios and logs:
- 1x Intel Xeon Processor E5640 @ 2.66 GHz
- 32 GB RAM

2 Dell R610 VMWare ESXi domain controllers:
- 1x Intel Xeon Processor E5640 @ 2.66 GHz
- 16 GB RAM

2 Linux routers
5 Dell Power Connect switches

Dev Tools

C#: Language
Visual Studio 2010 Team Suite: IDE
Microsoft ASP.NET (version 4.0): Framework
ASP.NET MVC 3: Web Framework
Razor: View Engine
jQuery 1.4.2: Browser Framework:
LINQ to SQL, some raw SQL: Data Access Layer
Mercurial and Kiln: Source Control
Beyond Compare 3: Compare Tool

Software and Technologies Used

Stack Overflow uses a WISC stack via BizSpark
Windows Server 2008 R2 x64: Operating System
SQL Server 2008 R2 running Microsoft Windows Server 2008 Enterprise Edition x64: Database
Ubuntu Server
CentOS
IIS 7.0: Web Server
HAProxy: for load balancing
Redis: used as the distributed caching layer.
CruiseControl.NET: for builds and automated deployment
Lucene.NET: for search
Bacula: for backups
Nagios: (with n2rrd and drraw plugins) for monitoring
Splunk: for logs
SQL Monitor: from Red Gate – for SQL Server monitoring
Bind: for DNS
Rovio: a little robot (a real robot) allowing remote developers to visit the office “virtually.”
Pingdom: an external monitor and alert service.

External Bits

Code that is not included as part of the development tools:

reCAPTCHA
DotNetOpenId
WMD – Now developed as open source. See github network graph
Prettify
Google Analytics
Cruise Control .NET
HAProxy
Cacti
MarkdownSharp
Flot
Nginx
Kiln
CDN: none, all static content is served off the sstatic.net, which is a fast, cookieless domain intended for static content delivered to the Stack Exchange family of websites.

Developers and System Administrators

14 Developers
2 System Administrators

Content

License: Creative Commons Attribution-Share Alike 2.5 Generic
Standards: OpenSearch, Atom
Host: PEAK Internet

More Architecture and Lessons Learned

HAProxy is used instead of Windows NLB because HAProxy is cheap, easy, free, works great as a 512MB VM “device” on a network via Hyper-V. It also works in front of the boxes so it’s completely transparent to them, and easier to troubleshoot as a different networking layer instead of being intermixed with all your windows configuration.
A CDN is not used because even “cheap” CDNs like Amazon one are very expensive relative to the bandwidth they get bundled into their existing host’s plan. The least they could pay is $1k/month based on Amazon’s CDN rates and their bandwidth usage.
Backup is to disk for fast retrieval and to tape for historical archiving.
Full Text Search in SQL Server is very badly integrated, buggy, deeply incompetent, so they went to Lucene.
Mostly interested in peak HTTP request figures as this is what they need to make sure they can handle.
All properties now run on the same Stack Exchange platform. That means Stack Overflow, Super User, Server Fault, Meta, WebApps, and Meta Web Apps are all running on the same software.
There are separate StackExchange sites because people have different sets of expertise that shouldn’t cross over to different topic sites. You can be the greatest chef in the world, but that doesn’t qualify you for fixing a server.
They aggressively cache everything.
All pages accessed by (and subsequently served to) annonymous users are cached via Output Caching.
Each site has 3 distinct caches: local, site, global.
local cache: can only be accessed from 1 server/site pair
- To limit network latency they use a local “L1” cache, basically HttpRuntime.Cache, of recently set/read values on a server. This would reduce the cache lookup overhead to 0 bytes on the network.
- Contains things like user sessions, and pending view count updates.
- This resides purely in memory, no network or DB access.
site cache: can be accessed by any instance (on any server) of a single site
- Most cached values go here, things like hot question id lists and user acceptance rates are good examples
- This resides in Redis (in a distinct DB, purely for easier debugging)
- Redis is so fast that the slowest part of a cache lookup is the time spent reading and writing bytes to the network.
- Values are compressed before sending them to Redis. They have plenty of CPU and most of their data are strings so they get a great compression ratio.
- The CPU usage on their Redis machines is 0%.
global cache: which is shared amongst all sites and servers
- Inboxes, API usage quotas, and a few other truly global things live here
- This resides in Redis (in DB 0, likewise for easier debugging)
Most items in the cache expire after a timeout period (a few minutes usually) and are never explicitly removed. When a specific cache invalidation is required they use Redis messaging to publish removal notices to the “L1” caches.
Joel Spolsky is not a Microsoft Loyalist, he doesn’t make the technical decisions for Stack Overflow, and considers Microsoft licensing a rounding error. Consider yourself corrected Hacker News commentor.
For their IO system they selected a RAID 10 array of Intel X25 solid state drives . The RAID array eased any concerns about reliability and the SSD drives performed really well in comparision to FusionIO at a much cheaper price.
The full boat cost for their Microsoft licenses would be approximately $242K. Since Stack Overflow is using Bizspark they are not paying near the full sticker price, but that’s the max they could pay.
Intel NICs are replacing Broadcom NICs and their primary production servers. This solved problems they were having with connectivity loss, packet loss, and corrupted arp tables.

Hacker News Thread on this Post / Reddit Thread
Stack Exchange’s Architecture in Bullet Points / HackerNews Thread
Stack Overflow’s New York Data Center – hardware of the various machines?
Designing For Scalability of Management and Fault Tolerance
Stack Overflow Blog
Stack Overflow Search — Now 81% Less Crappy – Lucene is now running on an underused cluster.
State of the Stack 2010 (a message from your CEO)
Stack Overflow Network Configuration
Does StackOverflow use caching and if so, how?
Meta StackOverflow
How does StackOverflow handle cache invalidation?
Which tools and technologies build the Stack Exchange Network?
How does Stack Overflow handle spam?
Our Storage Decision
How are “Hot” Questions Selected?
How are “related” questions selected? – the title, the question body, and the tags.
Stack Overflow and DVCS – Stack Overflow selects Mercurial for source code control.
Server Fault Chat Room
C# Redis Client
Broadcom, Die Mutha

Stack Overflow Architecture

六月 1, 2012架构Stackoverflow, 架构dotte

Update 2: Stack Overflow Architecture Update – Now At 95 Million Page Views A Month

Update: Startup – ASP.NET MVC, Cloud Scale & Deployment shows an interesting alternative approach for a Windows stack using ServerPath/GoGrid for a dedicated database machine, elastic VMs for the front end, and a free load balancer. Stack Overflow is a much loved programmer question and answer site written by two guys nobody has ever heard of before. Well, not exactly. The site was created by top programmer and blog stars Jeff Atwood and Joel Spolsky. In that sense Stack Overflow is like a celebrity owned restaurant, only it should be around for a while. Joel estimates 1/3 of all the programmers in the world have used the site so they must be serving up something good.

I fell in deep like with Stack Overflow for purely selfish reasons, it helped me solve a few difficult problems that were jabbing my eyes out with pain. I also appreciate their no-apologies anthropologically based design philosophy. Use design to engineer in the behaviours you want to encourage and minimize the responses you want to discourage. It’s the conscious awareness of the mechanisms that creates such a satisfying synergy.
What is key about the Stack Overflow story for me is the strong case they make for scale up as a viable solution for a certain potentially large class of problems. The publicity these days is all going scale out using NoSQL databases.
If you need to Google scale then you really have no choice but to go the NoSQL direction. But Stack Overflow is not Google and neither are most sites. When thinking about your design options keep Stack Overflow in mind. In this era of multi-core, large RAM machines and advances in parallel programming techniques, scale up is still a viable strategy and shouldn’t be tossed aside just because it’s not cool anymore. Maybe someday we’ll have the best of both worlds, but for now there’s a big painful choice to be made and that choice decides your fate.
Joel boasts that for 1/10 the hardware they have performance comparable to similarly size sites. He wonders if these other sites have good programmers. Let’s see how they did it and you be the judge.
Site: http://stackoverflow.com

The Stats

16 million page views a month
3 million unique visitors a month (Facebook reaches 77 million unique visitors a month)
6 million visits a month
86% of traffic comes from Google
9 million active programmers in the world and 30% have used Stack Overflow.
Cheaper licensing was attained through Microsoft’sBizSpark program. My impression is they pay about $11K for OS and SQL licensing.
Monitization strategy: unobtrusive adds, job placement ads, DevDays conferences, extend the software to target other related niches (Server Fault, Super User), develop StackExchangeas a white label and self hosted version of Stack Overflow, and perhaps develop some sort of programmer rating system.

Platform
Microsoft ASP.NET MVC
SQL Server 2008
C#
Visual Studio 2008 Team Suite
JQuery
LINQ to SQL
Subversion
Beyond Compare 3
VisualSVN 1.5
Web Tier – 2 x Lenovo ThinkServer RS110 1U – 4 cores, 2.83 Ghz, 12 MB L2 cache – 500 GB datacenter hard drives, mirrored – 8 GB RAM – 500 GB RAID 1 mirror array
Database Tier – 1 x Lenovo ThinkServer RD120 2U – 8 cores, 2.5 Ghz, 24 MB L2 cache – 48 GB RAM
A fourth server was added to run superuser.com. All together the servers also run Stack Overflow, Server Fault, and Super User.
QNAP TS-409U NAS for backups. Decided not to use a cloud solution because the bandwidth costs of transferring 5 GB of data per day becomes prohibitive.
Hosting at http://www.peakinternet.com/. Impressed with their detailed technical responses and reasonable hosting rates.
SQL Server’s full text search is used extensively for the site search and detecting if a question has already been asked. Lucene.net is considered an attractive alternative.

Lessons Learned

This is a mix of lessons taken from Jeff and Joel and comments from their posts.
If you’re comfortable managing servers then buy them. The two biggest problems with renting costs were: 1) the insane cost of memory and disk upgrades 2) the fact that they [hosting providers] really couldn’t manage anything.
Make larger one time up front investments to avoid recurring monthly costs which are more expensive in the long term.
Update all network drivers. Performance went from 2x slower to 2x faster.
Upgrading to 48GB RAM required upgrading MS Enterprise edition.
Memory is incredibly cheap. Max it out for almost free performance. At Dell, for example, upgrading from 4G memory to 128G is $4378.
Stack Overflow copied a key part of the Wikipedia database design. This turned out to be a mistake which will need massive and painful database refactoring to fix. The refactorings will be to avoid excessive joins in a lot of key queries. This is the key lesson from giant multi-terabyte table schemas (like Google’s BigTable) which are completely join-free. This is significant because Stack Overflow’s database is almost completely in RAM and the joins still exact too high a cost.
CPU speed is surprisingly important to the database server. Going from 1.86 GHz, to 2.5 GHz, to 3.5 GHz CPUs causes an almost linear improvement in typical query times. The exception is queries which don’t fit in memory.
When renting hardware nobody pays list price for RAM upgrades unless you are on a month-to-month contract.
The bottleneck is the database 90% of the time.
At low server volume, the key cost driver is not rackspace, power, bandwidth, servers, or software; it is NETWORKING EQUIPMENT. You need a gigabit network between your DB and Web tiers. Between the cloud and your web server, you need firewall, routing, and VPN devices. The moment you add a second web server, you also need a load balancing appliance. The upfront cost of these devices can easily be 2x the cost of a handful of servers.
EC2 is for scaling horizontally, that is you can split up your work across many machines (a good idea if you want to be able to scale). It makes even more sense if you need to be able to scale on demand (add and remove machines as load increases / decreases).
Scaling out is only frictionless when you use open source software. Otherwise scaling up means paying less for licenses and a lot more for hardware, while scaling out means paying less for the hardware, and a whole lot more for licenses.
RAID-10 is awesome in a heavy read/write database workload.
Separate application and database duties so each can scale independently of the other. Databases scale up and the applications scale out.
Applications should keep state in the database so they scale horizontally by adding more servers.
The problem with a scale up strategy is a lack of redundancy. A cluster ads more reliability, but is very expensive when the individual machines are expensive.
Few applications can scale linearly with the number of processors. Locks will be taken which serializes processing and ends up reducing the effectiveness of your Big Iron.
With larger form factors like 7U power and cooling become critical issues. Using something between 1U and 7U might be easier to make work in your data center.
As you add more and more database servers the SQL Server license costs can be outrageous. So by starting scale up and gradually going scale out with non-open source software you can be in a world of financial hurt.
It’s true there’s not much about their architecture here. We know about their machines, their tool chain, and that they use a two-tier architecture where they access the database directly from the web server code. We don’t know how they implement tags, etc. If interested you’ll be able to glean some of this information from an explanation of their schema.

Discussion

As an architecture profile candidate Stack Overflow has earned two important HighScalability badges: the Microsoft Stack Badge and and the Scale Up Badge. Both controversial and interesting topics of discussion.

Microsoft Stack Badge

The Microsoft Stack Badge was earned because Stack Overflow uses the entire Microsoft Stack: OS, database, C#, Visual Studio, and ASP .NET. People are always interested in how MS compares to LAMP, but I don’t have many case studies to show them.
Markus Frind of Plenty of Fish fame is often used as a Microsoft stack poster child, but since he explicitly uses as little of the stack as possible he’s not really a good example. Stack Overflow on the other hand is brash in proclaiming their love for MS, even when that love is occasionally spurned.
It’s hard to separate out the Microsoft stack and the scale up approach because for licensing reasons they tend to go together. If you find yourself in the position of transitioning from scale up to scale out by adding dozens of cores, MS licensing will bite you.
Licensing aside I personally find C#, Visual Studio, and .Net a very productive environment. C#/.Net is at least as good as Java/JVM. ASP .NET has always been a confusing mess to me. The knock against SQL Server is you have to pay for it and if that doesn’t bother you then it’s a solid choice. The Windows OS may not be as solid as other alternatives but it works well enough.
So for a scale up solution a Microsoft stack works, especially if you are already Windows centric.

Scale Up Badge

This won’t be a reenactment of the scale out vs scale up vs rent vs buy wars. For a thorough discussion of these issues please take a look at Scaling Up vs. Scaling Out and Server Hosting — Rent vs. Buy?. If you aren’t confused and if your head doesn’t hurt after reading all that then you haven’t properly understood the material 🙂
The Scale Up Badge was awarded because Stack Overflow uses a scale up strategy to meet their scaling requirements. When they reach a limit they scale vertically by buying a bigger machine and adding more memory.
Stack Overflow is in the sweet spot for scale up. It’s not too large, but with an Alexa ranking of 1,666 and 16 million page views a month it’s still a substantial site. Not Google scale, and probably will never have to be, but those are numbers many sites would be thrilled to have. Yet they aren’t uploading large amounts of media. They aren’t dealing with billions of tweets across complex social networks with millions of users. Their number of users is self limiting. And there are still directions they can take if they need to scale (caching, more web servers, faster disks, more denormalization, more memory, some partitioning, etc). All-in-all it’s a well done and very useful two-tier CRUD application.

NoSQL is Hard

So should Stack Overflow have scaled out instead of up, just in case?
What some don’t realize is NoSQL is hard. Relational databases have many many faults, but they make a lot of common tasks simple while hiding both the cost and complexity. If you want to know how many black Prius cars are in inventory, for example, then that’s pretty easy to do.
Not so with most NoSQL databases (I’ll speak generally here, some NoSQL databases have more features than others). You would have program a counter of black Prius cars yourself, up front, in code. There are no aggregate operators. You must maintain secondary indexes. There’s no searching. There are no distributed queries across partitions. There’s no Group By or Order By. There are no cursors for easy paging through result sets. Returning even 100 large records at time may timeout. There may be quotas that are very restrictive because they must limit the amount of IO for any one operation. Query languages may lack expressive power.
The biggest problem of all is that transactions can not span arbitrary boundaries. There are no ACID guarantees beyond a single record or small entity group. Once you wrap your head around what this means for the programmer it’s not a pleasant prospect at all. References must be manually maintained. Relationships must be manually maintained. There are no cascading deletes that act correctly during a failure. Every copy of denormalized data must be manually tracked and updated taking into account the possibility of partial failures and externally visible inconsistency.
All this functionality must be written manually by you in your code. While flexibility to write your own code is great in an OLAP/map-reduce situation, declarative approaches still cover a lot of ground and make for much less brittle code.
What you gain is the ability to write huge quantities of data. What you lose is complacency. The programmer must be very aware at all times that they are dealing with a system where it costs a lot to perform distribute operations and failure can occur at anytime.
All this may be the price of building a truly scalable and distributed system, but is this really the price you want to pay?

The Multitenancy Problem

With StackExchange Stack Overflow has gone into the multi-tenancy business. They are offering StackExchange either self-hosted or as a hosted white label application.
It will be interesting to see if their architecture can scale to handle a large number of sites. Salesorce is the king of multitenancy and although it’s true they use Oracle as their database, they basically use very little of Oracle and have written their own table structure, indexing and query processor on top of Oracle. All in order to support multitenancy.
Salesforce went extreme because supporting a lot of different customers is way more difficult than it seems, especially once you allow customization and support versioning.
Clearly all customers can’t run in one server for security, customization, and scaling reasons.
You may think just create a database for each customer, share a server for a certain number of customers, and then add more servers as needed. As long as a customer doesn’t need more than one server you are golden.
This doesn’t seem to work well in practice. Oddly database managers aren’t optimized for adding or updating databases. Creating databases is a heavyweight operation and can degrade performance for existing customers as system locks are taken. Upgrade issues are also problematic. Adding columns locks tables which causes problems in high traffic situations. Adding new indexes can also take a very long time and degrade performance. Plus each customer will likely have specializations that makes upgrading even more complicated.
To get around these problems Salesforce’s Craig Weissman, Chief Architect, created an innovative approach where tables are not created for each customer. All data from all customers is mapped into the same data table, including indexes. The schema for that table looks something like orgid, oid, value0, value1…value500. “orgid” is the organization ID and is how data is never mixed up. It’s a very wide and sparse table, which Oracle seems to handle well. Hundreds and hundreds of “tables” and custom fields are mapped into the data table.
With this approach Salesforce has no option other than to build their own infrastructure to interpret what’s in that table. Oracle is left to handle transactions, concurrency, and deadlock detection. The advatange is because there’s an interpreted layer handling versions and upgrades is relatively simple because the handling logic can be baked in. Strange but true.

Related Articles

This list includes a number of posts by Jeff as he chronicles their journey with Stack Overflow. Jeff is wonderful about being open about what they are doing and why. The comment threads are often tremendous. There’s a lot to learn.
Learning from StackOverflow.com by Joel Spolsky
Scaling Up vs. Scaling Out: Hidden Costs by Jeff Atwood
What Was Stack Overflow Built With?
New Stack Overflow Server Glamour Shots
New Stack Overflow Servers Ready
Server Hosting — Rent vs. Buy? – this is a very informative discussion the pros and cons of renting vs buying.
Rent vs. Buy (or EC2 vs. building your own iron) by Michael Friis
Oh, You Wanted “Awesome” Edition – We recently upgraded our database server to 48 GB of memory — because hardware is cheap, and programmers are expensive.
Our Backup Strategy – Inexpensive NAS
The Economics of Bandwidth
Understanding the StackOverflow Database Schema by Brent Ozar
Server Speed Tests – new hardware 2x slower – it was the network.
ASP.NET MVC: A New Framework for Building Web Applications
Three key things to know about moving MySQL into the cloud by morgan
NoSQL Conference
Decline of the Enterprise Data Warehouse by Bradford Stephens
Webinar: Multitenant Magic – Under the Covers of the Force.com Data Architecture by Craig Weissman, Chief Architect, salesforce.com.

Stack Overflow 架构

六月 1, 2012架构Stackoverflow, 架构dotte

Stack Overflow取得了长足发展：规模扩大了一倍多，每月不重复的访问用户超过1600万；每月网页浏览量（PV）增长了近6倍，达到9500万。

Stack Overflow发展壮大成了 Stack Exchange Network，而这个网络包括Stack Overflow、Server Fault和Super User等，旗下总共拥有43个网站，而且发展势头良好。

但不变的是Stack Overflow在其所作所为方面坚持的开放理念，而这才有了今天这篇文章。最近的一连串帖子主要介绍了Stack Overflow在如何应对增长：《Stack Exchange的架构要点介绍》、《Stack Overflow的纽约数据中心》、《为确保管理和容错的高扩展性而设计》、《Stack Overflow搜索——现在时间缩短了81%》、《Stack Overflow网络配置》、《Stack Overflow使用缓存吗？如果使用，怎么使用？》和《哪些工具和技术构建了Stack Exchange Network？》等。（51CTO编辑注：以上文章均为英文。）

这几年来比较明显的一些变化如下：

◆数量更多：更多的用户、更多的PV、更多的数据中心、更多的站点、更多的开发人员、更多的操作系统、更多的数据库、更多的机器。

◆Linux：Stack Overflow因使用Windows系列产品而著称，现在他们使用越来越多的Linux机器，用于HAProxy、Redis、Bacula、Nagios、日志和路由器等系统。所有支持功能似乎都由Linux来处理，这就需要开发并行版本发行流程。

◆容错：现在为Stack Overflow提供服务的是使用两条不同互联网连接的两只不同交换机，Stack Overflow添加了冗余机器，一些功能已搬迁到第二个数据中心。

◆NoSQL：Redis现用作整个网络的缓存层。以前没有独立的缓存层，所以这是一大变化，使用基于Linux的NoSQL数据库也是一大变化。

遗憾的是，我没有找到哪些帖子在介绍我上次提出的一些开放问题，比如Stack Overflow如何处理有着众多不同属性的多租户架构，不过我们还是可以从许多方面来了解。下面是收集的一些信息：

统计数字

◆每月网页浏览量9500万次

◆每秒800个HTTP请求

◆每秒180个DNS请求

◆每秒55兆位

◆1600万个用户——Stack Overflow的流量在2010年增长了131%，全球每月不重复访客增至1660万人。

数据中心

Stack Overflow网络架构

◆1个机架放在俄勒冈州的Peak Internet（用于放置chat和Data Explorer）

◆2个机架放在纽约州的Peer 1（用于放置Stack Exchange Network的其余部分）

硬件

◆10台戴尔R610 IIS Web服务器（3台专门用于Stack Overflow）

◆1个英特尔至强处理器E5640，2.66 GHz四核，8线程

◆16 GB内存

◆Windows Server 2008 R2

◆2台戴尔R710数据库服务器：

◆2个英特尔至强处理器X5680，3.33 GHz

◆64 GB内存

◆8个硬盘

◆SQL Server 2008 R2

◆2台戴尔R610 HAProxy服务器：

◆1个英特尔至强处理器E5640，2.66 GHz

◆4 GB内存

◆Ubuntu Server

◆2台戴尔R610 Redis服务器：

◆2个英特尔至强处理器E5640，2.66 GHz

◆16 GB内存

◆CentOS

◆1台戴尔R610 Linux备份服务器，运行Bacula：

◆1个英特尔至强处理器E5640，2.66 GHz

◆32 GB内存

◆1台戴尔R610 Linux管理服务器，用于Nagios和日志：

◆1个英特尔至强处理器E5640，2.66 GHz

◆32 GB内存

◆2个戴尔R610 VMWare ESXi域控制器：

◆1个英特尔至强处理器E5640，2.66 GHz

◆16 GB内存

◆2只Linux路由器

◆5只戴尔Power Connect交换机

开发工具

◆编程语言：C#

◆集成开发环境（IDE）：Visual Studio 2010团队套件

◆框架：微软ASP.NET（版本4.0）

◆Web框架：ASP.NET MVC 3

◆视图引擎：Razor

◆浏览器框架：jQuery 1.4.2

◆数据访问层：LINQ to SQL，一些原始SQL

◆源码控制：Mercurial和Kiln

◆比较工具：Beyond Compare 3

使用的软件和技术

◆Stack Overflow通过BizSpark，使用WISC堆栈

◆操作系统：Windows Server 2008 R2 x64

◆数据库：运行微软Windows Server 2008企业版x64的SQL Server 2008 R2

◆Ubuntu Server

◆CentOS

◆Web 服务器：IIS 7.0

◆HAProxy：用于负载均衡

◆Redis：用作分布式缓存层

◆CruiseControl.NET：用于代码构建和自动化部署

◆Lucene.NET：用于搜索

◆Bacula：用于备份

◆Nagios：（n2rrd和drraw插件）用于监控

◆Splunk：用于日志

◆SQL Monitor：Red Gate公司提供，用于SQL Server监控

◆Bind：用于DNS

◆Rovio：一个小巧的机器人（真正的机器人），让远程开发人员可以通过“虚拟方式”访问办公室。

◆Pingdom：外部监控和警报服务网站

外部组件

不是作为开发工具一部分而包括的代码：

◆reCAPTCHA

◆DotNetOpenId

◆WMD——现在作为开源而开发。详见github网络图

◆Prettify

◆Google Analytics

◆Cruise Control .NET

◆HAProxy

◆Cacti

◆MarkdownSharp

◆Flot

◆Nginx

◆Kiln

◆内容分发网络（CDN）：无，所有静态内容从sstatic.net来提供，这个快速的、无cookie的域用于将静态内容分发到Stack Exchange系列网站。

开发人员和系统管理员

◆14名开发人员

◆2名系统管理员

内容

◆许可证：Creative Commons Attribution-Share Alike 2.5 Generic

◆标准：OpenSearch，Atom

◆主机：PEAK Internet

架构的更多信息和汲取的经验

◆使用了Proxy，而不是使用Windows网络负载均衡（NLB），因为HAProxy成本低廉、易于使用，还是免费的；而且通过Hyper- V，很适合作为网络上的一个512M虚拟机“设备”。它还在服务器的前端工作，所以对服务器来说完全透明；而且作为不同的网络层，更容易排除故障，而不是与你的所有窗口配置混杂在一起。

◆没有使用CDN，因为与捆绑在现有主机方案中的带宽相比，连亚马逊CDN这样“便宜的”CDN其费用都非常昂贵。按照亚马逊的CDN费率和Stack Overflow的带宽使用量，每月至少要付1000美元。

◆备份到磁盘上，便于快速恢复；备份到磁带上，便于历史归档。

◆SQL Server的全文搜索机制集成度非常差，问题多多，功能很弱，所以Stack Overflow改用了Lucene。

◆最受关注的是峰值HTTP请求数字，因为这正是他们需要确保能处理的方面。

◆所有属性如今都在同一个Stack Exchange平台上运行。那意味着Stack Overflow、Super User、Server Fault、Meta、WebApps和Meta Web Apps都在同一个软件上运行。

◆有一些独立的StackExchange站点，因为人们拥有不同的专业技能，这些技能并不适用于不同的主题站点。你也许是世界上最出色的大厨，但并不是说你就有能力修复服务器。

◆Stack Overflow尽量把一切都放到缓存中。

◆匿名用户访问的所有页面通过输出缓存（Output Caching）放到缓存中，随后提供给匿名用户。

◆每个站点有三种不同的缓存：本地缓存、站点缓存和全局缓存。

◆本地缓存：只能通过1对服务器/站点来访问。

◆为了限制网络延迟时间，Stack Overflow使用了本地“一级”缓存（基本上是HttpRuntime.Cache），缓存服务器上最近设定/读取的值。这样就可以把网络上的缓存查找开销减小至0字节。

◆缓存里面含有用户会话和等待的视图数更新等内容。

◆缓存完全驻留在内存中，没有网络或数据库访问。

◆站点缓存：可以由一个站点（任何服务器上）的任何实例来访问。

◆大部分缓存的值进入到这里，热点问题ID列表和用户验收率就是两个典例。

◆缓存驻留在Redis（位于不同的数据库，纯粹为了易于调试）。

◆Redis的速度很快，缓存查找中速度最慢的部分就是读取字节并写到网络上。

◆值被发送到Redis之前先进行压缩。Stack Overflow有许多处理器，大部分数据是字符串，所以得到的压缩比很高。

◆Redis机器上的处理器使用率为0%。

◆全局缓存：全局缓存被所有站点和服务器共享。

◆缓存内容包括收件箱、API使用限额和另外几项真正全局的内容。

◆缓存驻留在Redis中（位于数据库0，同样为了易于调试）。

◆缓存中的大部分项目在超时（通常是几分钟）后过期，从来不被明确删除。需要宣布某个特定的缓存项目无效时，他们使用Redis消息传递机制，向“一级”缓存发布删除通知。

◆知名软件工程师、Fog Creek Software公司首席执行官Joel Spolsky不是微软的忠诚分子，他并不为Stack Overflow做出技术决策，认为微软的许可证是个舍入误差。

◆Stack Overflow为自己的输入/输出系统选择了英特尔X25固态硬盘组成的RAID 10阵列。这个RAID阵列消除了可靠性方面的任何问题；与FusionIO相比，固态硬盘的性能确实很好，而价格又便宜得多。

◆微软许可证的总标价约为24.2万美元。由于Stack Overflow使用Bizspark，所以没在支付总标价，但他们能付的最多也就这么多。

◆英特尔网卡取代了博通网卡和主生产服务器。这解决了他们之前面临的问题：连接中断、数据包丢失和地址解析协议（ARP）表损坏。

Dotte博客

大数据、云计算、架构、语言的本质、计算的未来

Tag Archives: 架构

myspace架构及大型网站架构经验

数据库架构的升级和变更

Stack Overflow Architecture Update – Now at 95 Million Page Views a Month

The Stats

Data Centers

Hardware

Dev Tools

Software and Technologies Used

External Bits

Developers and System Administrators

Content

More Architecture and Lessons Learned

Related Articles

Stack Overflow Architecture

The Stats

Platform

Lessons Learned

Discussion

Microsoft Stack Badge

Scale Up Badge

NoSQL is Hard

The Multitenancy Problem

Related Articles

Stack Overflow 架构