栖息谷-管理人的网上家园

[传递书] [原创] The Datacenter as a Computer -译文

[复制链接] 18
回复
13921
查看
打印 上一主题 下一主题
楼主
跳转到指定楼层
分享到:
发表于 2012-6-20 18:07:19 | 只看该作者 回帖奖励 |倒序浏览 |阅读模式
本帖最后由 trybestying 于 2012-6-25 16:36 编辑

序:最近因工作需要,了解云计算和数据中心建设相关的资料,发现英文资料居多,且没有译文。因此打算看这本书时顺手翻译成中文,以供将来同行的朋友们查阅。随后,我会相继翻译并发布。必竟不是专业翻译且英语水平有限,有翻译不当的地方,请朋友们多指正,谢谢。

The Datacenter as a Computer
An Introduction to the Design of
Warehouse-Scale Machines

Contents










chapter 1   第一章
Introduction  介绍
The ARPANET is about toturn forty, and the World Wide Web is approaching its 20th anniversary. Yet theInternet technologies that were largely sparked by these two remarkablemilestones continue to transform industries and our culture today and show nosigns of slowing down. More recently the mergence of such popular Internetservices as Web-based email, search and social networks plus the increasedworldwide availability of high-speed connectivity have accelerated a trend towardserver-side or “cloud”computing.
阿帕网格快要满四十岁,而万维网也接近20周年,然而,互联网技术的这两项显著里程碑,在很大程度上引发并继续改造行业和当今文化,并且丝毫没有放缓的迹象。近年来出现的比较流行的基于互联网技术的WEB邮件,搜索,社交网络 促使全球高速互联技术的快速发展,并日渐呈服务端或云计算发展的趋势。
Increasingly,computing and storage are moving from PC-like clients to large Internetservices.While early Internet services were mostly informational, today manyWeb applications offer services that previously resided in the client,including email, photo and video storage and office applications. The shifttoward server-side computing is driven primarily not only by the need for user experienceimprovements, such as ease of management (no configuration or backups needed)and
ubiquity ofaccess (a browser is all you need), but also by the advantages it offers tovendors. Software as a service allows faster application development because itis simpler for software vendors to make changes and improvements. Instead ofupdating many millions of clients (with a myriad of peculiar hardware andsoftware configurations), vendors need only coordinate improvements and fixesinside their datacenters and can restrict their hardware deployment to a fewwell-tested
configurations.Moreover, datacenter economics allow many application services to run at a lowcost per user. For example, servers may be shared among thousands of activeusers (and many more inactive ones), resulting in better utilization.Similarly, the computation itself may become cheaper in a shared service (e.g.,an email attachment received by multiple users can be stored once rather than manytimes). Finally, servers and storage in a datacenter can be easier to managethan the desktop or laptop equivalent because they are under control of asingle, knowledgeable entity.
很快的,存储和计算从PC客户端向互联网服务端转移,早期,互联网服务大多只是咨询信息,当下很多原来安装在PC客户端的应用都实现了基于WEB方式的互联网服务,如邮件,照片,视频存储,办公应用等。这种各服务端计算转变的趋势不仅是由于用户对服务体验的提升,如易管理(不再需要配置和备份),随处访问(仅需要一个浏览器),而且更是给服务提供商们提供堵多的优势,软件做为一种服务被允许快速的应用和开发出来,因为软件提供商能够更容易的进行软件的变更和改进,而不用更新数百万的客户端(无数怪异的硬件以及软件配置),软件提供商只需要协调改良和修复内部数据中心,而且能够限定硬件部署到几个有限的充分测试过的环境中。此外数据中心经济学使得许多应用服务能够低成本/每用户运行,例如服务器可能在成千上万的活动用户之间共享(有更多的非活动用户),导致更好的利用率,同样的,计算本身在共享服务中可能变得更便宜(例如,被多用户接受的邮件附件只需要存一次而非多次)。 最后,在数据中心的服务和存储比起台式或便携设备更容易管理,因为是在单一、智慧的实体控制下的。
Someworkloads require so much computing capability that they are a more natural fitfor a massive computing infrastructure than for client-side computing. Searchservices (Web, images, etc.) are a prime example of this class of workloads,but applications such as language translation can also run more effectively onlarge shared computing installations because of their reliance on massive-scalelanguage models.
有些工作负载需要如此多的计算能力,以至于他们更自然的适合巨型计算基础设施而非客户端计算。搜索服务(网页,图片等)是此类工作负载最好的例子,但是类似语言翻译此类的应用因为依赖大规模的语言模式同样需要更高效的运行在在大型共享计算环境中。
The trendtoward server-side computing and the exploding popularity of Internet services
has created anew class of computing systems that we have named warehouse-scale computers,or WSCs. The name is meant to call attention to the most distinguishingfeature of these machines: the massive scale of their software infrastructure,data repositories, and hardware platform. This perspective is a departure froma view of the computing problem that implicitly assumes a model where oneprogram runs in a single machine. In warehouse-scale computing, the program isan Internet service, which may consist of tens or more individual programs thatinteract to implement complex end-user services such as email, search, or maps.These programs might be implemented and maintained by different teams ofengineers, perhaps even across organizational, geographic, and companyboundaries (e.g., as is the case with mashups).
服务器端计算的趋势和互联网服务的爆炸流行,成就了一个新类的计算系统,我们称之为仓储级电脑或wscs。这个名字是为了呼吁人们关注这些机器最显著的特点:大规模的软件基础设施,数据仓储和硬件平台。这一观点有别于计算问题隐式地假定在单一机器中运行的模式,在仓储级计算中,程序做为互联网服务可能包括数十或更多的程序交互以实现复杂终端用户的服务如电子邮件,搜索和地图。这些程序可能初不同的工程团队,甚至跨组织跨地区,跨公司界限实现和维护(如mashups就是一个实例)。
The computingplatform required to run such large-scale services bears little resemblance
to apizza-box server or even the refrigerator-sized high-end multiprocessors thatreigned in the last decade. The hardware for such a platform consists ofthousands of individual computing nodes with their corresponding networking andstorage subsystems, power distribution and conditioning equipment, andextensive cooling systems. The enclosure for these systems is in fact abuilding structure and often indistinguishable from a large warehouse.
计算平台运行所需的这种大规模的服务 不象过去十年中的一个比萨饼盒大小的服务器或甚至冰箱大小的高端处理器所能支配的,对于这样一个平台的硬件由成千上万的独立计算节点以及与之相应的网络和存储子系统、配电和空调设备,广泛的冷却系统组成。外形实际是一建筑结构,经常被识别为一个大型仓库。



本帖子中包含更多资源

您需要 登录 才可以下载或查看,没有帐号?加入

x
推荐
 楼主| 发表于 2012-6-26 08:40:12 | 只看该作者
本帖最后由 trybestying 于 2012-6-26 08:41 编辑
班玛康乐 发表于 2012-6-25 22:09
楼主辛苦了

呵呵,谢谢支持!希望翻译的东东对大家有用!
沙发
 楼主| 发表于 2012-6-21 13:00:34 | 只看该作者
本帖最后由 trybestying 于 2012-6-29 10:04 编辑

1.1 WAREHOUSE-SCALE COMPUTERS 仓储级计算机
Had scale been the only distinguishing feature of these systems, we might simply refer to them as datacenters. Datacenters are buildings where multiple servers and ommunication gear are collocated because of their common environmental requirements and physical security needs, and for ease of maintenance. In that sense, a WSC could be considered a type of datacenter. Traditional datacenters, however, typically host a large number of relatively small- or medium-sized applications, each running on a dedicated hardware infrastructure that is de-coupled and protected from other systems in the same facility. Those datacenters host hardware and software for multiple organizational units or even different companies. Different computing systems within such a datacenter often have little in common in terms of hardware, software, or maintenance infrastructure, and tend
not to communicate with each other at all.
规模是唯一的特色的这些系统,我们可能简单地称之为数据中心。数据中心建筑中多个服务器和通信装备因共同的环境要求和物理安全的要求通常协同安置,易于维护。从这种意义上,WSC可认为是一种类型的数据中心。然而,通常数据中心主机上大量运着相对小型或中型的应用程序,且每一个程序运行在一个专用的硬件基础设施上且与专用的硬件基础设施高耦合(即使是相同的设施),与其他系统保护隔离。 这些数据中心主机的硬件和软件有多个组织、单位甚至不同的公司提供,且运用不同的计算机系统。这样的数据中心,通常而言,几乎没有什么共同点的硬件、软件,或者维护基础设施,并往往不交流彼此所有。
WSCs currently power the services offered by companies such as Google, Amazon, Yahoo,and Microsoft’s online services division. They differ significantly from traditional datacenters: they belong to a single organization, use a relatively homogeneous hardware and system software platform,and share a common systems management layer. Often much of the application, middleware,and system software is built in-house compared to the predominance of third-party software running in conventional datacenters. Most importantly, WSCs run a smaller number of very large applications (or Internet services), and the common resource management infrastructure allows significant deployment flexibility. The requirements of homogeneity, single-organization control, and enhanced focus on cost efficiency motivate designers to take new approaches in constructing and operating these systems.
WSCs目前提供有力服务的公司,如谷歌、亚马逊、雅虎,和微软的在线服务部门,他们明显不同于传统数据中心:他们属于一个组织,使用同源的硬件和系统软件平台,并共享通用的系统管理层。通常大多数应用程序、中间件和系统软件是内部建立,相比采用第三方软件运行的传统数据中心有很大的优势。更重要的,WSCs运行较少的非常大的应用程序(或互联网服务)且通用的资源管理基础设施允许重大部署的灵活性。同质性的要求,单一组织控制和增强的关注成本效率激励着设计师们采取新的方法来建设和运营这些系统
Internet services must achieve high availability, typically aiming for at least 99.99% uptime
(about an hour of downtime per year). Achieving fault-free operation on a large collection of hardware and system software is hard and is made more difficult by the large number of servers involved. Although it might be theoretically possible to prevent hardware failures in a collection of 10,000 servers, it would surely be extremely expensive. Consequently, WSC workloads must be designed to gracefully tolerate large numbers of component faults with little or no impact on service level performance and availability.
互联网服务必须达到高可用性,通常目标至少99.99%的正常运行时间(大约一个小时的停机时间/年)。实现无故障操作大量的硬件和系统软件是非常困难的, 而其上大量相关服务使得这一目标更加困难,虽然可能理论上可以防止10000服务器集的硬件故障,但付出的代价无异是非常昂贵的。因此,WSC工作负载设计必须能优雅地忍受大量的组件故障,达到很少或没有影响服务级别的性能和可用性。

1.2 EMPHASIS ON COST EFFICIENCY 强调成本效率
Building and operating a large computing platform is expensive, and the quality of a service may depend on the aggregate processing and storage capacity available, further driving costs up and requiring a focus on cost efficiency. For example, in information retrieval systems such as Web search, the growth of computing needs is driven by three main factors 。
建造和运营一个大型计算平台是昂贵的,而且高质量的服务,可能取决于聚合的处理能力和可用存储容量,进一步推动成本增加,需要注重成本效率。例如,信息检索系统,如网络搜索,计算需求的增长,由以下三个主要因素:
        Increased service popularity that translates into higher request loads.
        快速的服务普及将转化为更高的请求负载
        The size of the problem keeps growing—the Web is growing by millions of pages per day,which increases the cost of building and serving a Web index.
        问题的规模不断扩大-网页以每天数百万页的速度增加,使得构建和提供Web索引服务的成本急增。
        Even if the throughput and data repository could be held constant, the competitive nature of this market continuously drives innovations to improve the quality of results retrieved and the frequency with which the index is updated. Although some quality improvements can be achieved by smarter algorithms alone, most substantial improvements demand additional computing resources for every request. For example, in a search system that also considers synonyms of the search terms in a query, retrieving results is substantially more expensive—either the search needs to retrieve documents that match a more complex query that includes the synonyms or the synonyms of a term need to be replicated in the index data structure for each term.
        即使吞吐量和数据存储库可以保持不变,市场的竞争特性促使不断创新才能提高检索的搜索质量,索引更新的频率。尽管某些质量改进可以通过智能算法实现,但大部分实质性改进取决于为每个请求增加的额外计算资源。例如,搜索系统在搜索请求时要考虑同义词述语、检索结果代价更高-不论是搜索需要在检索文档中匹配更复杂的包括同义词的查询,还是为每个同义词术语在索引数据结构中复制。
The relentless demand for more computing capabilities makes cost efficiency a primary metric of interest in the design of WSCs. Cost efficiency must be defined broadly to account for all the significant components of cost, including hosting-facility capital and operational expenses (which include power provisioning and energy costs), hardware, software, management personnel, and repairs.
更高计算能力的钢性需求 使得成本效率成为 WSCs 设计 主要感兴趣的指标之一。成本效率必须更大范围的明确所有重要组成成分的成本,包括主机设备资金和运营费用(包括电力供应和能源成本)、硬件、软件、管理人员和维修费。
1.3 NOT JUST A COLLECTION OF SERVERS  不只是一组服务器
Our central point is that the datacenters powering many of today’s successful Internet services are no longer simply a miscellaneous collection of machines co-located in a facility and wired up together. The software running on these systems, such as Gmail or Web search services, execute at a scale far beyond a single machine or a single rack: they run on no smaller a unit than clusters of hundreds to thousands of individual servers. Therefore, the machine, the computer, is this large cluster or aggregation of servers itself and needs to be considered as a single computing unit.
我们的焦点是,数据中心作为许多今天的成功的互联网服务已不再只是一个杂七杂八的机器共存的设施连接在一起。这些系统上运行的软件,比如Gmail或Web搜索服务,执行规模远远超出一台机器或单一机架: 他们以最小单元在成百上千的单独的服务器组成的集群中运行。因此,这些机器、计算机,被识为大型集群或聚合的服务器中的一个单一的计算单元。
The technical challenges of designing WSCs are no less worthy of the expertise of computer systems architects than any other class of machines. First, they are a new class of large-scale machines driven by a new and rapidly evolving set of workloads. Their size alone makes them difficult to experiment with or simulate efficiently; therefore, system designers must develop new techniques to guide design decisions. Fault behavior and power and energy considerations have a more significant impact in the design of WSCs, perhaps more so than in other smaller scale computing platforms. Finally, WSCs have an additional layer of complexity beyond systems consisting of individual servers or small groups of server; WSCs introduce a significant new challenge to programmer productivity, a challenge perhaps greater than programming multicore systems. This additional
complexity arises indirectly from the larger scale of the application domain and manifests itself as a deeper and less homogeneous storage hierarchy (discussed later in this chapter), higher fault rates (Chapter 7), and possibly higher performance variability (Chapter 2).
设计WSCs的技术挑战不亚于设计其他任何种类的机器所需要的计算机系统架构师专业知识。首先,它们是一种新型的,负载着新的快速增长的工作负载的大型机器组。其次,规模本身使它们难以试验或模拟效率;因此,系统设计者必须开发新技术来指导设计决策。在WSCs的设计中,故障反应、电力和能源因素产生更为显著的影响,相比其他规模较小的计算平台。最后,WSCs有额外的一层复杂程度远远超出由单独的服务器或服务器的小群体组成的系统:WSCs导入一个新的重大编程工作效率挑战,该挑战甚至高于编程多核系统。这个额外的复杂性间接来自于规模较大的应用程序域,表现为一种更深和更少的同构存储体系(在本章稍后讨论),更高的出错率(第7章),并可能较高的性能易变性(第2章)
The objectives of this book are to introduce readers to this new design space, describe some of the requirements and characteristics of WSCs, highlight some of the important challenges unique to this space, and share some of our experience designing, programming, and operating them within Google. We have been in the fortunate position of being both designers of WSCs, as well as customers and programmers of the platform, which has provided us an unusual opportunity to evaluate design decisions throughout the lifetime of a product. We hope that we will succeed in relaying our enthusiasm for this area as an exciting new target worthy of the attention of the general research and technical communities.
这本书的目的是给读者介绍这种新设计空间,描述一些WSCs的要求和特点,强调一些重要的独有挑战,并分享一些我们在谷歌的经验,包括设计、编程和操作等方面。我们有幸设计WSCs 并成为该平台的客户和程序员,这为我们提供了一个不寻常的机会在一个产品的全生命周期中进行评估设计决策。我们希望我们能成功传达我们对该领域的热情, 并期待引发对该领域的常规研究和技术交流的关注和极大兴趣。
板凳
 楼主| 发表于 2012-6-25 12:05:12 | 只看该作者
1.4 O NE DATACENTER VS. SEVERAL DATACENTERS  一个数据中心VS.多个数据中心
In this book, we define the computer to be architected as a datacenter despite the fact that Internet services may involve multiple datacenters located far apart. Multiple datacenters are sometimes used as complete replicas of the same service, with replication being used mostly for reducing user latency and improving serving throughput (a typical example is a Web search service). In those cases, a given user query tends to be fully processed within one datacenter, and our machine definition seems appropriate.
在本书中,我们假单个数据中心的计算机架构,尽管互联网服务可能会涉及多个相距很远的数据中心。有时可以用作多个数据中心相同服务的完整副本,复制主要用于降低用户延迟和改善服务的吞吐量(一个典型的例子是一个Web搜索服务)。在这些情况下,给定用户查询往往完全在一个数据中心处理,我们的机器定义似乎是合适的。
However, in cases where a user query may involve computation across multiple datacenters,our single-datacenter focus is a less obvious fit. Typical examples are services that deal with nonvolatile user data updates, and therefore, require multiple copies for disaster tolerance reasons. For such computations, a set of datacenters might be the more appropriate system. But we have chosen to think of the multi-datacenter scenario as more analogous to a network of computers. This is in part to limit the scope of this lecture, but is mainly because the huge gap in connectivity quality between intra- and inter-datacenter communications causes programmers to view such systems as separate
computational resources. As the software development environment for this class of applications evolves, or if the connectivity gap narrows significantly in the future, we may need to adjust our choice of machine boundaries.
然而,在用户查询的情况下可能会涉及到计算跨多个数据中心,我们的单数据中心聚焦是不合适的。典型的例子是处理非易失性用户数据更新服务,因此需要多个副本作为容灾的原因。这样的计算,一组数据中心可能是更适当的系统。但是我们选择把多数据中心场景看作更类似于一个计算机网络。这在一定程度上是为了限制这本课程的范围,但主要是因为内部和内部数据中心的通信存在连接质量的巨大差距,导致程序员们认为这样的系统作为单独的计算资源。作为软件开发环境为这类应用程序的发展,或如果未来连通性差距显著缩小,我们可能需要调整选择的机器边界。
1.5 WHY WSCs MIGHT MATTER TO YOU 为什么WSCs可能对你很重要
As described so far, WSCs might be considered a niche area because their sheer size and cost render them unaffordable by all but a few large Internet companies. Unsurprisingly, we do not believe this to be true. We believe the problems that today’s large Internet services face will soon be meaningful to a much larger constituency because many organizations will soon be able to afford similarly sized computers at a much lower cost. Even today, the attractive economics of low-end server class computing platforms puts clusters of hundreds of nodes within the reach of a relatively broad range
of corporations and research institutions. When combined with the trends toward large numbers of processor cores on a single die, a single rack of servers may soon have as many or more hardware threads than many of today’s datacenters. For example, a rack with 40 servers, each with four 8-core dual-threaded CPUs, would contain more than two thousand hardware threads. Such systems will arguably be affordable to a very large number of organizations within just a few years, while exhibiting some of the scale, architectural organization, and fault behavior of today’s WSCs. Therefore, we believe that our experience building these unique systems will be useful in understanding the design
issues and programming challenges for those potentially ubiquitous next-generation machines.
目前为止所述,WSCs可能考虑的细分领域,因为他们的规模和成本使他们负担不起的,除了一些大的互联网公司。不出所料,我们不相信这是真的。我们相信今天的大型互联网服务面临的问题,很快就会有更大的选区,因为许多组织将很快能够以更低的成本负担得起相同大小的电脑。即使是在今天,引人注目的经济形低端服务器类计算平台最大可能的将数百个节点的集群集成在企业和研究机构。集结大量的处理器核心趋向死亡,单机架的服务器可能很快就会被多硬件线程的当下新型数据中心替代。例如,一个容量40台服务器,每台服务器四个8核多线程cpu的机架,将包含超过二千硬件线程。这样的系统才能在未来的几年内负担起大多数的组织机构。这展现了今天的WSCs的某些规模,组织架构和故障特性。因此,我们相信构建这些独特的系统的经验将有助理解设计问题和编程挑战那些潜在的无处不在的下一代的机器。
eMe
4
发表于 2012-6-25 15:03:09 | 只看该作者
坐观
5
 楼主| 发表于 2012-6-25 15:54:06 | 只看该作者
本帖最后由 trybestying 于 2012-6-29 17:45 编辑

1.6 ARCHITECTURAL OVERVIEW OF WSCs  WSCs架构概览
The hardware implementation of a WSC will differ significantly from one installation to the next. Even within a single organization such as Google, systems deployed in different years use different basic elements, reflecting the hardware improvements provided by the industry. However, the architectural organization of these systems has been relatively stable over the last few years. Therefore, it is useful to describe this general architecture at a high level as it sets the background for subsequent discussions.
WSC的硬件实现不同批次显著不同。即使在单一组织如谷歌、不同的年份使用不同的基本元素进行系统部署,这反映了该行业的硬件改进特性。然而,这些系统的架构组织在过去的几年里一直相对稳定,因此,以较高的水平来描述这个通用的架构为后续讨论做了很好的铺垫。

FIGURE 1.1: Typical elements in warehouse-scale systems: 1U server (left), 7′ rack with Ethernet switch (middle), and diagram of a small cluster with a cluster-level Ethernet switch/router (right). 1.1仓储级系统典型元素:1U服务(左),7以太网交换机(中间),一个小的集群有一个集群级以太网交换机或路由器()Figure 1.1 depicts some of the more popular building blocks for WSCs. A set of low-end servers,typically in a 1U or blade enclosure format, are mounted within a rack and interconnected using a local Ethernet switch. These rack-level switches, which can use 1- or 10-Gbps links, have a number of uplink connections to one or more cluster-level (or datacenter-level) Ethernet switches. This second-level switching domain can potentially span more than ten thousand individual servers.
图1.1描述了一些比较流行的WSCs构建块。一组低端的服务器,通常以1U或刀片式,安装到机架,内部互联使用本地以太网交换机。这些机架级交换,可以使用1 -或10-Gbps链接,有许多的上行链路连接到一个或多个集群级(或数据中心级)以太网交换机。二级切换域可以潜在跨度超过一万种不同的服务器。
1.6.1 Storage
Disk drives are connected directly to each individual server and managed by a global distributed file system (such as Google’s GFS [31]) or they can be part of Network Attached Storage (NAS) devices that are directly connected to the cluster-level switching fabric. A NAS tends to be a simpler solution to deploy initially because it pushes the responsibility for data management and integrity to a NAS appliance vendor. In contrast, using the collection of disks directly attached to server nodes requires a fault-tolerant file system at the cluster level. This is difficult to implement but can lower hardware costs (the disks leverage the existing server enclosure) and networking fabric utilization (each server network port is effectively dynamically shared between the computing tasks and the file system). The replication model between these two approaches is also fundamentally different. A NAS provides extra reliability through replication or error correction capabilities within each appliance, whereas systems like GFS implement replication across different machines and consequently will use more networking bandwidth to complete write operations. However, GFS-like systems are able to keep data available even after the loss of an entire server enclosure or rack and may allow higher aggregate read bandwidth because the same data can be sourced from multiple replicas. Trading off higher write overheads for lower cost, higher availability, and increased read bandwidth was the right solution for many of Google’s workloads. An additional advantage of having disks collocated with compute servers is that it enables distributed system software to exploit data locality. For the remainder of this book, we will therefore implicitly assume a model with distributed disks directly connected to all servers.
硬盘驱动直接连接到每个独立的服务器,由全球分布式文件系统(比如谷歌的Google文件系统[31])管理,或者作为(NAS)设备的部件直接连接到集群级交换结构。NAS往往是一个最初的简单的部署解决方案,因为它促进NAS设备供应商对数据管理和完整性的负责。相比之下,使用直接附加到服务器节点的硬盘集合需要一个集群级别的容错文件系统。这很难实现,但可以降低硬件成本(硬盘利用现有的服务器附件)和网络带宽的利用率(每个服务器网络端口在计算任务和文件系统间有效动态共享)。两种方式的复制模型本质不同。NAS通过在每个设备中复制或纠错功能以提供额外的可靠性,而类似的GFS系统则通过跨不同机器实现复制,因此将占用更多的网络带宽来完成写操作。然而,类似GFS的系统能保持数据的高可用性(甚至在整个服务器的外壳或架子损坏的情况下),可能会允许更高的聚集读带宽,因为相同的数据可以来自多个副本。对于许多类似谷歌这样的工作负载,以更低的成本达到更高的写开销,更高的可用性,并增加阅读带宽是正确的解决方案。另外一个好处, 硬盘集中集合服务使得它的分布式系统软件能够本地化浏览数据。这本书的剩余部分,隐式地假定分布式硬盘直接连接到所有服务器。
Some WSCs, including Google’s, deploy desktop-class disk drives instead of enterprise-grade disks because of the substantial cost differential between the two. Because that data are nearly always replicated in some distributed fashion (as in GFS), this mitigates the possibly higher fault rates of desktop disks. Moreover, because field reliability of disk drives tends to deviate significantly from the manufacturer’s specifications, the reliability edge of enterprise drives is not clearly established.For example, Elerath and Shah [24] point out that several factors can affect disk reliability more substantially than manufacturing process and design.
一些WSCs,包括谷歌,部署桌面级硬盘驱动器代替企业级硬盘,因为两者间的成本差异很大。因为这些数据几乎总是以分布式方式(如GSF) 复制,这减缓了桌面级硬盘可能较高的故障率。此外,因为硬盘驱动器的字段可靠性往往严重偏离制造商的规格,可靠性的边缘企业推动并不明确。例如,Elerath和Shah [24]指出数个因素比生产过程和设计更能严重影响硬盘可靠性比。
1.6.2 Networking Fabric 网络结构
Choosing a networking fabric for WSCs involves a trade-off between speed, scale, and cost. As of this writing, 1-Gbps Ethernet switches with up to 48 ports are essentially a commodity component, costing less than $30/Gbps per server to connect a single rack. As a result, bandwidth within a rack of servers tends to have a homogeneous profile. However, network switches with high port counts, which are needed to tie together WSC clusters, have a much different price structure and are more than ten times more expensive (per 1-Gbps port) than commodity switches. In other words, a switch that has 10 times the bi-section bandwidth costs about 100 times as much. As a result of this cost discontinuity, the networking fabric of WSCs is often organized as the two-level hierarchy depicted in Figure 1.1. Commodity switches in each rack provide a fraction of their bi-section bandwidth for interrack communication through a handful of uplinks to the more costly cluster-level switches. For example, a rack with 40 servers, each with a 1-Gbps port, might have between four and eight 1-Gbps uplinks to the cluster-level switch, corresponding to an oversubscription factor between 5 and 10 for communication across racks. In such a network, programmers must be aware of the relatively scarce cluster-level bandwidth resources and try to exploit rack-level networking locality, complicating software development and possibly impacting resource utilization.
为WSCs选择一个网络组织需要在速度、规模和成本之间平衡。在撰写本文时,1gbp s以太网交换机多达48端口,每服务器连接一个机架/1gbp s的交换连接成本不到30美元,因此,一个机架服务器上的带宽往往有一个均匀分布。然而,, WSC集群需要和高端口数网络交换机相配合,每1gbp s端口要比普通交换的成本高出很多甚至超过数十倍。换句话说,有10次bi-section的一次交换带宽成本约100倍。由于成本不连续性、WSCs的网络结构通常组织为两级层次,如图1.1所示。每个机柜的日常交换提供部分的bi-section 带宽,机架内部通信通过少部分的上行链路连接到更昂贵的集群级交换机。例如,一个容纳40台服务器,每个服务器达1-Gbps的机架,,可能有四到八个1-Gbps上行链路连接到集群级交换机,对应的机架间通信发挥超负荷模型系数在5到10之间。这样的网络中,程序员必须了解相对稀缺的集群级带宽资源,努力开拓本地rack-level网络,复杂的软件开发,可能影响资源利用率。
Alternatively, one can remove some of the cluster-level networking bottlenecks by spending more money on the interconnect fabric. For example, Infiniband interconnects typically scale to a few thousand ports but can cost $500–$2,000 per port. Similarly, some networking vendors are starting to provide larger-scale Ethernet fabrics, but again at a cost of at least hundreds of dollars per server. Alternatively, lower-cost fabrics can be formed from commodity Ethernet switches by building “fat tree” Clos networks [1]. How much to spend on networking vs. spending the equivalent amount on buying more servers or storage is an application-specific question that has no single correct answer. However, for now, we will assume that intra rack connectivity is often cheaper than inter rack connectivity.
另外你可以在架间互连结构上花费更多的钱以消除一些集群级网络瓶颈问题,例如,网络互连一般规模几千端口,但每端口的费用在$ 500- $ 2000之间。同样,一些网络供应商已开始提供大规模的以太网网数, 但每台服务器又要花费至少数百美元。另外,可通过构建“胖树”Clos网络[1] 商用以太网交换机形成低成本的网络。在网络上花费多少钱vs.消费相同金额购买更多的服务器或存储?这是一个特定应用问题,没有单一的正确答案。然而,现在,我们将假定内架连接成本通常是低于架间连接。
1.6.3 Storage Hierarchy 存储器体系
Figure 1.2 shows a programmer’s view of storage hierarchy of a typical WSC. A server consists of a number of processor sockets, each with a multicore CPU and its internal cache hierarchy, local shared and coherent DRAM, and a number of directly attached disk drives. The DRAM and disk resources within the rack are accessible through the first-level rack switches (assuming some sort of remote procedure call API to them), and all resources in all racks are accessible via the cluster-level switch.
1.2 以一个程序员的视角展示了一个典型WSC的存储体系。服务器由一批处理器插槽,每槽有多核CPU和其内部缓存的层次结构,共享偶合本地DRAM,以及大量的直接附加硬盘驱动器。机架内的DRAM和硬盘资源通过一级机架交换访问(假设某种形式的远程过程调用API),所有机架的所有资源通过集群级交换机访问。

FIGURE 1.2: Storage hierarchy of a WSC. 图1.2 WSC 的存储体系


FIGURE 1.3: Latency, bandwidth, and capacity of a WSC. 图1.3 WSC 的延迟、带宽和容量
1.6.4 Quantifying Latency, Bandwidth, and Capacity WSC 的延迟、带宽和容量
Figure 1.3 attempts to quantify the latency, bandwidth, and capacity characteristics of a WSC. For illustration we assume a system with 2,000 servers, each with 8 GB of DRAM and four 1-TB disk drives. Each group of 40 servers is connected through a 1-Gbps link to a rack-level switch that has an additional eight 1-Gbps ports used for connecting the rack to the cluster-level switch (an oversubscription factor of 5). Network latency numbers assume a socket-based TCP-IP transport,and networking bandwidth values assume that each server behind an oversubscribed set of uplinks is using its fair share of the available cluster-level bandwidth. We assume the rack- and cluster-level switches themselves are not internally oversubscribed. For disks, we show typical commodity disk drive (SATA) latencies and transfer rates. The graph shows the relative latency, bandwidth, and capacity of each resource pool. For example, the bandwidth available from local disks is 200 MB/s, whereas the bandwidth from off-rack disks is just 25 MB/s via the shared rack uplinks. On the other hand, total disk storage in the cluster is almost ten million times larger than local DRAM. A large application that requires many more servers than can fit on a single rack must deal effectively with these large discrepancies in latency, bandwidth, and capacity. These discrepancies are much larger than those seen on a single machine, making it more difficult to program a WSC.
图1.3 试图量化 WSC 的延迟、带宽和容量特征,为了说明我们假设系统由2000台服务器,每台服务器都有8 GB的内存和四个1-TB硬盘驱动器。每40组服务器 通过一个1-gbps链路连接到一个机架级交换机,有额外的8个1-gbps端口用于连接机架到集群级交换机(发挥超负荷模型系数为5)。网络延迟数假设基于套接字的TCP-IP传输,网络带宽值假设每台服务器连接使用它的集群级可用平均配额带宽(使用发挥超负荷模型的前提下)。我假设机架和集群级交换机本身并不在内部的超负荷模型之内。对于硬盘,显示典型的商用硬盘驱动器(SATA)的延迟和转化率。图表显示了每一个资源池的相对延迟、带宽和容量,例如,本地硬盘可用带宽是200 MB / s,而通过共享机架上行链接仅为 25 MB/s。另一方面,集群中的总硬盘存储量是本地DRAM的近万倍。需要更多的服务器集成在单一机架的大型应用必须有效地处理延迟、带宽和容量的巨大差异。这些差异要远远大于那些单一机器,编程设计WSC更难。
A key challenge for architects of WSCs is to smooth out these discrepancies in a cost-efficient manner. Conversely, a key challenge for software architects is to build cluster infrastructure and services that hide most of this complexity from application developers.
架构WSCs的一个关键挑战是以一个经济合算的方式来消除这些差异。相反, 软件架构的一个关键挑战是构建集群基础设施和服务,使应用开发者对大部分复杂性不可见。
1.6.5 Power Usage 电力供给
Energy and power usage are also important concerns in the design of WSCs because, as discussed in more detail in Chapter 5, energy-related costs have become an important component of the total cost of ownership of this class of systems. Figure 1.4 provides some insight into how energy is used in modern IT equipment by breaking down the peak power usage of one generation of WSCs deployed at Google in 2007 categorized by main component group.
能源和电力供给也是WSCs的重要设计问题,将在第5章进行深入详细的讨论。能源相关成本已成为此类典型系统的重要组成部分。 图1.4有助于深入了解 第一代WSCs 2007部署谷歌时,如何在各主要组件中使用现代IT设备打破能源用电高峰。
Although this breakdown can vary significantly depending on how systems are configured for a given workload domain, the graph indicates that CPUs can no longer be the sole focus of energy efficiency improvements because no one subsystem dominates the overall energy usage profile.Chapter 5 also discusses how overheads in power delivery and cooling can significantly increase the actual energy usage in WSCs.
尽管这一分布率对于不同工作负载域的系统配置可能会有显著地变化, 图表显示,cpu已不再是改进能源效益的唯一焦点,因为没有一个子系统主导整个能源使用概要文件。第五章还将讨论WSCs如何在电力输送和冷却方面显著提高实际能源使用量。

FIGURE 1.4: Approximate distribution of peak power usage by hardware subsystem in one of Google’s datacenters (circa 2007). 图1.4:谷歌数据中心硬件子系统功率使用峰值分布图(大约2007年)
1.6.6 Handling Failures 故障处理
The sheer scale of WSCs requires that Internet services software tolerate relatively high component fault rates. Disk drives, for example, can exhibit annualized failure rates higher than 4% [65,76]. Different deployments have reported between 1.2 and 16 average server-level restarts per year.With such high component failure rates, an application running across thousands of machines may need to react to failure conditions on an hourly basis. We expand on this topic further on Chapter 2, which describes the application domain, and Chapter 7, which deals with fault statistics.
WSCs的规模要求互联网服务软件容忍相对较高的组件故障率。例如, 磁盘驱动器,从已报道的不同部署的每年服务器级的重启率平均在1.2到16之间,显示年故障率高于4%[65,76]。如此高的组件故障率,使得跨越数以千计机器运行的应用程序可能需要应对失败条件以小时为基础。在第2章的应用程序域以及第7章的处理故障统计中将对该话题进一步展开。

本帖子中包含更多资源

您需要 登录 才可以下载或查看,没有帐号?加入

x
6
 楼主| 发表于 2012-6-25 16:09:40 | 只看该作者
eMe 发表于 2012-6-25 15:03
坐观

谢谢,因原文比较坳口,所以翻译起来比较费力!  有不当之处,还望多指教。
7
发表于 2012-6-25 22:09:03 | 只看该作者
楼主辛苦了
9
 楼主| 发表于 2012-6-26 12:41:05 | 只看该作者
本帖最后由 trybestying 于 2012-6-26 14:40 编辑

chapter 2
第二章
Workloads and Software Infrastructure工作负载和软件基础设施
Theapplications that run on warehouse-scale computers (WSCs)dominate many system design trade-off decisions. This chapter outlines some ofthe distinguishing characteristics of software that runs in large Internetservices and the system software and tools needed for a complete computingplatform. Here is some terminology that defines the different software layersin a typical WSC deployment:
在仓储级计算机(WSCs)中运行的应用程序引领了许多系统设计权衡决策。这一章概述运行在大型互联网服务上的软件的一些特别要求以及一个完整的计算平台所需要的系统软件和工具。下面是一些典型的WSC部署相关的术语,这些术语用来定义不同的软件层:
.Platform-level software——the common firmware, kernel, operating system distribution, andlibraries expected to be present in all individual servers to abstract thehardware of a single machine and provide basic server-level services.
.平台层的软件——预计将出现在所有单独的服务器中的常见的固件、内核、分布式操作系统和函数库,抽象单独机器的硬件,并提供基本、服务级的服务。
.Cluster-level infrastructurethe collectionof distributed systems software that manages resources and provides services atthe cluster level; ultimately, we consider these services as an operatingsystem for a datacenter. Examples are distributed file systems, schedulers,remote procedure call (RPC) layers, as well as programming models that simplifythe usage of resources at the scale of datacenters, such as MapReduce [19],Dryad [47], Hadoop [42], Sawzall [64], BigTable [13], Dynamo [20], and Chubby[7].
.集群层基础设施—用来管理资源和提供集群级服务的分布式系统软件集;实际上这些服务可看作数据中心的操作系统,比如分布式文件系统、调度器,远程过程调用(RPC)层,以及可以简化数据中心资源使用规模的编程模型等。像 MapReduce [19],Dryad [47], Hadoop [42],Sawzall [64], BigTable [13], Dynamo [20], and Chubby [7]都是。
.Application-level software—software that implements a specific service. It is often useful tofurther divide application-level software into online services and offlinecomputations because those tend to have different requirements. Examples ofonline services are Google search, Gmail, and Google Maps. Offline computationsare typically used in large-scale data analysis or as part of the pipeline thatgenerates the data used in online services; for example, building an index ofthe Web or processing satellite images to create map tiles for the onlineservice.
.
应用层软件—实现特定服务的软件。这对进一步分割应用级软件为在线服务和离线计算通常是有用的, 因为这些应用往往有不同的要求。例如在线服务有谷歌搜索、Gmail和谷歌地图。离线计算通常用在大型数据分析或部分管道产生的用于在线服务的数据;例如,建Web索引或 处理卫星图像以创建地图图像块用来提供在线服务。
2.1DATACENTEr VS. DESKToP
数据中心VS.桌面
Softwaredevelopment in Internet services differs from the traditional desktop/servermodel in many ways:
互联网服务方面的软件开发在许多方面有别于传统的桌面/服务器模式:
.Ample parallelism—Typical Internet services exhibit a large amount of parallelismstemming from both data- and request-level parallelism. Usually, the problem isnot to find parallelism but to manage and efficiently harness the explicitparallelism that is inherent in
theapplication. Data parallelism arises from the large data sets of relativelyindependent records that need processing, such as collections of billions ofWeb pages or billions of log lines. These very large data sets often requiresignificant computation for each parallel (sub) task, which in turn helps hideor tolerate communication and synchronization overheads. Similarly,request-level parallelism stems from the hundreds or thousands of requests persecond that popular Internet services receive. These requests rarely involveread-write sharing of data or synchronization across requests. For example,search requests are essentially independent and deal with a mostly read-onlydatabase; therefore, the computation can be easily partitioned both within arequest and across different requests. Similarly, whereas Web emailtransactions do modify user data, requests from different users are essentiallyindependent from each other, creating natural units of data partitioning andconcurrency.
.足够的并发性—典型的互联网服务表现出大量的并发性堵塞主要源于数据和请求级并发。通常,问题不是发现并发性,而是如何管理和有效处理应用程序内在的显式并发。数据并发来自需要处理的相关性独立记录形成的大型数据集,,比如收藏的数十亿网页或日志线。这些非常大的数据集, 每个并行(子)的任务通常都需要大量计算,这反过来有助于隐藏或容忍通信和同步开销。同样,请求级并行性源于流行的互联网服务接收到的每秒成百上千的请求。
.Workload churnUsers ofInternet services are isolated from the service‘simplementation details by relatively well-defined and stable high-level APIs(e.g., simple URLs), making it much easier to deploy new software quickly. Keypieces of Google‘s services have release cycles on theorder of a couple of weeks compared to months or years for desktop softwareproducts. Google‘s front-end Web server binaries, forexample, are released on a weekly cycle, with nearly a thousand independentcode changes checked in by hundreds of developers-——thecore of Google‘s search services has been reimplementednearly from scratch every 2 to 3 years. This environment creates significantincentives for rapid product innovation but makes it hard for a system designerto extract useful benchmarks even from established applications.Moreover,because Internet services are still a relatively new field, new products andservices frequently emerge, and their success with users directly affects theresulting workload mix in the datacenter. For example, video services such asYouTube have flourished in relatively short periods and may present a verydifferent set of requirements from the existing large customers of computingcycles in the datacenter, potentially affecting the optimal design point ofWSCs in unexpected ways. A beneficial side effect of this aggressive softwaredeployment environment is that hardware architects are not necessarily burdenedwith having to provide good performance for immutable pieces of code. Instead,architects can consider the possibility of significant software rewrites totake advantage of new hardware capabilities or devices.
. 工作负载搅动—用户的互联网服务通过定义相对良好、稳定的高级APIs(如,简单的url),来实现,隔离了服务实现细节,从更容易快速部署新软件。谷歌服务的关键部分的发布周期已达到大约几周,相比桌面软件产品则需几个月或几年。例如,谷歌前端Web服务的二进制文件(由数以百计的开发人员完成的近一千个独立的代码变更检入)发布周期仅为一个月—谷歌核心搜索服务的编码每2到3年几乎从零开始重新实现。这种环境极大激励了产品快速创新,然尔使系统设计师提取有用的基准库,甚至建立应用程序变得很难。此外,由于互联网服务仍然是一个相对较新的领域,新产品和服务经常出现,他们的成功与用户的直接影响所产生的工作负载,混合到数据中心。例如,YouTube等视频服务在相对较短的时间蓬勃发展,对于现有的数据中心的大客户的计算周期而言, 可能会呈现一组非常不同的需求集,可能以意想不到的方式影响WSCs优化设计的角度。这种激进的软件部署环境的有益的一面,则是硬件架构师不再纠结于为不可变的代码片断提供良好的性能,相反的, 架构师可以考虑利用新的硬件功能或设备实现大量软件重写的可能性。
.Platform homogeneity—The datacenter is generally a more homogeneous environment than thedesktop as a target platform for software development. Large Internet servicesoperations typically deploy a small number of hardware and system softwareconfigurations at any given time. Significant heterogeneity arises primarilyfrom the incentives to deploy more cost-efficient components that becomeavailable over time. Homogeneity within a platform generation simplifiescluster-level scheduling and load balancing and reduces the maintenance burdenfor platforms software (kernels, drivers, etc.). Similarly, homogeneity canallow more efficient supply chains and more efficient repair processes becauseautomatic and manual repairs benefit from having more experience with fewertypes of systems. In contrast, software for desktop systems can make fewassumptions about the hardware or software platform they are deployed on, andtheir complexity and performance characteristics may suffer from the need tosupport thousands or even millions of hardware and system softwareconfigurations.
.平台同质性—相比桌面作为目标平台的软件开发而言,数据中心的环境通常更同构化。在任何给定的时间,大型互联网服务操作通常使用少量的硬件和系统软件配置。显著的异质化主要来自部署更多成本更低的组件的,这些组件日久可用。平台同质简化集群级调度和负载平衡,减轻平台软件(内核,驱动,等等)的维护负担。类似地, 同质性可以使供应链效率更高,修复过程更高效,因为自动和手动修复得益于更少类型系统的更多经验积累。相比之下,运行在桌面系统上的软件,假设部署少量硬件或软件,也需要面对数以千计甚至数以百万计的需要满足复杂特性要求的硬件和系统软件配置。
.Fault-freeoperation—Because Internet service applications run on clusters of thousandsof machines—each of them not dramatically more reliablethan PC-class hardware—the multiplicative effect ofindividual failure rates means that some type of fault is expected every fewhours or less (more details are provided in Chapter 6). As a result, althoughit may be reasonable for desktop-class software to assume a fault-free hardwareoperation for months or years, this is not true for datacenter-level services—Internet services need to work in an environment where faults arepart of daily life. Ideally, the cluster-level system software should provide alayer that hides most of that complexity from application-level software,although that goal may be difficult to accomplish for all types ofapplications.
.无故障运行——因为互联网服务应用程序运行在成千上万S机器组成的集群上——相比PC级硬件,他们并不显得更可靠——个体失败率的乘法效应,意味着某种类型的故障将可能每几小时或者更少时间(第6章提供更多细节)发生。因此,假设一个桌面级软件可以在硬件无故障的情况下运行几个月或几年可能是合理的,但对于数据中心级服务缺是不实现的——互联网服务需要在错误为日常生活一部分的工作环境中运行。理想情况下,集群级系统软件应能提供一个层,以隔离应用程序级软件的大部分复杂性,这个目标对于所有类型的应用程序来说很难实现。
Although the plentiful thread-level parallelism and amore homogeneous computing platform help reduce software development complexityin Internet services compared to desktop systems, the scale, the need tooperate under hardware failures, and the speed of workload churn have the oppositeeffect.
相比桌面系统,尽管互联网服务中大量线程级别的并行性和更同质化的计算平台有助于减少软件开发的复杂性,但其规模、在硬件故障下运行的需求以及工作负载搅动速度都将会产生相反的效果。
10
 楼主| 发表于 2012-6-28 17:24:53 | 只看该作者
本帖最后由 trybestying 于 2012-6-28 17:29 编辑

2.2 PERFORMANCE AND AVAILABILITY  TOOLBOX性能和可用工具箱
Some basic programming concepts tend to occur often in both infrastructure and application levels because of their wide applicability in achieving high performance or high availability in large-scale deployments. The following table describes some of the most prevalent concepts.
不论是基础设施,还是应用程序级的一些基本的编程概念,趋向于时常产生, ,因为他们在大规模部署下能够实现高性能或高可用性的广泛适用性。下表描述了一些最普遍的概念。




Performance


Availability


Desc ripti on


Replication


Yes


Yes


Data replication is a powerful technique because it can  improve both performance and availability. It is particularly powerful when  the replicated data are not often modified because replication makes updates  more complex.


Sharding
  
(partitioning)
   


Yes


Yes


Splitting a data set into smaller fragments (shards)  and distributing them across a large number of machines. Operations on the  data set are dispatched to some or all of the machines hosting shards, and  results are coalesced by the client. The sharding policy can vary depending on  space constraints and performance considerations. Sharding also helps availability  because recovery of small data fragments can be done faster than larger ones.


Load-
  
Balancing


yes





In large-scale services, service-level performance  often depends on the slowest responder out of hundreds or thousands of  servers. Reducing response-time variance is therefore critical.
  
In a sharded service, load balancing can be achieved by  biasing the sharding policy to equalize the amount of work per server. That  policy may need to be informed by the expected mix of requests or by the  computing capabilities of different servers. Note that even homogeneous  machines can offer different performance characteristics to a load-balancing  client if multiple applications are sharing a subset of the load-balanced servers.  
  
In a replicated service, the load balancing agent can  dynamically adjust the load by selecting which servers to dispatch a new  request to. It may still be difficult to approach perfect load balancing  because the amount of work required by different types of requests is not  always constant or predictable..


Health
  
checking and
  
watchdog
  
timers





YES


In a large-scale system, failures often are manifested  as slow or unresponsive behavior from a given server. In this environment, no  operation can rely on a given server to respond to make forward progress.  Moreover, it is critical to quickly determine that a server is too slow or  unreachable and steer new requests away from it. Remote procedure calls must  set well-informed time-out values to abort long-running requests, and  infrastructure-level software may need to continually check connection-level  responsiveness of communicating servers and take appropriate action when  needed.


Integrity
  
Checks
   





YES


In some cases, besides unresponsiveness,faults are  manifested as data corruption.Although those may be rarer, they do occur and  often in ways that underlying hardware or software checks do not catch (e.g.,  there are known issues with the error coverage of some networking CRC  checks).Extra software checks can mitigate these problems by changing the  underlying encoding or adding more powerful redundant integrity checks.


Application-specific
  
compression


YES





Often a large portion of the equipment costs in modern  datacenters is in the various storage layers. For services with very high  throughput requirements, it is critical to fit as much of the working set as  possible in DRAM; this makes compression techniques very important because the  extra CPU overhead of decompressing is still orders of magnitude lower than  the penalties involved in going to disks. Although generic compression  algorithms can do quite well on the average, application-level compression  schemes that are aware of the data encoding and distribution of values can  achieve significantly superior compression factors or better decompression  speeds.


Eventual
  
consistency


yes


yes


Often, keeping multiple replicas up to date using the  traditional guarantees offered by a database management system significantly  increases complexity, hurts performance, and  reduces availability of distributed applications [90]. Fortunately, large  classes of applications have more relaxed requirements and can tolerate  inconsistent views for limited periods, provided that the system eventually  returns to a stable consistent state.







性能


可用性


描述


复制








数据复制是一项功能强大的技术,因为它可以同时提高性能和可用性。当复制的数据不经常修改时尤其强大,因为复制使更新更加复杂化。


分片
  
(分区)






分割数据集为更小的片段(碎片),并将其分散分布到大量的机器上去。数据集上的操作被分派到部分或所有托管碎片的机器上,由客户端合并结果。分片策略非常受限于空间限制和性能考虑。分片还有助于可用性,因为恢复小的数据片段要比恢复大的数据片段快的多。


均衡负载







在大规模的中,服务级性能往往取决于成千上万的服务当中最慢的响应者。减少响应时间差异无疑至关重要。
  
在分区服务中,可以通过偏倚的分片策略来补尝每个服务的任务总数,从而达到均衡负载。这样的策略可能需要被告之预期的混合请求或者不同服务器的计算性能告。注意, 如果多个应用程序共享负载均衡服务器的一个子集,即使是同质的机器,均衡负载的客户端会提供不同的性能特征。
  
在复制服务中,均衡负载代理通过选择服务器来分派新请求的方式能够动态调整负载。但它距最优化的均衡负载可能仍旧很困难,因为不同类型的请求至使任务请求总数往往不能常量化或不可预知。


健康检查和看门狗计时器







在大型系统中,故障往往表现为给定的服务器缓慢或无反应。在这种情形下,没有操作能够依赖给定服务器的响应来改进处理,此外,重要的是很快断定一个服务器太慢或不可访问,引导新的请求远离它。远程过程调用必须设置消息灵通的响应超时值来中过长时间运行的请求,而且基础设施层软件可能需要持续不断的检查通讯服务的连接级响应,并在需要的时候采取适当的行动。


完整性检查







在某些情况,,除了反应迟钝,故障表现为数据腐败。虽然这些可能是少见的,他们确实发生了,常常在底层硬件或软件检查不到位时(例如,有已知问题的一些网络CRC检查错误报道)。额外的软件检查可以缓解这些问题,通过改变底层编码或添加更强大的冗余的完整性检查。


应用特定压缩







通常,现代数据中心设备成本的大部分是各种存储层。对于非常高的吞吐量要求的服务器,重要的是在可能的DRAM中适合尽可能多的任务集;这使得压缩技术非常重要,因为减压额外的CPU开销仍然数量级低于参与磁盘的惩罚。尽管通用压缩算法, 一般说来能做的相当好,但感知数据编码和分布值的应用程序级压缩方案,可以实现显著优于压缩因子或更好的压缩速度。


最终一致性






通常,保持多个副本更新使用传统的数据库管理系统提供担保,明显增加了复杂性,影响性能,减少分布式应用程序的可用性[90]。幸运的是,大类型的应用程序对有限的时间有更轻松的要求,而且能容忍不一致的观点,只要这个系统最终返回到一个稳定一致的状态。


使用高级回帖 (可批量传图、插入视频等)快速回复

您需要登录后才可以回帖 登录 | 加入

本版积分规则   Ctrl + Enter 快速发布  

发帖时请遵守我国法律,网站会将有关你发帖内容、时间以及发帖IP地址等记录保留,只要接到合法请求,即会将信息提供给有关政府机构。
快速回复 返回顶部 返回列表