s5 Notes CC
s5 Notes CC
Gordon Bell, Jim Gray, and Alex Szalay [5] have advocated: “Computational science is changing to
be data-intensive. Supercomputers must be balanced systems, not just CPU farms but also petascale
I/O and networking arrays.” In the future, working with large data sets will typically mean sending
the computations (programs) to the data, rather than copying the data to the workstations. This
reflects the trend in IT of moving computing and data from desktops to large data centers, where
there is on-demand provision of software, hardware, and data as a service. This data explosion has
Cloud computing has been defined differently by many users and designers. For example, IBM, a
major player in cloud computing, has defined it as follows: “A cloud is a pool of virtualized
computer resources. A cloud can host a variety of different workloads, including batch-style
backend jobs and interactive and user-facing applications.” Based on this definition, a cloud allows
workloads to be deployed and scaled out quickly through rapid provisioning of virtual or physical
machines. The cloud supports redundant, self-recovering, highly scalable programming models that
allow workloads to recover from many unavoidable hardware/software failures. Finally, the cloud
system should be able to monitor resource use in real time to enable rebalancing of allocations when
needed.
Internet Clouds
Cloud computing applies a virtualized platform with elastic resources on demand by provisioning
hardware, software, and data sets dynamically (see Figure 1.18). The idea is to move desktop
computing to a service-oriented platform using server clusters and huge databases at data centers.
Cloud computing leverages its low cost and simplicity to benefit both users and providers. Machine
virtualization has enabled such cost-effectiveness. Cloud computing intends to satisfy many user
applications simultaneously. The cloud ecosystem must be designed to be secure, trustworthy, and
dependable. Some computer users think of the cloud as a centralized resource pool. Others consider
the cloud to be a server cluster which practices distributed computing over all the servers used.
The Cloud Landscape
administrative domain (e.g., a research laboratory or company) for on-premises computing needs.
However, these traditional systems have encountered several performance bottlenecks: constant sys-
tem maintenance, poor utilization, and increasing costs associated with hardware/software
upgrades.
problems.
users—namely servers, storage, networks, and the data center fabric. The user can deploy and
run on multiple VMs running guest OSes on specific applications. The user does not manage or
control the underlying cloud infrastructure, but can specify when to request and release the
needed resources.
Platform as a Service (PaaS) This model enables the user to deploy user-built applications
onto a virtualized cloud platform. PaaS includes middleware, databases, development tools, and
some runtime support such as Web 2.0 and Java. The platform includes both hardware and
software integrated with specific programming interfaces. The provider supplies the API and
software tools (e.g., Java, Python, Web 2.0, .NET). The user is freed from managing the cloud
infrastructure.
thousands of paid cloud customers. The SaaS model applies to business processes, industry
human resources (HR), and collaborative applications. On the customer side, there is no upfront
investment in servers or software licensing. On the provider side, costs are rather low, compared