On the Clouds : A New Way of Computing

This article introduces cloud computing and discusses the author’s experience “on the clouds.” The author reviews cloud computing services and providers, then presents his experience of running multiple systems (e.g., integrated library systems, content management systems, and repository software). He evaluates costs, discusses advantages, and addresses some issues about cloud computing. Cloud computing fundamentally changes the ways institutions and companies manage their computing needs. Libraries can take advantage of cloud computing to start an IT project with low cost, to manage computing resources cost-effectively, and to explore new computing possibilities.

shape cloud computing.For example, Sun's well-known slogan "the network is the computer" was established in late 1980s.Salesforce.com has been providing on-demand Software as a Service (SaaS) for customers since 1999.IBM and Microsoft started to deliver Web services in the early 2000s.Microsoft's Azure service provides an operating system and a set of developer tools and services.Google's popular Google Docs software provides Web-based word-processing, spreadsheet, and presentation applications.Google App Engine allows system developers to run their Python/Java applications on Google's infrastructure.Sun provides $1 per CPU hour.Amazon is well-known for providing Web services such as EC2 and S3.Yahoo! announced that it would use the Apache Hadoop framework to allow users to work with thousands of nodes and petabytes (1 million gigabytes) of data.These examples demonstrate that cloud computing providers are offering services on every level, from hardware (e.g., Amazon and Sun), to operating systems (e.g., Google and Microsoft), to software and service (e.g., Google, Microsoft, and Yahoo!).Cloud-computing providers target a variety of end users, from software developers to the general public.For additional information regarding cloud computing models, the University of California (UC) Berkeley's report provides a good comparison of these models by Amazon, Microsoft, and Google. 4s cloud computing providers lower prices and IT advancements remove technology barriers-such as virtualization and network bandwidth-cloud computing has moved into the mainstream. 5Gartner stated, "Organizations are switching from factors related to cloud computing: infinite computing resources available on demand, removing the need to plan ahead; the removal of an up-front costly investment, allowing companies to start small and increase resources when needed; and a system that is pay-for-use on a short-term basis and releases customers when needed (e.g., CPU by hour, storage by day). 2 National Institute of Standards and Technology (NIST) currently defines cloud computing as "a model for enabling convenient, on-demand network access to a shared pool of configurable computing resources (e.g.network, servers, storage, applications, and services) that can be rapidly provisioned and released with minimal management effort or service provider interaction." 3s there are several definitions for "utility computing" and "cloud computing," the author does not intend to suggest a better definition, but rather to list the characteristics of cloud computing.The term "cloud computing" means that This article discusses using cloud computing on an IT-infrastructure level, including building virtual server nodes and running a library's essential computer systems in remote data centers by paying a fee instead of running them on-site.The article reviews current cloud computing services, presents the author's experience, and discusses advantages and disadvantages of using the new approach.

All kinds of clouds
Major IT companies have spent billions of dollars since the 1990s to

On the Clouds: A New Way of Computing
This article introduces cloud computing and discusses the author's experience "on the clouds."The author reviews cloud computing services and providers, then presents his experience of running multiple systems (e.g., integrated library systems, content management systems, and repository software).He evaluates costs, discusses advantages, and addresses some issues about cloud computing.Cloud computing fundamentally changes the ways institutions and companies manage their computing needs.Libraries can take advantage of cloud computing to start an IT project with low cost, to manage computing resources cost-effectively, and to explore new computing possibilities.

S
cholarly communication and new ways of teaching provide an opportunity for academic institutions to collaborate on providing access to scholarly materials and research data.There is a growing need to handle large amounts of data using computer algorithms that presents challenges to libraries with limited experience in handling nontextual materials.Because of the current economic crisis, academic institutions need to find ways to acquire and manage computing resources in a cost-effective manner.
One of the hottest topics in IT is cloud computing.Cloud computing is not new to many of us because we have been using some of its services, such as Google Docs, for years.In his latest book, The Big Switch: Rewiring the World, from Edison to Google, Carr argues that computing will go the way of electricity: purchase when needed, which he calls "utility computing."His examples include Amazon's EC2 (Elastic Computing Cloud), and S3 (Simple Storage) services. 1Amazon's chief technology officer proposed the following

Yan Han
Tutorial Yan Han (hany@u.library.arizona.edu) is Associate Librarian, University of Arizona Libraries, Tucson.
company-owner hardware and software to per-use service-based models." 6For example, the U.S. government website (http://www.usa.gov/)will soon begin using cloud computing. 7The New York Times used Amazon's EC2 and S3 services as well as a Hadoop application to provide open access to public domain articles from 1851 to 1922.The Times loaded 4 TB of raw TIFF images and their derivative 11 million PDFs into Amazon's S3 in twenty-four hours at very reasonable cost. 8This project is very similar to digital library projects run by academic libraries.OCLC announced its movement of library management services to the Web. 9 It is clear that OCLC is going to deliver a Web-based integrated library system (ILS) to provide a new way of running an ILS.DuraSpace, a joint organization by Fedora Commons and DSpace Foundation, announced that they would be taking advantage of cloud storage and cloud computing. 10

On the clouds
Computing needs in academic libraries can be placed into two categories: user computing needs and library goals.

User computing needs
Academic libraries usually run hundreds of PCs for students and staff to fulfill their individual needs (e.g., Microsoft Office, browsers, and image-, audio-, and video-processing applications).

Library goals
A variety of library systems are used to achieve libraries' goals to support research, learning, and teaching.These systems include the following: ■ ■ Library website: The website may be built on simple HTML webpages or a content management system such as Drupal, Joomla, or any home-grown PHP, Perl, ASP, or JSP system.Due to differences in end users and functionality, most systems do not use computing resources equally.For example, the ILS is input and output intensive and database query intensive, while repository systems require storage ranging from a few gigabytes to dozens of terabytes and substantial network bandwidth.
Cloud computing brings a fundamental shift in computing.It changes the way organizations acquire, configure, manage, and maintain computing resources to achieve their business goals.The availability of cloud computing providers allows organizations to focus on their business and leave general computing maintenance to the major IT companies.In the fall of 2008, the author started to research cloud computing providers and how he could implement cloud computing for some library systems to save staff and equipment costs.In January 2009, the author started his plan to build library systems "on the clouds." The University of Arizona Libraries (UAL) has been a key player in the process of rebuilding higher education in Afghanistan  The author has also developed a Japanese ILL system (http://gif project.libraryfinder.org)for the North American Coordinating Council on Japanese Library Resources.These systems had been running on UAL's internal technical infrastructure.These systems run in a complex computing environment, require different modules, and do not use computing resources equally.For example, the Afghan ILS runs on Linux, Apache, MySQL, and Perl.Its OPAC and staff interface run on two different ports.The Afghanistan Digital Libraries website requires Linux, Apache, MySQL, and PHP.The Japanese ILL system was written in Java and runs on Tomcat.There are several reasons why the author moved these systems to the new cloud computing infrastructure: ■ ■ These systems need to be accessed in a system mode by people who are not UAL employees.
■ ■ System rebooting time can be substantial in this infrastructure because of server setup and IT policy.
■ ■ The current on-site server has reached its life expectancy and requires a replacement.
By analyzing the complex needs of different systems and considering how to use resources more effectively, the author decided to run all the systems through one cloud computing provider.By comparing the features and the costs, Linode (http://www.linode.com/) was chosen because it provides full SSH and root access using virtualization, four data centers in geographically diverse areas, high availability and clustering support, and an option for month-to-month contracts.In addition, other customers have provided reviews.In January 2009, the author purchased one node located in Fremont, California, for $19.95 per month.An implementation plan (see appendix) was drafted to complete the project in phases.The author owns a virtual server and has access to everything that a physical server provides.In addition, the provider and the user community provided timely help and technical support.
The migration of systems was straightforward: A Linux kernel (Debian 4.0) was installed within an hour, domain registration was complete and the domains went active in twenty-four hours, the Afghanistan Digital Libraries' website (based on Joomla) migration was complete within a week, and all supporting tools and libraries (e.g., MySQL, Tomcat, and Java SDK) were installed and configured within a few days.A month later, the Afghanistan ILS (based on Koha) migration was completed.The ILL system was also migrated without problem.Tests have been performed in all these systems to verify their usability.In summary, the migration of systems was very successful and did not encounter any barriers.It addresses the issues facing us: After the migration, SSH log-ins for users who are not university employees were set up quickly; systems maintenance is managed by the author's team, and rebooting now only takes about one minute; and there is no need to buy a new server and put it in a temperature and security controlled environment.The hardware is maintained by the provider.
The administrative GUI for the Linux Nodes is shown in figure 1.
Since migration, no downtime because of hardware or other failures caused by the provider has been observed.After migrating all the systems successfully and running them in a reliable mode for a few months, the second phase was implemented (see appendix).Another Linux node (located in Atalanta, Georgia) was purchased for backup and monitoring (see figure 2).Nagios, an open-source monitoring system, was tested and configured to identify and report problems for the above library systems.Nagios provides the following functions: (1) monitoring critical computing components, such as the network, systems, services, and servers; (2) timely alerts delivered via e-mail or cell phone; and (3) report and record logs of outages, events, and alerts.A backup script is also run as a prescheduled job to back up the systems on a regular basis.

Findings and discussions
Since January 2009, all the systems have been migrated and have been running without any issues caused by the provider.The author is very satisfied with the outcomes and cost.The annual cost of running two nodes is $480 per year, compared to at least $4,000 dollars if the hardware had been run in the library. 12rom the author's experience, cloud computing provides the following advantages over the traditional way of computing in academic institutions:  15 This brings concerns to both providers and end users, and it was suggested that privacy issues will be very challenging. 16

Summary
The author introduces cloud computing services and providers, presents his experience of running multiple systems such as ILS, content management systems, repository software, and the other system "on the clouds" since January 2009.Using cloud computing brings significant cost savings and flexibility.However, readers should be aware of technical and business issues.
The author is very satisfied with his experience of moving library systems to cloud computing.His experience demonstrates a new way of managing critical computing resources in an academic library setting.The next steps include using cloud computing to meet digital collections' storage needs.Cloud computing brings fundamental changes to organizations managing their computing needs.As major organizations in library fields, such as OCLC, started to take advantage of cloud computing, the author believes that cloud computing will play an important role in library IT.