Data and Application Management in an Open Cloud Platform

Cloud computing has had tremendous uptake in the global market and is expected to grow well into the future. The commoditization of computational and storage resources has given massive capabilities to individuals and companies to acquire such resources on demand, and to relinquish them when no longer required, without the need to budget for additional hardware and management.
Platform-as-a-Service (PaaS) architectures have arisen in the past years to alleviate the burdens of resource management for developers who may now focus strictly on application development. This faster time-to-value has increased productivity for both developers and their respective organizations. Developers no longer have to worry about lower level details such as CPU consumption, bandwidth limitations, memory consumption, and disk usage, as it has been common in the past. The scaling of applications is now the burden of the platform system. PaaS systems have become the operating systems of the datacenter.
Our research has been focused on developing a PaaS system which can give the aforementioned attributes in an open and pluggable way. We emulate the Google App Engine PaaS system as it was one of the first to come to market and offered the promise of infinite scalability at the front end of application servers and the backend of large data storage, all powered by Googles robust infrastructure. We call our PaaS solution AppScale. AppScale is an open cloud platform capable of transparently executing Google App Engine applications at scale and without modification. AppScale is a cloud-based web framework which provides multiple services that provide cloud infrastructure control, data persistence, caching and a number of other common application technologies. AppScale both simplifies and facilitates the benchmarking of the execution of scalable cloud technologies using real applications. This Ph.D. dissertation discusses the design, implementation, and evaluation of AppScale. It considers the many components of AppScale with a focus on the data management layer for scalable storage, transaction semantics, scalable queries, analysis of Big Data, and live migration support.
The PhD thesis can be downloaded here.