Scalable analytics for IaaS cloud availability

TitleScalable analytics for IaaS cloud availability
Publication TypeJournal Article
Year of Publication2014
AuthorsGhosh, R., F. Longo, F. Frattini, S. Russo, and K. S. Trivedi
JournalIEEE Transactions on Cloud Computing - IEEE Computer Society
KeywordsAvailability, availability analysis, cloud computing, Downtime, Existence of a solutions, Infrastructure as a service (IaaS), Iterative methods, Lakes, Maintenance, Markov processes, Model driven approach, Numeric solutions, Service level agreement (SLAs), simulation, Stochastic models, Stochastic reward nets, Stochastic systems

In a large Infrastructure-as-a-Service (IaaS) cloud, component failures are quite common. Such failures may lead to occasional system downtime and eventual violation of Service Level Agreements (SLAs) on the cloud service availability. The availability analysis of the underlying infrastructure is useful to the service provider to design a system capable of providing a defined SLA, as well as to evaluate the capabilities of an existing one. This paper presents a scalable, stochastic model-driven approach to quantify the availability of a large-scale IaaS cloud, where failures are typically dealt with through migration of physical machines among three pools: hot (running), warm (turned on, but not ready), and cold (turned off). Since monolithic models do not scale for large systems, we use an interacting Markov chain based approach to demonstrate the reduction in the complexity of analysis and the solution time. The three pools are modeled by interacting sub-models. Dependencies among them are resolved using fixed-point iteration, for which existence of a solution is proved. The analytic-numeric solutions obtained from the proposed approach and from the monolithic model are compared. We show that the errors introduced by interacting sub-models are insignificant and that our approach can handle very large size IaaS clouds. The simulative solution is also considered for the proposed model, and solution time of the methods are compared. © 2014 IEEE.