会议专题

AOCMS: An Adaptive and Scalable Monitoring System for Large-Scale Clusters

In this paper, we present the design and implementation of AOCMS, an adaptive, scalable and efficient monitoring system for a large-scale cluster. We describe an adaptive architecture of AOCMS in detail, and focus on the discussion about some techniques as to enhancing the adaptation, scalability and efficiency of AOCMS. These techniques include: a solution to monitor a heterogeneous cluster; a universal applet-servlet communicating controller responsible for communication between the clients and the web server; adaptive pools providing threads or connections to the database for the monitoring tasks on demand; and an AOP-based alarm decoupling the alarming logic from the monitoring logic. Moreover, we measured the performance of AOCMS. The results show that AOCMS runs with low overheads and responds to clients quickly.

Zhenghua Xue Xiaoshe Dong Weiguo Wu

Department of Computer Science and Technology, Xian Jiaotong University, Xian, China

国际会议

2006 Asia-Pacific Services Computing Conference(IEEE亚太地区服务计算会议)

广州

英文

466-472

2006-12-12(万方平台首次上网日期,不代表论文的发表时间)