作者: Fahad A. Arshad , Ignacio Laguna , Saurabh Bagchi
DOI:
关键词: Error detection and correction 、 Software bug 、 Distributed computing 、 Process (computing) 、 State (computer science) 、 Throughput (business) 、 Real-time computing 、 Runtime error detection 、 Computer science 、 Hidden Markov model 、 Stateful firewall
摘要: Today's distributed systems need runtime error detection to catch errors arising from software bugs, hardware errors, or unexpected operating conditions. A prominent class of techniques operates in a stateful manner, i.e., it keeps track the state application being monitored and then matches state-based rules. Large-scale applications generate high volume messages that can overwhelm capacity system. An existing approach handle this is randomly sample process subset. However, approach, leads non-determinism with respect system's view what in. This turn degradation quality detection. We present an intelligent sampling algorithm Hidden Markov Model (HMM)-based select system processes determine states such minimized. also mechanism for selectively triggering computationally intensive rules based on light-weight if rule likely be flagged. demonstrate called Monitor applied J2EE multi-tier application. empirically evaluate performance under different load conditions scenarios compare previous Pinpoint.