問題描述
基於 Java 的監控應用程序 (Java-based monitoring application)
Imagine the next architecture:
- We have Java application over Spring framework (let's call it Manager).
- It accepts requests and can launch other applications to handle them (let's call them Containers). These Containers exists for a long time. They has rather complex structure and consist of several different applications, but they have control part, written in C++ and bash scripts. Expected number is about 2000 on 8 CPU cores.
- We have monitoring Heartbeat application (also Java over Spring). Data be should somehow get from those Containers (on CPU, Memory, Bandwidth usage, versions) and from Manager and aggregated. Aggregated data are sent somewhere further by Heartbeat on regular basis.
What are your advices on implementing such thing? Please, provide point-outs to some frameworks or open sources addressing similar problem or some general considerations from your experience.
[UPDATE]
- Target OS: Solaris
- Mentioned 2000 processes are native applications. Each has its own chroot and runs logged in as a separate user.
參考解法
方法 1:
My initial thought is not the managing/monitoring aspect, but using the Java manager to launch 2,000 different processes on your one machine. How much memory do these consume ? Are they running simultaneously ? If they're different implementations can the machine make use of shared libraries etc. to reduce memory consumption.
Then the Java process has to read/handle the process stdout/stderr (usually you do this in a thread per stream, to prevent blocking - if you have 2,000 processes using this mechanism then you're looking at 4000 threads).
So I would think initially about the architecture of this and scaling this across multiple machines and VMs prior to looking at the manageability.
To get your CPU/memory etc. I would have a look at JMX and the provided beans. JMX provides means to expose this data over RMI etc. But to get this info per sub-process (container in your parlance) will require OS-specific info. What OS are you on ? For Windows check out WMI. On Unix/Linux the various /proc
filesystem nodes may help you.
方法 2:
Maybe you can find something helpful on the Nagios Page (http://www.nagios.org/)
(by Rorick、Brian Agnew、lothar)