Add nagios monitoring to mailman01's REST interface.
It listens on port 8001 on mailman01 localhost interface only.
So, you will need to use nrpe and check_http
Hello, I work on it?
You may if you like :)
{{{ cd /srv/web/infra/ && patch -p0 < patch_nagios_server.txt cd /srv/web/infra/ && patch -p0 < patch_nagios_client.txt cd /srv/web/infra/ && patch -p0 < patch_nagios_client_templates.txt }}}
check_http is only base, if it should be extensive, not a problem.
attachment patch_nagios_server.txt
attachment patch_nagios_client.txt
attachment patch_nagios_client_templates.txt
That looks pretty good, just a few comments:
Instead of just adding nagios-plugins-http to be installed on all clients, can you add a task that just installs it specifically on mailman01? Look in that same ansible/roles/nagios_client/tasks/main.yml role where we install just a few things on proxies for example.
can you make it just one patch with all the changes?
Thanks for working on this!
attachment patch_nagcheck_rest_inteface.txt
Done.
ok, that looks pretty good, except one thing I just noticed.
This is a REST interface, not http, so I don't think check_http is going to work for us here. We may need a specific to mailman script. :(
http://wiki.list.org/DEV/REST%20Interface
adding abompard here to comment on what we might check here to confirm that the rest api is up and processing normally.
It just was not enough to detect a keyword from a REST URL using check_http?
The problem is that the whole API is protected byt HTTP basic auth. I'll see if I can add a 'heartbeat' endpoint that would not be protected, in the meantime check_http should work I think.
check_http can also use http basic auth. Could you add user and password to http pasword file?
Where did we leave this?
Can we use check_http with basic auth? or just use check_http and see that its up?
I'll pick up where this was left off.
I'm not sure where we ended up here. We should setup a check, but I am not sure what nagios check makes most sense.
@abompard Is there a non authed end point we could check?
Failing that I guess we could just see that it's asking for auth and call that "up"?
Metadata Update from @kevin: - Issue tagged with: easyfix
Unfortunately everything is authed, so I guess checking for a 401 is the best we can do.
ok. done in current ansible.
:panda_face:
Metadata Update from @kevin: - Issue close_status updated to: Fixed - Issue status updated to: Closed (was: Open)
Login to comment on this ticket.