forked from GithubBackups/healthchecks
Fix spelling, grammar, style mistakes
This commit is contained in:
parent
8fa0d04830
commit
70519fcd89
@ -52,7 +52,7 @@ in your account.</p>
|
||||
<p>Your requests to SITE_NAME Management API must authenticate using an
|
||||
API key. All API keys are project-specific. There are no account-wide API keys.
|
||||
By default, a project on SITE_NAME doesn't have an API key. You can create read-write
|
||||
and read-only API keys in the <strong>Project Settings</strong> page.</p>
|
||||
and read-only API keys on the <strong>Project Settings</strong> page.</p>
|
||||
<dl>
|
||||
<dt>read-write key</dt>
|
||||
<dd>Has full access to all documented API endpoints.</dd>
|
||||
@ -88,7 +88,7 @@ one or more tags.</p>
|
||||
<dl>
|
||||
<dt>tag=<value></dt>
|
||||
<dd>
|
||||
<p>Filters the checks, and returns only the checks that are tagged with the
|
||||
<p>Filters the checks and returns only the checks that are tagged with the
|
||||
specified value.</p>
|
||||
<p>This parameter can be repeated multiple times.</p>
|
||||
<p>Example:</p>
|
||||
@ -196,7 +196,7 @@ and <a href="#list-flips">Get a list of check's status changes</a> API calls.</p
|
||||
<p><code>GET SITE_ROOT/api/v1/checks/<uuid></code><br>
|
||||
<code>GET SITE_ROOT/api/v1/checks/<unique_key></code></p>
|
||||
<p>Returns a JSON representation of a single check. Accepts either check's UUID or
|
||||
the <code>unique_key</code> (a field derived from UUID, and returned by API responses when
|
||||
the <code>unique_key</code> (a field derived from UUID and returned by API responses when
|
||||
using the read-only API key) as an identifier.</p>
|
||||
<h3>Response Codes</h3>
|
||||
<dl>
|
||||
@ -295,7 +295,7 @@ Example:</p>
|
||||
<p>number, optional, default value: {{ default_timeout }}.</p>
|
||||
<p>A number of seconds, the expected period of this check.</p>
|
||||
<p>Minimum: 60 (one minute), maximum: 2592000 (30 days).</p>
|
||||
<p>Example for 5 minute timeout:</p>
|
||||
<p>Example for a 5-minute timeout:</p>
|
||||
<p><pre>{"kind": "simple", "timeout": 300}</pre></p>
|
||||
</dd>
|
||||
<dt>grace</dt>
|
||||
@ -357,8 +357,8 @@ UUIDs. You can look up integration UUIDs using the
|
||||
"4ec5a071-2d08-4baa-898a-eb4eb3cd6941,746a083e-f542-4554-be1a-707ce16d3acc"}</pre></p>
|
||||
<p>Alternatively, if you have named your integrations in SITE_NAME dashboard,
|
||||
you can specify integrations by their names. For this to work, your integrations
|
||||
need non-empty and unique names, and they must not contain commas. The names
|
||||
must match exactly, whitespace is significant.</p>
|
||||
need non-empty unique names, and they must not contain commas.
|
||||
The names must match exactly, whitespace is significant.</p>
|
||||
<p>Example:</p>
|
||||
<p><pre>{"channels": "Email to Alice,SMS to Alice"}</pre></p>
|
||||
</dd>
|
||||
@ -370,9 +370,9 @@ existing checks, filtered by fields listed in <code>unique</code>.</p>
|
||||
<p>If SITE_NAME does not find a matching check, it creates a new check and returns it
|
||||
with the HTTP status code 201.</p>
|
||||
<p>If SITE_NAME finds a matching check, it updates the existing check and
|
||||
and returns it with HTTP status code 200.</p>
|
||||
returns it with HTTP status code 200.</p>
|
||||
<p>The accepted values for the <code>unique</code> field are
|
||||
<code>name</code>, <code>tags</code>, <code>timeout</code> and <code>grace</code>.</p>
|
||||
<code>name</code>, <code>tags</code>, <code>timeout</code>, and <code>grace</code>.</p>
|
||||
<p>Example:</p>
|
||||
<p><pre>{"name": "Backups", unique: ["name"]}</pre></p>
|
||||
<p>In this example, if a check named "Backups" exists, it will be returned.
|
||||
@ -456,7 +456,7 @@ parameter, SITE_NAME will leave its value unchanged.</p>
|
||||
<p>number, optional.</p>
|
||||
<p>A number of seconds, the expected period of this check.</p>
|
||||
<p>Minimum: 60 (one minute), maximum: 2592000 (30 days).</p>
|
||||
<p>Example for 5 minute timeout:</p>
|
||||
<p>Example for a 5-minute timeout:</p>
|
||||
<p><pre>{"kind": "simple", "timeout": 300}</pre></p>
|
||||
</dd>
|
||||
<dt>grace</dt>
|
||||
@ -575,7 +575,7 @@ field values.</dd>
|
||||
|
||||
<h2 class="rule" id="pause-check">Pause Monitoring of a Check</h2>
|
||||
<p><code>POST SITE_ROOT/api/v1/checks/<uuid>/pause</code></p>
|
||||
<p>Disables monitoring for a check, without removing it. The check goes into a "paused"
|
||||
<p>Disables monitoring for a check without removing it. The check goes into a "paused"
|
||||
state. You can resume monitoring of the check by pinging it.</p>
|
||||
<p>This API call has no request parameters.</p>
|
||||
<h3>Response Codes</h3>
|
||||
@ -621,7 +621,7 @@ header is sometimes required by some network proxies and web servers.</p>
|
||||
|
||||
<h2 class="rule" id="delete-check">Delete Check</h2>
|
||||
<p><code>DELETE SITE_ROOT/api/v1/checks/<uuid></code></p>
|
||||
<p>Permanently deletes the check from user's account. Returns JSON representation of the
|
||||
<p>Permanently deletes the check from the user's account. Returns JSON representation of the
|
||||
check that was just deleted.</p>
|
||||
<p>This API call has no request parameters.</p>
|
||||
<h3>Response Codes</h3>
|
||||
@ -666,7 +666,7 @@ check that was just deleted.</p>
|
||||
<p><code>GET SITE_ROOT/api/v1/checks/<uuid>/pings/</code></p>
|
||||
<p>Returns a list of pings this check has received.</p>
|
||||
<p>This endpoint returns pings in reverse order (most recent first), and the total
|
||||
number of returned pings depends on account's billing plan: 100 for free accounts,
|
||||
number of returned pings depends on the account's billing plan: 100 for free accounts,
|
||||
1000 for paid accounts.</p>
|
||||
<h3>Response Codes</h3>
|
||||
<dl>
|
||||
@ -735,7 +735,7 @@ number of returned pings depends on account's billing plan: 100 for free account
|
||||
<p><code>GET SITE_ROOT/api/v1/checks/<uuid>/flips/</code><br>
|
||||
<code>GET SITE_ROOT/api/v1/checks/<unique_key>/flips/</code></p>
|
||||
<p>Returns a list of "flips" this check has experienced. A flip is a change of status
|
||||
(from "down" to "up", or from "up" to "down").</p>
|
||||
(from "down" to "up," or from "up" to "down").</p>
|
||||
<h3>Query String Parameters</h3>
|
||||
<dl>
|
||||
<dt>seconds=<value></dt>
|
||||
|
@ -22,7 +22,7 @@ Endpoint Name | Endpoint Address
|
||||
Your requests to SITE_NAME Management API must authenticate using an
|
||||
API key. All API keys are project-specific. There are no account-wide API keys.
|
||||
By default, a project on SITE_NAME doesn't have an API key. You can create read-write
|
||||
and read-only API keys in the **Project Settings** page.
|
||||
and read-only API keys on the **Project Settings** page.
|
||||
|
||||
read-write key
|
||||
: Has full access to all documented API endpoints.
|
||||
@ -65,7 +65,7 @@ one or more tags.
|
||||
### Query String Parameters
|
||||
|
||||
tag=<value>
|
||||
: Filters the checks, and returns only the checks that are tagged with the
|
||||
: Filters the checks and returns only the checks that are tagged with the
|
||||
specified value.
|
||||
|
||||
This parameter can be repeated multiple times.
|
||||
@ -181,7 +181,7 @@ Example:
|
||||
`GET SITE_ROOT/api/v1/checks/<unique_key>`
|
||||
|
||||
Returns a JSON representation of a single check. Accepts either check's UUID or
|
||||
the `unique_key` (a field derived from UUID, and returned by API responses when
|
||||
the `unique_key` (a field derived from UUID and returned by API responses when
|
||||
using the read-only API key) as an identifier.
|
||||
|
||||
### Response Codes
|
||||
@ -296,7 +296,7 @@ timeout
|
||||
|
||||
Minimum: 60 (one minute), maximum: 2592000 (30 days).
|
||||
|
||||
Example for 5 minute timeout:
|
||||
Example for a 5-minute timeout:
|
||||
|
||||
<pre>{"kind": "simple", "timeout": 300}</pre>
|
||||
|
||||
@ -375,8 +375,8 @@ channels
|
||||
|
||||
Alternatively, if you have named your integrations in SITE_NAME dashboard,
|
||||
you can specify integrations by their names. For this to work, your integrations
|
||||
need non-empty and unique names, and they must not contain commas. The names
|
||||
must match exactly, whitespace is significant.
|
||||
need non-empty unique names, and they must not contain commas.
|
||||
The names must match exactly, whitespace is significant.
|
||||
|
||||
Example:
|
||||
|
||||
@ -392,10 +392,10 @@ unique
|
||||
with the HTTP status code 201.
|
||||
|
||||
If SITE_NAME finds a matching check, it updates the existing check and
|
||||
and returns it with HTTP status code 200.
|
||||
returns it with HTTP status code 200.
|
||||
|
||||
The accepted values for the `unique` field are
|
||||
`name`, `tags`, `timeout` and `grace`.
|
||||
`name`, `tags`, `timeout`, and `grace`.
|
||||
|
||||
Example:
|
||||
|
||||
@ -495,7 +495,7 @@ timeout
|
||||
|
||||
Minimum: 60 (one minute), maximum: 2592000 (30 days).
|
||||
|
||||
Example for 5 minute timeout:
|
||||
Example for a 5-minute timeout:
|
||||
|
||||
<pre>{"kind": "simple", "timeout": 300}</pre>
|
||||
|
||||
@ -644,7 +644,7 @@ curl SITE_ROOT/api/v1/checks/f618072a-7bde-4eee-af63-71a77c5723bc \
|
||||
|
||||
`POST SITE_ROOT/api/v1/checks/<uuid>/pause`
|
||||
|
||||
Disables monitoring for a check, without removing it. The check goes into a "paused"
|
||||
Disables monitoring for a check without removing it. The check goes into a "paused"
|
||||
state. You can resume monitoring of the check by pinging it.
|
||||
|
||||
This API call has no request parameters.
|
||||
@ -700,7 +700,7 @@ header is sometimes required by some network proxies and web servers.
|
||||
|
||||
`DELETE SITE_ROOT/api/v1/checks/<uuid>`
|
||||
|
||||
Permanently deletes the check from user's account. Returns JSON representation of the
|
||||
Permanently deletes the check from the user's account. Returns JSON representation of the
|
||||
check that was just deleted.
|
||||
|
||||
This API call has no request parameters.
|
||||
@ -755,7 +755,7 @@ curl SITE_ROOT/api/v1/checks/f618072a-7bde-4eee-af63-71a77c5723bc \
|
||||
Returns a list of pings this check has received.
|
||||
|
||||
This endpoint returns pings in reverse order (most recent first), and the total
|
||||
number of returned pings depends on account's billing plan: 100 for free accounts,
|
||||
number of returned pings depends on the account's billing plan: 100 for free accounts,
|
||||
1000 for paid accounts.
|
||||
|
||||
### Response Codes
|
||||
@ -833,7 +833,7 @@ curl SITE_ROOT/api/v1/checks/f618072a-7bde-4eee-af63-71a77c5723bc/pings/ \
|
||||
`GET SITE_ROOT/api/v1/checks/<unique_key>/flips/`
|
||||
|
||||
Returns a list of "flips" this check has experienced. A flip is a change of status
|
||||
(from "down" to "up", or from "up" to "down").
|
||||
(from "down" to "up," or from "up" to "down").
|
||||
|
||||
### Query String Parameters
|
||||
|
||||
|
@ -1,8 +1,9 @@
|
||||
<h1>Attaching Logs</h1>
|
||||
<p>SITE_NAME ping endpoints accept HTTP HEAD, GET and POST request methods.</p>
|
||||
<p>When using HTTP POST, <strong>you can include arbitrary payload in the request body</strong>.
|
||||
<p>When using HTTP POST, <strong>you can include an arbitrary payload in the request body</strong>.
|
||||
If the request body looks like a UTF-8 string, SITE_NAME will log the
|
||||
first 10 kilobytes (10 000 bytes) of the request body, so you can inspect it later.</p>
|
||||
first 10 kilobytes (10 000 bytes) of the request body, so that you can inspect
|
||||
it later.</p>
|
||||
<h2>Logging Command Output</h2>
|
||||
<p>In this example, we run <code>certbot renew</code>, capture its output (both the stdout
|
||||
and stderr streams), and submit the captured output to SITE_NAME:</p>
|
||||
@ -25,8 +26,8 @@ curl -fsS -m <span class="m">10</span> --retry <span class="m">5</span> --data-r
|
||||
|
||||
<h2>Using Runitor</h2>
|
||||
<p><a href="https://github.com/bdd/runitor">Runitor</a> is a third party utility that runs the
|
||||
supplied command, captures its output and and reports to SITE_NAME.
|
||||
It also measures the execution time, and retries HTTP requests on transient errors.
|
||||
supplied command, captures its output and reports to SITE_NAME.
|
||||
It also measures the execution time and retries HTTP requests on transient errors.
|
||||
Best of all, the syntax is simple and clean:</p>
|
||||
<div class="bash highlight"><pre><span></span><code>runitor -uuid your-uuid-here -- /usr/bin/certbot renew
|
||||
</code></pre></div>
|
||||
@ -38,8 +39,8 @@ designed for that. If you run into the issue of logs getting cut off, consider
|
||||
the following options:</p>
|
||||
<ul>
|
||||
<li>See if the logs can be made less verbose. For example, if you have a batch job
|
||||
that outputs a line of text per item processed, perhaps it can output a short
|
||||
summary with the totals instead.</li>
|
||||
that outputs a line of text per item processed, perhaps it can output a summary with
|
||||
the totals instead.</li>
|
||||
<li>If the important content is usually at the end, submit the <strong>last 10KB</strong> instead
|
||||
of the first. Here is an example that submits the last 10KB of <code>dmesg</code> output:</li>
|
||||
</ul>
|
||||
@ -51,6 +52,6 @@ curl -fsS -m <span class="m">10</span> --retry <span class="m">5</span> --data-r
|
||||
|
||||
|
||||
<ul>
|
||||
<li>Finally, if for your use case it is critical to capture the entire log output,
|
||||
<li>Finally, if it is critical to capture the entire log output,
|
||||
consider using a dedicated log aggregation service for capturing the logs.</li>
|
||||
</ul>
|
@ -2,9 +2,10 @@
|
||||
|
||||
SITE_NAME ping endpoints accept HTTP HEAD, GET and POST request methods.
|
||||
|
||||
When using HTTP POST, **you can include arbitrary payload in the request body**.
|
||||
When using HTTP POST, **you can include an arbitrary payload in the request body**.
|
||||
If the request body looks like a UTF-8 string, SITE_NAME will log the
|
||||
first 10 kilobytes (10 000 bytes) of the request body, so you can inspect it later.
|
||||
first 10 kilobytes (10 000 bytes) of the request body, so that you can inspect
|
||||
it later.
|
||||
|
||||
## Logging Command Output
|
||||
|
||||
@ -33,8 +34,8 @@ curl -fsS -m 10 --retry 5 --data-raw "$m" PING_URL/$?
|
||||
## Using Runitor
|
||||
|
||||
[Runitor](https://github.com/bdd/runitor) is a third party utility that runs the
|
||||
supplied command, captures its output and and reports to SITE_NAME.
|
||||
It also measures the execution time, and retries HTTP requests on transient errors.
|
||||
supplied command, captures its output and reports to SITE_NAME.
|
||||
It also measures the execution time and retries HTTP requests on transient errors.
|
||||
Best of all, the syntax is simple and clean:
|
||||
|
||||
```bash
|
||||
@ -48,8 +49,8 @@ designed for that. If you run into the issue of logs getting cut off, consider
|
||||
the following options:
|
||||
|
||||
* See if the logs can be made less verbose. For example, if you have a batch job
|
||||
that outputs a line of text per item processed, perhaps it can output a short
|
||||
summary with the totals instead.
|
||||
that outputs a line of text per item processed, perhaps it can output a summary with
|
||||
the totals instead.
|
||||
* If the important content is usually at the end, submit the **last 10KB** instead
|
||||
of the first. Here is an example that submits the last 10KB of `dmesg` output:
|
||||
|
||||
@ -60,5 +61,5 @@ m=$(dmesg | tail --bytes=10000)
|
||||
curl -fsS -m 10 --retry 5 --data-raw "$m" PING_URL
|
||||
```
|
||||
|
||||
* Finally, if for your use case it is critical to capture the entire log output,
|
||||
* Finally, if it is critical to capture the entire log output,
|
||||
consider using a dedicated log aggregation service for capturing the logs.
|
||||
|
@ -1,10 +1,10 @@
|
||||
<h1>Shell Scripts</h1>
|
||||
<p>You can easily add SITE_NAME monitoring to a shell script. All you
|
||||
have to do is make a HTTP request at an appropriate place in the script.
|
||||
have to do is make an HTTP request at an appropriate place in the script.
|
||||
<a href="https://curl.haxx.se/docs/manpage.html">curl</a> and
|
||||
<a href="https://www.gnu.org/software/wget/manual/wget.html">wget</a>
|
||||
are two common command line HTTP clients you can use.</p>
|
||||
<div class="bash highlight"><pre><span></span><code><span class="c1"># Sends a HTTP GET request with curl:</span>
|
||||
are two common command-line HTTP clients you can use.</p>
|
||||
<div class="bash highlight"><pre><span></span><code><span class="c1"># Sends an HTTP GET request with curl:</span>
|
||||
curl -m <span class="m">10</span> --retry <span class="m">5</span> PING_URL
|
||||
|
||||
<span class="c1"># Silent version (no stdout/stderr output unless curl hits an error):</span>
|
||||
@ -17,7 +17,7 @@ curl -fsS -m <span class="m">10</span> --retry <span class="m">5</span> -o /dev/
|
||||
<dt><strong>-m <seconds></strong></dt>
|
||||
<dd>Maximum time in seconds that you allow the whole operation to take.</dd>
|
||||
<dt><strong>--retry <num></strong></dt>
|
||||
<dd>If a HTTP request fails, retry up to this many times. By default, curl
|
||||
<dd>If an HTTP request fails, retry up to this many times. By default, curl
|
||||
uses an increasing delay between each retry (1s, 2s, 4s, 8s, ...).
|
||||
See also <a href="https://curl.haxx.se/docs/manpage.html#--retry-delay">--retry-delay</a>.</dd>
|
||||
<dt><strong>-f, --fail</strong></dt>
|
||||
@ -30,10 +30,10 @@ hides error messages.</dd>
|
||||
<dt><strong>-o /dev/null</strong></dt>
|
||||
<dd>Redirect curl's stdout to /dev/null (error messages still go to stderr).</dd>
|
||||
</dl>
|
||||
<h2>Signalling Failure from Shell Scripts</h2>
|
||||
<h2>Signaling Failure from Shell Scripts</h2>
|
||||
<p>You can append <code>/fail</code> or <code>/{exit-status}</code> to any ping URL and use the resulting URL
|
||||
to actively signal a failure. The exit status should be a 0-255 integer.
|
||||
SITE_NAME will interpret exit status 0 as success, and all non-zero values as failures.</p>
|
||||
SITE_NAME will interpret exit status 0 as success and all non-zero values as failures.</p>
|
||||
<p>The following example runs <code>/usr/bin/certbot renew</code>, and uses the <code>$?</code> variable to
|
||||
look up its exit status:</p>
|
||||
<div class="bash highlight"><pre><span></span><code><span class="ch">#!/bin/sh</span>
|
||||
@ -46,9 +46,9 @@ curl -m <span class="m">10</span> --retry <span class="m">5</span> PING_URL/<spa
|
||||
|
||||
|
||||
<h2>Logging Command Output</h2>
|
||||
<p>When pinging with HTTP POST, you can put extra diagnostic information in request
|
||||
<p>When pinging with HTTP POST, you can put extra diagnostic information in the request
|
||||
body. If the request body looks like a valid UTF-8 string, SITE_NAME
|
||||
will accept and store first 10KB of the request body.</p>
|
||||
will accept and store the first 10KB of the request body.</p>
|
||||
<p>In the below example, certbot's output is captured and submitted via HTTP POST:</p>
|
||||
<div class="bash highlight"><pre><span></span><code><span class="ch">#!/bin/sh</span>
|
||||
|
||||
@ -59,7 +59,7 @@ curl -fsS -m <span class="m">10</span> --retry <span class="m">5</span> --data-r
|
||||
|
||||
<h2>Auto-provisioning New Checks</h2>
|
||||
<p>This example uses SITE_NAME <a href="../api/">Management API</a> to create a check "on the fly"
|
||||
(if it does not already exist) and to retrieve its ping URL.
|
||||
(if it does not already exist) and retrieve its ping URL.
|
||||
Using this technique, you can write services that automatically
|
||||
register with SITE_NAME the first time they run.</p>
|
||||
<div class="bash highlight"><pre><span></span><code><span class="ch">#!/bin/bash</span>
|
||||
|
@ -1,13 +1,13 @@
|
||||
# Shell Scripts
|
||||
|
||||
You can easily add SITE_NAME monitoring to a shell script. All you
|
||||
have to do is make a HTTP request at an appropriate place in the script.
|
||||
have to do is make an HTTP request at an appropriate place in the script.
|
||||
[curl](https://curl.haxx.se/docs/manpage.html) and
|
||||
[wget](https://www.gnu.org/software/wget/manual/wget.html)
|
||||
are two common command line HTTP clients you can use.
|
||||
are two common command-line HTTP clients you can use.
|
||||
|
||||
```bash
|
||||
# Sends a HTTP GET request with curl:
|
||||
# Sends an HTTP GET request with curl:
|
||||
curl -m 10 --retry 5 PING_URL
|
||||
|
||||
# Silent version (no stdout/stderr output unless curl hits an error):
|
||||
@ -21,7 +21,7 @@ Here's what each curl parameter does:
|
||||
: Maximum time in seconds that you allow the whole operation to take.
|
||||
|
||||
**--retry <num>**
|
||||
: If a HTTP request fails, retry up to this many times. By default, curl
|
||||
: If an HTTP request fails, retry up to this many times. By default, curl
|
||||
uses an increasing delay between each retry (1s, 2s, 4s, 8s, ...).
|
||||
See also [--retry-delay](https://curl.haxx.se/docs/manpage.html#--retry-delay).
|
||||
|
||||
@ -38,11 +38,11 @@ Here's what each curl parameter does:
|
||||
**-o /dev/null**
|
||||
: Redirect curl's stdout to /dev/null (error messages still go to stderr).
|
||||
|
||||
## Signalling Failure from Shell Scripts
|
||||
## Signaling Failure from Shell Scripts
|
||||
|
||||
You can append `/fail` or `/{exit-status}` to any ping URL and use the resulting URL
|
||||
to actively signal a failure. The exit status should be a 0-255 integer.
|
||||
SITE_NAME will interpret exit status 0 as success, and all non-zero values as failures.
|
||||
SITE_NAME will interpret exit status 0 as success and all non-zero values as failures.
|
||||
|
||||
The following example runs `/usr/bin/certbot renew`, and uses the `$?` variable to
|
||||
look up its exit status:
|
||||
@ -58,9 +58,9 @@ curl -m 10 --retry 5 PING_URL/$?
|
||||
|
||||
## Logging Command Output
|
||||
|
||||
When pinging with HTTP POST, you can put extra diagnostic information in request
|
||||
When pinging with HTTP POST, you can put extra diagnostic information in the request
|
||||
body. If the request body looks like a valid UTF-8 string, SITE_NAME
|
||||
will accept and store first 10KB of the request body.
|
||||
will accept and store the first 10KB of the request body.
|
||||
|
||||
In the below example, certbot's output is captured and submitted via HTTP POST:
|
||||
|
||||
@ -74,7 +74,7 @@ curl -fsS -m 10 --retry 5 --data-raw "$m" PING_URL
|
||||
## Auto-provisioning New Checks
|
||||
|
||||
This example uses SITE_NAME [Management API](../api/) to create a check "on the fly"
|
||||
(if it does not already exist) and to retrieve its ping URL.
|
||||
(if it does not already exist) and retrieve its ping URL.
|
||||
Using this technique, you can write services that automatically
|
||||
register with SITE_NAME the first time they run.
|
||||
|
||||
|
@ -2,20 +2,20 @@
|
||||
<p>You can clone individual checks from the "Check Details"
|
||||
page:</p>
|
||||
<p><img alt="The "Create a Copy" button" src="IMG_URL/create_copy.png" /></p>
|
||||
<p>The "Create a Copy..." function creates a new check in the same project, and copies
|
||||
<p>The "Create a Copy..." function creates a new check in the same project and copies
|
||||
over the following:</p>
|
||||
<ul>
|
||||
<li>Name, tags, description</li>
|
||||
<li>Schedule</li>
|
||||
<li>Assigned notification methods</li>
|
||||
</ul>
|
||||
<p>The newly created check has a different ping URL and it starts with an empty log.</p>
|
||||
<p>The newly created check has a different ping URL, and it starts with an empty log.</p>
|
||||
<h2>Cloning All Checks Into a New Project</h2>
|
||||
<p>It is sometimes useful to clone an entire project. For example, when recreating
|
||||
an existing deployment in a new region. The SITE_NAME web interface does
|
||||
not have a function to clone an entire project, but this can be done relatively
|
||||
easily using the <a href="../api/">Management API</a> calls. Below is an example using Python
|
||||
and the requests library:</p>
|
||||
not have a function to clone an entire project, but you can clone all checks in the
|
||||
project relatively easily using the <a href="../api/">Management API</a> calls.
|
||||
Below is an example using Python and the <a href="https://requests.readthedocs.io/en/master/">requests</a> library:</p>
|
||||
<div class="python highlight"><pre><span></span><code><span class="kn">import</span> <span class="nn">requests</span>
|
||||
|
||||
<span class="n">API_URL</span> <span class="o">=</span> <span class="s2">"SITE_ROOT/api/v1/checks/"</span>
|
||||
|
@ -5,22 +5,22 @@ page:
|
||||
|
||||

|
||||
|
||||
The "Create a Copy..." function creates a new check in the same project, and copies
|
||||
The "Create a Copy..." function creates a new check in the same project and copies
|
||||
over the following:
|
||||
|
||||
* Name, tags, description
|
||||
* Schedule
|
||||
* Assigned notification methods
|
||||
|
||||
The newly created check has a different ping URL and it starts with an empty log.
|
||||
The newly created check has a different ping URL, and it starts with an empty log.
|
||||
|
||||
## Cloning All Checks Into a New Project
|
||||
|
||||
It is sometimes useful to clone an entire project. For example, when recreating
|
||||
an existing deployment in a new region. The SITE_NAME web interface does
|
||||
not have a function to clone an entire project, but this can be done relatively
|
||||
easily using the [Management API](../api/) calls. Below is an example using Python
|
||||
and the requests library:
|
||||
not have a function to clone an entire project, but you can clone all checks in the
|
||||
project relatively easily using the [Management API](../api/) calls.
|
||||
Below is an example using Python and the [requests](https://requests.readthedocs.io/en/master/) library:
|
||||
|
||||
```python
|
||||
import requests
|
||||
|
@ -9,9 +9,9 @@ either in SITE_NAME web interface or by calling <a href="../api/">Management API
|
||||
<p><img alt="Editing name, tags and description" src="IMG_URL/edit_name.png" /></p>
|
||||
<ul>
|
||||
<li><strong>Name</strong>: names are optional, but it is a good idea to set them.
|
||||
Good naming becomes especially important as you add more checks in the
|
||||
Good naming becomes especially important as you add more checks to the
|
||||
account. SITE_NAME will display check names in the web interface, in email reports,
|
||||
and in notifications.</li>
|
||||
and notifications.</li>
|
||||
<li><strong>Tags</strong>: a space-separated list of optional labels. Use tags to organize and group
|
||||
checks within a project. You can tag checks by the environment
|
||||
(<code>prod</code>, <code>staging</code>, <code>dev</code>, etc.) or by role (<code>www</code>, <code>db</code>, <code>worker</code>, etc.) or using
|
||||
|
@ -13,9 +13,9 @@ Describe each check using an optional name, tags, and description fields.
|
||||

|
||||
|
||||
* **Name**: names are optional, but it is a good idea to set them.
|
||||
Good naming becomes especially important as you add more checks in the
|
||||
Good naming becomes especially important as you add more checks to the
|
||||
account. SITE_NAME will display check names in the web interface, in email reports,
|
||||
and in notifications.
|
||||
and notifications.
|
||||
* **Tags**: a space-separated list of optional labels. Use tags to organize and group
|
||||
checks within a project. You can tag checks by the environment
|
||||
(`prod`, `staging`, `dev`, etc.) or by role (`www`, `db`, `worker`, etc.) or using
|
||||
|
@ -1,6 +1,6 @@
|
||||
<h1>Configuring Notifications</h1>
|
||||
<p>You can set up multiple ways to receive notifications when checks in your account
|
||||
change state. This is useful for several reasons:</p>
|
||||
change state. Doing so is useful for several reasons:</p>
|
||||
<ul>
|
||||
<li><strong>Redundancy in case of notification failures.</strong> Set up notifications using two
|
||||
different notification channels (for example, email and Slack). If one transport
|
||||
@ -8,7 +8,7 @@ fails (e.g., an email message goes to spam), you still receive a notification ov
|
||||
other channel.</li>
|
||||
<li><strong>Use different notification methods depending on urgency</strong>. For example, if a
|
||||
low-priority housekeeping script fails, post a message in chat. If a vital service fails,
|
||||
post in chat, send an email and send SMS.</li>
|
||||
post in chat, send an email, and send SMS.</li>
|
||||
<li>Route notifications to the right people.</li>
|
||||
</ul>
|
||||
<p>Each notification method ("integration") belongs to a project:
|
||||
@ -27,8 +27,8 @@ account's billing plan. The quota automatically resets at the start of each mont
|
||||
The "unused" sends from one month do not carry over to the next month.</p>
|
||||
<p>When an account exceeds its monthly limit, SITE_NAME will:</p>
|
||||
<ul>
|
||||
<li>Send a warning email to account's primary email address</li>
|
||||
<li>Show a warning message in the <strong>Integrations</strong> page</li>
|
||||
<li>Send a warning email to the account's primary email address</li>
|
||||
<li>Show a warning message on the <strong>Integrations</strong> page</li>
|
||||
</ul>
|
||||
<h2>Repeated Notifications</h2>
|
||||
<p>If you want to receive repeated notifications for as long as a particular check is
|
||||
@ -46,10 +46,10 @@ Set them up in <a href="../../accounts/profile/notifications/">Account Settings
|
||||
<p><img alt="Email reminder options" src="IMG_URL/email_reports.png" /></p>
|
||||
<h2>Monthly Reports</h2>
|
||||
<p>SITE_NAME sends monthly email reports at the start of each month. Use them
|
||||
to make sure all checks have their expected state and nothing has
|
||||
to make sure all checks have their expected state, and nothing has
|
||||
"fallen through the cracks."</p>
|
||||
<p>A monthly report shows checks from all your projects.
|
||||
For each check it lists:</p>
|
||||
For each check, it lists:</p>
|
||||
<ul>
|
||||
<li>check's current status</li>
|
||||
<li>the number of downtimes by month, for the last two months</li>
|
||||
|
@ -1,7 +1,7 @@
|
||||
# Configuring Notifications
|
||||
|
||||
You can set up multiple ways to receive notifications when checks in your account
|
||||
change state. This is useful for several reasons:
|
||||
change state. Doing so is useful for several reasons:
|
||||
|
||||
* **Redundancy in case of notification failures.** Set up notifications using two
|
||||
different notification channels (for example, email and Slack). If one transport
|
||||
@ -9,7 +9,7 @@ fails (e.g., an email message goes to spam), you still receive a notification ov
|
||||
other channel.
|
||||
* **Use different notification methods depending on urgency**. For example, if a
|
||||
low-priority housekeeping script fails, post a message in chat. If a vital service fails,
|
||||
post in chat, send an email and send SMS.
|
||||
post in chat, send an email, and send SMS.
|
||||
* Route notifications to the right people.
|
||||
|
||||
Each notification method ("integration") belongs to a project:
|
||||
@ -35,8 +35,8 @@ The "unused" sends from one month do not carry over to the next month.
|
||||
|
||||
When an account exceeds its monthly limit, SITE_NAME will:
|
||||
|
||||
* Send a warning email to account's primary email address
|
||||
* Show a warning message in the **Integrations** page
|
||||
* Send a warning email to the account's primary email address
|
||||
* Show a warning message on the **Integrations** page
|
||||
|
||||
|
||||
## Repeated Notifications
|
||||
@ -58,11 +58,11 @@ Set them up in [Account Settings › Email Reports](../../accounts/profile/notif
|
||||
## Monthly Reports
|
||||
|
||||
SITE_NAME sends monthly email reports at the start of each month. Use them
|
||||
to make sure all checks have their expected state and nothing has
|
||||
to make sure all checks have their expected state, and nothing has
|
||||
"fallen through the cracks."
|
||||
|
||||
A monthly report shows checks from all your projects.
|
||||
For each check it lists:
|
||||
For each check, it lists:
|
||||
|
||||
* check's current status
|
||||
* the number of downtimes by month, for the last two months
|
||||
|
@ -2,7 +2,7 @@
|
||||
<p>Healthchecks.io supports exporting metrics and check statuses to
|
||||
<a href="https://prometheus.io/">Prometheus</a>, for use with <a href="https://grafana.com/">Grafana</a>.</p>
|
||||
<p>You can generate the metrics export endpoint by going to your project settings
|
||||
and clicking "Create API Keys". You will then see the link to
|
||||
and clicking "Create API Keys." You will then see the link to
|
||||
the Prometheus endpoint:</p>
|
||||
<p><img alt="Project's API Keys" src="IMG_URL/prometheus_endpoint.png" /></p>
|
||||
<h2>Update the prometheus.yml</h2>
|
||||
@ -17,4 +17,4 @@ the Prometheus endpoint:</p>
|
||||
|
||||
|
||||
<p>Notice how we split up the URL and paste in the scheme, domain, and path separately.</p>
|
||||
<p>Reload Prometheus and your changes should be live, coming in under the <code>hc_</code> prefix.</p>
|
||||
<p>Reload Prometheus, and your changes should be live, coming in under the <code>hc_</code> prefix.</p>
|
@ -4,7 +4,7 @@ Healthchecks.io supports exporting metrics and check statuses to
|
||||
[Prometheus](https://prometheus.io/), for use with [Grafana](https://grafana.com/).
|
||||
|
||||
You can generate the metrics export endpoint by going to your project settings
|
||||
and clicking "Create API Keys". You will then see the link to
|
||||
and clicking "Create API Keys." You will then see the link to
|
||||
the Prometheus endpoint:
|
||||
|
||||

|
||||
@ -24,4 +24,4 @@ You can copy the Prometheus endpoint URL and add it to the Prometheus configurat
|
||||
|
||||
Notice how we split up the URL and paste in the scheme, domain, and path separately.
|
||||
|
||||
Reload Prometheus and your changes should be live, coming in under the `hc_` prefix.
|
||||
Reload Prometheus, and your changes should be live, coming in under the `hc_` prefix.
|
||||
|
@ -1,5 +1,5 @@
|
||||
<h1>C#</h1>
|
||||
<p>Below is an example of making a HTTP request to SITE_NAME from C#.</p>
|
||||
<p>Below is an example of making an HTTP request to SITE_NAME from C#.</p>
|
||||
<div class="csharp highlight"><pre><span></span><code><span class="k">using</span> <span class="p">(</span><span class="kt">var</span> <span class="n">client</span> <span class="p">=</span> <span class="k">new</span> <span class="n">System</span><span class="p">.</span><span class="n">Net</span><span class="p">.</span><span class="n">WebClient</span><span class="p">())</span>
|
||||
<span class="p">{</span>
|
||||
<span class="n">client</span><span class="p">.</span><span class="n">DownloadString</span><span class="p">(</span><span class="s">"PING_URL"</span><span class="p">);</span>
|
||||
|
@ -1,6 +1,6 @@
|
||||
# C\#
|
||||
|
||||
Below is an example of making a HTTP request to SITE_NAME from C#.
|
||||
Below is an example of making an HTTP request to SITE_NAME from C#.
|
||||
|
||||
```csharp
|
||||
using (var client = new System.Net.WebClient())
|
||||
|
@ -1,13 +1,13 @@
|
||||
<h1>Email</h1>
|
||||
<p>As an alternative to HTTP/HTTPS requests, you can "ping" checks by
|
||||
sending an emails to special email addresses.</p>
|
||||
sending email messages to special email addresses.</p>
|
||||
<p><img alt="Email address for pinging via email" src="IMG_URL/emails.png" /></p>
|
||||
<h2>Use Case: Newsletter Delivery Monitoring</h2>
|
||||
<p>Consider a cron job which runs weekly and sends weekly newsletters
|
||||
to a list of e-mail addresses. You have already set up a check to get alerted
|
||||
<p>Consider a cron job that runs weekly and sends weekly newsletters
|
||||
to a list of email addresses. You have already set up a check to get alerted
|
||||
when your cron job fails to run. But what you ultimately want to check is if
|
||||
<strong>your emails are getting sent and delivered</strong>.</p>
|
||||
<p>The solution: set up another check, and add its email address to your list of
|
||||
recipient email addresses. Set its Period to 1 week. As long as your weekly email
|
||||
script runs correctly, and there are no email delivery issues,
|
||||
SITE_NAME will regularly receive an email, and the check and will stay up.</p>
|
||||
SITE_NAME will regularly receive an email, and the check will stay up.</p>
|
@ -1,18 +1,18 @@
|
||||
# Email
|
||||
|
||||
As an alternative to HTTP/HTTPS requests, you can "ping" checks by
|
||||
sending an emails to special email addresses.
|
||||
sending email messages to special email addresses.
|
||||
|
||||

|
||||
|
||||
## Use Case: Newsletter Delivery Monitoring
|
||||
|
||||
Consider a cron job which runs weekly and sends weekly newsletters
|
||||
to a list of e-mail addresses. You have already set up a check to get alerted
|
||||
Consider a cron job that runs weekly and sends weekly newsletters
|
||||
to a list of email addresses. You have already set up a check to get alerted
|
||||
when your cron job fails to run. But what you ultimately want to check is if
|
||||
**your emails are getting sent and delivered**.
|
||||
|
||||
The solution: set up another check, and add its email address to your list of
|
||||
recipient email addresses. Set its Period to 1 week. As long as your weekly email
|
||||
script runs correctly, and there are no email delivery issues,
|
||||
SITE_NAME will regularly receive an email, and the check and will stay up.
|
||||
SITE_NAME will regularly receive an email, and the check will stay up.
|
||||
|
@ -1,5 +1,5 @@
|
||||
<h1>Go</h1>
|
||||
<p>Below is an example of making a HTTP request to SITE_NAME from Go.</p>
|
||||
<p>Below is an example of making an HTTP request to SITE_NAME from Go.</p>
|
||||
<div class="go highlight"><pre><span></span><code><span class="kn">package</span> <span class="nx">main</span>
|
||||
|
||||
<span class="kn">import</span> <span class="s">"fmt"</span>
|
||||
|
@ -1,6 +1,6 @@
|
||||
# Go
|
||||
|
||||
Below is an example of making a HTTP request to SITE_NAME from Go.
|
||||
Below is an example of making an HTTP request to SITE_NAME from Go.
|
||||
|
||||
```go
|
||||
package main
|
||||
|
@ -70,8 +70,8 @@ OK
|
||||
</code></pre></div>
|
||||
|
||||
|
||||
<p>Sends a "job has started!" message to SITE_NAME. This is
|
||||
optional but enables a few extra features:</p>
|
||||
<p>Sends a "job has started!" message to SITE_NAME. Sending a "start" signal is
|
||||
optional, but it enables a few extra features:</p>
|
||||
<ul>
|
||||
<li>SITE_NAME will measure and display job execution times</li>
|
||||
<li>SITE_NAME will detect if the job runs longer than its configured grace time</li>
|
||||
@ -101,7 +101,7 @@ OK
|
||||
|
||||
<p>Sends a success or failure signal depending on the exit status
|
||||
included in the URL. The exit status is a 0-255 integer. SITE_NAME
|
||||
interprets 0 as success, and all other values as failure.</p>
|
||||
interprets 0 as success and all other values as failure.</p>
|
||||
<p><strong>Example</strong></p>
|
||||
<div class="http highlight"><pre><span></span><code><span class="nf">GET</span> <span class="nn">/5bf66975-d4c7-4bf5-bcc8-b8d8a82ea278/1</span> <span class="kr">HTTP</span><span class="o">/</span><span class="m">1.0</span>
|
||||
<span class="na">Host</span><span class="o">:</span> <span class="l">hc-ping.com</span>
|
||||
|
@ -81,8 +81,8 @@ OK
|
||||
HEAD|GET|POST PING_ENDPOINT{uuid}/start
|
||||
```
|
||||
|
||||
Sends a "job has started!" message to SITE_NAME. This is
|
||||
optional but enables a few extra features:
|
||||
Sends a "job has started!" message to SITE_NAME. Sending a "start" signal is
|
||||
optional, but it enables a few extra features:
|
||||
|
||||
* SITE_NAME will measure and display job execution times
|
||||
* SITE_NAME will detect if the job runs longer than its configured grace time
|
||||
@ -114,7 +114,7 @@ HEAD|GET|POST PING_ENDPOINT{uuid}/{exit-status}
|
||||
|
||||
Sends a success or failure signal depending on the exit status
|
||||
included in the URL. The exit status is a 0-255 integer. SITE_NAME
|
||||
interprets 0 as success, and all other values as failure.
|
||||
interprets 0 as success and all other values as failure.
|
||||
|
||||
**Example**
|
||||
|
||||
|
@ -1,5 +1,5 @@
|
||||
<h1>Javascript</h1>
|
||||
<p>Below is an example of making a HTTP request to SITE_NAME from Node.js.</p>
|
||||
<p>Below is an example of making an HTTP request to SITE_NAME from Node.js.</p>
|
||||
<div class="js highlight"><pre><span></span><code><span class="kd">var</span> <span class="nx">https</span> <span class="o">=</span> <span class="nx">require</span><span class="p">(</span><span class="s1">'https'</span><span class="p">);</span>
|
||||
<span class="nx">https</span><span class="p">.</span><span class="nx">get</span><span class="p">(</span><span class="s1">'PING_URL'</span><span class="p">).</span><span class="nx">on</span><span class="p">(</span><span class="s1">'error'</span><span class="p">,</span> <span class="p">(</span><span class="nx">err</span><span class="p">)</span> <span class="p">=></span> <span class="p">{</span>
|
||||
<span class="nx">console</span><span class="p">.</span><span class="nx">log</span><span class="p">(</span><span class="s1">'Ping failed: '</span> <span class="o">+</span> <span class="nx">err</span><span class="p">)</span>
|
||||
|
@ -1,6 +1,6 @@
|
||||
# Javascript
|
||||
|
||||
Below is an example of making a HTTP request to SITE_NAME from Node.js.
|
||||
Below is an example of making an HTTP request to SITE_NAME from Node.js.
|
||||
|
||||
```js
|
||||
var https = require('https');
|
||||
|
@ -1,11 +1,11 @@
|
||||
<h1>Measuring Script Run Time</h1>
|
||||
<p>Append <code>/start</code> to a ping URL and use it to signal when a job starts.
|
||||
After receiving a start signal, Healthchecks.io will show the check as "Started".
|
||||
It will store the "start" events and display the job execution times. The job
|
||||
execution times are calculated as the time gaps between adjacent "start" and
|
||||
After receiving a start signal, Healthchecks.io will show the check as "Started."
|
||||
It will store the "start" events and display the job execution times. SITE_NAME
|
||||
calculates the job execution times as the time gaps between adjacent "start" and
|
||||
"complete" events.</p>
|
||||
<p>Signalling a start kicks off a separate timer: the job now <strong>must</strong> signal a
|
||||
success within its configured "Grace Time", or it will get marked as "down".</p>
|
||||
<p>Signaling a start kicks off a separate timer: the job now <strong>must</strong> signal a
|
||||
success within its configured "Grace Time," or it will get marked as "down."</p>
|
||||
<p>Below is a code example in Python:</p>
|
||||
<div class="python highlight"><pre><span></span><code><span class="kn">import</span> <span class="nn">requests</span>
|
||||
<span class="n">URL</span> <span class="o">=</span> <span class="s2">"PING_URL"</span>
|
||||
|
@ -1,13 +1,13 @@
|
||||
# Measuring Script Run Time
|
||||
|
||||
Append `/start` to a ping URL and use it to signal when a job starts.
|
||||
After receiving a start signal, Healthchecks.io will show the check as "Started".
|
||||
It will store the "start" events and display the job execution times. The job
|
||||
execution times are calculated as the time gaps between adjacent "start" and
|
||||
After receiving a start signal, Healthchecks.io will show the check as "Started."
|
||||
It will store the "start" events and display the job execution times. SITE_NAME
|
||||
calculates the job execution times as the time gaps between adjacent "start" and
|
||||
"complete" events.
|
||||
|
||||
Signalling a start kicks off a separate timer: the job now **must** signal a
|
||||
success within its configured "Grace Time", or it will get marked as "down".
|
||||
Signaling a start kicks off a separate timer: the job now **must** signal a
|
||||
success within its configured "Grace Time," or it will get marked as "down."
|
||||
|
||||
Below is a code example in Python:
|
||||
|
||||
|
@ -1,7 +1,7 @@
|
||||
<h1>Monitoring Cron Jobs</h1>
|
||||
<p>SITE_NAME is perfectly suited for monitoring cron jobs. All you have to do is
|
||||
update your cron job command to send a HTTP request to SITE_NAME
|
||||
after a job completes.</p>
|
||||
update your cron job command to send an HTTP request to SITE_NAME
|
||||
after completing the job.</p>
|
||||
<p>Let's look at an example:</p>
|
||||
<div class="bash highlight"><pre><span></span><code>$ crontab -l
|
||||
<span class="c1"># m h dom mon dow command</span>
|
||||
@ -11,10 +11,10 @@ after a job completes.</p>
|
||||
|
||||
<p>The above job runs <code>/home/user/backup.sh</code> every day at 6:08. The backup
|
||||
script is presumably a headless, background process. Even if it works
|
||||
correctly currently, it can start silently failing in future, without
|
||||
correctly currently, it can start silently failing in the future without
|
||||
anyone noticing.</p>
|
||||
<p>You can set up SITE_NAME to notify you whenever the backup script does not
|
||||
run on time or does not complete successfully. Here are the steps to do that.</p>
|
||||
run on time, or it does not complete successfully. Here are the steps to do that.</p>
|
||||
<ol>
|
||||
<li>
|
||||
<p>If you have not already, sign up for a free SITE_NAME account.</p>
|
||||
@ -46,20 +46,20 @@ after the command:</p>
|
||||
</code></pre></div>
|
||||
|
||||
|
||||
<p>Now, each time your cron job runs, it will send a HTTP request to the ping URL.
|
||||
Since SITE_NAME knows the schedule of your cron job, it can calculate
|
||||
<p>Now, each time your cron job runs, it will send an HTTP request to the ping URL.
|
||||
Since SITE_NAME knows your cron job's schedule, it can calculate
|
||||
the dates and times when the job should run. As soon as your cron job doesn't
|
||||
report at an expected time, SITE_NAME will send you a notification.</p>
|
||||
<p>This monitoring technique takes care of various failure scenarios that could
|
||||
potentially go unnoticed otherwise:</p>
|
||||
<ul>
|
||||
<li>The whole machine goes down (power outage, janitor stumbles on wires, VPS provider problems, etc.)</li>
|
||||
<li>cron daemon is not running, or has invalid configuration</li>
|
||||
<li>cron does start your task, but the task exits with non-zero exit code</li>
|
||||
<li>the cron daemon is not running or has an invalid configuration</li>
|
||||
<li>cron does start your task, but the task exits with a non-zero exit code</li>
|
||||
</ul>
|
||||
<h2>Curl Options</h2>
|
||||
<p>The extra options in the above example tells curl to retry failed HTTP requests, and
|
||||
to silence output unless there is an error. Feel free to adjust the curl options to
|
||||
<p>The extra options in the above example tell curl to retry failed HTTP requests, and
|
||||
silence output unless there is an error. Feel free to adjust the curl options to
|
||||
suit your needs.</p>
|
||||
<dl>
|
||||
<dt><strong>&&</strong></dt>
|
||||
@ -80,8 +80,8 @@ Transient error is a timeout or an HTTP 5xx response code.</dd>
|
||||
</dl>
|
||||
<h2>Looking up Your Machine's Time Zone</h2>
|
||||
<p>If your cron job consistently pings SITE_NAME an hour early or an hour late,
|
||||
the likely cause is a timezone mismatch: your machine may be using a different timezone
|
||||
than what is configured on SITE_NAME.</p>
|
||||
the likely cause is a timezone mismatch: your machine may be using a timezone
|
||||
different from what you have configured on SITE_NAME.</p>
|
||||
<p>On modern GNU/Linux systems, you can look up the time zone using the
|
||||
<code>timedatectl status</code> command and looking for "Time zone" in its output:</p>
|
||||
<div class="text highlight"><pre><span></span><code>$ timedatectl status
|
||||
@ -104,6 +104,6 @@ including logs from the cron daemon.</p>
|
||||
</code></pre></div>
|
||||
|
||||
|
||||
<p>To see logs from e.g. the last hour, and only from the cron daemon:</p>
|
||||
<p>To see the logs from e.g. the last hour, and only from the cron daemon:</p>
|
||||
<div class="bash highlight"><pre><span></span><code>journalctl --since <span class="s2">"1 hour ago"</span> -t CRON
|
||||
</code></pre></div>
|
@ -1,8 +1,8 @@
|
||||
# Monitoring Cron Jobs
|
||||
|
||||
SITE_NAME is perfectly suited for monitoring cron jobs. All you have to do is
|
||||
update your cron job command to send a HTTP request to SITE_NAME
|
||||
after a job completes.
|
||||
update your cron job command to send an HTTP request to SITE_NAME
|
||||
after completing the job.
|
||||
|
||||
Let's look at an example:
|
||||
|
||||
@ -14,11 +14,11 @@ $ crontab -l
|
||||
|
||||
The above job runs `/home/user/backup.sh` every day at 6:08. The backup
|
||||
script is presumably a headless, background process. Even if it works
|
||||
correctly currently, it can start silently failing in future, without
|
||||
correctly currently, it can start silently failing in the future without
|
||||
anyone noticing.
|
||||
|
||||
You can set up SITE_NAME to notify you whenever the backup script does not
|
||||
run on time or does not complete successfully. Here are the steps to do that.
|
||||
run on time, or it does not complete successfully. Here are the steps to do that.
|
||||
|
||||
1. If you have not already, sign up for a free SITE_NAME account.
|
||||
|
||||
@ -44,8 +44,8 @@ $ crontab -e
|
||||
8 6 * * * /home/user/backup.sh && curl -fsS --retry 5 -o /dev/null PING_URL
|
||||
```
|
||||
|
||||
Now, each time your cron job runs, it will send a HTTP request to the ping URL.
|
||||
Since SITE_NAME knows the schedule of your cron job, it can calculate
|
||||
Now, each time your cron job runs, it will send an HTTP request to the ping URL.
|
||||
Since SITE_NAME knows your cron job's schedule, it can calculate
|
||||
the dates and times when the job should run. As soon as your cron job doesn't
|
||||
report at an expected time, SITE_NAME will send you a notification.
|
||||
|
||||
@ -53,13 +53,13 @@ This monitoring technique takes care of various failure scenarios that could
|
||||
potentially go unnoticed otherwise:
|
||||
|
||||
* The whole machine goes down (power outage, janitor stumbles on wires, VPS provider problems, etc.)
|
||||
* cron daemon is not running, or has invalid configuration
|
||||
* cron does start your task, but the task exits with non-zero exit code
|
||||
* the cron daemon is not running or has an invalid configuration
|
||||
* cron does start your task, but the task exits with a non-zero exit code
|
||||
|
||||
## Curl Options
|
||||
|
||||
The extra options in the above example tells curl to retry failed HTTP requests, and
|
||||
to silence output unless there is an error. Feel free to adjust the curl options to
|
||||
The extra options in the above example tell curl to retry failed HTTP requests, and
|
||||
silence output unless there is an error. Feel free to adjust the curl options to
|
||||
suit your needs.
|
||||
|
||||
**&&**
|
||||
@ -86,8 +86,8 @@ suit your needs.
|
||||
## Looking up Your Machine's Time Zone
|
||||
|
||||
If your cron job consistently pings SITE_NAME an hour early or an hour late,
|
||||
the likely cause is a timezone mismatch: your machine may be using a different timezone
|
||||
than what is configured on SITE_NAME.
|
||||
the likely cause is a timezone mismatch: your machine may be using a timezone
|
||||
different from what you have configured on SITE_NAME.
|
||||
|
||||
On modern GNU/Linux systems, you can look up the time zone using the
|
||||
`timedatectl status` command and looking for "Time zone" in its output:
|
||||
@ -115,7 +115,7 @@ To see live logs:
|
||||
journalctl -f
|
||||
```
|
||||
|
||||
To see logs from e.g. the last hour, and only from the cron daemon:
|
||||
To see the logs from e.g. the last hour, and only from the cron daemon:
|
||||
|
||||
```bash
|
||||
journalctl --since "1 hour ago" -t CRON
|
||||
|
@ -1,5 +1,5 @@
|
||||
<h1>PHP</h1>
|
||||
<p>Below is an example of making a HTTP request to SITE_NAME from PHP.</p>
|
||||
<p>Below is an example of making an HTTP request to SITE_NAME from PHP.</p>
|
||||
<div class="php highlight"><pre><span></span><code><span class="nb">file_get_contents</span><span class="p">(</span><span class="s1">'PING_URL'</span><span class="p">);</span>
|
||||
</code></pre></div>
|
||||
|
||||
|
@ -1,6 +1,6 @@
|
||||
# PHP
|
||||
|
||||
Below is an example of making a HTTP request to SITE_NAME from PHP.
|
||||
Below is an example of making an HTTP request to SITE_NAME from PHP.
|
||||
|
||||
```php
|
||||
file_get_contents('PING_URL');
|
||||
|
@ -1,16 +1,16 @@
|
||||
<h1>PowerShell</h1>
|
||||
<p>You can use <a href="https://msdn.microsoft.com/en-us/powershell/mt173057.aspx">PowerShell</a>
|
||||
and Windows Task Scheduler to automate various tasks on a Windows system.
|
||||
From within a PowerShell script it is also easy to ping SITE_NAME.</p>
|
||||
From within a PowerShell script, it is also easy to ping SITE_NAME.</p>
|
||||
<p>Here is a simple PowerShell script that pings SITE_NAME. When scheduled to
|
||||
run with Task Scheduler, it will essentially just send regular "I'm alive" messages.
|
||||
You can of course extend it to do more things.</p>
|
||||
Of course, you can extend it to do more things.</p>
|
||||
<div class="powershell highlight"><pre><span></span><code><span class="c"># inside a PowerShell script:</span>
|
||||
<span class="nb">Invoke-RestMethod</span> <span class="n">PING_URL</span>
|
||||
</code></pre></div>
|
||||
|
||||
|
||||
<p>Save the above to e.g. <code>C:\Scripts\healthchecks.ps1</code>.
|
||||
<p>Save the above to e.g., <code>C:\Scripts\healthchecks.ps1</code>.
|
||||
Then use the following command in a Scheduled Task to run the script:</p>
|
||||
<div class="bat highlight"><pre><span></span><code>powershell.exe -ExecutionPolicy bypass -File C:\Scripts\healthchecks.ps1
|
||||
</code></pre></div>
|
||||
|
@ -2,18 +2,18 @@
|
||||
|
||||
You can use [PowerShell](https://msdn.microsoft.com/en-us/powershell/mt173057.aspx)
|
||||
and Windows Task Scheduler to automate various tasks on a Windows system.
|
||||
From within a PowerShell script it is also easy to ping SITE_NAME.
|
||||
From within a PowerShell script, it is also easy to ping SITE_NAME.
|
||||
|
||||
Here is a simple PowerShell script that pings SITE_NAME. When scheduled to
|
||||
run with Task Scheduler, it will essentially just send regular "I'm alive" messages.
|
||||
You can of course extend it to do more things.
|
||||
Of course, you can extend it to do more things.
|
||||
|
||||
```powershell
|
||||
# inside a PowerShell script:
|
||||
Invoke-RestMethod PING_URL
|
||||
```
|
||||
|
||||
Save the above to e.g. `C:\Scripts\healthchecks.ps1`.
|
||||
Save the above to e.g., `C:\Scripts\healthchecks.ps1`.
|
||||
Then use the following command in a Scheduled Task to run the script:
|
||||
|
||||
```bat
|
||||
|
@ -5,7 +5,7 @@ your checks between them as your usage grows.</p>
|
||||
<p><img alt="An overview of projects" src="IMG_URL/projects.png" /></p>
|
||||
<p>Checks and integrations are project-scoped: each check and each configured
|
||||
integration always belongs to a particular project. Checks can be transferred
|
||||
from one project to another, preserving check's ping address:</p>
|
||||
between projects. The transfer operation preserves check's ping address:</p>
|
||||
<p><img alt="The transfer dialog" src="IMG_URL/transfer_check.png" /></p>
|
||||
<h2>Team Access</h2>
|
||||
<p>You can grant your colleagues access to a project by inviting them into
|
||||
@ -47,17 +47,17 @@ limit depends on the account's billing plan. When you reach the Check Limit
|
||||
you will not be able to create new checks.</p>
|
||||
<p>All projects owned by your account shares your account's Check Limit.
|
||||
For example, consider a Business account with two projects,
|
||||
"Project A" and "Project B". If A has 70 checks, then B cannot have more than
|
||||
30 checks, in order to not exceed the Business account's total limit of 100.</p>
|
||||
"Project A" and "Project B." If A has 70 checks, then B cannot have more than
|
||||
30 checks in order to not exceed the Business account's total limit of 100.</p>
|
||||
<p>However, only checks from your own projects count towards your account's
|
||||
quota. If you get invited to somebody else's project, that does not change
|
||||
the number of checks you can create in your projects.</p>
|
||||
<h2>Projects and Team Size Limits</h2>
|
||||
<p><strong>Team Size</strong> is the number of <em>unique</em> users you can invite in your projects.
|
||||
<p><strong>Team Size</strong> is the number of <em>unique</em> users you can invite to your projects.
|
||||
Same as with Check Limit, all projects share your account's Team Size limit.
|
||||
However, if you invite the same user (using the same email address) into multiple
|
||||
projects, it only takes up a single seat.</p>
|
||||
<h2>Projects and Monthly Reports</h2>
|
||||
<p>SITE_NAME sends monthly email reports at the start of each month. The monthly reports
|
||||
list a summary of checks from <strong>all your projects</strong>. It contains status summaries for
|
||||
both the projects you own, and the projects you are a member of.</p>
|
||||
both the projects you own and the projects you are a member of.</p>
|
@ -8,7 +8,7 @@ your checks between them as your usage grows.
|
||||
|
||||
Checks and integrations are project-scoped: each check and each configured
|
||||
integration always belongs to a particular project. Checks can be transferred
|
||||
from one project to another, preserving check's ping address:
|
||||
between projects. The transfer operation preserves check's ping address:
|
||||
|
||||

|
||||
|
||||
@ -61,8 +61,8 @@ you will not be able to create new checks.
|
||||
|
||||
All projects owned by your account shares your account's Check Limit.
|
||||
For example, consider a Business account with two projects,
|
||||
"Project A" and "Project B". If A has 70 checks, then B cannot have more than
|
||||
30 checks, in order to not exceed the Business account's total limit of 100.
|
||||
"Project A" and "Project B." If A has 70 checks, then B cannot have more than
|
||||
30 checks in order to not exceed the Business account's total limit of 100.
|
||||
|
||||
However, only checks from your own projects count towards your account's
|
||||
quota. If you get invited to somebody else's project, that does not change
|
||||
@ -70,7 +70,7 @@ the number of checks you can create in your projects.
|
||||
|
||||
## Projects and Team Size Limits
|
||||
|
||||
**Team Size** is the number of *unique* users you can invite in your projects.
|
||||
**Team Size** is the number of *unique* users you can invite to your projects.
|
||||
Same as with Check Limit, all projects share your account's Team Size limit.
|
||||
However, if you invite the same user (using the same email address) into multiple
|
||||
projects, it only takes up a single seat.
|
||||
@ -79,4 +79,4 @@ projects, it only takes up a single seat.
|
||||
|
||||
SITE_NAME sends monthly email reports at the start of each month. The monthly reports
|
||||
list a summary of checks from **all your projects**. It contains status summaries for
|
||||
both the projects you own, and the projects you are a member of.
|
||||
both the projects you own and the projects you are a member of.
|
||||
|
@ -1,5 +1,6 @@
|
||||
<h1>Python</h1>
|
||||
<p>If you are already using the requests library, it is convenient to also use it here:</p>
|
||||
<p>If you are already using the <a href="https://requests.readthedocs.io/en/master/">requests</a>
|
||||
library, it is convenient to also use it here:</p>
|
||||
<div class="python highlight"><pre><span></span><code><span class="kn">import</span> <span class="nn">requests</span>
|
||||
|
||||
<span class="k">try</span><span class="p">:</span>
|
||||
@ -10,7 +11,8 @@
|
||||
</code></pre></div>
|
||||
|
||||
|
||||
<p>Otherwise, you can use the urllib module from Python 3 standard library:</p>
|
||||
<p>Otherwise, you can use the <a href="https://docs.python.org/3/library/urllib.request.html">urllib.request</a>
|
||||
module from Python 3 standard library:</p>
|
||||
<div class="python highlight"><pre><span></span><code><span class="kn">import</span> <span class="nn">socket</span>
|
||||
<span class="kn">import</span> <span class="nn">urllib.request</span>
|
||||
|
||||
|
@ -1,6 +1,7 @@
|
||||
# Python
|
||||
|
||||
If you are already using the requests library, it is convenient to also use it here:
|
||||
If you are already using the [requests](https://requests.readthedocs.io/en/master/)
|
||||
library, it is convenient to also use it here:
|
||||
|
||||
```python
|
||||
import requests
|
||||
@ -12,7 +13,8 @@ except requests.RequestException as e:
|
||||
print("Ping failed: %s" % e)
|
||||
```
|
||||
|
||||
Otherwise, you can use the urllib module from Python 3 standard library:
|
||||
Otherwise, you can use the [urllib.request](https://docs.python.org/3/library/urllib.request.html)
|
||||
module from Python 3 standard library:
|
||||
|
||||
```python
|
||||
import socket
|
||||
|
@ -1,17 +1,18 @@
|
||||
<h1>Pinging Reliability Tips</h1>
|
||||
<p>Sending monitoring signals over public internet is inherently unreliable.
|
||||
<p>Sending monitoring signals over the public internet is inherently unreliable.
|
||||
HTTP requests can sometimes take excessively long or fail completely
|
||||
for a variety of reasons. Here are some general tips to make your monitoring
|
||||
code more robust.</p>
|
||||
<h2>Specify HTTP Request Timeout</h2>
|
||||
<p>Put a time limit on how long each ping is allowed to take. This is especially
|
||||
important when sending a "start" signal at the start of a job: you don't want
|
||||
a stuck ping prevent the actual job from running. Another case is a continuously
|
||||
running worker process which pings SITE_NAME after each completed item. A stuck
|
||||
request would block the whole process, so it is important to guard against.</p>
|
||||
a stuck ping to prevent the actual job from running. Another case is a continuously
|
||||
running worker process that pings SITE_NAME after each completed item. A stuck
|
||||
request could block the whole process. An explicit per-request time limit mitigates
|
||||
this problem.</p>
|
||||
<p>Specifying the timeout depends on the tool you use. curl, for example, has the
|
||||
<code>--max-time</code> (shorthand: <code>-m</code>) parameter:</p>
|
||||
<div class="bash highlight"><pre><span></span><code><span class="c1"># Send a HTTP, 10 second timeout:</span>
|
||||
<div class="bash highlight"><pre><span></span><code><span class="c1"># Send an HTTP request, 10 second timeout:</span>
|
||||
curl -m <span class="m">10</span> PING_URL
|
||||
</code></pre></div>
|
||||
|
||||
@ -28,5 +29,5 @@ curl --retry <span class="m">5</span> PING_URL
|
||||
|
||||
<h2>Handle Exceptions</h2>
|
||||
<p>Make sure you know how your HTTP client handles failed requests. For example,
|
||||
if you use a HTTP library which raises exceptions, decide if you want to
|
||||
catch the exceptions, or let them bubble up.</p>
|
||||
if you use an HTTP library that raises exceptions, decide if you want to
|
||||
catch the exceptions or let them bubble up.</p>
|
@ -1,6 +1,6 @@
|
||||
# Pinging Reliability Tips
|
||||
|
||||
Sending monitoring signals over public internet is inherently unreliable.
|
||||
Sending monitoring signals over the public internet is inherently unreliable.
|
||||
HTTP requests can sometimes take excessively long or fail completely
|
||||
for a variety of reasons. Here are some general tips to make your monitoring
|
||||
code more robust.
|
||||
@ -9,15 +9,16 @@ code more robust.
|
||||
|
||||
Put a time limit on how long each ping is allowed to take. This is especially
|
||||
important when sending a "start" signal at the start of a job: you don't want
|
||||
a stuck ping prevent the actual job from running. Another case is a continuously
|
||||
running worker process which pings SITE_NAME after each completed item. A stuck
|
||||
request would block the whole process, so it is important to guard against.
|
||||
a stuck ping to prevent the actual job from running. Another case is a continuously
|
||||
running worker process that pings SITE_NAME after each completed item. A stuck
|
||||
request could block the whole process. An explicit per-request time limit mitigates
|
||||
this problem.
|
||||
|
||||
Specifying the timeout depends on the tool you use. curl, for example, has the
|
||||
`--max-time` (shorthand: `-m`) parameter:
|
||||
|
||||
```bash
|
||||
# Send a HTTP, 10 second timeout:
|
||||
# Send an HTTP request, 10 second timeout:
|
||||
curl -m 10 PING_URL
|
||||
```
|
||||
|
||||
@ -37,5 +38,5 @@ curl --retry 5 PING_URL
|
||||
## Handle Exceptions
|
||||
|
||||
Make sure you know how your HTTP client handles failed requests. For example,
|
||||
if you use a HTTP library which raises exceptions, decide if you want to
|
||||
catch the exceptions, or let them bubble up.
|
||||
if you use an HTTP library that raises exceptions, decide if you want to
|
||||
catch the exceptions or let them bubble up.
|
||||
|
@ -1,5 +1,5 @@
|
||||
<h1>Ruby</h1>
|
||||
<p>Below is an example of making a HTTP request to SITE_NAME from Ruby.</p>
|
||||
<p>Below is an example of making an HTTP request to SITE_NAME from Ruby.</p>
|
||||
<div class="ruby highlight"><pre><span></span><code><span class="nb">require</span> <span class="s1">'net/http'</span>
|
||||
<span class="nb">require</span> <span class="s1">'uri'</span>
|
||||
|
||||
|
@ -1,6 +1,6 @@
|
||||
# Ruby
|
||||
|
||||
Below is an example of making a HTTP request to SITE_NAME from Ruby.
|
||||
Below is an example of making an HTTP request to SITE_NAME from Ruby.
|
||||
|
||||
```ruby
|
||||
require 'net/http'
|
||||
|
@ -1,8 +1,8 @@
|
||||
<h1>Signalling failures</h1>
|
||||
<h1>Signaling failures</h1>
|
||||
<p>You can actively signal a failure to SITE_NAME by slightly changing the
|
||||
ping URL: append either <code>/fail</code> or <code>/{exit-status}</code> to your normal ping URL.
|
||||
The exit status should be a 0-255 integer. SITE_NAME will interpret
|
||||
exit status 0 as success, and all non-zero values as failures.</p>
|
||||
exit status 0 as success and all non-zero values as failures.</p>
|
||||
<p>Examples:</p>
|
||||
<div class="bash highlight"><pre><span></span><code><span class="c1"># Reports failure by appending the /fail suffix:</span>
|
||||
curl --retry <span class="m">3</span> PING_URL/fail
|
||||
@ -12,7 +12,7 @@ curl --retry <span class="m">3</span> PING_URL/1
|
||||
</code></pre></div>
|
||||
|
||||
|
||||
<p>By actively signalling failures to SITE_NAME, you can minimize the delay from your
|
||||
<p>By actively signaling failures to SITE_NAME, you can minimize the delay from your
|
||||
monitored service encountering a problem to you getting notified about it.</p>
|
||||
<h2>Shell Scripts</h2>
|
||||
<p>The below shell script appends <code>$?</code> (a special variable which contains the
|
||||
|
@ -1,9 +1,9 @@
|
||||
# Signalling failures
|
||||
# Signaling failures
|
||||
|
||||
You can actively signal a failure to SITE_NAME by slightly changing the
|
||||
ping URL: append either `/fail` or `/{exit-status}` to your normal ping URL.
|
||||
The exit status should be a 0-255 integer. SITE_NAME will interpret
|
||||
exit status 0 as success, and all non-zero values as failures.
|
||||
exit status 0 as success and all non-zero values as failures.
|
||||
|
||||
Examples:
|
||||
|
||||
@ -16,7 +16,7 @@ curl --retry 3 PING_URL/fail
|
||||
curl --retry 3 PING_URL/1
|
||||
```
|
||||
|
||||
By actively signalling failures to SITE_NAME, you can minimize the delay from your
|
||||
By actively signaling failures to SITE_NAME, you can minimize the delay from your
|
||||
monitored service encountering a problem to you getting notified about it.
|
||||
|
||||
## Shell Scripts
|
||||
|
@ -204,7 +204,7 @@
|
||||
class="icon" alt="Webhook icon" />
|
||||
|
||||
<h2>Webhook</h2>
|
||||
<p>Receive a HTTP callback when a check goes down.</p>
|
||||
<p>Receive an HTTP callback when a check goes down.</p>
|
||||
<a href="{% url 'hc-add-webhook' project.code %}" class="btn btn-primary">Add Integration</a>
|
||||
</li>
|
||||
|
||||
|
@ -37,7 +37,7 @@
|
||||
</div>
|
||||
<div class="panel-body">
|
||||
<p>A cron expression has five fields, separated by spaces.
|
||||
Asterisk is a wild card character and means "any value".
|
||||
Asterisk is a wild card character and means "any value."
|
||||
</p>
|
||||
|
||||
<div class="desc">"Run every minute"</div>
|
||||
@ -98,7 +98,7 @@
|
||||
</div>
|
||||
<div class="panel-body">
|
||||
<p>Use numeric values instead of asterisks to match specific
|
||||
minutes, hours, days and months.
|
||||
minutes, hours, days, and months.
|
||||
</p>
|
||||
|
||||
<div class="desc">"Run at 6PM on Fridays"</div>
|
||||
@ -160,7 +160,7 @@
|
||||
<div class="panel-body">
|
||||
<p>Use <code>{v1},{v2},...,{vn}</code> to list multiple values.</p>
|
||||
|
||||
<div class="desc">"Run at 9AM, 12PM and 6PM every day"</div>
|
||||
<div class="desc">"Run at 9AM, 12PM, and 6PM every day"</div>
|
||||
|
||||
<table class="cron-example">
|
||||
<tr>
|
||||
@ -245,7 +245,7 @@
|
||||
<td>
|
||||
<div class="guide"></div>
|
||||
</td>
|
||||
<td>Run on Monday to Friday</td>
|
||||
<td>Run from Monday to Friday</td>
|
||||
</tr>
|
||||
<tr>
|
||||
<td colspan="2" class="minor">
|
||||
@ -336,7 +336,7 @@
|
||||
<h3 class="panel-title">Combine numeric values and ranges in lists</h3>
|
||||
</div>
|
||||
<div class="panel-body">
|
||||
<p>In the comma-separated lists you can combine not only
|
||||
<p>In the comma-separated lists, you can combine not only
|
||||
numeric values but also ranges.</p>
|
||||
|
||||
<div class="desc">"Run every round hour outside office hours"</div>
|
||||
@ -459,13 +459,13 @@
|
||||
</div>
|
||||
<div class="panel-body">
|
||||
<p>
|
||||
Cron daemon uses server's <strong>local time</strong>.
|
||||
The cron daemon uses the server's <strong>local time</strong>.
|
||||
If your server's timezone is other than UTC, make sure
|
||||
to set a matching timezone for your check
|
||||
on {{ site_name }} as well.
|
||||
</p>
|
||||
|
||||
<p>On Ubuntu systems you can check the server's timezone
|
||||
<p>On Ubuntu systems, you can check the server's timezone
|
||||
with:</p>
|
||||
<pre>cat /etc/timezone</pre>
|
||||
</div>
|
||||
|
Loading…
x
Reference in New Issue
Block a user