Hosted Monitoring: Evaluating InfluxDB Cloud and Grafana Cloud

We’ve been self-web web hosting Grafana and an InfluxDB database for a extremely very long time, the grunt of it for monitoring our custom application metrics. Nevertheless, we began to push our persistence with upkeep and the boundaries of performance on a $5/mo DigitalOcean droplet.

So as to simplify things we began exploring what hosted alternatives shall be found in. In actuality, we appropriate form want one thing straightforward and cheap. The two necessary alternatives we notion to be as dangle been InfluxDB Cloud and Grafana Cloud, both offering moderately generous free-tiers.

InfluxDB Cloud

We began with InfluxDB Cloud, basically as a consequence of we dangle been already the grunt of InfluxDB as our recordsdata store. Thus, we didn’t dangle to rewrite any of our code that pushes metrics. Overall our first impressions dangle been appropriate form, onboarding used to be straightforward and the UI regarded nice. We dangle been expeditiously ready to safe our recordsdata into InfluxDB Cloud as successfully, and may perchance perchance well well also simply peaceable open up developing our panels and alerts. Right here is the place we began to trudge into considerations.

InfluxDB Cloud panels are very small (and buggy)

Coming from Grafana, we dangle been aged to a obvious stage of customization. InfluxDB Cloud on the opposite hand offers very small customization. As an illustration, there are greatest 3 alternatives available for Y-axis devices and myth alternatives are very small.

Visually notion to be one of many most frustrating things used to be that the graph series colors saved changing at any time when you refresh the knowledge, making it very complicated to aid observe of what you’re taking a eye at.

One other very frustrating characteristic (or lack thereof) is that the hover tooltip doesn’t give any indication of the negate series your mouse cursor is currently hovering over. A straightforward solution right here shall be to create the actively hovered series withhold or emphazied.

We couldn’t safe InfluxDB Cloud alerts to work at all

Whereas we may perchance perchance well well also stay with the visualization and customization considerations, alerts used to be dangle been we utterly gave up on InfluxDB Cloud. No topic how noteworthy we tried, we appropriate form weren’t ready to safe a straightforward current alert to work.

The strategy of developing alerts used to be no longer very intuitive. As an illustration, the “Checks” UI let’s you setup your required thresholds. The graph involves alternatives to toggle the threshold markers. But these don’t for sure build anything at all, and the inputted thresholds aren’t confirmed in the UI. Right here is the place I bought stuck first, pondering my thresholds weren’t space, when truly they dangle been.

It’s also easy to accidentally introduce “gaps” in your threshold ranges. At the delivery which you’ll be in a position to factor in to setup an alert enjoy this (OK under 10 and WARN above 10). Well, then you positively’re missing the cost of 10 (which turns into an “unknown” command).

So in this case you may perchance well perchance well well perchance also simply peaceable setup one thing enjoy this as a replacement (WARN above 9 and OK under 10), which is appropriate but appropriate form feels unsuitable. I’m obvious they may perchance perchance well well also simply dangle strategy up with a better solution for this.

Nevertheless, after including a “Check” and developing my “Notification Endpoint” I peaceable wasn’t getting any alerts. After a while I spotted there used to be one other UI I had neglected, particularly “Notification Solutions”.

I bought moderately enraged at this level, since there perceived to be a couple of flexibility. I began by attempting out developing an alert that would remind us every 24 hours if an alert used to be peaceable serious. Whereas this did truly work, it also resulted in our Slack getting utterly spammed with 100+ notifications. I’m no longer obvious why… but we expeditiously removed this alert.

After this I attempted developing a straightforward alert for when the field adjustments from OK to CRIT. This on the opposite hand, appropriate form didn’t work. No alerts dangle been despatched. At this level we gave up, after having spent larger than 1 week attempting to safe things to work.

One factor to buy under consideration is that InfluxDB Cloud’s free tier greatest involves 2 alerts (so we would dangle positively wished to upgrade to the paid tier if we had determined to prevent with InfluxDB Cloud).

One other malicious program we bumped into used to be that at one level a team member tried resetting their password, which as a replacement created a ticket contemporary group and used to be no longer ready to re-be half of the present group without developing a recent account the grunt of a recent email take care of…

Let’s strive Grafana Cloud as a replacement.

Grafana Cloud

Even supposing we dangle been previously the grunt of Grafana for visualizations we dangle been a small bit hesitant to buy a eye at Grafana Cloud. The necessary cause used to be that we’d want to alter from InfluxDB for our storage to either Graphite or Prometheus, which shall be equipped on Grafana Cloud.

Since we’re basically monitoring current application metrics we ideally appropriate form wished a straightforward endpoint that we may perchance perchance well well also push our recordsdata to (no want for like buffering, redundancy or error checking). Also, since our application is operating in a PaaS we didn’t want to dangle to set up any additional local dependencies in expose to push recordsdata to Grafana. First and most fundamental I presumed it used to be a requirement to dangle a neighborhood agent operating, but we did at closing determine push metrics to Grafana the grunt of the Graphite HTTP API.

Pushing recordsdata to Grafana

It used to be a small bit tricky to accumulate, on the opposite hand it’s that you just may perchance well perchance well well perchance also think of to push recordsdata on to Grafana Cloud the grunt of the Graphite HTTP API. The documentation will doubtless be realized right here. If the hyperlink doesn’t work, strive Googling for it since their docs hyperlinks seem to commerce and spoil moderately most frequently.

Besides they provide a minute instance GitHub repo, which I aged as a starting level.

Right here’s a minimal instance written for node.js:

const API_KEY = ': ';
const ENDPOINT = 'https://graphite-us-central1.grafana.rep/metrics';

// safe contemporary unix timestamp
const time = Math.ground(Date.now() / 1000);

// we're reporting values every 10 seconds
const interval = 10;

const recordsdata = {
  time:  Math.ground(time / interval) * interval, // align timestamp to interval
  title:  'app.some_metric.some_field',
  interval:  interval,
  cost:  1.234,
  tags:  ['type=foo', 'kind=bar'],
};

const headers = { Authorization:  `Bearer ${API_KEY}` };

bought.put up(ENDPOINT, {
  headers,
  responseType:  'json',
  resolveBodyOnly:  ideal-searching,
  json:  recordsdata,
});

One necessary factor to accumulate in ideas with Graphite is that metrics must be aligned to the given interval. As an illustration, an interval of 10 skill that there’ll doubtless be one cost for the given series every 10 seconds. It’s greatest to substantiate you push your metrics a small bit more in most cases then the interval to substantiate you don’t cross over any time slots, which may perchance well lead to gaps (null values) in your recordsdata.

Querying Graphite with tags

Since we’ve greatest been the grunt of InfluxDB as an recordsdata offer in Grafana previously, there used to be a small of a studying curve with Graphite. Particularly, I was having a laborious time figuring out grunt tags. Nevertheless, as soon as the syntax it’s moderately easy to grunt.

Series without tags are moderately easy, simply choose the desired metric in the dropdown.

Nevertheless, for series with tags, you first to dangle choose Tag: title in expose to buy your metric (I didn’t realize graphite robotically created the “title” ticket, so I was perplexed about this before every thing).

Then choose the desired metric as the ticket cost.

After that, you may perchance well perchance well well perchance also add additional tags to filter the knowledge by ticket cost, and you can also furthermore add features comparable to aliasByTags to title each series by their ticket cost (resulting in better taking a eye legends).

Syntehtic Monitoring

One other bonus characteristic we realized is the Synthethic Monitoring characteristic, also available in the free tier. It offers some current application monitoring, with downtime monitoring and world latency checks. Good to dangle!

Free-Tier Comparability

Overall both alternatives dangle very generous free tiers, huge for anybody taking a eye to safe began with current monitoring. The supreme quandary with InfluxDB Cloud is that they greatest offer 2 alerts on the free-tier, which feels very low in my ogle.

InfluxDB Cloud Free-Tier:

  • Series: 10,000
  • Retention 30 days
  • Indicators: 2
  • No storage restrict
  • Writes: 5MB/5min
  • Queries: 300MB/5min

Grafana Cloud Free-Tier:

  • Series: 10,000
  • Retention: 14 days
  • Indicators: 100
  • No storage restrict

Conclusion

The supreme incompatibility between the two is clearly the maturity and rich characteristic-space equipped by Grafana, whereas the solution InfluxDB Cloud offers feels lacking and buggy. We’ll stick with Grafana Cloud for now!

Be taught Extra

Leave a Reply

Your email address will not be published. Required fields are marked *