[02:59:46] (HAProxyRestarted) firing: HAProxy server restarted on cloudcontrol1005:9100 - https://wikitech.wikimedia.org/wiki/HAProxy#HAProxy_for_edge_caching - https://grafana.wikimedia.org/d/gQblbjtnk/haproxy-drilldown?orgId=1&var-site=eqiad%20prometheus/ops&var-instance=cloudcontrol1005&viewPanel=10 - https://alerts.wikimedia.org/?q=alertname%3DHAProxyRestarted [06:59:46] (HAProxyRestarted) firing: HAProxy server restarted on cloudcontrol1005:9100 - https://wikitech.wikimedia.org/wiki/HAProxy#HAProxy_for_edge_caching - https://grafana.wikimedia.org/d/gQblbjtnk/haproxy-drilldown?orgId=1&var-site=eqiad%20prometheus/ops&var-instance=cloudcontrol1005&viewPanel=10 - https://alerts.wikimedia.org/?q=alertname%3DHAProxyRestarted [07:48:23] godog: hmm are you around? [07:48:58] HAProxyRestarted expression looks like this: expr: 'sum(node_systemd_service_restart_total{name="haproxy.service", instance=~"(cp|dns).*"}) by (instance) >= 1' [07:49:16] not sure why it's getting triggered by cloudcontrol1005 [07:49:32] vgutierrez: ack, looking, let's see [07:50:01] lol [07:50:04] https://gerrit.wikimedia.org/r/c/operations/alerts/+/918471 [07:50:13] merging the CR could have been useful [07:50:34] haha! ok that explains [07:50:45] I was about to say that I don't see that expression [07:51:06] ok to merge it? [07:51:24] or do we have any better way of limiting the scope? [07:51:35] thx :) [07:51:42] yeah! instance or cluster really [07:51:50] that's fine IMHO [07:53:12] FWIW to get the current alert expression what I do is click on the alert's "xx hours ago" dropdown menu then "alert source links" will point to thanos or prometheus UI with the expression [08:19:46] (HAProxyRestarted) resolved: HAProxy server restarted on cloudcontrol1005:9100 - https://wikitech.wikimedia.org/wiki/HAProxy#HAProxy_for_edge_caching - https://grafana.wikimedia.org/d/gQblbjtnk/haproxy-drilldown?orgId=1&var-site=eqiad%20prometheus/ops&var-instance=cloudcontrol1005&viewPanel=10 - https://alerts.wikimedia.org/?q=alertname%3DHAProxyRestarted [08:45:15] 10Traffic, 10SRE, 10ops-eqiad: Relocate lvs1013-lvs1016 to rows E & F - https://phabricator.wikimedia.org/T341992 (10Fabfur) [09:52:13] 10Traffic, 10SRE, 10ops-eqiad: Relocate lvs1013-lvs1016 to rows E & F - https://phabricator.wikimedia.org/T341992 (10Fabfur) [10:24:06] 10Traffic, 10SRE, 10ops-eqiad: Relocate lvs1013-lvs1016 to rows E & F - https://phabricator.wikimedia.org/T341992 (10Fabfur) [21:23:00] 10Traffic, 10ops-eqiad: Relocate lvs1013-lvs1016 to rows E & F - https://phabricator.wikimedia.org/T341992 (10KOfori) [21:56:38] 10Traffic, 10netops, 10Commons, 10Infrastructure-Foundations: $wgUseInstantCommons throws an SSL error - https://phabricator.wikimedia.org/T342473 (10bd808) [22:37:41] 10Traffic, 10netops, 10Commons, 10Infrastructure-Foundations, 10SRE: $wgUseInstantCommons throws an SSL error - https://phabricator.wikimedia.org/T342473 (10Tgr) Usually this means some kind of man-in-the-middle scenario (or, less likely, misconfiguration at the target server) - you are getting a certifi... [23:01:18] 10Traffic, 10netops, 10Commons, 10Infrastructure-Foundations, 10SRE: $wgUseInstantCommons throws an SSL error - https://phabricator.wikimedia.org/T342473 (10Platonides) No problem connecting to commons.wikimedia.org from Germany. Note: connection from Germany = DigiCert wildcard signed by DigiCert TLS H... [23:15:30] 10Traffic, 10netops, 10Commons, 10Infrastructure-Foundations, 10SRE: $wgUseInstantCommons throws an SSL error - https://phabricator.wikimedia.org/T342473 (10taavi) This looks like https://bugs.launchpad.net/ubuntu/+source/curl/+bug/2028170. [23:34:46] 10Traffic, 10netops, 10Commons, 10Infrastructure-Foundations, 10SRE: $wgUseInstantCommons throws an SSL error - https://phabricator.wikimedia.org/T342473 (10OriginalAuthority) Looks like shutting down the server and booting it back up again fixed the issue. But yes, it probably is the issue above, taavi. [23:49:47] 10Traffic, 10netops, 10Commons, 10Infrastructure-Foundations, 10SRE: $wgUseInstantCommons throws an SSL error - https://phabricator.wikimedia.org/T342473 (10Platonides) 05Open→03Resolved a:03Platonides