[01:01:16] FIRING: ProbeDown: Service idp2004:443 has failed probes (http_idp_wikimedia_org_ip6) - https://wikitech.wikimedia.org/wiki/CAS-SSO#Alerting - https://grafana.wikimedia.org/d/O0nHhdhnz/network-probes-overview?var-job=probes/custom&var-module=All - https://alerts.wikimedia.org/?q=alertname%3DProbeDown [01:06:16] RESOLVED: ProbeDown: Service idp2004:443 has failed probes (http_idp_wikimedia_org_ip6) - https://wikitech.wikimedia.org/wiki/CAS-SSO#Alerting - https://grafana.wikimedia.org/d/O0nHhdhnz/network-probes-overview?var-job=probes/custom&var-module=All - https://alerts.wikimedia.org/?q=alertname%3DProbeDown [04:49:25] FIRING: MirrorHighLag: Mirrors - /srv/mirrors/debian synchronization lag - https://wikitech.wikimedia.org/wiki/Mirrors - https://grafana.wikimedia.org/d/dbd8a904-eab2-48d1-a3b9-fa1851ef3ed2/mirrors?orgId=1 - https://alerts.wikimedia.org/?q=alertname%3DMirrorHighLag [05:19:25] FIRING: [2x] MirrorHighLag: Mirrors - /srv/mirrors/debian synchronization lag - https://wikitech.wikimedia.org/wiki/Mirrors - https://grafana.wikimedia.org/d/dbd8a904-eab2-48d1-a3b9-fa1851ef3ed2/mirrors?orgId=1 - https://alerts.wikimedia.org/?q=alertname%3DMirrorHighLag [05:44:25] FIRING: [2x] MirrorHighLag: Mirrors - /srv/mirrors/debian synchronization lag - https://wikitech.wikimedia.org/wiki/Mirrors - https://grafana.wikimedia.org/d/dbd8a904-eab2-48d1-a3b9-fa1851ef3ed2/mirrors?orgId=1 - https://alerts.wikimedia.org/?q=alertname%3DMirrorHighLag [05:54:16] FIRING: ProbeDown: Service idp2004:443 has failed probes (http_idp_wikimedia_org_ip4) - https://wikitech.wikimedia.org/wiki/CAS-SSO#Alerting - https://grafana.wikimedia.org/d/O0nHhdhnz/network-probes-overview?var-job=probes/custom&var-module=All - https://alerts.wikimedia.org/?q=alertname%3DProbeDown [05:59:16] RESOLVED: [2x] ProbeDown: Service idp2004:443 has failed probes (http_idp_wikimedia_org_ip4) - https://wikitech.wikimedia.org/wiki/CAS-SSO#Alerting - https://grafana.wikimedia.org/d/O0nHhdhnz/network-probes-overview?var-job=probes/custom&var-module=All - https://alerts.wikimedia.org/?q=alertname%3DProbeDown [07:27:48] topranks: good morning! For the "certificate about to expire" alerts for a bunch of network devices is that something in your radar? [07:27:51] https://alerts.wikimedia.org/?q=%40state%3Dactive&q=%40cluster%3Dwikimedia.org&q=alertname%3DCertAlmostExpired [07:48:01] moritzm: there is a probe down alert for squid on install3004 that might be related to routed gaenti (the timing seems to align) [07:48:09] related logs https://logstash.wikimedia.org/app/dashboards#/view/f3e709c0-a5f8-11ec-bf8e-43f1807d5bc2?_g=(filters:!((query:(match_phrase:(service.name:http_squid_ip6))))) [07:51:15] I see that the logs show that it's trying to use the cope link address (and indeed on the host there is no scope global address) [07:51:53] I'l have a look in a bit [07:51:59] thanks [08:54:25] RESOLVED: MirrorHighLag: Mirrors - /srv/mirrors/debian synchronization lag - https://wikitech.wikimedia.org/wiki/Mirrors - https://grafana.wikimedia.org/d/dbd8a904-eab2-48d1-a3b9-fa1851ef3ed2/mirrors?orgId=1 - https://alerts.wikimedia.org/?q=alertname%3DMirrorHighLag [09:39:16] FIRING: ProbeDown: Service idp2004:443 has failed probes (http_idp_wikimedia_org_ip6) - https://wikitech.wikimedia.org/wiki/CAS-SSO#Alerting - https://grafana.wikimedia.org/d/O0nHhdhnz/network-probes-overview?var-job=probes/custom&var-module=All - https://alerts.wikimedia.org/?q=alertname%3DProbeDown [09:44:16] RESOLVED: ProbeDown: Service idp2004:443 has failed probes (http_idp_wikimedia_org_ip6) - https://wikitech.wikimedia.org/wiki/CAS-SSO#Alerting - https://grafana.wikimedia.org/d/O0nHhdhnz/network-probes-overview?var-job=probes/custom&var-module=All - https://alerts.wikimedia.org/?q=alertname%3DProbeDown [10:40:16] FIRING: ProbeDown: Service idp2004:443 has failed probes (http_idp_wikimedia_org_ip4) - https://wikitech.wikimedia.org/wiki/CAS-SSO#Alerting - https://grafana.wikimedia.org/d/O0nHhdhnz/network-probes-overview?var-job=probes/custom&var-module=All - https://alerts.wikimedia.org/?q=alertname%3DProbeDown [10:45:16] RESOLVED: ProbeDown: Service idp2004:443 has failed probes (http_idp_wikimedia_org_ip4) - https://wikitech.wikimedia.org/wiki/CAS-SSO#Alerting - https://grafana.wikimedia.org/d/O0nHhdhnz/network-probes-overview?var-job=probes/custom&var-module=All - https://alerts.wikimedia.org/?q=alertname%3DProbeDown [10:51:46] ^ I bounced Tomcat on idp2004 (it's the passive node) [11:28:20] I'd still like to understand why it breaks. [14:07:56] FIRING: [2x] ProbeDown: Service mirror1001:443 has failed probes (http_mirrors_wikimedia_org_ip4) - https://wikitech.wikimedia.org/wiki/Runbook#mirror1001:443 - https://grafana.wikimedia.org/d/O0nHhdhnz/network-probes-overview?var-job=probes/custom&var-module=All - https://alerts.wikimedia.org/?q=alertname%3DProbeDown [14:12:57] RESOLVED: [2x] ProbeDown: Service mirror1001:443 has failed probes (http_mirrors_wikimedia_org_ip4) - https://wikitech.wikimedia.org/wiki/Runbook#mirror1001:443 - https://grafana.wikimedia.org/d/O0nHhdhnz/network-probes-overview?var-job=probes/custom&var-module=All - https://alerts.wikimedia.org/?q=alertname%3DProbeDown [14:19:06] FIRING: NetboxAccounting: Netbox - Accounting job failed - https://wikitech.wikimedia.org/wiki/Netbox#Report_Alert - https://netbox.wikimedia.org/core/jobs/ - https://alerts.wikimedia.org/?q=alertname%3DNetboxAccounting [14:23:38] 10Mail, 06FR-donorrelations, 06Infrastructure-Foundations, 06SRE: Donations@ doesn't forward to donate@ - https://phabricator.wikimedia.org/T403986#11181050 (10jhathaway) p:05Triage→03Medium [14:31:16] 10SRE-tools, 06Infrastructure-Foundations: secure-cookbook doesn't allow for --dry-run - https://phabricator.wikimedia.org/T404355#11181082 (10LSobanski) p:05Triage→03Medium [14:38:26] 10netbox, 06Infrastructure-Foundations, 07Regression: after logging into Netbox, NDAs see an empty dashboard - https://phabricator.wikimedia.org/T404494#11181117 (10SLyngshede-WMF) p:05Triage→03Medium a:03SLyngshede-WMF [15:19:05] RESOLVED: NetboxAccounting: Netbox - Accounting job failed - https://wikitech.wikimedia.org/wiki/Netbox#Report_Alert - https://netbox.wikimedia.org/core/jobs/ - https://alerts.wikimedia.org/?q=alertname%3DNetboxAccounting [16:17:07] 10netops, 06DC-Ops, 06Infrastructure-Foundations, 10ops-eqiad: eqiad: rows C/D Upgrade Tracking - https://phabricator.wikimedia.org/T404609 (10RobH) 03NEW p:05Triage→03Medium [16:18:46] 10netops, 06DC-Ops, 06Infrastructure-Foundations, 10ops-eqiad: eqiad: rows C/D Upgrade Tracking - https://phabricator.wikimedia.org/T404609#11181650 (10RobH) @cmooney: What do you think is the best way to go about migrating these connections on upcoming C/D updates? The new switch will be online in the ra... [17:54:44] hello I/F friends - it seems there's a regression in mypy 1.18.1 (released last Thursday), that's now raising a spurious finding on a line in the sre.hosts.provision cookbook. [17:54:44] wanted to flag here (1) in case anyone else is running into this and (2) to mention that jasmine_ is drafting a patch to add a narrowly scoped `# type: ignore` for it [17:54:44] (IMO, that's a better solution than either complexifying the code for mypy's benefit or pinning to 1.17.1) [17:55:58] out of curiosity swfrench-wmf is there an issue filed upstream? I have an old friend who I could bug about mypy [17:56:44] cdanis: I just pulled up their issues and was about to check. I'll keep you posted :) [17:59:21] 04:33:29 cookbooks/sre/hosts/provision.py: note: In member "__init__" of class "SupermicroProvisionRunner": [17:59:23] 04:33:41 cookbooks/sre/hosts/provision.py:302: error: Argument 1 to "next" has incompatible type "Union[Iterator[IPv4Address], list[IPv4Address], Iterator[IPv6Address], list[IPv6Address]]"; expected "SupportsNext[_BaseAddress]" [arg-type] [17:59:34] yup, that's it [18:00:05] thanks swfrench-wmf! and much thanks for investigating and identifying so quickly - here's the patch: https://gerrit.wikimedia.org/r/c/operations/cookbooks/+/1187997 in case anyone interested [18:03:18] I am somewhat suspicious of https://github.com/python/typeshed/issues/14718 but am pretty out of my depth here [18:03:41] jasmine_: it might make sense to split the annotation off into another patch, or at least add additional context to the commit description about why it's there :) [18:05:10] sorry, I meant https://github.com/python/mypy/issues/19852 [18:06:33] cdanis: ah, that's an interesting one. I would say I'm pretty out of my depth as well, heh. if I have time, I might experiment a bit to seen if there's a critical number of union variants that trips it, just to have a minimal repro. [18:13:31] hmm maybe https://github.com/python/mypy/issues/19855 [18:14:25] hm but there's not obviously callables involved there, so ParamSpec shouldn't be involved at all? ... so probably not [18:29:52] yeah, this is going to be rather hard to repro ... I can't come up with a good way to trigger this behavior where mypy will somehow infer that `list` is one of the possible return values of `IPv{4,6}Network.hosts()` [18:33:39] as in, _if_ that were possible, this would be a correct finding, but I have no idea how it's inferring that :) [18:43:37] thanks swfrench-wmf, ChrisDobbins901_ ran into this as well ^ [18:43:53] swfrench-wmf: so I was able to repro, but this is pretty silly [18:44:51] if the network has only 1 ip apparently it retuns a list (despite the documentation saying iterator [18:44:55] >>> a = ipaddress.ip_interface('10.0.0.1/32') [18:44:56] >>> a.network.hosts() [18:44:56] [IPv4Address('10.0.0.1')] [18:50:29] and it's not part of the source code in https://github.com/python/cpython/blob/3.11/Lib/ipaddress.py but it looks like an optimization [18:50:36] https://github.com/python/cpython/blob/3.9/Lib/ipaddress.py#L1537 [18:50:40] ^ this is wild [18:50:43] >>> print(inspect.getsource(a.network.hosts)) self.hosts = lambda: [IPv4Address(addr)] [18:50:49] volans: yeah, that :) [18:50:53] yep [18:51:31] I even skimmed the source file earlier, and completely missed the special case for singletons [18:51:35] * swfrench-wmf shakes head [18:51:54] yeah that's breaking all the typing and protocol assumptions of that method [18:52:02] it's clearly not only an iterator [18:52:40] has 3 implementations for self.hosts(), that's crazy [18:53:18] https://github.com/python/cpython/pull/18757 [18:53:32] so yeah, I guess this is a real finding after all! and has surfaced some pretty wild choices in the implementation of ipaddress [18:54:14] yes, just change it into next(iter(...)) [18:54:41] and should work in all cases [18:54:47] +1 exactly, yeah [18:55:23] ^ jasmine_: since you're already working on a patch, would you mind changing course and trying the above instead [18:56:18] you can refer this GH task upstream: https://github.com/python/cpython/issues/109305 [18:57:50] thanks volans <3 [18:58:16] thanks again for highlighting this, volans - I'm still floored that this is a thing :) [18:58:17] this was a very javascript move for python [18:58:26] I'm really sad [19:01:23] * taavi is tempted to !bash that [19:01:36] feel free :) [19:01:41] !bash this was a very javascript move for python [19:01:42] taavi: Stored quip at https://bash.toolforge.org/quip/IZHBTpkBvg159pQrGeGw [19:01:49] lol [19:04:28] for the record at the end of the doc it says 'Networks with a mask of 32 will return a list containing the single host address.', but is buried and far from the first 3 words ('Returns an iterator') [19:04:42] (same for 128 for v6) [19:31:50] oof v interesting, ty for the finds volans [19:32:00] swfrench-wmf: {{done}} https://gerrit.wikimedia.org/r/c/operations/cookbooks/+/1188417 and jenkins seems happy [19:32:19] thanks, jasmine_! looking now [19:37:11] tyty! [21:08:56] FIRING: MaxConntrack: Max conntrack at 82.53% on krb1002:9100 - https://wikitech.wikimedia.org/wiki/Monitoring/check_conntrack - https://grafana.wikimedia.org/d/oITUqwKIk/netfilter-connection-tracking - https://alerts.wikimedia.org/?q=alertname%3DMaxConntrack [21:13:55] RESOLVED: MaxConntrack: Max conntrack at 85.21% on krb1002:9100 - https://wikitech.wikimedia.org/wiki/Monitoring/check_conntrack - https://grafana.wikimedia.org/d/oITUqwKIk/netfilter-connection-tracking - https://alerts.wikimedia.org/?q=alertname%3DMaxConntrack