[02:32:27] 10netbox, 06DC-Ops, 06Infrastructure-Foundations, 10ops-codfw, 06SRE: codfw:cr* router power not balance on all 4 PEM's - https://phabricator.wikimedia.org/T401937#11183419 (10Papaul) @cmooney we have the spare PEM on site. I need to get on a call with Juniper to troubleshooting this. Do you think Thursd... [05:32:25] FIRING: MirrorHighLag: Mirrors - /srv/mirrors/ubuntu synchronization lag - https://wikitech.wikimedia.org/wiki/Mirrors - https://grafana.wikimedia.org/d/dbd8a904-eab2-48d1-a3b9-fa1851ef3ed2/mirrors?orgId=1 - https://alerts.wikimedia.org/?q=alertname%3DMirrorHighLag [06:07:25] RESOLVED: MirrorHighLag: Mirrors - /srv/mirrors/ubuntu synchronization lag - https://wikitech.wikimedia.org/wiki/Mirrors - https://grafana.wikimedia.org/d/dbd8a904-eab2-48d1-a3b9-fa1851ef3ed2/mirrors?orgId=1 - https://alerts.wikimedia.org/?q=alertname%3DMirrorHighLag [07:03:56] FIRING: MaxConntrack: Max conntrack at 84.42% on krb1002:9100 - https://wikitech.wikimedia.org/wiki/Monitoring/check_conntrack - https://grafana.wikimedia.org/d/oITUqwKIk/netfilter-connection-tracking - https://alerts.wikimedia.org/?q=alertname%3DMaxConntrack [07:08:55] RESOLVED: MaxConntrack: Max conntrack at 84.72% on krb1002:9100 - https://wikitech.wikimedia.org/wiki/Monitoring/check_conntrack - https://grafana.wikimedia.org/d/oITUqwKIk/netfilter-connection-tracking - https://alerts.wikimedia.org/?q=alertname%3DMaxConntrack [07:19:25] FIRING: MaxConntrack: Max conntrack at 83.86% on krb1002:9100 - https://wikitech.wikimedia.org/wiki/Monitoring/check_conntrack - https://grafana.wikimedia.org/d/oITUqwKIk/netfilter-connection-tracking - https://alerts.wikimedia.org/?q=alertname%3DMaxConntrack [07:24:26] RESOLVED: MaxConntrack: Max conntrack at 82.09% on krb1002:9100 - https://wikitech.wikimedia.org/wiki/Monitoring/check_conntrack - https://grafana.wikimedia.org/d/oITUqwKIk/netfilter-connection-tracking - https://alerts.wikimedia.org/?q=alertname%3DMaxConntrack [10:12:25] FIRING: SystemdUnitFailed: nginx.service on urldownloader1003:9100 - https://wikitech.wikimedia.org/wiki/Monitoring/check_systemd_state - https://grafana.wikimedia.org/d/g-AaZRFWk/systemd-status - https://alerts.wikimedia.org/?q=alertname%3DSystemdUnitFailed [10:17:25] RESOLVED: SystemdUnitFailed: nginx.service on urldownloader1003:9100 - https://wikitech.wikimedia.org/wiki/Monitoring/check_systemd_state - https://grafana.wikimedia.org/d/g-AaZRFWk/systemd-status - https://alerts.wikimedia.org/?q=alertname%3DSystemdUnitFailed [10:50:24] 10netbox, 06DC-Ops, 06Infrastructure-Foundations, 10ops-codfw, 06SRE: codfw:cr* router power not balance on all 4 PEM's - https://phabricator.wikimedia.org/T401937#11184382 (10cmooney) Hey @papaul yeah Thursday will be fine thanks. [11:13:08] Hi there, I'm working on Poliloom (https://loom.everypolitician.org/ | https://github.com/opensanctions/poliloom/), a tool to update Wikidata with information from Wikipedia and the wider web, and I'm running into a problem that I suspect could be blocking from the Wikidata side. I found this IRC chat here: [11:13:08] https://wikitech.wikimedia.org/wiki/SRE/Infrastructure_Foundations/Contact and was wondering if this is the correct channel to ask questions about this? Thank you in advance for any help you can give me. [11:16:03] The problem I'm running into is that on my development machine, with my development credentials, inserts into Wikidata work perfectly. [11:16:03] However, the same service, using our production credentials and running on Hetzner infrastructure gets these JSON responses from the Wikidata REST API: [11:16:04] {"error":"rest-write-denied","httpCode":403,"httpReason":"Forbidden"} when I try to push statements. [11:16:04] In both environments, logging in with the wikimedia login system works great, so the credentials should be correct. [11:16:05] As far as I can see, our OAuth2 app configurations are identical, and our services should behave the same. I've double checked all the configuration, tried deploying the same docker based setup as production locally, that worked. [11:16:05] I exhausted my options and now I wonder, is there some form of IP range blocking going on here? Does Wikidata disallow writes originating from Hetzner IP ranges? Or is this a elusive problem on my end? [11:16:52] Our OAuth credentials: [11:16:52] development: https://meta.wikimedia.org/w/index.php?title=Special:OAuthListConsumers/view/cc0ecfe052f413eb5ea8f700cf206863&name=PoliLoom&publisher=&stage=1 [11:16:53] production: https://meta.wikimedia.org/w/index.php?title=Special:OAuthListConsumers/view/560f4dac847feb3b7f71b79fc3a1e9b5&name=Poliloom&publisher=&stage=1 [12:18:07] 10netops, 06DC-Ops, 06Infrastructure-Foundations, 10ops-eqiad, 06SRE: eqiad: rows C/D Upgrade Tracking - https://phabricator.wikimedia.org/T404609#11184704 (10cmooney) >>! In T404609#11181649, @RobH wrote: > @cmooney: What do you think is the best way to go about migrating these connections on upcoming C... [12:22:03] 10netops, 06DC-Ops, 06Infrastructure-Foundations, 10ops-eqiad, 06SRE: eqiad: rows C/D Upgrade Tracking - https://phabricator.wikimedia.org/T404609#11184711 (10cmooney) @RobH @Jclark-ctr there is also another way we could try to approach this so may as well mention it now before we start planning. Rack-b... [12:40:45] 10netops, 06DC-Ops, 06Infrastructure-Foundations, 10ops-eqiad, 06SRE: eqiad: rows C/D Upgrade Tracking - https://phabricator.wikimedia.org/T404609#11184793 (10Jclark-ctr) @cmooney I’m flexible to try either way. Maybe a mix could work? We could start with roles that aren’t single points of failure and ar... [13:09:36] 10Mail, 06FR-donorrelations, 06Infrastructure-Foundations, 06SRE: Donations@ doesn't forward to donate@ - https://phabricator.wikimedia.org/T403986#11184909 (10AMJohnson) 05Open→03Resolved a:03AMJohnson @DSeyfert_WMF was able to fix this for us. Thank you, Dustin! Going ahead and closing out this... [14:10:37] 10Mail, 06FR-donorrelations, 06Infrastructure-Foundations, 06SRE: Donations@ doesn't forward to donate@ - https://phabricator.wikimedia.org/T403986#11185266 (10Aklapper) a:05AMJohnson→03DSeyfert_WMF [16:12:23] 10netops, 06DC-Ops, 06Infrastructure-Foundations, 10ops-eqiad, 06SRE: eqiad: rows C/D Upgrade Tracking - https://phabricator.wikimedia.org/T404609#11185881 (10RobH) I overthought this, we should just move them with an SFP-T to the new port and worry about reimage and migration to full 10G later. [16:13:29] 10netops, 06DC-Ops, 06Infrastructure-Foundations, 10ops-eqiad, 06SRE: eqiad: rows C/D Upgrade Tracking - https://phabricator.wikimedia.org/T404609#11185897 (10RobH) [16:24:03] 10netops, 06DC-Ops, 06Infrastructure-Foundations, 10ops-eqiad, 06SRE: eqiad: rows C/D Upgrade Tracking - https://phabricator.wikimedia.org/T404609#11185965 (10RobH) [19:18:56] FIRING: MaxConntrack: Max conntrack at 82.19% on krb1002:9100 - https://wikitech.wikimedia.org/wiki/Monitoring/check_conntrack - https://grafana.wikimedia.org/d/oITUqwKIk/netfilter-connection-tracking - https://alerts.wikimedia.org/?q=alertname%3DMaxConntrack [19:23:55] RESOLVED: MaxConntrack: Max conntrack at 84.55% on krb1002:9100 - https://wikitech.wikimedia.org/wiki/Monitoring/check_conntrack - https://grafana.wikimedia.org/d/oITUqwKIk/netfilter-connection-tracking - https://alerts.wikimedia.org/?q=alertname%3DMaxConntrack