[08:24:02] 10netops, 10Infrastructure-Foundations, 10SRE: Netbox info missing on some WMCS elements - https://phabricator.wikimedia.org/T292097 (10aborrero) The `cloud-gw-transport-eqiad` range is actually `185.15.56.236/30` not 185.15.56.**238**/30. It is registered on netbox: https://netbox.wikimedia.org/ipam/prefixe... [09:08:18] 10netops, 10Infrastructure-Foundations, 10SRE: Netbox info missing on some WMCS elements - https://phabricator.wikimedia.org/T292097 (10cmooney) @aborrero Yeah the range is there alright, I just mean the second IP in the linknet, 185.15.56.238, is not associated with anything. Netbox always uses the CIDR ma... [09:15:55] 10SRE-tools, 10DBA, 10Infrastructure-Foundations, 10Sustainability (Incident Followup): Create or modify an existing tool that quickly shows the db replication status in case of master failure - https://phabricator.wikimedia.org/T281249 (10Marostegui) With orchestrator we can sort of do that (note clouddb1... [09:35:42] Hey, not sure if anyone knows how to deal with this one, just a small thing. [09:36:38] Icinga is alerting for the RIPE Atlas in codfw - it's getting a 403 back when it tries to run a check against it. [09:37:28] I think this is a slight change from previous - until yesterday the unit was hard down - now it is back online, but not yet accepted by RIPE as a valid probe. [09:37:45] So it's no problem, I've made the required request to RIPE to bring it back. [09:38:06] Should I just ACK the Icinga alerts? Leave them as they are? Like I say no big thing at all just wondering best practice here. [09:58:08] I can tell you what *I* do- if there is a chance of an alert getting worse without noticing, but currently it is a known issue, I ack it, but give it a limited time for expiration, that way I don't forget about it forever [10:10:12] jynus: Sounds reasonable, thanks for the advice :) [10:12:37] it depends on the check, really- I think that works for me because weekly backup checks, if you think this is going to take a very long time, maybe disabling it on puppet is another option [10:13:41] The RIPE docs say they normally process the things within a week so I think it's probably ok to leave it active, just wanted to signal it's expected. [10:13:51] the important thing is have a safe way to not forget to reenable stuff, expiration time, a ticket or a calendar thingy [10:14:37] yeah exactly, serves as a reminder to me to chase up if it doesn't resolve itself. [10:14:42] topranks, then 1 week expiration would work for you- and linking on the comment on how to learn more in case you are out [10:15:03] cool sounds good :) [10:36:59] 10netops, 10Infrastructure-Foundations, 10SRE: Netbox info missing on some WMCS elements - https://phabricator.wikimedia.org/T292097 (10aborrero) >>! In T292097#7390420, @cmooney wrote: > @aborrero Yeah the range is there alright, I just mean the second IP in the linknet, 185.15.56.238, is not associated wit... [10:53:59] 10SRE-tools, 10DBA, 10Infrastructure-Foundations, 10Sustainability (Incident Followup): Create or modify an existing tool that quickly shows the db replication status in case of master failure - https://phabricator.wikimedia.org/T281249 (10jcrespo) For backups (but I think DBAs may have an equivalent need)... [11:12:45] 10SRE-tools, 10DBA, 10Infrastructure-Foundations, 10Sustainability (Incident Followup): Create or modify an existing tool that quickly shows the db replication status in case of master failure - https://phabricator.wikimedia.org/T281249 (10Marostegui) Orchestrator has tags, which can be useful - I filed th... [11:18:46] 10SRE-tools, 10DBA, 10Infrastructure-Foundations, 10Sustainability (Incident Followup): Create or modify an existing tool that quickly shows the db replication status in case of master failure - https://phabricator.wikimedia.org/T281249 (10Marostegui) ` root@dborch1001:~# /usr/bin/orchestrator-client -c... [12:22:01] 10Mail, 10Infrastructure-Foundations, 10SRE, 10Patch-For-Review: Upgrade MXes to Bullseye - https://phabricator.wikimedia.org/T286911 (10MoritzMuehlenhoff) 05In progress→03Resolved a:03MoritzMuehlenhoff mx1001/mx2001 have been reimaged to Bullseye (reusing the VM/IP for potential IP reputation issues... [13:58:48] 10netops, 10Infrastructure-Foundations, 10SRE: Netbox info missing on some WMCS elements - https://phabricator.wikimedia.org/T292097 (10cmooney) Cool thanks for that. Regarding 185.15.56.238, I didn't realise it was already in DNS. That makes it easy, I've gone ahead and added an object for it to Netbox:... [15:01:17] 10Puppet, 10Infrastructure-Foundations, 10GitLab (Infrastructure), 10Patch-For-Review, and 3 others: Puppetise gitlab-ansible playbook - https://phabricator.wikimedia.org/T283076 (10Jelto) The preparation of GitLab puppet code is mostly done. I would like to deploy https://gerrit.wikimedia.org/r/724430 to... [16:04:11] 10netops, 10Infrastructure-Foundations, 10SRE: ripe-atlas-codfw is down - https://phabricator.wikimedia.org/T267714 (10cmooney) [16:04:25] 10netops, 10DC-Ops, 10Infrastructure-Foundations, 10SRE: (Need By: TBD) rack/setup/install atlas-codfw.wikimedia.org - https://phabricator.wikimedia.org/T273114 (10cmooney) 05Open→03Resolved And we are live :) ` From: RIPE Atlas [mailto:atlas@ripe.net] Sent: Thursday, September 30, 2021, 2:31 PM To: Ca... [18:54:30] 10netops, 10Infrastructure-Foundations, 10SRE: Netbox info missing on some WMCS elements - https://phabricator.wikimedia.org/T292097 (10cmooney) Just an update, I removed the DNS entry / IP object for the .238 address just to be safe. When the sre.dns.netbox cookbook ran the change in Netbox made it want to... [19:54:23] 10Puppet, 10Infrastructure-Foundations, 10Release-Engineering-Team, 10User-brennen: logspam-watch: UTF-8 errors for some input - https://phabricator.wikimedia.org/T292246 (10brennen) [19:54:31] 10Puppet, 10Infrastructure-Foundations, 10Release-Engineering-Team, 10User-brennen: logspam-watch: UTF-8 errors for some input - https://phabricator.wikimedia.org/T292246 (10brennen) [22:22:47] 10netops, 10DC-Ops, 10Infrastructure-Foundations, 10SRE, and 2 others: Q1:(Need By: TBD) rack/setup/install cloudswift100[12] - https://phabricator.wikimedia.org/T289882 (10Jclark-ctr) [22:22:57] 10netops, 10DC-Ops, 10Infrastructure-Foundations, 10SRE, and 2 others: Q1:(Need By: TBD) rack/setup/install cloudswift100[12] - https://phabricator.wikimedia.org/T289882 (10Jclark-ctr) @cmooney These host have come in and racked unless something has changed and these racks are correct please assign to @C...