[06:22:39] 10SRE-tools, 10DBA, 10Infrastructure-Foundations, 10Sustainability (Incident Followup), 10User-Ladsgroup: Create or modify an existing tool that quickly shows the db replication status in case of master failure - https://phabricator.wikimedia.org/T281249 (10Marostegui) Thanks for working on this @Ladsgro... [09:36:32] 10SRE-tools, 10DBA, 10Infrastructure-Foundations, 10Sustainability (Incident Followup), 10User-Ladsgroup: Create or modify an existing tool that quickly shows the db replication status in case of master failure - https://phabricator.wikimedia.org/T281249 (10jcrespo) > @jcrespo Did you test the POC I ment... [09:49:25] 10SRE-tools, 10DBA, 10Infrastructure-Foundations, 10Sustainability (Incident Followup), 10User-Ladsgroup: Create or modify an existing tool that quickly shows the db replication status in case of master failure - https://phabricator.wikimedia.org/T281249 (10Marostegui) >>! In T281249#7780918, @jcrespo wr... [09:59:26] 10SRE-tools, 10DBA, 10Infrastructure-Foundations, 10Sustainability (Incident Followup), 10User-Ladsgroup: Create or modify an existing tool that quickly shows the db replication status in case of master failure - https://phabricator.wikimedia.org/T281249 (10jcrespo) > When we migrated to dbctl, we lost t... [09:59:50] 10SRE-tools, 10DBA, 10Infrastructure-Foundations, 10Sustainability (Incident Followup), 10User-Ladsgroup: Create or modify an existing tool that quickly shows the db replication status in case of master failure - https://phabricator.wikimedia.org/T281249 (10Marostegui) Fixed dbctl notes for s4. Checked a... [11:22:44] 10Puppet, 10Infrastructure-Foundations, 10SRE, 10Patch-For-Review, 10User-jbond: Work required to prepare for puppet 6 - https://phabricator.wikimedia.org/T265138 (10jbond) [13:02:18] 10CFSSL-PKI, 10Infrastructure-Foundations, 10observability, 10serviceops: CertAlmostExpired firing regularly for cert-manager certificates - https://phabricator.wikimedia.org/T303932 (10JMeybohm) p:05Triage→03Medium [13:32:33] 10CFSSL-PKI, 10Infrastructure-Foundations, 10observability, 10serviceops: CertAlmostExpired firing regularly for cert-manager certificates - https://phabricator.wikimedia.org/T303932 (10fgiunchedi) Thanks Janis for kickstarting the discussion. I more or less guessed the thresholds for critical/warning, def... [13:32:44] 10CFSSL-PKI, 10Infrastructure-Foundations, 10observability, 10serviceops: CertAlmostExpired firing regularly for cert-manager certificates - https://phabricator.wikimedia.org/T303932 (10jbond) > I don't know where the 96h come from (maybe that's the cfssl default if nothing is configured on the profile lev... [13:40:50] 10CFSSL-PKI, 10Infrastructure-Foundations, 10observability, 10serviceops: CertAlmostExpired firing regularly for cert-manager certificates - https://phabricator.wikimedia.org/T303932 (10JMeybohm) >>! In T303932#7781820, @jbond wrote: >> I don't know where the 96h come from (maybe that's the cfssl default i... [13:47:05] 10CFSSL-PKI, 10Infrastructure-Foundations, 10observability, 10serviceops: CertAlmostExpired firing regularly for cert-manager certificates - https://phabricator.wikimedia.org/T303932 (10jbond) > 264h I don't see in hiera - what is that used for? ok so i told a white lie its actually [[ https://github.com/w... [14:02:13] 10CFSSL-PKI, 10Infrastructure-Foundations, 10observability, 10serviceops: CertAlmostExpired firing regularly for cert-manager certificates - https://phabricator.wikimedia.org/T303932 (10JMeybohm) Ah, I see :-) Would we be fine with icinga/alertmanager set to warn at 9 days and critical at 7? [14:07:54] 10CFSSL-PKI, 10Infrastructure-Foundations, 10observability, 10serviceops, 10Patch-For-Review: CertAlmostExpired firing regularly for cert-manager certificates - https://phabricator.wikimedia.org/T303932 (10jbond) >>! In T303932#7781970, @JMeybohm wrote: > Would we be fine with icinga/alertmanager set to... [14:13:04] 10SRE-tools, 10DBA, 10Infrastructure-Foundations, 10Sustainability (Incident Followup), 10User-Ladsgroup: Create or modify an existing tool that quickly shows the db replication status in case of master failure - https://phabricator.wikimedia.org/T281249 (10Ladsgroup) I updated that script to completely... [14:15:58] 10SRE-tools, 10DBA, 10Infrastructure-Foundations, 10Sustainability (Incident Followup), 10User-Ladsgroup: Create or modify an existing tool that quickly shows the db replication status in case of master failure - https://phabricator.wikimedia.org/T281249 (10Volans) >>! In T281249#7781991, @Ladsgroup wrot... [14:16:58] 10SRE-tools, 10DBA, 10Infrastructure-Foundations, 10Sustainability (Incident Followup), 10User-Ladsgroup: Create or modify an existing tool that quickly shows the db replication status in case of master failure - https://phabricator.wikimedia.org/T281249 (10Marostegui) >>! In T281249#7781991, @Ladsgroup... [14:59:43] 10netops, 10DC-Ops, 10Infrastructure-Foundations, 10SRE, 10ops-eqiad: Q2:(Need By: TBD) Rows E/F network racking task - https://phabricator.wikimedia.org/T292095 (10cmooney) [15:00:52] 10netops, 10DC-Ops, 10Infrastructure-Foundations, 10SRE, 10ops-eqiad: Q2:(Need By: TBD) Rows E/F network racking task - https://phabricator.wikimedia.org/T292095 (10cmooney) [15:13:54] 10SRE-tools, 10DBA, 10Infrastructure-Foundations, 10Sustainability (Incident Followup), 10User-Ladsgroup: Create or modify an existing tool that quickly shows the db replication status in case of master failure - https://phabricator.wikimedia.org/T281249 (10Ladsgroup) >>! In T281249#7782008, @Volans wrot... [16:52:59] 10Puppet, 10Infrastructure-Foundations, 10SRE, 10Patch-For-Review, 10User-jbond: Work required to prepare for puppet 6 - https://phabricator.wikimedia.org/T265138 (10jhathaway) [16:53:43] 10Puppet, 10Infrastructure-Foundations, 10Patch-For-Review: Where to Put Community Modules? - https://phabricator.wikimedia.org/T302423 (10jhathaway) 05Open→03Resolved Community modules have now been moved to vendor_modules, thanks everyone for the discussion & feedback. [19:24:03] 10netops, 10Data-Engineering, 10Data-Engineering-Kanban, 10Infrastructure-Foundations: Allow access to prometheus-pushgateway.discovery.wmnet port 80 from within Analytics VLAN - https://phabricator.wikimedia.org/T304001 (10Ottomata) [19:26:06] 10netops, 10Data-Engineering, 10Data-Engineering-Kanban, 10Infrastructure-Foundations: Allow access to prometheus-pushgateway.discovery.wmnet port 80 from within Analytics VLAN - https://phabricator.wikimedia.org/T304001 (10Ottomata) [20:14:38] 10netops, 10Data-Engineering, 10Data-Engineering-Kanban, 10Infrastructure-Foundations, and 2 others: Allow access to prometheus-pushgateway.discovery.wmnet port 80 from within Analytics VLAN - https://phabricator.wikimedia.org/T304001 (10Ottomata) [22:24:17] 10netops, 10Data-Engineering, 10Data-Engineering-Kanban, 10Infrastructure-Foundations, and 2 others: Allow access to prometheus-pushgateway.discovery.wmnet port 80 from within Analytics VLAN - https://phabricator.wikimedia.org/T304001 (10cmooney) Worth noting that we are planning in the short term to adjus...