[02:09:24] FIRING: SystemdUnitFailed: generate_os_reports.service on puppetdb2003:9100 - https://wikitech.wikimedia.org/wiki/Monitoring/check_systemd_state - https://grafana.wikimedia.org/d/g-AaZRFWk/systemd-status - https://alerts.wikimedia.org/?q=alertname%3DSystemdUnitFailed [06:09:24] FIRING: SystemdUnitFailed: generate_os_reports.service on puppetdb2003:9100 - https://wikitech.wikimedia.org/wiki/Monitoring/check_systemd_state - https://grafana.wikimedia.org/d/g-AaZRFWk/systemd-status - https://alerts.wikimedia.org/?q=alertname%3DSystemdUnitFailed [08:54:32] 10netops, 06Infrastructure-Foundations: Apply egress Source Address Validation on the Wikimedia core routers - https://phabricator.wikimedia.org/T372158#10056948 (10ayounsi) > However, in reality, it should be possible to reject all IP packets where the source IP is not part of the IP prefixes that the Foundat... [09:08:36] 10netops, 06Infrastructure-Foundations: Publish, and maintain ASPA records for valid AS14907 upstreams - https://phabricator.wikimedia.org/T372161#10056965 (10ayounsi) > Nevertheless, it should be possible to publish ASPA records in RPKI through the ARIN portal I looked a bit around Arin's RPKI's portal but co... [09:10:01] 10netops, 06Infrastructure-Foundations, 07sre-alert-triage: Alert in need of triage: BGP status (instance cr1-esams) - https://phabricator.wikimedia.org/T372248 (10LSobanski) 03NEW [09:42:05] o/ puppetmaster100{1,3} seem unhappy, see failed probe alerts and intermittent failures as discussed in -operations, I'm going to see if a restart helps unless you peeps want to have a look first [09:51:44] 10netops, 06Infrastructure-Foundations, 07sre-alert-triage: Alert in need of triage: BGP status (instance cr1-esams) - https://phabricator.wikimedia.org/T372248#10057060 (10ayounsi) a:03ayounsi Emailed AS54994 and cleared the errors for the others. [10:09:24] FIRING: SystemdUnitFailed: generate_os_reports.service on puppetdb2003:9100 - https://wikitech.wikimedia.org/wiki/Monitoring/check_systemd_state - https://grafana.wikimedia.org/d/g-AaZRFWk/systemd-status - https://alerts.wikimedia.org/?q=alertname%3DSystemdUnitFailed [11:58:42] o/ [13:25:18] 10Mail, 06Infrastructure-Foundations: Alert email sent from backupmon1001 didn't reach engineer's google inbox (was: check-dbbackup-time sometimes doesn't send email alerts) - https://phabricator.wikimedia.org/T369253#10057656 (10jhathaway) >>! In T369253#9961121, @MatthewVernon wrote: > [it might be worth mak... [14:09:24] FIRING: SystemdUnitFailed: generate_os_reports.service on puppetdb2003:9100 - https://wikitech.wikimedia.org/wiki/Monitoring/check_systemd_state - https://grafana.wikimedia.org/d/g-AaZRFWk/systemd-status - https://alerts.wikimedia.org/?q=alertname%3DSystemdUnitFailed [15:44:15] 10netops, 06Infrastructure-Foundations: Publish, and maintain ASPA records for valid AS14907 upstreams - https://phabricator.wikimedia.org/T372161#10058298 (10Southparkfan) Follow-up from IRC: Wikimedia uses the [[ https://www.arin.net/resources/manage/rpki/hosted/ | Hosted RPKI ]], but we assume the ARIN port... [15:53:35] 10netops, 06DBA, 06DC-Ops, 06Infrastructure-Foundations, and 2 others: Migrate codfw row C & D database hosts to new Leaf switches - https://phabricator.wikimedia.org/T370852#10058326 (10ABran-WMF) >>! In T370852#10028352, @Marostegui wrote: > @ABran-WMF please coordinate with @cmooney for this. ack, will... [16:25:08] 10netbox, 06Infrastructure-Foundations, 13Patch-For-Review: Netbox: use Custom Model Validation - https://phabricator.wikimedia.org/T310590#10058423 (10ayounsi) [16:38:36] jhathaway: slyngs: I'm trying to update an entry in pwstore and I'm seeing an expired key with ID FA1E9F9A41E7F43502CA5D6352FC8E7BEDB7FCA2 - what should I do? [16:45:18] btullis: find out which user it is with gpg --search-keys FA1E9F9A41E7F43502CA5D6352FC8E7BEDB7FCA2 and ask them to update their key [16:45:39] though until that is done. I would say remove the key [16:46:21] in this case it's lego [16:46:40] though there is basically always at least one expired one when you try to make an edit, in my experience [16:47:21] mutante: Thanks. Yes, I'm just not sure what Kunal's current employment/nda status is, so I thought I'd check here. [16:47:29] so it needs to be removed from the .users file and then that file needs to be signed again [18:09:24] FIRING: SystemdUnitFailed: generate_os_reports.service on puppetdb2003:9100 - https://wikitech.wikimedia.org/wiki/Monitoring/check_systemd_state - https://grafana.wikimedia.org/d/g-AaZRFWk/systemd-status - https://alerts.wikimedia.org/?q=alertname%3DSystemdUnitFailed [18:31:02] 10Mail, 06Infrastructure-Foundations, 06Trust-and-Safety: Mail from Bishzilla to emergency@wikimedia.org is possibly getting lost - https://phabricator.wikimedia.org/T338032#10058980 (10jhathaway) @Nahid would you be able to take a look into @RoySmith most recent message? [18:31:11] 10Mail, 06Infrastructure-Foundations, 06Trust-and-Safety: Mail from Bishzilla to emergency@wikimedia.org is possibly getting lost - https://phabricator.wikimedia.org/T338032#10058985 (10jhathaway) a:05jhathaway→03None [21:37:14] 10netops, 06DC-Ops, 06Infrastructure-Foundations, 10ops-codfw, 06SRE: Request additional mgmt IP range for frack servers - https://phabricator.wikimedia.org/T370164#10060002 (10Dwisehaupt) a:05Dwisehaupt→03None Sorry for the delay, I was out last week. This should be fixed. The connections to the mg... [21:38:56] 10netops, 06DC-Ops, 10fundraising-tech-ops, 06Infrastructure-Foundations, and 2 others: Request additional mgmt IP range for frack servers - https://phabricator.wikimedia.org/T370164#10060015 (10Dwisehaupt) [22:09:24] FIRING: SystemdUnitFailed: generate_os_reports.service on puppetdb2003:9100 - https://wikitech.wikimedia.org/wiki/Monitoring/check_systemd_state - https://grafana.wikimedia.org/d/g-AaZRFWk/systemd-status - https://alerts.wikimedia.org/?q=alertname%3DSystemdUnitFailed