[00:38:14] 10serviceops, 10Data-Persistence, 10Phabricator, 10serviceops-collab, 10Release-Engineering-Team (Bonus Level 🕹ī¸): sort out mysql privileges for phab1004/phab2002 - https://phabricator.wikimedia.org/T315713 (10Dzahn) >>! In T315713#8190804, @Marostegui wrote: > Is there a way in puppet that any new host... [00:38:33] 10serviceops, 10Data-Persistence, 10Phabricator, 10serviceops-collab, 10Release-Engineering-Team (Bonus Level 🕹ī¸): sort out mysql privileges for phab1004/phab2002 - https://phabricator.wikimedia.org/T315713 (10Dzahn) a:03Dzahn [00:38:51] 10serviceops, 10Phabricator, 10serviceops-collab, 10Release-Engineering-Team (Bonus Level 🕹ī¸): sort out mysql privileges for phab1004/phab2002 - https://phabricator.wikimedia.org/T315713 (10Dzahn) [00:39:26] 10serviceops, 10Phabricator, 10serviceops-collab, 10Release-Engineering-Team (Bonus Level 🕹ī¸): sort out mysql privileges for phab1004/phab2002 - https://phabricator.wikimedia.org/T315713 (10Dzahn) removing data-persistence again. You did answer the question and we know what to do. Thanks! [08:00:20] 10serviceops, 10Parsoid, 10Patch-For-Review, 10Performance-Team (Radar): Parsoid migration to php 7.4 - https://phabricator.wikimedia.org/T312638 (10Volans) >>! In T312638#8204469, @Dzahn wrote: > P.S. the cookbook was supposed to set the dowtime for that but that failed for some reason, so nothing you wer... [08:07:08] 10serviceops, 10Parsoid, 10Patch-For-Review, 10Performance-Team (Radar): Parsoid migration to php 7.4 - https://phabricator.wikimedia.org/T312638 (10Joe) Fopr the record, that failure was expected and Clement was well aware of >>! In T312638#8204449, @Dzahn wrote: > @Clement_Goubert Just wanted to share... [08:17:52] 10serviceops, 10Parsoid, 10Patch-For-Review, 10Performance-Team (Radar): Parsoid migration to php 7.4 - https://phabricator.wikimedia.org/T312638 (10ops-monitoring-bot) Icinga downtime and Alertmanager silence (ID=e0342c4d-b0d1-490b-b740-0d2962a32ac0) set by cgoubert@cumin1001 for 7 days, 0:00:00 on 1 host... [08:19:57] 10serviceops, 10Parsoid, 10Patch-For-Review, 10Performance-Team (Radar): Parsoid migration to php 7.4 - https://phabricator.wikimedia.org/T312638 (10Clement_Goubert) What happened is, I downtimed all parse1* for a week while we worked on them with @Joe Reimage of parse1002 reset that to 2h for that host,... [09:13:04] 10serviceops, 10DBA, 10Patch-For-Review, 10Performance-Team (Radar): Update wgLBFactoryConf for x2 to register only the local primary - https://phabricator.wikimedia.org/T316482 (10Marostegui) Excellent @CDanis - thank you. Let me know indeed and I can add it to x2. [09:27:48] 10serviceops: Put parse parse100[01-24] in production - https://phabricator.wikimedia.org/T307219 (10Clement_Goubert) [09:27:57] 10serviceops, 10Parsoid, 10Patch-For-Review, 10Performance-Team (Radar): Parsoid migration to php 7.4 - https://phabricator.wikimedia.org/T312638 (10Clement_Goubert) [09:28:40] 10serviceops, 10Dumps-Generation, 10Patch-For-Review, 10Performance-Team (Radar): Migrate WMF production from PHP 7.2 to PHP 7.4 - https://phabricator.wikimedia.org/T271736 (10Clement_Goubert) [09:28:48] 10serviceops, 10Parsoid, 10Patch-For-Review, 10Performance-Team (Radar): Parsoid migration to php 7.4 - https://phabricator.wikimedia.org/T312638 (10Clement_Goubert) 05Open→03In progress [09:28:56] 10serviceops: Put parse parse100[01-24] in production - https://phabricator.wikimedia.org/T307219 (10Clement_Goubert) 05Open→03In progress [09:29:04] 10serviceops, 10Parsoid, 10Patch-For-Review, 10Performance-Team (Radar): Parsoid migration to php 7.4 - https://phabricator.wikimedia.org/T312638 (10Clement_Goubert) [10:14:32] 10serviceops: Put parse parse10[01-24] in production - https://phabricator.wikimedia.org/T307219 (10Clement_Goubert) [10:34:36] 10serviceops, 10Performance-Team, 10SRE, 10SRE-swift-storage, and 3 others: Progressive Multi-DC roll out - https://phabricator.wikimedia.org/T279664 (10tstarling) I'm aiming to do stage 3 and 4 on September 6. [11:12:36] 10serviceops, 10Parsoid, 10Patch-For-Review, 10Performance-Team (Radar): Parsoid migration to php 7.4 - https://phabricator.wikimedia.org/T312638 (10Clement_Goubert) I will be moving production operations comments to https://phabricator.wikimedia.org/T307219 as not to clutter this task further [11:13:10] 10serviceops: Put parse parse10[01-24] in production - https://phabricator.wikimedia.org/T307219 (10Clement_Goubert) As mentioned in https://phabricator.wikimedia.org/T312638 I'll be logging here the progress of putting the new parse servers in production. parse1001 got put in the pool replacing wtp1034 parse10... [11:13:45] 10serviceops, 10Parsoid, 10Patch-For-Review, 10Performance-Team (Radar): Parsoid migration to php 7.4 - https://phabricator.wikimedia.org/T312638 (10Clement_Goubert) 8% of parse traffic now served in php7.4 only [11:41:02] 10serviceops, 10API Platform, 10Growth-Structured-Tasks, 10Image-Suggestions, and 5 others: Exception: Invalid JSON response for page: Espejo - https://phabricator.wikimedia.org/T313973 (10kostajh) Thanks for the explanation. If I understand correctly, that means that we're more likely to see errors for sp... [12:14:16] 10serviceops, 10DBA, 10Patch-For-Review, 10Performance-Team (Radar): Update wgLBFactoryConf for x2 to register only the local primary - https://phabricator.wikimedia.org/T316482 (10CDanis) This is ready, was tested by hand on cumin2002, and is now deployed to both cumin hosts. `sh ✔ī¸ cdanis@cumin2002.codf... [12:35:50] 10serviceops: Put parse parse10[01-24] in production - https://phabricator.wikimedia.org/T307219 (10Clement_Goubert) p:05Triage→03High [13:06:32] 10serviceops: Cleanup profile::docker::engine::packagename - https://phabricator.wikimedia.org/T316639 (10Clement_Goubert) Hosts `cloudweb2002-dev.wikimedia.org,cloudweb[1003-1004].wikimedia.org` still use docker-ce, will carve an exception for them in hiera. [13:20:26] 10serviceops, 10Data-Engineering, 10SRE, 10Event-Platform Value Stream (Sprint 01): eventstreams chart should use latest common_templates - https://phabricator.wikimedia.org/T310721 (10JArguello-WMF) [13:21:06] 10serviceops, 10Data Engineering Planning, 10SRE, 10Event-Platform Value Stream (Sprint 01), 10Patch-For-Review: eventgate chart should use common_templates - https://phabricator.wikimedia.org/T303543 (10JArguello-WMF) [13:25:58] 10serviceops, 10Data-Engineering, 10SRE, 10Event-Platform Value Stream (Sprint 01): eventstreams chart should use latest common_templates - https://phabricator.wikimedia.org/T310721 (10lbowmaker) a:03gmodena [13:26:32] 10serviceops, 10Data Engineering Planning, 10SRE, 10Event-Platform Value Stream (Sprint 01), 10Patch-For-Review: eventgate chart should use common_templates - https://phabricator.wikimedia.org/T303543 (10lbowmaker) a:05Jelto→03gmodena [13:44:48] akosiaris/jayme: thank you/apologies for taking on the thumbor chart review <3 [13:45:28] hnowlan: thanks for working on it. While doing the first pass, it dawned on me how much of a work it is to get that chart working. [14:27:37] 10serviceops, 10Patch-For-Review: Cleanup profile::docker::engine::packagename - https://phabricator.wikimedia.org/T316639 (10Clement_Goubert) 05Open→03Resolved [14:42:21] 10serviceops, 10SRE, 10serviceops-collab, 10Patch-For-Review, 10Technical-Debt: Sunset search.wikimedia.org service - https://phabricator.wikimedia.org/T316296 (10Jelto) I don't want to over complicate the decommission of the service. But I was thinking about depooling the service first from confctl. Dep... [15:32:21] 10serviceops, 10API Platform, 10Growth-Structured-Tasks, 10Image-Suggestions, and 5 others: Exception: Invalid JSON response for page: Espejo - https://phabricator.wikimedia.org/T313973 (10Eevans) >>! In T313973#8205672, @kostajh wrote: > Thanks for the explanation. If I understand correctly, that means th... [15:50:21] 10serviceops: Make gVisor packages available via apt.wikimedia.org - https://phabricator.wikimedia.org/T316879 (10ori) [15:50:33] 10serviceops: Make gVisor packages available via apt.wikimedia.org - https://phabricator.wikimedia.org/T316879 (10ori) [16:39:04] 10serviceops, 10Prod-Kubernetes, 10Wikidata, 10Wikidata-Query-Service, and 2 others: Write and adapt Runbooks and cookbooks related to the WDQS Streaming Updater and kubernetes - https://phabricator.wikimedia.org/T293063 (10dcausse) [16:39:15] 10serviceops, 10Prod-Kubernetes, 10Wikidata, 10Wikidata-Query-Service, and 2 others: Write and adapt Runbooks and cookbooks related to the WDQS Streaming Updater and kubernetes - https://phabricator.wikimedia.org/T293063 (10dcausse) @JMeybohm thanks for the write-up! I added few more notes. [19:12:25] 10serviceops, 10Fundraising Tech - Chaos Crew, 10Release-Engineering-Team, 10serviceops-collab, 10GitLab (Project Migration): Create new GitLab project group: Fundraising-Tech - https://phabricator.wikimedia.org/T316695 (10Dzahn) 05Open→03In progress a:03Dzahn [19:12:46] 10serviceops, 10Fundraising Tech - Chaos Crew, 10Release-Engineering-Team, 10serviceops-collab, 10GitLab (Project Migration): Create new GitLab project group: Fundraising-Tech - https://phabricator.wikimedia.org/T316695 (10Dzahn) [19:17:29] 10serviceops, 10Fundraising Tech - Chaos Crew, 10Release-Engineering-Team, 10serviceops-collab, 10GitLab (Project Migration): Create new GitLab project group: Fundraising-Tech - https://phabricator.wikimedia.org/T316695 (10jgleeson) thanks @Dzahn ! [19:20:01] 10serviceops, 10Parsoid, 10Patch-For-Review, 10Performance-Team (Radar): Parsoid migration to php 7.4 - https://phabricator.wikimedia.org/T312638 (10Dzahn) >>! In T312638#8205255, @Volans wrote: > That's not correct. The cookbooks worked as expected and the downtime was properly set: Sorry, @Volans. The t... [19:21:23] 10serviceops, 10Fundraising Tech - Chaos Crew, 10Release-Engineering-Team, 10serviceops-collab, 10GitLab (Project Migration): Create new GitLab project group: Fundraising-Tech - https://phabricator.wikimedia.org/T316695 (10Dzahn) 05In progress→03Resolved Yep. cheers. Feel free to reopen if there are... [19:23:41] 10serviceops, 10Infrastructure-Foundations: Make gVisor packages available via apt.wikimedia.org - https://phabricator.wikimedia.org/T316879 (10Dzahn) [19:50:14] 10serviceops, 10serviceops-collab, 10vrts: vrts - spamassassin icinga alerts - https://phabricator.wikimedia.org/T316903 (10Dzahn) [19:53:16] 10serviceops, 10serviceops-collab, 10vrts: vrts - spamassassin icinga alerts - https://phabricator.wikimedia.org/T316903 (10Dzahn) ` [otrs1001:~] $ sudo systemctl status spamassassin_updates ● spamassassin_updates.service - Spamassassin definitions update Loaded: loaded (/lib/systemd/system/spamassassin_u... [19:57:44] 10serviceops, 10serviceops-collab, 10vrts: vrts - spamassassin icinga alerts - https://phabricator.wikimedia.org/T316903 (10Dzahn) @Arnoldokoth If you feel like taking a look.. maybe we can run the updates manually a couple times and catch some logs to figure out what is failing here every once in a while. [19:58:14] 10serviceops, 10serviceops-collab, 10vrts: vrts - spamassassin icinga alerts - https://phabricator.wikimedia.org/T316903 (10Dzahn) p:05Triage→03Low [20:02:14] 10serviceops, 10serviceops-collab, 10vrts: vrts - spamassassin icinga alerts - https://phabricator.wikimedia.org/T316903 (10Arnoldokoth) Hey @Dzahn Wasn't this patch reverted? [20:12:38] 10serviceops, 10serviceops-collab, 10vrts: vrts - spamassassin icinga alerts - https://phabricator.wikimedia.org/T316903 (10Dzahn) @Arnoldokoth It was like this: original change: https://gerrit.wikimedia.org/r/c/operations/puppet/+/819553 revert: https://gerrit.wikimedia.org/r/c/operations/puppet/+/8263... [20:21:45] 10serviceops, 10serviceops-collab, 10vrts: vrts - spamassassin icinga alerts - https://phabricator.wikimedia.org/T316903 (10Arnoldokoth) `Sep 01 20:19:06 otrs1001 systemd[1]: spamassassin_updates.service: Succeeded` This is all I got from the logs after manually starting it. [20:38:42] 10serviceops, 10serviceops-collab, 10vrts: vrts - spamassassin icinga alerts - https://phabricator.wikimedia.org/T316903 (10Dzahn) ACK. It seems to be transitory, just happens every once in a while. Let's keep an eye open for the Icinga alert (maybe make it send email?) to catch it next time before logs are...