[00:00:05] <jouncebot>	 RoanKattouw and Urbanecm: #bothumor My software never has bugs. It just develops random features. Rise for UTC late backport window. (https://wikitech.wikimedia.org/wiki/Deployments#deploycal-item-20220111T0000).
[00:00:05] <jouncebot>	 nray: A patch you scheduled for UTC late backport window is about to be deployed. Please be around during the process. Note: If you break AND fix the wikis, you will be rewarded with a sticker.
[00:00:24] <nray>	 Hello o/
[00:00:35] <urbanecm>	 Hey nray 
[00:00:39] <urbanecm>	 I can deploy today
[00:00:51] <nray>	 hey urbanecm . Thank you!
[00:01:10] <wikibugs>	 (03CR) 10Urbanecm: [C: 03+2] Fix TypeError: document.querySelectorAll(...).forEach is not a function [skins/Vector] (wmf/1.38.0-wmf.16) - 10https://gerrit.wikimedia.org/r/752766 (https://phabricator.wikimedia.org/T298910) (owner: 10Nray)
[00:17:03] <wikibugs>	 (03CR) 10Cwhite: [C: 03+1] "LGTM thanks!" [puppet] - 10https://gerrit.wikimedia.org/r/752211 (https://phabricator.wikimedia.org/T297239) (owner: 10Herron)
[00:17:36] <wikibugs>	 (03CR) 10Cwhite: [C: 03+1] "LGTM" [puppet] - 10https://gerrit.wikimedia.org/r/752631 (owner: 10Muehlenhoff)
[00:18:42] <wikibugs>	 (03Merged) 10jenkins-bot: Fix TypeError: document.querySelectorAll(...).forEach is not a function [skins/Vector] (wmf/1.38.0-wmf.16) - 10https://gerrit.wikimedia.org/r/752766 (https://phabricator.wikimedia.org/T298910) (owner: 10Nray)
[00:18:48] <wikibugs>	 (03CR) 10Cwhite: [C: 03+1] kafka-logging: move to fixed UID/GID for kafka user [puppet] - 10https://gerrit.wikimedia.org/r/752677 (https://phabricator.wikimedia.org/T298883) (owner: 10Herron)
[00:20:04] <urbanecm>	 nray: can you test at mwdebug1001 please?
[00:20:14] <nray>	 yes testing now, thank you
[00:20:24] <urbanecm>	 thanks
[00:21:44] <wikibugs>	 (03CR) 10Cwhite: [C: 03+1] "My promtool executable is also located elsewhere in PATH.  I tested this locally and it worked great. Thanks!" [alerts] - 10https://gerrit.wikimedia.org/r/752651 (owner: 10JMeybohm)
[00:22:19] <logmsgbot>	 !log mwdebug-deploy@deploy1002 helmfile [eqiad] START helmfile.d/services/mwdebug: apply on pinkunicorn
[00:22:21] <stashbot>	 Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log
[00:22:37] <nray>	 @urbanecm things look good. You can proceed
[00:22:42] <urbanecm>	 syncing
[00:23:26] <logmsgbot>	 !log mwdebug-deploy@deploy1002 helmfile [eqiad] DONE helmfile.d/services/mwdebug: sync on pinkunicorn
[00:23:27] <stashbot>	 Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log
[00:23:27] <logmsgbot>	 !log mwdebug-deploy@deploy1002 helmfile [codfw] START helmfile.d/services/mwdebug: apply on pinkunicorn
[00:23:28] <stashbot>	 Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log
[00:24:29] <logmsgbot>	 !log urbanecm@deploy1002 Synchronized php-1.38.0-wmf.16/skins/Vector/resources/skins.vector.js/dropdownMenus.js: 79b33f2: Fix TypeError: document.querySelectorAll(...).forEach is not a function (T298910) (duration: 00m 59s)
[00:24:31] <stashbot>	 Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log
[00:24:31] <stashbot>	 T298910: TypeError: document.querySelectorAll(...).forEach is not a function  - https://phabricator.wikimedia.org/T298910
[00:24:33] <urbanecm>	 nray: and live
[00:24:35] <urbanecm>	 anything else?
[00:24:41] <logmsgbot>	 !log mwdebug-deploy@deploy1002 helmfile [codfw] DONE helmfile.d/services/mwdebug: sync on pinkunicorn
[00:24:42] <stashbot>	 Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log
[00:24:43] <nray>	 that's it. thanks so much for your help!
[00:24:54] <urbanecm>	 any time
[00:25:39] <wikibugs>	 (03CR) 10Cwhite: "This change also includes a role reassignment from kibana7_ecs to logging::opensearch::collector.  I propose we recreate role::kibana7_ecs" [puppet] - 10https://gerrit.wikimedia.org/r/752756 (https://phabricator.wikimedia.org/T288621) (owner: 10Cwhite)
[00:28:20] <wikibugs>	 10SRE, 10MediaWiki-General, 10Performance-Team, 10serviceops-radar, and 2 others: Move MainStash out of Redis to a simpler multi-dc aware solution - https://phabricator.wikimedia.org/T212129 (10aaron) I like "mainstash". If there is ever vertical sharding by extension, then "<group>stash" could be used as...
[00:50:53] <icinga-wm>	 PROBLEM - Prometheus jobs reduced availability on alert1001 is CRITICAL: job=sidekiq site=codfw https://wikitech.wikimedia.org/wiki/Prometheus%23Prometheus_job_unavailable https://grafana.wikimedia.org/d/NEJu05xZz/prometheus-targets
[00:53:03] <icinga-wm>	 RECOVERY - Prometheus jobs reduced availability on alert1001 is OK: All metrics within thresholds. https://wikitech.wikimedia.org/wiki/Prometheus%23Prometheus_job_unavailable https://grafana.wikimedia.org/d/NEJu05xZz/prometheus-targets
[01:16:35] <icinga-wm>	 RECOVERY - Check systemd state on an-launcher1002 is OK: OK - running: The system is fully operational https://wikitech.wikimedia.org/wiki/Monitoring/check_systemd_state
[01:35:41] <icinga-wm>	 RECOVERY - SSH on mw2252.mgmt is OK: SSH OK - OpenSSH_7.0 (protocol 2.0) https://wikitech.wikimedia.org/wiki/Dc-operations/Hardware_Troubleshooting_Runbook
[01:45:01] <icinga-wm>	 PROBLEM - Citoid LVS eqiad on citoid.svc.eqiad.wmnet is CRITICAL: /robots.txt (Untitled test) is CRITICAL: Test Untitled test returned the unexpected status 503 (expecting: 200): /api (Zotero and citoid alive) is CRITICAL: Test Zotero and citoid alive returned the unexpected status 503 (expecting: 200) https://wikitech.wikimedia.org/wiki/Citoid
[01:47:17] <icinga-wm>	 RECOVERY - Citoid LVS eqiad on citoid.svc.eqiad.wmnet is OK: All endpoints are healthy https://wikitech.wikimedia.org/wiki/Citoid
[02:00:04] <jouncebot>	 Deploy window Automatic branching of MediaWiki, extensions, skins, and vendor – see Heterogeneous_deployment/Train_deploys (https://wikitech.wikimedia.org/wiki/Deployments#deploycal-item-20220111T0200)
[02:00:25] <icinga-wm>	 RECOVERY - Check systemd state on deneb is OK: OK - running: The system is fully operational https://wikitech.wikimedia.org/wiki/Monitoring/check_systemd_state
[02:05:44] <logmsgbot>	 !log mwdebug-deploy@deploy1002 helmfile [eqiad] START helmfile.d/services/mwdebug: apply on pinkunicorn
[02:05:45] <stashbot>	 Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log
[02:06:39] <logmsgbot>	 !log mwdebug-deploy@deploy1002 helmfile [eqiad] DONE helmfile.d/services/mwdebug: sync on pinkunicorn
[02:06:40] <stashbot>	 Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log
[02:06:40] <logmsgbot>	 !log mwdebug-deploy@deploy1002 helmfile [codfw] START helmfile.d/services/mwdebug: apply on pinkunicorn
[02:06:41] <stashbot>	 Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log
[02:06:53] <icinga-wm>	 PROBLEM - Check systemd state on deneb is CRITICAL: CRITICAL - degraded: The following units failed: package_builder_Clean_up_build_directory.service https://wikitech.wikimedia.org/wiki/Monitoring/check_systemd_state
[02:06:57] <wikibugs>	 (03PS1) 10TrainBranchBot: Branch commit for wmf/1.38.0-wmf.17 [core] (wmf/1.38.0-wmf.17) - 10https://gerrit.wikimedia.org/r/752802
[02:06:59] <wikibugs>	 (03CR) 10TrainBranchBot: [C: 03+2] Branch commit for wmf/1.38.0-wmf.17 [core] (wmf/1.38.0-wmf.17) - 10https://gerrit.wikimedia.org/r/752802 (owner: 10TrainBranchBot)
[02:07:44] <logmsgbot>	 !log mwdebug-deploy@deploy1002 helmfile [codfw] DONE helmfile.d/services/mwdebug: sync on pinkunicorn
[02:07:45] <stashbot>	 Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log
[02:25:13] <wikibugs>	 (03Merged) 10jenkins-bot: Branch commit for wmf/1.38.0-wmf.17 [core] (wmf/1.38.0-wmf.17) - 10https://gerrit.wikimedia.org/r/752802 (owner: 10TrainBranchBot)
[02:32:58] <logmsgbot>	 !log mwdebug-deploy@deploy1002 helmfile [eqiad] START helmfile.d/services/mwdebug: apply on pinkunicorn
[02:33:00] <stashbot>	 Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log
[02:33:57] <logmsgbot>	 !log mwdebug-deploy@deploy1002 helmfile [eqiad] DONE helmfile.d/services/mwdebug: sync on pinkunicorn
[02:33:58] <logmsgbot>	 !log mwdebug-deploy@deploy1002 helmfile [codfw] START helmfile.d/services/mwdebug: apply on pinkunicorn
[02:33:58] <stashbot>	 Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log
[02:33:59] <stashbot>	 Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log
[02:35:06] <logmsgbot>	 !log mwdebug-deploy@deploy1002 helmfile [codfw] DONE helmfile.d/services/mwdebug: sync on pinkunicorn
[02:35:07] <stashbot>	 Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log
[02:41:58] <wikibugs>	 (03PS1) 10Andrew Bogott: All nfs-exportd to make public mounts actually public [puppet] - 10https://gerrit.wikimedia.org/r/752805 (https://phabricator.wikimedia.org/T293800)
[02:42:35] <wikibugs>	 (03PS1) 10Aaron Schulz: Add "db-mainstash" entry to $wgObjectCaches [mediawiki-config] - 10https://gerrit.wikimedia.org/r/752807 (https://phabricator.wikimedia.org/T212129)
[02:45:53] <wikibugs>	 (03PS2) 10Andrew Bogott: cloudnfs: allow nfs-exportd to make public mounts actually public [puppet] - 10https://gerrit.wikimedia.org/r/752805 (https://phabricator.wikimedia.org/T293800)
[02:46:28] <wikibugs>	 (03CR) 10jerkins-bot: [V: 04-1] cloudnfs: allow nfs-exportd to make public mounts actually public [puppet] - 10https://gerrit.wikimedia.org/r/752805 (https://phabricator.wikimedia.org/T293800) (owner: 10Andrew Bogott)
[02:58:58] <wikibugs>	 (03PS3) 10Andrew Bogott: cloudnfs: allow nfs-exportd to make public mounts actually public [puppet] - 10https://gerrit.wikimedia.org/r/752805 (https://phabricator.wikimedia.org/T293800)
[03:10:22] <wikibugs>	 (03PS4) 10Andrew Bogott: cloudnfs: allow nfs-exportd to make public mounts actually public [puppet] - 10https://gerrit.wikimedia.org/r/752805 (https://phabricator.wikimedia.org/T293800)
[03:13:17] <wikibugs>	 (03PS5) 10Andrew Bogott: cloudnfs: allow nfs-exportd to make public mounts actually public [puppet] - 10https://gerrit.wikimedia.org/r/752805 (https://phabricator.wikimedia.org/T293800)
[03:14:18] <wikibugs>	 (03CR) 10Andrew Bogott: [C: 03+2] cloudnfs: allow nfs-exportd to make public mounts actually public [puppet] - 10https://gerrit.wikimedia.org/r/752805 (https://phabricator.wikimedia.org/T293800) (owner: 10Andrew Bogott)
[03:15:31] <wikibugs>	 (03PS1) 10Andrew Bogott: nfs/add_server.py: one last puppet run after everthing is configured [cookbooks] (wmcs) - 10https://gerrit.wikimedia.org/r/752810 (https://phabricator.wikimedia.org/T293800)
[03:39:38] <wikibugs>	 (03CR) 10Andrew Bogott: [C: 03+2] nfs/add_server.py: one last puppet run after everthing is configured [cookbooks] (wmcs) - 10https://gerrit.wikimedia.org/r/752810 (https://phabricator.wikimedia.org/T293800) (owner: 10Andrew Bogott)
[03:42:35] <wikibugs>	 (03Merged) 10jenkins-bot: nfs/add_server.py: one last puppet run after everthing is configured [cookbooks] (wmcs) - 10https://gerrit.wikimedia.org/r/752810 (https://phabricator.wikimedia.org/T293800) (owner: 10Andrew Bogott)
[04:44:53] <icinga-wm>	 PROBLEM - SSH on contint1001.mgmt is CRITICAL: CRITICAL - Socket timeout after 10 seconds https://wikitech.wikimedia.org/wiki/Dc-operations/Hardware_Troubleshooting_Runbook
[05:44:03] <logmsgbot>	 !log marostegui@cumin1001 START - Cookbook sre.hosts.downtime for 6:00:00 on db1171.eqiad.wmnet with reason: Maintenance
[05:44:05] <logmsgbot>	 !log marostegui@cumin1001 END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1171.eqiad.wmnet with reason: Maintenance
[05:44:05] <stashbot>	 Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log
[05:44:07] <stashbot>	 Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log
[05:44:07] <logmsgbot>	 !log marostegui@cumin1001 START - Cookbook sre.hosts.downtime for 6:00:00 on db1116.eqiad.wmnet with reason: Maintenance
[05:44:08] <stashbot>	 Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log
[05:44:09] <logmsgbot>	 !log marostegui@cumin1001 END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1116.eqiad.wmnet with reason: Maintenance
[05:44:10] <stashbot>	 Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log
[05:44:11] <logmsgbot>	 !log marostegui@cumin1001 START - Cookbook sre.hosts.downtime for 6:00:00 on db1104.eqiad.wmnet with reason: Maintenance
[05:44:12] <stashbot>	 Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log
[05:44:13] <logmsgbot>	 !log marostegui@cumin1001 END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1104.eqiad.wmnet with reason: Maintenance
[05:44:14] <stashbot>	 Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log
[05:44:17] <logmsgbot>	 !log marostegui@cumin1001 dbctl commit (dc=all): 'Depooling db1104 (T297191)', diff saved to https://phabricator.wikimedia.org/P18503 and previous config saved to /var/cache/conftool/dbconfig/20220111-054417-marostegui.json
[05:44:19] <stashbot>	 Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log
[05:44:20] <stashbot>	 T297191: Schema change for dropping page_restrictions.pr_user field on wmf sites - https://phabricator.wikimedia.org/T297191
[05:44:54] <wikibugs>	 (03PS1) 10Marostegui: Revert "es2032: Disable notifications" [puppet] - 10https://gerrit.wikimedia.org/r/752767
[05:46:01] <icinga-wm>	 RECOVERY - SSH on contint1001.mgmt is OK: SSH OK - OpenSSH_6.6 (protocol 2.0) https://wikitech.wikimedia.org/wiki/Dc-operations/Hardware_Troubleshooting_Runbook
[05:46:12] <wikibugs>	 (03CR) 10Marostegui: [C: 03+2] Revert "es2032: Disable notifications" [puppet] - 10https://gerrit.wikimedia.org/r/752767 (owner: 10Marostegui)
[05:49:57] <wikibugs>	 (03PS1) 10Marostegui: Revert "dbproxy1013: Disable notifications" [puppet] - 10https://gerrit.wikimedia.org/r/752768
[05:51:22] <wikibugs>	 (03CR) 10Marostegui: [C: 03+2] Revert "dbproxy1013: Disable notifications" [puppet] - 10https://gerrit.wikimedia.org/r/752768 (owner: 10Marostegui)
[05:55:32] <wikibugs>	 (03PS1) 10Marostegui: dbproxy1012: Disable notifications [puppet] - 10https://gerrit.wikimedia.org/r/752934 (https://phabricator.wikimedia.org/T298586)
[05:56:17] <wikibugs>	 (03CR) 10Marostegui: [C: 03+2] dbproxy1012: Disable notifications [puppet] - 10https://gerrit.wikimedia.org/r/752934 (https://phabricator.wikimedia.org/T298586) (owner: 10Marostegui)
[06:00:25] <logmsgbot>	 !log marostegui@cumin1001 START - Cookbook sre.hosts.reimage for host dbproxy1012.eqiad.wmnet with OS bullseye
[06:00:26] <stashbot>	 Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log
[06:12:16] <wikibugs>	 (03PS1) 10Marostegui: drop_rev_page_id_T285149.py: Schema change [software/schema-changes] - 10https://gerrit.wikimedia.org/r/752935 (https://phabricator.wikimedia.org/T285149)
[06:15:36] <wikibugs>	 (03CR) 10Ladsgroup: [C: 03+1] "I reviewed this before." [software/schema-changes] - 10https://gerrit.wikimedia.org/r/752935 (https://phabricator.wikimedia.org/T285149) (owner: 10Marostegui)
[06:17:01] <wikibugs>	 (03CR) 10Marostegui: [V: 03+2 C: 03+2] drop_rev_page_id_T285149.py: Schema change [software/schema-changes] - 10https://gerrit.wikimedia.org/r/752935 (https://phabricator.wikimedia.org/T285149) (owner: 10Marostegui)
[06:18:33] <taavi>	 Amir1: btw can I start the centralauth hidden_level migration script?
[06:18:53] <Amir1>	 taavi: good morning, sure
[06:19:09] <taavi>	 just a screen session on mwmaint1002 is fine I guess?
[06:19:20] <wikibugs>	 (03PS1) 10Gergő Tisza: SECURITY: Fix several i18n XSS issues in suggested edits [extensions/GrowthExperiments] (wmf/1.38.0-wmf.17) - 10https://gerrit.wikimedia.org/r/752769 (https://phabricator.wikimedia.org/T298504)
[06:19:32] <Amir1>	 depends on how long it would take but screen is better
[06:21:03] <taavi>	 I honestly have no clue on how long it will take
[06:21:30] <taavi>	 !log starting extensions/CentralAuth/maintenance/migrateHiddenLevel.php on a mwmaint1002 screen session - T289068
[06:21:32] <stashbot>	 Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log
[06:21:33] <stashbot>	 T289068: Normalise centralauth.gu_hidden - https://phabricator.wikimedia.org/T289068
[06:23:42] <taavi>	 it's going really fast, but not telling me how many rows it is affecting
[06:24:03] <taavi>	 at least I don't see any replag on grafana
[06:26:21] <logmsgbot>	 !log marostegui@cumin1001 dbctl commit (dc=all): 'Repool es2032 after Bullseye reimage T295965', diff saved to https://phabricator.wikimedia.org/P18504 and previous config saved to /var/cache/conftool/dbconfig/20220111-062620-marostegui.json
[06:26:23] <stashbot>	 Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log
[06:26:24] <stashbot>	 T295965: Test MariaDB 10.4 with Bullseye - https://phabricator.wikimedia.org/T295965
[06:27:44] <logmsgbot>	 !log marostegui@cumin1001 dbctl commit (dc=all): 'db1144:3315 (re)pooling @ 25%: repooling after schema change', diff saved to https://phabricator.wikimedia.org/P18505 and previous config saved to /var/cache/conftool/dbconfig/20220111-062743-root.json
[06:27:45] <stashbot>	 Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log
[06:29:55] <logmsgbot>	 !log marostegui@cumin1001 END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host dbproxy1012.eqiad.wmnet with OS bullseye
[06:29:56] <stashbot>	 Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log
[06:30:20] <logmsgbot>	 !log marostegui@cumin1001 START - Cookbook sre.hosts.downtime for 6:00:00 on db1171.eqiad.wmnet with reason: Maintenance
[06:30:21] <stashbot>	 Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log
[06:30:22] <logmsgbot>	 !log marostegui@cumin1001 END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1171.eqiad.wmnet with reason: Maintenance
[06:30:23] <stashbot>	 Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log
[06:30:33] <logmsgbot>	 !log marostegui@cumin1001 START - Cookbook sre.hosts.downtime for 6:00:00 on dbstore1003.eqiad.wmnet with reason: Maintenance
[06:30:34] <stashbot>	 Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log
[06:30:35] <logmsgbot>	 !log marostegui@cumin1001 END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on dbstore1003.eqiad.wmnet with reason: Maintenance
[06:30:36] <stashbot>	 Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log
[06:30:46] <logmsgbot>	 !log marostegui@cumin1001 START - Cookbook sre.hosts.downtime for 6:00:00 on db1101.eqiad.wmnet with reason: Maintenance
[06:30:47] <stashbot>	 Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log
[06:30:48] <logmsgbot>	 !log marostegui@cumin1001 END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1101.eqiad.wmnet with reason: Maintenance
[06:30:49] <stashbot>	 Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log
[06:30:53] <logmsgbot>	 !log marostegui@cumin1001 dbctl commit (dc=all): 'Depooling db1101:3317 (T297191)', diff saved to https://phabricator.wikimedia.org/P18506 and previous config saved to /var/cache/conftool/dbconfig/20220111-063052-marostegui.json
[06:30:55] <stashbot>	 Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log
[06:30:55] <stashbot>	 T297191: Schema change for dropping page_restrictions.pr_user field on wmf sites - https://phabricator.wikimedia.org/T297191
[06:32:08] <logmsgbot>	 !log marostegui@cumin1001 dbctl commit (dc=all): 'Repooling after maintenance db1101:3317 (T297191)', diff saved to https://phabricator.wikimedia.org/P18507 and previous config saved to /var/cache/conftool/dbconfig/20220111-063207-marostegui.json
[06:32:10] <stashbot>	 Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log
[06:32:59] <wikibugs>	 (03PS4) 10ArielGlenn: Add siteinfo data in formatversion=2 too [dumps] - 10https://gerrit.wikimedia.org/r/747987 (owner: 10Legoktm)
[06:33:45] <wikibugs>	 (03PS1) 10Gergő Tisza: Strip comments from indicators [extensions/PageImages] (wmf/1.38.0-wmf.17) - 10https://gerrit.wikimedia.org/r/752770 (https://phabricator.wikimedia.org/T298930)
[06:34:23] <wikibugs>	 (03CR) 10Legoktm: [C: 03+1] "PS4 changes LGTM!" [dumps] - 10https://gerrit.wikimedia.org/r/747987 (owner: 10Legoktm)
[06:34:39] <wikibugs>	 (03CR) 10ArielGlenn: "Sorry about that, forgot to actually git add the file with the small changes.Done now." [dumps] - 10https://gerrit.wikimedia.org/r/747987 (owner: 10Legoktm)
[06:37:10] <tgr>	 will do some backports
[06:41:46] <icinga-wm>	 PROBLEM - SSH on mw2252.mgmt is CRITICAL: CRITICAL - Socket timeout after 10 seconds https://wikitech.wikimedia.org/wiki/Dc-operations/Hardware_Troubleshooting_Runbook
[06:42:47] <logmsgbot>	 !log marostegui@cumin1001 dbctl commit (dc=all): 'db1144:3315 (re)pooling @ 50%: repooling after schema change', diff saved to https://phabricator.wikimedia.org/P18508 and previous config saved to /var/cache/conftool/dbconfig/20220111-064247-root.json
[06:42:49] <stashbot>	 Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log
[06:45:00] <wikibugs>	 (03CR) 10jerkins-bot: [V: 04-1] SECURITY: Fix several i18n XSS issues in suggested edits [extensions/GrowthExperiments] (wmf/1.38.0-wmf.17) - 10https://gerrit.wikimedia.org/r/752769 (https://phabricator.wikimedia.org/T298504) (owner: 10Gergő Tisza)
[06:47:12] <logmsgbot>	 !log marostegui@cumin1001 dbctl commit (dc=all): 'Repooling after maintenance db1101:3317', diff saved to https://phabricator.wikimedia.org/P18509 and previous config saved to /var/cache/conftool/dbconfig/20220111-064712-marostegui.json
[06:47:13] <stashbot>	 Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log
[06:48:25] <wikibugs>	 (03PS1) 10Marostegui: Revert "dbproxy1012: Disable notifications" [puppet] - 10https://gerrit.wikimedia.org/r/752771
[06:50:25] <wikibugs>	 (03CR) 10Marostegui: [C: 03+2] Revert "dbproxy1012: Disable notifications" [puppet] - 10https://gerrit.wikimedia.org/r/752771 (owner: 10Marostegui)
[06:50:47] <Amir1>	 !log upgrading mysql on ['db2114', 'db2117', 'db2124']
[06:50:48] <stashbot>	 Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log
[06:51:12] <logmsgbot>	 !log ladsgroup@cumin1001 START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db2114.codfw.wmnet with reason: Maintenance
[06:51:13] <stashbot>	 Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log
[06:51:14] <logmsgbot>	 !log ladsgroup@cumin1001 END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2114.codfw.wmnet with reason: Maintenance
[06:51:15] <stashbot>	 Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log
[06:51:18] <logmsgbot>	 !log ladsgroup@cumin1001 dbctl commit (dc=all): 'Depooling db2114 (T296143)', diff saved to https://phabricator.wikimedia.org/P18510 and previous config saved to /var/cache/conftool/dbconfig/20220111-065118-ladsgroup.json
[06:51:21] <stashbot>	 Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log
[06:51:21] <stashbot>	 T296143: Optimize commonswiki image table - https://phabricator.wikimedia.org/T296143
[06:51:22] <logmsgbot>	 !log ladsgroup@cumin1001 START - Cookbook sre.mysql.upgrade for db2114.codfw.wmnet
[06:51:23] <stashbot>	 Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log
[06:52:07] <Amir1>	 I put the wrong ticket
[06:55:35] <logmsgbot>	 !log ladsgroup@cumin1001 END (PASS) - Cookbook sre.mysql.upgrade (exit_code=0) for db2114.codfw.wmnet
[06:55:36] <stashbot>	 Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log
[06:56:40] <logmsgbot>	 !log ladsgroup@cumin1001 dbctl commit (dc=all): 'Repooling after maintenance db2114 (T296143)', diff saved to https://phabricator.wikimedia.org/P18511 and previous config saved to /var/cache/conftool/dbconfig/20220111-065640-ladsgroup.json
[06:56:42] <stashbot>	 Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log
[06:56:43] <stashbot>	 T296143: Optimize commonswiki image table - https://phabricator.wikimedia.org/T296143
[06:57:51] <logmsgbot>	 !log marostegui@cumin1001 dbctl commit (dc=all): 'db1144:3315 (re)pooling @ 75%: repooling after schema change', diff saved to https://phabricator.wikimedia.org/P18512 and previous config saved to /var/cache/conftool/dbconfig/20220111-065750-root.json
[06:57:52] <stashbot>	 Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log
[07:02:17] <logmsgbot>	 !log marostegui@cumin1001 dbctl commit (dc=all): 'Repooling after maintenance db1101:3317', diff saved to https://phabricator.wikimedia.org/P18513 and previous config saved to /var/cache/conftool/dbconfig/20220111-070216-marostegui.json
[07:02:18] <stashbot>	 Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log
[07:06:17] <wikibugs>	 (03PS1) 10Marostegui: wmnet: Failover m2 master to dbproxy1013 [dns] - 10https://gerrit.wikimedia.org/r/752936 (https://phabricator.wikimedia.org/T298586)
[07:07:37] <marostegui>	 !log Failover m2 proxy from dbproxy1015 to dbproxy1013 T298586
[07:07:39] <stashbot>	 Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log
[07:07:41] <stashbot>	 T298586: Upgrade all dbproxy hosts to Bullseye - https://phabricator.wikimedia.org/T298586
[07:07:49] <wikibugs>	 (03CR) 10Marostegui: [C: 03+2] wmnet: Failover m2 master to dbproxy1013 [dns] - 10https://gerrit.wikimedia.org/r/752936 (https://phabricator.wikimedia.org/T298586) (owner: 10Marostegui)
[07:11:45] <logmsgbot>	 !log ladsgroup@cumin1001 dbctl commit (dc=all): 'Repooling after maintenance db2114', diff saved to https://phabricator.wikimedia.org/P18514 and previous config saved to /var/cache/conftool/dbconfig/20220111-071144-ladsgroup.json
[07:11:46] <stashbot>	 Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log
[07:12:33] <taavi>	 !log extensions/CentralAuth/maintenance/migrateHiddenLevel.php finished - T289068
[07:12:35] <stashbot>	 Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log
[07:12:35] <stashbot>	 T289068: Normalise centralauth.gu_hidden - https://phabricator.wikimedia.org/T289068
[07:12:54] <logmsgbot>	 !log marostegui@cumin1001 dbctl commit (dc=all): 'db1144:3315 (re)pooling @ 100%: repooling after schema change', diff saved to https://phabricator.wikimedia.org/P18515 and previous config saved to /var/cache/conftool/dbconfig/20220111-071254-root.json
[07:12:56] <stashbot>	 Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log
[07:17:22] <logmsgbot>	 !log marostegui@cumin1001 dbctl commit (dc=all): 'Repooling after maintenance db1101:3317 (T297191)', diff saved to https://phabricator.wikimedia.org/P18516 and previous config saved to /var/cache/conftool/dbconfig/20220111-071721-marostegui.json
[07:17:23] <logmsgbot>	 !log marostegui@cumin1001 START - Cookbook sre.hosts.downtime for 6:00:00 on db1127.eqiad.wmnet with reason: Maintenance
[07:17:24] <stashbot>	 Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log
[07:17:25] <logmsgbot>	 !log marostegui@cumin1001 END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1127.eqiad.wmnet with reason: Maintenance
[07:17:25] <stashbot>	 T297191: Schema change for dropping page_restrictions.pr_user field on wmf sites - https://phabricator.wikimedia.org/T297191
[07:17:26] <stashbot>	 Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log
[07:17:27] <stashbot>	 Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log
[07:17:29] <logmsgbot>	 !log marostegui@cumin1001 dbctl commit (dc=all): 'Depooling db1127 (T297191)', diff saved to https://phabricator.wikimedia.org/P18517 and previous config saved to /var/cache/conftool/dbconfig/20220111-071729-marostegui.json
[07:17:31] <stashbot>	 Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log
[07:26:50] <logmsgbot>	 !log ladsgroup@cumin1001 dbctl commit (dc=all): 'Repooling after maintenance db2114', diff saved to https://phabricator.wikimedia.org/P18518 and previous config saved to /var/cache/conftool/dbconfig/20220111-072649-ladsgroup.json
[07:26:50] <wikibugs>	 (03CR) 10Gergő Tisza: [C: 03+2] Strip comments from indicators [extensions/PageImages] (wmf/1.38.0-wmf.17) - 10https://gerrit.wikimedia.org/r/752770 (https://phabricator.wikimedia.org/T298930) (owner: 10Gergő Tisza)
[07:26:51] <stashbot>	 Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log
[07:28:47] <logmsgbot>	 !log marostegui@cumin1001 dbctl commit (dc=all): 'Repooling after maintenance db1127 (T297191)', diff saved to https://phabricator.wikimedia.org/P18519 and previous config saved to /var/cache/conftool/dbconfig/20220111-072847-marostegui.json
[07:28:49] <stashbot>	 Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log
[07:28:50] <stashbot>	 T297191: Schema change for dropping page_restrictions.pr_user field on wmf sites - https://phabricator.wikimedia.org/T297191
[07:41:54] <logmsgbot>	 !log ladsgroup@cumin1001 dbctl commit (dc=all): 'Repooling after maintenance db2114 (T296143)', diff saved to https://phabricator.wikimedia.org/P18520 and previous config saved to /var/cache/conftool/dbconfig/20220111-074154-ladsgroup.json
[07:41:56] <logmsgbot>	 !log ladsgroup@cumin1001 START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db2117.codfw.wmnet with reason: Maintenance
[07:41:56] <stashbot>	 Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log
[07:41:57] <logmsgbot>	 !log ladsgroup@cumin1001 END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2117.codfw.wmnet with reason: Maintenance
[07:41:58] <stashbot>	 T296143: Optimize commonswiki image table - https://phabricator.wikimedia.org/T296143
[07:41:59] <stashbot>	 Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log
[07:42:00] <stashbot>	 Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log
[07:42:02] <logmsgbot>	 !log ladsgroup@cumin1001 dbctl commit (dc=all): 'Depooling db2117 (T296143)', diff saved to https://phabricator.wikimedia.org/P18521 and previous config saved to /var/cache/conftool/dbconfig/20220111-074202-ladsgroup.json
[07:42:04] <stashbot>	 Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log
[07:42:05] <logmsgbot>	 !log ladsgroup@cumin1001 START - Cookbook sre.mysql.upgrade for db2117.codfw.wmnet
[07:42:06] <stashbot>	 Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log
[07:43:52] <logmsgbot>	 !log marostegui@cumin1001 dbctl commit (dc=all): 'Repooling after maintenance db1127', diff saved to https://phabricator.wikimedia.org/P18522 and previous config saved to /var/cache/conftool/dbconfig/20220111-074351-marostegui.json
[07:43:53] <stashbot>	 Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log
[07:46:55] <logmsgbot>	 !log ladsgroup@cumin1001 END (PASS) - Cookbook sre.mysql.upgrade (exit_code=0) for db2117.codfw.wmnet
[07:46:56] <stashbot>	 Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log
[07:47:33] <wikibugs>	 (03Merged) 10jenkins-bot: Strip comments from indicators [extensions/PageImages] (wmf/1.38.0-wmf.17) - 10https://gerrit.wikimedia.org/r/752770 (https://phabricator.wikimedia.org/T298930) (owner: 10Gergő Tisza)
[07:48:00] <logmsgbot>	 !log ladsgroup@cumin1001 dbctl commit (dc=all): 'Repooling after maintenance db2117 (T296143)', diff saved to https://phabricator.wikimedia.org/P18523 and previous config saved to /var/cache/conftool/dbconfig/20220111-074800-ladsgroup.json
[07:48:02] <stashbot>	 Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log
[07:48:03] <stashbot>	 T296143: Optimize commonswiki image table - https://phabricator.wikimedia.org/T296143
[07:53:20] <logmsgbot>	 !log mwdebug-deploy@deploy1002 helmfile [eqiad] START helmfile.d/services/mwdebug: apply on pinkunicorn
[07:53:21] <stashbot>	 Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log
[07:53:29] <wikibugs>	 (03CR) 10Gergő Tisza: [C: 03+2] "recheck" [extensions/GrowthExperiments] (wmf/1.38.0-wmf.17) - 10https://gerrit.wikimedia.org/r/752769 (https://phabricator.wikimedia.org/T298504) (owner: 10Gergő Tisza)
[07:53:48] <wikibugs>	 (03PS1) 10Marostegui: dbproxy1020: Disable notifications [puppet] - 10https://gerrit.wikimedia.org/r/752989 (https://phabricator.wikimedia.org/T298586)
[07:54:19] <logmsgbot>	 !log mwdebug-deploy@deploy1002 helmfile [eqiad] DONE helmfile.d/services/mwdebug: sync on pinkunicorn
[07:54:20] <logmsgbot>	 !log mwdebug-deploy@deploy1002 helmfile [codfw] START helmfile.d/services/mwdebug: apply on pinkunicorn
[07:54:20] <stashbot>	 Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log
[07:54:21] <stashbot>	 Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log
[07:54:43] <wikibugs>	 (03CR) 10Marostegui: [C: 03+2] dbproxy1020: Disable notifications [puppet] - 10https://gerrit.wikimedia.org/r/752989 (https://phabricator.wikimedia.org/T298586) (owner: 10Marostegui)
[07:55:20] <logmsgbot>	 !log mwdebug-deploy@deploy1002 helmfile [codfw] DONE helmfile.d/services/mwdebug: sync on pinkunicorn
[07:55:21] <stashbot>	 Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log
[07:55:26] <logmsgbot>	 !log marostegui@cumin1001 START - Cookbook sre.hosts.reimage for host dbproxy1020.eqiad.wmnet with OS bullseye
[07:55:27] <stashbot>	 Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log
[07:58:56] <logmsgbot>	 !log marostegui@cumin1001 dbctl commit (dc=all): 'Repooling after maintenance db1127', diff saved to https://phabricator.wikimedia.org/P18524 and previous config saved to /var/cache/conftool/dbconfig/20220111-075856-marostegui.json
[07:58:58] <stashbot>	 Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log
[08:02:34] <icinga-wm>	 PROBLEM - HTTPS-wmfusercontent on phab.wmfusercontent.org is CRITICAL: SSL CRITICAL - Certificate *.wikipedia.org valid until 2022-02-10 08:02:21 +0000 (expires in 29 days) https://phabricator.wikimedia.org/tag/phabricator/
[08:03:05] <logmsgbot>	 !log ladsgroup@cumin1001 dbctl commit (dc=all): 'Repooling after maintenance db2117', diff saved to https://phabricator.wikimedia.org/P18525 and previous config saved to /var/cache/conftool/dbconfig/20220111-080305-ladsgroup.json
[08:03:06] <stashbot>	 Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log
[08:14:01] <logmsgbot>	 !log marostegui@cumin1001 dbctl commit (dc=all): 'Repooling after maintenance db1127 (T297191)', diff saved to https://phabricator.wikimedia.org/P18526 and previous config saved to /var/cache/conftool/dbconfig/20220111-081400-marostegui.json
[08:14:03] <stashbot>	 Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log
[08:14:04] <stashbot>	 T297191: Schema change for dropping page_restrictions.pr_user field on wmf sites - https://phabricator.wikimedia.org/T297191
[08:14:06] <logmsgbot>	 !log marostegui@cumin1001 START - Cookbook sre.hosts.downtime for 6:00:00 on db2121.codfw.wmnet with reason: Maintenance
[08:14:07] <stashbot>	 Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log
[08:14:08] <logmsgbot>	 !log marostegui@cumin1001 END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db2121.codfw.wmnet with reason: Maintenance
[08:14:09] <logmsgbot>	 !log marostegui@cumin1001 START - Cookbook sre.hosts.downtime for 12:00:00 on 10 hosts with reason: Maintenance
[08:14:09] <stashbot>	 Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log
[08:14:10] <stashbot>	 Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log
[08:14:17] <logmsgbot>	 !log marostegui@cumin1001 END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on 10 hosts with reason: Maintenance
[08:14:18] <stashbot>	 Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log
[08:14:36] <logmsgbot>	 !log marostegui@cumin1001 START - Cookbook sre.hosts.downtime for 6:00:00 on db1170.eqiad.wmnet with reason: Maintenance
[08:14:37] <stashbot>	 Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log
[08:14:38] <logmsgbot>	 !log marostegui@cumin1001 END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1170.eqiad.wmnet with reason: Maintenance
[08:14:39] <stashbot>	 Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log
[08:14:43] <logmsgbot>	 !log marostegui@cumin1001 dbctl commit (dc=all): 'Depooling db1170:3317 (T297191)', diff saved to https://phabricator.wikimedia.org/P18527 and previous config saved to /var/cache/conftool/dbconfig/20220111-081442-marostegui.json
[08:14:44] <stashbot>	 Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log
[08:15:58] <logmsgbot>	 !log marostegui@cumin1001 dbctl commit (dc=all): 'Repooling after maintenance db1170:3317 (T297191)', diff saved to https://phabricator.wikimedia.org/P18528 and previous config saved to /var/cache/conftool/dbconfig/20220111-081557-marostegui.json
[08:16:00] <stashbot>	 Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log
[08:16:46] <wikibugs>	 (03CR) 10Ema: [C: 03+1] "Congrats on cluster id 100 \o/" [puppet] - 10https://gerrit.wikimedia.org/r/752146 (owner: 10Ssingh)
[08:18:10] <logmsgbot>	 !log ladsgroup@cumin1001 dbctl commit (dc=all): 'Repooling after maintenance db2117', diff saved to https://phabricator.wikimedia.org/P18529 and previous config saved to /var/cache/conftool/dbconfig/20220111-081809-ladsgroup.json
[08:18:11] <stashbot>	 Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log
[08:20:13] <wikibugs>	 (03Merged) 10jenkins-bot: SECURITY: Fix several i18n XSS issues in suggested edits [extensions/GrowthExperiments] (wmf/1.38.0-wmf.17) - 10https://gerrit.wikimedia.org/r/752769 (https://phabricator.wikimedia.org/T298504) (owner: 10Gergő Tisza)
[08:21:24] <wikibugs>	 10SRE, 10Analytics-Radar, 10Event-Platform, 10Patch-For-Review: Allow kafka clients to verify brokers hostnames when using SSL - https://phabricator.wikimedia.org/T291905 (10elukey)
[08:22:26] <wikibugs>	 (03PS4) 10Elukey: varnishkafka: use new ca bundle instead of the Puppet one [puppet] - 10https://gerrit.wikimedia.org/r/742747 (https://phabricator.wikimedia.org/T296064)
[08:24:06] <logmsgbot>	 !log marostegui@cumin1001 END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host dbproxy1020.eqiad.wmnet with OS bullseye
[08:24:07] <stashbot>	 Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log
[08:24:19] <wikibugs>	 (03CR) 10Elukey: [V: 03+1] "PCC SUCCESS (DIFF 2): https://integration.wikimedia.org/ci/job/operations-puppet-catalog-compiler/33181/console" [puppet] - 10https://gerrit.wikimedia.org/r/742747 (https://phabricator.wikimedia.org/T296064) (owner: 10Elukey)
[08:25:35] <logmsgbot>	 !log mwdebug-deploy@deploy1002 helmfile [eqiad] START helmfile.d/services/mwdebug: apply on pinkunicorn
[08:25:36] <stashbot>	 Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log
[08:26:27] <logmsgbot>	 !log mwdebug-deploy@deploy1002 helmfile [eqiad] DONE helmfile.d/services/mwdebug: sync on pinkunicorn
[08:26:28] <stashbot>	 Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log
[08:26:28] <logmsgbot>	 !log mwdebug-deploy@deploy1002 helmfile [codfw] START helmfile.d/services/mwdebug: apply on pinkunicorn
[08:26:29] <stashbot>	 Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log
[08:27:28] <logmsgbot>	 !log mwdebug-deploy@deploy1002 helmfile [codfw] DONE helmfile.d/services/mwdebug: sync on pinkunicorn
[08:27:29] <stashbot>	 Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log
[08:31:02] <logmsgbot>	 !log marostegui@cumin1001 dbctl commit (dc=all): 'Repooling after maintenance db1170:3317', diff saved to https://phabricator.wikimedia.org/P18530 and previous config saved to /var/cache/conftool/dbconfig/20220111-083102-marostegui.json
[08:31:04] <stashbot>	 Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log
[08:32:14] <wikibugs>	 (03CR) 10Marostegui: [C: 03+1] auto_schema: Force depool in codfw for mysql upgrades [software] - 10https://gerrit.wikimedia.org/r/752700 (https://phabricator.wikimedia.org/T239814) (owner: 10Ladsgroup)
[08:33:14] <logmsgbot>	 !log ladsgroup@cumin1001 dbctl commit (dc=all): 'Repooling after maintenance db2117 (T296143)', diff saved to https://phabricator.wikimedia.org/P18531 and previous config saved to /var/cache/conftool/dbconfig/20220111-083314-ladsgroup.json
[08:33:16] <logmsgbot>	 !log ladsgroup@cumin1001 START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db2124.codfw.wmnet with reason: Maintenance
[08:33:17] <stashbot>	 Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log
[08:33:18] <logmsgbot>	 !log ladsgroup@cumin1001 END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2124.codfw.wmnet with reason: Maintenance
[08:33:18] <stashbot>	 T296143: Optimize commonswiki image table - https://phabricator.wikimedia.org/T296143
[08:33:19] <stashbot>	 Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log
[08:33:20] <stashbot>	 Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log
[08:33:22] <logmsgbot>	 !log ladsgroup@cumin1001 dbctl commit (dc=all): 'Depooling db2124 (T296143)', diff saved to https://phabricator.wikimedia.org/P18532 and previous config saved to /var/cache/conftool/dbconfig/20220111-083322-ladsgroup.json
[08:33:24] <stashbot>	 Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log
[08:33:26] <logmsgbot>	 !log ladsgroup@cumin1001 START - Cookbook sre.mysql.upgrade for db2124.codfw.wmnet
[08:33:27] <stashbot>	 Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log
[08:34:15] <wikibugs>	 (03CR) 10Ladsgroup: [C: 03+2] auto_schema: Force depool in codfw for mysql upgrades [software] - 10https://gerrit.wikimedia.org/r/752700 (https://phabricator.wikimedia.org/T239814) (owner: 10Ladsgroup)
[08:34:47] <wikibugs>	 (03Merged) 10jenkins-bot: auto_schema: Force depool in codfw for mysql upgrades [software] - 10https://gerrit.wikimedia.org/r/752700 (https://phabricator.wikimedia.org/T239814) (owner: 10Ladsgroup)
[08:39:46] <logmsgbot>	 !log ladsgroup@cumin1001 END (PASS) - Cookbook sre.mysql.upgrade (exit_code=0) for db2124.codfw.wmnet
[08:39:47] <stashbot>	 Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log
[08:40:46] <logmsgbot>	 !log jmm@cumin2002 START - Cookbook sre.hosts.reimage for host ganeti2023.codfw.wmnet with OS buster
[08:40:47] <stashbot>	 Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log
[08:41:52] <logmsgbot>	 !log ladsgroup@cumin1001 dbctl commit (dc=all): 'Repooling after maintenance db2124 (T296143)', diff saved to https://phabricator.wikimedia.org/P18533 and previous config saved to /var/cache/conftool/dbconfig/20220111-084151-ladsgroup.json
[08:41:54] <stashbot>	 Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log
[08:41:54] <stashbot>	 T296143: Optimize commonswiki image table - https://phabricator.wikimedia.org/T296143
[08:42:12] <icinga-wm>	 PROBLEM - SSH on bast3005 is CRITICAL: Server answer: https://wikitech.wikimedia.org/wiki/SSH/monitoring
[08:44:14] <icinga-wm>	 RECOVERY - SSH on bast3005 is OK: SSH OK - OpenSSH_7.9p1 Debian-10+deb10u2 (protocol 2.0) https://wikitech.wikimedia.org/wiki/SSH/monitoring
[08:45:09] <wikibugs>	 (03PS1) 10Marostegui: db2078: Disable notifications [puppet] - 10https://gerrit.wikimedia.org/r/752993 (https://phabricator.wikimedia.org/T295965)
[08:46:07] <logmsgbot>	 !log marostegui@cumin1001 dbctl commit (dc=all): 'Repooling after maintenance db1170:3317', diff saved to https://phabricator.wikimedia.org/P18534 and previous config saved to /var/cache/conftool/dbconfig/20220111-084606-marostegui.json
[08:46:08] <stashbot>	 Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log
[08:48:18] <logmsgbot>	 !log marostegui@cumin1001 START - Cookbook sre.hosts.reimage for host db2078.codfw.wmnet with OS bullseye
[08:48:19] <stashbot>	 Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log
[08:48:33] <wikibugs>	 (03CR) 10Marostegui: [C: 03+2] db2078: Disable notifications [puppet] - 10https://gerrit.wikimedia.org/r/752993 (https://phabricator.wikimedia.org/T295965) (owner: 10Marostegui)
[08:51:51] <icinga-wm>	 PROBLEM - haproxy failover on dbproxy2001 is CRITICAL: CRITICAL check_failover servers up 1 down 1: https://wikitech.wikimedia.org/wiki/HAProxy
[08:52:05] <icinga-wm>	 PROBLEM - haproxy failover on dbproxy2004 is CRITICAL: CRITICAL check_failover servers up 1 down 1: https://wikitech.wikimedia.org/wiki/HAProxy
[08:52:43] <marostegui>	 ^ me
[08:53:18] <icinga-wm>	 ACKNOWLEDGEMENT - haproxy failover on dbproxy2001 is CRITICAL: CRITICAL check_failover servers up 1 down 1: Marostegui known https://wikitech.wikimedia.org/wiki/HAProxy
[08:53:18] <icinga-wm>	 ACKNOWLEDGEMENT - haproxy failover on dbproxy2004 is CRITICAL: CRITICAL check_failover servers up 1 down 1: Marostegui known https://wikitech.wikimedia.org/wiki/HAProxy
[08:56:57] <logmsgbot>	 !log ladsgroup@cumin1001 dbctl commit (dc=all): 'Repooling after maintenance db2124', diff saved to https://phabricator.wikimedia.org/P18535 and previous config saved to /var/cache/conftool/dbconfig/20220111-085656-ladsgroup.json
[08:56:58] <stashbot>	 Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log
[09:01:12] <logmsgbot>	 !log marostegui@cumin1001 dbctl commit (dc=all): 'Repooling after maintenance db1170:3317 (T297191)', diff saved to https://phabricator.wikimedia.org/P18536 and previous config saved to /var/cache/conftool/dbconfig/20220111-090111-marostegui.json
[09:01:13] <logmsgbot>	 !log marostegui@cumin1001 START - Cookbook sre.hosts.downtime for 6:00:00 on db1174.eqiad.wmnet with reason: Maintenance
[09:01:14] <stashbot>	 Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log
[09:01:15] <logmsgbot>	 !log marostegui@cumin1001 END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1174.eqiad.wmnet with reason: Maintenance
[09:01:15] <stashbot>	 T297191: Schema change for dropping page_restrictions.pr_user field on wmf sites - https://phabricator.wikimedia.org/T297191
[09:01:16] <stashbot>	 Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log
[09:01:18] <stashbot>	 Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log
[09:01:19] <logmsgbot>	 !log marostegui@cumin1001 dbctl commit (dc=all): 'Depooling db1174 (T297191)', diff saved to https://phabricator.wikimedia.org/P18537 and previous config saved to /var/cache/conftool/dbconfig/20220111-090119-marostegui.json
[09:01:21] <stashbot>	 Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log
[09:03:08] <icinga-wm>	 PROBLEM - Prometheus jobs reduced availability on alert1001 is CRITICAL: job=mysql-misc site=codfw https://wikitech.wikimedia.org/wiki/Prometheus%23Prometheus_job_unavailable https://grafana.wikimedia.org/d/NEJu05xZz/prometheus-targets
[09:07:32] <logmsgbot>	 !log marostegui@cumin1001 dbctl commit (dc=all): 'Repooling after maintenance db1174 (T297191)', diff saved to https://phabricator.wikimedia.org/P18538 and previous config saved to /var/cache/conftool/dbconfig/20220111-090732-marostegui.json
[09:07:34] <stashbot>	 Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log
[09:07:35] <stashbot>	 T297191: Schema change for dropping page_restrictions.pr_user field on wmf sites - https://phabricator.wikimedia.org/T297191
[09:09:38] <icinga-wm>	 PROBLEM - haproxy failover on dbproxy2002 is CRITICAL: CRITICAL check_failover servers up 1 down 1: https://wikitech.wikimedia.org/wiki/HAProxy
[09:11:52] <logmsgbot>	 !log jmm@cumin2002 END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host ganeti2023.codfw.wmnet with OS buster
[09:11:53] <stashbot>	 Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log
[09:12:01] <logmsgbot>	 !log ladsgroup@cumin1001 dbctl commit (dc=all): 'Repooling after maintenance db2124', diff saved to https://phabricator.wikimedia.org/P18539 and previous config saved to /var/cache/conftool/dbconfig/20220111-091201-ladsgroup.json
[09:12:03] <stashbot>	 Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log
[09:12:10] <icinga-wm>	 RECOVERY - haproxy failover on dbproxy2002 is OK: OK check_failover servers up 2 down 0: https://wikitech.wikimedia.org/wiki/HAProxy
[09:12:22] <icinga-wm>	 RECOVERY - haproxy failover on dbproxy2004 is OK: OK check_failover servers up 2 down 0: https://wikitech.wikimedia.org/wiki/HAProxy
[09:13:10] <icinga-wm>	 RECOVERY - haproxy failover on dbproxy2001 is OK: OK check_failover servers up 2 down 0: https://wikitech.wikimedia.org/wiki/HAProxy
[09:13:43] <logmsgbot>	 !log jayme@cumin1001 START - Cookbook sre.ganeti.reboot-vm for VM kubemaster1002.eqiad.wmnet
[09:13:44] <stashbot>	 Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log
[09:14:46] <icinga-wm>	 RECOVERY - Prometheus jobs reduced availability on alert1001 is OK: All metrics within thresholds. https://wikitech.wikimedia.org/wiki/Prometheus%23Prometheus_job_unavailable https://grafana.wikimedia.org/d/NEJu05xZz/prometheus-targets
[09:15:55] <logmsgbot>	 !log jayme@cumin1001 END (PASS) - Cookbook sre.ganeti.reboot-vm (exit_code=0) for VM kubemaster1002.eqiad.wmnet
[09:15:56] <stashbot>	 Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log
[09:17:29] <logmsgbot>	 !log marostegui@cumin1001 END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db2078.codfw.wmnet with OS bullseye
[09:17:31] <stashbot>	 Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log
[09:22:37] <logmsgbot>	 !log marostegui@cumin1001 dbctl commit (dc=all): 'Repooling after maintenance db1174', diff saved to https://phabricator.wikimedia.org/P18540 and previous config saved to /var/cache/conftool/dbconfig/20220111-092236-marostegui.json
[09:22:38] <stashbot>	 Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log
[09:23:22] <hashar>	 !log Upgrading Jenkins and Apache on releases1002 & release2002
[09:23:24] <stashbot>	 Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log
[09:23:50] <ema>	 !log cp4021 (upload), cp4027 (text): upgrade varnish to 6.0.9-1wm1 T298758
[09:23:52] <stashbot>	 Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log
[09:23:52] <stashbot>	 T298758: Package and deploy Varnish 6.0.9 - https://phabricator.wikimedia.org/T298758
[09:25:35] <logmsgbot>	 !log jmm@cumin2002 START - Cookbook sre.hosts.reimage for host ganeti2019.codfw.wmnet with OS buster
[09:25:36] <stashbot>	 Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log
[09:27:06] <logmsgbot>	 !log ladsgroup@cumin1001 dbctl commit (dc=all): 'Repooling after maintenance db2124 (T296143)', diff saved to https://phabricator.wikimedia.org/P18541 and previous config saved to /var/cache/conftool/dbconfig/20220111-092706-ladsgroup.json
[09:27:08] <stashbot>	 Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log
[09:27:09] <stashbot>	 T296143: Optimize commonswiki image table - https://phabricator.wikimedia.org/T296143
[09:29:10] <logmsgbot>	 !log jayme@cumin1001 START - Cookbook sre.ganeti.reboot-vm for VM kubemaster1001.eqiad.wmnet
[09:29:12] <stashbot>	 Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log
[09:33:14] <logmsgbot>	 !log jayme@cumin1001 START - Cookbook sre.ganeti.reboot-vm for VM kubestagemaster1001.eqiad.wmnet
[09:33:16] <stashbot>	 Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log
[09:33:46] <wikibugs>	 10SRE, 10Infrastructure-Foundations: Migrate eqiad Ganeti cluster to KVM machine type pc-i440fx-2.8 - https://phabricator.wikimedia.org/T294120 (10JMeybohm)
[09:33:57] <wikibugs>	 (03PS2) 10Cparle: Updated maint script to use fewer queries [extensions/MediaSearch] (wmf/1.38.0-wmf.16) - 10https://gerrit.wikimedia.org/r/752701 (https://phabricator.wikimedia.org/T297484)
[09:35:31] <logmsgbot>	 !log jayme@cumin1001 END (PASS) - Cookbook sre.ganeti.reboot-vm (exit_code=0) for VM kubemaster1001.eqiad.wmnet
[09:35:32] <stashbot>	 Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log
[09:37:42] <logmsgbot>	 !log marostegui@cumin1001 dbctl commit (dc=all): 'Repooling after maintenance db1174', diff saved to https://phabricator.wikimedia.org/P18542 and previous config saved to /var/cache/conftool/dbconfig/20220111-093741-marostegui.json
[09:37:43] <stashbot>	 Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log
[09:40:13] <logmsgbot>	 !log jayme@cumin1001 END (PASS) - Cookbook sre.ganeti.reboot-vm (exit_code=0) for VM kubestagemaster1001.eqiad.wmnet
[09:40:15] <stashbot>	 Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log
[09:42:58] <wikibugs>	 (03PS2) 10JMeybohm: Use promtool in PATH rather than /usr/bin/promtool [alerts] - 10https://gerrit.wikimedia.org/r/752651
[09:45:42] <wikibugs>	 (03PS1) 10Jcrespo: mediabackup: Backup testcommonswiki on codfw [puppet] - 10https://gerrit.wikimedia.org/r/752996 (https://phabricator.wikimedia.org/T262668)
[09:46:00] <wikibugs>	 (03PS2) 10Arturo Borrero Gonzalez: wmcs: GridConfigurator: run puppet agent in the master node when reconfiguring [cookbooks] (wmcs) - 10https://gerrit.wikimedia.org/r/749739
[09:50:24] <wikibugs>	 (03CR) 10JMeybohm: [C: 03+2] Use promtool in PATH rather than /usr/bin/promtool [alerts] - 10https://gerrit.wikimedia.org/r/752651 (owner: 10JMeybohm)
[09:50:29] <wikibugs>	 10Puppet, 10SRE, 10Infrastructure-Foundations, 10Patch-For-Review: Unused puppet resources audit, 2021 - https://phabricator.wikimedia.org/T272559 (10dcaro)
[09:51:13] <logmsgbot>	 !log jayme@cumin1001 conftool action : set/pooled=false; selector: dnsdisc=helm-charts,name=eqiad
[09:51:14] <stashbot>	 Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log
[09:52:26] <wikibugs>	 (03Merged) 10jenkins-bot: Use promtool in PATH rather than /usr/bin/promtool [alerts] - 10https://gerrit.wikimedia.org/r/752651 (owner: 10JMeybohm)
[09:52:47] <logmsgbot>	 !log marostegui@cumin1001 dbctl commit (dc=all): 'Repooling after maintenance db1174 (T297191)', diff saved to https://phabricator.wikimedia.org/P18543 and previous config saved to /var/cache/conftool/dbconfig/20220111-095246-marostegui.json
[09:52:48] <logmsgbot>	 !log marostegui@cumin1001 START - Cookbook sre.hosts.downtime for 6:00:00 on db1098.eqiad.wmnet with reason: Maintenance
[09:52:49] <stashbot>	 Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log
[09:52:49] <logmsgbot>	 !log marostegui@cumin1001 END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1098.eqiad.wmnet with reason: Maintenance
[09:52:50] <stashbot>	 T297191: Schema change for dropping page_restrictions.pr_user field on wmf sites - https://phabricator.wikimedia.org/T297191
[09:52:51] <stashbot>	 Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log
[09:52:52] <stashbot>	 Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log
[09:52:54] <logmsgbot>	 !log marostegui@cumin1001 dbctl commit (dc=all): 'Depooling db1098:3317 (T297191)', diff saved to https://phabricator.wikimedia.org/P18544 and previous config saved to /var/cache/conftool/dbconfig/20220111-095254-marostegui.json
[09:52:56] <stashbot>	 Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log
[09:53:33] <wikibugs>	 (03PS2) 10Cparle: Enable support for references [mediawiki-config] - 10https://gerrit.wikimedia.org/r/752599 (https://phabricator.wikimedia.org/T230315) (owner: 10Matthias Mullie)
[09:54:02] <wikibugs>	 (03CR) 10Arturo Borrero Gonzalez: [C: 03+2] wmcs: GridConfigurator: run puppet agent in the master node when reconfiguring [cookbooks] (wmcs) - 10https://gerrit.wikimedia.org/r/749739 (owner: 10Arturo Borrero Gonzalez)
[09:54:12] <logmsgbot>	 !log marostegui@cumin1001 dbctl commit (dc=all): 'Repooling after maintenance db1098:3317 (T297191)', diff saved to https://phabricator.wikimedia.org/P18545 and previous config saved to /var/cache/conftool/dbconfig/20220111-095408-marostegui.json
[09:54:14] <stashbot>	 Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log
[09:54:28] <logmsgbot>	 !log jmm@cumin2002 END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host ganeti2019.codfw.wmnet with OS buster
[09:54:29] <stashbot>	 Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log
[09:55:14] <wikibugs>	 (03PS2) 10Ideophagous: arywiki NS [mediawiki-config] - 10https://gerrit.wikimedia.org/r/747973 (https://phabricator.wikimedia.org/T291737)
[09:55:57] <wikibugs>	 (03CR) 10Arturo Borrero Gonzalez: [C: 03+1] role::mariadb: remove unused role [puppet] - 10https://gerrit.wikimedia.org/r/751725 (https://phabricator.wikimedia.org/T272559) (owner: 10David Caro)
[09:56:58] <wikibugs>	 (03Merged) 10jenkins-bot: wmcs: GridConfigurator: run puppet agent in the master node when reconfiguring [cookbooks] (wmcs) - 10https://gerrit.wikimedia.org/r/749739 (owner: 10Arturo Borrero Gonzalez)
[09:58:33] <wikibugs>	 (03PS1) 10Ayounsi: Add msw2-eqiad to monitoring [puppet] - 10https://gerrit.wikimedia.org/r/753000
[09:58:43] <logmsgbot>	 !log jayme@cumin1001 conftool action : set/pooled=true; selector: dnsdisc=helm-charts,name=eqiad
[09:58:44] <stashbot>	 Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log
[10:00:55] <wikibugs>	 (03PS1) 10Elukey: bigtop: move our internal APT repo config to Buster [puppet] - 10https://gerrit.wikimedia.org/r/753002
[10:02:51] <wikibugs>	 (03CR) 10Muehlenhoff: bigtop: move our internal APT repo config to Buster (031 comment) [puppet] - 10https://gerrit.wikimedia.org/r/753002 (owner: 10Elukey)
[10:09:17] <logmsgbot>	 !log marostegui@cumin1001 dbctl commit (dc=all): 'Repooling after maintenance db1098:3317', diff saved to https://phabricator.wikimedia.org/P18546 and previous config saved to /var/cache/conftool/dbconfig/20220111-100917-marostegui.json
[10:09:18] <stashbot>	 Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log
[10:09:58] <wikibugs>	 (03PS1) 10Muehlenhoff: Update repo config for Bigtop to buster [puppet] - 10https://gerrit.wikimedia.org/r/753004
[10:14:25] <wikibugs>	 (03CR) 10Matthias Mullie: [C: 03+2] Updated maint script to use fewer queries [extensions/MediaSearch] (wmf/1.38.0-wmf.16) - 10https://gerrit.wikimedia.org/r/752701 (https://phabricator.wikimedia.org/T297484) (owner: 10Cparle)
[10:15:02] <wikibugs>	 (03CR) 10Elukey: [C: 03+1] Update repo config for Bigtop to buster [puppet] - 10https://gerrit.wikimedia.org/r/753004 (owner: 10Muehlenhoff)
[10:15:22] <wikibugs>	 (03Abandoned) 10Elukey: bigtop: move our internal APT repo config to Buster [puppet] - 10https://gerrit.wikimedia.org/r/753002 (owner: 10Elukey)
[10:16:38] <wikibugs>	 (03CR) 10Jcrespo: "I heard this class or something similar was used or used to be used on cloud (VPS, not production) instances. This doesn't affect me, but " [puppet] - 10https://gerrit.wikimedia.org/r/751725 (https://phabricator.wikimedia.org/T272559) (owner: 10David Caro)
[10:20:22] <wikibugs>	 (03PS1) 10Arturo Borrero Gonzalez: wmcs: toolforge: grid: add cookbook to create an exec node [cookbooks] (wmcs) - 10https://gerrit.wikimedia.org/r/753006 (https://phabricator.wikimedia.org/T298948)
[10:23:42] <wikibugs>	 (03CR) 10Jcrespo: [C: 03+2] mediabackup: Backup testcommonswiki on codfw [puppet] - 10https://gerrit.wikimedia.org/r/752996 (https://phabricator.wikimedia.org/T262668) (owner: 10Jcrespo)
[10:24:23] <logmsgbot>	 !log marostegui@cumin1001 dbctl commit (dc=all): 'Repooling after maintenance db1098:3317', diff saved to https://phabricator.wikimedia.org/P18547 and previous config saved to /var/cache/conftool/dbconfig/20220111-102421-marostegui.json
[10:24:25] <stashbot>	 Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log
[10:25:12] <wikibugs>	 (03PS3) 10Btullis: Exclude log4j_extras from the classpath for coordinators [puppet] - 10https://gerrit.wikimedia.org/r/752673 (https://phabricator.wikimedia.org/T297468)
[10:26:59] <wikibugs>	 (03CR) 10Kormat: [C: 03+1] role::mariadb::proxy: remove unused role [puppet] - 10https://gerrit.wikimedia.org/r/751726 (https://phabricator.wikimedia.org/T272559) (owner: 10David Caro)
[10:28:05] <wikibugs>	 (03CR) 10Kormat: [C: 03+1] role::mariadb: remove unused role [puppet] - 10https://gerrit.wikimedia.org/r/751725 (https://phabricator.wikimedia.org/T272559) (owner: 10David Caro)
[10:28:19] <wikibugs>	 (03PS1) 10Btullis: Mark reedy as kerberos enabled [puppet] - 10https://gerrit.wikimedia.org/r/753007 (https://phabricator.wikimedia.org/T298951)
[10:29:14] <wikibugs>	 (03CR) 10David Caro: [C: 03+2] role::mariadb: remove unused role [puppet] - 10https://gerrit.wikimedia.org/r/751725 (https://phabricator.wikimedia.org/T272559) (owner: 10David Caro)
[10:29:30] <wikibugs>	 (03CR) 10David Caro: [C: 03+2] role::mariadb::proxy: remove unused role [puppet] - 10https://gerrit.wikimedia.org/r/751726 (https://phabricator.wikimedia.org/T272559) (owner: 10David Caro)
[10:30:12] <wikibugs>	 (03CR) 10Reedy: [C: 03+1] Mark reedy as kerberos enabled [puppet] - 10https://gerrit.wikimedia.org/r/753007 (https://phabricator.wikimedia.org/T298951) (owner: 10Btullis)
[10:30:20] <wikibugs>	 (03CR) 10Muehlenhoff: [C: 03+2] Update repo config for Bigtop to buster [puppet] - 10https://gerrit.wikimedia.org/r/753004 (owner: 10Muehlenhoff)
[10:30:44] <wikibugs>	 10Puppet, 10SRE, 10Infrastructure-Foundations, 10Patch-For-Review: Unused puppet resources audit, 2021 - https://phabricator.wikimedia.org/T272559 (10dcaro)
[10:30:44] <moritzm>	 dcaro: merging your patches along, ok?
[10:31:06] <dcaro>	 moritzm: sure
[10:31:08] <dcaro>	 just went to log in
[10:31:10] <dcaro>	 thanks
[10:31:19] <moritzm>	 ack, done
[10:32:59] <wikibugs>	 (03CR) 10David Caro: role::mariadb: remove unused role (031 comment) [puppet] - 10https://gerrit.wikimedia.org/r/751725 (https://phabricator.wikimedia.org/T272559) (owner: 10David Caro)
[10:39:28] <logmsgbot>	 !log marostegui@cumin1001 dbctl commit (dc=all): 'Repooling after maintenance db1098:3317 (T297191)', diff saved to https://phabricator.wikimedia.org/P18548 and previous config saved to /var/cache/conftool/dbconfig/20220111-103927-marostegui.json
[10:39:30] <logmsgbot>	 !log marostegui@cumin1001 START - Cookbook sre.hosts.downtime for 6:00:00 on db1158.eqiad.wmnet with reason: Maintenance
[10:39:30] <stashbot>	 Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log
[10:39:31] <stashbot>	 T297191: Schema change for dropping page_restrictions.pr_user field on wmf sites - https://phabricator.wikimedia.org/T297191
[10:39:32] <logmsgbot>	 !log marostegui@cumin1001 END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1158.eqiad.wmnet with reason: Maintenance
[10:39:32] <stashbot>	 Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log
[10:39:33] <logmsgbot>	 !log marostegui@cumin1001 START - Cookbook sre.hosts.downtime for 12:00:00 on clouddb[1014,1018,1021].eqiad.wmnet,db1155.eqiad.wmnet with reason: Maintenance
[10:39:34] <stashbot>	 Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log
[10:39:35] <stashbot>	 Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log
[10:39:36] <logmsgbot>	 !log marostegui@cumin1001 END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on clouddb[1014,1018,1021].eqiad.wmnet,db1155.eqiad.wmnet with reason: Maintenance
[10:39:38] <stashbot>	 Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log
[10:39:41] <logmsgbot>	 !log marostegui@cumin1001 dbctl commit (dc=all): 'Depooling db1158 (T297191)', diff saved to https://phabricator.wikimedia.org/P18549 and previous config saved to /var/cache/conftool/dbconfig/20220111-103941-marostegui.json
[10:39:43] <stashbot>	 Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log
[10:40:10] <wikibugs>	 (03PS1) 10Muehlenhoff: Fixup bigtop sync and repository section [puppet] - 10https://gerrit.wikimedia.org/r/753008
[10:40:34] <wikibugs>	 (03CR) 10Btullis: "Looks OK to me." [puppet] - 10https://gerrit.wikimedia.org/r/751100 (https://phabricator.wikimedia.org/T292389) (owner: 10Majavah)
[10:41:16] <wikibugs>	 (03CR) 10Btullis: [C: 03+2] Mark reedy as kerberos enabled [puppet] - 10https://gerrit.wikimedia.org/r/753007 (https://phabricator.wikimedia.org/T298951) (owner: 10Btullis)
[10:42:07] <wikibugs>	 (03CR) 10Muehlenhoff: [C: 03+2] Fixup bigtop sync and repository section [puppet] - 10https://gerrit.wikimedia.org/r/753008 (owner: 10Muehlenhoff)
[10:44:02] <wikibugs>	 (03CR) 10Btullis: [V: 03+1] "PCC SUCCESS (DIFF 2 NOOP 1): https://integration.wikimedia.org/ci/job/operations-puppet-catalog-compiler/33182/console" [puppet] - 10https://gerrit.wikimedia.org/r/752673 (https://phabricator.wikimedia.org/T297468) (owner: 10Btullis)
[10:46:31] <wikibugs>	 (03CR) 10Majavah: kerberos: manage users with custom puppet type (031 comment) [puppet] - 10https://gerrit.wikimedia.org/r/751100 (https://phabricator.wikimedia.org/T292389) (owner: 10Majavah)
[10:46:55] <logmsgbot>	 !log marostegui@cumin1001 dbctl commit (dc=all): 'Repooling after maintenance db1158 (T297191)', diff saved to https://phabricator.wikimedia.org/P18550 and previous config saved to /var/cache/conftool/dbconfig/20220111-104654-marostegui.json
[10:46:57] <stashbot>	 Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log
[10:46:58] <stashbot>	 T297191: Schema change for dropping page_restrictions.pr_user field on wmf sites - https://phabricator.wikimedia.org/T297191
[10:48:19] <wikibugs>	 (03PS1) 10David Caro: wmcs::db: remove used roles and profiles [puppet] - 10https://gerrit.wikimedia.org/r/753010 (https://phabricator.wikimedia.org/T272559)
[10:51:00] <wikibugs>	 10SRE, 10ops-eqiad, 10DC-Ops, 10Infrastructure-Foundations, 10netops: Q2:(Need By: TBD) Rows E/F network racking task - https://phabricator.wikimedia.org/T292095 (10cmooney) >  Only exception is the CR links which will be LC-SC to land on the patch panel.  I should clarify that if we pre-cable the patch...
[10:53:18] <wikibugs>	 (03CR) 10David Caro: "Somehow I forgot to send this patch before xd" [puppet] - 10https://gerrit.wikimedia.org/r/753010 (https://phabricator.wikimedia.org/T272559) (owner: 10David Caro)
[11:00:28] <wikibugs>	 10SRE, 10DC-Ops: Confirm support of PERC 750 raid controller - https://phabricator.wikimedia.org/T297913 (10MoritzMuehlenhoff) With the amount of information provided by Dell, we can't reliably tell. PERC controllers are rebranded Broadcom controllers, but there's no statement to which Broadcom controller PERC...
[11:01:59] <logmsgbot>	 !log marostegui@cumin1001 dbctl commit (dc=all): 'Repooling after maintenance db1158', diff saved to https://phabricator.wikimedia.org/P18551 and previous config saved to /var/cache/conftool/dbconfig/20220111-110159-marostegui.json
[11:02:01] <stashbot>	 Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log
[11:04:28] <wikibugs>	 10SRE, 10DC-Ops: Confirm support of PERC 750 raid controller - https://phabricator.wikimedia.org/T297913 (10Marostegui) Thanks @MoritzMuehlenhoff - if this is only available from Bullseye, I think that's fine from the DB point of view. We are almost finishing our Bullseye testing and I nothing changes dramatic...
[11:14:27] <icinga-wm>	 PROBLEM - SSH on restbase2010.mgmt is CRITICAL: CRITICAL - Socket timeout after 10 seconds https://wikitech.wikimedia.org/wiki/Dc-operations/Hardware_Troubleshooting_Runbook
[11:16:00] <wikibugs>	 (03CR) 10Muehlenhoff: [C: 03+2] Extend logstash Cumin alias with new Opensearch roles [puppet] - 10https://gerrit.wikimedia.org/r/752631 (owner: 10Muehlenhoff)
[11:16:26] <wikibugs>	 (03PS2) 10Hnowlan: maps: Install s3 client cli/lib [puppet] - 10https://gerrit.wikimedia.org/r/746929 (owner: 10Jgiannelos)
[11:17:04] <logmsgbot>	 !log marostegui@cumin1001 dbctl commit (dc=all): 'Repooling after maintenance db1158', diff saved to https://phabricator.wikimedia.org/P18553 and previous config saved to /var/cache/conftool/dbconfig/20220111-111704-marostegui.json
[11:17:06] <stashbot>	 Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log
[11:19:35] <icinga-wm>	 PROBLEM - restbase endpoints health on restbase2017 is CRITICAL: /en.wikipedia.org/v1/page/talk/{title} (Get structured talk page for enwiki Salt article) timed out before a response was received https://wikitech.wikimedia.org/wiki/Services/Monitoring/restbase
[11:19:39] <icinga-wm>	 PROBLEM - Restbase edge eqsin on text-lb.eqsin.wikimedia.org is CRITICAL: /api/rest_v1/page/talk/{title} (Get structured talk page for enwiki Salt article) timed out before a response was received https://wikitech.wikimedia.org/wiki/RESTBase
[11:20:15] <icinga-wm>	 PROBLEM - restbase endpoints health on restbase2024 is CRITICAL: /en.wikipedia.org/v1/page/talk/{title} (Get structured talk page for enwiki Salt article) timed out before a response was received https://wikitech.wikimedia.org/wiki/Services/Monitoring/restbase
[11:20:17] <icinga-wm>	 PROBLEM - restbase endpoints health on restbase2012 is CRITICAL: /en.wikipedia.org/v1/page/talk/{title} (Get structured talk page for enwiki Salt article) timed out before a response was received https://wikitech.wikimedia.org/wiki/Services/Monitoring/restbase
[11:20:34] <logmsgbot>	 !log jmm@cumin2002 START - Cookbook sre.hosts.reboot-single for host ganeti2023.codfw.wmnet
[11:20:35] <stashbot>	 Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log
[11:21:37] <icinga-wm>	 RECOVERY - restbase endpoints health on restbase2012 is OK: All endpoints are healthy https://wikitech.wikimedia.org/wiki/Services/Monitoring/restbase
[11:21:37] <icinga-wm>	 RECOVERY - restbase endpoints health on restbase2024 is OK: All endpoints are healthy https://wikitech.wikimedia.org/wiki/Services/Monitoring/restbase
[11:22:41] <wikibugs>	 (03CR) 10Ssingh: [C: 03+2] hieradata: add durum cluster [puppet] - 10https://gerrit.wikimedia.org/r/752146 (owner: 10Ssingh)
[11:23:11] <wikibugs>	 10SRE-swift-storage, 10MW-on-K8s, 10Shellbox, 10serviceops: Support large files in Shellbox - https://phabricator.wikimedia.org/T292322 (10Joe) For the record, I'm taking care of this release, and given I am annoyed at how we manage image versions for shellbox, I'm also slightly modifying the procedure. I'...
[11:23:45] <icinga-wm>	 PROBLEM - restbase endpoints health on restbase2011 is CRITICAL: /en.wikipedia.org/v1/page/talk/{title} (Get structured talk page for enwiki Salt article) timed out before a response was received https://wikitech.wikimedia.org/wiki/Services/Monitoring/restbase
[11:25:15] <logmsgbot>	 !log marostegui@cumin1001 dbctl commit (dc=all): 'db1104 (re)pooling @ 25%: repooling after schema change', diff saved to https://phabricator.wikimedia.org/P18554 and previous config saved to /var/cache/conftool/dbconfig/20220111-112514-root.json
[11:25:16] <stashbot>	 Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log
[11:25:26] <logmsgbot>	 !log jmm@cumin2002 END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ganeti2023.codfw.wmnet
[11:25:27] <stashbot>	 Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log
[11:26:13] <icinga-wm>	 RECOVERY - restbase endpoints health on restbase2017 is OK: All endpoints are healthy https://wikitech.wikimedia.org/wiki/Services/Monitoring/restbase
[11:26:57] <icinga-wm>	 RECOVERY - restbase endpoints health on restbase2011 is OK: All endpoints are healthy https://wikitech.wikimedia.org/wiki/Services/Monitoring/restbase
[11:27:22] <wikibugs>	 (03PS2) 10Cathal Mooney: admins: add jvargas to ldap_only_admins, added to wmf group [puppet] - 10https://gerrit.wikimedia.org/r/752725 (https://phabricator.wikimedia.org/T298719) (owner: 10Dzahn)
[11:28:13] <wikibugs>	 (03CR) 10Cathal Mooney: [C: 03+2] "Thanks Daniel!  Looks good to me." [puppet] - 10https://gerrit.wikimedia.org/r/752725 (https://phabricator.wikimedia.org/T298719) (owner: 10Dzahn)
[11:30:02] <wikibugs>	 10SRE, 10Infrastructure-Foundations: Migrate codfw Ganeti cluster to Buster - https://phabricator.wikimedia.org/T296622 (10MoritzMuehlenhoff)
[11:32:00] <wikibugs>	 (03CR) 10Hnowlan: [C: 03+2] maps: Install s3 client cli/lib [puppet] - 10https://gerrit.wikimedia.org/r/746929 (owner: 10Jgiannelos)
[11:32:09] <logmsgbot>	 !log marostegui@cumin1001 dbctl commit (dc=all): 'Repooling after maintenance db1158 (T297191)', diff saved to https://phabricator.wikimedia.org/P18555 and previous config saved to /var/cache/conftool/dbconfig/20220111-113208-marostegui.json
[11:32:10] <logmsgbot>	 !log marostegui@cumin1001 START - Cookbook sre.hosts.downtime for 6:00:00 on db1181.eqiad.wmnet with reason: Maintenance
[11:32:11] <stashbot>	 Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log
[11:32:12] <logmsgbot>	 !log marostegui@cumin1001 END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1181.eqiad.wmnet with reason: Maintenance
[11:32:12] <stashbot>	 T297191: Schema change for dropping page_restrictions.pr_user field on wmf sites - https://phabricator.wikimedia.org/T297191
[11:32:13] <stashbot>	 Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log
[11:32:14] <stashbot>	 Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log
[11:32:16] <logmsgbot>	 !log marostegui@cumin1001 dbctl commit (dc=all): 'Depooling db1181 (T297191)', diff saved to https://phabricator.wikimedia.org/P18556 and previous config saved to /var/cache/conftool/dbconfig/20220111-113216-marostegui.json
[11:32:18] <stashbot>	 Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log
[11:33:08] <wikibugs>	 10SRE, 10LDAP-Access-Requests, 10Patch-For-Review: Grant Access to <Idap/wmf> for <JVargas> - https://phabricator.wikimedia.org/T298719 (10cmooney) a:03cmooney Thanks Daniel for all the work on this, patch is now merged.  @JVargas is all good from your point of view?  Otherwise I will proceed to close this...
[11:35:13] <icinga-wm>	 RECOVERY - Restbase edge eqsin on text-lb.eqsin.wikimedia.org is OK: All endpoints are healthy https://wikitech.wikimedia.org/wiki/RESTBase
[11:35:35] <Kizule>	 Hello, I have a issue with Gerrit, again.
[11:35:46] <logmsgbot>	 !log jmm@cumin2002 START - Cookbook sre.hosts.reboot-single for host ganeti2019.codfw.wmnet
[11:35:47] <stashbot>	 Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log
[11:35:57] <Kizule>	 On git fetch for mediawiki/core.
[11:35:59] <Kizule>	 fetch-pack: unexpected disconnect while reading sideband packet
[11:35:59] <Kizule>	 fatal: early EOF
[11:35:59] <Kizule>	 Connection to gerrit.wikimedia.org closed by remote host.
[11:35:59] <Kizule>	 fatal: fetch-pack: invalid index-pack output
[11:36:29] <logmsgbot>	 !log marostegui@cumin1001 dbctl commit (dc=all): 'Repooling after maintenance db1181 (T297191)', diff saved to https://phabricator.wikimedia.org/P18557 and previous config saved to /var/cache/conftool/dbconfig/20220111-113628-marostegui.json
[11:36:31] <stashbot>	 Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log
[11:40:19] <logmsgbot>	 !log marostegui@cumin1001 dbctl commit (dc=all): 'db1104 (re)pooling @ 50%: repooling after schema change', diff saved to https://phabricator.wikimedia.org/P18558 and previous config saved to /var/cache/conftool/dbconfig/20220111-114018-root.json
[11:40:20] <stashbot>	 Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log
[11:41:22] <logmsgbot>	 !log jmm@cumin2002 END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ganeti2019.codfw.wmnet
[11:41:24] <stashbot>	 Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log
[11:46:03] <icinga-wm>	 RECOVERY - SSH on mw2252.mgmt is OK: SSH OK - OpenSSH_7.0 (protocol 2.0) https://wikitech.wikimedia.org/wiki/Dc-operations/Hardware_Troubleshooting_Runbook
[11:47:30] <wikibugs>	 10SRE, 10Infrastructure-Foundations: Migrate codfw Ganeti cluster to Buster - https://phabricator.wikimedia.org/T296622 (10MoritzMuehlenhoff)
[11:51:34] <logmsgbot>	 !log marostegui@cumin1001 dbctl commit (dc=all): 'Repooling after maintenance db1181', diff saved to https://phabricator.wikimedia.org/P18559 and previous config saved to /var/cache/conftool/dbconfig/20220111-115133-marostegui.json
[11:51:35] <stashbot>	 Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log
[11:53:26] <wikibugs>	 (03PS1) 10Awight: Allow aliases to be integers in addition to strings [extensions/TemplateData] (wmf/1.38.0-wmf.17) - 10https://gerrit.wikimedia.org/r/752775 (https://phabricator.wikimedia.org/T298795)
[11:55:22] <logmsgbot>	 !log marostegui@cumin1001 dbctl commit (dc=all): 'db1104 (re)pooling @ 75%: repooling after schema change', diff saved to https://phabricator.wikimedia.org/P18560 and previous config saved to /var/cache/conftool/dbconfig/20220111-115522-root.json
[11:55:23] <stashbot>	 Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log
[11:56:02] <wikibugs>	 (03PS1) 10Jbond: profile::installserver::proxy: update suiqd template [puppet] - 10https://gerrit.wikimedia.org/r/753016 (https://phabricator.wikimedia.org/T298087)
[11:56:04] <moritzm>	 !log rebalance ganeti row A (all nodes reimaged to Buster)
[11:56:05] <stashbot>	 Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log
[11:56:38] <wikibugs>	 (03CR) 10jerkins-bot: [V: 04-1] profile::installserver::proxy: update suiqd template [puppet] - 10https://gerrit.wikimedia.org/r/753016 (https://phabricator.wikimedia.org/T298087) (owner: 10Jbond)
[11:58:21] <wikibugs>	 (03PS2) 10Jbond: profile::installserver::proxy: update suiqd template [puppet] - 10https://gerrit.wikimedia.org/r/753016 (https://phabricator.wikimedia.org/T298087)
[11:59:01] <wikibugs>	 (03CR) 10Jbond: [V: 03+1] "PCC SUCCESS (DIFF 1): https://integration.wikimedia.org/ci/job/operations-puppet-catalog-compiler/33184/console" [puppet] - 10https://gerrit.wikimedia.org/r/753016 (https://phabricator.wikimedia.org/T298087) (owner: 10Jbond)
[11:59:24] <wikibugs>	 10SRE, 10LDAP-Access-Requests, 10Patch-For-Review: Grant Access to <Idap/wmf> for <JVargas> - https://phabricator.wikimedia.org/T298719 (10Aklapper) IIUC this isn't complete yet per items in https://wikitech.wikimedia.org/wiki/SRE/Clinic_Duty/Access_requests#LDAP_access
[12:00:04] <jouncebot>	 Amir1, Lucas_WMDE, awight, and Urbanecm: I seem to be stuck in Groundhog week. Sigh. Time for (yet another) UTC morning backport window deploy. (https://wikitech.wikimedia.org/wiki/Deployments#deploycal-item-20220111T1200).
[12:00:04] <jouncebot>	 cormacparle and matthiasmullie: A patch you scheduled for UTC morning backport window is about to be deployed. Please be around during the process. Note: If you break AND fix the wikis, you will be rewarded with a sticker.
[12:00:12] * cormacparle waves
[12:00:12] <wikibugs>	 (03PS2) 10Arturo Borrero Gonzalez: wmcs: toolforge: grid: add cookbooks to create each node type [cookbooks] (wmcs) - 10https://gerrit.wikimedia.org/r/753006 (https://phabricator.wikimedia.org/T298948)
[12:00:14] <wikibugs>	 (03PS1) 10Arturo Borrero Gonzalez: wmcs: toolforge: grid: factorized node creation cookbook [cookbooks] (wmcs) - 10https://gerrit.wikimedia.org/r/753017 (https://phabricator.wikimedia.org/T298948)
[12:00:16] <Lucas_WMDE>	 o/
[12:00:16] <wikibugs>	 (03PS1) 10Arturo Borrero Gonzalez: wmcs: relocate start_instance_with_prefix cookbook [cookbooks] (wmcs) - 10https://gerrit.wikimedia.org/r/753018 (https://phabricator.wikimedia.org/T298948)
[12:00:20] * urbanecm waves
[12:00:24] <Lucas_WMDE>	 the first backport has i18n changes, I’m not sure how to deploy that
[12:00:30] <wikibugs>	 (03PS3) 10Jbond: profile::installserver::proxy: update suiqd template [puppet] - 10https://gerrit.wikimedia.org/r/753016 (https://phabricator.wikimedia.org/T298087)
[12:00:32] <moritzm>	 !log reverting kubetcd2004.codfw.wmnet back to "plain" storage
[12:00:34] <stashbot>	 Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log
[12:00:46] <urbanecm>	 Lucas_WMDE: you'd need scap sync-world, which we normally try to avoid
[12:00:55] <Lucas_WMDE>	 that’s what I feared
[12:01:07] <wikibugs>	 (03CR) 10Jbond: [V: 03+1] "PCC SUCCESS (DIFF 1): https://integration.wikimedia.org/ci/job/operations-puppet-catalog-compiler/33185/console" [puppet] - 10https://gerrit.wikimedia.org/r/753016 (https://phabricator.wikimedia.org/T298087) (owner: 10Jbond)
[12:01:09] <urbanecm>	 cormacparle: matthiasmullie: are the i18n backports really necessary? They take significant time to be done, so that's why I'm asking
[12:01:18] <urbanecm>	 (we can still done if urgent, but I'd like to know the answer to "why")
[12:01:22] <urbanecm>	 *do it if urgent
[12:01:50] <urbanecm>	 and to avoid confusion...
[12:01:53] <urbanecm>	 I can deploy today
[12:02:01] <cormacparle>	 we're migrating a preference, and what used to be a checkbox is now a dropdown
[12:02:12] <cormacparle>	 so yeah we kinda do need the i18n change
[12:02:27] <cormacparle>	 and it needs to be done in a backport so we can run the maint script to do the migration
[12:02:37] <urbanecm>	 sounds like a good enough reason to me
[12:02:47] <wikibugs>	 (03CR) 10Urbanecm: [C: 03+2] Update the way the search interface is set [extensions/MediaSearch] (wmf/1.38.0-wmf.16) - 10https://gerrit.wikimedia.org/r/751836 (https://phabricator.wikimedia.org/T297484) (owner: 10Cparle)
[12:03:00] <wikibugs>	 (03CR) 10Urbanecm: [C: 03+2] Updated maint script to use fewer queries [extensions/MediaSearch] (wmf/1.38.0-wmf.16) - 10https://gerrit.wikimedia.org/r/752701 (https://phabricator.wikimedia.org/T297484) (owner: 10Cparle)
[12:03:06] <wikibugs>	 (03CR) 10Awight: [V: 03+1 C: 03+1] "Works locally." [extensions/TemplateData] (wmf/1.38.0-wmf.17) - 10https://gerrit.wikimedia.org/r/752775 (https://phabricator.wikimedia.org/T298795) (owner: 10Awight)
[12:03:12] <icinga-wm>	 PROBLEM - SSH on restbase2011.mgmt is CRITICAL: CRITICAL - Socket timeout after 10 seconds https://wikitech.wikimedia.org/wiki/Dc-operations/Hardware_Troubleshooting_Runbook
[12:03:20] <icinga-wm>	 PROBLEM - Host kubetcd2004 is DOWN: PING CRITICAL - Packet loss = 100%
[12:03:35] <urbanecm>	 cormacparle: does the config depend on the backports in some way?
[12:03:48] <cormacparle>	 nope, different thing
[12:04:29] <cormacparle>	 I mean - the config change is for a completely different issure
[12:04:34] <cormacparle>	 issue
[12:04:35] <urbanecm>	 cormacparle: i see. I also see you're a deployer -- want to do the config yourself?
[12:04:44] <cormacparle>	 sure
[12:05:00] <urbanecm>	 go ahead then :)
[12:05:04] <cormacparle>	 the 2 backport changes are in a chain btw, so there's only one sync required
[12:05:13] <cormacparle>	 cool, doing that now
[12:05:41] <urbanecm>	 cormacparle: yeah. Those will need special-treatment due to the i18n changes being done, so we'd need to sync everything via the sync-world command
[12:05:56] <wikibugs>	 (03PS1) 10Elukey: aptrepo: change settings for the Bigtop repository [puppet] - 10https://gerrit.wikimedia.org/r/753019
[12:06:06] <cormacparle>	 sorry :(
[12:06:30] <urbanecm>	 not your fault :)
[12:06:38] <logmsgbot>	 !log marostegui@cumin1001 dbctl commit (dc=all): 'Repooling after maintenance db1181', diff saved to https://phabricator.wikimedia.org/P18561 and previous config saved to /var/cache/conftool/dbconfig/20220111-120638-marostegui.json
[12:06:40] <stashbot>	 Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log
[12:07:41] <wikibugs>	 (03CR) 10Cparle: [C: 03+2] Enable support for references [mediawiki-config] - 10https://gerrit.wikimedia.org/r/752599 (https://phabricator.wikimedia.org/T230315) (owner: 10Matthias Mullie)
[12:07:45] <urbanecm>	 cormacparle: ping me if you need any help with the config patch deployment.
[12:08:00] <cormacparle>	 will do thanks urbanecm 
[12:08:42] <wikibugs>	 (03Merged) 10jenkins-bot: Enable support for references [mediawiki-config] - 10https://gerrit.wikimedia.org/r/752599 (https://phabricator.wikimedia.org/T230315) (owner: 10Matthias Mullie)
[12:09:51] <wikibugs>	 (03CR) 10Muehlenhoff: [C: 03+1] "Looks good!" [puppet] - 10https://gerrit.wikimedia.org/r/753019 (owner: 10Elukey)
[12:10:00] <wikibugs>	 (03CR) 10Elukey: [C: 03+2] aptrepo: change settings for the Bigtop repository [puppet] - 10https://gerrit.wikimedia.org/r/753019 (owner: 10Elukey)
[12:10:26] <logmsgbot>	 !log marostegui@cumin1001 dbctl commit (dc=all): 'db1104 (re)pooling @ 100%: repooling after schema change', diff saved to https://phabricator.wikimedia.org/P18562 and previous config saved to /var/cache/conftool/dbconfig/20220111-121025-root.json
[12:10:27] <stashbot>	 Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log
[12:10:42] <wikibugs>	 (03PS4) 10Jbond: profile::installserver::proxy: update suiqd template [puppet] - 10https://gerrit.wikimedia.org/r/753016 (https://phabricator.wikimedia.org/T298087)
[12:11:31] <wikibugs>	 (03CR) 10Jbond: [V: 03+1] "PCC SUCCESS (DIFF 1): https://integration.wikimedia.org/ci/job/operations-puppet-catalog-compiler/33186/console" [puppet] - 10https://gerrit.wikimedia.org/r/753016 (https://phabricator.wikimedia.org/T298087) (owner: 10Jbond)
[12:12:51] <wikibugs>	 (03CR) 10Jbond: [V: 03+1] profile::installserver::proxy: update suiqd template (031 comment) [puppet] - 10https://gerrit.wikimedia.org/r/753016 (https://phabricator.wikimedia.org/T298087) (owner: 10Jbond)
[12:13:12] <wikibugs>	 (03PS5) 10Jbond: profile::installserver::proxy: update suiqd template [puppet] - 10https://gerrit.wikimedia.org/r/753016 (https://phabricator.wikimedia.org/T298087)
[12:14:41] <logmsgbot>	 !log mwdebug-deploy@deploy1002 helmfile [eqiad] START helmfile.d/services/mwdebug: apply on pinkunicorn
[12:14:42] <stashbot>	 Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log
[12:14:45] <logmsgbot>	 !log jmm@cumin2002 START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on kubetcd2004.codfw.wmnet with reason: switch to plain disk storage
[12:14:46] <stashbot>	 Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log
[12:14:48] <logmsgbot>	 !log jmm@cumin2002 END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on kubetcd2004.codfw.wmnet with reason: switch to plain disk storage
[12:14:49] <stashbot>	 Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log
[12:15:26] <icinga-wm>	 RECOVERY - SSH on restbase2010.mgmt is OK: SSH OK - OpenSSH_7.0 (protocol 2.0) https://wikitech.wikimedia.org/wiki/Dc-operations/Hardware_Troubleshooting_Runbook
[12:15:49] <cormacparle>	 config change fine on debug, syncing now
[12:15:51] <logmsgbot>	 !log cparle@deploy1002 Synchronized wmf-config: Config: [[gerrit:752599|Enable support for references (T230315)]] (duration: 01m 00s)
[12:15:53] <stashbot>	 Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log
[12:15:53] <logmsgbot>	 !log mwdebug-deploy@deploy1002 helmfile [eqiad] DONE helmfile.d/services/mwdebug: sync on pinkunicorn
[12:15:54] <stashbot>	 T230315: [XL] Create a way to see and add references to structured data on Commons (MediaInfo) statements  - https://phabricator.wikimedia.org/T230315
[12:15:54] <logmsgbot>	 !log mwdebug-deploy@deploy1002 helmfile [codfw] START helmfile.d/services/mwdebug: apply on pinkunicorn
[12:15:55] <stashbot>	 Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log
[12:15:56] <stashbot>	 Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log
[12:16:12] <wikibugs>	 (03PS1) 10Giuseppe Lavagetto: shellbox: rationalize version handling, promote to 1.0 [deployment-charts] - 10https://gerrit.wikimedia.org/r/753020
[12:16:14] <wikibugs>	 (03PS1) 10Giuseppe Lavagetto: shellbox-*: promote to new build [deployment-charts] - 10https://gerrit.wikimedia.org/r/753021 (https://phabricator.wikimedia.org/T292322)
[12:16:42] <urbanecm>	 cormacparle: i take it that we're waiting at the CI now
[12:16:57] <urbanecm>	 will you want to try the backports too (via the sync-world command)?
[12:17:02] <urbanecm>	 (I'm happy to do it for you, just asking)
[12:17:04] <logmsgbot>	 !log mwdebug-deploy@deploy1002 helmfile [codfw] DONE helmfile.d/services/mwdebug: sync on pinkunicorn
[12:17:05] <stashbot>	 Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log
[12:17:33] <cormacparle>	 happy to do it, but never used sync-world before
[12:17:45] <urbanecm>	 cormacparle: then i'll guide you :)
[12:17:53] <urbanecm>	 the start is very similar to normal deployments
[12:18:00] <urbanecm>	 (fetch to depoyment, scap pull on a debug server)
[12:18:05] <wikibugs>	 10SRE, 10wikitech.wikimedia.org, 10Sustainability (Incident Followup), 10User-LSobanski: Incident response tools operational readiness review - https://phabricator.wikimedia.org/T290130 (10LSobanski) a:05LSobanski→03None
[12:18:22] <urbanecm>	 but, at the debug server, i18n changes probably will not work (you'll either see <message code> or the outdated message)
[12:18:42] <cormacparle>	 ok cool, will do that for a start anyway (config patch is now synced and seems fine)
[12:19:14] <urbanecm>	 once it merges and you tested it, ping me, and I'll share the rest :))
[12:19:35] <cormacparle>	 will do!
[12:20:54] <wikibugs>	 (03CR) 10Thiemo Kreuz (WMDE): [C: 03+1] Allow aliases to be integers in addition to strings [extensions/TemplateData] (wmf/1.38.0-wmf.17) - 10https://gerrit.wikimedia.org/r/752775 (https://phabricator.wikimedia.org/T298795) (owner: 10Awight)
[12:21:09] <wikibugs>	 (03Merged) 10jenkins-bot: Update the way the search interface is set [extensions/MediaSearch] (wmf/1.38.0-wmf.16) - 10https://gerrit.wikimedia.org/r/751836 (https://phabricator.wikimedia.org/T297484) (owner: 10Cparle)
[12:21:43] <logmsgbot>	 !log marostegui@cumin1001 dbctl commit (dc=all): 'Repooling after maintenance db1181 (T297191)', diff saved to https://phabricator.wikimedia.org/P18563 and previous config saved to /var/cache/conftool/dbconfig/20220111-122143-marostegui.json
[12:21:45] <wikibugs>	 (03Merged) 10jenkins-bot: Updated maint script to use fewer queries [extensions/MediaSearch] (wmf/1.38.0-wmf.16) - 10https://gerrit.wikimedia.org/r/752701 (https://phabricator.wikimedia.org/T297484) (owner: 10Cparle)
[12:21:46] <stashbot>	 Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log
[12:21:47] <stashbot>	 T297191: Schema change for dropping page_restrictions.pr_user field on wmf sites - https://phabricator.wikimedia.org/T297191
[12:21:53] <taavi>	 urbanecm: cormacparle: hi, can you please ping me when done deploying? I have a few patches of my own
[12:22:02] <cormacparle>	 sure
[12:22:09] <urbanecm>	 taavi: sure :)
[12:22:14] <urbanecm>	 (also, hi)
[12:26:41] <urbanecm>	 cormacparle: how are the tests going? 🙂
[12:26:53] <matthiasmullie>	 cormacparle: since there's a maint script that needs to be run, I guess you may want to do that before sync-world as well?
[12:27:19] <cormacparle>	 oh 
[12:27:21] <cormacparle>	 yes indeed
[12:27:52] <cormacparle>	 erm ...
[12:27:54] <urbanecm>	 do note that sync-world can take up to 40 minutes to complete (normally it finishes within 20)
[12:28:01] <cormacparle>	 https://www.irccloud.com/pastebin/LBdeJADJ/
[12:28:11] <wikibugs>	 (03PS1) 10Jelto: deployment_server,::helm: remove helm2 support [puppet] - 10https://gerrit.wikimedia.org/r/753026 (https://phabricator.wikimedia.org/T251305)
[12:28:14] <cormacparle>	 there are new commits in other extensions ...
[12:28:19] <cormacparle>	 that's not what I expected
[12:28:20] <urbanecm>	 cormacparle: that's security patches
[12:28:23] <urbanecm>	 ignore it 
[12:28:26] <cormacparle>	 kk cool
[12:28:47] <wikibugs>	 10SRE, 10serviceops, 10Kubernetes, 10Patch-For-Review: Migrate to helm v3 - https://phabricator.wikimedia.org/T251305 (10Jelto)
[12:29:15] <urbanecm>	 the runtime of sync-world means there might be nearly an hour during which the code needs to support both versions (old and new)
[12:29:24] <urbanecm>	 is that...fine? cormacparle 
[12:29:31] <cormacparle>	 yep
[12:29:50] <urbanecm>	 okay, good
[12:32:29] <wikibugs>	 10SRE, 10SRE-Access-Requests: Requesting access to Data Engineering team resources for Sandra Ebele Nwachukwu - https://phabricator.wikimedia.org/T298786 (10cmooney) p:05Triage→03Medium a:03cmooney
[12:35:35] <wikibugs>	 (03PS2) 10Arturo Borrero Gonzalez: wmcs: toolforge: grid: factorized node creation cookbook [cookbooks] (wmcs) - 10https://gerrit.wikimedia.org/r/753017 (https://phabricator.wikimedia.org/T298948)
[12:35:37] <wikibugs>	 (03PS3) 10Arturo Borrero Gonzalez: wmcs: toolforge: grid: add cookbooks to create each node type [cookbooks] (wmcs) - 10https://gerrit.wikimedia.org/r/753006 (https://phabricator.wikimedia.org/T298948)
[12:35:39] <wikibugs>	 (03PS1) 10Arturo Borrero Gonzalez: wmcs: toolforge: relocate some node-specific cookbooks [cookbooks] (wmcs) - 10https://gerrit.wikimedia.org/r/753027 (https://phabricator.wikimedia.org/T298948)
[12:36:57] <wikibugs>	 (03CR) 10Ayounsi: "Thanks!" [puppet] - 10https://gerrit.wikimedia.org/r/753016 (https://phabricator.wikimedia.org/T298087) (owner: 10Jbond)
[12:37:12] <logmsgbot>	 !log mwdebug-deploy@deploy1002 helmfile [eqiad] START helmfile.d/services/mwdebug: apply on pinkunicorn
[12:37:14] <stashbot>	 Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log
[12:37:49] <wikibugs>	 (03PS1) 10Jbond: P:installserver::proxy: Add domain whitelist to proxy [puppet] - 10https://gerrit.wikimedia.org/r/753029 (https://phabricator.wikimedia.org/T298087)
[12:38:22] <wikibugs>	 (03PS2) 10Jbond: P:installserver::proxy: Add domain whitelist to proxy [puppet] - 10https://gerrit.wikimedia.org/r/753029 (https://phabricator.wikimedia.org/T298087)
[12:38:44] <wikibugs>	 (03PS2) 10Jelto: deployment_server,::helm: remove helm2 support [puppet] - 10https://gerrit.wikimedia.org/r/753026 (https://phabricator.wikimedia.org/T251305)
[12:39:25] <wikibugs>	 (03PS3) 10Jbond: P:installserver::proxy: Add domain whitelist to proxy [puppet] - 10https://gerrit.wikimedia.org/r/753029 (https://phabricator.wikimedia.org/T298087)
[12:40:10] <wikibugs>	 (03CR) 10Jbond: [V: 03+1] "PCC SUCCESS (DIFF 1): https://integration.wikimedia.org/ci/job/operations-puppet-catalog-compiler/33189/console" [puppet] - 10https://gerrit.wikimedia.org/r/753029 (https://phabricator.wikimedia.org/T298087) (owner: 10Jbond)
[12:41:05] <logmsgbot>	 !log mwdebug-deploy@deploy1002 helmfile [eqiad] DONE helmfile.d/services/mwdebug: sync on pinkunicorn
[12:41:06] <logmsgbot>	 !log mwdebug-deploy@deploy1002 helmfile [codfw] START helmfile.d/services/mwdebug: apply on pinkunicorn
[12:41:06] <stashbot>	 Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log
[12:41:07] <stashbot>	 Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log
[12:41:15] <icinga-wm>	 PROBLEM - k8s API server requests latencies on ml-serve-ctrl1001 is CRITICAL: instance=10.64.16.202 verb=PATCH https://wikitech.wikimedia.org/wiki/Kubernetes https://grafana.wikimedia.org/dashboard/db/kubernetes-api?viewPanel=27
[12:41:32] <wikibugs>	 (03CR) 10Jbond: [C: 04-1] "This is an example change applying a whitelist to the proxy, going to -1 this for now just to make sure it dosn't get accidentally merged " [puppet] - 10https://gerrit.wikimedia.org/r/753029 (https://phabricator.wikimedia.org/T298087) (owner: 10Jbond)
[12:42:08] <wikibugs>	 (03CR) 10Jelto: [V: 03+1] "PCC SUCCESS (DIFF 1): https://integration.wikimedia.org/ci/job/operations-puppet-catalog-compiler/33190/console" [puppet] - 10https://gerrit.wikimedia.org/r/753026 (https://phabricator.wikimedia.org/T251305) (owner: 10Jelto)
[12:44:36] <wikibugs>	 (03PS6) 10Jbond: profile::installserver::proxy: update suiqd template [puppet] - 10https://gerrit.wikimedia.org/r/753016 (https://phabricator.wikimedia.org/T298087)
[12:44:55] <wikibugs>	 (03CR) 10Jbond: profile::installserver::proxy: update suiqd template (031 comment) [puppet] - 10https://gerrit.wikimedia.org/r/753016 (https://phabricator.wikimedia.org/T298087) (owner: 10Jbond)
[12:44:57] <icinga-wm>	 RECOVERY - k8s API server requests latencies on ml-serve-ctrl1001 is OK: All metrics within thresholds. https://wikitech.wikimedia.org/wiki/Kubernetes https://grafana.wikimedia.org/dashboard/db/kubernetes-api?viewPanel=27
[12:45:08] <logmsgbot>	 !log mwdebug-deploy@deploy1002 helmfile [codfw] DONE helmfile.d/services/mwdebug: sync on pinkunicorn
[12:45:09] <stashbot>	 Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log
[12:46:13] <wikibugs>	 (03PS4) 10Jbond: P:installserver::proxy: Add domain whitelist to proxy [puppet] - 10https://gerrit.wikimedia.org/r/753029 (https://phabricator.wikimedia.org/T298087)
[12:47:03] <wikibugs>	 (03CR) 10EllenR: [C: 03+1] "LGTM" [mediawiki-config] - 10https://gerrit.wikimedia.org/r/752708 (https://phabricator.wikimedia.org/T297623) (owner: 10Eigyan)
[12:49:55] <urbanecm>	 cormacparle: how's the testing going? :))
[12:50:23] <cormacparle>	 problem with the maint script :(
[12:51:03] <urbanecm>	 cormacparle: which kind of a problem?
[12:51:25] <cormacparle>	 can't write the data - there are already duplicates in the db we didn't know about
[12:51:30] <cormacparle>	 I think we'll have to revert both of those patches, because without the maint script they break the interface  
[12:51:34] <urbanecm>	 then we need to revert
[12:51:46] <cormacparle>	 yeah
[12:51:54] <wikibugs>	 (03PS1) 10Urbanecm: Revert "Updated maint script to use fewer queries" [extensions/MediaSearch] (wmf/1.38.0-wmf.16) - 10https://gerrit.wikimedia.org/r/752776 (https://phabricator.wikimedia.org/T297484)
[12:52:01] <wikibugs>	 (03CR) 10Urbanecm: [V: 03+2 C: 03+2] "revert" [extensions/MediaSearch] (wmf/1.38.0-wmf.16) - 10https://gerrit.wikimedia.org/r/752776 (https://phabricator.wikimedia.org/T297484) (owner: 10Urbanecm)
[12:52:08] <wikibugs>	 (03PS1) 10Urbanecm: Revert "Update the way the search interface is set" [extensions/MediaSearch] (wmf/1.38.0-wmf.16) - 10https://gerrit.wikimedia.org/r/752777 (https://phabricator.wikimedia.org/T297484)
[12:52:16] <wikibugs>	 (03CR) 10Urbanecm: [V: 03+2 C: 03+2] "revert" [extensions/MediaSearch] (wmf/1.38.0-wmf.16) - 10https://gerrit.wikimedia.org/r/752777 (https://phabricator.wikimedia.org/T297484) (owner: 10Urbanecm)
[12:52:29] <urbanecm>	 cormacparle: done
[12:52:35] <urbanecm>	 (config left live, as you said it's a diff thing)
[12:52:44] <cormacparle>	 yes config is fine
[12:52:59] <cormacparle>	 excellent thanks very much! we'll try again tomorrow
[12:53:17] <icinga-wm>	 PROBLEM - SSH on contint1001.mgmt is CRITICAL: CRITICAL - Socket timeout after 10 seconds https://wikitech.wikimedia.org/wiki/Dc-operations/Hardware_Troubleshooting_Runbook
[12:53:20] <urbanecm>	 good luck in resolving the data inconsistency issue :)
[12:53:39] <urbanecm>	 DB migrations are one of the things that no one notices if done correctly and everyone notices if an error happens :/
[12:54:00] <cormacparle>	 haha indeed!
[12:58:57] <wikibugs>	 10SRE-OnFire (FY2021/2022-Q3), 10SRE Observability (FY2021/2022-Q3): incidents occurring during Q2 and Q3 have been scored with the scorecard - https://phabricator.wikimedia.org/T292254 (10lmata)
[12:59:56] <logmsgbot>	 !log jmm@cumin2002 START - Cookbook sre.ganeti.reboot-vm for VM planet1002.eqiad.wmnet
[12:59:57] <stashbot>	 Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log
[13:04:07] <logmsgbot>	 !log jmm@cumin2002 END (PASS) - Cookbook sre.ganeti.reboot-vm (exit_code=0) for VM planet1002.eqiad.wmnet
[13:04:08] <stashbot>	 Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log
[13:04:23] <icinga-wm>	 RECOVERY - SSH on restbase2011.mgmt is OK: SSH OK - OpenSSH_7.0 (protocol 2.0) https://wikitech.wikimedia.org/wiki/Dc-operations/Hardware_Troubleshooting_Runbook
[13:05:24] <logmsgbot>	 !log mwdebug-deploy@deploy1002 helmfile [eqiad] START helmfile.d/services/mwdebug: apply on pinkunicorn
[13:05:25] <stashbot>	 Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log
[13:07:58] <logmsgbot>	 !log jmm@cumin2002 START - Cookbook sre.ganeti.reboot-vm for VM people1003.eqiad.wmnet
[13:07:59] <stashbot>	 Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log
[13:09:06] <logmsgbot>	 !log mwdebug-deploy@deploy1002 helmfile [eqiad] DONE helmfile.d/services/mwdebug: sync on pinkunicorn
[13:09:07] <stashbot>	 Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log
[13:09:08] <logmsgbot>	 !log mwdebug-deploy@deploy1002 helmfile [codfw] START helmfile.d/services/mwdebug: apply on pinkunicorn
[13:09:09] <stashbot>	 Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log
[13:11:44] <logmsgbot>	 !log jmm@cumin2002 END (PASS) - Cookbook sre.ganeti.reboot-vm (exit_code=0) for VM people1003.eqiad.wmnet
[13:11:46] <stashbot>	 Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log
[13:12:49] <logmsgbot>	 !log mwdebug-deploy@deploy1002 helmfile [codfw] DONE helmfile.d/services/mwdebug: sync on pinkunicorn
[13:12:50] <stashbot>	 Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log
[13:17:49] <wikibugs>	 (03CR) 10Hnowlan: [C: 03+2] Disable tilerator in all envs maps are deployed [puppet] - 10https://gerrit.wikimedia.org/r/752145 (https://phabricator.wikimedia.org/T298246) (owner: 10Jgiannelos)
[13:24:01] <wikibugs>	 (03PS3) 10Arturo Borrero Gonzalez: wmcs: toolforge: grid: factorized node creation cookbook [cookbooks] (wmcs) - 10https://gerrit.wikimedia.org/r/753017 (https://phabricator.wikimedia.org/T298948)
[13:24:03] <wikibugs>	 (03PS4) 10Arturo Borrero Gonzalez: wmcs: toolforge: grid: add cookbooks to create each node type [cookbooks] (wmcs) - 10https://gerrit.wikimedia.org/r/753006 (https://phabricator.wikimedia.org/T298948)
[13:24:05] <wikibugs>	 (03PS2) 10Arturo Borrero Gonzalez: wmcs: toolforge: relocate some node-specific cookbooks [cookbooks] (wmcs) - 10https://gerrit.wikimedia.org/r/753027 (https://phabricator.wikimedia.org/T298948)
[13:24:07] <wikibugs>	 (03PS2) 10Arturo Borrero Gonzalez: wmcs: relocate start_instance_with_prefix cookbook [cookbooks] (wmcs) - 10https://gerrit.wikimedia.org/r/753018 (https://phabricator.wikimedia.org/T298948)
[13:24:17] <wikibugs>	 (03CR) 10Btullis: [V: 03+1] Exclude log4j_extras from the classpath for coordinators (031 comment) [puppet] - 10https://gerrit.wikimedia.org/r/752673 (https://phabricator.wikimedia.org/T297468) (owner: 10Btullis)
[13:24:27] <icinga-wm>	 ACKNOWLEDGEMENT - Check systemd state on maps1006 is CRITICAL: CRITICAL - degraded: The following units failed: tilerator.service Hnowlan tilerator disabled intentionally https://wikitech.wikimedia.org/wiki/Monitoring/check_systemd_state
[13:24:27] <icinga-wm>	 ACKNOWLEDGEMENT - tilerator on maps1006 is CRITICAL: connect to address 10.64.0.18 and port 6534: Connection refused Hnowlan tilerator disabled intentionally https://wikitech.wikimedia.org/wiki/Services/Monitoring/tilerator
[13:24:27] <icinga-wm>	 ACKNOWLEDGEMENT - Check systemd state on maps1008 is CRITICAL: CRITICAL - degraded: The following units failed: tilerator.service Hnowlan tilerator disabled intentionally https://wikitech.wikimedia.org/wiki/Monitoring/check_systemd_state
[13:24:27] <icinga-wm>	 ACKNOWLEDGEMENT - tilerator on maps1008 is CRITICAL: connect to address 10.64.16.27 and port 6534: Connection refused Hnowlan tilerator disabled intentionally https://wikitech.wikimedia.org/wiki/Services/Monitoring/tilerator
[13:24:28] <icinga-wm>	 ACKNOWLEDGEMENT - Check systemd state on maps2009 is CRITICAL: CRITICAL - degraded: The following units failed: tilerator.service Hnowlan tilerator disabled intentionally https://wikitech.wikimedia.org/wiki/Monitoring/check_systemd_state
[13:24:28] <icinga-wm>	 ACKNOWLEDGEMENT - tilerator on maps2009 is CRITICAL: connect to address 10.192.16.107 and port 6534: Connection refused Hnowlan tilerator disabled intentionally https://wikitech.wikimedia.org/wiki/Services/Monitoring/tilerator
[13:26:08] <logmsgbot>	 !log marostegui@cumin1001 START - Cookbook sre.hosts.downtime for 6:00:00 on db1150.eqiad.wmnet with reason: Maintenance
[13:26:09] <stashbot>	 Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log
[13:26:10] <logmsgbot>	 !log marostegui@cumin1001 END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1150.eqiad.wmnet with reason: Maintenance
[13:26:11] <stashbot>	 Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log
[13:26:13] <logmsgbot>	 !log marostegui@cumin1001 START - Cookbook sre.hosts.downtime for 6:00:00 on db1145.eqiad.wmnet with reason: Maintenance
[13:26:14] <stashbot>	 Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log
[13:26:14] <logmsgbot>	 !log marostegui@cumin1001 END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1145.eqiad.wmnet with reason: Maintenance
[13:26:15] <stashbot>	 Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log
[13:26:17] <logmsgbot>	 !log marostegui@cumin1001 START - Cookbook sre.hosts.downtime for 6:00:00 on dbstore1007.eqiad.wmnet with reason: Maintenance
[13:26:18] <stashbot>	 Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log
[13:26:19] <logmsgbot>	 !log marostegui@cumin1001 END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on dbstore1007.eqiad.wmnet with reason: Maintenance
[13:26:20] <stashbot>	 Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log
[13:26:22] <logmsgbot>	 !log marostegui@cumin1001 START - Cookbook sre.hosts.downtime for 6:00:00 on db1147.eqiad.wmnet with reason: Maintenance
[13:26:22] <stashbot>	 Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log
[13:26:23] <logmsgbot>	 !log marostegui@cumin1001 END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1147.eqiad.wmnet with reason: Maintenance
[13:26:25] <stashbot>	 Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log
[13:26:28] <logmsgbot>	 !log marostegui@cumin1001 dbctl commit (dc=all): 'Depooling db1147 (T297191)', diff saved to https://phabricator.wikimedia.org/P18564 and previous config saved to /var/cache/conftool/dbconfig/20220111-132627-marostegui.json
[13:26:30] <stashbot>	 Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log
[13:26:30] <stashbot>	 T297191: Schema change for dropping page_restrictions.pr_user field on wmf sites - https://phabricator.wikimedia.org/T297191
[13:27:29] <icinga-wm>	 ACKNOWLEDGEMENT - Check systemd state on maps1005 is CRITICAL: CRITICAL - degraded: The following units failed: tilerator.service Hnowlan tilerator disabled intentionally https://wikitech.wikimedia.org/wiki/Monitoring/check_systemd_state
[13:27:29] <icinga-wm>	 ACKNOWLEDGEMENT - tilerator on maps1005 is CRITICAL: connect to address 10.64.0.12 and port 6534: Connection refused Hnowlan tilerator disabled intentionally https://wikitech.wikimedia.org/wiki/Services/Monitoring/tilerator
[13:27:30] <icinga-wm>	 ACKNOWLEDGEMENT - Check systemd state on maps1007 is CRITICAL: CRITICAL - degraded: The following units failed: tilerator.service Hnowlan tilerator disabled intentionally https://wikitech.wikimedia.org/wiki/Monitoring/check_systemd_state
[13:27:30] <icinga-wm>	 ACKNOWLEDGEMENT - tilerator on maps1007 is CRITICAL: connect to address 10.64.16.6 and port 6534: Connection refused Hnowlan tilerator disabled intentionally https://wikitech.wikimedia.org/wiki/Services/Monitoring/tilerator
[13:27:35] <logmsgbot>	 !log marostegui@cumin1001 dbctl commit (dc=all): 'Repooling after maintenance db1147 (T297191)', diff saved to https://phabricator.wikimedia.org/P18565 and previous config saved to /var/cache/conftool/dbconfig/20220111-132734-marostegui.json
[13:27:37] <stashbot>	 Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log
[13:27:51] <wikibugs>	 (03PS4) 10Arturo Borrero Gonzalez: wmcs: toolforge: grid: factorized node creation cookbook [cookbooks] (wmcs) - 10https://gerrit.wikimedia.org/r/753017 (https://phabricator.wikimedia.org/T298948)
[13:27:53] <wikibugs>	 (03PS5) 10Arturo Borrero Gonzalez: wmcs: toolforge: grid: add cookbooks to create each node type [cookbooks] (wmcs) - 10https://gerrit.wikimedia.org/r/753006 (https://phabricator.wikimedia.org/T298948)
[13:27:55] <wikibugs>	 (03PS3) 10Arturo Borrero Gonzalez: wmcs: toolforge: relocate some node-specific cookbooks [cookbooks] (wmcs) - 10https://gerrit.wikimedia.org/r/753027 (https://phabricator.wikimedia.org/T298948)
[13:27:59] <wikibugs>	 (03PS3) 10Arturo Borrero Gonzalez: wmcs: relocate start_instance_with_prefix cookbook [cookbooks] (wmcs) - 10https://gerrit.wikimedia.org/r/753018 (https://phabricator.wikimedia.org/T298948)
[13:29:19] <icinga-wm>	 ACKNOWLEDGEMENT - Check systemd state on maps1010 is CRITICAL: CRITICAL - degraded: The following units failed: tilerator.service Hnowlan Tilerator disabled - T298246 https://wikitech.wikimedia.org/wiki/Monitoring/check_systemd_state
[13:29:19] <icinga-wm>	 ACKNOWLEDGEMENT - tilerator on maps1010 is CRITICAL: connect to address 10.64.48.6 and port 6534: Connection refused Hnowlan Tilerator disabled - T298246 https://wikitech.wikimedia.org/wiki/Services/Monitoring/tilerator
[13:29:56] <logmsgbot>	 !log btullis@cumin1001 START - Cookbook sre.zookeeper.roll-restart-zookeeper for Zookeeper A:zookeeper-analytics cluster: Roll restart of jvm daemons.
[13:29:58] <stashbot>	 Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log
[13:32:02] <wikibugs>	 (03CR) 10Kormat: [C: 03+1] wmcs::db: remove used roles and profiles [puppet] - 10https://gerrit.wikimedia.org/r/753010 (https://phabricator.wikimedia.org/T272559) (owner: 10David Caro)
[13:33:55] <moritzm>	 !log installing 4.9.290 kernels von stretch systems (no reboots yet)
[13:33:56] <stashbot>	 Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log
[13:36:04] <logmsgbot>	 !log btullis@cumin1001 END (PASS) - Cookbook sre.zookeeper.roll-restart-zookeeper (exit_code=0) for Zookeeper A:zookeeper-analytics cluster: Roll restart of jvm daemons.
[13:36:05] <stashbot>	 Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log
[13:36:56] <logmsgbot>	 !log btullis@cumin1001 START - Cookbook sre.zookeeper.roll-restart-zookeeper for Zookeeper A:zookeeper-druid-analytics cluster: Roll restart of jvm daemons.
[13:36:57] <stashbot>	 Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log
[13:42:39] <logmsgbot>	 !log marostegui@cumin1001 dbctl commit (dc=all): 'Repooling after maintenance db1147', diff saved to https://phabricator.wikimedia.org/P18567 and previous config saved to /var/cache/conftool/dbconfig/20220111-134239-marostegui.json
[13:42:41] <stashbot>	 Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log
[13:43:04] <logmsgbot>	 !log btullis@cumin1001 END (PASS) - Cookbook sre.zookeeper.roll-restart-zookeeper (exit_code=0) for Zookeeper A:zookeeper-druid-analytics cluster: Roll restart of jvm daemons.
[13:43:05] <stashbot>	 Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log
[13:48:51] <wikibugs>	 (03PS1) 10Marostegui: dbproxy1021: Disable notifications [puppet] - 10https://gerrit.wikimedia.org/r/753041 (https://phabricator.wikimedia.org/T298586)
[13:49:39] <wikibugs>	 (03PS2) 10Muehlenhoff: Make build2001 a build host [puppet] - 10https://gerrit.wikimedia.org/r/751146
[13:50:02] <wikibugs>	 (03CR) 10Marostegui: [C: 03+2] dbproxy1021: Disable notifications [puppet] - 10https://gerrit.wikimedia.org/r/753041 (https://phabricator.wikimedia.org/T298586) (owner: 10Marostegui)
[13:50:19] <logmsgbot>	 !log marostegui@cumin1001 START - Cookbook sre.hosts.reimage for host dbproxy1021.eqiad.wmnet with OS bullseye
[13:50:20] <stashbot>	 Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log
[13:53:21] <wikibugs>	 (03PS2) 10Mvolz: citoid: pipeline bot promote [deployment-charts] - 10https://gerrit.wikimedia.org/r/723658 (owner: 10PipelineBot)
[13:55:07] <wikibugs>	 10SRE, 10LDAP-Access-Requests: Grant Access to <Idap/wmf> for <JVargas> - https://phabricator.wikimedia.org/T298719 (10JVargas) Thank you so much, @Dzahn and @cmooney! Appreciate the quick support for access.
[13:57:13] <wikibugs>	 (03PS1) 10Kormat: wmfdb/db: Add module for querying databases. [software/wmfdb] - 10https://gerrit.wikimedia.org/r/753045 (https://phabricator.wikimedia.org/T298236)
[13:57:18] <wikibugs>	 (03PS1) 10Jbond: C:mw_rc_irc::ircserver: Refresh ircd services on config changes [puppet] - 10https://gerrit.wikimedia.org/r/753046 (https://phabricator.wikimedia.org/T284052)
[13:57:44] <logmsgbot>	 !log marostegui@cumin1001 dbctl commit (dc=all): 'Repooling after maintenance db1147', diff saved to https://phabricator.wikimedia.org/P18568 and previous config saved to /var/cache/conftool/dbconfig/20220111-135744-marostegui.json
[13:57:46] <stashbot>	 Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log
[13:58:16] <wikibugs>	 (03CR) 10Jbond: [V: 03+1] "PCC SUCCESS (DIFF 1): https://integration.wikimedia.org/ci/job/operations-puppet-catalog-compiler/33191/console" [puppet] - 10https://gerrit.wikimedia.org/r/753046 (https://phabricator.wikimedia.org/T284052) (owner: 10Jbond)
[13:59:28] <wikibugs>	 (03PS1) 10Cathal Mooney: Add Sandra Ebele Nwachukwu to production access [puppet] - 10https://gerrit.wikimedia.org/r/753049 (https://phabricator.wikimedia.org/T298786)
[14:00:02] <wikibugs>	 (03CR) 10Jbond: [V: 03+1] "ready" [puppet] - 10https://gerrit.wikimedia.org/r/753046 (https://phabricator.wikimedia.org/T284052) (owner: 10Jbond)
[14:04:10] <wikibugs>	 (03CR) 10Klausman: [C: 03+1] "Just one nit, other than that, LGTM" [software/wmfdb] - 10https://gerrit.wikimedia.org/r/753045 (https://phabricator.wikimedia.org/T298236) (owner: 10Kormat)
[14:10:27] <wikibugs>	 (03PS1) 10Mforns: analytics:refinery:job:data_purge: Add deletion for anomaly detection [puppet] - 10https://gerrit.wikimedia.org/r/753052 (https://phabricator.wikimedia.org/T298972)
[14:12:10] <wikibugs>	 (03PS2) 10Mforns: analytics:refinery:job:data_purge: Add deletion for anomaly detection [puppet] - 10https://gerrit.wikimedia.org/r/753052 (https://phabricator.wikimedia.org/T298972)
[14:12:12] <wikibugs>	 (03CR) 10Cathal Mooney: [C: 03+2] Add Sandra Ebele Nwachukwu to production access [puppet] - 10https://gerrit.wikimedia.org/r/753049 (https://phabricator.wikimedia.org/T298786) (owner: 10Cathal Mooney)
[14:12:49] <logmsgbot>	 !log marostegui@cumin1001 dbctl commit (dc=all): 'Repooling after maintenance db1147 (T297191)', diff saved to https://phabricator.wikimedia.org/P18569 and previous config saved to /var/cache/conftool/dbconfig/20220111-141249-marostegui.json
[14:12:52] <stashbot>	 Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log
[14:12:53] <stashbot>	 T297191: Schema change for dropping page_restrictions.pr_user field on wmf sites - https://phabricator.wikimedia.org/T297191
[14:12:55] <logmsgbot>	 !log marostegui@cumin1001 START - Cookbook sre.hosts.downtime for 6:00:00 on db2110.codfw.wmnet with reason: Maintenance
[14:12:56] <stashbot>	 Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log
[14:12:57] <logmsgbot>	 !log marostegui@cumin1001 END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db2110.codfw.wmnet with reason: Maintenance
[14:12:58] <logmsgbot>	 !log marostegui@cumin1001 START - Cookbook sre.hosts.downtime for 12:00:00 on 12 hosts with reason: Maintenance
[14:12:58] <stashbot>	 Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log
[14:12:59] <stashbot>	 Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log
[14:13:07] <logmsgbot>	 !log marostegui@cumin1001 END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on 12 hosts with reason: Maintenance
[14:13:09] <stashbot>	 Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log
[14:13:12] <wikibugs>	 (03PS1) 10Jbond: O:puppetmaster::standalone: add type validation [puppet] - 10https://gerrit.wikimedia.org/r/753053 (https://phabricator.wikimedia.org/T284082)
[14:13:12] <logmsgbot>	 !log marostegui@cumin1001 START - Cookbook sre.hosts.downtime for 6:00:00 on db1142.eqiad.wmnet with reason: Maintenance
[14:13:13] <stashbot>	 Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log
[14:13:14] <logmsgbot>	 !log marostegui@cumin1001 END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1142.eqiad.wmnet with reason: Maintenance
[14:13:15] <stashbot>	 Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log
[14:13:19] <logmsgbot>	 !log marostegui@cumin1001 dbctl commit (dc=all): 'Depooling db1142 (T297191)', diff saved to https://phabricator.wikimedia.org/P18570 and previous config saved to /var/cache/conftool/dbconfig/20220111-141318-marostegui.json
[14:13:21] <stashbot>	 Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log
[14:14:08] <wikibugs>	 (03CR) 10Jbond: [C: 03+2] O:puppetmaster::standalone: add type validation [puppet] - 10https://gerrit.wikimedia.org/r/753053 (https://phabricator.wikimedia.org/T284082) (owner: 10Jbond)
[14:14:26] <logmsgbot>	 !log marostegui@cumin1001 dbctl commit (dc=all): 'Repooling after maintenance db1142 (T297191)', diff saved to https://phabricator.wikimedia.org/P18571 and previous config saved to /var/cache/conftool/dbconfig/20220111-141425-marostegui.json
[14:14:28] <stashbot>	 Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log
[14:14:54] <jbond>	 topranks: have merged your access request  change 
[14:15:15] <topranks>	 ok sry got pulled away for a moment.
[14:15:20] <jbond>	 no probs
[14:15:21] <wikibugs>	 10Puppet, 10Infrastructure-Foundations, 10User-jbond: Puppet Improvements 2021/2022 - https://phabricator.wikimedia.org/T294906 (10jbond)
[14:15:46] <topranks>	 thanks :)
[14:15:51] <wikibugs>	 10Puppet, 10Infrastructure-Foundations, 10Patch-For-Review, 10User-jbond: Add type validation to puppetmaster::standalone - https://phabricator.wikimedia.org/T284082 (10jbond) 05Open→03Resolved a:03jbond
[14:18:19] <wikibugs>	 10SRE, 10LDAP-Access-Requests: Grant Access to <Idap/wmf> for <JVargas> - https://phabricator.wikimedia.org/T298719 (10Dzahn) @cmooney Cool, thanks for closing this!  What Andre meant here above is that we are also supposed to add users to the Phabricator group called WMF-NDA when we add people into the LDAP g...
[14:19:13] <wikibugs>	 (03CR) 10Cathal Mooney: "Great work John.  Overall I am fully supportive of this change, it adds a very valuable layer of security and I don't expect it will be hu" [puppet] - 10https://gerrit.wikimedia.org/r/753029 (https://phabricator.wikimedia.org/T298087) (owner: 10Jbond)
[14:19:29] <logmsgbot>	 !log marostegui@cumin1001 END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host dbproxy1021.eqiad.wmnet with OS bullseye
[14:19:30] <stashbot>	 Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log
[14:21:06] <wikibugs>	 (03PS2) 10Kormat: wmfdb/db: Add module for querying databases. [software/wmfdb] - 10https://gerrit.wikimedia.org/r/753045 (https://phabricator.wikimedia.org/T298236)
[14:21:21] <wikibugs>	 (03CR) 10Kormat: wmfdb/db: Add module for querying databases. (031 comment) [software/wmfdb] - 10https://gerrit.wikimedia.org/r/753045 (https://phabricator.wikimedia.org/T298236) (owner: 10Kormat)
[14:21:24] <taavi>	 jouncebot: nowandnext
[14:21:24] <jouncebot>	 No deployments scheduled for the next 2 hour(s) and 38 minute(s)
[14:21:24] <jouncebot>	 In 2 hour(s) and 38 minute(s): Puppet request window (https://wikitech.wikimedia.org/wiki/Deployments#deploycal-item-20220111T1700)
[14:22:05] <wikibugs>	 (03CR) 10Majavah: [C: 03+2] reverse-proxy: add drmrs ranges [mediawiki-config] - 10https://gerrit.wikimedia.org/r/751952 (https://phabricator.wikimedia.org/T282787) (owner: 10Majavah)
[14:22:16] <wikibugs>	 (03CR) 10Klausman: [C: 03+1] wmfdb/db: Add module for querying databases. [software/wmfdb] - 10https://gerrit.wikimedia.org/r/753045 (https://phabricator.wikimedia.org/T298236) (owner: 10Kormat)
[14:22:49] <wikibugs>	 (03Merged) 10jenkins-bot: reverse-proxy: add drmrs ranges [mediawiki-config] - 10https://gerrit.wikimedia.org/r/751952 (https://phabricator.wikimedia.org/T282787) (owner: 10Majavah)
[14:23:20] <wikibugs>	 (03CR) 10Kormat: [C: 03+2] wmfdb/db: Add module for querying databases. [software/wmfdb] - 10https://gerrit.wikimedia.org/r/753045 (https://phabricator.wikimedia.org/T298236) (owner: 10Kormat)
[14:25:00] <wikibugs>	 (03Merged) 10jenkins-bot: wmfdb/db: Add module for querying databases. [software/wmfdb] - 10https://gerrit.wikimedia.org/r/753045 (https://phabricator.wikimedia.org/T298236) (owner: 10Kormat)
[14:25:36] <logmsgbot>	 !log taavi@deploy1002 Synchronized wmf-config/reverse-proxy.php: Config: [[gerrit:751952|reverse-proxy: add drmrs ranges (T282787)]] (duration: 01m 36s)
[14:25:37] <wikibugs>	 (03PS2) 10Majavah: Clean up nova-network remains [mediawiki-config] - 10https://gerrit.wikimedia.org/r/751949
[14:25:38] <stashbot>	 Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log
[14:25:40] <stashbot>	 T282787: Configure dns and puppet repositories for new drmrs datacenter - https://phabricator.wikimedia.org/T282787
[14:26:17] <wikibugs>	 (03CR) 10Majavah: [C: 03+2] Clean up nova-network remains [mediawiki-config] - 10https://gerrit.wikimedia.org/r/751949 (owner: 10Majavah)
[14:26:55] <wikibugs>	 10SRE, 10Infrastructure-Foundations: Migrate eqiad Ganeti cluster to KVM machine type pc-i440fx-2.8 - https://phabricator.wikimedia.org/T294120 (10MoritzMuehlenhoff)
[14:27:01] <wikibugs>	 (03Merged) 10jenkins-bot: Clean up nova-network remains [mediawiki-config] - 10https://gerrit.wikimedia.org/r/751949 (owner: 10Majavah)
[14:28:41] <logmsgbot>	 !log mwdebug-deploy@deploy1002 helmfile [eqiad] START helmfile.d/services/mwdebug: apply on pinkunicorn
[14:28:42] <stashbot>	 Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log
[14:29:30] <logmsgbot>	 !log marostegui@cumin1001 dbctl commit (dc=all): 'Repooling after maintenance db1142', diff saved to https://phabricator.wikimedia.org/P18572 and previous config saved to /var/cache/conftool/dbconfig/20220111-142930-marostegui.json
[14:29:31] <stashbot>	 Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log
[14:31:04] <logmsgbot>	 !log taavi@deploy1002 Synchronized wmf-config/CommonSettings.php: Config: [[gerrit:751949|Clean up nova-network remains]] (1/2) (duration: 02m 49s)
[14:31:05] <stashbot>	 Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log
[14:32:49] <logmsgbot>	 !log mwdebug-deploy@deploy1002 helmfile [eqiad] DONE helmfile.d/services/mwdebug: sync on pinkunicorn
[14:32:50] <logmsgbot>	 !log mwdebug-deploy@deploy1002 helmfile [codfw] START helmfile.d/services/mwdebug: apply on pinkunicorn
[14:32:50] <stashbot>	 Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log
[14:32:52] <stashbot>	 Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log
[14:32:59] <wikibugs>	 10SRE, 10LDAP-Access-Requests: Grant Access to <Idap/wmf> for <JVargas> - https://phabricator.wikimedia.org/T298719 (10cmooney) 05In progress→03Resolved @aklapper / @dzahn many thanks for spotting the omission and kindly correcting.  Duly noted for future similar requests.
[14:33:05] <icinga-wm>	 RECOVERY - Host kubetcd2004 is UP: PING OK - Packet loss = 0%, RTA = 31.79 ms
[14:33:55] <logmsgbot>	 !log taavi@deploy1002 Synchronized wmf-config/InitialiseSettings.php: Config: [[gerrit:751949|Clean up nova-network remains]] (2/2) (duration: 02m 40s)
[14:33:56] <stashbot>	 Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log
[14:34:01] * taavi done
[14:35:21] <marostegui>	 !log Upgrade pc1014 mysql
[14:35:22] <stashbot>	 Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log
[14:36:50] <logmsgbot>	 !log mwdebug-deploy@deploy1002 helmfile [codfw] DONE helmfile.d/services/mwdebug: sync on pinkunicorn
[14:36:51] <stashbot>	 Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log
[14:37:12] <wikibugs>	 (03CR) 10Ayounsi: "Thanks!" [puppet] - 10https://gerrit.wikimedia.org/r/753029 (https://phabricator.wikimedia.org/T298087) (owner: 10Jbond)
[14:38:23] <wikibugs>	 (03CR) 10Elukey: [C: 03+1] Exclude log4j_extras from the classpath for coordinators [puppet] - 10https://gerrit.wikimedia.org/r/752673 (https://phabricator.wikimedia.org/T297468) (owner: 10Btullis)
[14:38:36] <XioNoX>	 !log disable ping-offload in eqiad
[14:38:38] <stashbot>	 Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log
[14:44:35] <logmsgbot>	 !log marostegui@cumin1001 dbctl commit (dc=all): 'Repooling after maintenance db1142', diff saved to https://phabricator.wikimedia.org/P18573 and previous config saved to /var/cache/conftool/dbconfig/20220111-144435-marostegui.json
[14:44:36] <logmsgbot>	 !log jmm@cumin2002 START - Cookbook sre.ganeti.reboot-vm for VM ping1002.eqiad.wmnet
[14:44:37] <stashbot>	 Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log
[14:44:39] <stashbot>	 Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log
[14:47:40] <wikibugs>	 (03PS3) 10Eigyan: wmf-config: Update coverage to 0.5 in gdi-survey on cawiki beta [mediawiki-config] - 10https://gerrit.wikimedia.org/r/752708 (https://phabricator.wikimedia.org/T297623)
[14:48:00] <logmsgbot>	 !log jmm@cumin2002 END (PASS) - Cookbook sre.ganeti.reboot-vm (exit_code=0) for VM ping1002.eqiad.wmnet
[14:48:02] <stashbot>	 Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log
[14:48:46] <logmsgbot>	 !log jmm@cumin2002 START - Cookbook sre.ganeti.reboot-vm for VM zookeeper-test1002.eqiad.wmnet
[14:48:47] <stashbot>	 Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log
[14:52:33] <icinga-wm>	 PROBLEM - PHP7 jobrunner on mw1303 is CRITICAL: CRITICAL - Socket timeout after 10 seconds https://wikitech.wikimedia.org/wiki/Jobrunner
[14:54:25] <icinga-wm>	 RECOVERY - SSH on contint1001.mgmt is OK: SSH OK - OpenSSH_6.6 (protocol 2.0) https://wikitech.wikimedia.org/wiki/Dc-operations/Hardware_Troubleshooting_Runbook
[14:54:39] <icinga-wm>	 RECOVERY - PHP7 jobrunner on mw1303 is OK: HTTP OK: HTTP/1.1 200 OK - 324 bytes in 0.025 second response time https://wikitech.wikimedia.org/wiki/Jobrunner
[14:55:17] <wikibugs>	 10SRE, 10SRE-OnFire (FY2021/2022-Q3), 10SRE Observability (FY2021/2022-Q3): Ensure SRE team has a good understanding of how & when to declare an outage on the status page; & it is easy to do so - https://phabricator.wikimedia.org/T285769 (10lmata)
[14:55:43] <icinga-wm>	 PROBLEM - Disk space on pybal-test2002 is CRITICAL: DISK CRITICAL - free space: / 170 MB (1% inode=83%): /tmp 170 MB (1% inode=83%): /var/tmp 170 MB (1% inode=83%): https://wikitech.wikimedia.org/wiki/Monitoring/Disk_space https://grafana.wikimedia.org/dashboard/db/host-overview?var-server=pybal-test2002&var-datasource=codfw+prometheus/ops
[14:56:25] <logmsgbot>	 !log aokoth@cumin1001 START - Cookbook sre.ganeti.reboot-vm for VM etherpad1002.eqiad.wmnet
[14:56:27] <stashbot>	 Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log
[14:56:31] <wikibugs>	 10SRE, 10Infrastructure-Foundations: Migrate eqiad Ganeti cluster to KVM machine type pc-i440fx-2.8 - https://phabricator.wikimedia.org/T294120 (10ops-monitoring-bot) VM etherpad1002.eqiad.wmnet rebooted by aokoth@cumin1001 with reason: Ganeti Migration
[14:57:36] <wikibugs>	 10SRE, 10SRE-Access-Requests, 10Patch-For-Review: Requesting access to Data Engineering team resources for Sandra Ebele Nwachukwu - https://phabricator.wikimedia.org/T298786 (10cmooney) Hi Sandra,  I have now:  - Added you to LDAP group 'wmf' - Added you as a member of the 'WMF-NDA' group in Phabricator - Ad...
[14:58:02] <logmsgbot>	 !log jmm@cumin2002 END (PASS) - Cookbook sre.ganeti.reboot-vm (exit_code=0) for VM zookeeper-test1002.eqiad.wmnet
[14:58:03] <stashbot>	 Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log
[14:58:11] <icinga-wm>	 PROBLEM - Check systemd state on zookeeper-test1002 is CRITICAL: CRITICAL - degraded: The following units failed: ifup@ens5.service https://wikitech.wikimedia.org/wiki/Monitoring/check_systemd_state
[14:58:43] <icinga-wm>	 PROBLEM - Disk space on prometheus1004 is CRITICAL: DISK CRITICAL - free space: /boot 7 MB (3% inode=99%): https://wikitech.wikimedia.org/wiki/Monitoring/Disk_space https://grafana.wikimedia.org/dashboard/db/host-overview?var-server=prometheus1004&var-datasource=eqiad+prometheus/ops
[14:59:07] <icinga-wm>	 PROBLEM - PHP7 rendering on mw1303 is CRITICAL: CRITICAL - Socket timeout after 10 seconds https://wikitech.wikimedia.org/wiki/Application_servers/Runbook%23PHP7_rendering
[14:59:19] <icinga-wm>	 PROBLEM - PHP7 jobrunner on mw1303 is CRITICAL: CRITICAL - Socket timeout after 10 seconds https://wikitech.wikimedia.org/wiki/Jobrunner
[14:59:40] <logmsgbot>	 !log marostegui@cumin1001 dbctl commit (dc=all): 'Repooling after maintenance db1142 (T297191)', diff saved to https://phabricator.wikimedia.org/P18574 and previous config saved to /var/cache/conftool/dbconfig/20220111-145939-marostegui.json
[14:59:41] <logmsgbot>	 !log marostegui@cumin1001 START - Cookbook sre.hosts.downtime for 6:00:00 on db1141.eqiad.wmnet with reason: Maintenance
[14:59:42] <stashbot>	 Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log
[14:59:43] <logmsgbot>	 !log marostegui@cumin1001 END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1141.eqiad.wmnet with reason: Maintenance
[14:59:43] <stashbot>	 T297191: Schema change for dropping page_restrictions.pr_user field on wmf sites - https://phabricator.wikimedia.org/T297191
[14:59:44] <stashbot>	 Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log
[14:59:45] <stashbot>	 Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log
[14:59:47] <logmsgbot>	 !log marostegui@cumin1001 dbctl commit (dc=all): 'Depooling db1141 (T297191)', diff saved to https://phabricator.wikimedia.org/P18575 and previous config saved to /var/cache/conftool/dbconfig/20220111-145947-marostegui.json
[14:59:49] <stashbot>	 Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log
[15:00:09] <logmsgbot>	 !log aokoth@cumin1001 END (PASS) - Cookbook sre.ganeti.reboot-vm (exit_code=0) for VM etherpad1002.eqiad.wmnet
[15:00:10] <stashbot>	 Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log
[15:00:54] <logmsgbot>	 !log marostegui@cumin1001 dbctl commit (dc=all): 'Repooling after maintenance db1141 (T297191)', diff saved to https://phabricator.wikimedia.org/P18576 and previous config saved to /var/cache/conftool/dbconfig/20220111-150054-marostegui.json
[15:00:56] <stashbot>	 Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log
[15:01:41] <icinga-wm>	 RECOVERY - PHP7 jobrunner on mw1303 is OK: HTTP OK: HTTP/1.1 200 OK - 326 bytes in 7.307 second response time https://wikitech.wikimedia.org/wiki/Jobrunner
[15:02:04] <logmsgbot>	 !log aokoth@cumin1001 START - Cookbook sre.ganeti.reboot-vm for VM otrs1001.eqiad.wmnet
[15:02:05] <stashbot>	 Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log
[15:02:11] <wikibugs>	 10SRE, 10Infrastructure-Foundations: Migrate eqiad Ganeti cluster to KVM machine type pc-i440fx-2.8 - https://phabricator.wikimedia.org/T294120 (10ops-monitoring-bot) VM otrs1001.eqiad.wmnet rebooted by aokoth@cumin1001 with reason: Ganeti Migration
[15:04:42] <logmsgbot>	 !log jmm@cumin2002 START - Cookbook sre.ganeti.reboot-vm for VM rpki1001.eqiad.wmnet
[15:04:43] <stashbot>	 Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log
[15:05:35] <icinga-wm>	 PROBLEM - Disk space on prometheus2004 is CRITICAL: DISK CRITICAL - free space: /boot 7 MB (3% inode=99%): https://wikitech.wikimedia.org/wiki/Monitoring/Disk_space https://grafana.wikimedia.org/dashboard/db/host-overview?var-server=prometheus2004&var-datasource=codfw+prometheus/ops
[15:05:53] <wikibugs>	 (03PS1) 10Hnowlan: maps: add cassandra toggle, disable cassandra on maps hosts [puppet] - 10https://gerrit.wikimedia.org/r/753057 (https://phabricator.wikimedia.org/T298246)
[15:06:17] <icinga-wm>	 PROBLEM - PHP7 jobrunner on mw1303 is CRITICAL: CRITICAL - Socket timeout after 10 seconds https://wikitech.wikimedia.org/wiki/Jobrunner
[15:07:01] <wikibugs>	 (03CR) 10Hnowlan: [V: 03+1] "PCC SUCCESS (DIFF 1): https://integration.wikimedia.org/ci/job/operations-puppet-catalog-compiler/33192/console" [puppet] - 10https://gerrit.wikimedia.org/r/753057 (https://phabricator.wikimedia.org/T298246) (owner: 10Hnowlan)
[15:07:41] <wikibugs>	 (03CR) 10Hnowlan: maps: add cassandra toggle, disable cassandra on maps hosts [puppet] - 10https://gerrit.wikimedia.org/r/753057 (https://phabricator.wikimedia.org/T298246) (owner: 10Hnowlan)
[15:08:33] <logmsgbot>	 !log jmm@cumin2002 END (PASS) - Cookbook sre.ganeti.reboot-vm (exit_code=0) for VM rpki1001.eqiad.wmnet
[15:08:34] <stashbot>	 Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log
[15:10:05] <logmsgbot>	 !log aokoth@cumin1001 END (PASS) - Cookbook sre.ganeti.reboot-vm (exit_code=0) for VM otrs1001.eqiad.wmnet
[15:10:06] <stashbot>	 Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log
[15:10:15] <icinga-wm>	 PROBLEM - Check systemd state on otrs1001 is CRITICAL: CRITICAL - degraded: The following units failed: ifup@ens5.service https://wikitech.wikimedia.org/wiki/Monitoring/check_systemd_state
[15:11:03] <icinga-wm>	 PROBLEM - Prometheus jobs reduced availability on alert1001 is CRITICAL: job=routinator site=eqiad https://wikitech.wikimedia.org/wiki/Prometheus%23Prometheus_job_unavailable https://grafana.wikimedia.org/d/NEJu05xZz/prometheus-targets
[15:11:41] <wikibugs>	 (03CR) 10David Caro: [C: 03+2] wmcs::db: remove used roles and profiles [puppet] - 10https://gerrit.wikimedia.org/r/753010 (https://phabricator.wikimedia.org/T272559) (owner: 10David Caro)
[15:12:08] <wikibugs>	 10SRE, 10Infrastructure-Foundations: Migrate eqiad Ganeti cluster to KVM machine type pc-i440fx-2.8 - https://phabricator.wikimedia.org/T294120 (10Arnoldokoth)
[15:12:15] <icinga-wm>	 PROBLEM - Disk space on prometheus2003 is CRITICAL: DISK CRITICAL - free space: /boot 7 MB (3% inode=99%): https://wikitech.wikimedia.org/wiki/Monitoring/Disk_space https://grafana.wikimedia.org/dashboard/db/host-overview?var-server=prometheus2003&var-datasource=codfw+prometheus/ops
[15:12:35] <icinga-wm>	 RECOVERY - PHP7 jobrunner on mw1303 is OK: HTTP OK: HTTP/1.1 200 OK - 324 bytes in 0.037 second response time https://wikitech.wikimedia.org/wiki/Jobrunner
[15:12:55] <wikibugs>	 10Puppet, 10SRE, 10Infrastructure-Foundations, 10Patch-For-Review: Unused puppet resources audit, 2021 - https://phabricator.wikimedia.org/T272559 (10dcaro)
[15:13:17] <icinga-wm>	 PROBLEM - Disk space on prometheus1003 is CRITICAL: DISK CRITICAL - free space: /boot 7 MB (3% inode=99%): https://wikitech.wikimedia.org/wiki/Monitoring/Disk_space https://grafana.wikimedia.org/dashboard/db/host-overview?var-server=prometheus1003&var-datasource=eqiad+prometheus/ops
[15:14:37] <wikibugs>	 10SRE, 10SRE-Access-Requests: Add bking as icinga user - https://phabricator.wikimedia.org/T298738 (10cmooney) p:05Triage→03Medium a:03cmooney
[15:14:50] <wikibugs>	 10SRE, 10SRE-Access-Requests: Bing Webmaster Tools access request for Andrew Green - https://phabricator.wikimedia.org/T298723 (10cmooney) p:05Triage→03Low
[15:15:05] <wikibugs>	 10SRE, 10Infrastructure-Foundations: Migrate eqiad Ganeti cluster to KVM machine type pc-i440fx-2.8 - https://phabricator.wikimedia.org/T294120 (10MoritzMuehlenhoff)
[15:15:21] <wikibugs>	 10Puppet, 10SRE, 10Infrastructure-Foundations, 10Patch-For-Review: Unused puppet resources audit, 2021 - https://phabricator.wikimedia.org/T272559 (10dcaro)
[15:15:59] <logmsgbot>	 !log marostegui@cumin1001 dbctl commit (dc=all): 'Repooling after maintenance db1141', diff saved to https://phabricator.wikimedia.org/P18577 and previous config saved to /var/cache/conftool/dbconfig/20220111-151558-marostegui.json
[15:16:00] <stashbot>	 Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log
[15:16:59] <icinga-wm>	 PROBLEM - PHP7 jobrunner on mw1303 is CRITICAL: CRITICAL - Socket timeout after 10 seconds https://wikitech.wikimedia.org/wiki/Jobrunner
[15:17:21] <icinga-wm>	 RECOVERY - Prometheus jobs reduced availability on alert1001 is OK: All metrics within thresholds. https://wikitech.wikimedia.org/wiki/Prometheus%23Prometheus_job_unavailable https://grafana.wikimedia.org/d/NEJu05xZz/prometheus-targets
[15:19:45] <wikibugs>	 (03PS1) 10Cparle: Updated maint script to use fewer queries [extensions/MediaSearch] (wmf/1.38.0-wmf.17) - 10https://gerrit.wikimedia.org/r/753060 (https://phabricator.wikimedia.org/T297484)
[15:19:58] <wikibugs>	 (03PS1) 10Cathal Mooney: Adding user Antoine Qu'hen to analytics-admin group [puppet] - 10https://gerrit.wikimedia.org/r/753061 (https://phabricator.wikimedia.org/T298657)
[15:20:47] <icinga-wm>	 RECOVERY - Check systemd state on otrs1001 is OK: OK - running: The system is fully operational https://wikitech.wikimedia.org/wiki/Monitoring/check_systemd_state
[15:21:11] <wikibugs>	 (03CR) 10Cathal Mooney: [C: 03+2] Adding user Antoine Qu'hen to analytics-admin group [puppet] - 10https://gerrit.wikimedia.org/r/753061 (https://phabricator.wikimedia.org/T298657) (owner: 10Cathal Mooney)
[15:22:45] <arnoldokoth>	 !log systemctl reset-failed ifup@ens5.service on otrs1001 T273026
[15:22:47] <stashbot>	 Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log
[15:22:48] <stashbot>	 T273026: Errors for ifup@ens5.service after rebooting Ganeti VMs - https://phabricator.wikimedia.org/T273026
[15:22:53] <wikibugs>	 (03PS2) 10Giuseppe Lavagetto: shellbox: rationalize version handling, promote to 1.0 [deployment-charts] - 10https://gerrit.wikimedia.org/r/753020
[15:22:55] <wikibugs>	 (03PS2) 10Giuseppe Lavagetto: shellbox-*: promote to new build [deployment-charts] - 10https://gerrit.wikimedia.org/r/753021 (https://phabricator.wikimedia.org/T292322)
[15:22:57] <wikibugs>	 (03PS1) 10Giuseppe Lavagetto: shellbox: remove useless files/stanzas [deployment-charts] - 10https://gerrit.wikimedia.org/r/753062
[15:23:25] <wikibugs>	 (03CR) 10jerkins-bot: [V: 04-1] shellbox: remove useless files/stanzas [deployment-charts] - 10https://gerrit.wikimedia.org/r/753062 (owner: 10Giuseppe Lavagetto)
[15:24:26] <wikibugs>	 (03CR) 10jerkins-bot: [V: 04-1] shellbox: rationalize version handling, promote to 1.0 [deployment-charts] - 10https://gerrit.wikimedia.org/r/753020 (owner: 10Giuseppe Lavagetto)
[15:24:36] <wikibugs>	 (03CR) 10jerkins-bot: [V: 04-1] shellbox-*: promote to new build [deployment-charts] - 10https://gerrit.wikimedia.org/r/753021 (https://phabricator.wikimedia.org/T292322) (owner: 10Giuseppe Lavagetto)
[15:24:55] <wikibugs>	 10SRE, 10SRE-Access-Requests, 10Data-Engineering-Kanban, 10Patch-For-Review: Requesting access to the data engineering team resources for Antoine Qu'hen - https://phabricator.wikimedia.org/T298657 (10cmooney) a:05BTullis→03cmooney On the back of Olja's explicit approval I've added the username to the '...
[15:30:10] <hnowlan>	 !log Decommissioning cassandra instance restbase2009-a via nodetool 
[15:30:11] <stashbot>	 Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log
[15:31:04] <logmsgbot>	 !log marostegui@cumin1001 dbctl commit (dc=all): 'Repooling after maintenance db1141', diff saved to https://phabricator.wikimedia.org/P18578 and previous config saved to /var/cache/conftool/dbconfig/20220111-153103-marostegui.json
[15:31:05] <stashbot>	 Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log
[15:33:01] <jinxer-wm>	 (CirrusSearchJVMGCOldPoolFlatlined) firing: Elasticsearch instance elastic1049-production-search-psi-eqiad is showing memory pressure in the old pool - https://wikitech.wikimedia.org/wiki/Search#Stuck_in_old_GC_hell  - https://alerts.wikimedia.org
[15:33:29] <wikibugs>	 10SRE, 10Infrastructure-Foundations, 10Mail, 10Znuny, 10fundraising-tech-ops: move donation,donate, donations (otrs, wikimania) exim aliases from SRE to ITS - https://phabricator.wikimedia.org/T297915 (10bcampbell) I can assist with this. I believe once SRE removes the aliases from their side, ITS can ad...
[15:39:32] <wikibugs>	 (03PS1) 10Cathal Mooney: Add Elliot Eggleston (ejegg) to fr-tech-ops Icinga contact group. [puppet] - 10https://gerrit.wikimedia.org/r/753065 (https://phabricator.wikimedia.org/T298649)
[15:44:30] <wikibugs>	 10SRE, 10SRE-Access-Requests, 10Fundraising-Backlog, 10observability, and 2 others: Fundraising-Tech engineers unable to ACK icinga alerts on fr-tech host groups - https://phabricator.wikimedia.org/T298649 (10cmooney) a:03cmooney @jglesson hey just following up on this as I am on Clinic Duty this week....
[15:44:34] <wikibugs>	 (03CR) 10Cathal Mooney: [C: 03+2] Add Elliot Eggleston (ejegg) to fr-tech-ops Icinga contact group. [puppet] - 10https://gerrit.wikimedia.org/r/753065 (https://phabricator.wikimedia.org/T298649) (owner: 10Cathal Mooney)
[15:46:08] <logmsgbot>	 !log marostegui@cumin1001 dbctl commit (dc=all): 'Repooling after maintenance db1141 (T297191)', diff saved to https://phabricator.wikimedia.org/P18579 and previous config saved to /var/cache/conftool/dbconfig/20220111-154608-marostegui.json
[15:46:09] <logmsgbot>	 !log marostegui@cumin1001 START - Cookbook sre.hosts.downtime for 6:00:00 on db1144.eqiad.wmnet with reason: Maintenance
[15:46:11] <stashbot>	 Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log
[15:46:11] <logmsgbot>	 !log marostegui@cumin1001 END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1144.eqiad.wmnet with reason: Maintenance
[15:46:12] <stashbot>	 T297191: Schema change for dropping page_restrictions.pr_user field on wmf sites - https://phabricator.wikimedia.org/T297191
[15:46:13] <stashbot>	 Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log
[15:46:14] <stashbot>	 Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log
[15:46:16] <logmsgbot>	 !log marostegui@cumin1001 dbctl commit (dc=all): 'Depooling db1144:3314 (T297191)', diff saved to https://phabricator.wikimedia.org/P18580 and previous config saved to /var/cache/conftool/dbconfig/20220111-154615-marostegui.json
[15:46:18] <stashbot>	 Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log
[15:46:26] <wikibugs>	 10SRE, 10ops-eqiad, 10DC-Ops: Rack msw2-eqiad in new cage - https://phabricator.wikimedia.org/T298980 (10ayounsi) p:05Triage→03Medium
[15:47:23] <logmsgbot>	 !log marostegui@cumin1001 dbctl commit (dc=all): 'Repooling after maintenance db1144:3314 (T297191)', diff saved to https://phabricator.wikimedia.org/P18582 and previous config saved to /var/cache/conftool/dbconfig/20220111-154722-marostegui.json
[15:47:25] <stashbot>	 Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log
[15:47:37] <wikibugs>	 10SRE, 10LDAP-Access-Requests: Grant Access to ldap/wmf for Marco_Fossati - https://phabricator.wikimedia.org/T298766 (10cmooney) p:05Triage→03Medium a:03cmooney
[15:47:47] <logmsgbot>	 !log vgutierrez@cumin1001 START - Cookbook sre.ganeti.reboot-vm for VM acmechief-test1001.eqiad.wmnet
[15:47:48] <stashbot>	 Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log
[15:47:52] <wikibugs>	 10SRE, 10Infrastructure-Foundations: Migrate eqiad Ganeti cluster to KVM machine type pc-i440fx-2.8 - https://phabricator.wikimedia.org/T294120 (10ops-monitoring-bot) VM acmechief-test1001.eqiad.wmnet rebooted by vgutierrez@cumin1001 with reason: None
[15:48:06] <wikibugs>	 10SRE, 10ops-eqiad, 10DC-Ops: eqiad: Master Tracking Ticket for eqiad expansion cage - https://phabricator.wikimedia.org/T296966 (10ayounsi)
[15:51:05] <ebernhardson>	 !log restart elasticserach_6@production-search-psi-eqiad on elastic1049 to resolve issue with full heap
[15:51:06] <stashbot>	 Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log
[15:52:50] <logmsgbot>	 !log vgutierrez@cumin1001 END (PASS) - Cookbook sre.ganeti.reboot-vm (exit_code=0) for VM acmechief-test1001.eqiad.wmnet
[15:52:51] <stashbot>	 Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log
[15:53:01] <jinxer-wm>	 (CirrusSearchJVMGCOldPoolFlatlined) resolved: Elasticsearch instance elastic1049-production-search-psi-eqiad is showing memory pressure in the old pool - https://wikitech.wikimedia.org/wiki/Search#Stuck_in_old_GC_hell  - https://alerts.wikimedia.org
[15:55:52] <vgutierrez>	 !log disable puppet on acme-chief clients for acmechief1001 reboot - T294120
[15:55:54] <stashbot>	 Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log
[15:55:55] <stashbot>	 T294120: Migrate eqiad Ganeti cluster to KVM machine type pc-i440fx-2.8 - https://phabricator.wikimedia.org/T294120
[15:56:12] <logmsgbot>	 !log hnowlan@cumin1001 START - Cookbook sre.hosts.downtime for 30 days, 0:00:00 on restbase2009.codfw.wmnet with reason: Decommissioning - hnowlan
[15:56:13] <stashbot>	 Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log
[15:56:14] <logmsgbot>	 !log hnowlan@cumin1001 END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 30 days, 0:00:00 on restbase2009.codfw.wmnet with reason: Decommissioning - hnowlan
[15:56:15] <stashbot>	 Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log
[15:56:31] <logmsgbot>	 !log vgutierrez@cumin1001 START - Cookbook sre.ganeti.reboot-vm for VM acmechief1001.eqiad.wmnet
[15:56:32] <stashbot>	 Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log
[15:56:37] <wikibugs>	 10SRE, 10Infrastructure-Foundations: Migrate eqiad Ganeti cluster to KVM machine type pc-i440fx-2.8 - https://phabricator.wikimedia.org/T294120 (10ops-monitoring-bot) VM acmechief1001.eqiad.wmnet rebooted by vgutierrez@cumin1001 with reason: None
[15:58:40] <logmsgbot>	 !log vgutierrez@cumin1001 END (PASS) - Cookbook sre.ganeti.reboot-vm (exit_code=0) for VM acmechief1001.eqiad.wmnet
[15:58:41] <stashbot>	 Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log
[15:59:11] <vgutierrez>	 !log re-enable puppet on acme-chief clients after acmechief1001 reboot - T294120
[15:59:13] <stashbot>	 Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log
[16:00:19] <wikibugs>	 10SRE, 10Infrastructure-Foundations: Migrate eqiad Ganeti cluster to KVM machine type pc-i440fx-2.8 - https://phabricator.wikimedia.org/T294120 (10Vgutierrez)
[16:02:28] <logmsgbot>	 !log marostegui@cumin1001 dbctl commit (dc=all): 'Repooling after maintenance db1144:3314', diff saved to https://phabricator.wikimedia.org/P18583 and previous config saved to /var/cache/conftool/dbconfig/20220111-160227-marostegui.json
[16:02:29] <stashbot>	 Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log
[16:03:05] <cwhite>	 !log begin rolling restart of opensearch in codfw - jvm upgrade
[16:03:06] <stashbot>	 Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log
[16:04:08] <icinga-wm>	 PROBLEM - PHP7 jobrunner on mw1302 is CRITICAL: CRITICAL - Socket timeout after 10 seconds https://wikitech.wikimedia.org/wiki/Jobrunner
[16:05:26] <icinga-wm>	 RECOVERY - PHP7 jobrunner on mw1302 is OK: HTTP OK: HTTP/1.1 200 OK - 325 bytes in 0.784 second response time https://wikitech.wikimedia.org/wiki/Jobrunner
[16:06:08] <icinga-wm>	 PROBLEM - PHP7 rendering on mw1302 is CRITICAL: CRITICAL - Socket timeout after 10 seconds https://wikitech.wikimedia.org/wiki/Application_servers/Runbook%23PHP7_rendering
[16:06:38] <icinga-wm>	 RECOVERY - PHP7 rendering on mw1303 is OK: HTTP OK: HTTP/1.1 200 OK - 324 bytes in 0.031 second response time https://wikitech.wikimedia.org/wiki/Application_servers/Runbook%23PHP7_rendering
[16:07:02] <icinga-wm>	 RECOVERY - PHP7 jobrunner on mw1303 is OK: HTTP OK: HTTP/1.1 200 OK - 324 bytes in 0.024 second response time https://wikitech.wikimedia.org/wiki/Jobrunner
[16:09:34] <wikibugs>	 10SRE, 10Data-Engineering, 10Data-Engineering-Kanban, 10Infrastructure-Foundations, and 3 others: Collect netflow data for internal traffic - https://phabricator.wikimedia.org/T263277 (10JAllemandou) a:03JAllemandou
[16:10:10] <icinga-wm>	 PROBLEM - PHP7 jobrunner on mw1302 is CRITICAL: CRITICAL - Socket timeout after 10 seconds https://wikitech.wikimedia.org/wiki/Jobrunner
[16:10:58] <wikibugs>	 10SRE, 10SRE-Access-Requests, 10Fundraising-Backlog, 10observability, 10serviceops-radar: Fundraising-Tech engineers unable to ACK icinga alerts on fr-tech host groups - https://phabricator.wikimedia.org/T298649 (10jgleeson) Thanks @cmooney and yeah it makes sense not to give us permissions we don't need...
[16:12:37] <wikibugs>	 (03CR) 10Jgiannelos: [C: 04-1] maps: add cassandra toggle, disable cassandra on maps hosts (031 comment) [puppet] - 10https://gerrit.wikimedia.org/r/753057 (https://phabricator.wikimedia.org/T298246) (owner: 10Hnowlan)
[16:13:30] <wikibugs>	 10SRE-Access-Requests, 10Data-Engineering, 10LDAP-Access-Requests: Create Kerberos login for Brian King (bking) - https://phabricator.wikimedia.org/T298981 (10bking)
[16:13:55] <wikibugs>	 10SRE-Access-Requests, 10Data-Engineering, 10LDAP-Access-Requests: Create Kerberos login for Brian King (bking) - https://phabricator.wikimedia.org/T298981 (10bking)
[16:14:38] <wikibugs>	 (03CR) 10Jgiannelos: [C: 04-1] maps: add cassandra toggle, disable cassandra on maps hosts (031 comment) [puppet] - 10https://gerrit.wikimedia.org/r/753057 (https://phabricator.wikimedia.org/T298246) (owner: 10Hnowlan)
[16:14:48] <wikibugs>	 10SRE-Access-Requests, 10Data-Engineering, 10LDAP-Access-Requests: Create Kerberos login for Brian King (bking) - https://phabricator.wikimedia.org/T298981 (10bking)
[16:16:44] <icinga-wm>	 PROBLEM - k8s API server requests latencies on kubestagemaster2001 is CRITICAL: instance=10.192.48.10 verb={CREATE,UPDATE} https://wikitech.wikimedia.org/wiki/Kubernetes https://grafana.wikimedia.org/dashboard/db/kubernetes-api?viewPanel=27
[16:17:32] <logmsgbot>	 !log marostegui@cumin1001 dbctl commit (dc=all): 'Repooling after maintenance db1144:3314', diff saved to https://phabricator.wikimedia.org/P18584 and previous config saved to /var/cache/conftool/dbconfig/20220111-161732-marostegui.json
[16:17:34] <stashbot>	 Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log
[16:18:44] <icinga-wm>	 RECOVERY - k8s API server requests latencies on kubestagemaster2001 is OK: All metrics within thresholds. https://wikitech.wikimedia.org/wiki/Kubernetes https://grafana.wikimedia.org/dashboard/db/kubernetes-api?viewPanel=27
[16:19:01] <wikibugs>	 (03CR) 10Ottomata: [C: 03+2] Import commons mediainfo json dumps to HDFS [puppet] - 10https://gerrit.wikimedia.org/r/738874 (https://phabricator.wikimedia.org/T258834) (owner: 10Joal)
[16:19:42] <wikibugs>	 10SRE-Access-Requests, 10Data-Engineering, 10LDAP-Access-Requests: Create Kerberos login for Brian King (bking) - https://phabricator.wikimedia.org/T298981 (10Ottomata) Approved.
[16:20:10] <wikibugs>	 10SRE-Access-Requests, 10Data-Engineering, 10LDAP-Access-Requests: Create Kerberos login for Brian King (bking) - https://phabricator.wikimedia.org/T298981 (10Gehel) Approved
[16:22:34] <wikibugs>	 10SRE, 10SRE-Access-Requests, 10Fundraising-Backlog, 10observability, 10serviceops-radar: Fundraising-Tech engineers unable to ACK icinga alerts on fr-tech host groups - https://phabricator.wikimedia.org/T298649 (10Dzahn) @cmooney Perfect, I wanted to add exactly that but you already got it :) thanks
[16:23:50] <arturo>	 !log aborrero@apt1001:~ $ sudo -i reprepro --noskipold --component thirdparty/kubeadm-k8s-1-21 update buster-wikimedia
[16:23:51] <stashbot>	 Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log
[16:25:07] <wikibugs>	 (03CR) 10Arturo Borrero Gonzalez: [C: 03+2] kubeadm: raise default to 1.20 [puppet] - 10https://gerrit.wikimedia.org/r/739402 (owner: 10Majavah)
[16:25:59] <wikibugs>	 (03CR) 10Arturo Borrero Gonzalez: [C: 03+2] aptrepo: drop k8s 1.19 repos [puppet] - 10https://gerrit.wikimedia.org/r/739403 (owner: 10Majavah)
[16:26:09] <wikibugs>	 (03PS3) 10Arturo Borrero Gonzalez: aptrepo: drop k8s 1.19 repos [puppet] - 10https://gerrit.wikimedia.org/r/739403 (owner: 10Majavah)
[16:26:40] <wikibugs>	 (03CR) 10Arturo Borrero Gonzalez: [V: 03+2 C: 03+2] aptrepo: drop k8s 1.19 repos [puppet] - 10https://gerrit.wikimedia.org/r/739403 (owner: 10Majavah)
[16:29:20] <arturo>	 !log aborrero@apt1001:~ $ sudo -i reprepro clearvanished
[16:29:22] <stashbot>	 Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log
[16:30:20] <icinga-wm>	 RECOVERY - PHP7 rendering on mw1302 is OK: HTTP OK: HTTP/1.1 200 OK - 324 bytes in 0.021 second response time https://wikitech.wikimedia.org/wiki/Application_servers/Runbook%23PHP7_rendering
[16:32:37] <logmsgbot>	 !log marostegui@cumin1001 dbctl commit (dc=all): 'Repooling after maintenance db1144:3314 (T297191)', diff saved to https://phabricator.wikimedia.org/P18585 and previous config saved to /var/cache/conftool/dbconfig/20220111-163237-marostegui.json
[16:32:38] <logmsgbot>	 !log marostegui@cumin1001 START - Cookbook sre.hosts.downtime for 6:00:00 on db1143.eqiad.wmnet with reason: Maintenance
[16:32:40] <stashbot>	 Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log
[16:32:40] <logmsgbot>	 !log marostegui@cumin1001 END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1143.eqiad.wmnet with reason: Maintenance
[16:32:40] <stashbot>	 T297191: Schema change for dropping page_restrictions.pr_user field on wmf sites - https://phabricator.wikimedia.org/T297191
[16:32:42] <stashbot>	 Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log
[16:32:43] <stashbot>	 Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log
[16:32:45] <logmsgbot>	 !log marostegui@cumin1001 dbctl commit (dc=all): 'Depooling db1143 (T297191)', diff saved to https://phabricator.wikimedia.org/P18586 and previous config saved to /var/cache/conftool/dbconfig/20220111-163244-marostegui.json
[16:32:47] <stashbot>	 Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log
[16:33:51] <logmsgbot>	 !log marostegui@cumin1001 dbctl commit (dc=all): 'Repooling after maintenance db1143 (T297191)', diff saved to https://phabricator.wikimedia.org/P18587 and previous config saved to /var/cache/conftool/dbconfig/20220111-163351-marostegui.json
[16:33:54] <stashbot>	 Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log
[16:35:50] <icinga-wm>	 PROBLEM - PHP7 rendering on mw1302 is CRITICAL: CRITICAL - Socket timeout after 10 seconds https://wikitech.wikimedia.org/wiki/Application_servers/Runbook%23PHP7_rendering
[16:37:23] <wikibugs>	 (03PS1) 10Jcrespo: mediabackup: Backup s1 (enwiki) media files on codfw [puppet] - 10https://gerrit.wikimedia.org/r/753095 (https://phabricator.wikimedia.org/T262668)
[16:38:15] <wikibugs>	 (03CR) 10Jcrespo: [C: 03+2] mediabackup: Backup s1 (enwiki) media files on codfw [puppet] - 10https://gerrit.wikimedia.org/r/753095 (https://phabricator.wikimedia.org/T262668) (owner: 10Jcrespo)
[16:40:32] <icinga-wm>	 RECOVERY - PHP7 jobrunner on mw1302 is OK: HTTP OK: HTTP/1.1 200 OK - 324 bytes in 0.021 second response time https://wikitech.wikimedia.org/wiki/Jobrunner
[16:42:12] <icinga-wm>	 RECOVERY - PHP7 rendering on mw1302 is OK: HTTP OK: HTTP/1.1 200 OK - 324 bytes in 0.024 second response time https://wikitech.wikimedia.org/wiki/Application_servers/Runbook%23PHP7_rendering
[16:44:07] <wikibugs>	 (03PS1) 10Ladsgroup: export: Remove ignoring rev_page_id index [core] (wmf/1.38.0-wmf.17) - 10https://gerrit.wikimedia.org/r/753069 (https://phabricator.wikimedia.org/T163532)
[16:44:25] <wikibugs>	 (03CR) 10JHathaway: [C: 03+1] "looks good, one question" [puppet] - 10https://gerrit.wikimedia.org/r/753016 (https://phabricator.wikimedia.org/T298087) (owner: 10Jbond)
[16:44:34] <Amir1>	 jouncebot: nowandnext
[16:44:34] <jouncebot>	 No deployments scheduled for the next 0 hour(s) and 15 minute(s)
[16:44:35] <jouncebot>	 In 0 hour(s) and 15 minute(s): Puppet request window (https://wikitech.wikimedia.org/wiki/Deployments#deploycal-item-20220111T1700)
[16:44:46] <wikibugs>	 (03CR) 10Ladsgroup: [C: 03+2] "Catching the train" [core] (wmf/1.38.0-wmf.17) - 10https://gerrit.wikimedia.org/r/753069 (https://phabricator.wikimedia.org/T163532) (owner: 10Ladsgroup)
[16:45:33] <wikibugs>	 10SRE-Access-Requests, 10Data-Engineering, 10LDAP-Access-Requests: Create Kerberos login for Brian King (bking) - https://phabricator.wikimedia.org/T298981 (10bking) This is confirmed working, feel free to close this ticket.
[16:46:44] <wikibugs>	 10SRE, 10SRE-Access-Requests: Requesting access to Analytics Data for Michael Große (WMDE) - https://phabricator.wikimedia.org/T269610 (10Michael) 05Resolved→03Open Nothing was compromised, but I was stupid when playing around with my password manager and am no longer able to unlock the ssh key added for m...
[16:47:00] <wikibugs>	 (03PS7) 10Ayounsi: profile::installserver::proxy: update squid template [puppet] - 10https://gerrit.wikimedia.org/r/753016 (https://phabricator.wikimedia.org/T298087) (owner: 10Jbond)
[16:47:48] <wikibugs>	 10SRE, 10LDAP-Access-Requests: Grant Access to ldap/wmf for Marco_Fossati - https://phabricator.wikimedia.org/T298766 (10MarkTraceur) Approved! Re: Specific access, this is part of our onboarding checklist. It says:  "Create a Phabricator task to request access to the group ldap/wmf for your Gerrit account[.]...
[16:48:56] <logmsgbot>	 !log marostegui@cumin1001 dbctl commit (dc=all): 'Repooling after maintenance db1143', diff saved to https://phabricator.wikimedia.org/P18588 and previous config saved to /var/cache/conftool/dbconfig/20220111-164856-marostegui.json
[16:48:57] <wikibugs>	 10SRE, 10ops-eqiad, 10DC-Ops, 10cloud-services-team (Hardware): Q2:(Need By: TBD) rack/setup/install cloudbackup100[34] - https://phabricator.wikimedia.org/T293934 (10Cmjohnson) a:05Cmjohnson→03elukey @elukey when you have a moment can you look at the partman recipe for this and let me know if it's cor...
[16:48:57] <stashbot>	 Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log
[16:54:06] <wikibugs>	 10SRE, 10ops-eqiad: Degraded RAID on dumpsdata1004 - https://phabricator.wikimedia.org/T298582 (10Cmjohnson) @ArielGlenn The part is arriving today, can I do this tomorrow 1530 UTC?
[16:54:29] <wikibugs>	 (03CR) 10JHathaway: [C: 03+1] "looks good, one question?" [puppet] - 10https://gerrit.wikimedia.org/r/753046 (https://phabricator.wikimedia.org/T284052) (owner: 10Jbond)
[16:54:31] <wikibugs>	 (03PS1) 10Jcrespo: mediabackup: Update mediawiki replica for s1 backup on codfw [puppet] - 10https://gerrit.wikimedia.org/r/753099 (https://phabricator.wikimedia.org/T262668)
[16:54:50] <wikibugs>	 (03PS2) 10Jcrespo: mediabackup: Update mediawiki replica for s1 backup on codfw [puppet] - 10https://gerrit.wikimedia.org/r/753099 (https://phabricator.wikimedia.org/T262668)
[16:54:53] <wikibugs>	 10SRE, 10ops-eqiad: Installation issues on PowerEdge R440 Kafka main eqiad servers with buster / firmware update needed - https://phabricator.wikimedia.org/T298867 (10Cmjohnson) @elukey Can we plan to do this tomorrow (12 Jan)  starting around 1500UTC?
[16:55:17] <wikibugs>	 (03PS1) 10Andrew Bogott: profile::wmcs::nfs::standalone: bind service IP to VM [puppet] - 10https://gerrit.wikimedia.org/r/753100
[16:56:05] <wikibugs>	 10SRE, 10ops-eqiad, 10Analytics-Clusters, 10DC-Ops: (Need By: TBD) rack/setup/install an-test-coord1002 - https://phabricator.wikimedia.org/T293938 (10Cmjohnson) I've tried a different partman recipe. I do not know what is wrong or why the raid fails.
[16:56:17] <wikibugs>	 (03CR) 10jerkins-bot: [V: 04-1] profile::wmcs::nfs::standalone: bind service IP to VM [puppet] - 10https://gerrit.wikimedia.org/r/753100 (owner: 10Andrew Bogott)
[16:57:08] <wikibugs>	 (03CR) 10Jcrespo: [C: 03+2] mediabackup: Update mediawiki replica for s1 backup on codfw [puppet] - 10https://gerrit.wikimedia.org/r/753099 (https://phabricator.wikimedia.org/T262668) (owner: 10Jcrespo)
[16:57:21] <wikibugs>	 10SRE, 10ops-eqiad: Degraded RAID on dumpsdata1004 - https://phabricator.wikimedia.org/T298582 (10ArielGlenn) >>! In T298582#7613348, @Cmjohnson wrote: > @ArielGlenn The part is arriving today, can I do this tomorrow 1530 UTC?  Yes please!
[16:58:26] <icinga-wm>	 PROBLEM - PHP7 jobrunner on mw1302 is CRITICAL: CRITICAL - Socket timeout after 10 seconds https://wikitech.wikimedia.org/wiki/Jobrunner
[16:59:56] <wikibugs>	 (03PS2) 10Andrew Bogott: profile::wmcs::nfs::standalone: bind service IP to VM [puppet] - 10https://gerrit.wikimedia.org/r/753100 (https://phabricator.wikimedia.org/T291405)
[17:00:05] <jouncebot>	 jbond and rzl: #bothumor Q:How do functions break up? A:They stop calling each other. Rise for Puppet request window deploy. (https://wikitech.wikimedia.org/wiki/Deployments#deploycal-item-20220111T1700).
[17:00:05] <jouncebot>	 No Gerrit patches in the queue for this window AFAICS.
[17:00:13] <logmsgbot>	 !log bking@cumin1001 START - Cookbook sre.wdqs.data-reload
[17:00:19] <stashbot>	 Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log
[17:00:34] <icinga-wm>	 PROBLEM - PHP7 rendering on mw1302 is CRITICAL: CRITICAL - Socket timeout after 10 seconds https://wikitech.wikimedia.org/wiki/Application_servers/Runbook%23PHP7_rendering
[17:00:36] <wikibugs>	 (03CR) 10jerkins-bot: [V: 04-1] profile::wmcs::nfs::standalone: bind service IP to VM [puppet] - 10https://gerrit.wikimedia.org/r/753100 (https://phabricator.wikimedia.org/T291405) (owner: 10Andrew Bogott)
[17:01:52] <wikibugs>	 (03PS3) 10Andrew Bogott: profile::wmcs::nfs::standalone: bind service IP to VM [puppet] - 10https://gerrit.wikimedia.org/r/753100 (https://phabricator.wikimedia.org/T291405)
[17:03:10] <logmsgbot>	 !log bking@cumin1001 START - Cookbook sre.wdqs.data-reload
[17:03:11] <stashbot>	 Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log
[17:03:31] <logmsgbot>	 !log vgutierrez@cumin1001 START - Cookbook sre.ganeti.reboot-vm for VM ncredir1001.eqiad.wmnet
[17:03:32] <stashbot>	 Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log
[17:03:38] <wikibugs>	 10SRE, 10Infrastructure-Foundations: Migrate eqiad Ganeti cluster to KVM machine type pc-i440fx-2.8 - https://phabricator.wikimedia.org/T294120 (10ops-monitoring-bot) VM ncredir1001.eqiad.wmnet rebooted by vgutierrez@cumin1001 with reason: None
[17:03:43] <logmsgbot>	 !log bking@cumin1001 START - Cookbook sre.wdqs.data-reload
[17:03:44] <stashbot>	 Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log
[17:04:01] <logmsgbot>	 !log marostegui@cumin1001 dbctl commit (dc=all): 'Repooling after maintenance db1143', diff saved to https://phabricator.wikimedia.org/P18589 and previous config saved to /var/cache/conftool/dbconfig/20220111-170400-marostegui.json
[17:04:02] <stashbot>	 Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log
[17:04:34] <logmsgbot>	 !log bking@cumin1001 START - Cookbook sre.wdqs.data-reload
[17:04:35] <stashbot>	 Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log
[17:05:46] <wikibugs>	 (03CR) 10jerkins-bot: [V: 04-1] export: Remove ignoring rev_page_id index [core] (wmf/1.38.0-wmf.17) - 10https://gerrit.wikimedia.org/r/753069 (https://phabricator.wikimedia.org/T163532) (owner: 10Ladsgroup)
[17:06:02] <logmsgbot>	 !log bking@cumin1001 START - Cookbook sre.wdqs.data-reload
[17:06:03] <stashbot>	 Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log
[17:06:06] <icinga-wm>	 RECOVERY - PHP7 rendering on mw1302 is OK: HTTP OK: HTTP/1.1 200 OK - 326 bytes in 9.536 second response time https://wikitech.wikimedia.org/wiki/Application_servers/Runbook%23PHP7_rendering
[17:06:16] <icinga-wm>	 RECOVERY - PHP7 jobrunner on mw1302 is OK: HTTP OK: HTTP/1.1 200 OK - 324 bytes in 0.097 second response time https://wikitech.wikimedia.org/wiki/Jobrunner
[17:06:19] <wikibugs>	 (03CR) 10Ladsgroup: [C: 03+2] "." [core] (wmf/1.38.0-wmf.17) - 10https://gerrit.wikimedia.org/r/753069 (https://phabricator.wikimedia.org/T163532) (owner: 10Ladsgroup)
[17:06:38] <logmsgbot>	 !log bking@cumin1001 START - Cookbook sre.wdqs.data-reload
[17:06:39] <stashbot>	 Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log
[17:07:14] <logmsgbot>	 !log jgiannelos@deploy1002 Started deploy [kartotherian/deploy@65895c0]: Remove cassandra from kartotherian sources
[17:07:17] <stashbot>	 Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log
[17:07:26] <logmsgbot>	 !log vgutierrez@cumin1001 END (PASS) - Cookbook sre.ganeti.reboot-vm (exit_code=0) for VM ncredir1001.eqiad.wmnet
[17:07:27] <stashbot>	 Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log
[17:07:55] <wikibugs>	 10SRE, 10ops-eqiad: Installation issues on PowerEdge R440 Kafka main eqiad servers with buster / firmware update needed - https://phabricator.wikimedia.org/T298867 (10elukey) @Cmjohnson perfect thanks!
[17:08:50] <logmsgbot>	 !log vgutierrez@cumin1001 START - Cookbook sre.ganeti.reboot-vm for VM ncredir1002.eqiad.wmnet
[17:08:51] <stashbot>	 Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log
[17:08:56] <wikibugs>	 10SRE, 10Infrastructure-Foundations: Migrate eqiad Ganeti cluster to KVM machine type pc-i440fx-2.8 - https://phabricator.wikimedia.org/T294120 (10ops-monitoring-bot) VM ncredir1002.eqiad.wmnet rebooted by vgutierrez@cumin1001 with reason: None
[17:10:48] <logmsgbot>	 !log jgiannelos@deploy1002 Finished deploy [kartotherian/deploy@65895c0]: Remove cassandra from kartotherian sources (duration: 03m 33s)
[17:10:49] <stashbot>	 Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log
[17:11:48] <logmsgbot>	 !log jgiannelos@deploy1002 Started deploy [kartotherian/deploy@65895c0]: Remove cassandra from kartotherian sources
[17:11:49] <stashbot>	 Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log
[17:12:43] <logmsgbot>	 !log vgutierrez@cumin1001 END (PASS) - Cookbook sre.ganeti.reboot-vm (exit_code=0) for VM ncredir1002.eqiad.wmnet
[17:12:45] <stashbot>	 Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log
[17:13:51] <wikibugs>	 10SRE, 10Infrastructure-Foundations: Migrate eqiad Ganeti cluster to KVM machine type pc-i440fx-2.8 - https://phabricator.wikimedia.org/T294120 (10Vgutierrez)
[17:13:51] <logmsgbot>	 !log jgiannelos@deploy1002 Finished deploy [kartotherian/deploy@65895c0]: Remove cassandra from kartotherian sources (duration: 02m 04s)
[17:13:53] <stashbot>	 Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log
[17:14:07] <wikibugs>	 (03CR) 10Andrew Bogott: [C: 03+2] profile::wmcs::nfs::standalone: bind service IP to VM [puppet] - 10https://gerrit.wikimedia.org/r/753100 (https://phabricator.wikimedia.org/T291405) (owner: 10Andrew Bogott)
[17:15:16] <wikibugs>	 (03CR) 10JMeybohm: [C: 03+1] deployment_server,::helm: remove helm2 support [puppet] - 10https://gerrit.wikimedia.org/r/753026 (https://phabricator.wikimedia.org/T251305) (owner: 10Jelto)
[17:17:32] <wikibugs>	 (03CR) 10JHathaway: P:installserver::proxy: Add domain whitelist to proxy (031 comment) [puppet] - 10https://gerrit.wikimedia.org/r/753029 (https://phabricator.wikimedia.org/T298087) (owner: 10Jbond)
[17:19:05] <logmsgbot>	 !log marostegui@cumin1001 dbctl commit (dc=all): 'Repooling after maintenance db1143 (T297191)', diff saved to https://phabricator.wikimedia.org/P18590 and previous config saved to /var/cache/conftool/dbconfig/20220111-171905-marostegui.json
[17:19:07] <logmsgbot>	 !log marostegui@cumin1001 START - Cookbook sre.hosts.downtime for 6:00:00 on db1146.eqiad.wmnet with reason: Maintenance
[17:19:08] <stashbot>	 Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log
[17:19:08] <logmsgbot>	 !log marostegui@cumin1001 END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1146.eqiad.wmnet with reason: Maintenance
[17:19:09] <stashbot>	 T297191: Schema change for dropping page_restrictions.pr_user field on wmf sites - https://phabricator.wikimedia.org/T297191
[17:19:10] <stashbot>	 Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log
[17:19:12] <stashbot>	 Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log
[17:19:13] <logmsgbot>	 !log marostegui@cumin1001 dbctl commit (dc=all): 'Depooling db1146:3314 (T297191)', diff saved to https://phabricator.wikimedia.org/P18591 and previous config saved to /var/cache/conftool/dbconfig/20220111-171912-marostegui.json
[17:19:15] <stashbot>	 Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log
[17:20:20] <logmsgbot>	 !log marostegui@cumin1001 dbctl commit (dc=all): 'Repooling after maintenance db1146:3314 (T297191)', diff saved to https://phabricator.wikimedia.org/P18592 and previous config saved to /var/cache/conftool/dbconfig/20220111-172019-marostegui.json
[17:20:22] <stashbot>	 Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log
[17:21:25] <wikibugs>	 (03PS1) 10Jbond: bgpalerter: add new class to configure bgpalerter [puppet] - 10https://gerrit.wikimedia.org/r/753102 (https://phabricator.wikimedia.org/T230600)
[17:22:02] <wikibugs>	 (03CR) 10jerkins-bot: [V: 04-1] bgpalerter: add new class to configure bgpalerter [puppet] - 10https://gerrit.wikimedia.org/r/753102 (https://phabricator.wikimedia.org/T230600) (owner: 10Jbond)
[17:24:26] <icinga-wm>	 PROBLEM - PHP7 jobrunner on mw1318 is CRITICAL: CRITICAL - Socket timeout after 10 seconds https://wikitech.wikimedia.org/wiki/Jobrunner
[17:25:11] <wikibugs>	 (03Merged) 10jenkins-bot: export: Remove ignoring rev_page_id index [core] (wmf/1.38.0-wmf.17) - 10https://gerrit.wikimedia.org/r/753069 (https://phabricator.wikimedia.org/T163532) (owner: 10Ladsgroup)
[17:28:34] <icinga-wm>	 RECOVERY - PHP7 jobrunner on mw1318 is OK: HTTP OK: HTTP/1.1 200 OK - 324 bytes in 0.024 second response time https://wikitech.wikimedia.org/wiki/Jobrunner
[17:28:48] <logmsgbot>	 !log mwdebug-deploy@deploy1002 helmfile [eqiad] START helmfile.d/services/mwdebug: apply on pinkunicorn
[17:28:49] <stashbot>	 Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log
[17:29:08] <icinga-wm>	 PROBLEM - PHP7 jobrunner on mw1302 is CRITICAL: CRITICAL - Socket timeout after 10 seconds https://wikitech.wikimedia.org/wiki/Jobrunner
[17:29:32] <icinga-wm>	 PROBLEM - PHP7 rendering on mw1302 is CRITICAL: CRITICAL - Socket timeout after 10 seconds https://wikitech.wikimedia.org/wiki/Application_servers/Runbook%23PHP7_rendering
[17:31:26] <logmsgbot>	 !log mwdebug-deploy@deploy1002 helmfile [eqiad] DONE helmfile.d/services/mwdebug: sync on pinkunicorn
[17:31:27] <logmsgbot>	 !log mwdebug-deploy@deploy1002 helmfile [codfw] START helmfile.d/services/mwdebug: apply on pinkunicorn
[17:31:27] <stashbot>	 Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log
[17:31:29] <stashbot>	 Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log
[17:31:36] <icinga-wm>	 PROBLEM - PHP7 rendering on mw1318 is CRITICAL: CRITICAL - Socket timeout after 10 seconds https://wikitech.wikimedia.org/wiki/Application_servers/Runbook%23PHP7_rendering
[17:33:04] <icinga-wm>	 PROBLEM - PHP7 jobrunner on mw1318 is CRITICAL: CRITICAL - Socket timeout after 10 seconds https://wikitech.wikimedia.org/wiki/Jobrunner
[17:34:04] <wikibugs>	 (03PS3) 10Giuseppe Lavagetto: shellbox: rationalize version handling, promote to 1.0 [deployment-charts] - 10https://gerrit.wikimedia.org/r/753020
[17:34:06] <wikibugs>	 (03PS1) 10Giuseppe Lavagetto: Rakefile: when copying over helmfile directories, resolve symlinks [deployment-charts] - 10https://gerrit.wikimedia.org/r/753104
[17:35:18] <icinga-wm>	 RECOVERY - PHP7 jobrunner on mw1318 is OK: HTTP OK: HTTP/1.1 200 OK - 326 bytes in 8.630 second response time https://wikitech.wikimedia.org/wiki/Jobrunner
[17:35:25] <logmsgbot>	 !log marostegui@cumin1001 dbctl commit (dc=all): 'Repooling after maintenance db1146:3314', diff saved to https://phabricator.wikimedia.org/P18593 and previous config saved to /var/cache/conftool/dbconfig/20220111-173524-marostegui.json
[17:35:26] <stashbot>	 Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log
[17:35:29] <logmsgbot>	 !log mwdebug-deploy@deploy1002 helmfile [codfw] DONE helmfile.d/services/mwdebug: sync on pinkunicorn
[17:35:30] <stashbot>	 Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log
[17:36:04] <icinga-wm>	 RECOVERY - PHP7 rendering on mw1318 is OK: HTTP OK: HTTP/1.1 200 OK - 326 bytes in 8.610 second response time https://wikitech.wikimedia.org/wiki/Application_servers/Runbook%23PHP7_rendering
[17:36:06] <wikibugs>	 10SRE, 10SRE-Access-Requests: Requesting access to Analytics Data for Michael Große (WMDE) - https://phabricator.wikimedia.org/T269610 (10Dzahn) a:05ssingh→03None
[17:43:58] <icinga-wm>	 PROBLEM - PHP7 jobrunner on mw1318 is CRITICAL: CRITICAL - Socket timeout after 10 seconds https://wikitech.wikimedia.org/wiki/Jobrunner
[17:44:18] <icinga-wm>	 RECOVERY - PHP7 jobrunner on mw1302 is OK: HTTP OK: HTTP/1.1 200 OK - 326 bytes in 7.507 second response time https://wikitech.wikimedia.org/wiki/Jobrunner
[17:44:21] <logmsgbot>	 !log hnowlan@puppetmaster1001 conftool action : set/pooled=no; selector: name=restbase2009.codfw.wmnet
[17:44:23] <stashbot>	 Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log
[17:44:36] <icinga-wm>	 RECOVERY - PHP7 rendering on mw1302 is OK: HTTP OK: HTTP/1.1 200 OK - 324 bytes in 0.022 second response time https://wikitech.wikimedia.org/wiki/Application_servers/Runbook%23PHP7_rendering
[17:46:04] <icinga-wm>	 RECOVERY - PHP7 jobrunner on mw1318 is OK: HTTP OK: HTTP/1.1 200 OK - 326 bytes in 6.012 second response time https://wikitech.wikimedia.org/wiki/Jobrunner
[17:47:55] <wikibugs>	 (03CR) 10Btullis: [V: 03+1 C: 03+2] Exclude log4j_extras from the classpath for coordinators [puppet] - 10https://gerrit.wikimedia.org/r/752673 (https://phabricator.wikimedia.org/T297468) (owner: 10Btullis)
[17:48:46] <icinga-wm>	 PROBLEM - PHP7 jobrunner on mw1302 is CRITICAL: CRITICAL - Socket timeout after 10 seconds https://wikitech.wikimedia.org/wiki/Jobrunner
[17:50:06] <icinga-wm>	 PROBLEM - PHP7 rendering on mw1302 is CRITICAL: CRITICAL - Socket timeout after 10 seconds https://wikitech.wikimedia.org/wiki/Application_servers/Runbook%23PHP7_rendering
[17:50:29] <logmsgbot>	 !log marostegui@cumin1001 dbctl commit (dc=all): 'Repooling after maintenance db1146:3314', diff saved to https://phabricator.wikimedia.org/P18594 and previous config saved to /var/cache/conftool/dbconfig/20220111-175029-marostegui.json
[17:50:31] <stashbot>	 Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log
[17:57:14] <icinga-wm>	 PROBLEM - PHP7 jobrunner on mw1318 is CRITICAL: CRITICAL - Socket timeout after 10 seconds https://wikitech.wikimedia.org/wiki/Jobrunner
[17:59:20] <icinga-wm>	 RECOVERY - PHP7 jobrunner on mw1318 is OK: HTTP OK: HTTP/1.1 200 OK - 323 bytes in 0.004 second response time https://wikitech.wikimedia.org/wiki/Jobrunner
[18:00:04] <jouncebot>	 chrisalbon and accraze: Your horoscope predicts another unfortunate Services – Graphoid / ORES deploy. May Zuul be (nice) with you. (https://wikitech.wikimedia.org/wiki/Deployments#deploycal-item-20220111T1800).
[18:01:57] <wikibugs>	 (03PS1) 10Elukey: install_server: fix netboot settings for an-test-coord1002 [puppet] - 10https://gerrit.wikimedia.org/r/753110 (https://phabricator.wikimedia.org/T293938)
[18:01:58] <icinga-wm>	 RECOVERY - PHP7 jobrunner on mw1302 is OK: HTTP OK: HTTP/1.1 200 OK - 324 bytes in 0.022 second response time https://wikitech.wikimedia.org/wiki/Jobrunner
[18:03:22] <icinga-wm>	 RECOVERY - PHP7 rendering on mw1302 is OK: HTTP OK: HTTP/1.1 200 OK - 324 bytes in 0.036 second response time https://wikitech.wikimedia.org/wiki/Application_servers/Runbook%23PHP7_rendering
[18:03:24] <wikibugs>	 (03CR) 10Elukey: [C: 03+2] install_server: fix netboot settings for an-test-coord1002 [puppet] - 10https://gerrit.wikimedia.org/r/753110 (https://phabricator.wikimedia.org/T293938) (owner: 10Elukey)
[18:05:34] <logmsgbot>	 !log marostegui@cumin1001 dbctl commit (dc=all): 'Repooling after maintenance db1146:3314 (T297191)', diff saved to https://phabricator.wikimedia.org/P18595 and previous config saved to /var/cache/conftool/dbconfig/20220111-180534-marostegui.json
[18:05:36] <logmsgbot>	 !log marostegui@cumin1001 START - Cookbook sre.hosts.downtime for 6:00:00 on db1121.eqiad.wmnet with reason: Maintenance
[18:05:36] <stashbot>	 Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log
[18:05:38] <stashbot>	 T297191: Schema change for dropping page_restrictions.pr_user field on wmf sites - https://phabricator.wikimedia.org/T297191
[18:05:38] <logmsgbot>	 !log marostegui@cumin1001 END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1121.eqiad.wmnet with reason: Maintenance
[18:05:39] <stashbot>	 Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log
[18:05:39] <logmsgbot>	 !log marostegui@cumin1001 START - Cookbook sre.hosts.downtime for 12:00:00 on clouddb[1015,1019,1021].eqiad.wmnet,db1155.eqiad.wmnet with reason: Maintenance
[18:05:40] <stashbot>	 Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log
[18:05:41] <stashbot>	 Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log
[18:05:43] <logmsgbot>	 !log marostegui@cumin1001 END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on clouddb[1015,1019,1021].eqiad.wmnet,db1155.eqiad.wmnet with reason: Maintenance
[18:05:44] <stashbot>	 Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log
[18:05:47] <logmsgbot>	 !log marostegui@cumin1001 dbctl commit (dc=all): 'Depooling db1121 (T297191)', diff saved to https://phabricator.wikimedia.org/P18596 and previous config saved to /var/cache/conftool/dbconfig/20220111-180547-marostegui.json
[18:05:49] <stashbot>	 Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log
[18:06:54] <logmsgbot>	 !log marostegui@cumin1001 dbctl commit (dc=all): 'Repooling after maintenance db1121 (T297191)', diff saved to https://phabricator.wikimedia.org/P18597 and previous config saved to /var/cache/conftool/dbconfig/20220111-180653-marostegui.json
[18:06:56] <stashbot>	 Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log
[18:08:46] <logmsgbot>	 !log elukey@cumin1001 START - Cookbook sre.hosts.reimage for host an-test-coord1002.eqiad.wmnet with OS buster
[18:08:48] <stashbot>	 Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log
[18:08:56] <wikibugs>	 10SRE, 10ops-eqiad, 10Analytics-Clusters, 10DC-Ops, 10Patch-For-Review: (Need By: TBD) rack/setup/install an-test-coord1002 - https://phabricator.wikimedia.org/T293938 (10ops-monitoring-bot) Cookbook cookbooks.sre.hosts.reimage was started by elukey@cumin1001 for host an-test-coord1002.eqiad.wmnet with O...
[18:11:54] <wikibugs>	 10SRE, 10LDAP-Access-Requests: Grant Access to cn=wmf and cn=ops for Nmaphophe - https://phabricator.wikimedia.org/T298868 (10cmooney) @CMacholan can you confirm and approve this request if appropriate?  thanks.
[18:12:31] <wikibugs>	 (03PS1) 10Jgiannelos: Disable triggering tile pregeneration on OSM syncs [puppet] - 10https://gerrit.wikimedia.org/r/753111
[18:13:33] <wikibugs>	 (03PS2) 10Jgiannelos: Disable triggering tile pregeneration on OSM syncs [puppet] - 10https://gerrit.wikimedia.org/r/753111 (https://phabricator.wikimedia.org/T298246)
[18:14:41] <wikibugs>	 10SRE, 10LDAP-Access-Requests: Grant Access to cn=wmf and cn=ops for Nmaphophe - https://phabricator.wikimedia.org/T298868 (10CMacholan) @cmooney approved
[18:18:16] <wikibugs>	 (03PS2) 10Giuseppe Lavagetto: Rakefile: when copying over helmfile directories, resolve symlinks [deployment-charts] - 10https://gerrit.wikimedia.org/r/753104
[18:18:18] <wikibugs>	 (03PS4) 10Giuseppe Lavagetto: shellbox: rationalize version handling, promote to 1.0 [deployment-charts] - 10https://gerrit.wikimedia.org/r/753020
[18:21:59] <logmsgbot>	 !log marostegui@cumin1001 dbctl commit (dc=all): 'Repooling after maintenance db1121', diff saved to https://phabricator.wikimedia.org/P18598 and previous config saved to /var/cache/conftool/dbconfig/20220111-182158-marostegui.json
[18:22:00] <stashbot>	 Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log
[18:25:00] <wikibugs>	 (03CR) 10Giuseppe Lavagetto: [C: 03+2] Rakefile: when copying over helmfile directories, resolve symlinks [deployment-charts] - 10https://gerrit.wikimedia.org/r/753104 (owner: 10Giuseppe Lavagetto)
[18:28:38] <wikibugs>	 (03Merged) 10jenkins-bot: Rakefile: when copying over helmfile directories, resolve symlinks [deployment-charts] - 10https://gerrit.wikimedia.org/r/753104 (owner: 10Giuseppe Lavagetto)
[18:28:59] <_joe_>	 !log uploaded scap 4.1.1-1 to apt T298986
[18:29:01] <stashbot>	 Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log
[18:29:03] <stashbot>	 T298986: Deploy Scap version 4.1.1 - https://phabricator.wikimedia.org/T298986
[18:29:54] <wikibugs>	 10SRE, 10ops-eqiad: msw-a8-eqiad potentially down - https://phabricator.wikimedia.org/T298869 (10RobH) >>! In T298869#7610301, @Cmjohnson wrote: > The mgmt switch power led was amber, tried pulling the power and plugging back in but no change. We had a spare  wmf4921, racked it, and moved all the mgmt cables....
[18:29:56] * dancy touches his fingers together and says "eeeexcellent" like Mr Burns.
[18:32:30] <wikibugs>	 10SRE, 10ops-eqiad: msw-a8-eqiad potentially down - https://phabricator.wikimedia.org/T298869 (10Cmjohnson) We had a spare on-site thankfully but we should probably purchase a new one just in case or save a couple of the older models for emergencies.
[18:34:28] <logmsgbot>	 !log elukey@cumin1001 END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host an-test-coord1002.eqiad.wmnet with OS buster
[18:34:29] <stashbot>	 Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log
[18:34:35] <wikibugs>	 10SRE, 10ops-eqiad, 10Analytics-Clusters, 10DC-Ops: (Need By: TBD) rack/setup/install an-test-coord1002 - https://phabricator.wikimedia.org/T293938 (10ops-monitoring-bot) Cookbook cookbooks.sre.hosts.reimage started by elukey@cumin1001 for host an-test-coord1002.eqiad.wmnet with OS buster completed: - an-t...
[18:34:41] <wikibugs>	 10SRE, 10Analytics-Legal: Options for creating internal (NDA-requiring) dashboards based on data from Google and Big search consoles - https://phabricator.wikimedia.org/T298991 (10AndyRussG)
[18:34:48] <_joe_>	 dancy: I will deploy it to the mwdebug servers for now, and try a scap pool
[18:34:58] <dancy>	 👍🏾
[18:37:04] <logmsgbot>	 !log marostegui@cumin1001 dbctl commit (dc=all): 'Repooling after maintenance db1121', diff saved to https://phabricator.wikimedia.org/P18599 and previous config saved to /var/cache/conftool/dbconfig/20220111-183703-marostegui.json
[18:37:05] <stashbot>	 Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log
[18:39:27] <wikibugs>	 (03CR) 10JMeybohm: [C: 04-1] shellbox: rationalize version handling, promote to 1.0 (031 comment) [deployment-charts] - 10https://gerrit.wikimedia.org/r/753020 (owner: 10Giuseppe Lavagetto)
[18:39:41] <wikibugs>	 10SRE, 10ops-eqiad, 10Analytics-Clusters, 10DC-Ops: (Need By: TBD) rack/setup/install an-test-coord1002 - https://phabricator.wikimedia.org/T293938 (10elukey) @Cmjohnson an-test-coord1002 done, there was an issue with your partman patch (it was targeting an-test-worker1002 instead of an-test-coord1002), bu...
[18:41:04] <_joe_>	 !log installed scap 4.1.1 on mwdebug1002 T298986, ran scap pull successfully
[18:41:06] <stashbot>	 Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log
[18:41:07] <stashbot>	 T298986: Deploy Scap version 4.1.1 - https://phabricator.wikimedia.org/T298986
[18:41:21] <_joe_>	 !log also ran apt-get autoremove on mwdebug1002
[18:41:22] <stashbot>	 Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log
[18:45:17] <dancy>	 thx _joe_
[18:46:03] <dancy>	 Stepping out for a bit.
[18:46:31] <wikibugs>	 (03PS5) 10Giuseppe Lavagetto: shellbox: rationalize version handling, promote to 1.0 [deployment-charts] - 10https://gerrit.wikimedia.org/r/753020
[18:46:54] <wikibugs>	 (03CR) 10Giuseppe Lavagetto: shellbox: rationalize version handling, promote to 1.0 (031 comment) [deployment-charts] - 10https://gerrit.wikimedia.org/r/753020 (owner: 10Giuseppe Lavagetto)
[18:49:06] <wikibugs>	 10SRE, 10serviceops, 10Patch-For-Review: Remove mediawiki::packages::fonts from non thumbor servers - https://phabricator.wikimedia.org/T294378 (10Dzahn) 05Resolved→03Open reopening  While we have purged all the font packages that were specifically in the config, these had also pulled in more font packag...
[18:50:11] <wikibugs>	 10SRE, 10serviceops, 10Patch-For-Review: Remove mediawiki::packages::fonts from non thumbor servers - https://phabricator.wikimedia.org/T294378 (10Dzahn) example:  a font that was in our list, fonts-alee is properly gone: https://debmonitor.wikimedia.org/packages/fonts-alee (except on thumbor, expected)  a f...
[18:51:10] <logmsgbot>	 !log mwdebug-deploy@deploy1002 helmfile [eqiad] START helmfile.d/services/mwdebug: apply on pinkunicorn
[18:51:11] <stashbot>	 Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log
[18:51:13] <wikibugs>	 (03CR) 10JMeybohm: [C: 03+1] shellbox: rationalize version handling, promote to 1.0 [deployment-charts] - 10https://gerrit.wikimedia.org/r/753020 (owner: 10Giuseppe Lavagetto)
[18:52:08] <logmsgbot>	 !log mwdebug-deploy@deploy1002 helmfile [eqiad] DONE helmfile.d/services/mwdebug: sync on pinkunicorn
[18:52:08] <logmsgbot>	 !log marostegui@cumin1001 dbctl commit (dc=all): 'Repooling after maintenance db1121 (T297191)', diff saved to https://phabricator.wikimedia.org/P18600 and previous config saved to /var/cache/conftool/dbconfig/20220111-185208-marostegui.json
[18:52:09] <logmsgbot>	 !log mwdebug-deploy@deploy1002 helmfile [codfw] START helmfile.d/services/mwdebug: apply on pinkunicorn
[18:52:09] <stashbot>	 Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log
[18:52:09] <logmsgbot>	 !log marostegui@cumin1001 START - Cookbook sre.hosts.downtime for 6:00:00 on db1160.eqiad.wmnet with reason: Maintenance
[18:52:11] <logmsgbot>	 !log marostegui@cumin1001 END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1160.eqiad.wmnet with reason: Maintenance
[18:52:12] <stashbot>	 Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log
[18:52:13] <stashbot>	 T297191: Schema change for dropping page_restrictions.pr_user field on wmf sites - https://phabricator.wikimedia.org/T297191
[18:52:14] <stashbot>	 Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log
[18:52:15] <stashbot>	 Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log
[18:52:15] <logmsgbot>	 !log marostegui@cumin1001 dbctl commit (dc=all): 'Depooling db1160 (T297191)', diff saved to https://phabricator.wikimedia.org/P18601 and previous config saved to /var/cache/conftool/dbconfig/20220111-185215-marostegui.json
[18:52:16] <stashbot>	 Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log
[18:52:19] <stashbot>	 Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log
[18:53:07] <logmsgbot>	 !log mwdebug-deploy@deploy1002 helmfile [codfw] DONE helmfile.d/services/mwdebug: sync on pinkunicorn
[18:53:08] <stashbot>	 Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log
[18:53:14] <wikibugs>	 (03PS2) 10Jbond: bgpalerter: add new class to configure bgpalerter [puppet] - 10https://gerrit.wikimedia.org/r/753102 (https://phabricator.wikimedia.org/T230600)
[18:53:16] <wikibugs>	 (03PS1) 10Jbond: hieradata - cloud: add config for prefies [puppet] - 10https://gerrit.wikimedia.org/r/753117
[18:53:22] <logmsgbot>	 !log marostegui@cumin1001 dbctl commit (dc=all): 'Repooling after maintenance db1160 (T297191)', diff saved to https://phabricator.wikimedia.org/P18602 and previous config saved to /var/cache/conftool/dbconfig/20220111-185322-marostegui.json
[18:53:24] <stashbot>	 Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log
[18:53:52] <wikibugs>	 (03CR) 10jerkins-bot: [V: 04-1] bgpalerter: add new class to configure bgpalerter [puppet] - 10https://gerrit.wikimedia.org/r/753102 (https://phabricator.wikimedia.org/T230600) (owner: 10Jbond)
[18:55:27] <wikibugs>	 (03PS3) 10Jbond: bgpalerter: add new class to configure bgpalerter [puppet] - 10https://gerrit.wikimedia.org/r/753102 (https://phabricator.wikimedia.org/T230600)
[18:56:04] <wikibugs>	 (03CR) 10jerkins-bot: [V: 04-1] bgpalerter: add new class to configure bgpalerter [puppet] - 10https://gerrit.wikimedia.org/r/753102 (https://phabricator.wikimedia.org/T230600) (owner: 10Jbond)
[18:56:13] <wikibugs>	 10SRE, 10Infrastructure-Foundations, 10netops: Eqiad Expansion - LVS Connectivity Options - https://phabricator.wikimedia.org/T292630 (10RobH)
[18:57:30] <ebernhardson>	 !log clear wcqs.jnl and aliases.map for all wcqs instances T296470
[18:57:33] <stashbot>	 Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log
[18:57:34] <stashbot>	 T296470: Initialize WCQS production servers - https://phabricator.wikimedia.org/T296470
[18:58:38] <logmsgbot>	 !log sukhe@cumin1001 START - Cookbook sre.ganeti.reboot-vm for VM doh1001.wikimedia.org
[18:58:40] <stashbot>	 Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log
[18:59:47] <wikibugs>	 10SRE, 10Analytics-Legal: Options for creating internal (NDA-requiring) dashboards based on data from Google and Big search consoles - https://phabricator.wikimedia.org/T298991 (10RhinosF1) #Analytics-Legal says "Public project for the Analytics and Techops team for reviewing incoming requests from WMF-Legal....
[19:00:02] <icinga-wm>	 PROBLEM - BGP status on cr2-eqiad is CRITICAL: BGP CRITICAL - AS64605/IPv4: Active - Anycast https://wikitech.wikimedia.org/wiki/Network_monitoring%23BGP_status
[19:00:04] <jouncebot>	 Deploy window Pre MediaWiki train break (https://wikitech.wikimedia.org/wiki/Deployments#deploycal-item-20220111T1900)
[19:00:57] <logmsgbot>	 !log sukhe@cumin1001 END (PASS) - Cookbook sre.ganeti.reboot-vm (exit_code=0) for VM doh1001.wikimedia.org
[19:00:58] <stashbot>	 Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log
[19:01:53] <logmsgbot>	 !log sukhe@cumin1001 START - Cookbook sre.ganeti.reboot-vm for VM doh1002.wikimedia.org
[19:01:54] <stashbot>	 Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log
[19:02:00] <wikibugs>	 10SRE, 10Infrastructure-Foundations: Migrate eqiad Ganeti cluster to KVM machine type pc-i440fx-2.8 - https://phabricator.wikimedia.org/T294120 (10ops-monitoring-bot) VM doh1002.wikimedia.org rebooted by sukhe@cumin1001 with reason: rebooting for T294120
[19:02:54] <icinga-wm>	 PROBLEM - BGP status on cr2-eqiad is CRITICAL: BGP CRITICAL - AS64605/IPv4: Active - Anycast https://wikitech.wikimedia.org/wiki/Network_monitoring%23BGP_status
[19:04:09] <wikibugs>	 (03CR) 10Herron: [C: 03+1] site: reprovision eqiad logging cluster to opensearch (031 comment) [puppet] - 10https://gerrit.wikimedia.org/r/752756 (https://phabricator.wikimedia.org/T288621) (owner: 10Cwhite)
[19:04:27] <wikibugs>	 10SRE, 10WMF-Legal: Options for creating internal (NDA-requiring) dashboards based on data from Google and Big search consoles - https://phabricator.wikimedia.org/T298991 (10AndyRussG) As the #WMF-Legal project tag was added to this task, some general information to avoid wrong expectations: Please note that p...
[19:04:43] <logmsgbot>	 !log dduvall@deploy1002 Pruned MediaWiki: 1.38.0-wmf.9 (duration: 15m 51s)
[19:04:45] <stashbot>	 Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log
[19:05:02] <logmsgbot>	 !log sukhe@cumin1001 END (PASS) - Cookbook sre.ganeti.reboot-vm (exit_code=0) for VM doh1002.wikimedia.org
[19:05:03] <stashbot>	 Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log
[19:05:08] <wikibugs>	 (03CR) 10Herron: [C: 03+1] hiera: add opensearch production configuration (eqiad) [puppet] - 10https://gerrit.wikimedia.org/r/752755 (https://phabricator.wikimedia.org/T288621) (owner: 10Cwhite)
[19:05:34] <logmsgbot>	 !log sukhe@cumin1001 START - Cookbook sre.ganeti.reboot-vm for VM durum1001.eqiad.wmnet
[19:05:36] <stashbot>	 Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log
[19:05:41] <wikibugs>	 10SRE, 10Infrastructure-Foundations: Migrate eqiad Ganeti cluster to KVM machine type pc-i440fx-2.8 - https://phabricator.wikimedia.org/T294120 (10ops-monitoring-bot) VM durum1001.eqiad.wmnet rebooted by sukhe@cumin1001 with reason: rebooting for T294120
[19:06:06] <icinga-wm>	 PROBLEM - BGP status on cr2-eqiad is CRITICAL: BGP CRITICAL - AS64605/IPv4: Active - Anycast https://wikitech.wikimedia.org/wiki/Network_monitoring%23BGP_status
[19:06:06] <wikibugs>	 10SRE: Options for creating internal (NDA-requiring) dashboards based on data from Google and Big search consoles - https://phabricator.wikimedia.org/T298991 (10AndyRussG)
[19:08:12] <wikibugs>	 10SRE: Options for creating internal (NDA-requiring) dashboards based on data from Google and Bing search consoles - https://phabricator.wikimedia.org/T298991 (10AndyRussG)
[19:08:27] <logmsgbot>	 !log marostegui@cumin1001 dbctl commit (dc=all): 'Repooling after maintenance db1160', diff saved to https://phabricator.wikimedia.org/P18603 and previous config saved to /var/cache/conftool/dbconfig/20220111-190827-marostegui.json
[19:08:28] <stashbot>	 Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log
[19:09:15] <wikibugs>	 10SRE: Options for creating internal (NDA-requiring) dashboards based on data from Google and Bing search consoles - https://phabricator.wikimedia.org/T298991 (10AndyRussG) Thanks much @RhinosF1! I'll reach out directly to Legal about this as specified.
[19:12:09] <wikibugs>	 (03PS4) 10Jbond: bgpalerter: add new class to configure bgpalerter [puppet] - 10https://gerrit.wikimedia.org/r/753102 (https://phabricator.wikimedia.org/T230600)
[19:12:47] <wikibugs>	 (03CR) 10jerkins-bot: [V: 04-1] bgpalerter: add new class to configure bgpalerter [puppet] - 10https://gerrit.wikimedia.org/r/753102 (https://phabricator.wikimedia.org/T230600) (owner: 10Jbond)
[19:13:22] <logmsgbot>	 !log sukhe@cumin1001 END (PASS) - Cookbook sre.ganeti.reboot-vm (exit_code=0) for VM durum1001.eqiad.wmnet
[19:13:23] <stashbot>	 Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log
[19:13:36] <logmsgbot>	 !log sukhe@cumin1001 START - Cookbook sre.ganeti.reboot-vm for VM durum1002.eqiad.wmnet
[19:13:37] <stashbot>	 Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log
[19:13:44] <wikibugs>	 10SRE, 10Infrastructure-Foundations: Migrate eqiad Ganeti cluster to KVM machine type pc-i440fx-2.8 - https://phabricator.wikimedia.org/T294120 (10ops-monitoring-bot) VM durum1002.eqiad.wmnet rebooted by sukhe@cumin1001 with reason: rebooting for T294120
[19:15:14] <icinga-wm>	 PROBLEM - BGP status on cr2-eqiad is CRITICAL: BGP CRITICAL - AS64605/IPv4: Active - Anycast https://wikitech.wikimedia.org/wiki/Network_monitoring%23BGP_status
[19:15:36] <wikibugs>	 (03PS5) 10Jbond: bgpalerter: add new class to configure bgpalerter [puppet] - 10https://gerrit.wikimedia.org/r/753102 (https://phabricator.wikimedia.org/T230600)
[19:16:47] <wikibugs>	 (03CR) 10jerkins-bot: [V: 04-1] bgpalerter: add new class to configure bgpalerter [puppet] - 10https://gerrit.wikimedia.org/r/753102 (https://phabricator.wikimedia.org/T230600) (owner: 10Jbond)
[19:17:05] <logmsgbot>	 !log sukhe@cumin1001 END (PASS) - Cookbook sre.ganeti.reboot-vm (exit_code=0) for VM durum1002.eqiad.wmnet
[19:17:06] <stashbot>	 Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log
[19:17:58] <wikibugs>	 (03PS1) 10Urbanecm: wgGEMentorDashboardDeploymentMode should be alpha in all of beta [mediawiki-config] - 10https://gerrit.wikimedia.org/r/753119 (https://phabricator.wikimedia.org/T298993)
[19:20:16] <wikibugs>	 (03PS1) 10Dduvall: testwikis wikis to 1.38.0-wmf.17  refs T293958 [mediawiki-config] - 10https://gerrit.wikimedia.org/r/753120
[19:20:18] <wikibugs>	 (03CR) 10Dduvall: [C: 03+2] testwikis wikis to 1.38.0-wmf.17  refs T293958 [mediawiki-config] - 10https://gerrit.wikimedia.org/r/753120 (owner: 10Dduvall)
[19:20:22] <wikibugs>	 (03PS6) 10Jbond: bgpalerter: add new class to configure bgpalerter [puppet] - 10https://gerrit.wikimedia.org/r/753102 (https://phabricator.wikimedia.org/T230600)
[19:21:40] <wikibugs>	 (03CR) 10jerkins-bot: [V: 04-1] bgpalerter: add new class to configure bgpalerter [puppet] - 10https://gerrit.wikimedia.org/r/753102 (https://phabricator.wikimedia.org/T230600) (owner: 10Jbond)
[19:21:42] <wikibugs>	 (03Merged) 10jenkins-bot: testwikis wikis to 1.38.0-wmf.17  refs T293958 [mediawiki-config] - 10https://gerrit.wikimedia.org/r/753120 (owner: 10Dduvall)
[19:21:46] <logmsgbot>	 !log dduvall@deploy1002 Started scap: testwikis wikis to 1.38.0-wmf.17  refs T293958
[19:21:48] <stashbot>	 Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log
[19:21:49] <stashbot>	 T293958: 1.38.0-wmf.17 deployment blockers - https://phabricator.wikimedia.org/T293958
[19:22:59] <wikibugs>	 10SRE, 10Infrastructure-Foundations: Migrate eqiad Ganeti cluster to KVM machine type pc-i440fx-2.8 - https://phabricator.wikimedia.org/T294120 (10ssingh)
[19:23:01] <wikibugs>	 10SRE, 10Patch-For-Review, 10SRE Observability (FY2021/2022-Q3): Decommission old ELK5 Logstash cluster - https://phabricator.wikimedia.org/T281266 (10herron)
[19:23:24] <logmsgbot>	 !log mwdebug-deploy@deploy1002 helmfile [eqiad] START helmfile.d/services/mwdebug: apply on pinkunicorn
[19:23:25] <stashbot>	 Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log
[19:23:32] <logmsgbot>	 !log marostegui@cumin1001 dbctl commit (dc=all): 'Repooling after maintenance db1160', diff saved to https://phabricator.wikimedia.org/P18604 and previous config saved to /var/cache/conftool/dbconfig/20220111-192331-marostegui.json
[19:23:33] <stashbot>	 Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log
[19:23:44] <wikibugs>	 10SRE, 10Infrastructure-Foundations: Migrate eqiad Ganeti cluster to KVM machine type pc-i440fx-2.8 - https://phabricator.wikimedia.org/T294120 (10ssingh) `doh1001` was also restarted but I forgot to add the `-t` switch and that's why you ops-bot didn't catch it :) Updated the hosts.
[19:24:24] <wikibugs>	 10SRE, 10Patch-For-Review, 10SRE Observability (FY2021/2022-Q3): DX App Synthetic Monitoring App - watchmouse alert flapping due to CA expiration - https://phabricator.wikimedia.org/T292603 (10herron)
[19:24:36] <logmsgbot>	 !log mwdebug-deploy@deploy1002 helmfile [eqiad] DONE helmfile.d/services/mwdebug: sync on pinkunicorn
[19:24:37] <logmsgbot>	 !log mwdebug-deploy@deploy1002 helmfile [codfw] START helmfile.d/services/mwdebug: apply on pinkunicorn
[19:24:37] <stashbot>	 Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log
[19:24:38] <stashbot>	 Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log
[19:25:03] <wikibugs>	 (03PS7) 10Jbond: bgpalerter: add new class to configure bgpalerter [puppet] - 10https://gerrit.wikimedia.org/r/753102 (https://phabricator.wikimedia.org/T230600)
[19:27:07] <logmsgbot>	 !log mwdebug-deploy@deploy1002 helmfile [codfw] DONE helmfile.d/services/mwdebug: sync on pinkunicorn
[19:27:08] <stashbot>	 Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log
[19:28:39] <wikibugs>	 (03CR) 10Jbond: [C: 03+2] "merging this is currently not used but will test in cloud" [puppet] - 10https://gerrit.wikimedia.org/r/753102 (https://phabricator.wikimedia.org/T230600) (owner: 10Jbond)
[19:29:06] <wikibugs>	 (03PS8) 10Jbond: bgpalerter: add new class to configure bgpalerter [puppet] - 10https://gerrit.wikimedia.org/r/753102 (https://phabricator.wikimedia.org/T230600)
[19:30:10] <sukhe>	 !log upload pdns-recursor_4.6.0-1wm1 to apt.wm.o (buster) - T252132
[19:30:12] <stashbot>	 Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log
[19:30:16] <stashbot>	 T252132: Deploy Wikidough: Experimental DNS-over-HTTPS (DoH) public resolver - https://phabricator.wikimedia.org/T252132
[19:30:17] <wikibugs>	 (03CR) 10Jbond: [C: 03+2] bgpalerter: add new class to configure bgpalerter [puppet] - 10https://gerrit.wikimedia.org/r/753102 (https://phabricator.wikimedia.org/T230600) (owner: 10Jbond)
[19:30:29] <wikibugs>	 (03CR) 10Jbond: [C: 03+2] hieradata - cloud: add config for prefies [puppet] - 10https://gerrit.wikimedia.org/r/753117 (owner: 10Jbond)
[19:30:46] <wikibugs>	 (03PS2) 10Jbond: hieradata - cloud: add config for prefies [puppet] - 10https://gerrit.wikimedia.org/r/753117
[19:34:49] <wikibugs>	 10SRE, 10SRE-Access-Requests, 10Data-Engineering, 10LDAP-Access-Requests: Create Kerberos login for Brian King (bking) - https://phabricator.wikimedia.org/T298981 (10cmooney) 05Open→03Resolved a:03cmooney Ok no problem if there is anything not working just drop me a line on irc :)
[19:35:20] <wikibugs>	 (03PS1) 10Jbond: hieradata: fix profix_options hash [puppet] - 10https://gerrit.wikimedia.org/r/753121
[19:36:22] <wikibugs>	 (03CR) 10Jbond: [C: 03+2] hieradata: fix profix_options hash [puppet] - 10https://gerrit.wikimedia.org/r/753121 (owner: 10Jbond)
[19:38:37] <logmsgbot>	 !log marostegui@cumin1001 dbctl commit (dc=all): 'Repooling after maintenance db1160 (T297191)', diff saved to https://phabricator.wikimedia.org/P18605 and previous config saved to /var/cache/conftool/dbconfig/20220111-193836-marostegui.json
[19:38:38] <logmsgbot>	 !log marostegui@cumin1001 START - Cookbook sre.hosts.downtime for 6:00:00 on db1149.eqiad.wmnet with reason: Maintenance
[19:38:39] <stashbot>	 Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log
[19:38:39] <logmsgbot>	 !log marostegui@cumin1001 END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1149.eqiad.wmnet with reason: Maintenance
[19:38:40] <stashbot>	 T297191: Schema change for dropping page_restrictions.pr_user field on wmf sites - https://phabricator.wikimedia.org/T297191
[19:38:41] <stashbot>	 Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log
[19:38:42] <stashbot>	 Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log
[19:38:44] <logmsgbot>	 !log marostegui@cumin1001 dbctl commit (dc=all): 'Depooling db1149 (T297191)', diff saved to https://phabricator.wikimedia.org/P18606 and previous config saved to /var/cache/conftool/dbconfig/20220111-193844-marostegui.json
[19:38:46] <stashbot>	 Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log
[19:39:07] <wikibugs>	 (03PS1) 10Ssingh: dnsrecursor: remove redundant setting delegation-only [puppet] - 10https://gerrit.wikimedia.org/r/753122
[19:39:49] <wikibugs>	 (03CR) 10Ssingh: [V: 03+1] "PCC SUCCESS (NOOP 2): https://integration.wikimedia.org/ci/job/operations-puppet-catalog-compiler/33198/console" [puppet] - 10https://gerrit.wikimedia.org/r/753122 (owner: 10Ssingh)
[19:39:51] <logmsgbot>	 !log marostegui@cumin1001 dbctl commit (dc=all): 'Repooling after maintenance db1149 (T297191)', diff saved to https://phabricator.wikimedia.org/P18607 and previous config saved to /var/cache/conftool/dbconfig/20220111-193951-marostegui.json
[19:39:54] <stashbot>	 Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log
[19:39:54] <wikibugs>	 (03PS1) 10Jbond: bgpalerter: fix type definition [puppet] - 10https://gerrit.wikimedia.org/r/753123
[19:40:47] <wikibugs>	 (03CR) 10Jbond: [C: 03+2] bgpalerter: fix type definition [puppet] - 10https://gerrit.wikimedia.org/r/753123 (owner: 10Jbond)
[19:41:38] <wikibugs>	 (03CR) 10Ssingh: [V: 03+1 C: 03+2] dnsrecursor: remove redundant setting delegation-only [puppet] - 10https://gerrit.wikimedia.org/r/753122 (owner: 10Ssingh)
[19:43:13] <wikibugs>	 (03PS1) 10Jbond: hieradata: fix typo [puppet] - 10https://gerrit.wikimedia.org/r/753124
[19:44:03] <wikibugs>	 (03CR) 10Jbond: [C: 03+2] hieradata: fix typo [puppet] - 10https://gerrit.wikimedia.org/r/753124 (owner: 10Jbond)
[19:51:13] <wikibugs>	 (03PS1) 10Jbond: bgpalerter: fix prefixes content [puppet] - 10https://gerrit.wikimedia.org/r/753125
[19:52:21] <wikibugs>	 (03CR) 10Jbond: [C: 03+2] bgpalerter: fix prefixes content [puppet] - 10https://gerrit.wikimedia.org/r/753125 (owner: 10Jbond)
[19:53:49] <logmsgbot>	 !log cwhite@cumin1001 START - Cookbook sre.ganeti.reboot-vm for VM logstash1023.eqiad.wmnet
[19:53:51] <stashbot>	 Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log
[19:54:56] <logmsgbot>	 !log marostegui@cumin1001 dbctl commit (dc=all): 'Repooling after maintenance db1149', diff saved to https://phabricator.wikimedia.org/P18608 and previous config saved to /var/cache/conftool/dbconfig/20220111-195456-marostegui.json
[19:54:58] <stashbot>	 Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log
[19:58:26] <icinga-wm>	 PROBLEM - PyBal backends health check on lvs1020 is CRITICAL: PYBAL CRITICAL - CRITICAL - kibana7_443: Servers logstash1023.eqiad.wmnet are marked down but pooled https://wikitech.wikimedia.org/wiki/PyBal
[19:59:29] <cwhite>	 ^^ is me. kibana7 in eqiad is not the active DC
[19:59:31] <logmsgbot>	 !log cwhite@cumin1001 END (PASS) - Cookbook sre.ganeti.reboot-vm (exit_code=0) for VM logstash1023.eqiad.wmnet
[19:59:32] <stashbot>	 Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log
[20:00:05] <jouncebot>	 dduvall and twentyafterfour: I seem to be stuck in Groundhog week. Sigh. Time for (yet another) MediaWiki train - Utc-7 Version deploy. (https://wikitech.wikimedia.org/wiki/Deployments#deploycal-item-20220111T2000).
[20:00:40] <icinga-wm>	 RECOVERY - PyBal backends health check on lvs1020 is OK: PYBAL OK - All pools are healthy https://wikitech.wikimedia.org/wiki/PyBal
[20:01:24] <logmsgbot>	 !log dduvall@deploy1002 Finished scap: testwikis wikis to 1.38.0-wmf.17  refs T293958 (duration: 39m 38s)
[20:01:27] <stashbot>	 Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log
[20:01:28] <stashbot>	 T293958: 1.38.0-wmf.17 deployment blockers - https://phabricator.wikimedia.org/T293958
[20:03:00] <hauskatze>	 taavi: temp global rights are fully shipped with wmf.17?
[20:06:23] <taavi>	 hauskatze: yes, but behind a config flag
[20:06:37] <taavi>	 I expect they'll be enabled like next Monday
[20:06:58] <hauskatze>	 taavi: awesome
[20:08:24] <logmsgbot>	 !log cwhite@cumin1001 START - Cookbook sre.ganeti.reboot-vm for VM logstash1024.eqiad.wmnet
[20:08:26] <stashbot>	 Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log
[20:08:30] <wikibugs>	 10SRE, 10Infrastructure-Foundations: Migrate eqiad Ganeti cluster to KVM machine type pc-i440fx-2.8 - https://phabricator.wikimedia.org/T294120 (10ops-monitoring-bot) VM logstash1024.eqiad.wmnet rebooted by cwhite@cumin1001 with reason: None
[20:09:53] <logmsgbot>	 !log cwhite@cumin1001 START - Cookbook sre.ganeti.reboot-vm for VM logstash1025.eqiad.wmnet
[20:09:54] <stashbot>	 Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log
[20:10:01] <logmsgbot>	 !log marostegui@cumin1001 dbctl commit (dc=all): 'Repooling after maintenance db1149', diff saved to https://phabricator.wikimedia.org/P18609 and previous config saved to /var/cache/conftool/dbconfig/20220111-201000-marostegui.json
[20:10:01] <wikibugs>	 10SRE, 10Infrastructure-Foundations: Migrate eqiad Ganeti cluster to KVM machine type pc-i440fx-2.8 - https://phabricator.wikimedia.org/T294120 (10ops-monitoring-bot) VM logstash1025.eqiad.wmnet rebooted by cwhite@cumin1001 with reason: None
[20:10:02] <stashbot>	 Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log
[20:11:10] <wikibugs>	 (03PS4) 10Ssingh: dnsrecursor: add support for DoT to auth servers [puppet] - 10https://gerrit.wikimedia.org/r/752706
[20:11:49] <wikibugs>	 (03PS1) 10Jbond: bgpalerter: use absolute path for prefixes and log directory [puppet] - 10https://gerrit.wikimedia.org/r/753128
[20:12:23] <wikibugs>	 (03CR) 10Ssingh: [V: 03+1] "PCC SUCCESS (DIFF 2): https://integration.wikimedia.org/ci/job/operations-puppet-catalog-compiler/33200/console" [puppet] - 10https://gerrit.wikimedia.org/r/752706 (owner: 10Ssingh)
[20:12:56] <wikibugs>	 (03CR) 10Jbond: [C: 03+2] bgpalerter: use absolute path for prefixes and log directory [puppet] - 10https://gerrit.wikimedia.org/r/753128 (owner: 10Jbond)
[20:16:36] <wikibugs>	 (03CR) 10Ssingh: [V: 03+1] "Merging since there is no change for the internal recursor configuration and PCC looks OK. Thank you!" [puppet] - 10https://gerrit.wikimedia.org/r/752706 (owner: 10Ssingh)
[20:16:42] <wikibugs>	 (03CR) 10Ssingh: [V: 03+1 C: 03+2] dnsrecursor: add support for DoT to auth servers [puppet] - 10https://gerrit.wikimedia.org/r/752706 (owner: 10Ssingh)
[20:17:37] <logmsgbot>	 !log cwhite@cumin1001 END (PASS) - Cookbook sre.ganeti.reboot-vm (exit_code=0) for VM logstash1025.eqiad.wmnet
[20:17:38] <stashbot>	 Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log
[20:17:45] <logmsgbot>	 !log cwhite@cumin1001 END (PASS) - Cookbook sre.ganeti.reboot-vm (exit_code=0) for VM logstash1024.eqiad.wmnet
[20:17:46] <stashbot>	 Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log
[20:17:58] <icinga-wm>	 PROBLEM - Check systemd state on logstash1025 is CRITICAL: CRITICAL - degraded: The following units failed: ifup@ens5.service https://wikitech.wikimedia.org/wiki/Monitoring/check_systemd_state
[20:18:06] <icinga-wm>	 PROBLEM - Check systemd state on logstash1024 is CRITICAL: CRITICAL - degraded: The following units failed: ifup@ens5.service https://wikitech.wikimedia.org/wiki/Monitoring/check_systemd_state
[20:18:59] <wikibugs>	 (03PS2) 10Ssingh: O:wikidough: enable DoT to auth servers [puppet] - 10https://gerrit.wikimedia.org/r/752726
[20:20:05] <wikibugs>	 (03PS11) 10Dzahn: phabricator: move vcs firewall rules to profile [puppet] - 10https://gerrit.wikimedia.org/r/751510 (https://phabricator.wikimedia.org/T114209)
[20:20:14] <wikibugs>	 (03CR) 10Dzahn: phabricator: move vcs firewall rules to profile (031 comment) [puppet] - 10https://gerrit.wikimedia.org/r/751510 (https://phabricator.wikimedia.org/T114209) (owner: 10Dzahn)
[20:21:45] <wikibugs>	 (03CR) 10Ssingh: [V: 03+1] "PCC SUCCESS (DIFF 1): https://integration.wikimedia.org/ci/job/operations-puppet-catalog-compiler/33201/console" [puppet] - 10https://gerrit.wikimedia.org/r/752726 (owner: 10Ssingh)
[20:22:14] <wikibugs>	 (03CR) 10Ssingh: [V: 03+1 C: 03+2] O:wikidough: enable DoT to auth servers [puppet] - 10https://gerrit.wikimedia.org/r/752726 (owner: 10Ssingh)
[20:22:20] <icinga-wm>	 RECOVERY - Check systemd state on logstash1025 is OK: OK - running: The system is fully operational https://wikitech.wikimedia.org/wiki/Monitoring/check_systemd_state
[20:22:28] <icinga-wm>	 RECOVERY - Check systemd state on logstash1024 is OK: OK - running: The system is fully operational https://wikitech.wikimedia.org/wiki/Monitoring/check_systemd_state
[20:23:08] <logmsgbot>	 !log cwhite@cumin1001 START - Cookbook sre.ganeti.reboot-vm for VM logstash1030.eqiad.wmnet
[20:23:09] <stashbot>	 Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log
[20:23:12] <logmsgbot>	 !log cwhite@cumin1001 START - Cookbook sre.ganeti.reboot-vm for VM logstash1031.eqiad.wmnet
[20:23:13] <stashbot>	 Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log
[20:23:14] <wikibugs>	 10SRE, 10Infrastructure-Foundations: Migrate eqiad Ganeti cluster to KVM machine type pc-i440fx-2.8 - https://phabricator.wikimedia.org/T294120 (10ops-monitoring-bot) VM logstash1030.eqiad.wmnet rebooted by cwhite@cumin1001 with reason: None
[20:23:19] <wikibugs>	 10SRE, 10Infrastructure-Foundations: Migrate eqiad Ganeti cluster to KVM machine type pc-i440fx-2.8 - https://phabricator.wikimedia.org/T294120 (10ops-monitoring-bot) VM logstash1031.eqiad.wmnet rebooted by cwhite@cumin1001 with reason: None
[20:25:06] <logmsgbot>	 !log marostegui@cumin1001 dbctl commit (dc=all): 'Repooling after maintenance db1149 (T297191)', diff saved to https://phabricator.wikimedia.org/P18610 and previous config saved to /var/cache/conftool/dbconfig/20220111-202505-marostegui.json
[20:25:07] <logmsgbot>	 !log marostegui@cumin1001 START - Cookbook sre.hosts.downtime for 6:00:00 on db1148.eqiad.wmnet with reason: Maintenance
[20:25:09] <logmsgbot>	 !log marostegui@cumin1001 END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1148.eqiad.wmnet with reason: Maintenance
[20:25:09] <stashbot>	 Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log
[20:25:10] <stashbot>	 T297191: Schema change for dropping page_restrictions.pr_user field on wmf sites - https://phabricator.wikimedia.org/T297191
[20:25:11] <stashbot>	 Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log
[20:25:13] <stashbot>	 Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log
[20:25:13] <logmsgbot>	 !log marostegui@cumin1001 dbctl commit (dc=all): 'Depooling db1148 (T297191)', diff saved to https://phabricator.wikimedia.org/P18611 and previous config saved to /var/cache/conftool/dbconfig/20220111-202513-marostegui.json
[20:25:16] <stashbot>	 Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log
[20:25:40] <wikibugs>	 (03PS1) 10Jbond: hieradata - clod: add pullapi endpoint [puppet] - 10https://gerrit.wikimedia.org/r/753132
[20:26:20] <logmsgbot>	 !log marostegui@cumin1001 dbctl commit (dc=all): 'Repooling after maintenance db1148 (T297191)', diff saved to https://phabricator.wikimedia.org/P18612 and previous config saved to /var/cache/conftool/dbconfig/20220111-202620-marostegui.json
[20:26:22] <stashbot>	 Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log
[20:26:49] <logmsgbot>	 !log cwhite@cumin1001 END (PASS) - Cookbook sre.ganeti.reboot-vm (exit_code=0) for VM logstash1030.eqiad.wmnet
[20:26:51] <stashbot>	 Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log
[20:26:52] <logmsgbot>	 !log cwhite@cumin1001 END (PASS) - Cookbook sre.ganeti.reboot-vm (exit_code=0) for VM logstash1031.eqiad.wmnet
[20:26:53] <wikibugs>	 (03CR) 10Jbond: [C: 03+2] hieradata - clod: add pullapi endpoint [puppet] - 10https://gerrit.wikimedia.org/r/753132 (owner: 10Jbond)
[20:26:53] <stashbot>	 Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log
[20:27:31] <logmsgbot>	 !log cwhite@cumin1001 START - Cookbook sre.ganeti.reboot-vm for VM logstash1032.eqiad.wmnet
[20:27:33] <stashbot>	 Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log
[20:27:37] <wikibugs>	 10SRE, 10Infrastructure-Foundations: Migrate eqiad Ganeti cluster to KVM machine type pc-i440fx-2.8 - https://phabricator.wikimedia.org/T294120 (10ops-monitoring-bot) VM logstash1032.eqiad.wmnet rebooted by cwhite@cumin1001 with reason: None
[20:27:52] <logmsgbot>	 !log cwhite@cumin1001 START - Cookbook sre.ganeti.reboot-vm for VM logstash1007.eqiad.wmnet
[20:27:53] <stashbot>	 Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log
[20:27:59] <wikibugs>	 10SRE, 10Infrastructure-Foundations: Migrate eqiad Ganeti cluster to KVM machine type pc-i440fx-2.8 - https://phabricator.wikimedia.org/T294120 (10ops-monitoring-bot) VM logstash1007.eqiad.wmnet rebooted by cwhite@cumin1001 with reason: None
[20:31:03] <wikibugs>	 (03PS1) 10Jbond: bgpalerter - hierdata: use standard port [puppet] - 10https://gerrit.wikimedia.org/r/753135
[20:31:12] <logmsgbot>	 !log cwhite@cumin1001 END (PASS) - Cookbook sre.ganeti.reboot-vm (exit_code=0) for VM logstash1032.eqiad.wmnet
[20:31:14] <stashbot>	 Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log
[20:31:52] <logmsgbot>	 !log cwhite@cumin1001 END (PASS) - Cookbook sre.ganeti.reboot-vm (exit_code=0) for VM logstash1007.eqiad.wmnet
[20:31:53] <stashbot>	 Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log
[20:32:08] <logmsgbot>	 !log cwhite@cumin1001 START - Cookbook sre.ganeti.reboot-vm for VM logstash1008.eqiad.wmnet
[20:32:09] <stashbot>	 Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log
[20:32:14] <wikibugs>	 10SRE, 10Infrastructure-Foundations: Migrate eqiad Ganeti cluster to KVM machine type pc-i440fx-2.8 - https://phabricator.wikimedia.org/T294120 (10ops-monitoring-bot) VM logstash1008.eqiad.wmnet rebooted by cwhite@cumin1001 with reason: None
[20:32:28] <wikibugs>	 (03CR) 10Jbond: [C: 03+2] bgpalerter - hierdata: use standard port [puppet] - 10https://gerrit.wikimedia.org/r/753135 (owner: 10Jbond)
[20:34:14] <wikibugs>	 10SRE, 10ops-eqiad: msw-a8-eqiad potentially down - https://phabricator.wikimedia.org/T298869 (10wiki_willy) For sure, agreed @Cmjohnson.  Once the new Netgear switches arrive for the expansion cage in April, we can hold onto some of the temp msw's we're currently using as future spares.  @ayounsi - are we goo...
[20:35:07] <wikibugs>	 (03PS1) 10Dduvall: group0 wikis to 1.38.0-wmf.17  refs T293958 [mediawiki-config] - 10https://gerrit.wikimedia.org/r/753138
[20:35:09] <wikibugs>	 (03CR) 10Dduvall: [C: 03+2] group0 wikis to 1.38.0-wmf.17  refs T293958 [mediawiki-config] - 10https://gerrit.wikimedia.org/r/753138 (owner: 10Dduvall)
[20:36:03] <logmsgbot>	 !log cwhite@cumin1001 END (PASS) - Cookbook sre.ganeti.reboot-vm (exit_code=0) for VM logstash1008.eqiad.wmnet
[20:36:04] <stashbot>	 Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log
[20:36:47] <wikibugs>	 (03Merged) 10jenkins-bot: group0 wikis to 1.38.0-wmf.17  refs T293958 [mediawiki-config] - 10https://gerrit.wikimedia.org/r/753138 (owner: 10Dduvall)
[20:38:30] <logmsgbot>	 !log dduvall@deploy1002 rebuilt and synchronized wikiversions files: group0 wikis to 1.38.0-wmf.17  refs T293958
[20:38:32] <stashbot>	 Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log
[20:38:33] <stashbot>	 T293958: 1.38.0-wmf.17 deployment blockers - https://phabricator.wikimedia.org/T293958
[20:38:47] <logmsgbot>	 !log cwhite@cumin1001 START - Cookbook sre.ganeti.reboot-vm for VM logstash1009.eqiad.wmnet
[20:38:48] <stashbot>	 Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log
[20:38:53] <wikibugs>	 10SRE, 10Infrastructure-Foundations: Migrate eqiad Ganeti cluster to KVM machine type pc-i440fx-2.8 - https://phabricator.wikimedia.org/T294120 (10ops-monitoring-bot) VM logstash1009.eqiad.wmnet rebooted by cwhite@cumin1001 with reason: None
[20:41:25] <logmsgbot>	 !log marostegui@cumin1001 dbctl commit (dc=all): 'Repooling after maintenance db1148', diff saved to https://phabricator.wikimedia.org/P18613 and previous config saved to /var/cache/conftool/dbconfig/20220111-204124-marostegui.json
[20:41:26] <stashbot>	 Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log
[20:42:42] <logmsgbot>	 !log cwhite@cumin1001 END (PASS) - Cookbook sre.ganeti.reboot-vm (exit_code=0) for VM logstash1009.eqiad.wmnet
[20:42:43] <stashbot>	 Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log
[20:42:48] <logmsgbot>	 !log mwdebug-deploy@deploy1002 helmfile [eqiad] START helmfile.d/services/mwdebug: apply on pinkunicorn
[20:42:49] <stashbot>	 Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log
[20:43:56] <logmsgbot>	 !log mwdebug-deploy@deploy1002 helmfile [eqiad] DONE helmfile.d/services/mwdebug: sync on pinkunicorn
[20:43:57] <logmsgbot>	 !log mwdebug-deploy@deploy1002 helmfile [codfw] START helmfile.d/services/mwdebug: apply on pinkunicorn
[20:43:57] <stashbot>	 Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log
[20:43:59] <stashbot>	 Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log
[20:44:35] <wikibugs>	 10SRE, 10Infrastructure-Foundations: Migrate eqiad Ganeti cluster to KVM machine type pc-i440fx-2.8 - https://phabricator.wikimedia.org/T294120 (10colewhite)
[20:45:06] <wikibugs>	 10ops-codfw, 10DC-Ops, 10Infrastructure-Foundations: Q3:(Need By: TBD) rack/setup/install ganeti2029.codfw.wmnet, ganeti2030.codfw.wmnet - https://phabricator.wikimedia.org/T298998 (10RobH)
[20:45:07] <logmsgbot>	 !log mwdebug-deploy@deploy1002 helmfile [codfw] DONE helmfile.d/services/mwdebug: sync on pinkunicorn
[20:45:08] <stashbot>	 Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log
[20:45:29] <wikibugs>	 10ops-codfw, 10DC-Ops, 10Infrastructure-Foundations: Q3:(Need By: TBD) rack/setup/install ganeti2029.codfw.wmnet, ganeti2030.codfw.wmnet - https://phabricator.wikimedia.org/T298998 (10RobH)
[20:49:59] <wikibugs>	 (03CR) 10Dzahn: "https://puppet-compiler.wmflabs.org/pcc-worker1001/33202/" [puppet] - 10https://gerrit.wikimedia.org/r/751510 (https://phabricator.wikimedia.org/T114209) (owner: 10Dzahn)
[20:50:19] <wikibugs>	 (03CR) 10Dzahn: [C: 03+1] phabricator: move vcs firewall rules to profile [puppet] - 10https://gerrit.wikimedia.org/r/751510 (https://phabricator.wikimedia.org/T114209) (owner: 10Dzahn)
[20:54:22] <wikibugs>	 (03PS1) 10Jbond: hieradata: add ASN name comments [puppet] - 10https://gerrit.wikimedia.org/r/753147
[20:56:00] <mutante>	 !log mw1418 (lowest numbered canary appserver that we use for httpbb hourly tests on cumin1001) - apt-get autoremove - removed font* and python3* packages - reason: T294378
[20:56:02] <stashbot>	 Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log
[20:56:03] <stashbot>	 T294378: Remove mediawiki::packages::fonts from non thumbor servers - https://phabricator.wikimedia.org/T294378
[20:56:30] <logmsgbot>	 !log marostegui@cumin1001 dbctl commit (dc=all): 'Repooling after maintenance db1148', diff saved to https://phabricator.wikimedia.org/P18614 and previous config saved to /var/cache/conftool/dbconfig/20220111-205629-marostegui.json
[20:56:31] <stashbot>	 Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log
[20:59:29] <wikibugs>	 10SRE, 10serviceops, 10Patch-For-Review: Remove mediawiki::packages::fonts from non thumbor servers - https://phabricator.wikimedia.org/T294378 (10Dzahn) Doing the `apt-get autoremove` and accepting what it suggests also removes python packages in addition to font packages.  When running puppet afterwards th...
[21:04:40] <wikibugs>	 (03CR) 10Jbond: "This just adds some comments to the as list generated bu bgpalerter.  Im guessing its so big and random due to the route servers." [puppet] - 10https://gerrit.wikimedia.org/r/753147 (owner: 10Jbond)
[21:05:35] <wikibugs>	 (03PS2) 10Jdlrobson: Skip vector-2022 skin in config, not Vector skin [mediawiki-config] - 10https://gerrit.wikimedia.org/r/752760 (https://phabricator.wikimedia.org/T298923)
[21:05:45] <wikibugs>	 (03PS3) 10Jdlrobson: Skip vector-2022 skin in config, not inside Vector skin codebase [mediawiki-config] - 10https://gerrit.wikimedia.org/r/752760 (https://phabricator.wikimedia.org/T298923)
[21:11:34] <logmsgbot>	 !log marostegui@cumin1001 dbctl commit (dc=all): 'Repooling after maintenance db1148 (T297191)', diff saved to https://phabricator.wikimedia.org/P18615 and previous config saved to /var/cache/conftool/dbconfig/20220111-211134-marostegui.json
[21:11:37] <stashbot>	 Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log
[21:11:38] <stashbot>	 T297191: Schema change for dropping page_restrictions.pr_user field on wmf sites - https://phabricator.wikimedia.org/T297191
[21:15:31] <wikibugs>	 (03CR) 10Jbond: hieradata: add ASN name comments (031 comment) [puppet] - 10https://gerrit.wikimedia.org/r/753147 (owner: 10Jbond)
[21:18:45] <wikibugs>	 10SRE, 10ops-eqiad: msw-a8-eqiad potentially down - https://phabricator.wikimedia.org/T298869 (10Cmjohnson) 05Open→03Resolved netbox updated with msw1 connection changed the broken name to msw-a8-eqiad-broken and placed as failed for the time being.
[21:20:23] <wikibugs>	 10SRE, 10ops-eqiad, 10Analytics-Clusters, 10DC-Ops: (Need By: TBD) rack/setup/install an-test-coord1002 - https://phabricator.wikimedia.org/T293938 (10Cmjohnson) 05Open→03Resolved Thanks @elukey resolving the task
[21:29:50] <mutante>	 !log mw1418 - apt-get remove --purge fonts*; apt-get remove --purge xfonts*; running puppet - nothing gets reinstalled and with --purge it means 'dpkg -l | grep fonts' is actually empty, not full of "rc" still - T294378
[21:29:53] <stashbot>	 Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log
[21:29:54] <stashbot>	 T294378: Remove mediawiki::packages::fonts from non thumbor servers - https://phabricator.wikimedia.org/T294378
[22:13:48] <wikibugs>	 10SRE, 10ops-eqiad, 10DC-Ops: Rack msw2-eqiad in new cage - https://phabricator.wikimedia.org/T298980 (10wiki_willy) a:03Jclark-ctr
[22:15:33] <tgr>	 dduvall: T298999 probably merits a rollback
[22:15:34] <stashbot>	 T298999: [regression-wmf.16] testwiki - cannot publish an edit - https://phabricator.wikimedia.org/T298999
[22:15:48] <tgr>	 (can repro outside testwiki)
[22:16:02] <dduvall>	 tgr: will do. thanks for the report
[22:16:21] <MatmaRex>	 dduvall: tgr: oops, i think we have a fix already
[22:16:41] <dduvall>	 i'll have to review the risky changes to make sure there isn't something blocking rollback
[22:16:45] <MatmaRex>	 https://gerrit.wikimedia.org/r/c/mediawiki/extensions/VisualEditor/+/753113
[22:16:50] <dduvall>	 MatmaRex: ok. what's the eta?
[22:17:03] <MatmaRex>	 but we haven't checked if the wmf branch was affected, oops
[22:17:07] <dduvall>	 ah, merged
[22:17:42] <MatmaRex>	 dduvall: yeah, just need to backport
[22:18:02] <MatmaRex>	 was caused by https://gerrit.wikimedia.org/r/c/mediawiki/extensions/VisualEditor/+/734332
[22:18:39] <wikibugs>	 (03PS1) 10Bartosz Dziewoński: Watchlist API update: Call correct method [extensions/VisualEditor] (wmf/1.38.0-wmf.17) - 10https://gerrit.wikimedia.org/r/753071
[22:18:49] <MatmaRex>	 dduvall: backport is https://gerrit.wikimedia.org/r/c/mediawiki/extensions/VisualEditor/+/753071
[22:19:02] <dduvall>	 ah, you beat me to it :)
[22:19:08] <MatmaRex>	 can you merge/deploy it? sorry about the problem
[22:19:24] <dduvall>	 sure
[22:19:44] <dduvall>	 jouncebot: now
[22:19:44] <jouncebot>	 No deployments scheduled for the next 1 hour(s) and 40 minute(s)
[22:20:30] <wikibugs>	 (03PS2) 10Bartosz Dziewoński: Watchlist API update: Call correct method [extensions/VisualEditor] (wmf/1.38.0-wmf.17) - 10https://gerrit.wikimedia.org/r/753071 (https://phabricator.wikimedia.org/T298999)
[22:22:11] * urbanecm waves and is around to help if needed
[22:38:08] <wikibugs>	 (03CR) 10Dduvall: [C: 03+2] Watchlist API update: Call correct method [extensions/VisualEditor] (wmf/1.38.0-wmf.17) - 10https://gerrit.wikimedia.org/r/753071 (https://phabricator.wikimedia.org/T298999) (owner: 10Bartosz Dziewoński)
[22:39:47] * dduvall waves to urbanecm in appreciation
[22:40:09] <urbanecm>	 no problem :). Ping if you need me.
[22:40:21] <dduvall>	 will do. just waiting on jenkins to deploy
[22:40:33] <dduvall>	 waiting on jenkins *before* i deploy
[22:40:53] <dduvall>	 in case i confuse someone that we suddenly have continuous deployment :)
[22:41:12] <urbanecm>	 well, we sort of have :)
[22:41:18] <urbanecm>	 at master branch and beta
[22:41:32] <dduvall>	 that is true
[22:48:13] <wikibugs>	 10SRE, 10Discovery-Search (Current work): Upgrade Cirrus Elasticsearch clusters to Debian Bullseye - https://phabricator.wikimedia.org/T289135 (10bking) The output of 'run-puppet-agent' : https://phabricator.wikimedia.org/P18581
[22:56:03] <wikibugs>	 (03Merged) 10jenkins-bot: Watchlist API update: Call correct method [extensions/VisualEditor] (wmf/1.38.0-wmf.17) - 10https://gerrit.wikimedia.org/r/753071 (https://phabricator.wikimedia.org/T298999) (owner: 10Bartosz Dziewoński)
[23:03:12] <wikibugs>	 (03PS1) 10Samwilson: Enable Disambiguator notifications for French Wikipedia [mediawiki-config] - 10https://gerrit.wikimedia.org/r/753175 (https://phabricator.wikimedia.org/T293319)
[23:04:14] <dduvall>	 !log syncing backport to fix VE regression that followed testwiki/group0 deployment (cc T293958)
[23:04:17] <stashbot>	 Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log
[23:04:18] <stashbot>	 T293958: 1.38.0-wmf.17 deployment blockers - https://phabricator.wikimedia.org/T293958
[23:05:39] <logmsgbot>	 !log dduvall@deploy1002 Synchronized php-1.38.0-wmf.17/extensions/VisualEditor/modules/ve-mw/init/targets/ve.init.mw.DesktopArticleTarget.js: Backport: [[gerrit:753071|Watchlist API update: Call correct method (T298999)]] (duration: 02m 40s)
[23:05:41] <stashbot>	 Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log
[23:05:42] <stashbot>	 T298999: [regression-wmf.17] testwiki - cannot publish an edit - https://phabricator.wikimedia.org/T298999
[23:06:27] <logmsgbot>	 !log mwdebug-deploy@deploy1002 helmfile [eqiad] START helmfile.d/services/mwdebug: apply on pinkunicorn
[23:06:28] <stashbot>	 Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log
[23:06:36] <dduvall>	 tgr or urbanecm, the fix is deployed. would you mind verifying?
[23:06:46] <urbanecm>	 certainly
[23:07:14] <urbanecm>	 i confirm i can edit at testwiki
[23:07:31] <dduvall>	 yay. thank you :)
[23:12:43] <logmsgbot>	 !log mwdebug-deploy@deploy1002 helmfile [eqiad] DONE helmfile.d/services/mwdebug: sync on pinkunicorn
[23:12:44] <logmsgbot>	 !log mwdebug-deploy@deploy1002 helmfile [codfw] START helmfile.d/services/mwdebug: apply on pinkunicorn
[23:12:45] <stashbot>	 Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log
[23:12:47] <stashbot>	 Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log
[23:19:00] <logmsgbot>	 !log mwdebug-deploy@deploy1002 helmfile [codfw] DONE helmfile.d/services/mwdebug: sync on pinkunicorn
[23:19:01] <stashbot>	 Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log
[23:23:16] <wikibugs>	 10SRE, 10Data-Engineering, 10Research-Backlog, 10WMF-Legal, 10User-Elukey: Enable layered data-access and sharing for a new form of collaboration - https://phabricator.wikimedia.org/T245833 (10odimitrijevic)
[23:24:02] <logmsgbot>	 !log mwdebug-deploy@deploy1002 helmfile [eqiad] START helmfile.d/services/mwdebug: apply on pinkunicorn
[23:24:04] <stashbot>	 Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log
[23:27:32] <icinga-wm>	 PROBLEM - SSH on restbase2010.mgmt is CRITICAL: CRITICAL - Socket timeout after 10 seconds https://wikitech.wikimedia.org/wiki/Dc-operations/Hardware_Troubleshooting_Runbook
[23:28:59] <wikibugs>	 (03CR) 10Cwhite: [C: 03+2] hiera: add opensearch production configuration (eqiad) [puppet] - 10https://gerrit.wikimedia.org/r/752755 (https://phabricator.wikimedia.org/T288621) (owner: 10Cwhite)
[23:30:14] <wikibugs>	 (03CR) 10Cwhite: [C: 03+2] role: add apifeatureusage role [puppet] - 10https://gerrit.wikimedia.org/r/747635 (https://phabricator.wikimedia.org/T297239) (owner: 10Cwhite)
[23:30:22] <wikibugs>	 (03PS11) 10Cwhite: role: add apifeatureusage role [puppet] - 10https://gerrit.wikimedia.org/r/747635 (https://phabricator.wikimedia.org/T297239)
[23:30:44] <logmsgbot>	 !log mwdebug-deploy@deploy1002 helmfile [eqiad] DONE helmfile.d/services/mwdebug: sync on pinkunicorn
[23:30:45] <logmsgbot>	 !log mwdebug-deploy@deploy1002 helmfile [codfw] START helmfile.d/services/mwdebug: apply on pinkunicorn
[23:30:46] <stashbot>	 Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log
[23:30:48] <stashbot>	 Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log
[23:37:09] <logmsgbot>	 !log mwdebug-deploy@deploy1002 helmfile [codfw] DONE helmfile.d/services/mwdebug: sync on pinkunicorn
[23:37:11] <stashbot>	 Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log
[23:37:58] <wikibugs>	 10SRE, 10serviceops, 10Wikimedia-production-error: PHP7 corruption reports in 2020-2022 (Call on wrong object, etc.) - https://phabricator.wikimedia.org/T245183 (10Krinkle)
[23:38:21] <wikibugs>	 10SRE, 10serviceops, 10Wikimedia-production-error: PHP7 corruption reports in 2020-2022 (Call on wrong object, etc.) - https://phabricator.wikimedia.org/T245183 (10Krinkle)
[23:46:47] <wikibugs>	 10SRE, 10ops-eqiad, 10DC-Ops: eqiad: Master Tracking Ticket for eqiad expansion cage - https://phabricator.wikimedia.org/T296966 (10wiki_willy)
[23:48:58] <logmsgbot>	 !log bking@cumin1001 END (FAIL) - Cookbook sre.wdqs.data-reload (exit_code=99)
[23:49:00] <stashbot>	 Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log
[23:50:20] <wikibugs>	 10SRE, 10ops-eqiad, 10DC-Ops: eqiad: Master Tracking Ticket for eqiad expansion cage - https://phabricator.wikimedia.org/T296966 (10Papaul)
[23:52:05] <wikibugs>	 10SRE, 10ops-eqiad, 10DC-Ops: Rack msw2-eqiad in new cage - https://phabricator.wikimedia.org/T298980 (10Jclark-ctr) Relocated msw2-eqiad and completed cross connect in  new and old cage  Please confirm link before I close ticket @ayounsi.  All Pdu's and Switches are connected to msw2-eqiad but netbox is not...
[23:52:54] <wikibugs>	 10SRE, 10serviceops, 10User-Ladsgroup, 10Wikimedia-production-error: wtp* hosts: Out of memory (allocated 39845888) (tried to allocate 131072 bytes) in OutputHandler.php - https://phabricator.wikimedia.org/T297517 (10Krinkle) 05Open→03Resolved a:03Ladsgroup The immediate issue appears resolved, as ev...
[23:55:48] <wikibugs>	 (03PS1) 10Clare Ming: Add new vector skin key to RelatedArticlesFooterAllowedSkins. [mediawiki-config] - 10https://gerrit.wikimedia.org/r/753187 (https://phabricator.wikimedia.org/T298916)
[23:56:12] <logmsgbot>	 !log bking@cumin1001 END (FAIL) - Cookbook sre.wdqs.data-reload (exit_code=99)
[23:56:13] <stashbot>	 Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log
[23:59:53] <wikibugs>	 10SRE, 10serviceops, 10Wikimedia-production-error: PHP7 corruption reports in 2020-2022 (Call on wrong object, etc.) - https://phabricator.wikimedia.org/T245183 (10Krinkle) From {T297316}  ` from /srv/mediawiki/php-1.38.0-wmf.9/includes/libs/objectcache/MemcachedPeclBagOStuff.php(341) #0 /srv/mediawiki/php-1...
[23:59:58] <wikibugs>	 (03CR) 10Jdlrobson: [C: 03+1] Add new vector skin key to RelatedArticlesFooterAllowedSkins. [mediawiki-config] - 10https://gerrit.wikimedia.org/r/753187 (https://phabricator.wikimedia.org/T298916) (owner: 10Clare Ming)