[00:00:04] <jouncebot>	 brennen: Dear deployers, time to do the UTC late backport and config training deploy. Dont look at me like that. You signed up for it. (https://wikitech.wikimedia.org/wiki/Deployments#deploycal-item-20220107T0000).
[00:00:05] <jouncebot>	 nn1l2: A patch you scheduled for UTC late backport and config training is about to be deployed. Please be around during the process. Note: If you break AND fix the wikis, you will be rewarded with a sticker.
[00:00:10] <nn1l2>	 hi
[00:00:25] <jhathaway>	 nn1l2: hi
[00:00:41] <nn1l2>	 I have a patch with -1 jenkis bot
[00:00:42] <icinga-wm>	 PROBLEM - Query Service HTTP Port on wdqs1006 is CRITICAL: HTTP CRITICAL: HTTP/1.1 502 Bad Gateway - 380 bytes in 0.001 second response time https://wikitech.wikimedia.org/wiki/Wikidata_query_service
[00:00:53] <nn1l2>	 I don't know what's wrong with jenkins
[00:01:09] <nn1l2>	 Could you please have a look
[00:01:30] <nn1l2>	 https://gerrit.wikimedia.org/r/c/operations/mediawiki-config/+/752036
[00:01:50] <bd808>	 nn1l2: "Unexpected space in '100' namespace title for viwiktionary, use underscores instead" "Failed asserting that 'Phụ lục' does not contain " ""
[00:02:01] <bd808>	 https://integration.wikimedia.org/ci/job/operations-mw-config-php72-composer-test-docker/15416/console
[00:02:25] <nn1l2>	 give me a sec and I'll fix it
[00:02:50] <icinga-wm>	 RECOVERY - Query Service HTTP Port on wdqs1006 is OK: HTTP OK: HTTP/1.1 200 OK - 448 bytes in 0.030 second response time https://wikitech.wikimedia.org/wiki/Wikidata_query_service
[00:03:40] <wikibugs>	 (03PS3) 104nn1l2: viwiktionary: add namespaces “Appendix” and “Appendix talk” [mediawiki-config] - 10https://gerrit.wikimedia.org/r/752036 (https://phabricator.wikimedia.org/T298289)
[00:05:17] <nn1l2>	 Good to go
[00:06:17] <jhathaway>	 cdanis: docs say I should depool any servers with lag greater than an hour
[00:06:59] <jhathaway>	 where can I find docs on how to do that?
[00:07:56] <nn1l2>	 Is B&C going on?
[00:08:23] <thcipriani>	 nn1l2: I can deploy your change, looking now
[00:08:37] <nn1l2>	 thanks
[00:09:37] <wikibugs>	 (03CR) 10Thcipriani: [C: 03+2] "backport" [mediawiki-config] - 10https://gerrit.wikimedia.org/r/752036 (https://phabricator.wikimedia.org/T298289) (owner: 104nn1l2)
[00:09:49] <jhathaway>	 I have restarted wdqs-blazegraph.service on all the laggy nodes
[00:10:20] <topranks>	 ok, that was just a short time ago was it?
[00:10:21] <wikibugs>	 (03Merged) 10jenkins-bot: viwiktionary: add namespaces “Appendix” and “Appendix talk” [mediawiki-config] - 10https://gerrit.wikimedia.org/r/752036 (https://phabricator.wikimedia.org/T298289) (owner: 104nn1l2)
[00:10:30] <jhathaway>	 topranks: yup
[00:10:42] <topranks>	 ok let's see how ti progresses.
[00:10:42] <jhathaway>	 lag seems to be dropping fast, according to the graphs?
[00:11:17] <cdanis>	 yeah if it is making good progress I would leave that
[00:11:24] <cdanis>	 btw 'sudo depool' on each host is the easiest way
[00:11:33] <jhathaway>	 cdanis: noted, thanks
[00:11:49] <thcipriani>	 nn1l2: namespace change is live on mwdebug1002, check please
[00:14:31] <nn1l2>	 thcipriani: I don't see any thing on https://vi.wiktionary.org/wiki/%C4%90%E1%BA%B7c_bi%E1%BB%87t:Ti%E1%BB%81n_t%E1%BB%91
[00:14:53] <nn1l2>	 when I open the drop down menu
[00:15:01] <logmsgbot>	 !log mwdebug-deploy@deploy1002 helmfile [eqiad] START helmfile.d/services/mwdebug: apply on pinkunicorn
[00:15:02] <stashbot>	 Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log
[00:15:04] <topranks>	 It's shot down on wqds1006/1004.  Still relatively high on wqds1012 but it's not increasing at least
[00:15:26] <nn1l2>	 I expect to see the new namespace "Phụ lục", but I can't see it
[00:15:51] <jhathaway>	 topranks: yeah
[00:16:06] <thcipriani>	 nn1l2: indeed, checking that everything synced
[00:16:37] <logmsgbot>	 !log mwdebug-deploy@deploy1002 helmfile [eqiad] DONE helmfile.d/services/mwdebug: sync on pinkunicorn
[00:16:38] <stashbot>	 Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log
[00:16:38] <logmsgbot>	 !log mwdebug-deploy@deploy1002 helmfile [codfw] START helmfile.d/services/mwdebug: apply on pinkunicorn
[00:16:40] <stashbot>	 Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log
[00:17:24] <topranks>	 definitely looking a lot healthier 
[00:17:34] <jhathaway>	 yeah I think so as well
[00:17:48] <jhathaway>	 anything else we should check before stepping away?
[00:17:58] <logmsgbot>	 !log mwdebug-deploy@deploy1002 helmfile [codfw] DONE helmfile.d/services/mwdebug: sync on pinkunicorn
[00:17:59] <stashbot>	 Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log
[00:18:32] <topranks>	 Looking at the OpenSearch page I can see the queries from AWS still there.
[00:18:44] <topranks>	 https://logstash.wikimedia.org/app/dashboards#/view/259a4460-8e7e-11e7-9846-4f694cbd6a14?_g=h@a91e569&_a=h@e4186ec
[00:19:00] <nn1l2>	 It's working now
[00:19:05] <topranks>	 So maybe that's gonna eventually knock something out of whack again
[00:19:10] <nn1l2>	 https://vi.wiktionary.org/wiki/%C4%90%E1%BA%B7c_bi%E1%BB%87t:Thay_%C4%91%E1%BB%95i_g%E1%BA%A7n_%C4%91%C3%A2y?hidebots=1&hidecategorization=1&hideWikibase=1&limit=50&days=7&urlversion=2
[00:19:16] <topranks>	 nn1l2: great thanks for confirming :)
[00:20:44] <jhathaway>	 topranks: that logstash url doesn't load correctly, it says to use the share button?
[00:21:00] <thcipriani>	 topranks: I think nn1l2 is probably talking about different things :) (also doing a quick backport)
[00:21:13] <thcipriani>	 nn1l2: cool, yeah, seeing it, too, going live
[00:21:19] <topranks>	 ok try 2:  https://logstash.wikimedia.org/goto/22072eac35d8a1785258521fd2cc27c8
[00:21:52] <jhathaway>	 yup that worked, thanks
[00:22:35] <topranks>	 jhathaway: ok thanks, that answers my question of "how is this Vietnamese wiktionary somehow related to wikidata" question anyway!
[00:23:08] <topranks>	 thcipriani:  apologies pasted wrong nick :)
[00:23:48] <logmsgbot>	 !log thcipriani@deploy1002 Synchronized wmf-config/InitialiseSettings.php: Config: [[gerrit:752036|viwiktionary: add namespaces "Appendix" and "Appendix talk" (T298289)]] (duration: 00m 59s)
[00:23:50] <stashbot>	 Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log
[00:23:51] <stashbot>	 T298289: Add namespace “Appendix” and “Appendix talk” to Vietnamese Wiktionary - https://phabricator.wikimedia.org/T298289
[00:24:06] <thcipriani>	 ^ nn1l2 should be live everywhere here shortly
[00:24:30] <nn1l2>	 Yeah, it's live now
[00:24:33] <nn1l2>	 Thanks
[00:24:39] <topranks>	 jhathaway: I'm not sure of anything else we should do.  Those lag metrics and everything else that shot up have returned to better levels than they were earlier.
[00:25:11] <jhathaway>	 yeah I agree, the bad queries appear to not be coming back, at least at the moment
[00:25:38] <topranks>	 If it happens again we might need to see if we could rate-limit those incoming queries based on user-agent or something.  
[00:26:02] <topranks>	 But let's hope it stays as it is
[00:26:08] <jhathaway>	 yeah I saw that mentioned in the docs, I'm going to step away and cook dinner, but feel free to page me if something pops up again, thanks for your help!
[00:34:22] <wikibugs>	 (03PS1) 10Addshore: planet: add wikidatacon tag to my blog feed [puppet] - 10https://gerrit.wikimedia.org/r/752040
[00:38:28] <icinga-wm>	 PROBLEM - SSH on mw2252.mgmt is CRITICAL: CRITICAL - Socket timeout after 10 seconds https://wikitech.wikimedia.org/wiki/Dc-operations/Hardware_Troubleshooting_Runbook
[00:50:00] <icinga-wm>	 PROBLEM - Prometheus jobs reduced availability on alert1001 is CRITICAL: job=sidekiq site=codfw https://wikitech.wikimedia.org/wiki/Prometheus%23Prometheus_job_unavailable https://grafana.wikimedia.org/d/NEJu05xZz/prometheus-targets
[00:54:22] <icinga-wm>	 RECOVERY - Prometheus jobs reduced availability on alert1001 is OK: All metrics within thresholds. https://wikitech.wikimedia.org/wiki/Prometheus%23Prometheus_job_unavailable https://grafana.wikimedia.org/d/NEJu05xZz/prometheus-targets
[01:30:34] <icinga-wm>	 PROBLEM - SSH on db2086.mgmt is CRITICAL: CRITICAL - Socket timeout after 10 seconds https://wikitech.wikimedia.org/wiki/Dc-operations/Hardware_Troubleshooting_Runbook
[02:09:34] <icinga-wm>	 PROBLEM - Check systemd state on deneb is CRITICAL: CRITICAL - degraded: The following units failed: package_builder_Clean_up_build_directory.service https://wikitech.wikimedia.org/wiki/Monitoring/check_systemd_state
[02:31:40] <icinga-wm>	 RECOVERY - SSH on db2086.mgmt is OK: SSH OK - OpenSSH_7.0 (protocol 2.0) https://wikitech.wikimedia.org/wiki/Dc-operations/Hardware_Troubleshooting_Runbook
[02:39:26] <wikibugs>	 10SRE-swift-storage, 10MW-on-K8s, 10Shellbox, 10serviceops: Support large files in Shellbox - https://phabricator.wikimedia.org/T292322 (10tstarling) Is the procedure the one documented at https://wikitech.wikimedia.org/wiki/Kubernetes/Deployments ?
[03:41:52] <icinga-wm>	 RECOVERY - SSH on mw2252.mgmt is OK: SSH OK - OpenSSH_7.0 (protocol 2.0) https://wikitech.wikimedia.org/wiki/Dc-operations/Hardware_Troubleshooting_Runbook
[04:22:31] <wikibugs>	 (03CR) 10Ladsgroup: passwords: Add ladsgroup to the cloud root (031 comment) [labs/private] - 10https://gerrit.wikimedia.org/r/748699 (owner: 10Ladsgroup)
[05:39:18] <wikibugs>	 10SRE-swift-storage, 10MW-on-K8s, 10Shellbox, 10serviceops: Support large files in Shellbox - https://phabricator.wikimedia.org/T292322 (10Legoktm) Yep, you'll need to create a commit like https://gerrit.wikimedia.org/r/c/operations/deployment-charts/+/751918 for shellbox-media, +2 it, wait for the cron to...
[05:47:38] <marostegui>	 !log rename wikishared.wikimedia_editor_tasks_targets_passed on db1120 T264225
[05:47:40] <stashbot>	 Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log
[05:47:41] <stashbot>	 T264225: Drop table wikimedia_editor_tasks_targets_passed on wmf wikis - https://phabricator.wikimedia.org/T264225
[06:08:21] <wikibugs>	 (03PS1) 10Marostegui: Revert "dbproxy200[1,2]: Disable notifications" [puppet] - 10https://gerrit.wikimedia.org/r/752011
[06:08:27] <wikibugs>	 (03PS1) 10Marostegui: Revert "dbproxy2003: Disable notifications" [puppet] - 10https://gerrit.wikimedia.org/r/752012
[06:11:42] <wikibugs>	 (03CR) 10Marostegui: [C: 03+2] Revert "dbproxy200[1,2]: Disable notifications" [puppet] - 10https://gerrit.wikimedia.org/r/752011 (owner: 10Marostegui)
[06:11:48] <wikibugs>	 (03CR) 10Marostegui: [C: 03+2] Revert "dbproxy2003: Disable notifications" [puppet] - 10https://gerrit.wikimedia.org/r/752012 (owner: 10Marostegui)
[06:14:50] <logmsgbot>	 !log marostegui@cumin1001 START - Cookbook sre.hosts.downtime for 6:00:00 on db[2076,2095].codfw.wmnet with reason: Maintenance
[06:14:51] <stashbot>	 Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log
[06:14:53] <logmsgbot>	 !log marostegui@cumin1001 END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db[2076,2095].codfw.wmnet with reason: Maintenance
[06:14:53] <stashbot>	 Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log
[06:15:00] <logmsgbot>	 !log marostegui@cumin1001 START - Cookbook sre.hosts.downtime for 6:00:00 on db2089.codfw.wmnet with reason: Maintenance
[06:15:01] <stashbot>	 Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log
[06:15:02] <logmsgbot>	 !log marostegui@cumin1001 END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db2089.codfw.wmnet with reason: Maintenance
[06:15:03] <stashbot>	 Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log
[06:15:10] <logmsgbot>	 !log marostegui@cumin1001 START - Cookbook sre.hosts.downtime for 6:00:00 on db2114.codfw.wmnet with reason: Maintenance
[06:15:11] <stashbot>	 Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log
[06:15:12] <logmsgbot>	 !log marostegui@cumin1001 END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db2114.codfw.wmnet with reason: Maintenance
[06:15:13] <stashbot>	 Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log
[06:15:19] <logmsgbot>	 !log marostegui@cumin1001 START - Cookbook sre.hosts.downtime for 6:00:00 on db2117.codfw.wmnet with reason: Maintenance
[06:15:20] <stashbot>	 Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log
[06:15:20] <logmsgbot>	 !log marostegui@cumin1001 END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db2117.codfw.wmnet with reason: Maintenance
[06:15:21] <stashbot>	 Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log
[06:15:28] <logmsgbot>	 !log marostegui@cumin1001 START - Cookbook sre.hosts.downtime for 6:00:00 on db2124.codfw.wmnet with reason: Maintenance
[06:15:29] <stashbot>	 Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log
[06:15:30] <logmsgbot>	 !log marostegui@cumin1001 END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db2124.codfw.wmnet with reason: Maintenance
[06:15:31] <stashbot>	 Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log
[06:15:38] <logmsgbot>	 !log marostegui@cumin1001 START - Cookbook sre.hosts.downtime for 6:00:00 on db2141.codfw.wmnet with reason: Maintenance
[06:15:39] <stashbot>	 Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log
[06:15:40] <logmsgbot>	 !log marostegui@cumin1001 END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db2141.codfw.wmnet with reason: Maintenance
[06:15:41] <stashbot>	 Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log
[06:22:54] <icinga-wm>	 PROBLEM - SSH on db2083.mgmt is CRITICAL: CRITICAL - Socket timeout after 10 seconds https://wikitech.wikimedia.org/wiki/Dc-operations/Hardware_Troubleshooting_Runbook
[06:41:13] <logmsgbot>	 !log marostegui@cumin1001 START - Cookbook sre.hosts.downtime for 6:00:00 on db1098.eqiad.wmnet with reason: Maintenance
[06:41:15] <logmsgbot>	 !log marostegui@cumin1001 END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1098.eqiad.wmnet with reason: Maintenance
[06:41:15] <stashbot>	 Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log
[06:41:17] <stashbot>	 Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log
[06:41:20] <logmsgbot>	 !log marostegui@cumin1001 dbctl commit (dc=all): 'Depooling db1098:3316 (T297191)', diff saved to https://phabricator.wikimedia.org/P18409 and previous config saved to /var/cache/conftool/dbconfig/20220107-064119-marostegui.json
[06:41:22] <stashbot>	 Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log
[06:41:22] <stashbot>	 T297191: Schema change for dropping page_restrictions.pr_user field on wmf sites - https://phabricator.wikimedia.org/T297191
[06:42:29] <logmsgbot>	 !log marostegui@cumin1001 dbctl commit (dc=all): 'Repooling after maintenance db1098:3316 (T297191)', diff saved to https://phabricator.wikimedia.org/P18410 and previous config saved to /var/cache/conftool/dbconfig/20220107-064228-marostegui.json
[06:42:32] <stashbot>	 Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log
[06:50:08] <wikibugs>	 10SRE, 10SRE-Access-Requests, 10Patch-For-Review: Requesting access to the data engineering team resources for Antoine Qu'hen - https://phabricator.wikimedia.org/T298657 (10odimitrijevic) Approved
[06:57:33] <logmsgbot>	 !log marostegui@cumin1001 dbctl commit (dc=all): 'Repooling after maintenance db1098:3316', diff saved to https://phabricator.wikimedia.org/P18411 and previous config saved to /var/cache/conftool/dbconfig/20220107-065733-marostegui.json
[06:57:35] <stashbot>	 Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log
[06:58:46] <wikibugs>	 (03CR) 10Marostegui: [C: 03+1] "testing looks good" [software] - 10https://gerrit.wikimedia.org/r/748726 (https://phabricator.wikimedia.org/T288235) (owner: 10Ladsgroup)
[07:00:45] <wikibugs>	 (03CR) 10Ladsgroup: [C: 03+2] auto_schema: Automatic detection of active dc [software] - 10https://gerrit.wikimedia.org/r/748726 (https://phabricator.wikimedia.org/T288235) (owner: 10Ladsgroup)
[07:01:20] <wikibugs>	 (03Merged) 10jenkins-bot: auto_schema: Automatic detection of active dc [software] - 10https://gerrit.wikimedia.org/r/748726 (https://phabricator.wikimedia.org/T288235) (owner: 10Ladsgroup)
[07:12:38] <logmsgbot>	 !log marostegui@cumin1001 dbctl commit (dc=all): 'Repooling after maintenance db1098:3316', diff saved to https://phabricator.wikimedia.org/P18412 and previous config saved to /var/cache/conftool/dbconfig/20220107-071237-marostegui.json
[07:12:39] <stashbot>	 Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log
[07:24:02] <icinga-wm>	 RECOVERY - SSH on db2083.mgmt is OK: SSH OK - OpenSSH_7.0 (protocol 2.0) https://wikitech.wikimedia.org/wiki/Dc-operations/Hardware_Troubleshooting_Runbook
[07:27:43] <logmsgbot>	 !log marostegui@cumin1001 dbctl commit (dc=all): 'Repooling after maintenance db1098:3316 (T297191)', diff saved to https://phabricator.wikimedia.org/P18413 and previous config saved to /var/cache/conftool/dbconfig/20220107-072742-marostegui.json
[07:27:45] <stashbot>	 Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log
[07:27:46] <stashbot>	 T297191: Schema change for dropping page_restrictions.pr_user field on wmf sites - https://phabricator.wikimedia.org/T297191
[07:56:36] <icinga-wm>	 PROBLEM - SSH on kubernetes1004.mgmt is CRITICAL: CRITICAL - Socket timeout after 10 seconds https://wikitech.wikimedia.org/wiki/Dc-operations/Hardware_Troubleshooting_Runbook
[08:00:05] <jouncebot>	 Deploy window No deploys all day! See Deployments/Emergencies if things are broken. (https://wikitech.wikimedia.org/wiki/Deployments#deploycal-item-20220107T0800)
[08:17:51] <wikibugs>	 (03PS1) 10Gehel: icinga: add multiple case for Gehel in Icinga authorization [puppet] - 10https://gerrit.wikimedia.org/r/752130
[08:23:05] <dcausse>	 jhathaway: thanks for taking care of blazegraph! <3
[08:46:38] <icinga-wm>	 PROBLEM - SSH on mw2258.mgmt is CRITICAL: CRITICAL - Socket timeout after 10 seconds https://wikitech.wikimedia.org/wiki/Dc-operations/Hardware_Troubleshooting_Runbook
[08:56:37] <wikibugs>	 (03CR) 10Hashar: Refactor git-daemon use in profile::zuul::merger (031 comment) [puppet] - 10https://gerrit.wikimedia.org/r/751816 (owner: 10Ahmon Dancy)
[08:57:40] <icinga-wm>	 RECOVERY - SSH on kubernetes1004.mgmt is OK: SSH OK - OpenSSH_7.0 (protocol 2.0) https://wikitech.wikimedia.org/wiki/Dc-operations/Hardware_Troubleshooting_Runbook
[09:09:00] <wikibugs>	 (03CR) 10JMeybohm: [C: 03+1] kubernetes: point to new kubestage node [dns] - 10https://gerrit.wikimedia.org/r/751976 (https://phabricator.wikimedia.org/T293729) (owner: 10AOkoth)
[09:27:52] <wikibugs>	 (03CR) 10David Caro: [C: 03+2] c:kafka:broker:jmxtrans: remove unused module [puppet] - 10https://gerrit.wikimedia.org/r/751085 (https://phabricator.wikimedia.org/T272559) (owner: 10David Caro)
[09:29:21] <wikibugs>	 (03CR) 10David Caro: [C: 03+2] osm: remove unused profile/role [puppet] - 10https://gerrit.wikimedia.org/r/751703 (https://phabricator.wikimedia.org/T272559) (owner: 10David Caro)
[09:30:22] <wikibugs>	 (03CR) 10David Caro: {p,r}:gerrit:migration/migration_base: remove unused role/profile (031 comment) [puppet] - 10https://gerrit.wikimedia.org/r/751696 (https://phabricator.wikimedia.org/T272559) (owner: 10David Caro)
[09:30:31] <wikibugs>	 (03Abandoned) 10David Caro: {p,r}:gerrit:migration/migration_base: remove unused role/profile [puppet] - 10https://gerrit.wikimedia.org/r/751696 (https://phabricator.wikimedia.org/T272559) (owner: 10David Caro)
[09:34:39] <wikibugs>	 (03PS6) 10Jbond: exim: add the ability to silently drop senders [puppet] - 10https://gerrit.wikimedia.org/r/748884 (https://phabricator.wikimedia.org/T298038) (owner: 10JHathaway)
[09:35:16] <wikibugs>	 (03CR) 10jerkins-bot: [V: 04-1] exim: add the ability to silently drop senders [puppet] - 10https://gerrit.wikimedia.org/r/748884 (https://phabricator.wikimedia.org/T298038) (owner: 10JHathaway)
[09:35:23] <wikibugs>	 10Puppet, 10SRE, 10Infrastructure-Foundations, 10Patch-For-Review: Unused puppet resources audit, 2021 - https://phabricator.wikimedia.org/T272559 (10dcaro)
[09:36:05] <wikibugs>	 (03CR) 10Jbond: [C: 03+1] "LGTM" [puppet] - 10https://gerrit.wikimedia.org/r/748884 (https://phabricator.wikimedia.org/T298038) (owner: 10JHathaway)
[09:36:59] <wikibugs>	 (03PS7) 10Jbond: exim: add the ability to silently drop senders [puppet] - 10https://gerrit.wikimedia.org/r/748884 (https://phabricator.wikimedia.org/T298038) (owner: 10JHathaway)
[09:40:02] <wikibugs>	 (03CR) 10Jbond: [C: 03+1] exim: add the ability to silently drop senders [puppet] - 10https://gerrit.wikimedia.org/r/748884 (https://phabricator.wikimedia.org/T298038) (owner: 10JHathaway)
[09:43:16] <wikibugs>	 (03CR) 10Jbond: [C: 03+1] "LGTM" [puppet] - 10https://gerrit.wikimedia.org/r/751956 (https://phabricator.wikimedia.org/T298657) (owner: 10Aqu)
[09:47:48] <icinga-wm>	 RECOVERY - SSH on mw2258.mgmt is OK: SSH OK - OpenSSH_7.0 (protocol 2.0) https://wikitech.wikimedia.org/wiki/Dc-operations/Hardware_Troubleshooting_Runbook
[09:48:08] <icinga-wm>	 PROBLEM - Debian mirror in sync with upstream on sodium is CRITICAL: /srv/mirrors/debian is over 14 hours old. https://wikitech.wikimedia.org/wiki/Mirrors
[09:52:00] <wikibugs>	 (03CR) 10Btullis: [C: 03+2] admin: create shell user aqu, add to analytics-privatedata-users [puppet] - 10https://gerrit.wikimedia.org/r/751956 (https://phabricator.wikimedia.org/T298657) (owner: 10Aqu)
[09:53:19] <wikibugs>	 10SRE, 10Performance-Team, 10Traffic, 10User-ema: Package and deploy Varnish 6.0.9 - https://phabricator.wikimedia.org/T298758 (10ema)
[09:53:26] <wikibugs>	 10SRE, 10Performance-Team, 10Traffic, 10User-ema: Package and deploy Varnish 6.0.9 - https://phabricator.wikimedia.org/T298758 (10ema) p:05Triage→03Medium
[09:57:36] <wikibugs>	 (03CR) 10David Caro: "Waiting for review from @mpopov, when he's back from paternal leave" [puppet] - 10https://gerrit.wikimedia.org/r/751704 (https://phabricator.wikimedia.org/T272559) (owner: 10David Caro)
[09:57:45] <wikibugs>	 (03CR) 10David Caro: "Waiting for review from @mpopov, when he's back from paternal leave" [puppet] - 10https://gerrit.wikimedia.org/r/751710 (https://phabricator.wikimedia.org/T272559) (owner: 10David Caro)
[10:04:21] <wikibugs>	 (03PS1) 10Joal: Update AQS druid datasource for new month [puppet] - 10https://gerrit.wikimedia.org/r/752132
[10:04:36] <joal>	 btullis: Heya - I posted that for when you have a minute --^
[10:05:18] <btullis>	 joal: Will do this morning. Thanks.
[10:19:37] <wikibugs>	 (03CR) 10Btullis: [C: 03+2] Update AQS druid datasource for new month [puppet] - 10https://gerrit.wikimedia.org/r/752132 (owner: 10Joal)
[10:20:46] <wikibugs>	 (03PS1) 10Majavah: Update wikitech etcd readonly exemption [mediawiki-config] - 10https://gerrit.wikimedia.org/r/752134
[10:33:16] <logmsgbot>	 !log btullis@cumin1001 START - Cookbook sre.aqs.roll-restart for AQS aqs cluster: Roll restart of all AQS's nodejs daemons.
[10:33:18] <stashbot>	 Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log
[10:38:03] <wikibugs>	 (03CR) 10Marostegui: "Adding Amir as he is more capable than me to review MW code :)" [mediawiki-config] - 10https://gerrit.wikimedia.org/r/752134 (owner: 10Majavah)
[10:39:52] <wikibugs>	 (03CR) 10Jelto: [C: 03+2] gitlab_runner: use config template for registering new runners [puppet] - 10https://gerrit.wikimedia.org/r/747539 (https://phabricator.wikimedia.org/T295481) (owner: 10Jelto)
[10:40:38] <logmsgbot>	 !log btullis@cumin1001 END (PASS) - Cookbook sre.aqs.roll-restart (exit_code=0) for AQS aqs cluster: Roll restart of all AQS's nodejs daemons.
[10:40:39] <stashbot>	 Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log
[10:47:15] <wikibugs>	 10SRE, 10SRE-Access-Requests: Requesting access to the data engineering team resources for Antoine Qu'hen - https://phabricator.wikimedia.org/T298657 (10BTullis)
[10:50:36] <wikibugs>	 (03PS1) 10Jelto: gitlab_runner: fix missing url in registration command [puppet] - 10https://gerrit.wikimedia.org/r/752137 (https://phabricator.wikimedia.org/T295481)
[10:56:16] <wikibugs>	 (03CR) 10Jelto: [C: 03+2] gitlab_runner: fix missing url in registration command [puppet] - 10https://gerrit.wikimedia.org/r/752137 (https://phabricator.wikimedia.org/T295481) (owner: 10Jelto)
[11:03:33] <wikibugs>	 (03PS1) 10Jelto: gitlab_runner: fix missing parameters in registration command [puppet] - 10https://gerrit.wikimedia.org/r/752138 (https://phabricator.wikimedia.org/T295481)
[11:07:01] <wikibugs>	 (03CR) 10Jelto: [C: 03+2] gitlab_runner: fix missing parameters in registration command [puppet] - 10https://gerrit.wikimedia.org/r/752138 (https://phabricator.wikimedia.org/T295481) (owner: 10Jelto)
[11:14:15] <wikibugs>	 (03PS1) 10RhinosF1: Revert "Use strict equality when safe to do so" [extensions/Flow] (wmf/1.38.0-wmf.16) - 10https://gerrit.wikimedia.org/r/752014
[11:14:41] <RhinosF1>	 taavi, kostajh: ^
[11:16:55] <wikibugs>	 10SRE, 10SRE-Access-Requests: Requesting access to the data engineering team resources for Antoine Qu'hen - https://phabricator.wikimedia.org/T298657 (10BTullis) I have added `aqu` to the `wmf` LDAP group as per: https://wikitech.wikimedia.org/wiki/SRE/LDAP#Add_a_user_to_a_group ` btullis@mwmaint1002:~$ sudo m...
[11:18:01] <wikibugs>	 (03PS2) 10Kosta Harlan: Revert "Use strict equality when safe to do so" [extensions/Flow] (wmf/1.38.0-wmf.16) - 10https://gerrit.wikimedia.org/r/752014 (https://phabricator.wikimedia.org/T298760) (owner: 10RhinosF1)
[11:22:00] <wikibugs>	 10SRE, 10SRE-Access-Requests: Requesting access to the data engineering team resources for Antoine Qu'hen - https://phabricator.wikimedia.org/T298657 (10BTullis) Maybe I jumped the gun here. I think that perhaps this ought to have been more correctly handled by the person on SRE clinic duty. https://wikitech.w...
[11:30:16] <zabe>	 I think T298694 is seeking for an emergency deployment (it could have been a train blocker imo)
[11:30:16] <stashbot>	 T298694: ProofreadPage: zoom/pan not working in side-by-side editing mode - https://phabricator.wikimedia.org/T298694
[11:32:27] <taavi>	 zabe: I'm happy to deploy as long as the patch author is available and gets releng+sre approval
[11:34:43] <wikibugs>	 (03CR) 10Kosta Harlan: [C: 03+1] Revert "Use strict equality when safe to do so" [extensions/Flow] (wmf/1.38.0-wmf.16) - 10https://gerrit.wikimedia.org/r/752014 (https://phabricator.wikimedia.org/T298760) (owner: 10RhinosF1)
[11:35:30] <wikibugs>	 (03CR) 10Hashar: [C: 03+2] Revert "Use strict equality when safe to do so" [extensions/Flow] (wmf/1.38.0-wmf.16) - 10https://gerrit.wikimedia.org/r/752014 (https://phabricator.wikimedia.org/T298760) (owner: 10RhinosF1)
[11:37:06] <inductiveload>	 hello! I'm not the patch author, but I am able to test it
[11:38:58] <zabe>	 hashar: could we do an emergency deployment for T298694 aswell?
[11:38:58] <stashbot>	 T298694: ProofreadPage: zoom/pan not working in side-by-side editing mode - https://phabricator.wikimedia.org/T298694
[11:45:01] <wikibugs>	 (03CR) 10Hnowlan: [C: 03+2] maps: correctly template swift credentials [puppet] - 10https://gerrit.wikimedia.org/r/751928 (https://phabricator.wikimedia.org/T292700) (owner: 10Hnowlan)
[11:47:39] <wikibugs>	 (03PS1) 10Hnowlan: maps: fix incorrect variable reference [puppet] - 10https://gerrit.wikimedia.org/r/752140 (https://phabricator.wikimedia.org/T292700)
[11:52:03] <wikibugs>	 (03Merged) 10jenkins-bot: Revert "Use strict equality when safe to do so" [extensions/Flow] (wmf/1.38.0-wmf.16) - 10https://gerrit.wikimedia.org/r/752014 (https://phabricator.wikimedia.org/T298760) (owner: 10RhinosF1)
[11:52:11] <hashar>	 zabe: what is the change ?
[11:52:14] <taavi>	 I'll sync that Flow patch out
[11:52:24] <hashar>	 +1
[11:52:46] <inductiveload>	 hashar: https://gerrit.wikimedia.org/r/c/mediawiki/extensions/ProofreadPage/+/751843
[11:52:59] <hashar>	 oh it is attached to the task
[11:53:00] <hashar>	 ;D
[11:53:16] <wikibugs>	 (03CR) 10Hashar: [C: 03+2] Makes sure $imgContHorizontal is always initialized [extensions/ProofreadPage] (wmf/1.38.0-wmf.16) - 10https://gerrit.wikimedia.org/r/751843 (https://phabricator.wikimedia.org/T298694) (owner: 10Tpt)
[11:53:20] <hashar>	 +2ed
[11:53:39] <inductiveload>	 it is, but now if it's the wrong one you can blame someone :-D
[11:54:48] <logmsgbot>	 !log mwdebug-deploy@deploy1002 helmfile [eqiad] START helmfile.d/services/mwdebug: apply on pinkunicorn
[11:54:49] <stashbot>	 Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log
[11:54:54] <taavi>	 tested the flow patch, syncing
[11:56:00] <logmsgbot>	 !log mwdebug-deploy@deploy1002 helmfile [eqiad] DONE helmfile.d/services/mwdebug: sync on pinkunicorn
[11:56:00] <logmsgbot>	 !log taavi@deploy1002 Synchronized php-1.38.0-wmf.16/extensions/Flow: Backport: [[gerrit:752014|Revert "Use strict equality when safe to do so" (T298760)]] (duration: 01m 00s)
[11:56:01] <stashbot>	 Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log
[11:56:01] <logmsgbot>	 !log mwdebug-deploy@deploy1002 helmfile [codfw] START helmfile.d/services/mwdebug: apply on pinkunicorn
[11:56:03] <stashbot>	 Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log
[11:56:04] <stashbot>	 T298760: Flow\Exception\FlowException: A required post has not been loaded: tn9fp3z7fq89497j - https://phabricator.wikimedia.org/T298760
[11:56:05] <stashbot>	 Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log
[11:56:24] <hashar>	 ;)
[11:56:30] <Tpt>	 hashar: Thank you!
[11:56:50] <Tpt>	 I'm around if you need someone to test the change
[11:57:13] <logmsgbot>	 !log mwdebug-deploy@deploy1002 helmfile [codfw] DONE helmfile.d/services/mwdebug: sync on pinkunicorn
[11:57:14] <stashbot>	 Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log
[11:57:14] <taavi>	 Tpt: we need someone to test it once Jenkins is happy with it
[11:58:01] <Tpt>	 great! I have a Firefox instance around with the WikimediaDebug extension
[11:58:49] <wikibugs>	 (03CR) 10Hnowlan: [C: 03+2] maps: fix incorrect variable reference [puppet] - 10https://gerrit.wikimedia.org/r/752140 (https://phabricator.wikimedia.org/T292700) (owner: 10Hnowlan)
[11:59:20] <hashar>	 https://gerrit.wikimedia.org/r/c/mediawiki/extensions/ProofreadPage/+/751843 is still in CI
[12:07:33] <wikibugs>	 (03CR) 10Jbond: [C: 03+1] elasticsearch:decommission: remove unused module [puppet] - 10https://gerrit.wikimedia.org/r/751088 (https://phabricator.wikimedia.org/T272559) (owner: 10David Caro)
[12:11:29] <wikibugs>	 10SRE, 10LDAP-Access-Requests: Grant Access to ldap/wmf for Marco_Fossati - https://phabricator.wikimedia.org/T298766 (10mfossati) Does //shell access// mean regular or **production** one? I don't have the latter yet.
[12:12:12] <wikibugs>	 (03Merged) 10jenkins-bot: Makes sure $imgContHorizontal is always initialized [extensions/ProofreadPage] (wmf/1.38.0-wmf.16) - 10https://gerrit.wikimedia.org/r/751843 (https://phabricator.wikimedia.org/T298694) (owner: 10Tpt)
[12:12:14] <taavi>	 finally
[12:12:55] <taavi>	 Tpt: inductiveload: the patch is live on mwdebug1002, could you test please?
[12:13:28] <inductiveload>	 yep that's working
[12:13:42] <taavi>	 great, syncing
[12:14:28] <hashar>	 \o/
[12:14:37] <logmsgbot>	 !log taavi@deploy1002 Synchronized php-1.38.0-wmf.16/extensions/ProofreadPage/modules/page: Backport: [[gerrit:751843|Makes sure $imgContHorizontal is always initialized (T298694)]] (duration: 00m 59s)
[12:14:39] <stashbot>	 Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log
[12:14:40] <stashbot>	 T298694: ProofreadPage: zoom/pan not working in side-by-side editing mode - https://phabricator.wikimedia.org/T298694
[12:14:56] <wikibugs>	 (03CR) 10Jelto: [V: 03+1 C: 03+2] P:prometheus::ops: add prometheus job and ferm rules for gitlab_runner metrics [puppet] - 10https://gerrit.wikimedia.org/r/751452 (https://phabricator.wikimedia.org/T295481) (owner: 10Jelto)
[12:15:04] <taavi>	 the patch is now live
[12:15:06] <taavi>	 anything else?
[12:15:10] <inductiveload>	 ¡hola! xover, jsut in time
[12:15:31] <xover>	 Indeed.
[12:15:33] <inductiveload>	 not from me, thank you very much for the backport
[12:15:35] <Tpt>	 thank you!
[12:15:35] <Tpt>	 I believe eveything else is fine on Wikisource
[12:16:08] <taavi>	 great
[12:17:19] <wikibugs>	 (03PS3) 10Jelto: P:prometheus::ops: add prometheus job and ferm rules for gitlab_runner metrics [puppet] - 10https://gerrit.wikimedia.org/r/751452 (https://phabricator.wikimedia.org/T295481)
[12:17:25] <logmsgbot>	 !log mwdebug-deploy@deploy1002 helmfile [eqiad] START helmfile.d/services/mwdebug: apply on pinkunicorn
[12:17:26] <stashbot>	 Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log
[12:18:27] <logmsgbot>	 !log mwdebug-deploy@deploy1002 helmfile [eqiad] DONE helmfile.d/services/mwdebug: sync on pinkunicorn
[12:18:28] <stashbot>	 Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log
[12:18:28] <logmsgbot>	 !log mwdebug-deploy@deploy1002 helmfile [codfw] START helmfile.d/services/mwdebug: apply on pinkunicorn
[12:18:30] <stashbot>	 Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log
[12:19:52] <logmsgbot>	 !log mwdebug-deploy@deploy1002 helmfile [codfw] DONE helmfile.d/services/mwdebug: sync on pinkunicorn
[12:19:53] <stashbot>	 Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log
[12:20:13] <xover>	 inductiveload, Tpt: verified. all the issues I noticed / saw reported appear to be fixed.
[12:20:55] <hashar>	 taavi: thank you for the backports deployments!
[12:21:57] <wikibugs>	 10SRE, 10SRE-tools, 10DC-Ops, 10Infrastructure-Foundations, and 2 others: Allow idrac tftp fetching of firmware updates (either to existing tftp or new solution) - https://phabricator.wikimedia.org/T283771 (10jbond) While looking at  Open Manage Enterprise i noticed that it appeared to download the informa...
[12:25:58] <icinga-wm>	 PROBLEM - MediaWiki exceptions and fatals per minute for parsoid on alert1001 is CRITICAL: 408 gt 100 https://wikitech.wikimedia.org/wiki/Application_servers https://grafana.wikimedia.org/d/000000438/mediawiki-alerts?panelId=18&fullscreen&orgId=1&var-datasource=eqiad+prometheus/ops
[12:26:12] <wikibugs>	 (03CR) 10Matthias Mullie: [C: 03+1] "Other patch has been approved; this is good to go" [mediawiki-config] - 10https://gerrit.wikimedia.org/r/747868 (https://phabricator.wikimedia.org/T297863) (owner: 10Matthias Mullie)
[12:27:04] <wikibugs>	 (03PS2) 10Matthias Mullie: Add MediaSearch profiles [mediawiki-config] - 10https://gerrit.wikimedia.org/r/747868 (https://phabricator.wikimedia.org/T297863)
[12:28:14] <icinga-wm>	 RECOVERY - MediaWiki exceptions and fatals per minute for parsoid on alert1001 is OK: (C)100 gt (W)50 gt 6 https://wikitech.wikimedia.org/wiki/Application_servers https://grafana.wikimedia.org/d/000000438/mediawiki-alerts?panelId=18&fullscreen&orgId=1&var-datasource=eqiad+prometheus/ops
[12:37:46] <wikibugs>	 (03PS1) 10Jgiannelos: Disable tilerator in all envs maps are deployed [puppet] - 10https://gerrit.wikimedia.org/r/752145 (https://phabricator.wikimedia.org/T298246)
[12:38:01] <wikibugs>	 (03PS1) 10Ssingh: hieradata: add durum cluster [puppet] - 10https://gerrit.wikimedia.org/r/752146
[12:47:49] <wikibugs>	 10SRE, 10Move-Files-To-Commons, 10Wikimedia-Extension-setup, 10Patch-For-Review, 10Wikimedia-extension-review-queue: Deploying FileExporter and FileImporter - https://phabricator.wikimedia.org/T190716 (10thiemowmde) 05Open→03Resolved Deployed to all wikis since T213425. Not a Beta feature any more si...
[13:11:29] <wikibugs>	 10SRE: Add user nmaphophe@wikimedia.org to the analytics-alerts mailing list - https://phabricator.wikimedia.org/T298770 (10ntsako)
[13:24:30] <wikibugs>	 10SRE, 10LDAP-Access-Requests: Grant Access to ldap/wmf for Marco_Fossati - https://phabricator.wikimedia.org/T298766 (10RhinosF1) Just production shell access
[13:26:38] <wikibugs>	 10SRE: Add user nmaphophe@wikimedia.org to the analytics-alerts mailing list - https://phabricator.wikimedia.org/T298770 (10Aklapper) (For the records, this is not a mailing list. It's an alias, see T289807.)
[13:26:50] <wikibugs>	 10SRE: Add user nmaphophe@wikimedia.org to the analytics-alerts mail alias - https://phabricator.wikimedia.org/T298770 (10Aklapper)
[13:45:25] <wikibugs>	 (03PS1) 10Ema: Use libunwind for backtraces [debs/varnish4] (debian-wmf) - 10https://gerrit.wikimedia.org/r/752151 (https://phabricator.wikimedia.org/T298758)
[13:56:14] <wikibugs>	 10SRE, 10Performance-Team, 10Traffic, 10Patch-For-Review, 10User-ema: Package and deploy Varnish 6.0.9 - https://phabricator.wikimedia.org/T298758 (10ema) >>! In T298758#7604333, @gerritbot wrote: > Change 752151 had a related patch set uploaded (by Ema; author: Ema): > %%%[operations/debs/varnish4@debia...
[13:56:39] <wikibugs>	 10SRE, 10Performance-Team, 10Traffic, 10Patch-For-Review, 10User-ema: Package and deploy Varnish 6.0.9 - https://phabricator.wikimedia.org/T298758 (10ema)
[14:04:33] <wikibugs>	 (03CR) 10Ladsgroup: [C: 03+1] "Looks straightforward enough to me." [mediawiki-config] - 10https://gerrit.wikimedia.org/r/752134 (owner: 10Majavah)
[14:05:40] <ema>	 !log upgrade varnish on deployment-cache-text06 to 6.0.9 T298758
[14:05:42] <stashbot>	 Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log
[14:05:42] <wikibugs>	 10SRE, 10SRE-Access-Requests: Requesting access to the data engineering team resources for Antoine Qu'hen - https://phabricator.wikimedia.org/T298657 (10Ottomata) No I think any SRE can do the work; IIUC clinic duty exists to make sure things like this don't fall through the cracks.  Proceed!
[14:05:43] <stashbot>	 T298758: Package and deploy Varnish 6.0.9 - https://phabricator.wikimedia.org/T298758
[14:06:16] <wikibugs>	 (03PS2) 10Majavah: Update wikitech etcd readonly exemption [mediawiki-config] - 10https://gerrit.wikimedia.org/r/752134
[14:06:34] <wikibugs>	 (03CR) 10Majavah: Update wikitech etcd readonly exemption (031 comment) [mediawiki-config] - 10https://gerrit.wikimedia.org/r/752134 (owner: 10Majavah)
[14:07:56] <wikibugs>	 10SRE, 10SRE-Access-Requests: Requesting access to the data engineering team resources for Antoine Qu'hen - https://phabricator.wikimedia.org/T298657 (10BTullis) p:05Triage→03Medium a:03BTullis
[14:08:14] <wikibugs>	 (03CR) 10Ladsgroup: [C: 03+1] Update wikitech etcd readonly exemption [mediawiki-config] - 10https://gerrit.wikimedia.org/r/752134 (owner: 10Majavah)
[14:09:07] <wikibugs>	 (03CR) 10Ema: [V: 03+2 C: 03+2] Use libunwind for backtraces [debs/varnish4] (debian-wmf) - 10https://gerrit.wikimedia.org/r/752151 (https://phabricator.wikimedia.org/T298758) (owner: 10Ema)
[14:10:28] <wikibugs>	 (03PS1) 10Ema: Release 6.0.9-1wm1 [debs/varnish4] (debian-wmf) - 10https://gerrit.wikimedia.org/r/752153 (https://phabricator.wikimedia.org/T293879)
[14:11:30] <wikibugs>	 10SRE, 10SRE-Access-Requests, 10Data-Engineering-Kanban: Requesting access to the data engineering team resources for Antoine Qu'hen - https://phabricator.wikimedia.org/T298657 (10BTullis)
[14:12:46] <wikibugs>	 (03PS2) 10Ema: Release 6.0.9-1wm1 [debs/varnish4] (debian-wmf) - 10https://gerrit.wikimedia.org/r/752153 (https://phabricator.wikimedia.org/T298758)
[14:14:17] <wikibugs>	 10SRE, 10Performance-Team, 10Traffic, 10Patch-For-Review, 10User-ema: Package and deploy Varnish 6.0.9 - https://phabricator.wikimedia.org/T298758 (10ema) Smoke testing of 6.0.9 is fine on deployment-prep, I'll start upgrading production nodes next week.
[14:38:35] <wikibugs>	 10SRE, 10SRE-Access-Requests, 10Data-Engineering-Kanban: Requesting access to the data engineering team resources for Antoine Qu'hen - https://phabricator.wikimedia.org/T298657 (10BTullis) I have created a Kerberos principal for Antoine. ` btullis@krb1001:~$ sudo manage_principals.py get aqu get_principal: P...
[14:43:39] <wikibugs>	 (03CR) 10jerkins-bot: [V: 04-1] Release 6.0.9-1wm1 [debs/varnish4] (debian-wmf) - 10https://gerrit.wikimedia.org/r/752153 (https://phabricator.wikimedia.org/T298758) (owner: 10Ema)
[14:54:52] <icinga-wm>	 PROBLEM - SSH on mw2252.mgmt is CRITICAL: CRITICAL - Socket timeout after 10 seconds https://wikitech.wikimedia.org/wiki/Dc-operations/Hardware_Troubleshooting_Runbook
[15:03:07] <wikibugs>	 10SRE, 10Move-Files-To-Commons, 10Wikimedia-Extension-setup, 10Patch-For-Review, 10Wikimedia-extension-review-queue: Deploy FileExporter and FileImporter to group0 - https://phabricator.wikimedia.org/T195370 (10thiemowmde)
[15:04:36] <wikibugs>	 10SRE, 10Move-Files-To-Commons, 10Wikimedia-Extension-setup, 10Patch-For-Review, 10Wikimedia-extension-review-queue: Deploying FileExporter and FileImporter - https://phabricator.wikimedia.org/T190716 (10thiemowmde)
[15:08:19] <ottomata>	 !log creeating mediainfo-streaming-updater.mutation topics on kafka main-eqiad and main-codfw and setting retention to 30 days - T296470
[15:08:21] <stashbot>	 Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log
[15:08:22] <stashbot>	 T296470: Initialize WCQS production servers - https://phabricator.wikimedia.org/T296470
[15:11:01] <wikibugs>	 10SRE: Adding aquhen@wikimedia.org to the analytics-alerts mailing list - https://phabricator.wikimedia.org/T298778 (10BTullis) I can verify this request. Antoine has recently joined our team.
[15:18:50] <taavi>	 !log reset email address for Ollie Shotton developer account per T298779
[15:18:52] <stashbot>	 Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log
[15:18:53] <stashbot>	 T298779: Account recovery help needed for Developer account Ollie Shotton - https://phabricator.wikimedia.org/T298779
[15:25:42] <icinga-wm>	 PROBLEM - Check systemd state on doc1001 is CRITICAL: CRITICAL - degraded: The following units failed: rsync-doc-doc2001.codfw.wmnet.service https://wikitech.wikimedia.org/wiki/Monitoring/check_systemd_state
[15:26:49] <taavi>	 ^ that seems to flap once in a while and then self-recovers after an hour or so, but I'm still curious on why it fails occasionally
[15:30:07] <wikibugs>	 10SRE, 10SRE-Access-Requests, 10Data-Engineering-Kanban: Requesting access to the data engineering team resources for Antoine Qu'hen - https://phabricator.wikimedia.org/T298657 (10BTullis) I believe that this is now complete, but feel free to respond on this ticket Antoine if anything doesn't behave as you'd...
[15:31:19] <wikibugs>	 (03PS8) 10Andrew Bogott: Added cookbook to create an nfs server [cookbooks] (wmcs) - 10https://gerrit.wikimedia.org/r/736915
[15:35:18] <wikibugs>	 (03CR) 10AOkoth: [C: 03+2] kubernetes: point to new kubestage node [dns] - 10https://gerrit.wikimedia.org/r/751976 (https://phabricator.wikimedia.org/T293729) (owner: 10AOkoth)
[15:35:26] <wikibugs>	 (03CR) 10AOkoth: [C: 03+2] kubernetes: remove kubestage1001 & kubestage1002 [puppet] - 10https://gerrit.wikimedia.org/r/751752 (https://phabricator.wikimedia.org/T293729) (owner: 10AOkoth)
[15:43:14] <wikibugs>	 10SRE, 10Two-Column-Edit-Conflict-Merge, 10Patch-For-Review: Deploy TwoColConflict extension to beta - https://phabricator.wikimedia.org/T154927 (10thiemowmde)
[15:49:26] <wikibugs>	 10SRE, 10Two-Column-Edit-Conflict-Merge, 10Patch-For-Review, 10Wikimedia-extension-review-queue: Deploy TwoColConflict extension to production - https://phabricator.wikimedia.org/T150184 (10thiemowmde)
[16:00:28] <icinga-wm>	 PROBLEM - SSH on contint1001.mgmt is CRITICAL: CRITICAL - Socket timeout after 10 seconds https://wikitech.wikimedia.org/wiki/Dc-operations/Hardware_Troubleshooting_Runbook
[16:04:14] <wikibugs>	 (03CR) 10AOkoth: [C: 03+2] kubernetes: remove kubestage1001 & kubestage1002 [homer/public] - 10https://gerrit.wikimedia.org/r/751754 (https://phabricator.wikimedia.org/T293729) (owner: 10AOkoth)
[16:11:04] <wikibugs>	 (03CR) 10Andrew Bogott: [C: 03+2] Added cookbook to create an nfs server [cookbooks] (wmcs) - 10https://gerrit.wikimedia.org/r/736915 (owner: 10Andrew Bogott)
[16:12:43] <wikibugs>	 (03PS2) 10JHathaway: sodium: change role to insetup, to prep for decom [puppet] - 10https://gerrit.wikimedia.org/r/751990
[16:14:08] <wikibugs>	 (03Merged) 10jenkins-bot: Added cookbook to create an nfs server [cookbooks] (wmcs) - 10https://gerrit.wikimedia.org/r/736915 (owner: 10Andrew Bogott)
[16:20:44] <wikibugs>	 (03PS3) 10JHathaway: sodium: change role to spare::system, to prep for decom [puppet] - 10https://gerrit.wikimedia.org/r/751990
[16:21:07] <wikibugs>	 10SRE, 10SRE-Access-Requests, 10Data-Engineering-Kanban: Requesting access to the data engineering team resources for Antoine Qu'hen - https://phabricator.wikimedia.org/T298657 (10BTullis) @Antoine_Quhen - I notice that you haven't added yourself to the `analytics-admins` group in `data.yaml`, only the `anal...
[16:21:10] <icinga-wm>	 RECOVERY - Check systemd state on doc1001 is OK: OK - running: The system is fully operational https://wikitech.wikimedia.org/wiki/Monitoring/check_systemd_state
[16:21:24] <wikibugs>	 (03CR) 10Jbond: [C: 03+1] "LGTM" [puppet] - 10https://gerrit.wikimedia.org/r/751990 (owner: 10JHathaway)
[16:23:26] <wikibugs>	 (03PS4) 10JHathaway: sodium: change role to spare::system, to prep for decom [puppet] - 10https://gerrit.wikimedia.org/r/751990
[16:27:41] <wikibugs>	 (03PS1) 10Urbanecm: Do not delete the suppress group [mediawiki-config] - 10https://gerrit.wikimedia.org/r/752162 (https://phabricator.wikimedia.org/T112147)
[16:27:43] <wikibugs>	 (03PS1) 10Urbanecm: Remove the oversight group hack [mediawiki-config] - 10https://gerrit.wikimedia.org/r/752163 (https://phabricator.wikimedia.org/T112147)
[16:27:51] <wikibugs>	 10SRE, 10SRE-OnFire, 10Infrastructure-Foundations, 10Patch-For-Review: "User-reported connectivity errors" (NEL data) not being posted to statuspage since 1 Jan 00:00 UTC - https://phabricator.wikimedia.org/T298619 (10colewhite) Index curation is affected as well because python's datetime formatter doesn't...
[16:31:31] <wikibugs>	 (03CR) 10JHathaway: [C: 03+2] sodium: change role to spare::system, to prep for decom [puppet] - 10https://gerrit.wikimedia.org/r/751990 (owner: 10JHathaway)
[16:32:31] <wikibugs>	 10SRE-Access-Requests: Requesting access to analytics cluster for Sandra Ebele Nwachukwu - https://phabricator.wikimedia.org/T298786 (10Snwachukwu)
[16:34:10] <wikibugs>	 10SRE-Access-Requests: Requesting access to analytics cluster for Sandra Ebele Nwachukwu - https://phabricator.wikimedia.org/T298786 (10Snwachukwu)
[16:38:11] <wikibugs>	 10SRE-Access-Requests: Requesting access to analytics cluster for Sandra Ebele Nwachukwu - https://phabricator.wikimedia.org/T298786 (10Snwachukwu)
[16:40:27] <wikibugs>	 10SRE-Access-Requests: Requesting access to analytics cluster for Sandra Ebele Nwachukwu - https://phabricator.wikimedia.org/T298786 (10Snwachukwu)
[16:44:49] <wikibugs>	 10SRE-Access-Requests: Requesting access to Data Engineering team resources for Sandra Ebele Nwachukwu - https://phabricator.wikimedia.org/T298786 (10Snwachukwu)
[16:46:22] <wikibugs>	 10SRE, 10SRE-Access-Requests, 10Data-Engineering-Kanban: Requesting access to the data engineering team resources for Antoine Qu'hen - https://phabricator.wikimedia.org/T298657 (10odimitrijevic) Approved
[16:46:54] <wikibugs>	 10SRE, 10LDAP-Access-Requests: Grant Access to ldap/wmf for Marco_Fossati - https://phabricator.wikimedia.org/T298766 (10mfossati)
[16:47:11] <wikibugs>	 (03CR) 10Andrew Bogott: [C: 03+2] designate sink: fix proxy cleanup when proxy domain == project domain [puppet] - 10https://gerrit.wikimedia.org/r/751963 (https://phabricator.wikimedia.org/T298681) (owner: 10Andrew Bogott)
[16:47:51] <wikibugs>	 10SRE-swift-storage, 10serviceops, 10Patch-For-Review: Allow maps2009/maps1009 (master nodes) access thanos-swift - https://phabricator.wikimedia.org/T292700 (10Jgiannelos) 05Open→03Resolved a:03Jgiannelos
[16:57:06] <icinga-wm>	 RECOVERY - SSH on mw2252.mgmt is OK: SSH OK - OpenSSH_7.0 (protocol 2.0) https://wikitech.wikimedia.org/wiki/Dc-operations/Hardware_Troubleshooting_Runbook
[16:58:19] <wikibugs>	 10SRE, 10ops-codfw, 10DBA: Degraded RAID on db2147 - https://phabricator.wikimedia.org/T298301 (10Papaul) 05Open→03Resolved Disk replaced
[16:58:45] <wikibugs>	 (03PS1) 10David Caro: check_haproxy: improve failover output [puppet] - 10https://gerrit.wikimedia.org/r/752170
[16:59:24] <wikibugs>	 (03CR) 10jerkins-bot: [V: 04-1] check_haproxy: improve failover output [puppet] - 10https://gerrit.wikimedia.org/r/752170 (owner: 10David Caro)
[16:59:29] <wikibugs>	 (03PS2) 10David Caro: check_haproxy: improve failover output [puppet] - 10https://gerrit.wikimedia.org/r/752170
[17:00:06] <wikibugs>	 (03CR) 10jerkins-bot: [V: 04-1] check_haproxy: improve failover output [puppet] - 10https://gerrit.wikimedia.org/r/752170 (owner: 10David Caro)
[17:01:36] <wikibugs>	 (03PS3) 10David Caro: check_haproxy: improve failover output [puppet] - 10https://gerrit.wikimedia.org/r/752170
[17:05:30] <wikibugs>	 (03PS1) 10Elukey: Use a flag to deploy log4j-extras on Hadoop-related nodes [puppet] - 10https://gerrit.wikimedia.org/r/752171
[17:06:49] <wikibugs>	 (03CR) 10Elukey: [V: 03+1] "PCC SUCCESS (DIFF 3): https://integration.wikimedia.org/ci/job/operations-puppet-catalog-compiler/33154/console" [puppet] - 10https://gerrit.wikimedia.org/r/752171 (owner: 10Elukey)
[17:11:45] <wikibugs>	 10SRE, 10SRE-Access-Requests: Requesting access to Data Engineering team resources for Sandra Ebele Nwachukwu - https://phabricator.wikimedia.org/T298786 (10Aklapper) a:05Snwachukwu→03None
[17:12:14] <wikibugs>	 10SRE, 10ops-codfw, 10Traffic: cp2029 crashed, hardware memory error - https://phabricator.wikimedia.org/T298293 (10Papaul) 05Open→03Resolved Checked the server today no error so far on DIMM B1, closing the task. if we have the problem we can re-open the task.
[17:12:16] <wikibugs>	 (03CR) 10Btullis: [C: 03+1] "LGTM" [puppet] - 10https://gerrit.wikimedia.org/r/752171 (owner: 10Elukey)
[17:13:27] <wikibugs>	 (03CR) 10Elukey: [V: 03+1 C: 03+2] Use a flag to deploy log4j-extras on Hadoop-related nodes [puppet] - 10https://gerrit.wikimedia.org/r/752171 (owner: 10Elukey)
[17:18:14] <wikibugs>	 10SRE, 10ops-codfw, 10DBA: Degraded RAID on db2147 - https://phabricator.wikimedia.org/T298301 (10Marostegui) Thanks - it is rebuilding: ` Enclosure Device ID: 32 Slot Number: 4 Drive's position: DiskGroup: 0, Span: 0, Arm: 4 Enclosure position: 1 Device Id: 4 WWN: 55cd2e41537dbf9a Sequence Number: 13 Media...
[17:31:25] <wikibugs>	 (03PS2) 10Cwhite: logstash: update weekly indexes to use weekyear pattern syntax [puppet] - 10https://gerrit.wikimedia.org/r/751765 (https://phabricator.wikimedia.org/T298619)
[17:31:27] <wikibugs>	 (03PS2) 10Cwhite: prometheus: update affected es-exporter configs to use weekyear [puppet] - 10https://gerrit.wikimedia.org/r/751766 (https://phabricator.wikimedia.org/T298619)
[17:32:02] <wikibugs>	 (03CR) 10Cwhite: logstash: update weekly indexes to use weekyear pattern syntax (031 comment) [puppet] - 10https://gerrit.wikimedia.org/r/751765 (https://phabricator.wikimedia.org/T298619) (owner: 10Cwhite)
[17:32:04] <wikibugs>	 (03CR) 10jerkins-bot: [V: 04-1] logstash: update weekly indexes to use weekyear pattern syntax [puppet] - 10https://gerrit.wikimedia.org/r/751765 (https://phabricator.wikimedia.org/T298619) (owner: 10Cwhite)
[17:35:44] <wikibugs>	 (03PS3) 10Cwhite: logstash: update weekly indexes to use weekyear pattern syntax [puppet] - 10https://gerrit.wikimedia.org/r/751765 (https://phabricator.wikimedia.org/T298619)
[17:35:46] <wikibugs>	 (03PS3) 10Cwhite: prometheus: update affected es-exporter configs to use weekyear [puppet] - 10https://gerrit.wikimedia.org/r/751766 (https://phabricator.wikimedia.org/T298619)
[17:41:05] <wikibugs>	 10SRE, 10SRE-Access-Requests: Requesting access to Data Engineering team resources for Sandra Ebele Nwachukwu - https://phabricator.wikimedia.org/T298786 (10Ottomata) Approved.
[17:41:45] <wikibugs>	 10SRE, 10SRE-Access-Requests: Requesting access to Data Engineering team resources for Sandra Ebele Nwachukwu - https://phabricator.wikimedia.org/T298786 (10odimitrijevic) Approved
[17:47:35] <wikibugs>	 (03CR) 10CDanis: [C: 03+1] "thanks!" [puppet] - 10https://gerrit.wikimedia.org/r/751765 (https://phabricator.wikimedia.org/T298619) (owner: 10Cwhite)
[17:58:17] <wikibugs>	 (03PS1) 10Andrew Bogott: wmf_sink base: fix the calculation of proxy parent zone [puppet] - 10https://gerrit.wikimedia.org/r/752181 (https://phabricator.wikimedia.org/T298681)
[18:00:39] <wikibugs>	 (03CR) 10Andrew Bogott: [C: 03+2] wmf_sink base: fix the calculation of proxy parent zone [puppet] - 10https://gerrit.wikimedia.org/r/752181 (https://phabricator.wikimedia.org/T298681) (owner: 10Andrew Bogott)
[18:01:19] <wikibugs>	 (03PS1) 10BryanDavis: wikitech: Remove password clear on block [mediawiki-config] - 10https://gerrit.wikimedia.org/r/752185
[18:08:29] <wikibugs>	 (03PS10) 10Herron: prometheus: add blackbox generic http/s static check support [puppet] - 10https://gerrit.wikimedia.org/r/747550 (https://phabricator.wikimedia.org/T292603)
[18:08:48] <wikibugs>	 (03CR) 10jerkins-bot: [V: 04-1] prometheus: add blackbox generic http/s static check support [puppet] - 10https://gerrit.wikimedia.org/r/747550 (https://phabricator.wikimedia.org/T292603) (owner: 10Herron)
[18:10:32] <wikibugs>	 (03PS11) 10Herron: prometheus: add blackbox generic http/s static check support [puppet] - 10https://gerrit.wikimedia.org/r/747550 (https://phabricator.wikimedia.org/T292603)
[18:11:09] <wikibugs>	 (03CR) 10jerkins-bot: [V: 04-1] prometheus: add blackbox generic http/s static check support [puppet] - 10https://gerrit.wikimedia.org/r/747550 (https://phabricator.wikimedia.org/T292603) (owner: 10Herron)
[18:12:27] <wikibugs>	 (03PS12) 10Herron: prometheus: add blackbox generic http/s static check support [puppet] - 10https://gerrit.wikimedia.org/r/747550 (https://phabricator.wikimedia.org/T292603)
[18:15:14] <wikibugs>	 (03PS1) 10Urbanecm: Growth: Add GEMentorDashboardDeploymentMode [mediawiki-config] - 10https://gerrit.wikimedia.org/r/752187 (https://phabricator.wikimedia.org/T298792)
[18:15:58] <wikibugs>	 (03CR) 10Reedy: [C: 03+1] wikitech: Remove password clear on block [mediawiki-config] - 10https://gerrit.wikimedia.org/r/752185 (owner: 10BryanDavis)
[18:16:35] <wikibugs>	 (03CR) 10Herron: prometheus: add blackbox generic http/s static check support (032 comments) [puppet] - 10https://gerrit.wikimedia.org/r/747550 (https://phabricator.wikimedia.org/T292603) (owner: 10Herron)
[18:27:24] <wikibugs>	 10SRE, 10vm-requests: eqiad/codfw: 2 VMs requested for apifeatureusage - https://phabricator.wikimedia.org/T298794 (10herron) p:05Triage→03Medium
[18:29:32] <logmsgbot>	 !log pt1979@cumin2002 START - Cookbook sre.hosts.reimage for host elastic2051.codfw.wmnet with OS bullseye
[18:29:34] <stashbot>	 Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log
[18:29:40] <wikibugs>	 10SRE, 10ops-codfw, 10Discovery-Search, 10Patch-For-Review: Degraded RAID on elastic2051 - https://phabricator.wikimedia.org/T298674 (10ops-monitoring-bot) Cookbook cookbooks.sre.hosts.reimage was started by pt1979@cumin2002 for host elastic2051.codfw.wmnet with OS bullseye
[18:51:47] <icinga-wm>	 RECOVERY - MegaRAID on db2147 is OK: OK: optimal, 1 logical, 10 physical, WriteBack policy https://wikitech.wikimedia.org/wiki/MegaCli%23Monitoring
[19:10:54] <wikibugs>	 10SRE, 10vm-requests: eqiad/codfw: 2 VMs requested for apifeatureusage - https://phabricator.wikimedia.org/T298794 (10herron) a:03herron
[19:11:01] <wikibugs>	 (03CR) 10Dzahn: [C: 03+2] planet: add wikidatacon tag to my blog feed [puppet] - 10https://gerrit.wikimedia.org/r/752040 (owner: 10Addshore)
[19:11:26] <logmsgbot>	 !log herron@cumin1001 START - Cookbook sre.ganeti.makevm for new host apifeatureusage1001.eqiad.wmnet
[19:11:28] <stashbot>	 Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log
[19:12:13] <icinga-wm>	 RECOVERY - SSH on contint1001.mgmt is OK: SSH OK - OpenSSH_6.6 (protocol 2.0) https://wikitech.wikimedia.org/wiki/Dc-operations/Hardware_Troubleshooting_Runbook
[19:16:23] <logmsgbot>	 !log pt1979@cumin2002 END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host elastic2051.codfw.wmnet with OS bullseye
[19:16:25] <stashbot>	 Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log
[19:16:31] <wikibugs>	 10SRE, 10ops-codfw, 10Discovery-Search, 10Patch-For-Review: Degraded RAID on elastic2051 - https://phabricator.wikimedia.org/T298674 (10ops-monitoring-bot) Cookbook cookbooks.sre.hosts.reimage started by pt1979@cumin2002 for host elastic2051.codfw.wmnet with OS bullseye executed with errors: - elastic2051...
[19:18:52] <logmsgbot>	 !log pt1979@cumin2002 START - Cookbook sre.hosts.reimage for host elastic2051.codfw.wmnet with OS bullseye
[19:18:53] <stashbot>	 Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log
[19:18:59] <wikibugs>	 10SRE, 10ops-codfw, 10Discovery-Search, 10Patch-For-Review: Degraded RAID on elastic2051 - https://phabricator.wikimedia.org/T298674 (10ops-monitoring-bot) Cookbook cookbooks.sre.hosts.reimage was started by pt1979@cumin2002 for host elastic2051.codfw.wmnet with OS bullseye
[19:21:03] <logmsgbot>	 !log herron@cumin1001 START - Cookbook sre.ganeti.makevm for new host apifeatureusage2001.codfw.wmnet
[19:21:05] <stashbot>	 Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log
[19:24:59] <wikibugs>	 10SRE, 10vm-requests: eqiad/codfw: 2 VMs requested for apifeatureusage - https://phabricator.wikimedia.org/T298794 (10herron)
[19:25:02] <wikibugs>	 10SRE, 10Observability-Logging, 10Patch-For-Review: Move logstash api-feature-usage output away from v5 cluster - https://phabricator.wikimedia.org/T297239 (10herron)
[19:25:14] <wikibugs>	 10SRE, 10ops-codfw: host ps1-d1-codfw down since a long time but still monitored - https://phabricator.wikimedia.org/T298800 (10Dzahn)
[19:26:58] <icinga-wm>	 ACKNOWLEDGEMENT - Host ps1-d1-codfw is DOWN: PING CRITICAL - Packet loss = 100% daniel_zahn https://phabricator.wikimedia.org/T298800
[19:29:23] <icinga-wm>	 ACKNOWLEDGEMENT - SSH on db2083.mgmt is CRITICAL: CRITICAL - Socket timeout after 10 seconds daniel_zahn https://phabricator.wikimedia.org/T283582 https://wikitech.wikimedia.org/wiki/Dc-operations/Hardware_Troubleshooting_Runbook
[19:29:23] <icinga-wm>	 ACKNOWLEDGEMENT - SSH on db2086.mgmt is CRITICAL: CRITICAL - Socket timeout after 10 seconds daniel_zahn https://phabricator.wikimedia.org/T283582 https://wikitech.wikimedia.org/wiki/Dc-operations/Hardware_Troubleshooting_Runbook
[19:30:00] <wikibugs>	 10ops-codfw, 10Continuous-Integration-Infrastructure, 10DC-Ops, 10netops: DRAC firmware upgrades codfw (was: Flapping codfw management alarm ( contint2001.mgmt/SSH is CRITICAL )) - https://phabricator.wikimedia.org/T283582 (10Dzahn) db2063 and db2068 were affected today
[19:32:30] <wikibugs>	 10ops-codfw, 10Continuous-Integration-Infrastructure, 10DC-Ops, 10netops: DRAC firmware upgrades codfw (was: Flapping codfw management alarm ( contint2001.mgmt/SSH is CRITICAL )) - https://phabricator.wikimedia.org/T283582 (10Dzahn) for the record: I have absolutely no idea why contint2001.mgmt disappeared...
[19:33:22] <wikibugs>	 10ops-codfw, 10Continuous-Integration-Infrastructure, 10DC-Ops, 10netops: DRAC firmware upgrades codfw (was: Flapping codfw management alarm ( contint2001.mgmt/SSH is CRITICAL )) - https://phabricator.wikimedia.org/T283582 (10Dzahn) a:05Dzahn→03None
[19:34:05] <wikibugs>	 10SRE, 10DC-Ops: Confirm support of PERC 750 raid controller - https://phabricator.wikimedia.org/T297913 (10RobH)
[19:35:16] <wikibugs>	 10SRE, 10DC-Ops: Confirm support of PERC 750 raid controller - https://phabricator.wikimedia.org/T297913 (10RobH)
[19:36:08] <logmsgbot>	 !log pt1979@cumin2002 END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host elastic2051.codfw.wmnet with OS bullseye
[19:36:10] <stashbot>	 Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log
[19:36:18] <wikibugs>	 10SRE, 10ops-codfw, 10Discovery-Search, 10Patch-For-Review: Degraded RAID on elastic2051 - https://phabricator.wikimedia.org/T298674 (10ops-monitoring-bot) Cookbook cookbooks.sre.hosts.reimage started by pt1979@cumin2002 for host elastic2051.codfw.wmnet with OS bullseye executed with errors: - elastic2051...
[19:36:34] <wikibugs>	 10ops-codfw, 10Continuous-Integration-Infrastructure, 10DC-Ops, 10netops: DRAC firmware upgrades codfw (was: Flapping codfw management alarm ( contint2001.mgmt/SSH is CRITICAL )) - https://phabricator.wikimedia.org/T283582 (10Dzahn) @Papaul Do you know about contint2001.mgmt status?
[19:37:39] <wikibugs>	 10ops-codfw, 10Continuous-Integration-Infrastructure, 10DC-Ops, 10netops: DRAC firmware upgrades codfw (was: Flapping codfw management alarm ( contint2001.mgmt/SSH is CRITICAL )) - https://phabricator.wikimedia.org/T283582 (10Papaul) @Dzahn no
[19:40:15] <wikibugs>	 10SRE, 10DC-Ops: Confirm support of PERC 750 raid controller - https://phabricator.wikimedia.org/T297913 (10RobH)
[19:40:50] <Juan_90264>	 Can any deployers purge this change: https://gerrit.wikimedia.org/r/c/751530? The user requesting the task in Phabricator tells me he can't see the logo of this change on the wiki, he says he can only see the old one yet.
[19:40:58] <Juan_90264>	 Can any deployers purge this change: https://gerrit.wikimedia.org/r/c/751530 ? The user requesting the task in Phabricator tells me he can't see the logo of this change on the wiki, he says he can only see the old one yet.
[19:41:45] <logmsgbot>	 !log herron@cumin1001 END (PASS) - Cookbook sre.ganeti.makevm (exit_code=0) for new host apifeatureusage1001.eqiad.wmnet
[19:41:46] <stashbot>	 Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log
[19:45:14] <wikibugs>	 10SRE, 10SRE-Access-Requests: Add bking as icinga user - https://phabricator.wikimedia.org/T298738 (10Dzahn) 05Open→03In progress
[19:46:06] <wikibugs>	 10SRE, 10SRE-Access-Requests: Add bking as icinga user - https://phabricator.wikimedia.org/T298738 (10Dzahn) @bking Is it working?
[19:46:40] <wikibugs>	 10ops-codfw, 10Continuous-Integration-Infrastructure, 10DC-Ops, 10netops, 10observability: DRAC firmware upgrades codfw (was: Flapping codfw management alarm ( contint2001.mgmt/SSH is CRITICAL )) - https://phabricator.wikimedia.org/T283582 (10Dzahn)
[19:47:01] <wikibugs>	 10ops-codfw, 10Continuous-Integration-Infrastructure, 10DC-Ops, 10netops, 10observability: contint2001.mgmt disappeared from Icinga (was: DRAC firmware upgrades codfw (was: Flapping codfw management alarm ( contint2001.mgmt/SSH is CRITICAL ))) - https://phabricator.wikimedia.org/T283582 (10Dzahn)
[19:49:18] <Juan_90264>	 Can any deployers purge this change: https://gerrit.wikimedia.org/r/c/751530 ?
[19:49:53] <logmsgbot>	 !log herron@cumin1001 END (PASS) - Cookbook sre.ganeti.makevm (exit_code=0) for new host apifeatureusage2001.codfw.wmnet
[19:49:54] <stashbot>	 Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log
[19:51:17] <wikibugs>	 10SRE: Adding aquhen@wikimedia.org to the analytics-alerts mailing list - https://phabricator.wikimedia.org/T298778 (10Dzahn) @BTullis You can do this self-service with your root privileges:  It's in the private puppet repo.   See:   puppetmaster1001:/srv/private/modules/privateexim/files/wikimedia.org  This is...
[19:51:59] <wikibugs>	 10SRE: Add user nmaphophe@wikimedia.org to the analytics-alerts mail alias - https://phabricator.wikimedia.org/T298770 (10Dzahn) @BTullis same as T298778#7606120
[19:53:15] <wikibugs>	 (03CR) 10Urbanecm: [C: 03+1] "LGTM" [mediawiki-config] - 10https://gerrit.wikimedia.org/r/575390 (https://phabricator.wikimedia.org/T237890) (owner: 10Jforrester)
[19:54:28] <wikibugs>	 10SRE, 10SRE-Access-Requests: Requesting access to Data Engineering team resources for Sandra Ebele Nwachukwu - https://phabricator.wikimedia.org/T298786 (10razzi) I'm going to go ahead and grant the permissions
[19:54:55] <Juan_90264>	 Urbanecm: Can any deployers purge this change: https://gerrit.wikimedia.org/r/c/751530 ?
[19:55:17] <urbanecm>	 Juan_90264: what do you mean by "purge"?
[20:00:36] <Reedy>	 urbanecm: purgeList presumably
[20:00:40] <wikibugs>	 10SRE, 10Data-Engineering, 10Generated Data Platform, 10Platform Engineering: Import Debian package of Cassandra 3.11.11 as 'dev' version - https://phabricator.wikimedia.org/T298805 (10Eevans)
[20:01:22] <urbanecm>	 Reedy: well, unless you did it, https://en.wikipedia.org/static/images/project-logos/zhwikinews.png returns the new image on my end
[20:01:29] <Reedy>	 I didn't
[20:01:39] <Reedy>	 It's possible not all DCs do though
[20:02:41] <Juan_90264>	 In mine also returns the new logo, but others this is not happening
[20:02:59] <urbanecm>	 which one doesn't work?
[20:03:01] <Juan_90264>	 Yes, it's exactly the PurgeList
[20:04:41] <Juan_90264>	 Urbanecm: The user on Phabricator seems to have reported that this is happening in the zh version
[20:05:22] <wikibugs>	 10SRE, 10LDAP-Access-Requests: Grant Access to ldap/wmf for Marco_Fossati - https://phabricator.wikimedia.org/T298766 (10Dzahn) verified user via https://wikitech.wikimedia.org/wiki/SRE/Clinic_Duty/Access_requests#Verifying_WMF_developer_accounts  @MarkTraceur can you approve?
[20:05:23] <Juan_90264>	 (The zh-hans version he hasn't tested yet, but I believe it's the same for him because it's part of the same change)
[20:06:03] <urbanecm>	 well, i can do a purgeList, no problem
[20:06:12] <urbanecm>	 but keep in mind that /static is cached in browser too
[20:06:28] <Juan_90264>	 Okay
[20:06:46] <urbanecm>	 so changes to files in /static generally take weeks to take effect everywhere (until each and every cache in browsers expire)
[20:06:51] <urbanecm>	 Ctrl+Shift+R gets rid of that
[20:07:19] <wikibugs>	 10SRE, 10LDAP-Access-Requests: Grant Access to ldap/wmf for Marco_Fossati - https://phabricator.wikimedia.org/T298766 (10Dzahn) 05Open→03In progress
[20:08:52] <urbanecm>	 !log Purge https://en.wikipedia.org/static/images/project-logos/{zhwikinews,zhwikinews-1.5x,zhwikinews-2x,zhwikinews-hans,zhwikinews-hans-1.5x,zhwikinews-hans-2x}.png via purgeList.php
[20:08:54] <stashbot>	 Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log
[20:09:26] <wikibugs>	 10SRE, 10LDAP-Access-Requests: Grant Access to <Idap/wmf> for <JVargas> - https://phabricator.wikimedia.org/T298719 (10Dzahn) to SRE on clinic duty:  could not verify user, no defined manager triggers bug in check script?   `  Username: JVargas  Verified Email: 20220106184544 Traceback (most recent call last):...
[20:09:35] <urbanecm>	 Juan_90264: done
[20:10:25] <Juan_90264>	 Perfect thanks
[20:10:43] <urbanecm>	 np
[20:10:56] <wikibugs>	 (03PS1) 10Ssingh: durum: use the correct directive to disable error_logging [puppet] - 10https://gerrit.wikimedia.org/r/752197
[20:11:36] <wikibugs>	 10SRE, 10LDAP-Access-Requests: Grant Access to <Idap/wmf> for <JVargas> - https://phabricator.wikimedia.org/T298719 (10Dzahn) 05Open→03In progress
[20:11:53] <wikibugs>	 (03CR) 10Ssingh: [V: 03+1] "PCC SUCCESS (DIFF 1): https://integration.wikimedia.org/ci/job/operations-puppet-catalog-compiler/33155/console" [puppet] - 10https://gerrit.wikimedia.org/r/752197 (owner: 10Ssingh)
[20:14:02] <wikibugs>	 (03CR) 10Ssingh: [V: 03+1 C: 03+2] durum: use the correct directive to disable error_logging [puppet] - 10https://gerrit.wikimedia.org/r/752197 (owner: 10Ssingh)
[20:14:50] <wikibugs>	 10SRE: check_user - KeyError: 'relations' - https://phabricator.wikimedia.org/T298808 (10Dzahn)
[20:15:17] <wikibugs>	 10SRE: check_user - KeyError: 'relations' - https://phabricator.wikimedia.org/T298808 (10Dzahn)
[20:15:49] <wikibugs>	 10SRE: check_user - KeyError: 'relations' - https://phabricator.wikimedia.org/T298808 (10Dzahn)
[20:17:18] <wikibugs>	 10SRE, 10Infrastructure-Foundations: check_user - KeyError: 'relations' - https://phabricator.wikimedia.org/T298808 (10Dzahn)
[20:18:38] <wikibugs>	 (03PS1) 10Jbond: profile::logstash::gelf_relay: pass correct package name [puppet] - 10https://gerrit.wikimedia.org/r/752200
[20:18:56] <wikibugs>	 10SRE, 10Fundraising-Backlog, 10observability, 10serviceops-radar: Fundraising-Tech engineers unable to ACK icinga alerts on fr-tech host groups - https://phabricator.wikimedia.org/T298649 (10Dzahn) a:05Dzahn→03jgleeson
[20:19:45] <wikibugs>	 10SRE, 10Fundraising-Backlog, 10observability, 10serviceops-radar: Fundraising-Tech engineers unable to ACK icinga alerts on fr-tech host groups - https://phabricator.wikimedia.org/T298649 (10Dzahn) @jgleeson We can either resolve this if it works for you or keep using it for the other people that need to...
[20:20:14] <wikibugs>	 (03CR) 10jerkins-bot: [V: 04-1] profile::logstash::gelf_relay: pass correct package name [puppet] - 10https://gerrit.wikimedia.org/r/752200 (owner: 10Jbond)
[20:20:36] <wikibugs>	 (03PS2) 10Jbond: profile::logstash::gelf_relay: pass correct package name [puppet] - 10https://gerrit.wikimedia.org/r/752200
[20:21:22] <wikibugs>	 (03CR) 10Jbond: [V: 03+1] "PCC SUCCESS (DIFF 3): https://integration.wikimedia.org/ci/job/operations-puppet-catalog-compiler/33157/console" [puppet] - 10https://gerrit.wikimedia.org/r/752200 (owner: 10Jbond)
[20:22:56] <wikibugs>	 (03CR) 10Jbond: [V: 03+1] "PCC SUCCESS (DIFF 1): https://integration.wikimedia.org/ci/job/operations-puppet-catalog-compiler/33158/console" [puppet] - 10https://gerrit.wikimedia.org/r/752200 (owner: 10Jbond)
[20:28:25] <wikibugs>	 (03CR) 10Jbond: [V: 03+1 C: 03+2] profile::logstash::gelf_relay: pass correct package name [puppet] - 10https://gerrit.wikimedia.org/r/752200 (owner: 10Jbond)
[20:35:19] <wikibugs>	 (03CR) 10jerkins-bot: [V: 04-1] check_user: catch manager being None [puppet] - 10https://gerrit.wikimedia.org/r/752018 (https://phabricator.wikimedia.org/T298808) (owner: 10RhinosF1)
[20:35:23] <RhinosF1>	 mutante: ^
[20:36:30] <wikibugs>	 (03PS3) 10RhinosF1: check_user: catch manager being None [puppet] - 10https://gerrit.wikimedia.org/r/752018 (https://phabricator.wikimedia.org/T298808)
[20:39:41] <mutante>	 RhinosF1: thanks!
[20:39:55] <RhinosF1>	 mutante: np
[20:42:58] <wikibugs>	 (03PS1) 10Herron: install_server: add dhcp/netboot entries for apifeatureusage VMs [puppet] - 10https://gerrit.wikimedia.org/r/752207 (https://phabricator.wikimedia.org/T298794)
[20:44:18] <wikibugs>	 10SRE, 10Fundraising-Backlog, 10observability, 10serviceops-radar: Fundraising-Tech engineers unable to ACK icinga alerts on fr-tech host groups - https://phabricator.wikimedia.org/T298649 (10jgleeson) Sorry Dan I forgot to check in on this today and have finished work as I'm working from the UK.  I'll tes...
[20:44:56] <wikibugs>	 10SRE, 10ops-codfw, 10Discovery-Search, 10Patch-For-Review: Degraded RAID on elastic2051 - https://phabricator.wikimedia.org/T298674 (10Papaul) 05Open→03Resolved This is ready but according to @jbond, it still has some puppet errors but that looks like it is related to this puppet policy not being read...
[20:45:18] <wikibugs>	 (03CR) 10Herron: [C: 03+2] install_server: add dhcp/netboot entries for apifeatureusage VMs [puppet] - 10https://gerrit.wikimedia.org/r/752207 (https://phabricator.wikimedia.org/T298794) (owner: 10Herron)
[20:46:02] <wikibugs>	 10SRE, 10Fundraising-Backlog, 10observability, 10serviceops-radar: Fundraising-Tech engineers unable to ACK icinga alerts on fr-tech host groups - https://phabricator.wikimedia.org/T298649 (10Dzahn) Yes yes, there was no expectation that this happens right now or you work on the weekend. This was for Monda...
[20:46:33] <wikibugs>	 10SRE, 10Fundraising-Backlog, 10observability, 10serviceops-radar: Fundraising-Tech engineers unable to ACK icinga alerts on fr-tech host groups - https://phabricator.wikimedia.org/T298649 (10Dzahn) p:05Triage→03Medium
[20:50:19] <wikibugs>	 10SRE, 10ops-codfw, 10Discovery-Search, 10Patch-For-Review: Degraded RAID on elastic2051 - https://phabricator.wikimedia.org/T298674 (10jbond) >>! In T298674#7606290, @Papaul wrote: > This is ready but according to @jbond, it still has some puppet errors but that looks like it is related to this puppet pol...
[21:00:46] <mutante>	 searching for "Error: error" in Phabricator:  =  Query Error    why?  unknown search function "Error"
[21:01:12] <wikibugs>	 (03PS1) 10Herron: assign role::apifeatureusage::logstash to apifeatureusages[12]001 hosts [puppet] - 10https://gerrit.wikimedia.org/r/752211 (https://phabricator.wikimedia.org/T297239)
[21:01:28] <wikibugs>	 (03PS2) 10Herron: assign role::apifeatureusage::logstash to apifeatureusage[12]001 hosts [puppet] - 10https://gerrit.wikimedia.org/r/752211 (https://phabricator.wikimedia.org/T297239)
[21:01:48] <wikibugs>	 (03PS1) 10Aaron Schulz: Simplify comments and stubs for etcd-defined DB config [mediawiki-config] - 10https://gerrit.wikimedia.org/r/752212
[21:03:42] <wikibugs>	 (03PS8) 10Herron: role: add apifeatureusage role [puppet] - 10https://gerrit.wikimedia.org/r/747635 (https://phabricator.wikimedia.org/T297239) (owner: 10Cwhite)
[21:04:35] <wikibugs>	 (03PS3) 10Herron: assign role::apifeatureusage::logstash to apifeatureusage[12]001 hosts [puppet] - 10https://gerrit.wikimedia.org/r/752211 (https://phabricator.wikimedia.org/T297239)
[21:10:43] <wikibugs>	 10SRE, 10Observability-Logging, 10Patch-For-Review: Move logstash api-feature-usage output away from v5 cluster - https://phabricator.wikimedia.org/T297239 (10herron)
[21:10:52] <wikibugs>	 10SRE, 10vm-requests: eqiad/codfw: 2 VMs requested for apifeatureusage - https://phabricator.wikimedia.org/T298794 (10herron) 05Open→03Resolved `apifeatureusage[12]001.(eqiad|codfw).wmnet` are now online
[21:25:09] <icinga-wm>	 PROBLEM - Elasticsearch HTTPS for production-search-codfw on elastic2051 is CRITICAL: SSL CRITICAL - failed to connect or SSL handshake:Connection refused https://wikitech.wikimedia.org/wiki/Search
[21:36:59] <icinga-wm>	 PROBLEM - SSH on mw2254.mgmt is CRITICAL: CRITICAL - Socket timeout after 10 seconds https://wikitech.wikimedia.org/wiki/Dc-operations/Hardware_Troubleshooting_Runbook
[21:45:47] <icinga-wm>	 PROBLEM - Check systemd state on elastic2051 is CRITICAL: CRITICAL - degraded: The following units failed: elasticsearch-disable-readahead.service,nginx.service,prometheus-elasticsearch-exporter-9200.service,prometheus-elasticsearch-exporter-9400.service,prometheus-wmf-elasticsearch-exporter-9200.service,prometheus-wmf-elasticsearch-exporter-9400.service https://wikitech.wikimedia.org/wiki/Monitoring/check_systemd_state
[21:46:38] <wikibugs>	 (03CR) 10Cwhite: [C: 03+2] logstash: update weekly indexes to use weekyear pattern syntax [puppet] - 10https://gerrit.wikimedia.org/r/751765 (https://phabricator.wikimedia.org/T298619) (owner: 10Cwhite)
[21:48:01] <icinga-wm>	 PROBLEM - Elasticsearch HTTPS for production-search-omega-codfw on elastic2051 is CRITICAL: SSL CRITICAL - failed to connect or SSL handshake:Connection refused https://wikitech.wikimedia.org/wiki/Search
[21:50:13] <urbanecm>	 mutante: you need to put it into "'s. Then it should work.
[21:51:36] <wikibugs>	 (03CR) 10Cwhite: [C: 03+2] prometheus: update affected es-exporter configs to use weekyear [puppet] - 10https://gerrit.wikimedia.org/r/751766 (https://phabricator.wikimedia.org/T298619) (owner: 10Cwhite)
[21:53:59] <mutante>	 urbanecm: confirmed. quoting with ' works :) thanks, I was more sharing this as a curiosity
[21:54:14] <mutante>	 Error: is a magic word
[21:54:16] <urbanecm>	 thought you're looking for a fix :)
[21:54:18] <urbanecm>	 apparently it is
[21:55:09] <mutante>	 No, no, I am also going to actually use that
[21:55:17] <mutante>	 it just wasn't "Error: error" 
[21:55:28] <mutante>	 but "Error: somethingelse" from some other ticket
[21:55:58] <mutante>	 and then I noticed "Error: " is a different error from "Error: error" 
[21:58:19] <wikibugs>	 10SRE, 10SRE-Access-Requests: Requesting access to Data Engineering team resources for Sandra Ebele Nwachukwu - https://phabricator.wikimedia.org/T298786 (10razzi)
[22:00:36] <wikibugs>	 10SRE, 10SRE-Access-Requests: Requesting access to Data Engineering team resources for Sandra Ebele Nwachukwu - https://phabricator.wikimedia.org/T298786 (10razzi) According to the [access requests procedure](https://wikitech.wikimedia.org/wiki/SRE/Clinic_Duty/Access_requests#Production_shell_access), we need...
[22:05:57] <wikibugs>	 (03CR) 10Dzahn: "the change in data.yaml seems to have started some "Malformed membership for ops user ..., has additional group(s): {'deployment-ci-admins" [puppet] - 10https://gerrit.wikimedia.org/r/751166 (https://phabricator.wikimedia.org/T297673) (owner: 10Giuseppe Lavagetto)
[22:16:07] <icinga-wm>	 PROBLEM - SSH on contint1001.mgmt is CRITICAL: CRITICAL - Socket timeout after 10 seconds https://wikitech.wikimedia.org/wiki/Dc-operations/Hardware_Troubleshooting_Runbook
[22:24:57] <wikibugs>	 10SRE, 10ops-codfw: host ps1-d1-codfw down since a long time but still monitored - https://phabricator.wikimedia.org/T298800 (10Dzahn) p:05Triage→03Medium
[22:32:33] <wikibugs>	 10SRE, 10Discovery-Search (Current work): Get familiar with ES non-prod enviroments - https://phabricator.wikimedia.org/T298817 (10bking)
[23:04:32] <RhinosF1>	 mutante: easy fix
[23:07:58] <wikibugs>	 (03CR) 10RhinosF1: "This change is ready for review." [puppet] - 10https://gerrit.wikimedia.org/r/752022 (https://phabricator.wikimedia.org/T298815) (owner: 10RhinosF1)
[23:08:07] <RhinosF1>	 mutante: ^ should do it
[23:09:36] <mutante>	 RhinosF1: cool, thank you
[23:09:53] <mutante>	 not merging it right now, but appreciate it
[23:10:46] <wikibugs>	 10SRE, 10Infrastructure-Foundations, 10Patch-For-Review, 10User-RhinosF1: check_user - KeyError: 'relations' - https://phabricator.wikimedia.org/T298808 (10RhinosF1)
[23:11:30] <RhinosF1>	 mutante: np, i also have https://gerrit.wikimedia.org/r/c/operations/puppet/+/749875 waiting review
[23:16:50] <RhinosF1>	 mutante: can I bribe you into being a phab admin and disabling H394? It's far too broad.
[23:17:05] <icinga-wm>	 RECOVERY - SSH on contint1001.mgmt is OK: SSH OK - OpenSSH_6.6 (protocol 2.0) https://wikitech.wikimedia.org/wiki/Dc-operations/Hardware_Troubleshooting_Runbook
[23:17:26] <RhinosF1>	 cc zabe 
[23:17:28] <zabe>	 I think it is meant to subscribe instead of assigning
[23:18:43] <RhinosF1>	 https://phabricator.wikimedia.org/T298818
[23:18:50] <RhinosF1>	 zabe: title says that
[23:19:28] <RhinosF1>	 zabe: it's also anytime a rule matches
[23:28:06] <wikibugs>	 (03CR) 10Zabe: cross-validate-accounts: add deployment-ci-admins to ops expected list (031 comment) [puppet] - 10https://gerrit.wikimedia.org/r/752022 (https://phabricator.wikimedia.org/T298815) (owner: 10RhinosF1)
[23:29:39] <wikibugs>	 (03PS3) 10RhinosF1: cross-validate-accounts: add deployment-ci-admins to ops expected list [puppet] - 10https://gerrit.wikimedia.org/r/752022 (https://phabricator.wikimedia.org/T298815)
[23:29:48] <RhinosF1>	 zabe: good spot
[23:37:11] <wikibugs>	 10SRE, 10Patch-For-Review, 10Service-deployment-requests: New Service Request miscweb - https://phabricator.wikimedia.org/T281538 (10Dzahn) I could debug the gzip encoding issue and at the same time test using my production image in cloud VPS in k8splay project.  `</html>dzahn@dzahn:~$ curl --compressed dzah...
[23:41:34] <wikibugs>	 10SRE, 10Infrastructure-Foundations, 10Patch-For-Review, 10User-RhinosF1: check_user - KeyError: 'relations' - https://phabricator.wikimedia.org/T298808 (10Dzahn) p:05Triage→03Medium
[23:45:48] <mutante>	 RhinosF1: for the dumps change, please ask on https://phabricator.wikimedia.org/T273585 or during Europe time
[23:46:00] <mutante>	 RhinosF1: I see H349 as already disabled (now at least)
[23:46:12] <mutante>	 kind of busy on something else
[23:47:08] <mutante>	 ah, wrong H I was looking at
[23:47:31] <mutante>	 RhinosF1: I don't have permission to disable that
[23:48:03] <mutante>	 not without some shell procedure, so I'd like to let other admins do that
[23:48:19] <mutante>	 unless it's emergency
[23:50:52] <mutante>	 please start by asking https://phabricator.wikimedia.org/p/Lens0021/ directly .it's their rule
[23:51:02] <dwisehaupt>	 /ac/ac
[23:51:36] <mutante>	 Lens0021: ^ :)