[00:03:06] !log mwdebug-deploy@deploy1002 helmfile [eqiad] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' . [00:03:10] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log [00:04:27] (03PS1) 10Bstorm: cloudnfs: remove the redundant nfs-common file [puppet] - 10https://gerrit.wikimedia.org/r/722478 (https://phabricator.wikimedia.org/T291406) [00:04:59] !log mwdebug-deploy@deploy1002 helmfile [codfw] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' . [00:05:03] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log [00:09:02] (03CR) 10Bstorm: [C: 03+2] cloudnfs: remove the redundant nfs-common file [puppet] - 10https://gerrit.wikimedia.org/r/722478 (https://phabricator.wikimedia.org/T291406) (owner: 10Bstorm) [00:15:40] (03CR) 10Krinkle: "Testing via /w/fatal-error.php in beta (action=nomethod&from=shutdown) it shows the expected text:" [puppet] - 10https://gerrit.wikimedia.org/r/721923 (https://phabricator.wikimedia.org/T291192) (owner: 10Krinkle) [00:16:27] !log tgr@deploy1002 Synchronized php-1.37.0-wmf.23/extensions/GrowthExperiments/modules/ext.growthExperiments.StructuredTask/addlink/AddLinkArticleTarget.js: Backport: [[gerrit:722449|AddLink: Skip over headings in phrase matching (T291361)]] (duration: 00m 57s) [00:16:32] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log [00:16:34] T291361: Phrase matching: Skip over headings - https://phabricator.wikimedia.org/T291361 [00:16:44] !log Evening deploys done [00:16:48] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log [00:17:17] (03CR) 10Krinkle: "nvm, header shows up as well (Have to enable WikimediaDebug for that to not be filtered out)." [puppet] - 10https://gerrit.wikimedia.org/r/721923 (https://phabricator.wikimedia.org/T291192) (owner: 10Krinkle) [00:17:45] (03PS1) 10Bstorm: cloudnfs: remove deprecated dependency on nfs-manage-binds [puppet] - 10https://gerrit.wikimedia.org/r/722479 (https://phabricator.wikimedia.org/T291406) [00:17:47] (03CR) 10Krinkle: "I would test this in beta but can't per T233134 etc" [puppet] - 10https://gerrit.wikimedia.org/r/721924 (https://phabricator.wikimedia.org/T291192) (owner: 10Krinkle) [00:21:15] (03CR) 10Bstorm: [C: 03+2] cloudnfs: remove deprecated dependency on nfs-manage-binds [puppet] - 10https://gerrit.wikimedia.org/r/722479 (https://phabricator.wikimedia.org/T291406) (owner: 10Bstorm) [00:53:57] RECOVERY - Long running screen/tmux on gitlab1001 is OK: OK: No SCREEN or tmux processes detected. https://wikitech.wikimedia.org/wiki/Monitoring/Long_running_screens [02:00:05] Deploy window Branching MediaWiki, extensions, skins, and vendor – See Heterogeneous_deployment/Train_deploys (https://wikitech.wikimedia.org/wiki/Deployments#deploycal-item-20210921T0200) [02:03:51] !log mwdebug-deploy@deploy1002 helmfile [eqiad] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' . [02:03:55] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log [02:05:47] !log mwdebug-deploy@deploy1002 helmfile [codfw] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' . [02:05:49] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log [02:06:51] (03PS1) 10TrainBranchBot: Branch commit for wmf/1.37.0-wmf.24 [core] (wmf/1.37.0-wmf.24) - 10https://gerrit.wikimedia.org/r/722480 [02:06:53] (03CR) 10TrainBranchBot: [C: 03+2] Branch commit for wmf/1.37.0-wmf.24 [core] (wmf/1.37.0-wmf.24) - 10https://gerrit.wikimedia.org/r/722480 (owner: 10TrainBranchBot) [02:26:29] (03Merged) 10jenkins-bot: Branch commit for wmf/1.37.0-wmf.24 [core] (wmf/1.37.0-wmf.24) - 10https://gerrit.wikimedia.org/r/722480 (owner: 10TrainBranchBot) [02:33:18] !log mwdebug-deploy@deploy1002 helmfile [eqiad] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' . [02:33:21] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log [02:35:13] !log mwdebug-deploy@deploy1002 helmfile [codfw] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' . [02:35:16] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log [03:01:51] (03PS2) 10Krinkle: mediawiki: Add request ID to php-wmerrors error page [puppet] - 10https://gerrit.wikimedia.org/r/721923 (https://phabricator.wikimedia.org/T291192) [03:01:53] (03PS2) 10Krinkle: mediawiki: Set php-wmerrors reqId to "unknown" for Logstash [puppet] - 10https://gerrit.wikimedia.org/r/721924 (https://phabricator.wikimedia.org/T291192) [03:01:55] (03PS2) 10Krinkle: mediawiki: Move statsd call from php-wmerrors page to end of script [puppet] - 10https://gerrit.wikimedia.org/r/721925 [03:01:57] (03PS1) 10Krinkle: mediawiki: Set "mwversion" for Logstash entries from php-wmerrors [puppet] - 10https://gerrit.wikimedia.org/r/722483 (https://phabricator.wikimedia.org/T253781) [03:19:21] PROBLEM - WDQS high update lag on wdqs1004 is CRITICAL: 1.113e+05 ge 4.32e+04 https://wikitech.wikimedia.org/wiki/Wikidata_query_service/Runbook%23Update_lag https://grafana.wikimedia.org/dashboard/db/wikidata-query-service?orgId=1&panelId=8&fullscreen [03:19:53] RECOVERY - Maps - OSM synchronization lag - codfw on alert1001 is OK: (C)2.592e+05 ge (W)1.764e+05 ge 1.74e+05 https://wikitech.wikimedia.org/wiki/Maps/Runbook https://grafana.wikimedia.org/dashboard/db/maps-performances?panelId=12&fullscreen&orgId=1 [03:34:03] PROBLEM - WDQS high update lag on wdqs1004 is CRITICAL: 1.119e+05 ge 4.32e+04 https://wikitech.wikimedia.org/wiki/Wikidata_query_service/Runbook%23Update_lag https://grafana.wikimedia.org/dashboard/db/wikidata-query-service?orgId=1&panelId=8&fullscreen [03:40:23] PROBLEM - WDQS high update lag on wdqs1004 is CRITICAL: 1.123e+05 ge 4.32e+04 https://wikitech.wikimedia.org/wiki/Wikidata_query_service/Runbook%23Update_lag https://grafana.wikimedia.org/dashboard/db/wikidata-query-service?orgId=1&panelId=8&fullscreen [04:07:53] 10SRE, 10LDAP-Access-Requests: Request to add Georgina Burnett to the ldap/nda group - https://phabricator.wikimedia.org/T291391 (10Marostegui) a:05Marostegui→03Dzahn Thanks a lot Daniel!! [04:32:08] (03PS5) 10Marostegui: admin: Access request for Mew Ophaswongse [puppet] - 10https://gerrit.wikimedia.org/r/722375 (https://phabricator.wikimedia.org/T290200) [04:33:20] (03CR) 10Marostegui: [C: 03+2] admin: Access request for Mew Ophaswongse [puppet] - 10https://gerrit.wikimedia.org/r/722375 (https://phabricator.wikimedia.org/T290200) (owner: 10Marostegui) [04:37:17] 10SRE, 10SRE-Access-Requests, 10Patch-For-Review: Requesting access to production shell for Mew Ophaswongse - https://phabricator.wikimedia.org/T290200 (10Marostegui) 05In progress→03Resolved a:03Marostegui This has been granted - give it around 1h to make sure puppet runs everywhere. @mewoph please r... [04:48:08] (03PS2) 10KartikMistry: Update cxserver to 2021-09-16-130208-production [deployment-charts] - 10https://gerrit.wikimedia.org/r/722268 [04:48:24] * kart_ updating cxserver.. [04:53:57] (03CR) 10KartikMistry: [C: 03+2] Update cxserver to 2021-09-16-130208-production [deployment-charts] - 10https://gerrit.wikimedia.org/r/722268 (owner: 10KartikMistry) [04:57:53] (03Merged) 10jenkins-bot: Update cxserver to 2021-09-16-130208-production [deployment-charts] - 10https://gerrit.wikimedia.org/r/722268 (owner: 10KartikMistry) [04:58:34] !log kartik@deploy1002 helmfile [staging] Ran 'sync' command on namespace 'cxserver' for release 'staging' . [04:58:38] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log [05:03:15] !log kartik@deploy1002 helmfile [codfw] Ran 'sync' command on namespace 'cxserver' for release 'production' . [05:03:18] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log [05:12:14] !log kartik@deploy1002 helmfile [eqiad] Ran 'sync' command on namespace 'cxserver' for release 'production' . [05:12:18] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log [05:14:33] 10SRE, 10MW-on-K8s, 10serviceops, 10Patch-For-Review, 10Performance-Team (Radar): Benchmark performance of MediaWiki on k8s - https://phabricator.wikimedia.org/T280497 (10Joe) >>! In T280497#7365189, @jijiki wrote: > @ssastry we have done some benchmarks, but non of those were parsoid urls, it would grea... [05:15:28] 10SRE, 10MW-on-K8s, 10serviceops, 10Patch-For-Review, 10Performance-Team (Radar): Benchmark performance of MediaWiki on k8s - https://phabricator.wikimedia.org/T280497 (10Joe) Also, parsoid *might* need more memory and we might need to adapt mediawiki-config so that we can raise php's memory limit in k8s... [05:16:33] !log Upgraded cxserver to 2021-09-16-130208-production [05:16:37] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log [05:28:55] (03CR) 10Urbanecm: admin: Access request for Mew Ophaswongse (031 comment) [puppet] - 10https://gerrit.wikimedia.org/r/722375 (https://phabricator.wikimedia.org/T290200) (owner: 10Marostegui) [05:36:48] PROBLEM - dump of s6 in codfw on alert1001 is CRITICAL: Last dump for s6 at codfw (db2141.codfw.wmnet:3316) taken on 2021-09-21 04:20:13 is 102 GB, but previous one was 85 GB, a change of 20.4% https://wikitech.wikimedia.org/wiki/MariaDB/Backups%23Alerting [05:37:10] ^ that's wikitech [05:37:57] ACKNOWLEDGEMENT - dump of s6 in codfw on alert1001 is CRITICAL: Last dump for s6 at codfw (db2141.codfw.wmnet:3316) taken on 2021-09-21 04:20:13 is 102 GB, but previous one was 85 GB, a change of 20.4% Marostegui Wikitech migration https://wikitech.wikimedia.org/wiki/MariaDB/Backups%23Alerting [06:24:42] (03CR) 10Elukey: [V: 03+2 C: 03+2] Refactor kubernetes tokens and secrets [labs/private] - 10https://gerrit.wikimedia.org/r/721850 (owner: 10Elukey) [06:25:10] (03PS28) 10Elukey: kubernetes: add revscoring-editquality in the services configs [puppet] - 10https://gerrit.wikimedia.org/r/720048 (https://phabricator.wikimedia.org/T286791) [06:25:58] 10SRE, 10SRE-Access-Requests: Updating mbinder's keys for phabricator-bulk-manager - https://phabricator.wikimedia.org/T291141 (10Marostegui) 05Open→03Resolved Resolving this - reopen if needed. [06:27:54] (03CR) 10Elukey: [V: 03+1] "PCC SUCCESS (DIFF 2): https://integration.wikimedia.org/ci/job/operations-puppet-catalog-compiler/31146/console" [puppet] - 10https://gerrit.wikimedia.org/r/720048 (https://phabricator.wikimedia.org/T286791) (owner: 10Elukey) [06:32:46] (03CR) 10Elukey: [V: 03+1 C: 04-1] kubernetes: add revscoring-editquality in the services configs [puppet] - 10https://gerrit.wikimedia.org/r/720048 (https://phabricator.wikimedia.org/T286791) (owner: 10Elukey) [06:36:02] (03PS29) 10Elukey: kubernetes: add revscoring-editquality in the services configs [puppet] - 10https://gerrit.wikimedia.org/r/720048 (https://phabricator.wikimedia.org/T286791) [06:37:29] (03CR) 10Elukey: [V: 03+1] "PCC SUCCESS (DIFF 2): https://integration.wikimedia.org/ci/job/operations-puppet-catalog-compiler/31147/console" [puppet] - 10https://gerrit.wikimedia.org/r/720048 (https://phabricator.wikimedia.org/T286791) (owner: 10Elukey) [06:40:27] (03PS1) 10Marostegui: valid_section.pp: Remove reference to s10 [puppet] - 10https://gerrit.wikimedia.org/r/722544 (https://phabricator.wikimedia.org/T167973) [06:41:52] (03CR) 10Marostegui: "kormat I am merging this as it only removes two commented lines. Just CCing you in case there's something else that needs to be done" [puppet] - 10https://gerrit.wikimedia.org/r/722544 (https://phabricator.wikimedia.org/T167973) (owner: 10Marostegui) [06:50:48] (03CR) 10Elukey: [V: 03+1] "Thanks a lot for the suggestions! Tried to implement all of them, but I am still a bit confused by the pcc's output. I see new/old private" [puppet] - 10https://gerrit.wikimedia.org/r/720048 (https://phabricator.wikimedia.org/T286791) (owner: 10Elukey) [06:51:46] 10SRE, 10Infrastructure-Foundations, 10netops, 10procurement: Move AMS-IX port to 802.1q tagged and get "private vlan" added - https://phabricator.wikimedia.org/T291407 (10ayounsi) a:05wiki_willy→03ayounsi AMS-IX NOC emailed to schedule the change, with vlan 380 for IX and 381 for NaWas. [07:16:44] (03CR) 10Muehlenhoff: "We had this before in the original version of apt::package_from_component, but then there were bugs caused by installation order and lack " [puppet] - 10https://gerrit.wikimedia.org/r/722345 (https://phabricator.wikimedia.org/T291370) (owner: 10Jbond) [07:26:22] (03CR) 10ZPapierski: Add kafka clusters' brokers to spicerack config (031 comment) [puppet] - 10https://gerrit.wikimedia.org/r/721857 (https://phabricator.wikimedia.org/T276469) (owner: 10ZPapierski) [07:47:29] (03PS3) 10Elukey: helmfile.d: move private dirs to the new format [deployment-charts] - 10https://gerrit.wikimedia.org/r/722276 (https://phabricator.wikimedia.org/T286791) [07:47:31] (03PS18) 10Elukey: Add revscoring-editquality as first ml-service to helmfile.d [deployment-charts] - 10https://gerrit.wikimedia.org/r/719128 (https://phabricator.wikimedia.org/T286791) [07:47:33] (03PS16) 10Elukey: Rakefile: change HELMFILE_GLOB to include ml-services [deployment-charts] - 10https://gerrit.wikimedia.org/r/719522 (https://phabricator.wikimedia.org/T286791) [07:47:35] (03PS10) 10Elukey: helmfile: add the ability to inject labels to Namespaces [deployment-charts] - 10https://gerrit.wikimedia.org/r/720997 (https://phabricator.wikimedia.org/T290476) [07:47:37] (03PS6) 10Elukey: kubeflow-kfserving: move Namespace creation to helmfile [deployment-charts] - 10https://gerrit.wikimedia.org/r/721268 (https://phabricator.wikimedia.org/T288829) [07:50:28] (03PS1) 10Muehlenhoff: Temporarily filter port 25 on mx1001 for reimage [homer/public] - 10https://gerrit.wikimedia.org/r/722551 (https://phabricator.wikimedia.org/T286911) [08:02:15] (03CR) 10David Caro: [C: 03+2] wmcs: fix lints [cookbooks] (wmcs) - 10https://gerrit.wikimedia.org/r/722282 (owner: 10David Caro) [08:02:19] (03CR) 10David Caro: [V: 03+2 C: 03+2] wmcs: fix lints [cookbooks] (wmcs) - 10https://gerrit.wikimedia.org/r/722282 (owner: 10David Caro) [08:05:26] (03CR) 10jerkins-bot: [V: 04-1] wmcs: fix lints [cookbooks] (wmcs) - 10https://gerrit.wikimedia.org/r/722282 (owner: 10David Caro) [08:07:39] (03PS2) 10David Caro: wmcs: fix lints [cookbooks] (wmcs) - 10https://gerrit.wikimedia.org/r/722282 [08:13:04] 10SRE, 10serviceops: Rebuild production Stretch images with GNUTLS/OpenSSL updates for LE issue chain update - https://phabricator.wikimedia.org/T291458 (10MoritzMuehlenhoff) [08:17:53] (03CR) 10Ayounsi: [C: 03+1] "Looks good to me as occasional one off." [homer/public] - 10https://gerrit.wikimedia.org/r/722551 (https://phabricator.wikimedia.org/T286911) (owner: 10Muehlenhoff) [08:21:37] (03PS2) 10Hashar: docker: add security updates to Bullseye base image [puppet] - 10https://gerrit.wikimedia.org/r/720241 [08:22:53] (03CR) 10Ayounsi: [C: 03+1] "Thanks!" [puppet] - 10https://gerrit.wikimedia.org/r/721854 (https://phabricator.wikimedia.org/T273673) (owner: 10Dzahn) [08:23:06] (03CR) 10Hashar: "I have made some adjustment to the commit message." [puppet] - 10https://gerrit.wikimedia.org/r/720241 (owner: 10Hashar) [08:23:11] (03CR) 10Jelto: "Adding you as reviewer because @JMeybohm is out next two weeks." [puppet] - 10https://gerrit.wikimedia.org/r/721373 (https://phabricator.wikimedia.org/T251305) (owner: 10Jelto) [08:23:31] (03CR) 10Jelto: "Adding you as reviewer because @JMeybohm is out next two weeks." [deployment-charts] - 10https://gerrit.wikimedia.org/r/721301 (https://phabricator.wikimedia.org/T251305) (owner: 10Jelto) [08:27:39] (03CR) 10Volans: "Reply inline" [puppet] - 10https://gerrit.wikimedia.org/r/721857 (https://phabricator.wikimedia.org/T276469) (owner: 10ZPapierski) [08:52:39] (03PS5) 10Jbond: apt::package_from_component: use apt-get update exec from init class [puppet] - 10https://gerrit.wikimedia.org/r/722345 (https://phabricator.wikimedia.org/T291370) [08:55:57] (03PS6) 10Jbond: apt::package_from_component: Fix bug in dependency mapping [puppet] - 10https://gerrit.wikimedia.org/r/722345 (https://phabricator.wikimedia.org/T291370) [08:59:08] (03PS7) 10Jbond: apt::package_from_component: use apt-get update exec from init class [puppet] - 10https://gerrit.wikimedia.org/r/722345 (https://phabricator.wikimedia.org/T291370) [08:59:10] (03CR) 10Jbond: "thanks updated" [puppet] - 10https://gerrit.wikimedia.org/r/722345 (https://phabricator.wikimedia.org/T291370) (owner: 10Jbond) [09:01:59] 10SRE, 10Infrastructure-Foundations, 10Traffic: OpenSSL < 1.1.0 compatibility issues with new LE issuance chain - https://phabricator.wikimedia.org/T283165 (10Joe) [09:02:03] 10SRE, 10serviceops: Rebuild production Stretch images with GNUTLS/OpenSSL updates for LE issue chain update - https://phabricator.wikimedia.org/T291458 (10Joe) 05Open→03In progress a:03Joe [09:25:52] 10SRE, 10serviceops: Rebuild production Stretch images with GNUTLS/OpenSSL updates for LE issue chain update - https://phabricator.wikimedia.org/T291458 (10Joe) The debmonitor query for [[https://debmonitor.wikimedia.org/packages/libssl1.0.2 | libssl 1.0.2]] tells us it's mostly images under the `/releng` pref... [09:36:49] 10SRE, 10serviceops: Rebuild production Stretch images with GNUTLS/OpenSSL updates for LE issue chain update - https://phabricator.wikimedia.org/T291458 (10Joe) The debmonitor query for [[ https://debmonitor.wikimedia.org/packages/libgnutls30 || libgnutls30]] tells us again it's mostly releng images, plus: *... [09:40:08] (03CR) 10Arturo Borrero Gonzalez: create role to deploy staging instance for quarry (031 comment) [puppet] - 10https://gerrit.wikimedia.org/r/721585 (https://phabricator.wikimedia.org/T291204) (owner: 10Michael DiPietro) [09:42:48] (03PS1) 10Giuseppe Lavagetto: Update a few stretch-based images for openssl / gnutls updates [docker-images/production-images] - 10https://gerrit.wikimedia.org/r/722565 (https://phabricator.wikimedia.org/T291458) [09:44:58] <_joe_> !log deleting images for graphoid, T291458 [09:45:03] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log [09:45:05] T291458: Rebuild production Stretch images with GNUTLS/OpenSSL updates for LE issue chain update - https://phabricator.wikimedia.org/T291458 [09:46:08] <_joe_> !log deneb:~# docker-registryctl delete-tags docker-registry.wikimedia.org/fluentd T291458 [09:46:12] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log [09:52:01] (03PS1) 10Arturo Borrero Gonzalez: openstack: manila: introduce rabbit configuration [puppet] - 10https://gerrit.wikimedia.org/r/722567 (https://phabricator.wikimedia.org/T291257) [09:52:03] (03PS1) 10Arturo Borrero Gonzalez: openstack: manila: fix datatype for db_host config value [puppet] - 10https://gerrit.wikimedia.org/r/722568 (https://phabricator.wikimedia.org/T291257) [09:54:34] (03PS1) 10Arturo Borrero Gonzalez: hieradata: openstack: manila: introduce placeholder for rabbit password [labs/private] - 10https://gerrit.wikimedia.org/r/722569 (https://phabricator.wikimedia.org/T291257) [09:57:27] (03CR) 10Arturo Borrero Gonzalez: [C: 03+2] openstack: manila: introduce rabbit configuration [puppet] - 10https://gerrit.wikimedia.org/r/722567 (https://phabricator.wikimedia.org/T291257) (owner: 10Arturo Borrero Gonzalez) [09:57:33] (03CR) 10Arturo Borrero Gonzalez: [C: 03+2] openstack: manila: fix datatype for db_host config value [puppet] - 10https://gerrit.wikimedia.org/r/722568 (https://phabricator.wikimedia.org/T291257) (owner: 10Arturo Borrero Gonzalez) [09:57:40] (03CR) 10Arturo Borrero Gonzalez: [V: 03+2 C: 03+2] hieradata: openstack: manila: introduce placeholder for rabbit password [labs/private] - 10https://gerrit.wikimedia.org/r/722569 (https://phabricator.wikimedia.org/T291257) (owner: 10Arturo Borrero Gonzalez) [09:58:09] (03CR) 10Giuseppe Lavagetto: [V: 03+2 C: 03+2] Update a few stretch-based images for openssl / gnutls updates [docker-images/production-images] - 10https://gerrit.wikimedia.org/r/722565 (https://phabricator.wikimedia.org/T291458) (owner: 10Giuseppe Lavagetto) [09:59:27] <_joe_> !log rebuilding openjdk8* image, ruby, nodejs-slim for T291458 [09:59:32] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log [09:59:33] T291458: Rebuild production Stretch images with GNUTLS/OpenSSL updates for LE issue chain update - https://phabricator.wikimedia.org/T291458 [10:11:12] (03PS1) 10Giuseppe Lavagetto: Fix openjdk8 images build [docker-images/production-images] - 10https://gerrit.wikimedia.org/r/722572 (https://phabricator.wikimedia.org/T291458) [10:12:42] (03PS2) 10Giuseppe Lavagetto: Fix openjdk8 images build [docker-images/production-images] - 10https://gerrit.wikimedia.org/r/722572 (https://phabricator.wikimedia.org/T291458) [10:16:18] (03CR) 10Muehlenhoff: [C: 03+1] "Looks good to me, I also doublechecked the current openjdk-8-jdk-headless package and all manpages are in man1" [docker-images/production-images] - 10https://gerrit.wikimedia.org/r/722572 (https://phabricator.wikimedia.org/T291458) (owner: 10Giuseppe Lavagetto) [10:20:45] (03CR) 10Giuseppe Lavagetto: [V: 03+2 C: 03+2] Fix openjdk8 images build [docker-images/production-images] - 10https://gerrit.wikimedia.org/r/722572 (https://phabricator.wikimedia.org/T291458) (owner: 10Giuseppe Lavagetto) [10:25:04] !log jmm@cumin2002 START - Cookbook sre.hosts.decommission for hosts testvm2001.codfw.wmnet [10:25:07] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log [10:25:18] (03PS2) 10David Caro: wmcs: enforce a minimum spicerack version [cookbooks] (wmcs) - 10https://gerrit.wikimedia.org/r/713812 [10:25:54] (03PS1) 10Urbanecm: Undeploy GettingStarted I: Disable on all wikis [mediawiki-config] - 10https://gerrit.wikimedia.org/r/722574 (https://phabricator.wikimedia.org/T235752) [10:25:56] (03PS1) 10Urbanecm: Undeploy GettingStarted II: Don't load regardless of config [mediawiki-config] - 10https://gerrit.wikimedia.org/r/722575 (https://phabricator.wikimedia.org/T235752) [10:25:58] (03PS1) 10Urbanecm: Undeploy getting started III: Don't set wmgUseGettingStarted, now ignored [mediawiki-config] - 10https://gerrit.wikimedia.org/r/722576 (https://phabricator.wikimedia.org/T235752) [10:26:00] (03PS1) 10Urbanecm: Undeploy GettingStarted IV: Don't build i18n [mediawiki-config] - 10https://gerrit.wikimedia.org/r/722577 [10:26:02] (03PS1) 10Urbanecm: Undeploy GettingStarted V: Remove now-obsolete logging channels [mediawiki-config] - 10https://gerrit.wikimedia.org/r/722578 (https://phabricator.wikimedia.org/T235752) [10:29:22] (03CR) 10David Caro: [C: 03+2] wmcs: enforce a minimum spicerack version [cookbooks] (wmcs) - 10https://gerrit.wikimedia.org/r/713812 (owner: 10David Caro) [10:32:25] (03Merged) 10jenkins-bot: wmcs: enforce a minimum spicerack version [cookbooks] (wmcs) - 10https://gerrit.wikimedia.org/r/713812 (owner: 10David Caro) [10:37:39] (03PS1) 10Jbond: git - schema: Add new schema for adding git information [software/ecs] - 10https://gerrit.wikimedia.org/r/722580 (https://phabricator.wikimedia.org/T222826) [10:38:06] (03CR) 10jerkins-bot: [V: 04-1] git - schema: Add new schema for adding git information [software/ecs] - 10https://gerrit.wikimedia.org/r/722580 (https://phabricator.wikimedia.org/T222826) (owner: 10Jbond) [10:42:36] (03PS2) 10Jbond: git - schema: Add new schema for adding git information [software/ecs] - 10https://gerrit.wikimedia.org/r/722580 (https://phabricator.wikimedia.org/T222826) [10:43:00] (03CR) 10jerkins-bot: [V: 04-1] git - schema: Add new schema for adding git information [software/ecs] - 10https://gerrit.wikimedia.org/r/722580 (https://phabricator.wikimedia.org/T222826) (owner: 10Jbond) [10:43:27] (03PS1) 10Arturo Borrero Gonzalez: hieradata: openstack: codfw1dev: add missing manila key [puppet] - 10https://gerrit.wikimedia.org/r/722581 (https://phabricator.wikimedia.org/T291257) [10:43:45] (03CR) 10Muehlenhoff: [C: 03+1] "Looks good!" [puppet] - 10https://gerrit.wikimedia.org/r/722345 (https://phabricator.wikimedia.org/T291370) (owner: 10Jbond) [10:44:04] (03CR) 10Arturo Borrero Gonzalez: [C: 03+2] hieradata: openstack: codfw1dev: add missing manila key [puppet] - 10https://gerrit.wikimedia.org/r/722581 (https://phabricator.wikimedia.org/T291257) (owner: 10Arturo Borrero Gonzalez) [10:45:19] (03PS7) 10Jbond: puppetmaster: drop log messages from logstash reporter [puppet] - 10https://gerrit.wikimedia.org/r/719368 (https://phabricator.wikimedia.org/T222826) [10:54:12] !log jmm@cumin2002 END (PASS) - Cookbook sre.hosts.decommission (exit_code=0) for hosts testvm2001.codfw.wmnet [10:54:14] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log [10:54:17] 10SRE, 10Infrastructure-Foundations, 10Patch-For-Review: Create Ganeti test cluster - https://phabricator.wikimedia.org/T286206 (10ops-monitoring-bot) cookbooks.sre.hosts.decommission executed by jmm@cumin2002 for hosts: `testvm2001.codfw.wmnet` - testvm2001.codfw.wmnet (**WARN**) - //Host not found on Ici... [10:56:04] (03PS3) 10Jbond: git - schema: Add new schema for adding git information [software/ecs] - 10https://gerrit.wikimedia.org/r/722580 (https://phabricator.wikimedia.org/T222826) [10:56:07] !log jmm@cumin2002 START - Cookbook sre.hosts.reboot-single for host ganeti2026.codfw.wmnet [10:56:10] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log [11:00:05] Amir1, Lucas_WMDE, awight, and Urbanecm: I seem to be stuck in Groundhog week. Sigh. Time for (yet another) European mid-day backport window deploy. (https://wikitech.wikimedia.org/wiki/Deployments#deploycal-item-20210921T1100). [11:00:05] No Gerrit patches in the queue for this window AFAICS. [11:00:12] o/ [11:00:15] nothing to do \o/ [11:07:46] !log jmm@cumin2002 END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ganeti2026.codfw.wmnet [11:07:49] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log [11:09:06] (03CR) 10Giuseppe Lavagetto: [C: 04-1] "Please fix the text of the comment, which is misleading. Other than that LGTM" [puppet] - 10https://gerrit.wikimedia.org/r/720048 (https://phabricator.wikimedia.org/T286791) (owner: 10Elukey) [11:09:38] PROBLEM - Check systemd state on ganeti2026 is CRITICAL: CRITICAL - degraded: The following units failed: networking.service https://wikitech.wikimedia.org/wiki/Monitoring/check_systemd_state [11:09:58] (03PS1) 10Arturo Borrero Gonzalez: hieradata: openstack: manila: add user password placeholder [labs/private] - 10https://gerrit.wikimedia.org/r/722588 (https://phabricator.wikimedia.org/T291257) [11:10:34] (03CR) 10Jbond: [C: 03+2] apt::package_from_component: use apt-get update exec from init class [puppet] - 10https://gerrit.wikimedia.org/r/722345 (https://phabricator.wikimedia.org/T291370) (owner: 10Jbond) [11:16:16] (03PS1) 10Jbond: P:java: create new variable to track the default package name [puppet] - 10https://gerrit.wikimedia.org/r/722589 [11:17:16] (03CR) 10Hnowlan: "Thanks for this!" [puppet] - 10https://gerrit.wikimedia.org/r/722345 (https://phabricator.wikimedia.org/T291370) (owner: 10Jbond) [11:18:07] (03PS1) 10Arturo Borrero Gonzalez: openstack: manila: fix typo in hiera key for rabbit pass [puppet] - 10https://gerrit.wikimedia.org/r/722590 (https://phabricator.wikimedia.org/T291257) [11:18:09] (03PS1) 10Arturo Borrero Gonzalez: openstack: manila: introduce support for manila user [puppet] - 10https://gerrit.wikimedia.org/r/722591 (https://phabricator.wikimedia.org/T291257) [11:18:27] (03PS3) 10Jgiannelos: Configure event stream for map tile state change [mediawiki-config] - 10https://gerrit.wikimedia.org/r/722289 [11:18:32] (03CR) 10Arturo Borrero Gonzalez: [V: 03+2 C: 03+2] hieradata: openstack: manila: add user password placeholder [labs/private] - 10https://gerrit.wikimedia.org/r/722588 (https://phabricator.wikimedia.org/T291257) (owner: 10Arturo Borrero Gonzalez) [11:19:38] (03CR) 10Jbond: [C: 03+2] P:java: create new variable to track the default package name [puppet] - 10https://gerrit.wikimedia.org/r/722589 (owner: 10Jbond) [11:19:54] (03CR) 10Arturo Borrero Gonzalez: [C: 03+2] openstack: manila: fix typo in hiera key for rabbit pass [puppet] - 10https://gerrit.wikimedia.org/r/722590 (https://phabricator.wikimedia.org/T291257) (owner: 10Arturo Borrero Gonzalez) [11:20:18] (03PS2) 10Arturo Borrero Gonzalez: openstack: manila: introduce support for manila user [puppet] - 10https://gerrit.wikimedia.org/r/722591 (https://phabricator.wikimedia.org/T291257) [11:21:58] (03CR) 10Jgiannelos: [C: 03+2] Configure event stream for map tile state change [mediawiki-config] - 10https://gerrit.wikimedia.org/r/722289 (owner: 10Jgiannelos) [11:22:46] (03Merged) 10jenkins-bot: Configure event stream for map tile state change [mediawiki-config] - 10https://gerrit.wikimedia.org/r/722289 (owner: 10Jgiannelos) [11:23:35] Hi can I deploy this? https://gerrit.wikimedia.org/r/c/operations/mediawiki-config/+/722289 [11:24:43] (03PS2) 10Volans: remote: add support to enable/disable Cumin output [software/spicerack] - 10https://gerrit.wikimedia.org/r/720993 [11:24:45] (03PS2) 10Volans: dhcp: reduce verbosity of Cumin's output [software/spicerack] - 10https://gerrit.wikimedia.org/r/720994 [11:24:47] (03PS2) 10Volans: icinga: reduce verbosity of Cumin's output [software/spicerack] - 10https://gerrit.wikimedia.org/r/720995 [11:24:49] (03PS2) 10Volans: puppet: reduce verbosity of Cumin's output [software/spicerack] - 10https://gerrit.wikimedia.org/r/720996 [11:24:55] (03CR) 10Volans: "replies inline" [software/spicerack] - 10https://gerrit.wikimedia.org/r/720993 (owner: 10Volans) [11:25:22] (03CR) 10Arturo Borrero Gonzalez: [C: 03+2] openstack: manila: introduce support for manila user [puppet] - 10https://gerrit.wikimedia.org/r/722591 (https://phabricator.wikimedia.org/T291257) (owner: 10Arturo Borrero Gonzalez) [11:27:37] !log mwdebug-deploy@deploy1002 helmfile [eqiad] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' . [11:27:39] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log [11:29:32] !log mwdebug-deploy@deploy1002 helmfile [codfw] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' . [11:29:35] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log [11:34:19] nemo-yiannis: everything merged to mediawiki-config master must be deployed or reverted, usually config patches should be deployed in dedicated windows https://wikitech.wikimedia.org/wiki/Backport_windows [11:35:06] (03PS30) 10Elukey: kubernetes: add revscoring-editquality in the services configs [puppet] - 10https://gerrit.wikimedia.org/r/720048 (https://phabricator.wikimedia.org/T286791) [11:35:10] PROBLEM - Uncommitted DNS changes in Netbox on netbox1001 is CRITICAL: Netbox has uncommitted DNS changes https://wikitech.wikimedia.org/wiki/Monitoring/Netbox_DNS_uncommitted_changes [11:35:27] (03CR) 10Elukey: kubernetes: add revscoring-editquality in the services configs (031 comment) [puppet] - 10https://gerrit.wikimedia.org/r/720048 (https://phabricator.wikimedia.org/T286791) (owner: 10Elukey) [11:35:31] nemo-yiannis: since you merged it, please deploy it as well [11:35:38] and ideally add it to the backport+config window in the calendar [11:36:36] ok [11:40:58] !log jmm@cumin2002 START - Cookbook sre.hosts.reboot-single for host ganeti2026.codfw.wmnet [11:41:01] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log [11:42:59] I am running the netbox dns cookbook to see what's hanging [11:43:39] seems related to "testvm2001" [11:44:24] moritzm: --^ ok to clean up the records? [11:45:08] !log jmm@cumin2002 END (FAIL) - Cookbook sre.hosts.reboot-single (exit_code=99) for host ganeti2026.codfw.wmnet [11:45:10] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log [11:45:30] !log jgiannelos@deploy1002 Synchronized wmf-config/InitialiseSettings.php: Configure event stream for map tile state change - 3b01ef587 (duration: 00m 57s) [11:45:33] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log [11:46:22] !log elukey@cumin1001 START - Cookbook sre.dns.netbox [11:46:25] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log [11:46:31] since there was a decom I'll proceed [11:47:36] RECOVERY - Check systemd state on ganeti2026 is OK: OK - running: The system is fully operational https://wikitech.wikimedia.org/wiki/Monitoring/check_systemd_state [11:47:45] elukey: ack, please go ahead [11:48:01] I'll also re-create the VM later, but please go ahead [11:55:31] !log elukey@cumin1001 END (PASS) - Cookbook sre.dns.netbox (exit_code=0) [11:55:35] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log [11:56:37] 10SRE, 10LDAP-Access-Requests: Grant Access to ldap/wmf for KCVelaga_(wikimf) - https://phabricator.wikimedia.org/T291475 (10Marostegui) p:05Triage→03Medium @KFrancis could you confirm that @KCVelaga_WMF has the NDA signed? Thanks! [11:57:42] (03CR) 10Marostegui: [C: 03+2] valid_section.pp: Remove reference to s10 [puppet] - 10https://gerrit.wikimedia.org/r/722544 (https://phabricator.wikimedia.org/T167973) (owner: 10Marostegui) [11:58:30] !log jmm@cumin2002 START - Cookbook sre.hosts.reboot-single for host ganeti2025.codfw.wmnet [11:58:33] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log [12:01:34] !log jmm@cumin2002 END (FAIL) - Cookbook sre.hosts.reboot-single (exit_code=99) for host ganeti2025.codfw.wmnet [12:01:37] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log [12:06:40] PROBLEM - Check systemd state on an-launcher1002 is CRITICAL: CRITICAL - degraded: The following units failed: produce_canary_events.service https://wikitech.wikimedia.org/wiki/Monitoring/check_systemd_state [12:08:37] 10SRE, 10LDAP-Access-Requests: Grant Access to ldap/wmf for KCVelaga_(wikimf) - https://phabricator.wikimedia.org/T291475 (10Marostegui) @KCVelaga_WMF can you confirm your email is kcvelaga-ctr@wikimedia.org? [12:08:46] (03PS11) 10Jbond: cassandra: use profile::java [puppet] - 10https://gerrit.wikimedia.org/r/631789 (https://phabricator.wikimedia.org/T261966) (owner: 10Hnowlan) [12:11:44] RECOVERY - Uncommitted DNS changes in Netbox on netbox1001 is OK: Netbox has zero uncommitted DNS changes https://wikitech.wikimedia.org/wiki/Monitoring/Netbox_DNS_uncommitted_changes [12:15:06] RECOVERY - Check systemd state on an-launcher1002 is OK: OK - running: The system is fully operational https://wikitech.wikimedia.org/wiki/Monitoring/check_systemd_state [12:15:34] (03PS1) 10Marostegui: data.yaml: Add KCVelaga [puppet] - 10https://gerrit.wikimedia.org/r/722598 (https://phabricator.wikimedia.org/T291475) [12:15:53] (03CR) 10Marostegui: [C: 04-2] "Waiting for approvals" [puppet] - 10https://gerrit.wikimedia.org/r/722598 (https://phabricator.wikimedia.org/T291475) (owner: 10Marostegui) [12:15:55] 10SRE, 10LDAP-Access-Requests, 10Patch-For-Review: Grant Access to ldap/wmf for KCVelaga_(wikimf) - https://phabricator.wikimedia.org/T291475 (10Marostegui) @TAndic or @JAnstee_WMF could you approve this? [12:17:02] (03PS1) 10Jbond: C:cassandra: add optional java_package variable [puppet] - 10https://gerrit.wikimedia.org/r/722599 (https://phabricator.wikimedia.org/T261966) [12:18:26] (03PS12) 10Jbond: cassandra: use profile::java [puppet] - 10https://gerrit.wikimedia.org/r/631789 (https://phabricator.wikimedia.org/T261966) (owner: 10Hnowlan) [12:18:40] (03PS2) 10Jbond: C:cassandra: add optional java_package variable [puppet] - 10https://gerrit.wikimedia.org/r/722599 (https://phabricator.wikimedia.org/T261966) [12:18:46] (03CR) 10jerkins-bot: [V: 04-1] C:cassandra: add optional java_package variable [puppet] - 10https://gerrit.wikimedia.org/r/722599 (https://phabricator.wikimedia.org/T261966) (owner: 10Jbond) [12:19:04] (03CR) 10Jbond: "pcc: https://puppet-compiler.wmflabs.org/compiler1001/31149/" [puppet] - 10https://gerrit.wikimedia.org/r/722599 (https://phabricator.wikimedia.org/T261966) (owner: 10Jbond) [12:20:31] (03CR) 10Jbond: [V: 03+1] "PCC SUCCESS (DIFF 7): https://integration.wikimedia.org/ci/job/operations-puppet-catalog-compiler/31150/console" [puppet] - 10https://gerrit.wikimedia.org/r/631789 (https://phabricator.wikimedia.org/T261966) (owner: 10Hnowlan) [12:20:33] (03CR) 10jerkins-bot: [V: 04-1] C:cassandra: add optional java_package variable [puppet] - 10https://gerrit.wikimedia.org/r/722599 (https://phabricator.wikimedia.org/T261966) (owner: 10Jbond) [12:20:38] 10SRE, 10LDAP-Access-Requests, 10Patch-For-Review: Grant Access to ldap/wmf for KCVelaga_(wikimf) - https://phabricator.wikimedia.org/T291475 (10KCVelaga_WMF) Yes, I confirm my email. [12:21:26] PROBLEM - Check systemd state on an-launcher1002 is CRITICAL: CRITICAL - degraded: The following units failed: produce_canary_events.service https://wikitech.wikimedia.org/wiki/Monitoring/check_systemd_state [12:22:28] (03CR) 10Jbond: [V: 03+1 C: 03+1] "lgtm" [puppet] - 10https://gerrit.wikimedia.org/r/631789 (https://phabricator.wikimedia.org/T261966) (owner: 10Hnowlan) [12:23:27] (03PS3) 10Jbond: C:cassandra: add optional java_package variable [puppet] - 10https://gerrit.wikimedia.org/r/722599 (https://phabricator.wikimedia.org/T261966) [12:29:23] (03PS13) 10Jbond: cassandra: use profile::java [puppet] - 10https://gerrit.wikimedia.org/r/631789 (https://phabricator.wikimedia.org/T261966) (owner: 10Hnowlan) [12:29:25] (03PS4) 10Jbond: C:cassandra: add optional java_package variable [puppet] - 10https://gerrit.wikimedia.org/r/722599 (https://phabricator.wikimedia.org/T261966) [12:29:27] (03PS1) 10Jbond: P:java: fix default_package_name [puppet] - 10https://gerrit.wikimedia.org/r/722600 [12:30:19] (03CR) 10Jbond: [V: 03+2 C: 03+2] P:java: fix default_package_name [puppet] - 10https://gerrit.wikimedia.org/r/722600 (owner: 10Jbond) [12:31:12] (03CR) 10Jbond: [V: 03+1] "PCC SUCCESS (DIFF 7): https://integration.wikimedia.org/ci/job/operations-puppet-catalog-compiler/31152/console" [puppet] - 10https://gerrit.wikimedia.org/r/722599 (https://phabricator.wikimedia.org/T261966) (owner: 10Jbond) [12:34:56] (03PS1) 10Hashar: Enable 'DuplicateParse' logging bucket [mediawiki-config] - 10https://gerrit.wikimedia.org/r/722601 [12:41:27] (03PS1) 10Arturo Borrero Gonzalez: openstack: manila: configure additional bits for the service image [puppet] - 10https://gerrit.wikimedia.org/r/722602 (https://phabricator.wikimedia.org/T291257) [12:44:05] (03PS1) 10Arturo Borrero Gonzalez: hieradata: openstack: manila: add service instance password [labs/private] - 10https://gerrit.wikimedia.org/r/722603 (https://phabricator.wikimedia.org/T291257) [12:46:48] (03CR) 10Arturo Borrero Gonzalez: [C: 03+2] openstack: manila: configure additional bits for the service image [puppet] - 10https://gerrit.wikimedia.org/r/722602 (https://phabricator.wikimedia.org/T291257) (owner: 10Arturo Borrero Gonzalez) [12:46:59] (03CR) 10Arturo Borrero Gonzalez: [V: 03+2 C: 03+2] hieradata: openstack: manila: add service instance password [labs/private] - 10https://gerrit.wikimedia.org/r/722603 (https://phabricator.wikimedia.org/T291257) (owner: 10Arturo Borrero Gonzalez) [12:51:32] (03CR) 10Michael DiPietro: create role to deploy staging instance for quarry (031 comment) [puppet] - 10https://gerrit.wikimedia.org/r/721585 (https://phabricator.wikimedia.org/T291204) (owner: 10Michael DiPietro) [12:52:55] (03PS1) 10MSantos: kartographer: enable tegola in testwiki [mediawiki-config] - 10https://gerrit.wikimedia.org/r/722605 (https://phabricator.wikimedia.org/T291178) [13:02:04] (03CR) 10Jelto: [C: 03+2] "lgtm, just make sure to also modify the path in the repo if needed https://gitlab.wikimedia.org/search?search=gerrit&group_id=15&project_i" [puppet] - 10https://gerrit.wikimedia.org/r/719363 (https://phabricator.wikimedia.org/T290259) (owner: 10Brennen Bearnes) [13:02:18] (03PS1) 10Arturo Borrero Gonzalez: openstack: codfw1dev: manila: enable services [puppet] - 10https://gerrit.wikimedia.org/r/722607 (https://phabricator.wikimedia.org/T291257) [13:03:08] (03PS2) 10Arturo Borrero Gonzalez: openstack: codfw1dev: manila: enable services [puppet] - 10https://gerrit.wikimedia.org/r/722607 (https://phabricator.wikimedia.org/T291257) [13:04:47] 10SRE, 10User-herron: Rebalance kafka partitions in main-{eqiad,codfw} clusters - https://phabricator.wikimedia.org/T288825 (10elukey) I was able to add the list of json commands to https://gitlab.wikimedia.org/Elukey/kafka_main_rebalance/-/tree/main/main-codfw/topicmappr_json [13:07:12] (03CR) 10Arturo Borrero Gonzalez: [C: 03+2] "PCC: https://puppet-compiler.wmflabs.org/compiler1001/31153/" [puppet] - 10https://gerrit.wikimedia.org/r/722607 (https://phabricator.wikimedia.org/T291257) (owner: 10Arturo Borrero Gonzalez) [13:07:28] (03CR) 10Ottomata: "ah! I had forgotten (we haven't done this in a while) that eventgate-main docker image uses a baked in clone of the schema repos (so it d" [mediawiki-config] - 10https://gerrit.wikimedia.org/r/722289 (owner: 10Jgiannelos) [13:08:04] !log jmm@cumin2002 START - Cookbook sre.ganeti.makevm for new host testvm2002.codfw.wmnet [13:08:11] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log [13:08:51] 10SRE, 10MW-on-K8s, 10serviceops, 10Patch-For-Review, 10Performance-Team (Radar): Benchmark performance of MediaWiki on k8s - https://phabricator.wikimedia.org/T280497 (10ssastry) >>! In T280497#7367757, @Joe wrote: >>>! In T280497#7365189, @jijiki wrote: >> @ssastry we have done some benchmarks, but non... [13:09:28] 10SRE, 10User-herron: Rebalance kafka partitions in main-{eqiad,codfw} clusters - https://phabricator.wikimedia.org/T288825 (10elukey) The final goal is to make https://grafana.wikimedia.org/d/000000027/kafka?viewPanel=48&orgId=1&var-datasource=codfw%20prometheus%2Fops&var-kafka_cluster=main-codfw&var-cluster=... [13:13:39] 10SRE, 10ops-eqiad: rack spare switches in c1-eqiad - https://phabricator.wikimedia.org/T185337 (10ayounsi) [13:17:34] 10SRE, 10Observability-Logging: Provision plaintext syslog collectors in esams/ulsfo/eqsin - https://phabricator.wikimedia.org/T243065 (10ayounsi) [13:18:52] !log upgrading php on wtp* servers to 7.2.34-18+0~20210223.60+debian10~1.gbpb21322+wmf2 && rolling service restart - T291052 [13:18:56] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log [13:18:57] T291052: Deploy PHP patch for DOM replaceChild/removeChild performance - https://phabricator.wikimedia.org/T291052 [13:24:24] 10Puppet, 10Infrastructure-Foundations: error while resolving custom fact "lldp_neighbors" on ms-be105[1-9], ms-be205[1-6] and relforge100[3-4] - https://phabricator.wikimedia.org/T290984 (10joanna_borun) 05Open→03In progress [13:32:30] 10SRE, 10Infrastructure-Foundations, 10Patch-For-Review: Create Ganeti test cluster - https://phabricator.wikimedia.org/T286206 (10joanna_borun) 05Open→03In progress [13:32:38] !log jmm@cumin2002 END (PASS) - Cookbook sre.ganeti.makevm (exit_code=0) for new host testvm2002.codfw.wmnet [13:32:42] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log [13:33:08] 10SRE, 10Infrastructure-Foundations, 10netops, 10procurement: Move AMS-IX port to 802.1q tagged and get "private vlan" added - https://phabricator.wikimedia.org/T291407 (10joanna_borun) 05Open→03In progress [13:33:29] 10SRE, 10Infrastructure-Foundations, 10Mail, 10Patch-For-Review: Upgrade MXes to Bullseye - https://phabricator.wikimedia.org/T286911 (10joanna_borun) 05Open→03In progress [13:34:45] (03PS1) 10Ottomata: Bump eventgate-main image version to get maps/tile_change schema [deployment-charts] - 10https://gerrit.wikimedia.org/r/722618 (https://phabricator.wikimedia.org/T289771) [13:35:10] (03CR) 10Ottomata: [V: 03+2 C: 03+2] Bump eventgate-main image version to get maps/tile_change schema [deployment-charts] - 10https://gerrit.wikimedia.org/r/722618 (https://phabricator.wikimedia.org/T289771) (owner: 10Ottomata) [13:36:50] 10Puppet, 10Cloud-VPS, 10Infrastructure-Foundations, 10User-jbond, 10cloud-services-team (Kanban): Audit puppet usage in cloud hosts - https://phabricator.wikimedia.org/T289658 (10joanna_borun) 05Open→03In progress [13:36:57] 10Puppet, 10Cloud Services Proposals, 10Cloud-VPS, 10Infrastructure-Foundations, and 3 others: Easing pain points caused by divergence between cloudservices and production puppet usecases - https://phabricator.wikimedia.org/T285539 (10joanna_borun) [13:37:18] 10Puppet, 10Cloud-VPS, 10Infrastructure-Foundations, 10User-jbond, 10cloud-services-team (Kanban): Normalise hiera default values - https://phabricator.wikimedia.org/T289665 (10joanna_borun) 05Open→03In progress [13:37:26] 10Puppet, 10Cloud Services Proposals, 10Cloud-VPS, 10Infrastructure-Foundations, and 3 others: Easing pain points caused by divergence between cloudservices and production puppet usecases - https://phabricator.wikimedia.org/T285539 (10joanna_borun) [13:37:56] !log otto@deploy1002 helmfile [staging] Ran 'sync' command on namespace 'eventgate-main' for release 'production' . [13:38:00] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log [13:38:24] 10Puppet, 10Cloud-VPS, 10Infrastructure-Foundations, 10User-jbond, 10cloud-services-team (Kanban): Gather a list of puppet modules shared between cloud and production - https://phabricator.wikimedia.org/T289666 (10joanna_borun) 05Open→03In progress [13:38:30] 10Puppet, 10Cloud Services Proposals, 10Cloud-VPS, 10Infrastructure-Foundations, and 3 others: Easing pain points caused by divergence between cloudservices and production puppet usecases - https://phabricator.wikimedia.org/T285539 (10joanna_borun) [13:40:14] (03PS1) 10Muehlenhoff: os-reports: Flag missing/broken state in red [puppet] - 10https://gerrit.wikimedia.org/r/722619 [13:41:27] !log otto@deploy1002 helmfile [codfw] Ran 'sync' command on namespace 'eventgate-main' for release 'canary' . [13:41:27] !log otto@deploy1002 helmfile [codfw] Ran 'sync' command on namespace 'eventgate-main' for release 'production' . [13:41:31] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log [13:41:34] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log [13:41:43] (03CR) 10jerkins-bot: [V: 04-1] os-reports: Flag missing/broken state in red [puppet] - 10https://gerrit.wikimedia.org/r/722619 (owner: 10Muehlenhoff) [13:45:26] (03PS2) 10Muehlenhoff: os-reports: Flag missing/broken state in red [puppet] - 10https://gerrit.wikimedia.org/r/722619 [13:48:06] (03CR) 10Muehlenhoff: [C: 03+2] os-reports: Flag missing/broken state in red [puppet] - 10https://gerrit.wikimedia.org/r/722619 (owner: 10Muehlenhoff) [13:48:52] !log otto@deploy1002 helmfile [eqiad] Ran 'sync' command on namespace 'eventgate-main' for release 'production' . [13:48:52] !log otto@deploy1002 helmfile [eqiad] Ran 'sync' command on namespace 'eventgate-main' for release 'canary' . [13:48:55] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log [13:49:00] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log [13:49:26] 10SRE, 10MW-on-K8s, 10serviceops, 10Patch-For-Review, 10Performance-Team (Radar): Benchmark performance of MediaWiki on k8s - https://phabricator.wikimedia.org/T280497 (10ssastry) Times from scandium.eqiad.wmnet: * http://en.wikipedia.org/w/rest.php/en.wikipedia.org/v3/page/html/Hospet/1043074958 takes b... [13:52:34] !log disable AMS-IX peering sessions for maintenance - T291407 [13:52:39] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log [13:52:40] T291407: Move AMS-IX port to 802.1q tagged and get "private vlan" added - https://phabricator.wikimedia.org/T291407 [13:53:06] RECOVERY - BGP status on cr2-esams is OK: BGP OK - up: 20, down: 0, shutdown: 0 https://wikitech.wikimedia.org/wiki/Network_monitoring%23BGP_status [13:56:16] RECOVERY - Check systemd state on an-launcher1002 is OK: OK - running: The system is fully operational https://wikitech.wikimedia.org/wiki/Monitoring/check_systemd_state [13:56:52] 10SRE, 10Infrastructure-Foundations, 10netops, 10procurement: Move AMS-IX port to 802.1q tagged and get "private vlan" added - https://phabricator.wikimedia.org/T291407 (10ayounsi) Updated Netbox, resulting diff: `lang=diff [edit interfaces ae2] + flexible-vlan-tagging; + encapsulation flexible-ethern... [14:01:40] PROBLEM - BGP status on cr2-esams is CRITICAL: BGP CRITICAL - AS6939/IPv6: Active - HE https://wikitech.wikimedia.org/wiki/Network_monitoring%23BGP_status [14:02:37] (03PS1) 10Jbond: primary_interface: filter out SLAAC addresses [puppet] - 10https://gerrit.wikimedia.org/r/722621 [14:04:43] (03CR) 10CDanis: [C: 03+1] sections.yaml: Remove s10 from codfw [puppet] - 10https://gerrit.wikimedia.org/r/722273 (https://phabricator.wikimedia.org/T167973) (owner: 10Marostegui) [14:08:03] Ty cdanis [14:10:44] (03CR) 10RhinosF1: [C: 03+1] sections.yaml: Remove s10 from codfw [puppet] - 10https://gerrit.wikimedia.org/r/722273 (https://phabricator.wikimedia.org/T167973) (owner: 10Marostegui) [14:15:11] (03PS2) 10ZPapierski: Add kafka clusters' brokers to spicerack config [puppet] - 10https://gerrit.wikimedia.org/r/721857 (https://phabricator.wikimedia.org/T276469) [14:15:42] (03CR) 10Volans: [C: 03+1] "Seems sane, afaict, but I'm no expert" [puppet] - 10https://gerrit.wikimedia.org/r/722621 (owner: 10Jbond) [14:17:25] !log temporarily downpref Telia-Deutsch Telekom to not saturate Telia transit - T291407 [14:17:32] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log [14:17:33] T291407: Move AMS-IX port to 802.1q tagged and get "private vlan" added - https://phabricator.wikimedia.org/T291407 [14:18:32] RECOVERY - BGP status on cr2-esams is OK: BGP OK - up: 20, down: 0, shutdown: 0 https://wikitech.wikimedia.org/wiki/Network_monitoring%23BGP_status [14:25:22] (03CR) 10Volans: "The compiler is happy [1], paths are ok. I've left two totally optional questions inline." [puppet] - 10https://gerrit.wikimedia.org/r/721857 (https://phabricator.wikimedia.org/T276469) (owner: 10ZPapierski) [14:32:06] (03CR) 10ZPapierski: Add kafka clusters' brokers to spicerack config (032 comments) [puppet] - 10https://gerrit.wikimedia.org/r/721857 (https://phabricator.wikimedia.org/T276469) (owner: 10ZPapierski) [14:32:49] (03PS1) 10Herron: wip [puppet] - 10https://gerrit.wikimedia.org/r/722630 [14:32:56] (03PS1) 10Giuseppe Lavagetto: _tls_helpers: bump to envoy config v3 api [deployment-charts] - 10https://gerrit.wikimedia.org/r/722631 [14:33:28] (03CR) 10jerkins-bot: [V: 04-1] _tls_helpers: bump to envoy config v3 api [deployment-charts] - 10https://gerrit.wikimedia.org/r/722631 (owner: 10Giuseppe Lavagetto) [14:33:50] (03CR) 10Volans: [C: 03+1] "LGTM" [puppet] - 10https://gerrit.wikimedia.org/r/721857 (https://phabricator.wikimedia.org/T276469) (owner: 10ZPapierski) [14:35:00] !log re-enable AMS-IX peering sessions - T291407 [14:35:03] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log [14:35:05] T291407: Move AMS-IX port to 802.1q tagged and get "private vlan" added - https://phabricator.wikimedia.org/T291407 [14:36:21] 10SRE, 10Infrastructure-Foundations, 10netops, 10procurement: Move AMS-IX port to 802.1q tagged and get "private vlan" added - https://phabricator.wikimedia.org/T291407 (10ayounsi) 05In progress→03Resolved All done. [14:39:49] (03CR) 10Ottomata: "IIUC the intention here is to be able to write cookbooks that use Kafka without hardcoding Kafka cluster information in the cookbook?" [puppet] - 10https://gerrit.wikimedia.org/r/721857 (https://phabricator.wikimedia.org/T276469) (owner: 10ZPapierski) [14:42:30] (03PS1) 10Ayounsi: Downpref Telia->DT [homer/public] - 10https://gerrit.wikimedia.org/r/722634 (https://phabricator.wikimedia.org/T291495) [14:44:10] (03PS3) 10ZPapierski: Add kafka clusters' brokers to spicerack config [puppet] - 10https://gerrit.wikimedia.org/r/721857 (https://phabricator.wikimedia.org/T276469) [14:44:23] (03CR) 10ZPapierski: Add kafka clusters' brokers to spicerack config (031 comment) [puppet] - 10https://gerrit.wikimedia.org/r/721857 (https://phabricator.wikimedia.org/T276469) (owner: 10ZPapierski) [14:45:29] (03CR) 10Cathal Mooney: [C: 03+1] "LGTM. Based on today's traffic profile (while AMS-IX was down) I think this makes sense, to reduce load on Telia circuit. DT are large p" [homer/public] - 10https://gerrit.wikimedia.org/r/722634 (https://phabricator.wikimedia.org/T291495) (owner: 10Ayounsi) [14:46:09] (03CR) 10Ayounsi: [C: 03+2] Downpref Telia->DT [homer/public] - 10https://gerrit.wikimedia.org/r/722634 (https://phabricator.wikimedia.org/T291495) (owner: 10Ayounsi) [14:46:46] (03Merged) 10jenkins-bot: Downpref Telia->DT [homer/public] - 10https://gerrit.wikimedia.org/r/722634 (https://phabricator.wikimedia.org/T291495) (owner: 10Ayounsi) [14:47:17] (03CR) 10Ottomata: "And/or... would it be better to have cookbooks pull this info out of zookeeper instead? E.g." [puppet] - 10https://gerrit.wikimedia.org/r/721857 (https://phabricator.wikimedia.org/T276469) (owner: 10ZPapierski) [14:49:12] (03CR) 10ZPapierski: Add kafka clusters' brokers to spicerack config (031 comment) [puppet] - 10https://gerrit.wikimedia.org/r/721857 (https://phabricator.wikimedia.org/T276469) (owner: 10ZPapierski) [14:51:35] (03CR) 10Ottomata: Add kafka clusters' brokers to spicerack config (031 comment) [puppet] - 10https://gerrit.wikimedia.org/r/721857 (https://phabricator.wikimedia.org/T276469) (owner: 10ZPapierski) [14:52:39] (03CR) 10Ottomata: Add kafka clusters' brokers to spicerack config (031 comment) [puppet] - 10https://gerrit.wikimedia.org/r/721857 (https://phabricator.wikimedia.org/T276469) (owner: 10ZPapierski) [14:53:49] (03CR) 10Ottomata: Add kafka clusters' brokers to spicerack config (031 comment) [puppet] - 10https://gerrit.wikimedia.org/r/721857 (https://phabricator.wikimedia.org/T276469) (owner: 10ZPapierski) [14:54:07] (03CR) 10Marostegui: [C: 03+2] sections.yaml: Remove s10 from codfw [puppet] - 10https://gerrit.wikimedia.org/r/722273 (https://phabricator.wikimedia.org/T167973) (owner: 10Marostegui) [14:59:13] (03CR) 10Herron: "This change is ready for review." [puppet] - 10https://gerrit.wikimedia.org/r/722630 (https://phabricator.wikimedia.org/T286911) (owner: 10Herron) [15:00:58] (03CR) 10Ottomata: [C: 03+1] Add kafka clusters' brokers to spicerack config [puppet] - 10https://gerrit.wikimedia.org/r/721857 (https://phabricator.wikimedia.org/T276469) (owner: 10ZPapierski) [15:08:08] PROBLEM - Uncommitted dbctl configuration changes- check dbctl config diff on cumin1001 is CRITICAL: CRITICAL - Uncommitted dbctl configuration changes, check dbctl config diff https://wikitech.wikimedia.org/wiki/Dbctl%23Uncommitted_dbctl_diffs [15:08:16] PROBLEM - Uncommitted dbctl configuration changes- check dbctl config diff on cumin2002 is CRITICAL: CRITICAL - Uncommitted dbctl configuration changes, check dbctl config diff https://wikitech.wikimedia.org/wiki/Dbctl%23Uncommitted_dbctl_diffs [15:09:46] ^ fixing [15:09:59] !log marostegui@cumin1001 dbctl commit (dc=all): 'Remove s10 from codfw T167973', diff saved to https://phabricator.wikimedia.org/P17307 and previous config saved to /var/cache/conftool/dbconfig/20210921-150958-marostegui.json [15:10:04] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log [15:10:05] T167973: Move database for wikitech (labswiki) to a main cluster section - https://phabricator.wikimedia.org/T167973 [15:10:24] 10SRE, 10ops-eqiad, 10DC-Ops: Q4:(Need By: TBD) rack/setup/install cloudcephosd102[1-4].eqiad.wmnet - https://phabricator.wikimedia.org/T284471 (10Cmjohnson) @papaul or @robh I've hit a roadblock with cloudcephosd1021. The install is failing at 45% in the partitioner. I am not seeing any failed disks. I... [15:14:01] 10SRE, 10Observability-Metrics: Tooling for end-of-quarter SLO reporting - https://phabricator.wikimedia.org/T290924 (10lmata) [15:14:14] RECOVERY - Uncommitted dbctl configuration changes- check dbctl config diff on cumin1001 is OK: OK - no diffs https://wikitech.wikimedia.org/wiki/Dbctl%23Uncommitted_dbctl_diffs [15:14:20] RECOVERY - Uncommitted dbctl configuration changes- check dbctl config diff on cumin2002 is OK: OK - no diffs https://wikitech.wikimedia.org/wiki/Dbctl%23Uncommitted_dbctl_diffs [15:15:24] 10SRE, 10ops-codfw, 10DC-Ops, 10SRE Observability (FY2021/2022-Q1): Q1: (Need By: TBD) rack/setup/install centrallog2002.codfw.wmnet - https://phabricator.wikimedia.org/T289624 (10lmata) [15:17:31] 10SRE, 10DBA, 10Observability-Alerting, 10observability, and 2 others: Database read_only alert has a changing service description - https://phabricator.wikimedia.org/T290591 (10lmata) [15:19:52] 10SRE, 10Traffic, 10SRE Observability (FY2021/2022-Q2): VarnishTrafficDrop alert false positives due to DCs depooled - https://phabricator.wikimedia.org/T291148 (10lmata) [15:21:27] !log cmjohnson@cumin1001 START - Cookbook sre.dns.netbox [15:21:30] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log [15:21:48] (03CR) 10ZPapierski: Add kafka clusters' brokers to spicerack config (031 comment) [puppet] - 10https://gerrit.wikimedia.org/r/721857 (https://phabricator.wikimedia.org/T276469) (owner: 10ZPapierski) [15:22:05] (03PS1) 10Arturo Borrero Gonzalez: openstack: manila: install python3-manilaclient [puppet] - 10https://gerrit.wikimedia.org/r/722639 (https://phabricator.wikimedia.org/T291257) [15:22:19] 10SRE, 10ops-eqiad, 10DC-Ops, 10Traffic, 10decommission-hardware: reclaim cescout1001.eqiad.wmnet - https://phabricator.wikimedia.org/T275696 (10Cmjohnson) [15:22:32] 10SRE, 10ops-eqiad, 10DC-Ops, 10Traffic, 10decommission-hardware: reclaim cescout1001.eqiad.wmnet - https://phabricator.wikimedia.org/T275696 (10Cmjohnson) 05Open→03Resolved [15:22:51] (03CR) 10Arturo Borrero Gonzalez: [C: 03+2] openstack: manila: install python3-manilaclient [puppet] - 10https://gerrit.wikimedia.org/r/722639 (https://phabricator.wikimedia.org/T291257) (owner: 10Arturo Borrero Gonzalez) [15:24:31] 10SRE, 10Patch-For-Review, 10SRE Observability (FY2021/2022-Q1): rsyslog service should fail on configuration errors - https://phabricator.wikimedia.org/T290870 (10lmata) [15:24:54] !log cmjohnson@cumin1001 END (PASS) - Cookbook sre.dns.netbox (exit_code=0) [15:24:56] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log [15:25:20] 10SRE, 10Icinga, 10Observability-Alerting, 10observability: Extend dpkg Icinga check to also check for inconsistent apt state - https://phabricator.wikimedia.org/T190693 (10lmata) [15:26:15] !log upgrade php7.2 on app-canaries and restart service - T291052 [15:26:19] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log [15:26:20] T291052: Deploy PHP patch for DOM replaceChild/removeChild performance - https://phabricator.wikimedia.org/T291052 [15:34:08] (03PS1) 10Bstorm: cloudnfs: remove another conflict with the wmcs/instance profile [puppet] - 10https://gerrit.wikimedia.org/r/722643 (https://phabricator.wikimedia.org/T291406) [15:36:24] (03CR) 10Giuseppe Lavagetto: "The way we're managing this for MediaWiki and other stuff is to actually define the "semver" label (which is *guaranteed* to be immutable)" [deployment-charts] - 10https://gerrit.wikimedia.org/r/722472 (https://phabricator.wikimedia.org/T291442) (owner: 10BryanDavis) [15:37:12] (03CR) 10Giuseppe Lavagetto: "For the record, base images with versions like "0.1" are immutable tags; only "latest" is mutable, and should thus only used in developmen" [deployment-charts] - 10https://gerrit.wikimedia.org/r/722472 (https://phabricator.wikimedia.org/T291442) (owner: 10BryanDavis) [15:39:14] (03CR) 10Bstorm: [C: 03+2] "This is net new and broken without so I'm just merging this." [puppet] - 10https://gerrit.wikimedia.org/r/722643 (https://phabricator.wikimedia.org/T291406) (owner: 10Bstorm) [15:39:17] (03CR) 10Brennen Bearnes: dev-images: migrate repository to gitlab remote (031 comment) [puppet] - 10https://gerrit.wikimedia.org/r/719363 (https://phabricator.wikimedia.org/T290259) (owner: 10Brennen Bearnes) [15:39:39] !log update pcc facts [15:39:42] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log [15:42:51] (03PS1) 10Arturo Borrero Gonzalez: openstack: manila: install manila-data package [puppet] - 10https://gerrit.wikimedia.org/r/722645 (https://phabricator.wikimedia.org/T291257) [15:44:04] (03CR) 10Cwhite: opensearch: fork elasticsearch module into opensearch module (032 comments) [puppet] - 10https://gerrit.wikimedia.org/r/721359 (https://phabricator.wikimedia.org/T288618) (owner: 10Cwhite) [15:46:19] !log jgiannelos@deploy1002 helmfile [codfw] Ran 'sync' command on namespace 'tegola-vector-tiles' for release 'main' . [15:46:22] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log [15:50:23] (03CR) 10Ottomata: [C: 03+1] Add kafka clusters' brokers to spicerack config (031 comment) [puppet] - 10https://gerrit.wikimedia.org/r/721857 (https://phabricator.wikimedia.org/T276469) (owner: 10ZPapierski) [15:54:17] (03PS1) 10Bstorm: cloudnfs: some cleanup in case we want more forking [puppet] - 10https://gerrit.wikimedia.org/r/722647 (https://phabricator.wikimedia.org/T291406) [15:54:19] !log mwdebug-deploy@deploy1002 helmfile [eqiad] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' . [15:54:24] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log [15:55:30] (03PS2) 10Giuseppe Lavagetto: _tls_helpers: bump to envoy config v3 api [deployment-charts] - 10https://gerrit.wikimedia.org/r/722631 [15:56:14] !log mwdebug-deploy@deploy1002 helmfile [codfw] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' . [15:56:18] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log [15:59:57] 10SRE-tools, 10Infrastructure-Foundations: Cookbooks: convert wmf-auto-reimage scripts to Cookbooks - https://phabricator.wikimedia.org/T205885 (10joanna_borun) 05Open→03In progress [16:00:03] 10SRE, 10SRE-tools, 10Infrastructure-Foundations, 10Goal: Expand Spicerack library and SRE Cookbooks - Q2 2018-19 Goal - https://phabricator.wikimedia.org/T205867 (10joanna_borun) [16:00:05] jbond and rzl: My dear minions, it's time we take the moon! Just kidding. Time for Puppet request window deploy. (https://wikitech.wikimedia.org/wiki/Deployments#deploycal-item-20210921T1600). [16:00:05] toan: A patch you scheduled for Puppet request window is about to be deployed. Please be around during the process. Note: If you break AND fix the wikis, you will be rewarded with a sticker. [16:00:08] 10Puppet, 10Infrastructure-Foundations, 10Patch-For-Review, 10User-jbond: puppetdb seems to be slow on host reimage - https://phabricator.wikimedia.org/T263578 (10joanna_borun) 05Open→03In progress [16:00:20] 10SRE, 10Infrastructure-Foundations, 10netops: Create an alert for output discards on network devices - https://phabricator.wikimedia.org/T284593 (10joanna_borun) 05Open→03In progress [16:00:37] 10SRE, 10SRE-tools, 10Infrastructure-Foundations, 10homer, and 3 others: Investigate Capirca - https://phabricator.wikimedia.org/T273865 (10joanna_borun) 05Open→03In progress [16:01:28] (03PS1) 10Daniel Kinzler: Create functional values-beta.yaml [deployment-charts] - 10https://gerrit.wikimedia.org/r/722649 [16:01:37] I'm around o/ [16:01:43] toan: hi! stand by, we're having some issues with the puppet compiler so may need to postpone this to the next puppet window :/ [16:02:01] 10SRE, 10Infrastructure-Foundations, 10netops: ripe-atlas-codfw is down - https://phabricator.wikimedia.org/T267714 (10joanna_borun) 05Open→03In progress [16:02:07] okay! [16:02:28] looks like a pretty straightforward apache change as far as they go though, so maybe we can just run without it? give me a minute [16:02:31] 10SRE-tools, 10Infrastructure-Foundations, 10netbox: Netbox support for svc allocation - https://phabricator.wikimedia.org/T263429 (10joanna_borun) 05Open→03In progress [16:03:04] (03PS1) 10Jgiannelos: tegola-vector-tiles: Use envoy as DB proxy in all envs [deployment-charts] - 10https://gerrit.wikimedia.org/r/722652 [16:03:25] (03CR) 10jerkins-bot: [V: 04-1] Create functional values-beta.yaml [deployment-charts] - 10https://gerrit.wikimedia.org/r/722649 (owner: 10Daniel Kinzler) [16:03:55] I honestly don't know what it means not running it through the puppet compiler, but yes the changes aren't too crazy [16:04:05] (03PS2) 10Jgiannelos: tegola-vector-tiles: Use envoy as DB proxy in all envs [deployment-charts] - 10https://gerrit.wikimedia.org/r/722652 (https://phabricator.wikimedia.org/T283159) [16:04:52] 10SRE, 10Infrastructure-Foundations, 10observability, 10CAS-SSO, 10User-jbond: Icinga Monitoring for CAS - https://phabricator.wikimedia.org/T233935 (10joanna_borun) 05Open→03In progress [16:04:54] 10SRE, 10Infrastructure-Foundations, 10Security-Team, 10CAS-SSO, 10User-jbond: Further steps for CAS/web SSO - https://phabricator.wikimedia.org/T233921 (10joanna_borun) [16:05:00] toan: oh, https://wikitech.wikimedia.org/wiki/Help:Puppet-compiler if you're interested -- the TLDR is it's a tool for checking the result of a Puppet change before rolling it out [16:05:08] 10SRE-tools, 10Infrastructure-Foundations: Spicerack: split wmf-auto-reimage-lib into Spicerack modules - https://phabricator.wikimedia.org/T205884 (10joanna_borun) 05Open→03In progress [16:05:14] 10SRE, 10SRE-tools, 10Infrastructure-Foundations, 10Goal: Expand Spicerack library and SRE Cookbooks - Q2 2018-19 Goal - https://phabricator.wikimedia.org/T205867 (10joanna_borun) [16:05:31] I see, thanks [16:07:28] (03CR) 10Jgiannelos: "This came up as an issue on the first deployment to codfw k8s (tegola raised DB related connection errors)" [deployment-charts] - 10https://gerrit.wikimedia.org/r/722652 (https://phabricator.wikimedia.org/T283159) (owner: 10Jgiannelos) [16:08:29] (03PS1) 10Jbond: hieradata - O:puppetmaster::puppetdb: dont filter facts used for compilation [puppet] - 10https://gerrit.wikimedia.org/r/722653 [16:08:43] volans: can you check ^^ [16:08:52] sure [16:09:14] thanks [16:10:20] side note - can we add a note to https://wikitech.wikimedia.org/wiki/Help:Puppet-compiler ? [16:10:38] jbond: +1 ship it [16:10:50] (I'll add it to gerrit in a sec... solving a local issue) [16:11:10] thanks volans [16:11:25] (03CR) 10Jbond: [C: 03+2] hieradata - O:puppetmaster::puppetdb: dont filter facts used for compilation [puppet] - 10https://gerrit.wikimedia.org/r/722653 (owner: 10Jbond) [16:12:17] toan: okay, sorry to take a little longer :) if you're still about, I'll go ahead and merge, then will ask you to test [16:12:18] (03PS1) 10Ppchelko: Eventgate: Symlink _helpers and _tls_helpers [deployment-charts] - 10https://gerrit.wikimedia.org/r/722654 [16:12:30] (03CR) 10Volans: [C: 03+1] "LGTM" [puppet] - 10https://gerrit.wikimedia.org/r/722653 (owner: 10Jbond) [16:12:32] rzl: sounds good [16:12:41] (03CR) 10RLazarus: [C: 03+2] miscweb: Add CSP headers for query builder [puppet] - 10https://gerrit.wikimedia.org/r/708463 (https://phabricator.wikimedia.org/T285761) (owner: 10Ladsgroup) [16:13:49] running puppet on miscweb1002... [16:14:24] !log jgiannelos@deploy1002 helmfile [eqiad] Ran 'sync' command on namespace 'tegola-vector-tiles' for release 'main' . [16:14:27] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log [16:14:37] (03CR) 10Cwhite: "Looking good! This is a great start and there is freedom to expand as need expands over time." [software/ecs] - 10https://gerrit.wikimedia.org/r/722580 (https://phabricator.wikimedia.org/T222826) (owner: 10Jbond) [16:15:41] toan: okay, have a look! [16:15:47] cheers! [16:18:02] seems to be working as far as I can tell [16:19:20] sweet [16:21:17] (03PS2) 10Bstorm: cloudnfs: some cleanup in case we want more forking [puppet] - 10https://gerrit.wikimedia.org/r/722647 (https://phabricator.wikimedia.org/T291406) [16:21:21] rzl: Seems to work indeed, thanks [16:21:32] (03PS2) 10Ppchelko: Eventgate: Symlink _helpers and _tls_helpers [deployment-charts] - 10https://gerrit.wikimedia.org/r/722654 [16:21:35] \o/ [16:22:13] (03PS3) 10Ppchelko: Eventgate: Symlink _helpers and _tls_helpers [deployment-charts] - 10https://gerrit.wikimedia.org/r/722654 (https://phabricator.wikimedia.org/T291504) [16:24:02] 10SRE, 10LDAP-Access-Requests, 10Patch-For-Review: Grant Access to ldap/wmf for KCVelaga_(wikimf) - https://phabricator.wikimedia.org/T291475 (10KFrancis) @Marostegui I am confirming Krishna Chaitanya Velaga has an NDA on file. Thanks! [16:24:48] (03CR) 10MSantos: [C: 03+2] tegola-vector-tiles: Use envoy as DB proxy in all envs [deployment-charts] - 10https://gerrit.wikimedia.org/r/722652 (https://phabricator.wikimedia.org/T283159) (owner: 10Jgiannelos) [16:25:38] (03CR) 10Bstorm: [C: 03+2] cloudnfs: some cleanup in case we want more forking [puppet] - 10https://gerrit.wikimedia.org/r/722647 (https://phabricator.wikimedia.org/T291406) (owner: 10Bstorm) [16:27:51] (03CR) 10RLazarus: [C: 03+1] remote: add support to enable/disable Cumin output (032 comments) [software/spicerack] - 10https://gerrit.wikimedia.org/r/720993 (owner: 10Volans) [16:28:20] (03PS1) 10Ppchelko: DNM: Demo of the changes between eventgate and common_templates [deployment-charts] - 10https://gerrit.wikimedia.org/r/722656 [16:28:44] (03CR) 10Muehlenhoff: [C: 03+1] "Looks good and the Exim 4.90 changelog confirms:" [puppet] - 10https://gerrit.wikimedia.org/r/722630 (https://phabricator.wikimedia.org/T286911) (owner: 10Herron) [16:29:05] (03Merged) 10jenkins-bot: tegola-vector-tiles: Use envoy as DB proxy in all envs [deployment-charts] - 10https://gerrit.wikimedia.org/r/722652 (https://phabricator.wikimedia.org/T283159) (owner: 10Jgiannelos) [16:29:45] (03CR) 10Ppchelko: "Given that this is impossible to review, here's a diff between eventgate helpers and common_templates I've symlinked: https://gerrit.wikim" [deployment-charts] - 10https://gerrit.wikimedia.org/r/722654 (https://phabricator.wikimedia.org/T291504) (owner: 10Ppchelko) [16:30:09] (03PS1) 10Bstorm: cloudnfs: connect toolsbeta to the VM-based NFS server for testing [puppet] - 10https://gerrit.wikimedia.org/r/722658 (https://phabricator.wikimedia.org/T291406) [16:33:26] !log jgiannelos@deploy1002 helmfile [codfw] Ran 'sync' command on namespace 'tegola-vector-tiles' for release 'main' . [16:33:29] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log [16:33:56] !log mwdebug-deploy@deploy1002 helmfile [eqiad] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' . [16:33:59] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log [16:34:06] (03PS1) 10TrainBranchBot: Branch commit for wmf/1.38.0-wmf.1 [core] (wmf/1.38.0-wmf.1) - 10https://gerrit.wikimedia.org/r/722659 [16:34:08] (03CR) 10TrainBranchBot: [C: 03+2] Branch commit for wmf/1.38.0-wmf.1 [core] (wmf/1.38.0-wmf.1) - 10https://gerrit.wikimedia.org/r/722659 (owner: 10TrainBranchBot) [16:35:50] !log mwdebug-deploy@deploy1002 helmfile [codfw] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' . [16:35:53] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log [16:36:15] (03PS2) 10Bstorm: cloudnfs: connect toolsbeta to the VM-based NFS server for testing [puppet] - 10https://gerrit.wikimedia.org/r/722658 (https://phabricator.wikimedia.org/T291406) [16:38:27] (03CR) 10Jbond: [C: 03+2] primary_interface: filter out SLAAC addresses [puppet] - 10https://gerrit.wikimedia.org/r/722621 (owner: 10Jbond) [16:39:32] RECOVERY - WDQS high update lag on wdqs1004 is OK: (C)4.32e+04 ge (W)2.16e+04 ge 2.122e+04 https://wikitech.wikimedia.org/wiki/Wikidata_query_service/Runbook%23Update_lag https://grafana.wikimedia.org/dashboard/db/wikidata-query-service?orgId=1&panelId=8&fullscreen [16:41:23] (03CR) 10Muehlenhoff: data.yaml: Add KCVelaga (031 comment) [puppet] - 10https://gerrit.wikimedia.org/r/722598 (https://phabricator.wikimedia.org/T291475) (owner: 10Marostegui) [16:51:47] !log jgiannelos@deploy1002 helmfile [eqiad] Ran 'sync' command on namespace 'tegola-vector-tiles' for release 'main' . [16:51:50] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log [16:53:03] (03CR) 10Marostegui: [C: 04-2] data.yaml: Add KCVelaga (031 comment) [puppet] - 10https://gerrit.wikimedia.org/r/722598 (https://phabricator.wikimedia.org/T291475) (owner: 10Marostegui) [16:53:12] (03PS4) 10Ppchelko: Eventgate: Symlink _helpers and _tls_helpers [deployment-charts] - 10https://gerrit.wikimedia.org/r/722654 (https://phabricator.wikimedia.org/T291504) [16:53:36] (03PS2) 10Marostegui: data.yaml: Add KCVelaga [puppet] - 10https://gerrit.wikimedia.org/r/722598 (https://phabricator.wikimedia.org/T291475) [16:54:00] (03PS2) 10Ppchelko: DNM: Demo of the changes between eventgate and common_templates [deployment-charts] - 10https://gerrit.wikimedia.org/r/722656 [16:54:36] (03PS5) 10Ppchelko: Eventgate: Symlink _helpers and _tls_helpers [deployment-charts] - 10https://gerrit.wikimedia.org/r/722654 (https://phabricator.wikimedia.org/T291504) [16:55:15] (03Merged) 10jenkins-bot: Branch commit for wmf/1.38.0-wmf.1 [core] (wmf/1.38.0-wmf.1) - 10https://gerrit.wikimedia.org/r/722659 (owner: 10TrainBranchBot) [16:58:24] !log mwdebug-deploy@deploy1002 helmfile [eqiad] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' . [16:58:27] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log [16:59:37] (03CR) 10Volans: "reply inline" [software/spicerack] - 10https://gerrit.wikimedia.org/r/720993 (owner: 10Volans) [17:00:05] chrisalbon and accraze: I seem to be stuck in Groundhog week. Sigh. Time for (yet another) Services – Graphoid / ORES deploy. (https://wikitech.wikimedia.org/wiki/Deployments#deploycal-item-20210921T1700). [17:00:21] !log mwdebug-deploy@deploy1002 helmfile [codfw] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' . [17:00:24] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log [17:07:14] (03PS2) 10Jelto: modules::gitlab add missing fields from ansible gitlab.rb template [puppet] - 10https://gerrit.wikimedia.org/r/722370 (https://phabricator.wikimedia.org/T283076) [17:08:40] !log jgiannelos@deploy1002 helmfile [eqiad] Ran 'sync' command on namespace 'tegola-vector-tiles' for release 'main' . [17:08:44] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log [17:11:42] (03PS1) 10Dduvall: testwikis wikis to 1.38.0-wmf.1 [mediawiki-config] - 10https://gerrit.wikimedia.org/r/722661 [17:11:44] (03CR) 10Dduvall: [C: 03+2] testwikis wikis to 1.38.0-wmf.1 [mediawiki-config] - 10https://gerrit.wikimedia.org/r/722661 (owner: 10Dduvall) [17:11:49] 10SRE, 10Traffic, 10Patch-For-Review: Deploy durum: check service for Wikidough - https://phabricator.wikimedia.org/T289536 (10cmooney) Thanks for all the background info here. Regarding the use cases for the manual entries, yes there are probably some (like wikimedia-dns.org) that we could adjust the scrip... [17:12:33] (03Merged) 10jenkins-bot: testwikis wikis to 1.38.0-wmf.1 [mediawiki-config] - 10https://gerrit.wikimedia.org/r/722661 (owner: 10Dduvall) [17:12:34] !log dduvall@deploy1002 Started scap: testwikis wikis to 1.38.0-wmf.1 [17:12:38] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log [17:12:55] (03CR) 10Muehlenhoff: [C: 03+1] "LGTM" [puppet] - 10https://gerrit.wikimedia.org/r/722598 (https://phabricator.wikimedia.org/T291475) (owner: 10Marostegui) [17:18:37] 10SRE, 10CommRel-Specialists-Support (Jul-Sep-2021), 10Datacenter-Switchover: CommRel support for September 2021 Switchover - https://phabricator.wikimedia.org/T287546 (10Trizek-WMF) >>! In T287546#7365753, @Elitre wrote: > Can this be closed then? Szymon and I will debrief it next week. Then we will close... [17:25:05] (03CR) 10Jelto: "When moving from ansible to puppet I would like to have a similar gitlab.rb config file. This change makes the gitlab.rb created by puppet" [puppet] - 10https://gerrit.wikimedia.org/r/722370 (https://phabricator.wikimedia.org/T283076) (owner: 10Jelto) [17:27:24] !log jgiannelos@deploy1002 helmfile [codfw] Ran 'sync' command on namespace 'tegola-vector-tiles' for release 'main' . [17:27:27] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log [17:31:53] (03CR) 10Herron: opensearch: fork elasticsearch module into opensearch module (032 comments) [puppet] - 10https://gerrit.wikimedia.org/r/721359 (https://phabricator.wikimedia.org/T288618) (owner: 10Cwhite) [17:33:13] 10SRE-Access-Requests: Requesting access to RESOURCE for USER[S] - https://phabricator.wikimedia.org/T291508 (10NRodriguez) [17:33:30] 10SRE-Access-Requests: Requesting access to RESOURCE for NRodriguez - https://phabricator.wikimedia.org/T291508 (10NRodriguez) [17:33:46] 10SRE-Access-Requests: Requesting access to analytics-privatedata-access for NRodriguez - https://phabricator.wikimedia.org/T291508 (10NRodriguez) [17:35:09] !log jiji@deploy1002 helmfile [eqiad] Ran 'sync' command on namespace 'tegola-vector-tiles' for release 'main' . [17:35:15] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log [17:35:54] (03CR) 10Ppchelko: "One difference between eventgate fork of _helpers and common_templates/_helpers is a redefined wmf.releasename template. eventgate was usi" [deployment-charts] - 10https://gerrit.wikimedia.org/r/722654 (https://phabricator.wikimedia.org/T291504) (owner: 10Ppchelko) [17:37:49] !log mwdebug-deploy@deploy1002 helmfile [eqiad] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' . [17:37:53] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log [17:38:13] (03PS6) 10Ppchelko: Eventgate: Symlink _helpers and _tls_helpers [deployment-charts] - 10https://gerrit.wikimedia.org/r/722654 (https://phabricator.wikimedia.org/T291504) [17:39:15] !log update pcc facts [17:39:17] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log [17:42:27] !log mwdebug-deploy@deploy1002 helmfile [codfw] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' . [17:42:34] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log [17:46:38] (03CR) 10Dzahn: [C: 03+1] data.yaml: Add KCVelaga [puppet] - 10https://gerrit.wikimedia.org/r/722598 (https://phabricator.wikimedia.org/T291475) (owner: 10Marostegui) [17:48:19] !log dduvall@deploy1002 Finished scap: testwikis wikis to 1.38.0-wmf.1 (duration: 35m 44s) [17:48:22] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log [17:48:35] !log jiji@deploy1002 helmfile [eqiad] Ran 'sync' command on namespace 'tegola-vector-tiles' for release 'main' . [17:48:38] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log [17:49:21] !log 1.38.0-wmf.1 deployed to testwikis (T281165) [17:49:25] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log [17:49:26] T281165: 1.38.0-wmf.1 deployment blockers - https://phabricator.wikimedia.org/T281165 [17:52:00] (03PS1) 10Zabe: build: Upgrade composer testing stack to latest as used Wikimedia-wide [mediawiki-config] - 10https://gerrit.wikimedia.org/r/722663 [17:52:32] (03CR) 10jerkins-bot: [V: 04-1] build: Upgrade composer testing stack to latest as used Wikimedia-wide [mediawiki-config] - 10https://gerrit.wikimedia.org/r/722663 (owner: 10Zabe) [17:53:14] (03CR) 10Volans: remote: add support to enable/disable Cumin output (031 comment) [software/spicerack] - 10https://gerrit.wikimedia.org/r/720993 (owner: 10Volans) [17:56:13] (03PS2) 10Zabe: build: Upgrade composer dependencies [mediawiki-config] - 10https://gerrit.wikimedia.org/r/722663 [17:56:39] (03CR) 10jerkins-bot: [V: 04-1] build: Upgrade composer dependencies [mediawiki-config] - 10https://gerrit.wikimedia.org/r/722663 (owner: 10Zabe) [18:00:05] Deploy window Pre MediaWiki train break (https://wikitech.wikimedia.org/wiki/Deployments#deploycal-item-20210921T1800) [18:05:07] (03PS1) 10Bstorm: toolforge harbor: install clients for redis and postgres [puppet] - 10https://gerrit.wikimedia.org/r/722664 (https://phabricator.wikimedia.org/T267616) [18:15:19] (03CR) 10Bstorm: [C: 03+2] toolforge harbor: install clients for redis and postgres [puppet] - 10https://gerrit.wikimedia.org/r/722664 (https://phabricator.wikimedia.org/T267616) (owner: 10Bstorm) [18:24:15] (03CR) 10Legoktm: [C: 04-1] "Looks correct except for the IP address." [puppet] - 10https://gerrit.wikimedia.org/r/713959 (owner: 10Ebernhardson) [18:27:27] (03CR) 10Ottomata: Eventgate: Symlink _helpers and _tls_helpers (031 comment) [deployment-charts] - 10https://gerrit.wikimedia.org/r/722654 (https://phabricator.wikimedia.org/T291504) (owner: 10Ppchelko) [18:28:18] (03PS5) 10Ryan Kemper: Add wcqs.svc.{codfw,eqiad}.wmnet [dns] - 10https://gerrit.wikimedia.org/r/713929 (https://phabricator.wikimedia.org/T280001) (owner: 10Ebernhardson) [18:28:26] (03CR) 10jerkins-bot: [V: 04-1] Add wcqs.svc.{codfw,eqiad}.wmnet [dns] - 10https://gerrit.wikimedia.org/r/713929 (https://phabricator.wikimedia.org/T280001) (owner: 10Ebernhardson) [18:30:14] PROBLEM - Mobileapps LVS codfw on mobileapps.svc.codfw.wmnet is CRITICAL: /{domain}/v1/page/description/{title} (Get description for test page) timed out before a response was received https://wikitech.wikimedia.org/wiki/Mobileapps_%28service%29 [18:32:14] RECOVERY - Mobileapps LVS codfw on mobileapps.svc.codfw.wmnet is OK: All endpoints are healthy https://wikitech.wikimedia.org/wiki/Mobileapps_%28service%29 [18:35:54] (03PS6) 10Ryan Kemper: Add wcqs.svc.{codfw,eqiad}.wmnet [dns] - 10https://gerrit.wikimedia.org/r/713929 (https://phabricator.wikimedia.org/T280001) (owner: 10Ebernhardson) [18:36:02] (03CR) 10Ppchelko: Eventgate: Symlink _helpers and _tls_helpers (031 comment) [deployment-charts] - 10https://gerrit.wikimedia.org/r/722654 (https://phabricator.wikimedia.org/T291504) (owner: 10Ppchelko) [18:36:37] (03CR) 10jerkins-bot: [V: 04-1] Add wcqs.svc.{codfw,eqiad}.wmnet [dns] - 10https://gerrit.wikimedia.org/r/713929 (https://phabricator.wikimedia.org/T280001) (owner: 10Ebernhardson) [18:37:04] (03CR) 10Andrew Bogott: [C: 03+1] "lgtm!" [puppet] - 10https://gerrit.wikimedia.org/r/722658 (https://phabricator.wikimedia.org/T291406) (owner: 10Bstorm) [18:38:18] (03PS7) 10Ryan Kemper: Add wcqs.svc.{codfw,eqiad}.wmnet [dns] - 10https://gerrit.wikimedia.org/r/713929 (https://phabricator.wikimedia.org/T280001) (owner: 10Ebernhardson) [18:38:55] (03CR) 10jerkins-bot: [V: 04-1] Add wcqs.svc.{codfw,eqiad}.wmnet [dns] - 10https://gerrit.wikimedia.org/r/713929 (https://phabricator.wikimedia.org/T280001) (owner: 10Ebernhardson) [18:39:10] (03PS8) 10Ryan Kemper: Add wcqs.svc.{codfw,eqiad}.wmnet [dns] - 10https://gerrit.wikimedia.org/r/713929 (https://phabricator.wikimedia.org/T280001) (owner: 10Ebernhardson) [18:40:37] (03CR) 10Bstorm: [C: 03+2] cloudnfs: connect toolsbeta to the VM-based NFS server for testing [puppet] - 10https://gerrit.wikimedia.org/r/722658 (https://phabricator.wikimedia.org/T291406) (owner: 10Bstorm) [18:44:49] (03CR) 10Ryan Kemper: [C: 03+2] Add wcqs.svc.{codfw,eqiad}.wmnet [dns] - 10https://gerrit.wikimedia.org/r/713929 (https://phabricator.wikimedia.org/T280001) (owner: 10Ebernhardson) [18:44:58] !log T280001 Merging https://gerrit.wikimedia.org/r/c/operations/dns/+/713929; will follow steps in https://wikitech.wikimedia.org/wiki/DNS#Changing_records_in_a_zonefile post-merge [18:45:04] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log [18:45:05] T280001: Set up puppet configuration for new WCQS cluster - https://phabricator.wikimedia.org/T280001 [18:45:31] !log T280001 `ryankemper@authdns1001:~$ sudo authdns-update` [18:45:35] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log [18:46:58] (03CR) 10Ppchelko: Eventgate: Symlink _helpers and _tls_helpers (031 comment) [deployment-charts] - 10https://gerrit.wikimedia.org/r/722654 (https://phabricator.wikimedia.org/T291504) (owner: 10Ppchelko) [18:48:08] !log T280001 `OK - authdns-update successful on all nodes!` [18:48:12] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log [18:53:18] !log T280001 `for i in 0 1 2 ; do dig @ns${i}.wikimedia.org -t any wcqs.svc.[eqiad,codfw].wmnet ; done` looks as expected [18:53:22] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log [18:53:24] T280001: Set up puppet configuration for new WCQS cluster - https://phabricator.wikimedia.org/T280001 [18:54:26] 10SRE, 10Wikimedia-Mailing-lists: Wikimedia-l Digests include $hyperkitty_url in the footer instead of a url - https://phabricator.wikimedia.org/T291511 (10Legoktm) 05Open→03Resolved a:03Legoktm I updated the digest footer to directly link to the archives rather than using $hyperkitty_url (cc: @Ijon). I'... [18:56:19] (03PS7) 10Ppchelko: Eventgate: Symlink _helpers and _tls_helpers [deployment-charts] - 10https://gerrit.wikimedia.org/r/722654 (https://phabricator.wikimedia.org/T291504) [18:56:55] !log T280001 Running `sudo -i cookbook sre.dns.netbox -t T280001 'Added wcqs.svc.[eqiad,codfw].wmnet'` per final step of https://wikitech.wikimedia.org/wiki/LVS#DNS_changes_(svc_zone_only)... [18:56:59] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log [18:57:04] !log ryankemper@cumin1001 START - Cookbook sre.dns.netbox [18:57:07] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log [19:00:05] dduvall and hashar: That opportune time is upon us again. Time for a MediaWiki train - American+European Version deploy. Don't be afraid. (https://wikitech.wikimedia.org/wiki/Deployments#deploycal-item-20210921T1900). [19:03:59] !log ryankemper@cumin1001 END (PASS) - Cookbook sre.dns.netbox (exit_code=0) [19:04:02] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log [19:04:39] (03PS1) 10Dduvall: group0 wikis to 1.38.0-wmf.1 [mediawiki-config] - 10https://gerrit.wikimedia.org/r/722667 [19:04:41] (03CR) 10Dduvall: [C: 03+2] group0 wikis to 1.38.0-wmf.1 [mediawiki-config] - 10https://gerrit.wikimedia.org/r/722667 (owner: 10Dduvall) [19:05:05] (03CR) 10Ppchelko: "The latest version is following the logic described in https://phabricator.wikimedia.org/T242861#5819686" [deployment-charts] - 10https://gerrit.wikimedia.org/r/722654 (https://phabricator.wikimedia.org/T291504) (owner: 10Ppchelko) [19:05:09] (03CR) 10Gergő Tisza: [C: 03+1] Undeploy GettingStarted I: Disable on all wikis [mediawiki-config] - 10https://gerrit.wikimedia.org/r/722574 (https://phabricator.wikimedia.org/T235752) (owner: 10Urbanecm) [19:05:25] (03Merged) 10jenkins-bot: group0 wikis to 1.38.0-wmf.1 [mediawiki-config] - 10https://gerrit.wikimedia.org/r/722667 (owner: 10Dduvall) [19:06:45] !log dduvall@deploy1002 rebuilt and synchronized wikiversions files: group0 wikis to 1.38.0-wmf.1 [19:06:48] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log [19:08:07] (03CR) 10Gergő Tisza: [C: 03+1] Undeploy getting started III: Don't set wmgUseGettingStarted, now ignored [mediawiki-config] - 10https://gerrit.wikimedia.org/r/722576 (https://phabricator.wikimedia.org/T235752) (owner: 10Urbanecm) [19:08:19] (03CR) 10Gergő Tisza: [C: 03+1] Undeploy GettingStarted IV: Don't build i18n [mediawiki-config] - 10https://gerrit.wikimedia.org/r/722577 (owner: 10Urbanecm) [19:08:31] (03CR) 10Gergő Tisza: [C: 03+1] Undeploy GettingStarted V: Remove now-obsolete logging channels [mediawiki-config] - 10https://gerrit.wikimedia.org/r/722578 (https://phabricator.wikimedia.org/T235752) (owner: 10Urbanecm) [19:09:21] (03PS1) 10Bstorm: cloudnfs: adding DNS to nfs server [puppet] - 10https://gerrit.wikimedia.org/r/722668 (https://phabricator.wikimedia.org/T291406) [19:10:43] !log T280001 `sre.dns.netbox` completed successfully [19:10:48] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log [19:10:49] T280001: Set up puppet configuration for new WCQS cluster - https://phabricator.wikimedia.org/T280001 [19:12:37] (03CR) 10Bstorm: [C: 03+2] cloudnfs: adding DNS to nfs server [puppet] - 10https://gerrit.wikimedia.org/r/722668 (https://phabricator.wikimedia.org/T291406) (owner: 10Bstorm) [19:12:45] (03CR) 10Ottomata: "I think this will work, buuuut IIRC Alex had some resistance to using main_app.name in wmf.releasename?" [deployment-charts] - 10https://gerrit.wikimedia.org/r/722654 (https://phabricator.wikimedia.org/T291504) (owner: 10Ppchelko) [19:15:38] !log mwdebug-deploy@deploy1002 helmfile [eqiad] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' . [19:15:45] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log [19:19:06] !log mwdebug-deploy@deploy1002 helmfile [codfw] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' . [19:19:09] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log [19:22:46] Pchelolo, DannyS712: please let me know asap whether T291515 is severe enough to warrant immediate rollback. my first inclination is to hold at group0 since there are so few instances of the error [19:22:46] T291515: PHP Deprecated: $wgUser reassignment detected [Called from MediaWiki\Extension\CentralAuth\Special\SpecialCentralLogin::execute] - https://phabricator.wikimedia.org/T291515 [19:23:11] dduvall: this is just log. no user-visible impact [19:23:29] great. thanks [19:23:50] since it's a deprecation, we cna make it a blocker for the next deploy. [19:24:37] sounds good! [19:25:37] I first got a bit scared my CA work was causing issues, but thankfully that seems unrelated to it [19:25:39] but in this case, it's interesting - the thing we're trying to deprecate is massive, so it's expected we'd have some warnings about forgotten things [19:28:45] Pchelolo: I'm going offline now, but if you need code review / whatever from centralauth perspective tomorrow please ping me on phab [19:28:54] ok, will do [19:29:10] need to think what to do about this. maybe DannyS712 has good ideas [19:37:33] dduvall: if you are done, I would like to push a patch to turn on a logging bucket :) should be easy https://gerrit.wikimedia.org/r/c/operations/mediawiki-config/+/722601 [19:38:27] 10Puppet, 10Infrastructure-Foundations, 10GitLab (Infrastructure), 10Patch-For-Review, and 3 others: Puppetise gitlab-ansible playbook - https://phabricator.wikimedia.org/T283076 (10brennen) [19:39:14] (03CR) 10Herron: "https://puppet-compiler.wmflabs.org/compiler1003/31179/" [puppet] - 10https://gerrit.wikimedia.org/r/722630 (https://phabricator.wikimedia.org/T286911) (owner: 10Herron) [19:39:49] (03CR) 10Herron: [C: 03+2] profile::mail::mx: add +dkim_verbose to log_selector on bullseye [puppet] - 10https://gerrit.wikimedia.org/r/722630 (https://phabricator.wikimedia.org/T286911) (owner: 10Herron) [19:42:48] 10SRE, 10GitLab (Auth & Access), 10Release-Engineering-Team (Doing), 10User-brennen: Define auth strategy for GitLab - https://phabricator.wikimedia.org/T274461 (10brennen) [19:50:29] hashar: sorry, yes. go for it [19:51:12] doing it ;) [19:51:29] (03CR) 10Hashar: [C: 03+2] "Talked about it with Bill, Cindy, Ariel and Daniel." [mediawiki-config] - 10https://gerrit.wikimedia.org/r/722601 (owner: 10Hashar) [19:51:38] how do I deploy? [19:51:39] ;D [19:52:13] (03Merged) 10jenkins-bot: Enable 'DuplicateParse' logging bucket [mediawiki-config] - 10https://gerrit.wikimedia.org/r/722601 (owner: 10Hashar) [19:52:17] that logging bucket should be low traffic [19:54:04] !log hashar@deploy1002 Synchronized wmf-config/InitialiseSettings.php: Enable 'DuplicateParse' logging bucket (duration: 01m 07s) [19:54:08] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log [19:56:12] nothing has exploded [19:56:40] and nothing got logged to that `DuplicateParse` logging bucket yet [19:56:48] so I am assuming it is very low traffic [19:57:55] !log mwdebug-deploy@deploy1002 helmfile [eqiad] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' . [19:57:58] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log [19:59:51] !log mwdebug-deploy@deploy1002 helmfile [codfw] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' . [19:59:54] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log [20:06:54] 10SRE, 10Infrastructure-Foundations, 10Mail, 10Patch-For-Review: Upgrade MXes to Bullseye - https://phabricator.wikimedia.org/T286911 (10herron) Re: the above patch -- DKIM metrics dropped off on mx2001 beginning yesterday. Our Exim metrics are generated via mtail parsing of the Exim log, and Exim 4.90 in... [20:10:27] I am done for today. Good night! [20:21:24] (03CR) 10MSantos: "This change is ready for review." [mediawiki-config] - 10https://gerrit.wikimedia.org/r/722605 (https://phabricator.wikimedia.org/T291178) (owner: 10MSantos) [20:21:29] (03PS2) 10MSantos: kartographer: enable tegola in testwiki [mediawiki-config] - 10https://gerrit.wikimedia.org/r/722605 (https://phabricator.wikimedia.org/T291178) [20:23:32] (03CR) 10MSantos: "Once we're already, this will enable the source right away." [mediawiki-config] - 10https://gerrit.wikimedia.org/r/722605 (https://phabricator.wikimedia.org/T291178) (owner: 10MSantos) [20:49:17] 10SRE, 10Editing-team, 10Fundraising-Backlog, 10Performance-Team, and 5 others: RFC: Serve Main Page of Wikimedia wikis from a consistent URL - https://phabricator.wikimedia.org/T120085 (10Jdlrobson) [20:54:28] 10SRE, 10ops-eqiad, 10DC-Ops, 10Infrastructure-Foundations: Q1:(Need By: TBD) rack/setup/install puppetmaster100[45].eqiad.wmnet - https://phabricator.wikimedia.org/T289732 (10Jclark-ctr) [20:59:25] 10SRE, 10ops-eqiad, 10DC-Ops: hw troubleshooting: megaraid reset due to fatal error for labstore1005.eqiad.wmnet - https://phabricator.wikimedia.org/T290318 (10Jclark-ctr) [20:59:58] 10SRE, 10ops-eqiad, 10Analytics-Clusters, 10DC-Ops, 10Data-Engineering: Q1:(Need By: ASAP) rack/setup/install an-db100[12].eqiad.wmnet - https://phabricator.wikimedia.org/T289632 (10Ottomata) Awesome! Servers in the DC! I can/would work on these boxes ASAP...in case that factors into the priority for t... [21:01:22] 10SRE, 10ops-eqiad, 10Analytics-Clusters, 10DC-Ops, 10Data-Engineering: Q1:(Need By: ASAP) rack/setup/install an-db100[12].eqiad.wmnet - https://phabricator.wikimedia.org/T289632 (10Jclark-ctr) [21:19:39] (03PS1) 10Brennen Bearnes: set session lifetime to 604800s (1w) [gitlab-ansible] - 10https://gerrit.wikimedia.org/r/722682 (https://phabricator.wikimedia.org/T288757) [21:24:17] (03PS3) 10Ryan Kemper: blazegraph: LVS for WCQS step 1 [puppet] - 10https://gerrit.wikimedia.org/r/713959 (https://phabricator.wikimedia.org/T280001) (owner: 10Ebernhardson) [21:24:38] (03CR) 10jerkins-bot: [V: 04-1] blazegraph: LVS for WCQS step 1 [puppet] - 10https://gerrit.wikimedia.org/r/713959 (https://phabricator.wikimedia.org/T280001) (owner: 10Ebernhardson) [21:27:42] (03CR) 10Ryan Kemper: blazegraph: LVS for WCQS step 1 (032 comments) [puppet] - 10https://gerrit.wikimedia.org/r/713959 (https://phabricator.wikimedia.org/T280001) (owner: 10Ebernhardson) [21:30:35] (03PS4) 10Ryan Kemper: blazegraph: LVS for WCQS step 1 [puppet] - 10https://gerrit.wikimedia.org/r/713959 (https://phabricator.wikimedia.org/T280001) (owner: 10Ebernhardson) [21:31:17] (03CR) 10jerkins-bot: [V: 04-1] blazegraph: LVS for WCQS step 1 [puppet] - 10https://gerrit.wikimedia.org/r/713959 (https://phabricator.wikimedia.org/T280001) (owner: 10Ebernhardson) [21:31:33] (03PS5) 10Ryan Kemper: blazegraph: LVS for WCQS step 1 [puppet] - 10https://gerrit.wikimedia.org/r/713959 (https://phabricator.wikimedia.org/T280001) (owner: 10Ebernhardson) [21:32:03] (03CR) 10Ryan Kemper: "check experimental" [puppet] - 10https://gerrit.wikimedia.org/r/713959 (https://phabricator.wikimedia.org/T280001) (owner: 10Ebernhardson) [21:32:26] (03CR) 10Brennen Bearnes: [C: 04-1] "Looks like the necessary change to omnibus packages will be published in 14.3, which isn't out yet, so holding off on this." [gitlab-ansible] - 10https://gerrit.wikimedia.org/r/722682 (https://phabricator.wikimedia.org/T288757) (owner: 10Brennen Bearnes) [21:36:28] (03PS1) 10BryanDavis: toolhub: pin mcrouter container versions; remove pull_policy [deployment-charts] - 10https://gerrit.wikimedia.org/r/722684 (https://phabricator.wikimedia.org/T291442) [21:36:30] (03PS1) 10BryanDavis: toolhub: Add config.public.SOCIAL_AUTH_PROXIES setting [deployment-charts] - 10https://gerrit.wikimedia.org/r/722685 (https://phabricator.wikimedia.org/T291447) [21:36:32] (03PS1) 10BryanDavis: toolhub: bump container version to 2021-09-21-211503-production [deployment-charts] - 10https://gerrit.wikimedia.org/r/722686 [21:36:34] (03PS1) 10BryanDavis: toolhub: Set SOCIAL_AUTH_PROXIES [deployment-charts] - 10https://gerrit.wikimedia.org/r/722687 (https://phabricator.wikimedia.org/T291447) [21:46:25] (03CR) 10BryanDavis: toolhub: set "docker.pull_policy: Always" (031 comment) [deployment-charts] - 10https://gerrit.wikimedia.org/r/722472 (https://phabricator.wikimedia.org/T291442) (owner: 10BryanDavis) [21:52:52] (03CR) 10BryanDavis: [C: 03+2] toolhub: pin mcrouter container versions; remove pull_policy [deployment-charts] - 10https://gerrit.wikimedia.org/r/722684 (https://phabricator.wikimedia.org/T291442) (owner: 10BryanDavis) [21:56:47] (03Merged) 10jenkins-bot: toolhub: pin mcrouter container versions; remove pull_policy [deployment-charts] - 10https://gerrit.wikimedia.org/r/722684 (https://phabricator.wikimedia.org/T291442) (owner: 10BryanDavis) [21:57:17] (03PS1) 10Bstorm: cloudnfs: refactor profile to allow hiera to set server vols [puppet] - 10https://gerrit.wikimedia.org/r/722689 (https://phabricator.wikimedia.org/T291406) [21:58:39] !log bd808@deploy1002 helmfile [staging] Ran 'sync' command on namespace 'toolhub' for release 'main' . [21:58:50] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log [22:01:04] (03CR) 10Ebernhardson: "check experimental" [puppet] - 10https://gerrit.wikimedia.org/r/713959 (https://phabricator.wikimedia.org/T280001) (owner: 10Ebernhardson) [22:04:25] (03CR) 10BryanDavis: [C: 03+2] toolhub: Add config.public.SOCIAL_AUTH_PROXIES setting [deployment-charts] - 10https://gerrit.wikimedia.org/r/722685 (https://phabricator.wikimedia.org/T291447) (owner: 10BryanDavis) [22:04:37] (03CR) 10Bstorm: [C: 03+2] cloudnfs: refactor profile to allow hiera to set server vols [puppet] - 10https://gerrit.wikimedia.org/r/722689 (https://phabricator.wikimedia.org/T291406) (owner: 10Bstorm) [22:05:39] (03PS1) 10Ebernhardson: [DNM] Trigger pcc against wcqs2001 [puppet] - 10https://gerrit.wikimedia.org/r/722691 [22:06:01] (03CR) 10Ebernhardson: "check experimental" [puppet] - 10https://gerrit.wikimedia.org/r/722691 (owner: 10Ebernhardson) [22:08:32] (03PS8) 10Ppchelko: Eventgate: Symlink _helpers and _tls_helpers [deployment-charts] - 10https://gerrit.wikimedia.org/r/722654 (https://phabricator.wikimedia.org/T291504) [22:08:44] (03Merged) 10jenkins-bot: toolhub: Add config.public.SOCIAL_AUTH_PROXIES setting [deployment-charts] - 10https://gerrit.wikimedia.org/r/722685 (https://phabricator.wikimedia.org/T291447) (owner: 10BryanDavis) [22:11:46] (03CR) 10Ppchelko: "This is still a workaround - we need a standard for another dimension of installs for eventgate, using chart name + release name is not en" [deployment-charts] - 10https://gerrit.wikimedia.org/r/722654 (https://phabricator.wikimedia.org/T291504) (owner: 10Ppchelko) [22:15:36] (03PS9) 10Ppchelko: Eventgate: Symlink _helpers and _tls_helpers [deployment-charts] - 10https://gerrit.wikimedia.org/r/722654 (https://phabricator.wikimedia.org/T291504) [22:17:11] (03Abandoned) 10Ebernhardson: [DNM] Trigger pcc against wcqs2001 [puppet] - 10https://gerrit.wikimedia.org/r/722691 (owner: 10Ebernhardson) [22:18:25] (03CR) 10BryanDavis: [C: 03+2] toolhub: bump container version to 2021-09-21-211503-production [deployment-charts] - 10https://gerrit.wikimedia.org/r/722686 (owner: 10BryanDavis) [22:18:39] (03CR) 10BryanDavis: "recheck" [deployment-charts] - 10https://gerrit.wikimedia.org/r/722687 (https://phabricator.wikimedia.org/T291447) (owner: 10BryanDavis) [22:22:17] (03Merged) 10jenkins-bot: toolhub: bump container version to 2021-09-21-211503-production [deployment-charts] - 10https://gerrit.wikimedia.org/r/722686 (owner: 10BryanDavis) [22:26:53] (03PS10) 10Ppchelko: Eventgate: Symlink _helpers and _tls_helpers [deployment-charts] - 10https://gerrit.wikimedia.org/r/722654 (https://phabricator.wikimedia.org/T291504) [22:27:14] (03CR) 10Cwhite: opensearch: fork elasticsearch module into opensearch module (032 comments) [puppet] - 10https://gerrit.wikimedia.org/r/721359 (https://phabricator.wikimedia.org/T288618) (owner: 10Cwhite) [22:28:31] (03CR) 10Cwhite: [C: 03+1] "LGTM" [puppet] - 10https://gerrit.wikimedia.org/r/721773 (https://phabricator.wikimedia.org/T246470) (owner: 10Elukey) [22:29:06] !log bd808@deploy1002 helmfile [staging] Ran 'sync' command on namespace 'toolhub' for release 'main' . [22:29:11] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log [22:33:07] (03CR) 10BryanDavis: [C: 03+2] toolhub: Set SOCIAL_AUTH_PROXIES [deployment-charts] - 10https://gerrit.wikimedia.org/r/722687 (https://phabricator.wikimedia.org/T291447) (owner: 10BryanDavis) [22:35:19] (03CR) 10Ppchelko: "I don't quite understand the output of the CI, it seems to be rendering old chart version with new helm file at some point?" [deployment-charts] - 10https://gerrit.wikimedia.org/r/722654 (https://phabricator.wikimedia.org/T291504) (owner: 10Ppchelko) [22:37:13] (03Merged) 10jenkins-bot: toolhub: Set SOCIAL_AUTH_PROXIES [deployment-charts] - 10https://gerrit.wikimedia.org/r/722687 (https://phabricator.wikimedia.org/T291447) (owner: 10BryanDavis) [22:44:45] (03CR) 10Cwhite: search-platform: add flink alerts (039 comments) [alerts] - 10https://gerrit.wikimedia.org/r/720066 (https://phabricator.wikimedia.org/T276467) (owner: 10DCausse) [22:56:23] !log bd808@deploy1002 helmfile [staging] Ran 'sync' command on namespace 'toolhub' for release 'main' . [22:56:27] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log [23:00:04] RoanKattouw, Niharika, and Urbanecm: (Dis)respected human, time to deploy Evening backport window (https://wikitech.wikimedia.org/wiki/Deployments#deploycal-item-20210921T2300). Please do the needful. [23:00:05] No Gerrit patches in the queue for this window AFAICS. [23:12:26] (03PS7) 10Michael DiPietro: create role to deploy staging instance for quarry [puppet] - 10https://gerrit.wikimedia.org/r/721585 (https://phabricator.wikimedia.org/T291204) [23:19:00] !log mwdebug-deploy@deploy1002 helmfile [eqiad] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' . [23:19:03] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log [23:20:58] !log mwdebug-deploy@deploy1002 helmfile [codfw] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' . [23:21:01] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log [23:25:21] (03CR) 10Cwhite: profile::logstash::gelf_relay: ingest GELF logs and output as JSON over UDP (031 comment) [puppet] - 10https://gerrit.wikimedia.org/r/721345 (https://phabricator.wikimedia.org/T288620) (owner: 10Herron) [23:28:27] (03CR) 10Cwhite: profile::logstash::gelf_relay: ingest GELF logs and output as JSON over UDP (031 comment) [puppet] - 10https://gerrit.wikimedia.org/r/721345 (https://phabricator.wikimedia.org/T288620) (owner: 10Herron) [23:41:25] 10SRE, 10Infrastructure-Foundations, 10Observability-Metrics, 10CAS-SSO, and 3 others: Sign-in links from Grafana dashboards don't work when not signed into SSO - https://phabricator.wikimedia.org/T269272 (10RLazarus) 05Open→03Resolved a:03jbond Nope, I don't still get the loop described in the origi...