[07:04:32] 10serviceops, 10Patch-For-Review: Allow coexisting php version in our puppet code - https://phabricator.wikimedia.org/T293450 (10Joe) a:03Joe [11:12:28] 10serviceops, 10Prod-Kubernetes, 10Kubernetes: Provision TLS certificates for k8s services in istio-system namespace - https://phabricator.wikimedia.org/T295385 (10JMeybohm) [12:07:09] 10serviceops, 10CFSSL-PKI, 10Infrastructure-Foundations, 10Prod-Kubernetes, and 2 others: Automate issuing of TLS certificates in kubernetes clusters - https://phabricator.wikimedia.org/T294560 (10JMeybohm) [13:27:50] 10serviceops, 10CFSSL-PKI, 10Infrastructure-Foundations, 10Prod-Kubernetes, and 2 others: Automate issuing of TLS certificates in kubernetes clusters - https://phabricator.wikimedia.org/T294560 (10JMeybohm) [14:36:29] 10serviceops, 10Infrastructure-Foundations, 10SRE, 10Znuny: upgrade/replace VRTS (formerly ORTS) buster to bullseye - https://phabricator.wikimedia.org/T295416 (10akosiaris) +1 on the general concept and actions. Some more information on the `some magic to import data if needed` part: Not really much is... [15:35:17] 10serviceops, 10GitLab, 10Release-Engineering-Team, 10Security-Team: Setup GitLab Runner in trusted environment - https://phabricator.wikimedia.org/T295481 (10Jelto) [16:40:20] 10serviceops, 10Performance-Team (Radar): Migrate WMF Production from PHP 7.2 to PHP 7.4 - https://phabricator.wikimedia.org/T271736 (10Legoktm) [16:44:16] 10serviceops, 10Shellbox: Migrate Shellbox to PHP 7.4 - https://phabricator.wikimedia.org/T295489 (10Legoktm) [17:07:11] 10serviceops, 10SRE, 10Continuous-Integration-Config, 10Release-Engineering-Team (CI & Testing services), 10Test-Coverage: Add pcov PHP extension to wikimedia apt so it can be used in Wikimedia CI - https://phabricator.wikimedia.org/T243847 (10Daimona) 05Invalid→03Open This is actually still relevant... [17:20:41] 10serviceops, 10SRE, 10Continuous-Integration-Config, 10Release-Engineering-Team (CI & Testing services), 10Test-Coverage: Add pcov PHP extension to wikimedia apt so it can be used in Wikimedia CI - https://phabricator.wikimedia.org/T243847 (10hashar) [17:34:31] Hi we are facing this issue with tegola tile pregeneration on k8s: https://phabricator.wikimedia.org/T293366#7496674 Can somebody help us with increasing the partitions on the kafka topics we use? [17:46:18] 10serviceops, 10Maps, 10Patch-For-Review, 10Product-Infrastructure-Team-Backlog (Kanban): Performance considerations about the current usage of EventPlatform from maps - https://phabricator.wikimedia.org/T293366 (10Jgiannelos) [17:57:45] 10serviceops, 10Infrastructure-Foundations, 10SRE, 10Znuny: upgrade/replace VRTS (formerly ORTS) buster to bullseye - https://phabricator.wikimedia.org/T295416 (10Dzahn) p:05Triage→03Medium Sounds great, thanks for that, Alex. [17:59:18] 10serviceops, 10GitLab, 10Release-Engineering-Team, 10Security-Team: Setup GitLab Runner in trusted environment - https://phabricator.wikimedia.org/T295481 (10Dzahn) [18:03:33] 10serviceops, 10GitLab, 10Release-Engineering-Team, 10Security-Team: Setup GitLab Runner in trusted environment - https://phabricator.wikimedia.org/T295481 (10Dzahn) > I would like to reuse the existing puppet code for the Shared Runners in WMCS. Yes, this is great, all for this. > we could start with V... [18:11:34] 10serviceops, 10GitLab, 10Release-Engineering-Team, 10Security-Team: Setup GitLab Runner in trusted environment - https://phabricator.wikimedia.org/T295481 (10Joe) The only things I could see as potentially different between these runners and those running in wmcs are: - Need to use the http proxy to reac... [18:22:21] 10serviceops, 10GitLab, 10Security-Team, 10Release-Engineering-Team (Radar): Setup GitLab Runner in trusted environment - https://phabricator.wikimedia.org/T295481 (10thcipriani) [18:28:05] 10serviceops, 10Release-Engineering-Team: contint hardware refresh? - https://phabricator.wikimedia.org/T294276 (10thcipriani) [18:28:35] 10serviceops, 10Release-Engineering-Team: contint hardware refresh - https://phabricator.wikimedia.org/T294276 (10thcipriani) [18:29:56] 10serviceops, 10Release-Engineering-Team (Seen): contint hardware refresh - https://phabricator.wikimedia.org/T294276 (10thcipriani) Runs out of warranty this calendar year or fiscal year? I'm inclined to go ahead and replace this hardware. [18:31:20] 10serviceops, 10Release-Engineering-Team (Radar): Puppet failure on deploy-1002.devtools.eqiad1.wikimedia.cloud due to missing profile::kubernetes::deployment_server::user_defaults - https://phabricator.wikimedia.org/T294174 (10thcipriani) [18:31:55] 10serviceops, 10Release-Engineering-Team (Seen): contint hardware refresh - https://phabricator.wikimedia.org/T294276 (10Dzahn) The purchase date was 2016-03-24 (and one day earlier for contint1001) and I expect it to last 5 years. [18:45:24] 10serviceops, 10SRE, 10Continuous-Integration-Config, 10Release-Engineering-Team (CI & Testing services), 10Test-Coverage: Add pcov PHP extension to wikimedia apt so it can be used in Wikimedia CI - https://phabricator.wikimedia.org/T243847 (10Legoktm) a:03Legoktm Sure. [19:19:45] nemo-yiannis: o/ [19:19:56] so, i can increase to 6 partitoins [19:20:23] i wil ldo so in main eqiad, main codfw, and in jumbo eqiad, to be consistent everywher [19:20:37] just checking that this is def okay before I do, (reducing # of partitions is not really possibel) [19:21:55] will run [19:21:58] kafka topics --alter --topic eqiad.maps.tiles_change --partitions 6 [19:21:58] kafka topics --alter --topic codfw.maps.tiles_change --partitions 6 [19:22:05] in each kafka cluster [19:23:28] 10serviceops, 10SRE, 10Continuous-Integration-Config, 10Release-Engineering-Team (CI & Testing services), 10Test-Coverage: Add pcov PHP extension to wikimedia apt so it can be used in Wikimedia CI - https://phabricator.wikimedia.org/T243847 (10Legoktm) 05Open→03Resolved [19:48:32] hey ottomata, yeah it should be ok [19:48:43] ok [19:51:19] 10serviceops, 10Maps, 10Patch-For-Review, 10Product-Infrastructure-Team-Backlog (Kanban): Performance considerations about the current usage of EventPlatform from maps - https://phabricator.wikimedia.org/T293366 (10Ottomata) Running the following in Kafka clusters main-eqiad, main-codfw, and jumbo-eqiad:... [19:53:57] nemo-yiannis: done [19:54:04] 10serviceops, 10Maps, 10Patch-For-Review, 10Product-Infrastructure-Team-Backlog (Kanban): Performance considerations about the current usage of EventPlatform from maps - https://phabricator.wikimedia.org/T293366 (10Ottomata) ` 19:52:22 [@kafka-main1001:/home/otto] $ kafka topics --describe --topic eqiad.ma... [19:54:13] thanks ottomata [19:54:59] 10serviceops, 10Maps, 10Patch-For-Review, 10Product-Infrastructure-Team-Backlog (Kanban): Performance considerations about the current usage of EventPlatform from maps - https://phabricator.wikimedia.org/T293366 (10Jgiannelos) 05Open→03Resolved [20:26:28] 10serviceops, 10Patch-For-Review, 10Release-Engineering-Team (Radar): Puppet failure on deploy-1002.devtools.eqiad1.wikimedia.cloud due to missing profile::kubernetes::deployment_server::user_defaults - https://phabricator.wikimedia.org/T294174 (10Dzahn) >>! In T294174#7454163, @hashar wrote: > #beta-cluster... [20:28:48] 10serviceops, 10Patch-For-Review, 10Release-Engineering-Team (Radar): Puppet failure on deploy-1002.devtools.eqiad1.wikimedia.cloud due to missing profile::kubernetes::deployment_server::user_defaults - https://phabricator.wikimedia.org/T294174 (10Dzahn) puppet run finished on deploy1002 again for the first... [20:54:42] 10serviceops, 10Patch-For-Review, 10Release-Engineering-Team (Radar): Puppet failure on deploy-1002.devtools.eqiad1.wikimedia.cloud due to missing profile::kubernetes::deployment_server::user_defaults - https://phabricator.wikimedia.org/T294174 (10Dzahn) 05Open→03Resolved a:03Dzahn Horizon Hiera empty,... [20:57:24] 10serviceops, 10Arc-Lamp, 10Performance-Team, 10SRE: webperf*002 running out of disk space (arc lamp, xhgui) - https://phabricator.wikimedia.org/T235425 (10Dzahn) 05Resolved→03Open [20:59:07] 10serviceops, 10Arc-Lamp, 10Performance-Team, 10SRE: webperf*002 running out of disk space (arc lamp, xhgui) - https://phabricator.wikimedia.org/T235425 (10Dzahn) This is happening again. The webperf*2 hosts are alerting in Icinga about disk space. [21:03:41] 10serviceops, 10Arc-Lamp, 10Performance-Team, 10SRE: webperf*002 running out of disk space (arc lamp, xhgui) - https://phabricator.wikimedia.org/T235425 (10Krinkle) a:05Krinkle→03None [21:11:14] 10serviceops, 10Arc-Lamp, 10Performance-Team, 10SRE: webperf*002 running out of disk space (arc lamp, xhgui) - https://phabricator.wikimedia.org/T235425 (10dpifke) a:03dpifke This is arguably a new issue, unrelated to the last time. I don't see anything obviously wrong with the jobs to compress and even... [21:35:17] 10serviceops, 10Arc-Lamp, 10Performance-Team, 10SRE, 10Patch-For-Review: webperf*002 running out of disk space (arc lamp, xhgui) - https://phabricator.wikimedia.org/T235425 (10Krinkle) From [Grafana: Host overview](https://grafana.wikimedia.org/d/000000377/host-overview?viewPanel=12&orgId=1&var-server=we... [23:11:25] 10serviceops, 10Arc-Lamp, 10Performance-Team, 10SRE: webperf*002 running out of disk space (arc lamp, xhgui) - https://phabricator.wikimedia.org/T235425 (10dpifke) 05Open→03Resolved Before reducing compression age threshold: ` Filesystem Size Used Avail Use% Mounted on /dev/vdb 295G 268G...