[00:26:05] 10serviceops, 10serviceops-collab: Review DNS TTLs for ServiceOps-Collab owned services - https://phabricator.wikimedia.org/T315319 (10Dzahn) The "standard values" at WMF should be either "1H" (normal) or "600" (5 minutes, what we lower it to for migrations/fail-over). Other values I would consider non-standa... [00:31:34] 10serviceops, 10serviceops-collab: Review DNS TTLs for ServiceOps-Collab owned services - https://phabricator.wikimedia.org/T315319 (10Dzahn) current status: gerrit.wikimedia.org/gerrit-replica.wikimedia.org: 600 gitlab.wikimedia.org/gitlab-replica.wikimedia.org/gitlab-replica-old.wikimedia.org: 300 phabr... [06:49:13] 10serviceops, 10Patch-For-Review, 10Performance-Team (Radar): Migrate WMF production from PHP 7.2 to PHP 7.4 - https://phabricator.wikimedia.org/T271736 (10Joe) My current plan is to proceed as follows: [] Move 0.1% of the traffic on monday 22nd [] Ramp up to 5% by the end of the week [] Get to 15% the week... [07:15:46] 10serviceops, 10Patch-For-Review, 10Performance-Team (Radar): Migrate WMF production from PHP 7.2 to PHP 7.4 - https://phabricator.wikimedia.org/T271736 (10ArielGlenn) I'd like to convert the snapshot hosts sometime between aug 26 and aug 31 if that works for folks. [07:29:35] 10serviceops, 10Patch-For-Review, 10Performance-Team (Radar): Migrate WMF production from PHP 7.2 to PHP 7.4 - https://phabricator.wikimedia.org/T271736 (10tstarling) >>! In T271736#8160364, @Joe wrote: > For jobrunners, I would say that if jobs are working in beta where they're supposedly running on 7.4, we... [11:51:53] 10serviceops, 10Dumps-Generation, 10Patch-For-Review, 10Performance-Team (Radar): Migrate WMF production from PHP 7.2 to PHP 7.4 - https://phabricator.wikimedia.org/T271736 (10ArielGlenn) [11:52:20] Hello. I'm currently working on setting up the new dse-k8s cluster (T310196) and I'm considering using the cfssl based PKI for the control-plane and kubelet certificates, instead of cergen and puppet based certificates. [11:53:19] Firstly, would anyone be opposed to this or could suggest a reason why I shouldn't do it? [11:53:55] Secondaly, if I did go down this route, would it be better to use one new intermediate CA for kubernetes, or one per cluster? [11:55:07] Thanks for your time. There's a CR here if you'd like to make any comments on that: https://gerrit.wikimedia.org/r/c/operations/puppet/+/824161 [11:57:23] 10serviceops, 10Performance-Team, 10SRE, 10Traffic: multi-dc.lua ATS script failing in production - https://phabricator.wikimedia.org/T315434 (10Vgutierrez) [12:03:16] btullis: I think it's ultimately the right thing to do but will involve quite some changes to the k8s puppet code where we would also need to ensure backwards compatibility with existing clusters [12:03:29] changing certificates in k8s context is not an easy thing to do [12:04:08] so it might be wise to handle that outside of the dse-k8s context and with a separate ticket so we can discuss there in detail [12:23:53] jayme: ok, understood. Thanks. I thought it would be fairly easy to make sure it was a noop for existing clusters. But I'm sure you're right. Maybe we will have time to look at it before the dse-k8s cluster goes into full production mode. [12:28:43] maybe it's easy after all but I would like this to be a more explicit change. There are a couple of different certificates involved (TLS for the kube-apiserver, certificicates for signing service accounts, possibly certificates for nodes etc.) and I think it might make sense to take the time and think about the whole picture [12:45:36] Ok, will do. Thanks. [13:02:36] 10serviceops, 10Thumbor, 10Thumbor Migration, 10Performance-Team (Radar), 10User-jijiki: Terminate Thumbor with SSL - https://phabricator.wikimedia.org/T180696 (10TheDJ) I'm speculating here that this will be automatically fixed by the thumbor migration, which I assume already will do ssl termination ? [13:30:19] 10serviceops, 10Beta-Cluster-Infrastructure: Serve beta cluster via PHP 7.4 by default - https://phabricator.wikimedia.org/T306042 (10Joe) 05Open→03In progress a:03Joe [13:30:26] 10serviceops, 10Dumps-Generation, 10Patch-For-Review, 10Performance-Team (Radar): Migrate WMF production from PHP 7.2 to PHP 7.4 - https://phabricator.wikimedia.org/T271736 (10Joe) [13:31:41] 10serviceops, 10Performance-Team, 10SRE, 10Traffic: multi-dc.lua ATS script failing in production - https://phabricator.wikimedia.org/T315434 (10Vgutierrez) 05Open→03Resolved a:03Vgutierrez Fix has been deployed. I'll reopen the task if we are still seeing errors [13:31:51] 10serviceops, 10Performance-Team, 10SRE, 10SRE-swift-storage, and 2 others: Progressive Multi-DC roll out - https://phabricator.wikimedia.org/T279664 (10Vgutierrez) [14:35:17] 10serviceops, 10Data-Persistence-Backup, 10serviceops-collab, 10GitLab (Infrastructure), and 2 others: Backups for GitLab - https://phabricator.wikimedia.org/T274463 (10Jelto) #### New partman config The new partman config on `gitlab2003` increased the size of the backup volume: ` vg-root... [14:39:51] if anyone around, I could use a `puppet-merge` to have the docker daemon started after contint servers are rebooted https://gerrit.wikimedia.org/r/c/operations/puppet/+/814157 ;) [14:40:41] it merely adds `service { 'docker': enable => true }` the systemd unit shipped by the Debian package intentionally does not enable the service [14:44:36] looking [14:47:53] hashar: left a comment. I'm not sure enable => true is what you really want [14:52:21] jayme: thanks, I do need `enable` [14:52:57] AIUI ensure => 'running' will enable as well and it will ensure the service is running :-) [14:52:59] `ensure => running` would bring docker when puppet runs at some point but I would rather have systemd do it automatically on start [14:53:13] and would like to aovid Puppet to magically bring up docker for us when we explicitly stopped it for some reason [14:53:28] ah, okay [14:54:14] if it ever crashed, the unit has `Restarted=always` so that is covered ;] [14:54:41] those `enable` vs `ensure` always confuse me :-\ [14:56:56] hashar: merged [14:58:46] jayme: thank you! will run puppet ): [14:58:47] :) [15:00:47] Notice: /Stage[main]/Profile::Ci::Docker/Service[docker]/enable: enable changed 'false' to 'true' [15:00:50] jayme: solved thank you ;) [15:02:35] yw [18:42:31] 10serviceops, 10DC-Ops, 10SRE, 10ops-codfw: Q1:rack/setup/install contint1002 - https://phabricator.wikimedia.org/T313830 (10Jclark-ctr) [18:44:43] 10serviceops, 10DC-Ops, 10SRE, 10ops-eqiad: Q1:rack/setup/install kubernetes102[01] - https://phabricator.wikimedia.org/T313873 (10Jclark-ctr) [18:48:04] 10serviceops, 10DC-Ops, 10SRE, 10ops-eqiad: Q1:rack/setup/install new eqiad memcached hosts - https://phabricator.wikimedia.org/T313963 (10Jclark-ctr) [22:43:50] 10serviceops, 10Phabricator, 10serviceops-collab, 10Patch-For-Review, 10Release-Engineering-Team (Bonus Level 🕹ī¸): Setup rsync for phab data on disk - https://phabricator.wikimedia.org/T313360 (10Dzahn) >>! In T313360#8159924, @Dzahn wrote: > regarding the UIDs.. user 'phd' has a reserved UID of 498. per... [22:57:19] 10serviceops, 10Data-Persistence-Backup, 10serviceops-collab, 10GitLab (Infrastructure), and 2 others: Backups for GitLab - https://phabricator.wikimedia.org/T274463 (10thcipriani) @LSobanski and I talked about predicting storage space of GitLab as we grow. We've learned that storage growth is much differe... [23:54:18] 10serviceops, 10DC-Ops, 10SRE, 10ops-codfw: Q1:rack/setup/install kubernetes202[34] - https://phabricator.wikimedia.org/T313870 (10Papaul)