[06:41:58] 10serviceops, 10Infrastructure-Foundations, 10Parsoid, 10SRE, and 3 others: Deployers unable to ssh to parse* hosts - https://phabricator.wikimedia.org/T290144 (10fgiunchedi) p:05Triage→03Medium [07:21:39] hello folks [07:22:07] I'd need a little brainbounce for TLS + istio + knative [07:22:19] after reading https://knative.dev/docs/serving/using-a-tls-cert/#manually-adding-a-tls-certificate (see Istio tab) [07:22:55] I have tested it on minikube in the past and it works, but the main problem for me now is where to put the Secret containing the TLS cert [07:24:06] since it needs to be in the istio-system namespace, but it is knative that should place it (since knative is what configures the L7 settings for Istio, like https) [07:24:46] so in theory this should mean having a Secret in the knative-serving chart that is injected into the istio-system namespace [07:25:13] is it possible / doable ? I don't like the cross-dependency but at the moment knative-serving doesn't work without istio [07:25:45] in theory knative-serving's helmfile config is under admin_ng so I suppose so, but I wanted to know some opinions [08:34:07] elukey: I'm not 100% about the implications, but I think you might get into troble trying to install objects into two different namespaces from one chart [08:34:49] but you could potentially use the "raw" chart (as we do a lot in admin_ng) to populate the secret I guess [08:37:12] jayme: ah interesting! I didn't know about the raw [08:37:15] *raw chart [08:40:12] jayme: just to understand - should I create a new admin_ng dir (like knative-serving-istio or similar) and then sync it, or is it possible to add the raw char to the pre-existing knative-serving admin_ng dir? (I guess the former but I am a little ignorat about helmfile, I can RTM in case :) [08:41:15] elukey: you can add an additional release to the "knative-serving/helmfile.yaml" [08:41:31] just don't use the template :-p [08:44:01] jayme: ack I can try, I am not sure what you mean with "just don't use the template" [08:45:27] elukey: there is this release template (default) defined at the top which is included in the relaeases [08:45:51] sorry, potentially confusing comment...what I mean is, do something like: https://phabricator.wikimedia.org/P17139 [08:47:06] jayme: my fault, very n00b with helmfile, thanks for the detailed comment, will try :) [08:47:31] the interesting part will be how to include values from the definitions in /etc/ there [08:52:35] jayme: I was wondering the same, I like your use of "interesting" :D [08:57:53] 10serviceops, 10DC-Ops, 10SRE, 10ops-eqiad: Q1:(Need By: TBD) rack/setup/install kubernetes10[19-22] - https://phabricator.wikimedia.org/T290202 (10JMeybohm) The latest kubernetes node there is is kubernetes1017, so I'd say the new nodes should be `kubernetes10[18-21]`. We also need them to run **Stretch**... [09:15:31] 10serviceops, 10MW-on-K8s, 10Kubernetes, 10Patch-For-Review: Kubernetes timeing out before pulling the mediawiki-multiversion image - https://phabricator.wikimedia.org/T284628 (10JMeybohm) >>! In T284628#7325551, @Legoktm wrote: > I put up a simple patch using `:latest` for now (I was already in the proces... [09:38:37] jayme: what if I add the raw-chart-yaml config for the secret directly in profile::kubernetes::deployment_server_secrets::services ? [09:39:20] it is a bit of a stretch but then I could simply declare the raw chart dep, and its template would get populated via helmfile [09:39:36] elukey: the complete yaml spec you mean? [09:39:48] that's an evil plan... :) [09:39:52] hahahahah [09:40:34] I find it cleaner than trying to mess with the helmfile.yaml interpolation [09:40:43] but yeah different from the rest [09:40:45] for sure [09:41:09] I don't completely like the snowflake nature of this [09:41:40] but otoh it's stricly valid as it actually *is* a value for a helm chart (the complete yaml spec) [09:43:05] elukey: if you need to quickly solve this I'd say go ahead but plan on cleaning this up later [09:43:17] an alternative could also be to have an helm chart to support istioctl's values, for example with a templated Secret that could be populated more cleanly via puppet private [09:43:42] that I don't get [09:44:24] that Secret needs to be defined somewhere :D [09:44:29] I thought about adding a additional helm chart that installs arbitrary secrets (like raw, but just for secrets) [09:44:51] yes something like that, this is the idea (not sure if right or not) [09:45:03] ahh you mean generic [09:45:11] not istio-related [09:45:13] I maybe got confused at the "support for istioctl values" poitn [09:45:19] yeah [09:45:32] I meant just random, whatever secrets [09:45:38] yes my bad, what I meant was "other than what we already set with istioctl" [09:45:48] could be an option yes [09:46:29] no need to have snowflakes in puppet private, a little more verbose but who cares [09:46:54] yeah...and you get the bonus of validation and linting [09:46:59] thich we don't have for raw [09:47:16] going to create a simple proof of concept in gerrit, then if we find it horrible we drop it :) [09:47:22] y3 [09:47:27] <3* [11:25:00] 10serviceops, 10ops-codfw: mw2264 went down - https://phabricator.wikimedia.org/T290242 (10MoritzMuehlenhoff) [11:31:14] 10serviceops: Migrate WMF Production from PHP 7.2 to a newer version - https://phabricator.wikimedia.org/T271736 (10tstarling) >>! In T271736#6737009, @Reedy wrote: > Marking stalled as we're a bit away from this happening as {T245757} isn't finished yet Apparently it will be finished around the middle of Septe... [12:40:01] 10serviceops: Migrate WMF Production from PHP 7.2 to a newer version - https://phabricator.wikimedia.org/T271736 (10MoritzMuehlenhoff) > I'm getting tired of putting up with dinosaur versions of PHP in production. PHP is our core business. > 7.4.0 was released in November 2019, so presumably will stop receiving... [12:42:44] 10serviceops: Migrate WMF Production from PHP 7.2 to a newer version - https://phabricator.wikimedia.org/T271736 (10Majavah) >>! In T271736#7328048, @tstarling wrote: > 8.0.x is the latest stable and has lots of nice features, like the JIT. That seems like a reasonable target at this point. 8.0 or newer is bloc... [12:54:57] 10serviceops: Migrate WMF Production from PHP 7.2 to a newer version - https://phabricator.wikimedia.org/T271736 (10Reedy) >>! In T271736#7328191, @Majavah wrote: >>>! In T271736#7328048, @tstarling wrote: >> 8.0.x is the latest stable and has lots of nice features, like the JIT. That seems like a reasonable tar... [13:09:03] 10serviceops, 10SRE, 10ops-codfw: mw2264 went down - https://phabricator.wikimedia.org/T290242 (10Dzahn) ` Record: 5 Date/Time: 08/30/2021 10:22:49 Source: system Severity: Non-Critical Description: Correctable memory error rate exceeded for DIMM_B1. -------------------------------------------... [13:16:17] 10serviceops, 10SRE, 10ops-codfw: mw2264 went down - https://phabricator.wikimedia.org/T290242 (10Dzahn) @Papaul Hi, this host went down as described above and I pasted the relevant entries from 'racadm getsel' above. As you can see it looks like the DIMM B1 is broken. It is in rack B3 and purchase date wa... [13:16:40] 10serviceops, 10SRE, 10ops-codfw: mw2264 went down - https://phabricator.wikimedia.org/T290242 (10Dzahn) p:05Triage→03Medium [13:24:17] 10serviceops, 10Release-Engineering-Team (Radar): nodejs-devel image does not contain npm - https://phabricator.wikimedia.org/T290209 (10Dzahn) I don't know if this is the reason it was left out of the package but using npm to install software would conflict with L3. [14:19:27] jayme: https://gerrit.wikimedia.org/r/c/operations/deployment-charts/+/716235 is the poc (kubeyaml still hates me), lemme know more or less if it could work [14:19:56] I wanted to have a chart with the possibility of definining multiple secrets, but it can ben also one [14:40:15] 10serviceops, 10Prod-Kubernetes, 10SRE, 10Kubernetes: Add TLS termination to services running on kubernetes - https://phabricator.wikimedia.org/T235411 (10JMeybohm) [14:40:38] 10serviceops, 10Prod-Kubernetes, 10SRE, 10Kubernetes: Add TLS termination to services running on kubernetes - https://phabricator.wikimedia.org/T235411 (10JMeybohm) [14:40:46] 10serviceops, 10Prod-Kubernetes, 10SRE, 10Kubernetes, 10Patch-For-Review: Move cxserver to use TLS only - https://phabricator.wikimedia.org/T255879 (10JMeybohm) 05Open→03Resolved [14:40:49] 10serviceops, 10Prod-Kubernetes, 10SRE, 10Kubernetes, 10Patch-For-Review: Move wikifeeds to use TLS only - https://phabricator.wikimedia.org/T255878 (10JMeybohm) 05Open→03Resolved a:03JMeybohm [14:40:57] 10serviceops, 10Prod-Kubernetes, 10SRE, 10Kubernetes: Add TLS termination to services running on kubernetes - https://phabricator.wikimedia.org/T235411 (10JMeybohm) [14:41:05] 10serviceops, 10Prod-Kubernetes, 10SRE, 10Kubernetes: Add TLS termination to services running on kubernetes - https://phabricator.wikimedia.org/T235411 (10JMeybohm) [14:41:16] 10serviceops, 10Prod-Kubernetes, 10SRE, 10Kubernetes: Add TLS termination to services running on kubernetes - https://phabricator.wikimedia.org/T235411 (10JMeybohm) [14:41:22] 10serviceops, 10Citoid, 10Prod-Kubernetes, 10SRE, and 2 others: Move citoid to use TLS only - https://phabricator.wikimedia.org/T255868 (10JMeybohm) 05Open→03Resolved [14:41:28] 10serviceops, 10Prod-Kubernetes, 10SRE, 10Kubernetes: Add TLS termination to services running on kubernetes - https://phabricator.wikimedia.org/T235411 (10JMeybohm) [14:41:39] 10serviceops, 10Prod-Kubernetes, 10SRE, 10Traffic, and 3 others: Move termbox to use TLS only - https://phabricator.wikimedia.org/T254581 (10JMeybohm) 05Open→03Resolved [14:41:50] 10serviceops, 10Prod-Kubernetes, 10SRE, 10Kubernetes, 10Patch-For-Review: Move zotero to use TLS only - https://phabricator.wikimedia.org/T255869 (10JMeybohm) 05Open→03Resolved [14:42:31] 10serviceops, 10SRE, 10Kubernetes, 10Patch-For-Review, 10Release Pipeline (Blubber): Move blubberoid to use TLS only. - https://phabricator.wikimedia.org/T236017 (10JMeybohm) 05Open→03Resolved [14:54:23] 10serviceops, 10MW-on-K8s, 10Kubernetes, 10Patch-For-Review: Kubernetes timeing out before pulling the mediawiki-multiversion image - https://phabricator.wikimedia.org/T284628 (10dancy) Just so everyone is clear, under normal circumstances there will be one **big** image per week (at the start of the train... [15:01:15] jelto, moritzm: noting i'm about to be afk 'til tuesday, shall we plan gitlab 14.x upgrade then? [15:04:22] brennen: ack sounds good for me [15:04:59] cool. i'll check upgrade path on gitlab-ansible-test in the meanwhile. [15:05:40] 10serviceops, 10Infrastructure-Foundations, 10Parsoid, 10SRE, and 3 others: Deployers unable to ssh to parse* hosts - https://phabricator.wikimedia.org/T290144 (10Dzahn) After merging the change above I ran puppet on parse2001 and saw all the deployer shell accounts being created. On all other hosts it wi... [15:07:56] 10serviceops, 10GitLab, 10Release-Engineering-Team (Radar), 10User-brennen: GitLab patch release: 13.12.10: Resolves "Username ending with MIME type format is not allowed" errors - https://phabricator.wikimedia.org/T288631 (10brennen) 05Stalled→03Declined Declining this one specifically, see T289802 f... [15:13:08] 10serviceops, 10SRE, 10ops-codfw: mw2264 went down - https://phabricator.wikimedia.org/T290242 (10Papaul) a:03Papaul [15:14:01] brennen: wfm [16:08:00] 10serviceops, 10GitLab, 10Patch-For-Review, 10Release-Engineering-Team (Next), 10User-brennen: GitLab major version upgrade: 14.x - https://phabricator.wikimedia.org/T289802 (10brennen) Per IRC discussion, we're planning to conduct upgrade week of 2021-09-06. (I'm out 'til US morning of the 7th.) Using... [17:41:20] 10serviceops, 10MW-on-K8s, 10Kubernetes, 10Patch-For-Review: Kubernetes timeing out before pulling the mediawiki-multiversion image - https://phabricator.wikimedia.org/T284628 (10Legoktm) The initial pull of the 12G image took ~11 minutes on kubestage1002, after that it just needed to pull one more layer w... [21:19:54] 10serviceops, 10Infrastructure-Foundations, 10SRE, 10SRE-Access-Requests, and 2 others: Deployers unable to ssh to parse* hosts - https://phabricator.wikimedia.org/T290144 (10Arlolra)