[07:16:19] 06serviceops, 10MW-on-K8s: Migrate mwmaint server functionality to mw-on-k8s - https://phabricator.wikimedia.org/T341560#9968163 (10JMeybohm) >>! In T341560#9967304, @RLazarus wrote: > We'll eventually send it to logstash too, but that hasn't happened yet. Everything a container produces on stdout and stderr... [07:27:43] 06serviceops, 06DC-Ops, 10ops-eqiad, 10Prod-Kubernetes, 06SRE: kubernetes1051.eqiad.wmnet failed to pull mediawiki images - https://phabricator.wikimedia.org/T369011#9968167 (10JMeybohm) >>! In T369011#9965709, @cmooney wrote: >>>! In T369011#9948452, @JMeybohm wrote: >> I've deleted the node from the k8... [10:53:45] 06serviceops, 06DC-Ops, 10ops-eqiad, 10Prod-Kubernetes, 06SRE: kubernetes1051.eqiad.wmnet failed to pull mediawiki images - https://phabricator.wikimedia.org/T369011#9968705 (10cmooney) >>! In T369011#9968167, @JMeybohm wrote: > I've disabled BGP for this node for now. A cool. It's generally not a prob... [12:29:00] hnowlan: is it unfair to pick on you to help with the switch upgrades because you kindly volunteered yesterday? [12:31:07] I don't think too bad today only kubernetes1059 in rack E1 [12:43:26] 06serviceops, 06Infrastructure-Foundations, 13Patch-For-Review, 07Security: Upgrade K8s docker images to running in production on Buster with either Bullseye or Bookworm - https://phabricator.wikimedia.org/T368366#9969064 (10Lucas_Werkmeister_WMDE) In T368523 we’re seeing an “unable to get local issuer cer... [13:02:52] topranks: I can do it :) 15.00Z again? [13:04:29] jayme: yep same time again, that would be great thanks :) [13:05:50] topranks: ack [13:48:01] oh, just saw this. I 've already done kubernetes1059 [13:55:17] ah cool - thanks akosiaris! [13:56:59] 06serviceops, 06DC-Ops, 10ops-eqiad, 10Prod-Kubernetes, 06SRE: kubernetes1051.eqiad.wmnet failed to pull mediawiki images - https://phabricator.wikimedia.org/T369011#9969387 (10ops-monitoring-bot) Icinga downtime and Alertmanager silence (ID=3f55d01c-31c3-4e2a-8c13-b4c6da9484f8) set by cgoubert@cumin1002... [14:05:56] 06serviceops, 06DC-Ops, 10ops-eqiad, 10Prod-Kubernetes, 06SRE: hw troubleshooting: Management and main interfaces down for kubernetes1051.eqiad.wmnet - https://phabricator.wikimedia.org/T369011#9969436 (10Clement_Goubert) p:05Triage→03Medium a:03Jclark-ctr [14:34:13] 06serviceops, 10MW-on-K8s, 06SRE: Re-think how we separate traffic to mediawiki in clusters. - https://phabricator.wikimedia.org/T291918#9969588 (10Clement_Goubert) 05Open→03Resolved a:03Clement_Goubert This seems pretty well setlled now with: - `mw-api-ext` for external api calls - `mw-api-int` fo... [14:37:15] ok then :) [14:43:37] 06serviceops, 06Infrastructure-Foundations, 06SRE: ferm sometimes fails to restart on Kubernetes workers via xtables lock held by kube-proxy - https://phabricator.wikimedia.org/T354855#9969636 (10Clement_Goubert) 05Open→03Resolved We've only had one spike of job enqueuing errors since merging `Restar... [14:51:20] 06serviceops, 10conftool: Move conftool to gitlab, turn on deb package auto-generation - https://phabricator.wikimedia.org/T369594#9969683 (10Joe) [16:08:47] sukhe: btw, have you done any running confctl locally to play with it? it's not very hard [16:09:58] cdanis: no actually! I just have a local etcd instance and my own one-liners but maybe I should just do confctl [16:10:01] good idea, thanks [16:10:13] your own one-liners for etcd? [16:10:37] all I do is start up etcd.service as installed from debian and then run confctl from the tox venv [16:10:54] cdanis: basically a setup for confd + confctl + changing a few things, making sure it outputs correctly etc [16:11:02] (+ etcd) [16:11:03] oh! with confd [16:11:21] yeah! [16:11:22] I wonder how hard it would be to get that all glued together in `dcl` [16:11:56] dcl user here as well but never tested that stuff :) [16:12:16] would be pretty nice -- so far my dcl usage has been limited to testing Puppet stuff and ERB-ing [16:12:32] but golang TPL + ERB is another level of pain anyway so [16:13:23] 12:10:37 < cdanis> all I do is start up etcd.service as installed from debian and then run confctl from the tox venv [16:13:42] helpful, thanks [16:14:12] I think you probably also need to pass it the config file in the integration directory [16:14:32] will try. but yeah, dcl for this would be pretty nice. [16:14:54] sorry, from the fixtures directory [16:15:44] so meaning, add it to dbconfig/schema.yaml? [16:16:24] ah [16:16:35] no I think I'm wrong, I think you don't need any config file [16:16:56] (which controls stuff like the etcd backend used) [16:17:28] but if you are using your own object type you'll have to do confctl --schema $YOURFILE [16:17:59] ah, right [16:18:17] I very much like your idea of alleviating the need for --schema and automatically detecting it [16:18:25] I am not sure it will work [16:18:35] not even if we have specific tags? [16:18:35] I'm pretty sure sw.french will find a hole to poke in it [16:18:50] ha! [16:18:53] oh, sorry yes, I meant "work without any schema changes for other users" [16:19:37] and yeah you mean --object-type [16:20:49] yep [16:21:01] like I said, I am overthinking this 100% [16:21:14] yeah I just wanted to clarify [16:21:27] --object-type is the one we're proposing being autodetected [16:21:44] --schema is a yaml file that tells conftool what the object-types are [16:22:09] and that one is automatic in production, but not when doing local dev (except that the integration test runners for e.g. dbconfig do it) [16:22:10] `conftool/tests/fixtures/dbconfig/schema.yaml` is supposed to be kept in-sync with the same file in the puppet repo [16:22:14] got it, so I conflated the two [16:22:30] so yeah you should just edit dbconfig/schema.yaml with your proposed changes [16:22:52] yeah I added that here https://gitlab.wikimedia.org/repos/sre/conftool/-/merge_requests/4 [16:23:03] and you can test locally with something like [16:23:04] ./.tox/py311-unit/bin/confctl --schema conftool/tests/fixtures/dbconfig/schema.yaml --object-type geodns [16:23:04] really split about the name/dc part but I guess I will wait for bblack to come and see what he thinks since the rest of the us are split as well [16:23:11] thanks! [16:24:32] 06serviceops, 06SRE, 10Data Products (Data Products Sprint 16), 13Patch-For-Review, 07Service-deployment-requests: Commons Impact Metrics AQS 2.0 Deployment to Staging and Production - https://phabricator.wikimedia.org/T361835#9970307 (10Scott_French) Alright, good(er) news: the service is now live at `/... [16:29:05] 06serviceops, 10DNS, 10fundraising-tech-ops, 06SRE, and 2 others: redirect benefactors.wikimedia.org (was: Cleanup unused DNS subdomains) - https://phabricator.wikimedia.org/T367012#9970327 (10Dzahn) Thanks Clément! I was about to make a subtask and rename this again but you already took it on. It's apprec... [16:38:23] 06serviceops, 10Wikidata, 06Wikidata Dev Team, 10Wikidata-Termbox, 10wmde-wikidata-tech: [SW] [GENERAL] Simplify Termbox SSR test release - https://phabricator.wikimedia.org/T355955#9970399 (10Lucas_Werkmeister_WMDE) > I am also not at all sure right now that the test release can easily be folded in like... [16:48:30] 06serviceops, 06DC-Ops, 10ops-eqiad: Q1:rack/setup/install wikikube-worker1240 to wikikube-worker1304 - https://phabricator.wikimedia.org/T369743 (10RobH) 03NEW [16:55:23] 06serviceops: wikikube-worker1240 to wikikube-worker1304 implementation tracking - https://phabricator.wikimedia.org/T369744 (10RobH) 03NEW [22:06:35] 06serviceops, 06Infrastructure-Foundations, 06SRE, 07ARM support: Adoption of aarch64 (aka arm64) in WMF production? (SRE Summit 2022 Session) - https://phabricator.wikimedia.org/T320811#9971586 (10bd808) >>! In T320811#8777225, @Ladsgroup wrote: > This might be interesting, specially in choosing a manufac... [23:42:21] 06serviceops, 10DNS, 10fundraising-tech-ops, 06SRE, 06Traffic: redirect benefactors.wikimedia.org (was: Cleanup unused DNS subdomains) - https://phabricator.wikimedia.org/T367012#9971801 (10Dzahn) [23:46:03] 06serviceops, 10DNS, 10fundraising-tech-ops, 06SRE, 06Traffic: redirect benefactors.wikimedia.org (was: Cleanup unused DNS subdomains) - https://phabricator.wikimedia.org/T367012#9971806 (10Dzahn) Thanks to Clement and Reuven for the redirect change and deploying it. benefactors redirects now. @Pppery... [23:49:12] 06serviceops, 10DNS, 10fundraising-tech-ops, 06SRE, 06Traffic: Cleanup DNS subdomains displaying wikimedia.org homepage when they shouldn't - https://phabricator.wikimedia.org/T367012#9971809 (10Pppery) 05Open→03Resolved [23:51:41] 06serviceops, 10Wikimedia-Apache-configuration: Change redirect target of sep11.wikipedia.org - https://phabricator.wikimedia.org/T367014#9971812 (10Pppery) Another possibility is https://meta.wikimedia.org/wiki/Sep11wiki. On second thought that's probably better than the Wayback Machine as it explains the con...