[06:34:08] 10serviceops, 10PHP 7.4 support, 10Platform Team Workboards (Clinic Duty Team): Rename articles and users to prepare for PHP 7.3 unicode changes - https://phabricator.wikimedia.org/T292552 (10tstarling) I'm trying a dry run. But the proposed suffix "former Unicode character" seems weird. I think I would pref... [08:21:05] jayme: Did you plan to work on https://phabricator.wikimedia.org/T310721 with i.nflatador or can j.oe and I grab it? [08:21:32] claime: no plans, grab ahead :) [08:22:00] ack :) [09:10:29] jayme: Thanks for the review on https://gerrit.wikimedia.org/r/c/operations/docker-images/production-images/+/838151 - Much appreciated. [09:11:11] I still can't get `FROM {{ "openjdk-11-jdk" | image_tag }}` to work for me though. I'm wondering what I'm doing wrong: https://phabricator.wikimedia.org/P35417 [09:14:46] that is the exact command you're running and which docker-pkg version? [09:20:40] docker-pkg version is 3.0.3 - Yes, I've tried it with both single and double-quotes around the image name. [09:32:09] <_joe_> btullis: it's not that that errors out [09:32:11] <_joe_> 2022-10-12 09:50:13,506 [docker-pkg-build] ERROR - Unexpected error building image docker-registry.wikimedia.org/spark-build:3.3.0: Image docker-registry.wikimedia.org/golang1.18 not found (image.py:208) [09:32:26] <_joe_> we only have golang1.15 atm [09:32:37] <_joe_> if you need 1.18, we need to create an image i guess [09:32:37] not true [09:32:54] Thanks, but It's the same if I put golang1.15 [09:33:07] <_joe_> let me see the patch [09:33:30] Also `FROM {{ "openjdk-11-jdk" | image_tag }} fails in the same way/ [09:34:06] With pleasure: https://gerrit.wikimedia.org/r/c/operations/docker-images/production-images/+/838151 [09:34:36] <_joe_> yeah I am looking, I'm frankly at a loss, I'll try to reproduce in a few [09:36:42] Thanks. Running under python 3.9 on bullseye on my workstation. [09:39:15] you could try again with "docker-pkg --debug -c config.yaml build images/" [09:40:08] and check if you can "docker pull docker-registry.wikimedia.org/golang1.18" [09:41:03] Second one is easy. I did that a few minutes ago too. [09:41:06] https://www.irccloud.com/pastebin/EOyrwFLo/ [09:41:59] Hi, i don't see a dedicated deployment window for restbase. What would be a good time to push a deployment today ? [09:43:30] <_joe_> nemo-yiannis: I guess any time we don't have a deployment for anything else [09:43:42] Sounds good, thanks! [09:45:53] <_joe_> btullis: I just built your images successfully after downloading your change [09:45:57] <_joe_> so I can't reproduce [09:46:07] <_joe_> I freshly installed a venv with docker-pkg [09:46:11] ditto [09:46:16] <_joe_> to be sure I would start afresh [09:46:32] but your hostname in the screenshot says ... wsl ... 😬 [09:46:53] <_joe_> ahh [09:47:04] if that is what it sounds like, maybe it does things differently [09:47:06] <_joe_> yeah that might be of relevance [09:47:12] <_joe_> I suggest running with --debug then [09:47:31] <_joe_> also: are you using docker in wsl or docker for windows? [09:48:00] OK, thanks. Could it be because I had images/spark in the docker-pkg command? `docker-pkg --info --config config.yaml build images/spark` [09:49:24] yes. that won't work [09:49:51] Ah, good. My debug build with `docker-pkg --debug -c config.yaml build images/` also appears to be building successfully. [09:50:18] <_joe_> btullis: yeah if you just give it the spark directory, docker-pkg will only "see" images under that tree [09:51:05] 10serviceops, 10Data-Engineering, 10SRE, 10Event-Platform Value Stream (Sprint 02), 10Patch-For-Review: eventstreams chart should use latest common_templates - https://phabricator.wikimedia.org/T310721 (10Clement_Goubert) Hi, I'll be your SRE support for today, and will handle de/repooling, destroying th... [09:51:40] OK, great. That makes sense now. Thanks. Yeah, I hate having to run Windows but it's the only way I could get this laptop stable. [09:51:44] https://www.irccloud.com/pastebin/PWQ1Owv0/ [09:52:37] <_joe_> btullis: a carbon x1? [09:53:47] X1 Extreme Gen 3 - dual GPUs [09:55:34] <_joe_> ahhh yeah [09:55:38] <_joe_> the dual gpu ones [09:55:40] I ran Debian on it for about 8 months, but I was forever dropping out of meetings, or unable to share my screen, or I could share anything except gnome-terminal, or just crashing X and/or Wayland. [09:55:56] <_joe_> hee I would probably try ubuntu [09:57:03] Yeah, you're probably right. [09:58:23] Anyway, thanks for fixing this issue. I'll fix up those review comments. Now I'm in the process of switching to a source tarball download with verification, rather than a git checkout. Then I'll ask for another review. [09:58:55] 10serviceops, 10Page Content Service, 10Product-Infrastructure-Team-Backlog: Production configuration for mobileapps need to be adapted to codechanges - https://phabricator.wikimedia.org/T320505 (10Jgiannelos) 05Open→03Resolved a:03Jgiannelos [10:01:32] <_joe_> btullis: I fear you'll have some challenges ahead of you in production too [10:02:46] <_joe_> you are downloading a lot of stuff from the internet, which is expected given it's a maven project, I doubt having https_proxy as an env variable is enough [10:02:57] <_joe_> you might need to change settings.xml [10:12:34] _joe_: Oh, ok. Is https_proxy set automatically in the build environment? I see other images pulling from github etc without mentioning the proxy. [10:12:54] <_joe_> btullis: it's set in the configuration [10:13:22] 10serviceops, 10Data-Engineering, 10SRE, 10Event-Platform Value Stream (Sprint 02), 10Patch-For-Review: eventstreams chart should use latest common_templates - https://phabricator.wikimedia.org/T310721 (10Clement_Goubert) Destroy/apply done in staging: ` # helmfile -e staging status helmfile.yaml: basePa... [10:14:06] Well that seems to have worked [10:14:20] <_joe_> btullis: [10:14:22] <_joe_> root@build2001:~# cat /etc/production-images/config.yaml [10:14:24] <_joe_> http_proxy: "http://webproxy.codfw.wmnet:8080" [10:14:41] Ack, thanks. [10:15:39] OK, I will check to see whatever else maven might need to be able to use the proxy. I thought I remembered doing something similar for datahub, but can't immediately find it. [10:16:22] 10serviceops, 10PHP 7.4 support, 10Platform Team Workboards (Clinic Duty Team): Rename articles and users to prepare for PHP 7.3 unicode changes - https://phabricator.wikimedia.org/T292552 (10tstarling) There's 11584 global users to be renamed, which is a lot. Most of them are in the Georgian script -- the c... [10:43:24] 10serviceops, 10PHP 7.4 support, 10Platform Team Workboards (Clinic Duty Team): Rename articles and users to prepare for PHP 7.3 unicode changes - https://phabricator.wikimedia.org/T292552 (10tstarling) The script would rename 306,244 out of 462,201 pages on kawiki. I'm pretty sure we shouldn't go ahead with... [11:16:02] 10serviceops, 10Data-Engineering, 10SRE, 10Event-Platform Value Stream (Sprint 02), 10Patch-For-Review: eventstreams chart should use latest common_templates - https://phabricator.wikimedia.org/T310721 (10JArguello-WMF) @Clement_Goubert Thank you so much! Please let us know if there is anything we need... [11:20:58] 10serviceops, 10Data-Engineering, 10SRE, 10Event-Platform Value Stream (Sprint 02), 10Patch-For-Review: eventstreams chart should use latest common_templates - https://phabricator.wikimedia.org/T310721 (10Clement_Goubert) `eventstream` redeployed in codfw. @JArguello-WMF Apart from checking everything i... [11:52:23] 10serviceops, 10Data-Engineering, 10SRE, 10Event-Platform Value Stream (Sprint 02), 10Patch-For-Review: eventstreams chart should use latest common_templates - https://phabricator.wikimedia.org/T310721 (10Clement_Goubert) `eventstream` redeployed in eqiad [11:59:57] 10serviceops, 10Data-Engineering, 10SRE, 10Event-Platform Value Stream (Sprint 02), 10Patch-For-Review: eventstreams chart should use latest common_templates - https://phabricator.wikimedia.org/T310721 (10Clement_Goubert) Everything looks healthy from my end, both are getting traffic and not throwing err... [12:01:42] 10serviceops, 10observability: Port openapi/swagger checks/alerts to Prometheus - https://phabricator.wikimedia.org/T320620 (10fgiunchedi) [12:40:46] 10serviceops, 10Data-Engineering, 10SRE, 10Event-Platform Value Stream (Sprint 02), 10Patch-For-Review: eventstreams chart should use latest common_templates - https://phabricator.wikimedia.org/T310721 (10Ottomata) > eventstreams-internal is still used? I am not sure! I'd imagine folks use it, as it is... [12:57:12] 10serviceops, 10Security-Team, 10serviceops-collab, 10GitLab (CI & Job Runners), and 3 others: Setup GitLab Runner in trusted environment - https://phabricator.wikimedia.org/T295481 (10Jelto) While doing more debugging around the firewall rules for Trusted Runners I found out the firewall mostly work, as d... [13:37:21] ottomata: Regarding eventstreams-internal, I have no dash to view current connections to the service, so while I can depool/wait/repool, I can't guarantee a "clean" switchover and deployment. From what you said, I assume nothing would really suffer from a dropped connection if I happen to redeploy before all clients have switched, is that correct? [14:10:34] 10serviceops, 10Prod-Kubernetes, 10Kubernetes: Kubernetes services with externalTrafficPolicy: Local don't work - https://phabricator.wikimedia.org/T300500 (10JMeybohm) 05Resolved→03Open p:05Medium→03High ouch, this never made it to prod [14:18:56] 10serviceops, 10Observability-Alerting, 10observability: Port openapi/swagger checks/alerts to Prometheus - https://phabricator.wikimedia.org/T320620 (10colewhite) [14:49:39] claime: oh yes, you wont' break anything or anyone [14:49:54] please restart that at will, it is only for internal convenience for real users. it is not used for any prod things. [14:50:07] real=wmf shell users [14:50:50] Just realized that the EventStreams dash is not multi service! will try to make is so. [14:53:07] <3 [15:09:36] claime: ok i think i did it, although there are basically 0 metrics for eventstreams-internal (except for the k8s emitted ones). I guess that's expected. [15:20:29] I'm out until Monday (moving on Saturday) - see you all then o/ [15:21:31] see you Saturday ;) o/ [15:21:53] 💪 [15:25:28] likewise see you Saturday :-) [15:29:37] <_joe_> oh he's exploiting both of you [15:29:46] <_joe_> fight the power, steal his stuff [15:29:55] <_joe_> (good luck) [15:35:39] ottomata: evenstreams-internal redeployed on staging and prod, you should be good to go now [15:39:13] 10serviceops, 10Data-Engineering, 10SRE, 10Event-Platform Value Stream (Sprint 02), 10Patch-For-Review: eventstreams chart should use latest common_templates - https://phabricator.wikimedia.org/T310721 (10Clement_Goubert) `eventstreams-internal` fully redeployed, this task can probably be closed now. [15:42:26] 10serviceops, 10Data-Engineering, 10SRE, 10Event-Platform Value Stream (Sprint 02), 10Patch-For-Review: eventstreams chart should use latest common_templates - https://phabricator.wikimedia.org/T310721 (10Ottomata) Thank you so much! [15:42:26] claime: thank you! [15:42:46] ottomata: np! Glad to help [16:20:45] workout, back in ~30 [17:14:08] 10serviceops, 10Discovery-Search, 10SRE, 10serviceops-collab, and 2 others: Sunset search.wikimedia.org service - https://phabricator.wikimedia.org/T316296 (10Clement_Goubert) a:05Dzahn→03Clement_Goubert Just for clarification, we are talking about the service named `apple-search` in service discovery... [17:43:00] 10serviceops, 10PHP 7.4 support, 10Platform Team Workboards (Clinic Duty Team): Rename articles and users to prepare for PHP 7.3 unicode changes - https://phabricator.wikimedia.org/T292552 (10GiorgiOkropiridze) In Georgian we don't have lowercase and uppercase. Mtavruli and Mkhedruli cannot be confused. Or... [18:57:53] sorry, been back [19:51:32] 10serviceops, 10DC-Ops, 10SRE, 10ops-eqiad, 10serviceops-collab: Q2:rack/setup/install webperf1005.eqiad.wmnet - https://phabricator.wikimedia.org/T319433 (10Dzahn) [19:51:34] 10serviceops, 10SRE: service implementation tracking: webperf1005.eqiad.wmnet - https://phabricator.wikimedia.org/T319434 (10Dzahn) 05Open→03Stalled [19:52:18] 10serviceops, 10SRE: service implementation tracking: webperf2005.codfw.wmnet - https://phabricator.wikimedia.org/T319429 (10Dzahn) 05Open→03Stalled [19:52:24] 10serviceops, 10DC-Ops, 10SRE, 10ops-codfw: Q2:rack/setup/install webperf2005.codfw.wmnet - https://phabricator.wikimedia.org/T319428 (10Dzahn) [22:14:44] 10serviceops, 10Observability-Logging, 10Patch-For-Review, 10SRE Observability (FY2022/2023-Q1): Increase of ~50 million access logs per day from mobileapps-production-tls-proxy - https://phabricator.wikimedia.org/T313099 (10colewhite) 05Open→03Resolved Clean up as the indexes age is needed to graceful... [22:16:05] 10serviceops: Envoy can't connect to servers using TLS 1.3 (but can serve TLS 1.3 to clients) - https://phabricator.wikimedia.org/T246083 (10bking) Setting both maximum and minimum TLS versions in the envoy static config file appears to work. I confirmed this via packet captures, the results of which are availab... [23:01:04] 10serviceops, 10PHP 7.4 support, 10Patch-For-Review, 10Platform Team Workboards (Clinic Duty Team): Rename articles and users to prepare for PHP 7.3 unicode changes - https://phabricator.wikimedia.org/T292552 (10tstarling) I removed the Georgian characters from Pchelolo's case map, and I'm running the dry... [23:39:29] 10serviceops, 10PHP 7.4 support, 10Patch-For-Review, 10Platform Team Workboards (Clinic Duty Team): Rename articles and users to prepare for PHP 7.3 unicode changes - https://phabricator.wikimedia.org/T292552 (10tstarling) That reduced the number of user renames to 441, of which 370 are for decorative Esze...