[00:10:50] 06cloud-services-team (Hardware), 06DC-Ops, 10ops-eqiad, 06SRE: Q2:rack/setup/install clouddb1026-1033 - https://phabricator.wikimedia.org/T409162#12043251 (10ops-monitoring-bot) Cookbook cookbooks.sre.hosts.reimage started by jclark@cumin1003 for host clouddb1029.eqiad.wmnet with OS trixie completed: - cl... [00:11:07] 06cloud-services-team (Hardware), 06DC-Ops, 10ops-eqiad, 06SRE: Q2:rack/setup/install clouddb1026-1033 - https://phabricator.wikimedia.org/T409162#12043252 (10Jclark-ctr) [00:14:55] 06cloud-services-team (Hardware), 06DC-Ops, 10ops-eqiad, 06SRE: Q2:rack/setup/install clouddb1026-1033 - https://phabricator.wikimedia.org/T409162#12043255 (10Jclark-ctr) a:03Jclark-ctr [00:19:46] (03CR) 10Dzahn: [V:03+2 C:03+2] add repos/sre/sloth, repos/sre/slothslos to config [labs/codesearch] - 10https://gerrit.wikimedia.org/r/1304848 (https://phabricator.wikimedia.org/T429819) (owner: 10Dzahn) [00:21:02] 06cloud-services-team (Hardware), 06DC-Ops, 10ops-eqiad, 06SRE: Q2:rack/setup/install clouddb1026-1033 - https://phabricator.wikimedia.org/T409162#12043266 (10Jclark-ctr) [00:21:05] 06cloud-services-team (Hardware), 06DC-Ops, 10ops-eqiad, 06SRE: Q2:rack/setup/install clouddb1026-1033 - https://phabricator.wikimedia.org/T409162#12043267 (10ops-monitoring-bot) Cookbook cookbooks.sre.hosts.reimage started by jclark@cumin1003 for host clouddb1028.eqiad.wmnet with OS trixie completed: - cl... [00:24:34] 06cloud-services-team (Hardware), 06DC-Ops, 10ops-eqiad, 06SRE: Q2:rack/setup/install clouddb1026-1033 - https://phabricator.wikimedia.org/T409162#12043269 (10ops-monitoring-bot) Cookbook cookbooks.sre.hosts.reimage started by jclark@cumin1003 for host clouddb1026.eqiad.wmnet with OS trixie completed: - cl... [00:24:39] 06cloud-services-team (Hardware), 06DC-Ops, 10ops-eqiad, 06SRE: Q2:rack/setup/install clouddb1026-1033 - https://phabricator.wikimedia.org/T409162#12043270 (10Jclark-ctr) [00:30:23] 06cloud-services-team (Hardware), 06DC-Ops, 10ops-eqiad, 06SRE: Q2:rack/setup/install clouddb1026-1033 - https://phabricator.wikimedia.org/T409162#12043271 (10ops-monitoring-bot) Cookbook cookbooks.sre.hosts.reimage started by jclark@cumin1003 for host clouddb1033.eqiad.wmnet with OS trixie completed: - cl... [00:30:24] 06cloud-services-team (Hardware), 06DC-Ops, 10ops-eqiad, 06SRE: Q2:rack/setup/install clouddb1026-1033 - https://phabricator.wikimedia.org/T409162#12043272 (10Jclark-ctr) [00:33:27] 06cloud-services-team (Hardware), 06DC-Ops, 10ops-eqiad, 06SRE: Q2:rack/setup/install clouddb1026-1033 - https://phabricator.wikimedia.org/T409162#12043275 (10ops-monitoring-bot) Cookbook cookbooks.sre.hosts.reimage started by jclark@cumin1003 for host clouddb1027.eqiad.wmnet with OS trixie completed: - cl... [00:33:28] 06cloud-services-team (Hardware), 06DC-Ops, 10ops-eqiad, 06SRE: Q2:rack/setup/install clouddb1026-1033 - https://phabricator.wikimedia.org/T409162#12043276 (10Jclark-ctr) [00:33:45] 06cloud-services-team (Hardware), 06DC-Ops, 10ops-eqiad, 06SRE: Q2:rack/setup/install clouddb1026-1033 - https://phabricator.wikimedia.org/T409162#12043277 (10Jclark-ctr) 05Open→03Resolved [00:59:14] 10VPS-project-Codesearch: Index the slothslo gitlab repo in codesearch - https://phabricator.wikimedia.org/T429819#12043296 (10Dzahn) - added to config with the change above - run puppet which git pulls - systemctl start codesearch-write-config to create new config - systemctl restart hound-operations.service... [00:59:53] 10VPS-project-Codesearch: Index the slothslo gitlab repo in codesearch - https://phabricator.wikimedia.org/T429819#12043297 (10Dzahn) 05Open→03Resolved It has been added. https://codesearch.wmcloud.org/operations/?q=Sloth&i=nope&files=&excludeFiles=&repos= [01:18:39] 10Cloud-Services: Unable to create Toolforge project "Tool-odivoir" – HTTP 500 Internal Server Error on toolsadmin - https://phabricator.wikimedia.org/T429857 (10poro26) 03NEW The #Cloud-Services project tag is not intended to have any tasks. Please check the list on https://phabricator.wikimedia.org/project/p... [01:19:51] 10Cloud-Services: Unable to create Toolforge project "Tool-odivoir" – HTTP 500 Internal Server Error on toolsadmin - https://phabricator.wikimedia.org/T429857#12043357 (10poro26) >>! Dans T429857#12043354, @Herald a écrit : > The #Cloud-Services project tag is not intended to have any tasks. Please check the lis... [01:21:37] 10Tools: Unable to create Toolforge project "Tool-odivoir" – HTTP 500 Internal Server Error on toolsadmin - https://phabricator.wikimedia.org/T429857#12043359 (10poro26) [01:21:56] 10Tools: Unable to create Toolforge project "Tool-odivoir" – HTTP 500 Internal Server Error on toolsadmin - https://phabricator.wikimedia.org/T429857#12043363 (10poro26) >>! Dans T429857#12043357, @poro26 a écrit : >>>! Dans T429857#12043354, @Herald a écrit : >> The #Cloud-Services project tag is not intended t... [01:24:15] 06cloud-services-team, 10Striker: Unable to create Toolforge project "Tool-odivoir" – HTTP 500 Internal Server Error on toolsadmin - https://phabricator.wikimedia.org/T429857#12043364 (10JJMC89) [06:19:46] 06cloud-services-team, 10Quarry, 13Patch-For-Review, 07patch-welcome, 07Plural-Support: "1 rows" should be "1 row" - https://phabricator.wikimedia.org/T419564#12043764 (10Aklapper) @Nihar_Chakravarti: That patch seems to add lots of unrelated `\r`? [06:45:22] 06cloud-services-team, 10Quarry, 13Patch-For-Review, 07patch-welcome, 07Plural-Support: "1 rows" should be "1 row" - https://phabricator.wikimedia.org/T419564#12043802 (10Nihar_Chakravarti) Thanks for pointing that out!! I edited the files on Windows, so unintended line-ending changes may have been intro... [08:14:09] !log dcaro@cloudcumin1001 toolsbeta START - Cookbook wmcs.toolforge.component.deploy for component jobs-api [08:25:44] 06cloud-services-team, 10Striker: Unable to create Toolforge project "Tool-odivoir" – HTTP 500 Internal Server Error on toolsadmin - https://phabricator.wikimedia.org/T429857#12043959 (10dcaro) Hi @poro26, the odivoir tool already exists (you linked to it https://toolsadmin.wikimedia.org/tools/id/odivoir), can... [08:26:12] !log dcaro@cloudcumin1001 toolsbeta END (PASS) - Cookbook wmcs.toolforge.component.deploy (exit_code=0) for component jobs-api [08:27:12] !log dcaro@cloudcumin1001 tools START - Cookbook wmcs.toolforge.component.deploy for component jobs-api [08:29:51] 10Tool-glam-tool-hospital, 10GLAM-Tech: GLAMorgan - Bug - Test - https://phabricator.wikimedia.org/T429877 (10GLAM-Tool-Hospital-Bot) 03NEW p:05Triage→03Low [08:31:09] (03update) 10fnegri: Do not require --filelog if --filelog-std(out|err) is used [repos/cloud/toolforge/jobs-cli] - 10https://gitlab.wikimedia.org/repos/cloud/toolforge/jobs-cli/-/merge_requests/158 (https://phabricator.wikimedia.org/T428354) [08:31:24] (03update) 10dcaro: api_get_jobs: compute correct value for missing fields [repos/cloud/toolforge/jobs-api] - 10https://gitlab.wikimedia.org/repos/cloud/toolforge/jobs-api/-/merge_requests/316 (https://phabricator.wikimedia.org/T428354) (owner: 10fnegri) [08:31:50] (03update) 10dcaro: api_get_jobs: compute correct value for missing fields [repos/cloud/toolforge/jobs-api] - 10https://gitlab.wikimedia.org/repos/cloud/toolforge/jobs-api/-/merge_requests/316 (https://phabricator.wikimedia.org/T428354) (owner: 10fnegri) [08:32:05] 10Tool-glam-tool-hospital, 10GLAM-Tech: GLAMorgan - Bug - Test - https://phabricator.wikimedia.org/T429877#12043984 (10Dactylantha) 05Open→03Invalid [08:32:33] (03update) 10dcaro: Do not require --filelog if --filelog-std(out|err) is used [repos/cloud/toolforge/jobs-cli] - 10https://gitlab.wikimedia.org/repos/cloud/toolforge/jobs-cli/-/merge_requests/158 (https://phabricator.wikimedia.org/T428354) (owner: 10fnegri) [08:40:32] !log dcaro@cloudcumin1001 tools END (PASS) - Cookbook wmcs.toolforge.component.deploy (exit_code=0) for component jobs-api [08:41:25] (03approved) 10dcaro: jobs-api: bump to 0.0.513-20260622155500-30972a04 [repos/cloud/toolforge/toolforge-deploy] - 10https://gitlab.wikimedia.org/repos/cloud/toolforge/toolforge-deploy/-/merge_requests/1304 (https://phabricator.wikimedia.org/T429231) (owner: 10group_203_bot_3c0afd0d9fd9529f3b7bc7e69a4a3bce) [08:41:30] (03merge) 10dcaro: jobs-api: bump to 0.0.513-20260622155500-30972a04 [repos/cloud/toolforge/toolforge-deploy] - 10https://gitlab.wikimedia.org/repos/cloud/toolforge/toolforge-deploy/-/merge_requests/1304 (https://phabricator.wikimedia.org/T429231) (owner: 10group_203_bot_3c0afd0d9fd9529f3b7bc7e69a4a3bce) [08:54:22] 06cloud-services-team (FY2025/2026-Q3-Q4), 10Cloud-VPS, 10Data-Services, 06Data-Platform-SRE, and 3 others: Plan to make clouddumps more resilient and easier to operate - https://phabricator.wikimedia.org/T411248#12044146 (10brouberol) This will only be used by NFS clients, right? Will it impact humans goi... [09:03:56] 10Toolforge, 06tools-platform-team, 13Patch-For-Review: [jobs-cli] emits a warning to re-create valid jobs - https://phabricator.wikimedia.org/T429231#12044203 (10dcaro) okok, so some of it was solved, now the issue is a difference in the logs path: ` [2026-06-23 08:53:54,309] p12:t139856736306880 /app/tjf/... [09:10:55] (03update) 10dcaro: cleanup by image name [repos/cloud/toolforge/builds-api] (dont_chmod_on_build) - 10https://gitlab.wikimedia.org/repos/cloud/toolforge/builds-api/-/merge_requests/165 [09:11:39] (03update) 10dcaro: cleanup by image name [repos/cloud/toolforge/builds-api] (dont_chmod_on_build) - 10https://gitlab.wikimedia.org/repos/cloud/toolforge/builds-api/-/merge_requests/165 [10:20:47] 06tools-infrastructure-team: Implement magnum monitoring - https://phabricator.wikimedia.org/T429557#12044618 (10fnegri) [10:21:33] 06tools-infrastructure-team: Streamline cloud services reboots to minimize admin impact - https://phabricator.wikimedia.org/T427338#12044620 (10fnegri) [10:26:14] 06cloud-services-team, 10Data-Services, 06Data-Persistence: CPU spikes in clouddb1014@s2 - https://phabricator.wikimedia.org/T429892 (10fnegri) 03NEW [10:26:41] 06cloud-services-team, 10Data-Services, 06Data-Persistence: CPU spikes in clouddb1014@s2 - https://phabricator.wikimedia.org/T429892#12044637 (10fnegri) [10:26:57] 06cloud-services-team, 10Data-Services, 06Data-Persistence: CPU spikes in clouddb1014@s2 - https://phabricator.wikimedia.org/T429892#12044638 (10fnegri) [10:30:18] 06cloud-services-team, 10Data-Services, 06Data-Persistence: CPU spikes in clouddb1014@s2 - https://phabricator.wikimedia.org/T429892#12044665 (10fnegri) 30-day view of CPU, Memory and Network for clouddb1014: {F90116488} [10:32:18] 06cloud-services-team, 10Data-Services, 06Data-Persistence: CPU spikes in clouddb1014@s2 - https://phabricator.wikimedia.org/T429892#12044678 (10fnegri) [10:33:14] 06cloud-services-team, 10Data-Services, 06Data-Persistence: CPU spikes in clouddb1014@s2 - https://phabricator.wikimedia.org/T429892#12044681 (10Marostegui) Just to clarify: alerts weren't routed to us specifically. They arrived to #wikimedia-operations channel and I saw them. [10:36:04] 06cloud-services-team, 10Data-Services, 06Data-Persistence: CPU spikes in clouddb1014@s2 - https://phabricator.wikimedia.org/T429892#12044694 (10fnegri) [10:36:08] 06cloud-services-team, 10Data-Services, 06Data-Persistence: CPU spikes in clouddb1014@s2 - https://phabricator.wikimedia.org/T429892#12044695 (10fnegri) > Just to clarify: alerts weren't routed to us specifically. They arrived to #wikimedia-operations channel and I saw them. Ack, I corrected the task descri... [10:38:29] 06cloud-services-team, 10Data-Services, 06Data-Persistence: CPU spikes in clouddb1014@s2 - https://phabricator.wikimedia.org/T429892#12044698 (10fnegri) I checked the alert trends in Icinga and this is the only time the alert fired in the past 30 days, so unless this repeats I would treat it as a one-off and... [10:54:51] 06cloud-services-team, 10Data-Services, 06Data-Persistence: CPU spikes in clouddb1014@s2 - https://phabricator.wikimedia.org/T429892#12044744 (10CWilliams-WMF) @fnegri here is a quick test to point you in the direction: https://phabricator.wikimedia.org/P94365 In the example, heavyload.sh is just checking f... [11:46:05] 06cloud-services-team (FY2025/2026-Q3-Q4), 10Cloud-VPS, 10Data-Services, 06Data-Platform-SRE, and 3 others: Plan to make clouddumps more resilient and easier to operate - https://phabricator.wikimedia.org/T411248#12044919 (10fgiunchedi) >>! In T411248#12044146, @brouberol wrote: > This will only be used by... [11:48:44] 10VPS-Projects, 10VisualEditor, 10Catalyst (PatchDemo): Clean up the patchdemo3.visualeditor.eqiad1.wikimedia.cloud instance - https://phabricator.wikimedia.org/T429836#12044934 (10fgiunchedi) Yes, you'll need to move the proxy from `visualeditor` project to `catalyst` and point it to an instance which will... [12:03:56] 06cloud-services-team (FY2025/2026-Q3-Q4), 10Cloud-VPS, 10Data-Services, 06Data-Platform-SRE, and 3 others: Plan to make clouddumps more resilient and easier to operate - https://phabricator.wikimedia.org/T411248#12045019 (10brouberol) No objection from my part! [12:09:29] (03approved) 10dcaro: api_get_jobs: compute correct value for missing fields [repos/cloud/toolforge/jobs-api] - 10https://gitlab.wikimedia.org/repos/cloud/toolforge/jobs-api/-/merge_requests/316 (https://phabricator.wikimedia.org/T428354) (owner: 10fnegri) [12:09:30] (03update) 10dcaro: api_get_jobs: compute correct value for missing fields [repos/cloud/toolforge/jobs-api] - 10https://gitlab.wikimedia.org/repos/cloud/toolforge/jobs-api/-/merge_requests/316 (https://phabricator.wikimedia.org/T428354) (owner: 10fnegri) [12:10:18] (03update) 10dcaro: Do not require --filelog if --filelog-std(out|err) is used [repos/cloud/toolforge/jobs-cli] - 10https://gitlab.wikimedia.org/repos/cloud/toolforge/jobs-cli/-/merge_requests/158 (https://phabricator.wikimedia.org/T428354) (owner: 10fnegri) [12:27:44] 06tools-platform-team: [jobs-api] can't run a buildservice job with filelog - https://phabricator.wikimedia.org/T429907 (10dcaro) 03NEW [12:27:49] 06tools-platform-team: [jobs-api] can't run a buildservice job with filelog - https://phabricator.wikimedia.org/T429907#12045120 (10dcaro) p:05Triage→03High [12:34:23] 10VPS-project-Codesearch: Index the slothslo gitlab repo in codesearch - https://phabricator.wikimedia.org/T429819#12045146 (10stjn) @Dzahn: did this change break all customisations to Codesearch somehow or is it something different that caused this? Compare - https://codesearch.wmcloud.org (now) - https://w... [12:50:49] (03open) 10dcaro: conftest: Added an environment guard for DEBUG [repos/cloud/toolforge/jobs-api] - 10https://gitlab.wikimedia.org/repos/cloud/toolforge/jobs-api/-/merge_requests/317 [12:51:02] (03update) 10dcaro: conftest: Added an environment guard for DEBUG [repos/cloud/toolforge/jobs-api] - 10https://gitlab.wikimedia.org/repos/cloud/toolforge/jobs-api/-/merge_requests/317 [13:03:13] 10Tool-subtitler: Silence detection fails - https://phabricator.wikimedia.org/T429912 (10psubhashish1) 03NEW [13:03:23] 10VPS-Projects, 10VisualEditor, 10Catalyst (PatchDemo): Clean up the patchdemo3.visualeditor.eqiad1.wikimedia.cloud instance - https://phabricator.wikimedia.org/T429836#12045315 (10matmarex) Can you advise on how to do that? I don't see an option to move it to another project in the Horizon interface. I pres... [13:05:05] 10Tool-subtitler: Silence detection fails - https://phabricator.wikimedia.org/T429912#12045318 (10psubhashish1) [13:09:59] 10VPS-Projects, 10VisualEditor, 10Catalyst (PatchDemo): Clean up the patchdemo3.visualeditor.eqiad1.wikimedia.cloud instance - https://phabricator.wikimedia.org/T429836#12045334 (10matmarex) I found this: https://wikitech.wikimedia.org/wiki/Help:Using_a_web_proxy_to_reach_Cloud_VPS_servers_from_the_internet#... [13:11:33] 10Tool-subtitler: ASR for silence detection instead of/in addition to the current, faulty silence detection - https://phabricator.wikimedia.org/T429914 (10psubhashish1) 03NEW [13:13:43] 06cloud-services-team (FY2025/2026-Q3-Q4), 10Cloud-VPS, 10Data-Services, 06Data-Platform-SRE, and 3 others: Plan to make clouddumps more resilient and easier to operate - https://phabricator.wikimedia.org/T411248#12045381 (10Gehel) Moving this to "watching" as I don't expect #data-platform-sre to be involv... [13:13:54] 10Tool-subtitler: ASR for silence detection instead of/in addition to the current, faulty silence detection - https://phabricator.wikimedia.org/T429914#12045387 (10psubhashish1) [13:14:58] 10Tool-phab-ban: Add a button to phab-ban that turns on manual account approval - https://phabricator.wikimedia.org/T420251#12045394 (10Aklapper) [13:15:24] 10Tool-subtitler: ASR for silence detection instead of/in addition to the current, faulty silence detection - https://phabricator.wikimedia.org/T429914#12045396 (10psubhashish1) [13:20:20] 10VPS-Projects, 10VisualEditor, 10Catalyst (PatchDemo): Clean up the patchdemo3.visualeditor.eqiad1.wikimedia.cloud instance - https://phabricator.wikimedia.org/T429836#12045432 (10matmarex) 05Open→03Resolved a:03matmarex That worked! I went ahead and deleted the instance now, as well as the old "p... [13:20:58] 10Tool-subtitler: Support Wikimedia API for machine translation - https://phabricator.wikimedia.org/T429916 (10psubhashish1) 03NEW [13:22:48] 06tools-platform-team: [jobs-api] can't run a buildservice job with filelog - https://phabricator.wikimedia.org/T429907#12045450 (10fnegri) a:03fnegri [13:25:33] 10Tool-subtitler: Update/upload TimedText on Commons - https://phabricator.wikimedia.org/T429917 (10psubhashish1) 03NEW [13:31:12] 06cloud-services-team, 10Cloud-VPS, 07Wikimedia-Incident: Openstack cinder volumes backups are broken - https://phabricator.wikimedia.org/T428867#12045519 (10Volans) 05Open→03Resolved All cinder backups are now working fine, the retention period has been restored to a sane value (10 days, was 8 befor... [13:39:12] 10Tool-inteGraality: Support P1963 "properties for this type" as a source for dashboard properties - https://phabricator.wikimedia.org/T429918 (10JeanFred) 03NEW [13:41:09] (03update) 10fnegri: Do not require --filelog if --filelog-std(out|err) is used [repos/cloud/toolforge/jobs-cli] - 10https://gitlab.wikimedia.org/repos/cloud/toolforge/jobs-cli/-/merge_requests/158 (https://phabricator.wikimedia.org/T428354) [13:43:32] 06cloud-services-team, 10Cloud-VPS, 07Incident Severity 3, 07Wikimedia-Incident: Openstack cinder volumes backups are broken - https://phabricator.wikimedia.org/T428867#12045576 (10MLechvien-WMF) [13:47:26] 06cloud-services-team, 10Cloud-VPS, 07Incident Severity 3, 07Wikimedia-Incident: Openstack cinder volumes backups are broken - https://phabricator.wikimedia.org/T428867#12045616 (10MLechvien-WMF) [13:48:25] 06tools-platform-team: [jobs-api] can't run a buildservice job with filelog - https://phabricator.wikimedia.org/T429907#12045632 (10fnegri) This is slightly more complicated. I could not reproduce this at first: ` lang=shell-session tools.whopaintedthis@tools-bastion-15:~$ toolforge jobs run --command 'sleep 10... [13:50:58] (03update) 10dcaro: deploy_task: wait for empty slot when starting build [repos/cloud/toolforge/components-api] - 10https://gitlab.wikimedia.org/repos/cloud/toolforge/components-api/-/merge_requests/173 [13:52:03] 10Tool-inteGraality: Support P1963 "properties for this type" as a source for dashboard properties - https://phabricator.wikimedia.org/T429918#12045693 (10JeanFred) [13:52:43] 06tools-platform-team: [jobs-api] can't run a buildservice job with filelog - https://phabricator.wikimedia.org/T429907#12045694 (10dcaro) > But if I try to create a job with --mount=all it creates a "corrupt" job that cannot be listed or deleted: You mean without? [13:54:42] 10Tool-inteGraality: Support P1963 "properties for this type" as a source for dashboard properties - https://phabricator.wikimedia.org/T429918#12045711 (10JeanFred) [13:55:30] (03update) 10dcaro: deploy_task: wait for empty slot when starting build [repos/cloud/toolforge/components-api] - 10https://gitlab.wikimedia.org/repos/cloud/toolforge/components-api/-/merge_requests/173 [14:03:41] 06tools-platform-team: [jobs-api] can't run a buildservice job with filelog - https://phabricator.wikimedia.org/T429907#12045752 (10fnegri) > You mean without? Yes sorry :D Can you reproduce or do you get a different behavior? [14:08:41] FIRING: CloudVPSDesignateLeaks: Detected 2 stray dns records - https://wikitech.wikimedia.org/wiki/Portal:Cloud_VPS/Admin/Runbooks/Designate_record_leaks - https://grafana.wikimedia.org/d/ebJoA6VWz/wmcs-openstack-eqiad-nova-fullstack - https://alerts.wikimedia.org/?q=alertname%3DCloudVPSDesignateLeaks [14:08:43] 06tools-platform-team: [jobs-api] can't run a buildservice job with filelog - https://phabricator.wikimedia.org/T429907#12045776 (10dcaro) > Can you reproduce or do you get a different behavior? I have the same issue yep, had to flush to get it unstuck [14:14:32] 10Tool-inteGraality: Support P1963 "properties for this type" as a source for dashboard properties - https://phabricator.wikimedia.org/T429918#12045812 (10JeanFred) [14:15:35] (03update) 10dcaro: deploy_task: wait for empty slot when starting build [repos/cloud/toolforge/components-api] - 10https://gitlab.wikimedia.org/repos/cloud/toolforge/components-api/-/merge_requests/173 [14:16:18] 10Tool-subtitler: Remove any leading white space in the beginning of a subtitle - https://phabricator.wikimedia.org/T429923 (10psubhashish1) 03NEW [14:17:17] !log andrew@cloudcumin1001 video START - Cookbook wmcs.nfs.migrate_service [14:17:26] !log andrew@cloudcumin1001 video END (FAIL) - Cookbook wmcs.nfs.migrate_service (exit_code=99) [14:18:33] 10Tool-inteGraality: Support P1963 "properties for this type" as a source for dashboard properties - https://phabricator.wikimedia.org/T429918#12045852 (10JeanFred) [14:19:27] 10Tool-inteGraality: Support P1963 "properties for this type" as a source for dashboard properties - https://phabricator.wikimedia.org/T429918#12045858 (10JeanFred) [14:19:28] 06tools-platform-team: [jobs-api] can't run a buildservice job with filelog - https://phabricator.wikimedia.org/T429907#12045859 (10dcaro) It's weird that it complains there, when a line before also does the `storage_job.get_resolved_core_job().model_dump(exclude=to_exclude)` [14:20:18] !log andrew@cloudcumin1001 video START - Cookbook wmcs.nfs.add_server [14:20:29] !log andrew@cloudcumin1001 video END (FAIL) - Cookbook wmcs.nfs.add_server (exit_code=99) [14:21:19] 06tools-platform-team: [jobs-api] can't run a buildservice job with filelog - https://phabricator.wikimedia.org/T429907#12045861 (10dcaro) This also means that somehow it was able to save it (when it should have failed before instead), so there's some validation that's not happening [14:24:35] !log andrew@cloudcumin1001 video START - Cookbook wmcs.nfs.add_server [14:24:43] !log andrew@cloudcumin1001 video END (FAIL) - Cookbook wmcs.nfs.add_server (exit_code=99) [14:25:40] 06tools-platform-team: [jobs-api] can't run a buildservice job with filelog - https://phabricator.wikimedia.org/T429907#12045878 (10dcaro) ahh, the logs above are misleading, those are from the `jobs list` from the creation of the job you get: ` jobs-api jobs-api-65f9dcd976-cks85 webservice File "/app/tjf/api/... [14:26:47] !log andrew@cloudcumin1001 video START - Cookbook wmcs.nfs.add_server [14:26:50] 10Tool-wmf-openapi-linter, 06MW-Interfaces-Team, 06Tech-Docs-Team, 05MW-1.47-notes (1.47.0-wmf.8; 2026-06-23), 07OKR-Work: Fix summary issues in the MediaWiki REST API OAD - https://phabricator.wikimedia.org/T428150#12045882 (10TBurmeister) 05In progress→03Resolved [14:27:10] 10Tool-wmf-openapi-linter, 06MW-Interfaces-Team, 06Tech-Docs-Team, 05MW-1.47-notes (1.47.0-wmf.8; 2026-06-23), 07OKR-Work: Fix summary issues in the MediaWiki REST API OAD - https://phabricator.wikimedia.org/T428150#12045886 (10TBurmeister) [14:29:12] 06tools-platform-team: [jobs-api] can't run a buildservice job with filelog - https://phabricator.wikimedia.org/T429907#12045894 (10dcaro) okok, so we do the resolve_job only when trying to create in runtime, we should do that before as validation, something like: ` ## core.py def create_job(self, job: Any... [14:30:12] 06tools-platform-team: [jobs-api] can't run a buildservice job with filelog - https://phabricator.wikimedia.org/T429907#12045899 (10dcaro) Looking though why it does not validate it before. [14:38:41] !log andrew@cloudcumin1001 video END (PASS) - Cookbook wmcs.nfs.add_server (exit_code=0) [14:40:24] !log andrew@cloudcumin1001 video START - Cookbook wmcs.nfs.migrate_service [14:41:07] !log andrew@cloudcumin1001 video END (FAIL) - Cookbook wmcs.nfs.migrate_service (exit_code=99) [14:41:23] 06tools-platform-team: [jobs-api] can't run a buildservice job with filelog - https://phabricator.wikimedia.org/T429907#12045976 (10dcaro) >>! In T429907#12045894, @dcaro wrote: > okok, so we do the resolve_job only when trying to create in runtime, we should do that before as validation, something like: > > `p... [14:42:45] !log andrew@cloudcumin1001 video START - Cookbook wmcs.nfs.migrate_service [14:43:00] 10VPS-project-devtools, 10VPS-project-Phabricator: Puppet broken on phabricator-bookworm-3.devtools.eqiad1.wikimedia.cloud - https://phabricator.wikimedia.org/T428069#12045978 (10dancy) >>! In T428069#12042082, @Arnoldokoth wrote: > I terminated this host so should not be an issue now. I'll mark this ticke... [14:43:30] !log andrew@cloudcumin1001 video END (FAIL) - Cookbook wmcs.nfs.migrate_service (exit_code=99) [14:45:58] !log andrew@cloudcumin1001 video START - Cookbook wmcs.nfs.migrate_service [14:46:34] (03open) 10cgoubert: Remove apiportalwiki from config [toolforge-repos/api-explorer] - 10https://gitlab.wikimedia.org/toolforge-repos/api-explorer/-/merge_requests/1 (https://phabricator.wikimedia.org/T418494) [14:48:10] 06tools-platform-team: [jobs-api] can't run a buildservice job with filelog - https://phabricator.wikimedia.org/T429907#12045999 (10dcaro) And after that, you can create jobs with mount=all: ` local.tf-test2@toolslocal:~$ toolforge jobs run --continuous --command 'sleep 100' --image tool-tf-test2/my-cronjob4 --f... [14:53:40] !log andrew@cloudcumin1001 video END (FAIL) - Cookbook wmcs.nfs.migrate_service (exit_code=99) [14:54:53] !log andrew@cloudcumin1001 video START - Cookbook wmcs.nfs.migrate_service [14:54:53] !log andrew@cloudcumin1001 video END (FAIL) - Cookbook wmcs.nfs.migrate_service (exit_code=99) [14:55:06] !log andrew@cloudcumin1001 video START - Cookbook wmcs.nfs.migrate_service [14:55:06] !log andrew@cloudcumin1001 video END (FAIL) - Cookbook wmcs.nfs.migrate_service (exit_code=99) [15:14:24] 10Cloud-VPS (Project-requests): Request creation of wikimedia_commons_uploader_telegram_bot VPS project - https://phabricator.wikimedia.org/T429729#12046135 (10fgiunchedi) Thank you for explaining the workflow -- this sounds to me like a great use case for a Toolforge tool. Have you considered deploying on Toolf... [15:19:36] 06cloud-services-team (FY2025/2026-Q3-Q4), 10Cloud-VPS, 06tools-infrastructure-team: Put cloudvirt10[77-80] in service - https://phabricator.wikimedia.org/T429563#12046165 (10fgiunchedi) Hosts are provisioned, though not yet on cloud-private. @cmooney has a cookbook in the testing phase which will help autom... [15:19:39] 10Cloud-VPS, 06tools-platform-team: Convert dynamicproxy from nginx to haproxy - https://phabricator.wikimedia.org/T429930 (10Volans) 03NEW p:05Triage→03Medium [15:25:05] 10Cloud-VPS (Debian Bullseye Deprecation), 06The-Wikipedia-Library, 06Moderator-Tools-Team (Kanban): twl: Replace deprecated Bullseye VMs in Cloud VPS - https://phabricator.wikimedia.org/T402054#12046224 (10Samwalton9-WMF) [15:25:06] 10Cloud-VPS (Debian Bullseye Deprecation), 06The-Wikipedia-Library, 06Moderator-Tools-Team (Kanban): wikilink: Replace deprecated Bullseye VM in Cloud VPS - https://phabricator.wikimedia.org/T402055#12046226 (10Samwalton9-WMF) [15:25:46] 06tools-infrastructure-team: Stop hardcoding cloudcephosd networking data in puppet - https://phabricator.wikimedia.org/T429932 (10fgiunchedi) 03NEW [15:25:57] 10Cloud-VPS (Project-requests): Request creation of wikimedia_commons_uploader_telegram_bot VPS project - https://phabricator.wikimedia.org/T429729#12046243 (10vitaly-zdanevich) >>! In T429729#12046135, @fgiunchedi wrote: > this sounds to me like a great use case for a Toolforge tool. For this project, we need... [15:26:28] 10Cloud-VPS (Debian Bullseye Deprecation), 06The-Wikipedia-Library, 07Essential-Work, 06Moderator-Tools-Team (Kanban): wikilink: Replace deprecated Bullseye VM in Cloud VPS - https://phabricator.wikimedia.org/T402055#12046246 (10DMburugu) [15:27:31] 10VPS-project-Codesearch: Index the slothslo gitlab repo in codesearch - https://phabricator.wikimedia.org/T429819#12046253 (10SomeRandomDeveloper) >>! In T429819#12045146, @stjn wrote: > @Dzahn: did this change break all customisations to Codesearch somehow or is it something different that caused this? Com... [15:27:50] 10Cloud-VPS (Project-requests): Request creation of wikimedia_commons_uploader_telegram_bot VPS project - https://phabricator.wikimedia.org/T429729#12046257 (10vitaly-zdanevich) I can implement oauth token, if you think that it will be better. [15:28:22] 10Cloud-VPS (Debian Bullseye Deprecation), 06The-Wikipedia-Library, 07Essential-Work, 06Moderator-Tools-Team (Kanban): twl: Replace deprecated Bullseye VMs in Cloud VPS - https://phabricator.wikimedia.org/T402054#12046260 (10DMburugu) [15:28:45] 10Cloud-VPS (Debian Bullseye Deprecation), 06tools-platform-team: Reach out to Cloud VPS project maintainers about Debian Bullseye deprecation - https://phabricator.wikimedia.org/T428196#12046264 (10Andrew) >>! In T428196#12041187, @Don-vip wrote: >>>! In T428196#12040703, @Andrew wrote: >> You're right! I... [15:30:39] 10Cloud-VPS, 06tools-platform-team: Convert dynamicproxy from nginx to haproxy - https://phabricator.wikimedia.org/T429930#12046276 (10Volans) [15:33:51] 10Cloud-VPS (Project-requests): Request creation of wikimedia_commons_uploader_telegram_bot VPS project - https://phabricator.wikimedia.org/T429729#12046289 (10fgiunchedi) >>! In T429729#12046243, @vitaly-zdanevich wrote: >>>! In T429729#12046135, @fgiunchedi wrote: >> this sounds to me like a great use case for... [15:40:02] 10Cloud-VPS (Project-requests): Request creation of wikimedia_commons_uploader_telegram_bot VPS project - https://phabricator.wikimedia.org/T429729#12046532 (10vitaly-zdanevich) Ok, if I will be able to install official Telegram server to Toolforge - I can try it. I will have ssh to that Toolforge, right? Will... [15:43:26] 10Cloud-VPS (Project-requests): Request creation of wikimedia_commons_uploader_telegram_bot VPS project - https://phabricator.wikimedia.org/T429729#12046562 (10vitaly-zdanevich) And does Toolforge accept zip/rar that are a few GB in size? [15:43:56] 10Cloud-VPS (Project-requests): Request creation of wikimedia_commons_uploader_telegram_bot VPS project - https://phabricator.wikimedia.org/T429729#12046563 (10fgiunchedi) >>! In T429729#12046532, @vitaly-zdanevich wrote: > Ok, if I will be able to install official Telegram server to Toolforge - I can try it. I... [15:46:11] 10Cloud-VPS (Project-requests): Request creation of wikimedia_commons_uploader_telegram_bot VPS project - https://phabricator.wikimedia.org/T429729#12046581 (10fgiunchedi) >>! In T429729#12046562, @vitaly-zdanevich wrote: > And does Toolforge accept zip/rar that are a few GB in size? Yes you have local filesyst... [15:48:17] 10Cloud-VPS, 06tools-infrastructure-team: Convert dynamicproxy from nginx to haproxy - https://phabricator.wikimedia.org/T429930#12046589 (10Volans) [15:51:41] 10Tool-inteGraality: Support for 'mul' in integraality - https://phabricator.wikimedia.org/T393166#12046619 (10JeanFred) 05Open→03Resolved a:03JeanFred With {T237276} done, I think we are in good shape when it comes to mul support [15:57:34] 10Cloud-VPS, 06tools-infrastructure-team: Convert dynamicproxy from nginx to haproxy - https://phabricator.wikimedia.org/T429930#12046641 (10Volans) [16:00:22] 10Cloud-VPS (Project-requests): Request creation of wikimedia_commons_uploader_telegram_bot VPS project - https://phabricator.wikimedia.org/T429729#12046663 (10vitaly-zdanevich) Ok, thanks, I never used Toolforge - according to the instruction - I sent the Toolforge membership request. [16:00:31] 10Cloud-VPS, 06tools-platform-team: Horizon UI: limit to one backend per web proxy - https://phabricator.wikimedia.org/T429960 (10Volans) 03NEW [16:00:46] 10Cloud-VPS, 06tools-infrastructure-team: Horizon UI: limit to one backend per web proxy - https://phabricator.wikimedia.org/T429960#12046679 (10Volans) [16:03:41] !log andrew@cloudcumin1001 maps START - Cookbook wmcs.nfs.add_server [16:04:27] 06cloud-services-team, 10Striker: Add dark mode - https://phabricator.wikimedia.org/T429962 (10vitaly-zdanevich) 03NEW [16:05:01] !log andrew@cloudcumin1001 maps END (FAIL) - Cookbook wmcs.nfs.add_server (exit_code=99) [16:09:07] !log andrew@cloudcumin1001 maps START - Cookbook wmcs.nfs.add_server [16:09:50] 10Toolforge, 06tools-platform-team: [jobs-cli] emits a warning to re-create valid jobs - https://phabricator.wikimedia.org/T429231#12046755 (10dcaro) 05Open→03In progress [16:10:27] !log andrew@cloudcumin1001 maps END (FAIL) - Cookbook wmcs.nfs.add_server (exit_code=99) [16:12:16] 10VPS-project-Wikistats: Add minwikiquote to wikistats - https://phabricator.wikimedia.org/T429948#12046783 (10Dzahn) T429922 [16:12:39] 10VPS-project-Wikistats: Add isvwiki to wikistats - https://phabricator.wikimedia.org/T429940#12046788 (10Dzahn) T429920 [16:13:23] 10VPS-project-Wikistats: Add bolwiki to wikistats - https://phabricator.wikimedia.org/T429956#12046794 (10Dzahn) T429921 [16:17:49] !log andrew@cloudcumin1001 maps START - Cookbook wmcs.nfs.add_server [16:19:02] 10Data-Services, 06tools-platform-team, 06Data-Persistence: CPU spikes in clouddb1014@s2 - https://phabricator.wikimedia.org/T429892#12046844 (10fnegri) [16:29:44] !log andrew@cloudcumin1001 maps END (PASS) - Cookbook wmcs.nfs.add_server (exit_code=0) [16:31:00] !log andrew@cloudcumin1001 maps START - Cookbook wmcs.nfs.add_server [16:38:32] FIRING: PuppetStaleCertificates: Found non-revoked Puppet certificates for 1 deleted instances on paws-puppetserver-1 - https://wikitech.wikimedia.org/wiki/Portal:Cloud_VPS/Admin/Runbooks/PuppetStaleCertificates - https://prometheus-alerts.wmcloud.org/?q=alertname%3DPuppetStaleCertificates [16:42:39] 06cloud-services-team, 10Striker: [striker] Add dark mode - https://phabricator.wikimedia.org/T429962#12047046 (10dcaro) [16:43:57] !log andrew@cloudcumin1001 maps END (PASS) - Cookbook wmcs.nfs.add_server (exit_code=0) [16:46:25] 10VPS-project-Codesearch: Index the slothslo gitlab repo in codesearch - https://phabricator.wikimedia.org/T429819#12047072 (10Dzahn) >>! In T429819#12045146, @stjn wrote: > @Dzahn: did this change break all customisations to Codesearch somehow This change did nothing besides adding 2 new repos to the index... [16:49:54] 10VPS-project-Codesearch: Index the slothslo gitlab repo in codesearch - https://phabricator.wikimedia.org/T429819#12047083 (10Dzahn) >>! In T429819#12046253, @SomeRandomDeveloper wrote: > The post-merge job for the change (https://integration.wikimedia.org/ci/job/codesearch-pipeline-publish/190/console) fai... [16:49:56] 10VPS-project-Codesearch, 07Regression: Codesearch suddenly uses the default Hound UI - https://phabricator.wikimedia.org/T429967 (10SomeRandomDeveloper) 03NEW [16:50:22] 10VPS-project-Codesearch: Index the slothslo gitlab repo in codesearch - https://phabricator.wikimedia.org/T429819#12047110 (10SomeRandomDeveloper) (Filed {T429967}) [16:51:19] 10Cloud-VPS (Project-requests): Request creation of wikimedia_commons_uploader_telegram_bot VPS project - https://phabricator.wikimedia.org/T429729#12047126 (10vitaly-zdanevich) Created this oauth client without "allow consumer to cpecify a callback" - its a mistake, already did a new one. Can somebody remove i... [16:54:49] 10VPS-project-Codesearch, 07Regression: Codesearch suddenly uses the default Hound UI - https://phabricator.wikimedia.org/T429967#12047163 (10Dzahn) [16:58:01] 10Toolforge, 06tools-platform-team: [jobs-cli] emits a warning to re-create valid jobs - https://phabricator.wikimedia.org/T429231#12047186 (10dcaro) > log paths seem to get expanded at the wrong time This happens because the k8s job does not have the full path in the logs, it was created in 2023, I think th... [17:07:04] 10Toolforge, 06tools-platform-team: [toolforge-weld] Fails to publish to pypi - https://phabricator.wikimedia.org/T429241#12047233 (10dcaro) p:05Triage→03Medium a:03dcaro [17:07:06] 10Toolforge, 06tools-platform-team: [toolforge-weld] Fails to publish to pypi - https://phabricator.wikimedia.org/T429241#12047236 (10dcaro) 05Open→03In progress [17:15:46] 10Toolforge, 06tools-platform-team: [toolforge-weld] Fails to publish to pypi - https://phabricator.wikimedia.org/T429241#12047346 (10dcaro) Manually it works (using the exported user/pass set by CI), so maybe CI is not setting the vars anymore? [17:18:00] (03open) 10fnegri: CommonJob: don't force mount=none if filelog is set [repos/cloud/toolforge/jobs-api] - 10https://gitlab.wikimedia.org/repos/cloud/toolforge/jobs-api/-/merge_requests/318 (https://phabricator.wikimedia.org/T429907) [17:21:19] 10Toolforge, 06tools-platform-team: [toolforge-weld] Fails to publish to pypi - https://phabricator.wikimedia.org/T429241#12047373 (10dcaro) Found it, it was a protected tag issue, the new tag uses debian/*, and was not added to the list of protected tags. Now it fails with bad request because I already up... [17:21:27] 10Toolforge, 06tools-platform-team: [toolforge-weld] Fails to publish to pypi - https://phabricator.wikimedia.org/T429241#12047375 (10dcaro) 05In progress→03Resolved [17:30:08] 06cloud-services-team, 10Cloud-VPS: Project members cannot ssh into newly created deployment-prep instances - https://phabricator.wikimedia.org/T429978 (10dancy) 03NEW [17:30:20] 10Tool-wmf-openapi-linter, 06MW-Interfaces-Team (MWI-Sprint-36 (2026-06-16 to 2026-06-30)), 07OKR-Work: Fix type, media type, and array issues in the MediaWiki REST API OAD - https://phabricator.wikimedia.org/T428149#12047443 (10MGoncalves-WMF) 05Open→03In progress [17:33:02] (03open) 10dcaro: pypi: check for properly set auth [repos/cloud/cicd/gitlab-ci] - 10https://gitlab.wikimedia.org/repos/cloud/cicd/gitlab-ci/-/merge_requests/89 [17:39:00] (03update) 10dcaro: pypi: check for properly set auth [repos/cloud/cicd/gitlab-ci] - 10https://gitlab.wikimedia.org/repos/cloud/cicd/gitlab-ci/-/merge_requests/89 [17:44:29] 06tools-platform-team, 13Patch-For-Review: [jobs-api] can't run a buildservice job with filelog - https://phabricator.wikimedia.org/T429907#12047500 (10fnegri) I didn't read your comments above because I was busy working on a different fix, that I pushed to https://gitlab.wikimedia.org/repos/cloud/toolforge/jo... [17:46:59] (03close) 10fnegri: CommonJob: don't force mount=none if filelog is set [repos/cloud/toolforge/jobs-api] - 10https://gitlab.wikimedia.org/repos/cloud/toolforge/jobs-api/-/merge_requests/318 (https://phabricator.wikimedia.org/T429907) [17:47:32] 06tools-platform-team: [sample-complex-app] currently failing to start both api and worker - https://phabricator.wikimedia.org/T429981 (10dcaro) 03NEW [17:55:34] 06cloud-services-team, 10Cloud-VPS: Request creation of wiki-community-metrics-service VPS project - https://phabricator.wikimedia.org/T429983 (10Hermes213412) 03NEW [18:03:20] !log andrew@cloudcumin1001 maps START - Cookbook wmcs.nfs.migrate_service [18:03:47] !log andrew@cloudcumin1001 maps END (FAIL) - Cookbook wmcs.nfs.migrate_service (exit_code=99) [18:08:56] FIRING: CloudVPSDesignateLeaks: Detected 2 stray dns records - https://wikitech.wikimedia.org/wiki/Portal:Cloud_VPS/Admin/Runbooks/Designate_record_leaks - https://grafana.wikimedia.org/d/ebJoA6VWz/wmcs-openstack-eqiad-nova-fullstack - https://alerts.wikimedia.org/?q=alertname%3DCloudVPSDesignateLeaks [18:16:19] (03PS1) 10Dzahn: add repos/test-platform/catalyst/ci-charts to config [labs/codesearch] - 10https://gerrit.wikimedia.org/r/1305224 [18:16:58] !log andrew@cloudcumin1001 maps START - Cookbook wmcs.nfs.add_server [18:17:06] !log andrew@cloudcumin1001 maps END (FAIL) - Cookbook wmcs.nfs.add_server (exit_code=99) [18:17:22] !log andrew@cloudcumin1001 maps START - Cookbook wmcs.nfs.add_server [18:17:30] !log andrew@cloudcumin1001 maps END (FAIL) - Cookbook wmcs.nfs.add_server (exit_code=99) [18:17:56] !log andrew@cloudcumin1001 maps START - Cookbook wmcs.nfs.add_server [18:18:41] RESOLVED: CloudVPSDesignateLeaks: Detected 2 stray dns records - https://wikitech.wikimedia.org/wiki/Portal:Cloud_VPS/Admin/Runbooks/Designate_record_leaks - https://grafana.wikimedia.org/d/ebJoA6VWz/wmcs-openstack-eqiad-nova-fullstack - https://alerts.wikimedia.org/?q=alertname%3DCloudVPSDesignateLeaks [18:18:59] 06tools-platform-team, 13Patch-For-Review: [jobs-api] using "--filelog" without "--mount" saves invalid jobs into storage - https://phabricator.wikimedia.org/T429907#12047638 (10fnegri) [18:19:26] (03open) 10fnegri: create_job: call get_resolved_core_job() to validate [repos/cloud/toolforge/jobs-api] - 10https://gitlab.wikimedia.org/repos/cloud/toolforge/jobs-api/-/merge_requests/319 (https://phabricator.wikimedia.org/T429907) [18:22:01] (03CR) 10Dzahn: "want to see if CI issue has been fixed" [labs/codesearch] - 10https://gerrit.wikimedia.org/r/1305224 (owner: 10Dzahn) [18:22:24] (03update) 10fnegri: create_job: call get_resolved_core_job() to validate [repos/cloud/toolforge/jobs-api] - 10https://gitlab.wikimedia.org/repos/cloud/toolforge/jobs-api/-/merge_requests/319 (https://phabricator.wikimedia.org/T429907) [18:23:15] (03CR) 10CI reject: [V:04-1] add repos/test-platform/catalyst/ci-charts to config [labs/codesearch] - 10https://gerrit.wikimedia.org/r/1305224 (owner: 10Dzahn) [18:29:45] !log andrew@cloudcumin1001 maps END (PASS) - Cookbook wmcs.nfs.add_server (exit_code=0) [18:31:57] !log andrew@cloudcumin1001 maps START - Cookbook wmcs.nfs.migrate_service [18:32:24] !log andrew@cloudcumin1001 maps END (FAIL) - Cookbook wmcs.nfs.migrate_service (exit_code=99) [18:42:41] (03update) 10fnegri: create_job: call get_resolved_core_job() to validate [repos/cloud/toolforge/jobs-api] - 10https://gitlab.wikimedia.org/repos/cloud/toolforge/jobs-api/-/merge_requests/319 (https://phabricator.wikimedia.org/T429907) [18:46:17] 10VPS-project-Codesearch, 07Regression: Codesearch suddenly uses the default Hound UI - https://phabricator.wikimedia.org/T429967#12047749 (10Dzahn) a:03Dzahn [18:47:14] 10VPS-project-Codesearch, 07Regression: Codesearch suddenly uses the default Hound UI - https://phabricator.wikimedia.org/T429967#12047751 (10Dzahn) also see previous comments on T429819#12047072 trying to get the post-merge build fixed as a first step [18:50:57] 06tools-infrastructure-team: Patch demo resource check - https://phabricator.wikimedia.org/T429990 (10BLiviero-WMF) 03NEW [18:53:24] (03update) 10fnegri: create_job: call get_resolved_core_job() to validate [repos/cloud/toolforge/jobs-api] - 10https://gitlab.wikimedia.org/repos/cloud/toolforge/jobs-api/-/merge_requests/319 (https://phabricator.wikimedia.org/T429907) [18:59:38] (03update) 10fnegri: create_job: call get_resolved_core_job() to validate [repos/cloud/toolforge/jobs-api] - 10https://gitlab.wikimedia.org/repos/cloud/toolforge/jobs-api/-/merge_requests/319 (https://phabricator.wikimedia.org/T429907) [18:59:40] (03update) 10fnegri: create_job: call get_resolved_core_job() to validate [repos/cloud/toolforge/jobs-api] - 10https://gitlab.wikimedia.org/repos/cloud/toolforge/jobs-api/-/merge_requests/319 (https://phabricator.wikimedia.org/T429907) [19:08:27] 10Cloud-VPS (Debian Bullseye Deprecation), 06tools-platform-team: Reach out to Cloud VPS project maintainers about Debian Bullseye deprecation - https://phabricator.wikimedia.org/T428196#12047853 (10Don-vip) >>! In T428196#12046264, @Andrew wrote: > @Don-vip, this should be done; can you confirm that thing... [19:08:42] !log andrew@cloudcumin1001 paws START - Cookbook wmcs.nfs.add_server [19:17:18] 06cloud-services-team, 10Data-Services, 06Data-Persistence: [wikireplicas] Publish sys.user_summary to Prometheus - https://phabricator.wikimedia.org/T429993 (10fnegri) 03NEW [19:19:56] !log andrew@cloudcumin1001 paws END (PASS) - Cookbook wmcs.nfs.add_server (exit_code=0) [19:20:38] 06cloud-services-team, 10Data-Services, 06Data-Persistence: [wikireplicas] Publish sys.user_summary to Prometheus - https://phabricator.wikimedia.org/T429993#12047876 (10fnegri) [19:23:32] RESOLVED: PuppetStaleCertificates: Found non-revoked Puppet certificates for 1 deleted instances on paws-puppetserver-1 - https://wikitech.wikimedia.org/wiki/Portal:Cloud_VPS/Admin/Runbooks/PuppetStaleCertificates - https://prometheus-alerts.wmcloud.org/?q=alertname%3DPuppetStaleCertificates [19:26:40] 10Data-Services, 06tools-platform-team, 06Data-Persistence: CPU spikes in clouddb1014@s2 - https://phabricator.wikimedia.org/T429892#12047883 (10fnegri) a:03fnegri [19:30:43] !log andrew@cloudcumin1001 paws START - Cookbook wmcs.nfs.migrate_service [19:30:52] !log andrew@cloudcumin1001 paws END (FAIL) - Cookbook wmcs.nfs.migrate_service (exit_code=99) [19:31:08] 10Data-Services, 06tools-platform-team, 06Data-Persistence: CPU spikes in clouddb1014@s2 - https://phabricator.wikimedia.org/T429892#12047901 (10fnegri) @CWilliams-WMF thanks, that's very helpful. I also created {T429892} to track the idea of adding `sys.user_summary` to Prometheus. I don't plan to work on... [19:44:59] FIRING: PawsJupyterHubDown: PAWS JupyterHub is down https://wikitech.wikimedia.org/wiki/PAWS/Admin - https://prometheus-alerts.wmcloud.org/?q=alertname%3DPawsJupyterHubDown [19:45:32] FIRING: TargetDown: Job jupyterhub is unreachable in project paws instance hub-paws.wmcloud.org:443 - https://prometheus-alerts.wmcloud.org/?q=alertname%3DTargetDown [20:08:24] 10Toolforge, 06tools-platform-team: [jobs-cli] emits a warning to re-create valid jobs - https://phabricator.wikimedia.org/T429231#12048014 (10Wbm1058) Whatever you've done seems to have made the warning messages go away. I just did another `toolforge jobs list` and did not get any warning messages. [20:19:56] 06cloud-services-team, 10Data-Services, 06Data-Persistence, 10Observability-Metrics: [wikireplicas] Publish sys.user_summary to Prometheus - https://phabricator.wikimedia.org/T429993#12048061 (10jcrespo) I am adding observability because they transmitted to me their worry when adding high cardinality metri... [20:21:02] RESOLVED: TargetDown: Job jupyterhub is unreachable in project paws instance hub-paws.wmcloud.org:443 - https://prometheus-alerts.wmcloud.org/?q=alertname%3DTargetDown [20:21:32] RESOLVED: PawsJupyterHubDown: PAWS JupyterHub is down https://wikitech.wikimedia.org/wiki/PAWS/Admin - https://prometheus-alerts.wmcloud.org/?q=alertname%3DPawsJupyterHubDown [20:29:11] 06cloud-services-team, 10Data-Services, 06Data-Persistence, 10Observability-Metrics: [wikireplicas] Publish sys.user_summary to Prometheus - https://phabricator.wikimedia.org/T429993#12048094 (10jcrespo) @fnegri I agree that the view seems quite worry-free, specially for wmcs as they are already highly agg... [20:48:34] (03PS1) 10Dzahn: update tag for the go image [labs/codesearch] - 10https://gerrit.wikimedia.org/r/1305248 (https://phabricator.wikimedia.org/T429967) [20:50:20] (03CR) 10Dzahn: [C:03+2] update tag for the go image [labs/codesearch] - 10https://gerrit.wikimedia.org/r/1305248 (https://phabricator.wikimedia.org/T429967) (owner: 10Dzahn) [20:50:25] (03PS2) 10Dzahn: update tag for the go image [labs/codesearch] - 10https://gerrit.wikimedia.org/r/1305248 (https://phabricator.wikimedia.org/T429967) [20:52:34] (03CR) 10Dzahn: ""unsure what produces that image, but the 1.15-1 tag doesn't seem to point to the latest 1.15-1 build (https://docker-registry.wikimedia.o" [labs/codesearch] - 10https://gerrit.wikimedia.org/r/1305248 (https://phabricator.wikimedia.org/T429967) (owner: 10Dzahn) [20:53:33] (03Merged) 10jenkins-bot: update tag for the go image [labs/codesearch] - 10https://gerrit.wikimedia.org/r/1305248 (https://phabricator.wikimedia.org/T429967) (owner: 10Dzahn) [20:53:39] (03PS2) 10Dzahn: add repos/test-platform/catalyst/ci-charts to config [labs/codesearch] - 10https://gerrit.wikimedia.org/r/1305224 [20:57:28] (03CR) 10Dzahn: [C:03+2] add repos/test-platform/catalyst/ci-charts to config [labs/codesearch] - 10https://gerrit.wikimedia.org/r/1305224 (owner: 10Dzahn) [20:58:50] (03Merged) 10jenkins-bot: add repos/test-platform/catalyst/ci-charts to config [labs/codesearch] - 10https://gerrit.wikimedia.org/r/1305224 (owner: 10Dzahn) [21:16:24] 06cloud-services-team: Update all cloud-vps Bullseye NFS servers - https://phabricator.wikimedia.org/T429793#12048281 (10Andrew) [21:16:59] 06cloud-services-team: Update all cloud-vps Bullseye NFS servers - https://phabricator.wikimedia.org/T429793#12048285 (10Andrew) [21:45:30] 10VPS-project-Codesearch, 07Regression: Codesearch suddenly uses the default Hound UI - https://phabricator.wikimedia.org/T429967#12048370 (10stjn) Now Codesearch seems to say this (https://codesearch.wmcloud.org/_health says `invalid backend`): ` Unable to contact hound. If 10VPS-project-Codesearch, 07Regression: Codesearch suddenly uses the default Hound UI - https://phabricator.wikimedia.org/T429967#12048451 (10Dzahn) @stjn That's because I was working on it to fix this. [22:13:56] 10VPS-project-Codesearch, 07Regression: Codesearch suddenly uses the default Hound UI - https://phabricator.wikimedia.org/T429967#12048452 (10Dzahn) - CI has been fixed by updating to an image tag that does not use the deprecated mirrors.wikimedia.org (https://gerrit.wikimedia.org/r/c/labs/codesearch/+/1305248... [22:14:04] 06cloud-services-team, 10Cloud-VPS: debian-12.0-bookworm and debian-13.0-trixie image still reference mirrors.wikimedia.org - https://phabricator.wikimedia.org/T429542#12048454 (10Andrew) 05Open→03Resolved a:03Andrew I built fresh base images for Bookworm and Trixie which should resolve this. [22:15:34] 10VPS-project-Codesearch, 07Regression: Codesearch suddenly uses the default Hound UI - https://phabricator.wikimedia.org/T429967#12048459 (10Dzahn) {F90195358} [22:15:37] 10VPS-project-Codesearch, 07Regression: Codesearch suddenly uses the default Hound UI - https://phabricator.wikimedia.org/T429967#12048460 (10Dzahn) 05Open→03Resolved [22:19:47] 10VPS-project-Codesearch: Index the slothslo gitlab repo in codesearch - https://phabricator.wikimedia.org/T429819#12048467 (10Dzahn) The CI problem has been solved by https://gerrit.wikimedia.org/r/c/labs/codesearch/+/1305248 because that newer image did for go-lang did not reference mirrors.wikimedia.org... [22:41:37] 06cloud-services-team, 10Striker: Unable to create Toolforge project "Tool-odivoir" – HTTP 500 Internal Server Error on toolsadmin - https://phabricator.wikimedia.org/T429857#12048498 (10poro26) >>! Dans T429857#12043959, @dcaro a écrit : > Hi @poro26, the odivoir tool already exists (you linked to it https://...