[00:08:28] FIRING: PuppetAgentStaleLastRun: Last Puppet run was over 24 hours ago on instance tf-infra-test in project tf-infra-test - https://prometheus-alerts.wmcloud.org/?q=alertname%3DPuppetAgentStaleLastRun [00:13:28] RESOLVED: PuppetAgentStaleLastRun: Last Puppet run was over 24 hours ago on instance tf-infra-test in project tf-infra-test - https://prometheus-alerts.wmcloud.org/?q=alertname%3DPuppetAgentStaleLastRun [00:25:10] RESOLVED: CephClusterInWarning: Ceph cluster in eqiad is in warning status - https://wikitech.wikimedia.org/wiki/Portal:Cloud_VPS/Admin/Runbooks/CephClusterInWarning - https://grafana.wikimedia.org/d/P1tFnn3Mk/wmcs-ceph-eqiad-health?orgId=1&search=open&tag=ceph&tag=health&tag=WMCS - https://alerts.wikimedia.org/?q=alertname%3DCephClusterInWarning [01:19:57] FIRING: CloudVPSDesignateLeaks: Detected 6 stray dns records - https://wikitech.wikimedia.org/wiki/Portal:Cloud_VPS/Admin/Runbooks/Designate_record_leaks - https://grafana.wikimedia.org/d/ebJoA6VWz/wmcs-openstack-eqiad-nova-fullstack - https://alerts.wikimedia.org/?q=alertname%3DCloudVPSDesignateLeaks [03:15:16] 10Cloud-VPS (Debian Buster Deprecation), 10Beta-Cluster-Infrastructure: Migrate deployment-prep away from Debian Buster to Bullseye/Bookworm - https://phabricator.wikimedia.org/T327742#9947848 (10Andrew) I have further increased the core quota for this project from 220 to 240. [03:56:48] 06cloud-services-team, 10Cloud-VPS (Debian Buster Deprecation), 10Beta-Cluster-Infrastructure, 06serviceops: Replace deployment-memc[08-10] with Bullseye or Bookworm - https://phabricator.wikimedia.org/T361384#9947872 (10Andrew) The old memc hosts run redis, and are referred to as such in deployment-prep p... [05:04:42] RESOLVED: CloudVPSDesignateLeaks: Detected 6 stray dns records - https://wikitech.wikimedia.org/wiki/Portal:Cloud_VPS/Admin/Runbooks/Designate_record_leaks - https://grafana.wikimedia.org/d/ebJoA6VWz/wmcs-openstack-eqiad-nova-fullstack - https://alerts.wikimedia.org/?q=alertname%3DCloudVPSDesignateLeaks [06:31:39] 10Toolforge (Toolforge iteration 12), 13Patch-For-Review, 07Upstream: [maintain-harbor] Manage project quotas via maintain-harbor - https://phabricator.wikimedia.org/T352417#9948001 (10Slst2020) It now seems likely that this will make it into v2.12. The planned release date is in October 2024. https://github... [06:34:17] 10Toolforge (Toolforge iteration 12): [toolforge-deploy] envvars functional tests fail when out of quota - https://phabricator.wikimedia.org/T367169#9948014 (10Slst2020) >>! In T367169#9944551, @dcaro wrote: > @Slst2020 do you think that this is still relevant? Given that the quota has increased to 64, making t... [06:34:23] 10Toolforge (Toolforge iteration 12): [toolforge-deploy] envvars functional tests fail when out of quota - https://phabricator.wikimedia.org/T367169#9948015 (10Slst2020) 05Open→03Declined [06:51:26] (03close) 10sstefanova: Draft: Testing error generation for envvars-api [repos/cloud/toolforge/envvars-api] - 10https://gitlab.wikimedia.org/repos/cloud/toolforge/envvars-api/-/merge_requests/35 (https://phabricator.wikimedia.org/T360147 https://phabricator.wikimedia.org/T366697) (owner: 10ebomani) [06:51:44] (03close) 10sstefanova: Draft: Testing error generation for envvars-api [repos/cloud/toolforge/envvars-api] - 10https://gitlab.wikimedia.org/repos/cloud/toolforge/envvars-api/-/merge_requests/36 (https://phabricator.wikimedia.org/T366697) [07:34:58] 06cloud-services-team, 10Technical-blog-posts: Tech blog post: "Wikimedia Toolforge: migrating Kubernetes from PodSecurityPolicy to kyverno" - https://phabricator.wikimedia.org/T368948#9948079 (10aborrero) >>! In T368948#9947286, @debt wrote: > @aborrero @Andrew Can you take a look at the [[ https://techblog.w... [08:00:41] 10cloud-services-team (FY2023/2024-Q3-Q4), 10Cloud-VPS, 06DC-Ops, 10ops-eqiad, 06SRE: cloudcephosd1021-1034: hard drive sector errors increasing - https://phabricator.wikimedia.org/T348643#9948101 (10dcaro) Doing some tests this morning with rados bench from several of the nodes. Running on 12 osd nodes... [08:26:24] 10cloud-services-team (FY2023/2024-Q3-Q4), 10superset.wmcloud.org: Allow Superset to query ToolsDB public databases - https://phabricator.wikimedia.org/T367393#9948169 (10KCVelaga_WMF) @fnegri I have created a db named `s55986__automod_metrics_p` with a couple of tables and some dummy data. Please add it to S... [08:45:49] 10cloud-services-team (FY2023/2024-Q3-Q4), 10Cloud-VPS, 06DC-Ops, 10ops-eqiad, 06SRE: cloudcephosd1021-1034: hard drive sector errors increasing - https://phabricator.wikimedia.org/T348643#9948214 (10dcaro) To minimize the routers load I'm going to use a spread-out set of nodes for the tests and try agai... [08:56:32] 10cloud-services-team (FY2023/2024-Q3-Q4), 10Cloud-VPS, 06DC-Ops, 10ops-eqiad, 06SRE: cloudcephosd1021-1034: hard drive sector errors increasing - https://phabricator.wikimedia.org/T348643#9948349 (10dcaro) using 12 spread nodes hits the discards again: {F56197512} and nothing popping up on the disks s... [08:58:22] 10Toolforge, 07Epic: [components-api] First iteration of the component API - https://phabricator.wikimedia.org/T362051#9948416 (10dcaro) [08:58:25] 10Cloud Services Proposals, 10cloud-services-team (FY2023/2024-Q3-Q4), 10Toolforge, 05Cloud-Services-Origin-Team, and 3 others: [Epic,builds-api,components-api,webservice,jobs-api] Make Toolforge a proper platform as a service with push-to-deploy and build ... - https://phabricator.wikimedia.org/T194332#9948417 [09:03:28] 10cloud-services-team (FY2023/2024-Q3-Q4), 10Cloud-VPS, 05Cloud-Services-Origin-Team, 07Cloud-Services-Worktype-Maintenance, 05Goal: [ceph] Upgrade hosts to bullseye - https://phabricator.wikimedia.org/T309789#9948449 (10dcaro) [09:08:54] (03approved) 10dcaro: poetry: Autoupdate [repos/cloud/toolforge/toolforge-cli] - 10https://gitlab.wikimedia.org/repos/cloud/toolforge/toolforge-cli/-/merge_requests/21 (owner: 10group_203_bot_4866fc124f4b41659f667468a6115cf3) [09:08:57] (03update) 10dcaro: poetry: Autoupdate [repos/cloud/toolforge/toolforge-cli] - 10https://gitlab.wikimedia.org/repos/cloud/toolforge/toolforge-cli/-/merge_requests/21 (owner: 10group_203_bot_4866fc124f4b41659f667468a6115cf3) [09:09:00] (03merge) 10dcaro: poetry: Autoupdate [repos/cloud/toolforge/toolforge-cli] - 10https://gitlab.wikimedia.org/repos/cloud/toolforge/toolforge-cli/-/merge_requests/21 (owner: 10group_203_bot_4866fc124f4b41659f667468a6115cf3) [09:11:20] (03approved) 10dcaro: utils/update_component.sh: fail if no chartVersion tag was found at all [repos/cloud/toolforge/toolforge-deploy] - 10https://gitlab.wikimedia.org/repos/cloud/toolforge/toolforge-deploy/-/merge_requests/378 (owner: 10aborrero) [09:17:51] 10Tool-Global-user-contributions, 06Stewards-and-global-tools, 10XTools, 07Epic, 10Temporary accounts (Create/update essential tools/anti-abuse management): Investigate: How to make the GUC query performant - https://phabricator.wikimedia.org/T355672#9948515 (10Tchanders) 05Open→03Resolved a:03T... [09:19:35] 10Tool-Global-user-contributions, 06Stewards-and-global-tools, 10XTools, 07Epic, 10Temporary accounts (Create/update essential tools/anti-abuse management): Investigate: How to make the GUC query performant - https://phabricator.wikimedia.org/T355672#9948520 (10kostajh) We are going with {T368151} [09:26:32] 10Tools, 10Wikidata, 06Wikidata Dev Team, 10wmde-wikidata-tech: [GENERAL] Deprecate connecting senses prototype - https://phabricator.wikimedia.org/T351829#9948596 (10Arian_Bozorg) [09:26:38] 10Tool-Global-user-contributions, 06Stewards-and-global-tools, 07Epic, 10Temporary accounts (Create/update essential tools/anti-abuse management): [Epic] Implement global user contributions feature - https://phabricator.wikimedia.org/T337089#9948597 (10kostajh) [09:26:42] 10Tool-Global-user-contributions, 06Stewards-and-global-tools, 07Epic, 10Temporary accounts (Create/update essential tools/anti-abuse management): [Epic] Implement global user contributions feature - https://phabricator.wikimedia.org/T337089#9948598 (10kostajh) [09:27:53] (03update) 10dcaro: poetry: Autoupdate [repos/cloud/toolforge/api-gateway] - 10https://gitlab.wikimedia.org/repos/cloud/toolforge/api-gateway/-/merge_requests/25 (owner: 10group_203_bot_4866fc124f4b41659f667468a6115cf3) [09:28:25] (03update) 10dcaro: pre-commit: Autoupdate [repos/cloud/toolforge/volume-admission] - 10https://gitlab.wikimedia.org/repos/cloud/toolforge/volume-admission/-/merge_requests/11 (owner: 10group_203_bot_4866fc124f4b41659f667468a6115cf3) [09:28:27] 10cloud-services-team (FY2023/2024-Q3-Q4), 10Cloud-VPS, 06DC-Ops, 10ops-eqiad, 06SRE: cloudcephosd1021-1034: hard drive sector errors increasing - https://phabricator.wikimedia.org/T348643#9948601 (10dcaro) Created the data: ` dcaro@cumin1002:~$ sudo cumin -x cloudcephosd[1006,1016,1021].eqiad.wmnet,clou... [09:28:32] (03approved) 10dcaro: pre-commit: Autoupdate [repos/cloud/toolforge/volume-admission] - 10https://gitlab.wikimedia.org/repos/cloud/toolforge/volume-admission/-/merge_requests/11 (owner: 10group_203_bot_4866fc124f4b41659f667468a6115cf3) [09:28:49] (03approved) 10dcaro: pre-commit: Autoupdate [repos/cloud/toolforge/envvars-admission] - 10https://gitlab.wikimedia.org/repos/cloud/toolforge/envvars-admission/-/merge_requests/5 (owner: 10group_203_bot_4866fc124f4b41659f667468a6115cf3) [09:28:52] (03merge) 10dcaro: pre-commit: Autoupdate [repos/cloud/toolforge/envvars-admission] - 10https://gitlab.wikimedia.org/repos/cloud/toolforge/envvars-admission/-/merge_requests/5 (owner: 10group_203_bot_4866fc124f4b41659f667468a6115cf3) [09:29:40] (03merge) 10dcaro: poetry: Autoupdate [repos/cloud/toolforge/api-gateway] - 10https://gitlab.wikimedia.org/repos/cloud/toolforge/api-gateway/-/merge_requests/25 (owner: 10group_203_bot_4866fc124f4b41659f667468a6115cf3) [09:31:07] (03merge) 10dcaro: pre-commit: Autoupdate [repos/cloud/toolforge/volume-admission] - 10https://gitlab.wikimedia.org/repos/cloud/toolforge/volume-admission/-/merge_requests/11 (owner: 10group_203_bot_4866fc124f4b41659f667468a6115cf3) [09:31:39] (03open) 10project_1317_bot_df3177307bed93c3f34e421e26c86e38: api-gateway: bump to 0.0.27-20240703092947-1dcdf608 [repos/cloud/toolforge/toolforge-deploy] - 10https://gitlab.wikimedia.org/repos/cloud/toolforge/toolforge-deploy/-/merge_requests/379 [09:33:10] (03update) 10sstefanova: api: remove unprefixed endpoints [repos/cloud/toolforge/envvars-api] - 10https://gitlab.wikimedia.org/repos/cloud/toolforge/envvars-api/-/merge_requests/33 (https://phabricator.wikimedia.org/T363808) [09:34:02] (03open) 10project_1317_bot_df3177307bed93c3f34e421e26c86e38: volume-admission: bump to 0.0.49-20240703093120-2a4ec3c3 [repos/cloud/toolforge/toolforge-deploy] - 10https://gitlab.wikimedia.org/repos/cloud/toolforge/toolforge-deploy/-/merge_requests/380 [09:37:45] (03update) 10sstefanova: api: remove unprefixed endpoints [repos/cloud/toolforge/envvars-api] - 10https://gitlab.wikimedia.org/repos/cloud/toolforge/envvars-api/-/merge_requests/33 (https://phabricator.wikimedia.org/T363808) [09:41:48] 10cloud-services-team (FY2023/2024-Q3-Q4), 10Cloud-VPS, 06DC-Ops, 10ops-eqiad, 06SRE: cloudcephosd1021-1034: hard drive sector errors increasing - https://phabricator.wikimedia.org/T348643#9948660 (10dcaro) Compare the traffic generated when the cluster is rebalancing some data: {F56198741} :/ [09:50:55] !log dcaro@cloudcumin1001 toolsbeta START - Cookbook wmcs.toolforge.k8s.component.deploy for component volume-admission [09:51:07] !log dcaro@cloudcumin1001 toolsbeta END (PASS) - Cookbook wmcs.toolforge.k8s.component.deploy (exit_code=0) for component volume-admission [09:59:06] !log dcaro@cloudcumin1001 tools START - Cookbook wmcs.toolforge.k8s.component.deploy for component volume-admission [09:59:17] !log dcaro@cloudcumin1001 tools END (PASS) - Cookbook wmcs.toolforge.k8s.component.deploy (exit_code=0) for component volume-admission [10:12:28] 10cloud-services-team (FY2023/2024-Q3-Q4), 10Cloud-VPS, 06DC-Ops, 10ops-eqiad, 06SRE: cloudcephosd1021-1034: hard drive sector errors increasing - https://phabricator.wikimedia.org/T348643#9948751 (10dcaro) Ok, using 16 nodes, with 64 parallel operations each still does not trigger any issues on the driv... [10:14:47] (03approved) 10dcaro: volume-admission: bump to 0.0.49-20240703093120-2a4ec3c3 [repos/cloud/toolforge/toolforge-deploy] - 10https://gitlab.wikimedia.org/repos/cloud/toolforge/toolforge-deploy/-/merge_requests/380 (owner: 10project_1317_bot_df3177307bed93c3f34e421e26c86e38) [10:14:49] (03merge) 10dcaro: volume-admission: bump to 0.0.49-20240703093120-2a4ec3c3 [repos/cloud/toolforge/toolforge-deploy] - 10https://gitlab.wikimedia.org/repos/cloud/toolforge/toolforge-deploy/-/merge_requests/380 (owner: 10project_1317_bot_df3177307bed93c3f34e421e26c86e38) [10:15:08] (03update) 10dcaro: api-gateway: bump to 0.0.27-20240703092947-1dcdf608 [repos/cloud/toolforge/toolforge-deploy] - 10https://gitlab.wikimedia.org/repos/cloud/toolforge/toolforge-deploy/-/merge_requests/379 (owner: 10project_1317_bot_df3177307bed93c3f34e421e26c86e38) [10:15:19] !log dcaro@cloudcumin1001 toolsbeta START - Cookbook wmcs.toolforge.k8s.component.deploy for component api-gateway [10:15:30] !log dcaro@cloudcumin1001 toolsbeta END (PASS) - Cookbook wmcs.toolforge.k8s.component.deploy (exit_code=0) for component api-gateway [10:20:41] FIRING: CloudVPSDesignateLeaks: Detected 1 stray dns records - https://wikitech.wikimedia.org/wiki/Portal:Cloud_VPS/Admin/Runbooks/Designate_record_leaks - https://grafana.wikimedia.org/d/ebJoA6VWz/wmcs-openstack-eqiad-nova-fullstack - https://alerts.wikimedia.org/?q=alertname%3DCloudVPSDesignateLeaks [10:21:17] !log dcaro@cloudcumin1001 tools START - Cookbook wmcs.toolforge.k8s.component.deploy for component api-gateway [10:21:29] !log dcaro@cloudcumin1001 tools END (PASS) - Cookbook wmcs.toolforge.k8s.component.deploy (exit_code=0) for component api-gateway [10:22:28] (03approved) 10dcaro: api-gateway: bump to 0.0.27-20240703092947-1dcdf608 [repos/cloud/toolforge/toolforge-deploy] - 10https://gitlab.wikimedia.org/repos/cloud/toolforge/toolforge-deploy/-/merge_requests/379 (owner: 10project_1317_bot_df3177307bed93c3f34e421e26c86e38) [10:22:31] (03merge) 10dcaro: api-gateway: bump to 0.0.27-20240703092947-1dcdf608 [repos/cloud/toolforge/toolforge-deploy] - 10https://gitlab.wikimedia.org/repos/cloud/toolforge/toolforge-deploy/-/merge_requests/379 (owner: 10project_1317_bot_df3177307bed93c3f34e421e26c86e38) [10:31:44] 10Cloud-Services, 10superset.wmcloud.org: Analysis and metrics collection for quarry and superset adoption - https://phabricator.wikimedia.org/T369150 (10joanna_borun) 03NEW The #Cloud-Services project tag is not intended to have any tasks. Please check the list on https://phabricator.wikimedia.org/project/p... [11:00:42] RESOLVED: CloudVPSDesignateLeaks: Detected 1 stray dns records - https://wikitech.wikimedia.org/wiki/Portal:Cloud_VPS/Admin/Runbooks/Designate_record_leaks - https://grafana.wikimedia.org/d/ebJoA6VWz/wmcs-openstack-eqiad-nova-fullstack - https://alerts.wikimedia.org/?q=alertname%3DCloudVPSDesignateLeaks [11:01:20] (03open) 10dcaro: toolforge_get_versions: add some color when versions are weird [repos/cloud/toolforge/lima-kilo] - 10https://gitlab.wikimedia.org/repos/cloud/toolforge/lima-kilo/-/merge_requests/156 [11:01:39] (03update) 10dcaro: toolforge_get_versions: add some color when versions are weird [repos/cloud/toolforge/lima-kilo] - 10https://gitlab.wikimedia.org/repos/cloud/toolforge/lima-kilo/-/merge_requests/156 [11:13:56] (03update) 10dcaro: toolforge_get_versions: add some color when versions are weird [repos/cloud/toolforge/lima-kilo] - 10https://gitlab.wikimedia.org/repos/cloud/toolforge/lima-kilo/-/merge_requests/156 [11:20:20] (03update) 10dcaro: toolforge_get_versions: add some color when versions are weird [repos/cloud/toolforge/lima-kilo] - 10https://gitlab.wikimedia.org/repos/cloud/toolforge/lima-kilo/-/merge_requests/156 [11:22:51] FIRING: ProbeDown: Service tools-static-15:80 has failed probes (http_tools_static_wmflabs_org_ip4) - https://wikitech.wikimedia.org/wiki/Runbook#tools-static-15:80 - https://grafana.wikimedia.org/d/O0nHhdhnz/network-probes-overview?var-job=probes/custom&var-module=All - https://prometheus-alerts.wmcloud.org/?q=alertname%3DProbeDown [11:27:51] RESOLVED: ProbeDown: Service tools-static-15:80 has failed probes (http_tools_static_wmflabs_org_ip4) - https://wikitech.wikimedia.org/wiki/Runbook#tools-static-15:80 - https://grafana.wikimedia.org/d/O0nHhdhnz/network-probes-overview?var-job=probes/custom&var-module=All - https://prometheus-alerts.wmcloud.org/?q=alertname%3DProbeDown [11:34:55] (03approved) 10dcaro: pre-commit: Autoupdate [repos/cloud/toolforge/jobs-api] - 10https://gitlab.wikimedia.org/repos/cloud/toolforge/jobs-api/-/merge_requests/100 (owner: 10group_203_bot_4866fc124f4b41659f667468a6115cf3) [11:34:59] (03merge) 10dcaro: pre-commit: Autoupdate [repos/cloud/toolforge/jobs-api] - 10https://gitlab.wikimedia.org/repos/cloud/toolforge/jobs-api/-/merge_requests/100 (owner: 10group_203_bot_4866fc124f4b41659f667468a6115cf3) [11:35:14] (03approved) 10dcaro: pre-commit: Autoupdate [repos/cloud/toolforge/envvars-cli] - 10https://gitlab.wikimedia.org/repos/cloud/toolforge/envvars-cli/-/merge_requests/49 (owner: 10group_203_bot_4866fc124f4b41659f667468a6115cf3) [11:35:18] (03merge) 10dcaro: pre-commit: Autoupdate [repos/cloud/toolforge/envvars-cli] - 10https://gitlab.wikimedia.org/repos/cloud/toolforge/envvars-cli/-/merge_requests/49 (owner: 10group_203_bot_4866fc124f4b41659f667468a6115cf3) [11:38:56] (03open) 10project_1317_bot_df3177307bed93c3f34e421e26c86e38: jobs-api: bump to 0.0.312-20240703113513-fb748479 [repos/cloud/toolforge/toolforge-deploy] - 10https://gitlab.wikimedia.org/repos/cloud/toolforge/toolforge-deploy/-/merge_requests/381 [11:44:53] (03open) 10sstefanova: api: rename params for clarity [repos/cloud/toolforge/builds-api] - 10https://gitlab.wikimedia.org/repos/cloud/toolforge/builds-api/-/merge_requests/100 [11:49:29] (03approved) 10dcaro: poetry: Autoupdate [repos/cloud/toolforge/builds-cli] - 10https://gitlab.wikimedia.org/repos/cloud/toolforge/builds-cli/-/merge_requests/76 (owner: 10group_203_bot_4866fc124f4b41659f667468a6115cf3) [11:49:33] (03merge) 10dcaro: poetry: Autoupdate [repos/cloud/toolforge/builds-cli] - 10https://gitlab.wikimedia.org/repos/cloud/toolforge/builds-cli/-/merge_requests/76 (owner: 10group_203_bot_4866fc124f4b41659f667468a6115cf3) [12:01:13] 10Toolforge (Toolforge iteration 12), 13Patch-For-Review: [builds-api, builds-cli] Prefix all endpoints with `/tool/` - https://phabricator.wikimedia.org/T363808#9949147 (10Slst2020) [12:04:50] 10Toolforge (Toolforge iteration 12), 13Patch-For-Review: [builds-api, builds-cli] Prefix all endpoints with `/tool/` - https://phabricator.wikimedia.org/T363808#9949150 (10Slst2020) 05In progress→03Resolved [12:05:00] (03open) 10dcaro: show package origin mr [repos/cloud/toolforge/lima-kilo] - 10https://gitlab.wikimedia.org/repos/cloud/toolforge/lima-kilo/-/merge_requests/157 [12:06:36] (03update) 10dcaro: show package origin mr [repos/cloud/toolforge/lima-kilo] (version_with_colors) - 10https://gitlab.wikimedia.org/repos/cloud/toolforge/lima-kilo/-/merge_requests/157 [12:07:08] (03update) 10dcaro: show package origin mr [repos/cloud/toolforge/lima-kilo] (version_with_colors) - 10https://gitlab.wikimedia.org/repos/cloud/toolforge/lima-kilo/-/merge_requests/157 [12:10:13] 10Data-Services, 06Data-Persistence, 10Data-Platform-SRE (2024.06.17 - 2024.07.07), 13Patch-For-Review: Modify db-mysql to connect to an-redacteddb1001 from cumin hosts - https://phabricator.wikimedia.org/T368354#9949169 (10BTullis) I have merged https://gitlab.wikimedia.org/repos/sre/wmfdb/-/merge_request... [12:13:48] (03update) 10dcaro: poetry: Autoupdate [repos/cloud/toolforge/envvars-cli] - 10https://gitlab.wikimedia.org/repos/cloud/toolforge/envvars-cli/-/merge_requests/48 (owner: 10group_203_bot_4866fc124f4b41659f667468a6115cf3) [12:14:20] (03approved) 10dcaro: poetry: Autoupdate [repos/cloud/toolforge/envvars-cli] - 10https://gitlab.wikimedia.org/repos/cloud/toolforge/envvars-cli/-/merge_requests/48 (owner: 10group_203_bot_4866fc124f4b41659f667468a6115cf3) [12:14:23] (03update) 10dcaro: poetry: Autoupdate [repos/cloud/toolforge/envvars-cli] - 10https://gitlab.wikimedia.org/repos/cloud/toolforge/envvars-cli/-/merge_requests/48 (owner: 10group_203_bot_4866fc124f4b41659f667468a6115cf3) [12:14:56] (03merge) 10dcaro: poetry: Autoupdate [repos/cloud/toolforge/envvars-cli] - 10https://gitlab.wikimedia.org/repos/cloud/toolforge/envvars-cli/-/merge_requests/48 (owner: 10group_203_bot_4866fc124f4b41659f667468a6115cf3) [12:18:12] 10Data-Services, 06Data-Persistence, 10Data-Platform-SRE (2024.06.17 - 2024.07.07), 13Patch-For-Review: Modify db-mysql to connect to an-redacteddb1001 from cumin hosts - https://phabricator.wikimedia.org/T368354#9949200 (10Marostegui) @ABran-WMF could you assist @BTullis with this ^? Thanks! [12:19:18] !log dcaro@cloudcumin1001 toolsbeta START - Cookbook wmcs.toolforge.k8s.component.deploy for component jobs-api [12:19:29] !log dcaro@cloudcumin1001 toolsbeta END (PASS) - Cookbook wmcs.toolforge.k8s.component.deploy (exit_code=0) for component jobs-api [12:22:17] !log dcaro@urcuchillay admin START - Cookbook wmcs.ceph.osd.bootstrap_and_add (T309789) [12:22:22] Logged the message at https://wikitech.wikimedia.org/wiki/Nova_Resource:Admin/SAL [12:22:22] T309789: [ceph] Upgrade hosts to bullseye - https://phabricator.wikimedia.org/T309789 [12:23:31] 10cloud-services-team (FY2023/2024-Q3-Q4), 10Cloud-VPS, 06DC-Ops, 10ops-eqiad, 06SRE: cloudcephosd1021-1034: hard drive sector errors increasing - https://phabricator.wikimedia.org/T348643#9949203 (10dcaro) I'll try adding the `sdc` drive to `cloudcephosd1034`, that should force it to get populated with... [12:24:32] 10cloud-services-team (FY2023/2024-Q3-Q4), 10Cloud-VPS, 06DC-Ops, 10ops-eqiad, 06SRE: cloudcephosd1021-1034: hard drive sector errors increasing - https://phabricator.wikimedia.org/T348643#9949207 (10dcaro) Current error counters (before adding `sdc`): ` root@cloudcephosd1034:~# for i in /dev/sd?; do ech... [12:25:26] !log dcaro@cloudcumin1001 tools START - Cookbook wmcs.toolforge.k8s.component.deploy for component jobs-api [12:25:38] !log dcaro@cloudcumin1001 tools END (PASS) - Cookbook wmcs.toolforge.k8s.component.deploy (exit_code=0) for component jobs-api [12:28:32] !log dcaro@urcuchillay admin END (FAIL) - Cookbook wmcs.ceph.osd.bootstrap_and_add (exit_code=99) (T309789) [12:28:37] Logged the message at https://wikitech.wikimedia.org/wiki/Nova_Resource:Admin/SAL [12:28:37] T309789: [ceph] Upgrade hosts to bullseye - https://phabricator.wikimedia.org/T309789 [12:29:10] FIRING: CephClusterInWarning: Ceph cluster in eqiad is in warning status - https://wikitech.wikimedia.org/wiki/Portal:Cloud_VPS/Admin/Runbooks/CephClusterInWarning - https://grafana.wikimedia.org/d/P1tFnn3Mk/wmcs-ceph-eqiad-health?orgId=1&search=open&tag=ceph&tag=health&tag=WMCS - https://alerts.wikimedia.org/?q=alertname%3DCephClusterInWarning [12:34:10] RESOLVED: CephClusterInWarning: Ceph cluster in eqiad is in warning status - https://wikitech.wikimedia.org/wiki/Portal:Cloud_VPS/Admin/Runbooks/CephClusterInWarning - https://grafana.wikimedia.org/d/P1tFnn3Mk/wmcs-ceph-eqiad-health?orgId=1&search=open&tag=ceph&tag=health&tag=WMCS - https://alerts.wikimedia.org/?q=alertname%3DCephClusterInWarning [12:38:25] (03approved) 10dcaro: api: remove unprefixed endpoints [repos/cloud/toolforge/envvars-api] - 10https://gitlab.wikimedia.org/repos/cloud/toolforge/envvars-api/-/merge_requests/33 (https://phabricator.wikimedia.org/T363808) (owner: 10sstefanova) [12:38:33] 06cloud-services-team, 10Toolforge (Toolforge iteration 12): [infra,k8s] Upgrade Toolforge Kubernetes to version 1.25 - https://phabricator.wikimedia.org/T316107#9949250 (10aborrero) [12:41:48] (03approved) 10dcaro: jobs-api: bump to 0.0.312-20240703113513-fb748479 [repos/cloud/toolforge/toolforge-deploy] - 10https://gitlab.wikimedia.org/repos/cloud/toolforge/toolforge-deploy/-/merge_requests/381 (owner: 10project_1317_bot_df3177307bed93c3f34e421e26c86e38) [12:41:51] (03merge) 10dcaro: jobs-api: bump to 0.0.312-20240703113513-fb748479 [repos/cloud/toolforge/toolforge-deploy] - 10https://gitlab.wikimedia.org/repos/cloud/toolforge/toolforge-deploy/-/merge_requests/381 (owner: 10project_1317_bot_df3177307bed93c3f34e421e26c86e38) [12:41:52] (03update) 10dcaro: jobs-api: bump to 0.0.312-20240703113513-fb748479 [repos/cloud/toolforge/toolforge-deploy] - 10https://gitlab.wikimedia.org/repos/cloud/toolforge/toolforge-deploy/-/merge_requests/381 (owner: 10project_1317_bot_df3177307bed93c3f34e421e26c86e38) [12:42:15] 06cloud-services-team, 10Toolforge (Toolforge iteration 12): toolforge: prepare deb packages for k8s 1.25 - https://phabricator.wikimedia.org/T369163 (10aborrero) 03NEW [12:42:19] (03update) 10dcaro: poetry: Autoupdate [repos/cloud/toolforge/jobs-api] - 10https://gitlab.wikimedia.org/repos/cloud/toolforge/jobs-api/-/merge_requests/99 (owner: 10group_203_bot_4866fc124f4b41659f667468a6115cf3) [12:46:49] 06cloud-services-team, 10Toolforge: toolforge: Upgrade all Kubernetes components to versions supporting Kubernetes 1.25 - https://phabricator.wikimedia.org/T329671#9949293 (10aborrero) [12:46:51] 06cloud-services-team, 10Toolforge: toolforge: upgrade all Kubernetes components to versions supporting Kubernetes 1.25 - https://phabricator.wikimedia.org/T329671#9949301 (10aborrero) [12:48:53] 06cloud-services-team, 10Toolforge (Toolforge iteration 12): toolforge: review k8s API usage by custom components for 1.25 upgrade - https://phabricator.wikimedia.org/T369164 (10aborrero) 03NEW [12:49:15] (03merge) 10sstefanova: api: remove unprefixed endpoints [repos/cloud/toolforge/envvars-api] - 10https://gitlab.wikimedia.org/repos/cloud/toolforge/envvars-api/-/merge_requests/33 (https://phabricator.wikimedia.org/T363808) [12:49:46] 06cloud-services-team, 10Toolforge (Toolforge iteration 12): toolforge: review k8s API usage by custom components for 1.25 upgrade - https://phabricator.wikimedia.org/T369164#9949322 (10aborrero) [12:50:32] 06cloud-services-team, 10Toolforge: toolforge: upgrade all Kubernetes components to versions supporting Kubernetes 1.25 - https://phabricator.wikimedia.org/T329671#9949323 (10aborrero) [12:51:01] 06cloud-services-team, 10Toolforge (Toolforge iteration 12): toolforge: upgrade all Kubernetes components to versions supporting Kubernetes 1.25 - https://phabricator.wikimedia.org/T329671#9949325 (10aborrero) [12:51:41] 06cloud-services-team, 10Toolforge (Toolforge iteration 12): toolforge: upgrade lima-kilo for kubernetes 1.25 - https://phabricator.wikimedia.org/T369165 (10aborrero) 03NEW [12:52:28] (03update) 10sstefanova: toolforge_get_versions: add some color when versions are weird [repos/cloud/toolforge/lima-kilo] - 10https://gitlab.wikimedia.org/repos/cloud/toolforge/lima-kilo/-/merge_requests/156 (owner: 10dcaro) [12:53:02] 10Data-Services, 06Data-Persistence, 10Data-Platform-SRE (2024.06.17 - 2024.07.07), 13Patch-For-Review: Modify db-mysql to connect to an-redacteddb1001 from cumin hosts - https://phabricator.wikimedia.org/T368354#9949328 (10ABran-WMF) @Marostegui yep no worries! @BTullis you can remove and create `dgit/$di... [12:53:41] 06cloud-services-team, 10Toolforge (Toolforge iteration 12): toolforge: refresh kubernetes cookbooks for the 1.25 upgrade - https://phabricator.wikimedia.org/T369166 (10aborrero) 03NEW [12:53:44] 06cloud-services-team, 10Toolforge (Toolforge iteration 12): toolforge: upgrade lima-kilo for kubernetes 1.25 - https://phabricator.wikimedia.org/T369165#9949361 (10Slst2020) a:03Slst2020 [12:53:51] (03open) 10project_1317_bot_df3177307bed93c3f34e421e26c86e38: envvars-api: bump to 0.0.51-20240703124925-5829df80 [repos/cloud/toolforge/toolforge-deploy] - 10https://gitlab.wikimedia.org/repos/cloud/toolforge/toolforge-deploy/-/merge_requests/382 [12:54:07] 06cloud-services-team, 10Toolforge (Toolforge iteration 12): toolforge: upgrade toolsbeta to k8s 1.25 - https://phabricator.wikimedia.org/T369167 (10aborrero) 03NEW [12:55:06] 06cloud-services-team, 10Toolforge (Toolforge iteration 12): toolsbeta: upgrade control plane nodes to k8s 1.25 - https://phabricator.wikimedia.org/T369168 (10aborrero) 03NEW [12:55:13] (03approved) 10sstefanova: toolforge_get_versions: add some color when versions are weird [repos/cloud/toolforge/lima-kilo] - 10https://gitlab.wikimedia.org/repos/cloud/toolforge/lima-kilo/-/merge_requests/156 (owner: 10dcaro) [12:55:33] 06cloud-services-team, 10Toolforge (Toolforge iteration 12): toolsbeta: upgrade data plane nodes to k8s 1.25 - https://phabricator.wikimedia.org/T369170 (10aborrero) 03NEW [12:55:47] (03update) 10sstefanova: toolforge_get_versions: add some color when versions are weird [repos/cloud/toolforge/lima-kilo] - 10https://gitlab.wikimedia.org/repos/cloud/toolforge/lima-kilo/-/merge_requests/156 (owner: 10dcaro) [12:56:52] 06cloud-services-team, 10Toolforge (Toolforge iteration 12): toolforge: upgrade worker nodes to k8s 1.25 - https://phabricator.wikimedia.org/T369171 (10aborrero) 03NEW [12:57:30] 06cloud-services-team, 10Toolforge (Toolforge iteration 12): toolforge: upgrade control plane nodes to k8s 1.25 - https://phabricator.wikimedia.org/T369172 (10aborrero) 03NEW [13:00:31] 10Data-Services: SQL function to recover the normal hostname, to install on Wiki Replica instances - https://phabricator.wikimedia.org/T344877#9949460 (10Ladsgroup) I don't know if MariaDB could allow a function being defined but not used in WHERE (and only in SELECT for example). Not to mention that someone cou... [13:04:13] 10Data-Services: SQL function to recover the normal hostname, to install on Wiki Replica instances - https://phabricator.wikimedia.org/T344877#9949471 (10Ladsgroup) To emphasize. It's be nice to have that function in quarry/superset itself and deal with it in those services but not on mariadb level. [13:07:03] 06cloud-services-team, 10Toolforge (Toolforge iteration 12): toolforge: upgrade all Kubernetes components to versions supporting Kubernetes 1.25 - https://phabricator.wikimedia.org/T329671#9949494 (10Slst2020) ` [~/repos/work/toolforge/toolforge-deploy (upgrade-k8s-to-1.25)] % git grep kubeVersion: '*tools.yam... [13:07:56] 06cloud-services-team, 10Data-Services, 06SRE: [wikireplicas] Make sure there is no sensitive data in clouddb hosts - https://phabricator.wikimedia.org/T368136#9949511 (10Ladsgroup) >>! In T368136#9935249, @fnegri wrote: > Can we somehow remove the data that is currently filtered at the view layer, and inste... [13:07:59] 06cloud-services-team, 10Toolforge (Toolforge iteration 12): toolforge: upgrade all Kubernetes components to versions supporting Kubernetes 1.25 - https://phabricator.wikimedia.org/T329671#9949512 (10Slst2020) [13:14:12] 06cloud-services-team, 10Toolforge (Toolforge iteration 12): toolforge: upgrade all Kubernetes components to versions supporting Kubernetes 1.25 - https://phabricator.wikimedia.org/T329671#9949525 (10Slst2020) Per the docs, cert-manager v.1.11.0 is compatible with Kubernetes version >= 1.21.0-0 https://artifac... [13:24:12] 06cloud-services-team, 10Toolforge (Toolforge iteration 12): toolforge: upgrade all Kubernetes components to versions supporting Kubernetes 1.25 - https://phabricator.wikimedia.org/T329671#9949568 (10Slst2020) [13:32:35] 06cloud-services-team, 10Toolforge (Toolforge iteration 12): toolforge: upgrade all Kubernetes components to versions supporting Kubernetes 1.25 - https://phabricator.wikimedia.org/T329671#9949585 (10Slst2020) [13:34:15] 06cloud-services-team, 10Toolforge (Toolforge iteration 12): toolforge: upgrade all Kubernetes components to versions supporting Kubernetes 1.25 - https://phabricator.wikimedia.org/T329671#9949588 (10Slst2020) ` [~/repos/work/toolforge/builds-api (main)] % grep "tekton" go.mod github.com/tektoncd/pipeline v0.... [13:40:24] 10Data-Services: SQL function to recover the normal hostname, to install on Wiki Replica instances - https://phabricator.wikimedia.org/T344877#9949627 (10fnegri) > I don't know if MariaDB could allow a function being defined but not used in WHERE (and only in SELECT for example) But you can define a function in... [13:41:56] 06cloud-services-team, 10Cloud-VPS, 06collaboration-services: VMs in Cloud VPS share the same machine-id - https://phabricator.wikimedia.org/T351507#9949631 (10Andrew) Hello @jelto -- assuming that this fix requires me to rebuild debian base images, do you need new Bullseye and Bookworm both, or just one or... [13:48:34] 06cloud-services-team, 10Cloud-VPS, 06collaboration-services: VMs in Cloud VPS share the same machine-id - https://phabricator.wikimedia.org/T351507#9949638 (10Jelto) >>! In T351507#9949631, @Andrew wrote: > Hello @jelto -- assuming that this fix requires me to rebuild debian base images, do you need new Bul... [13:50:35] 06cloud-services-team, 10Data-Services, 06SRE: [wikireplicas] Make sure there is no sensitive data in clouddb hosts - https://phabricator.wikimedia.org/T368136#9949639 (10fnegri) 05Open→03Declined > I highly doubt it'd be possible honestly for everything. I tend to agree, I underestimated the amount... [13:50:42] FIRING: CloudVPSDesignateLeaks: Detected 1 stray dns records - https://wikitech.wikimedia.org/wiki/Portal:Cloud_VPS/Admin/Runbooks/Designate_record_leaks - https://grafana.wikimedia.org/d/ebJoA6VWz/wmcs-openstack-eqiad-nova-fullstack - https://alerts.wikimedia.org/?q=alertname%3DCloudVPSDesignateLeaks [14:00:42] RESOLVED: CloudVPSDesignateLeaks: Detected 1 stray dns records - https://wikitech.wikimedia.org/wiki/Portal:Cloud_VPS/Admin/Runbooks/Designate_record_leaks - https://grafana.wikimedia.org/d/ebJoA6VWz/wmcs-openstack-eqiad-nova-fullstack - https://alerts.wikimedia.org/?q=alertname%3DCloudVPSDesignateLeaks [14:06:07] 10Cloud-VPS (Debian Buster Deprecation), 10Beta-Cluster-Infrastructure: Migrate deployment-prep away from Debian Buster to Bullseye/Bookworm - https://phabricator.wikimedia.org/T327742#9949679 (10Andrew) >>! In T327742#9812917, @elukey wrote: > I have asked something on IRC's `wikimedia-cloud` about a way to s... [14:22:00] 10cloud-services-team (FY2023/2024-Q3-Q4), 10Data-Services, 05Goal: [toolsdb] Migrate quickstatements db to Trove - https://phabricator.wikimedia.org/T369177 (10fnegri) 03NEW [14:23:38] 10Cloud-VPS (Debian Buster Deprecation), 10Beta-Cluster-Infrastructure: Migrate deployment-prep away from Debian Buster to Bullseye/Bookworm - https://phabricator.wikimedia.org/T327742#9949756 (10elukey) >>! In T327742#9949679, @Andrew wrote: >>>! In T327742#9812917, @elukey wrote: >> I have asked something on... [14:25:17] 06cloud-services-team, 10Technical-blog-posts: Tech blog post: "Wikimedia Toolforge: migrating Kubernetes from PodSecurityPolicy to kyverno" - https://phabricator.wikimedia.org/T368948#9949774 (10debt) whoops, see attached file, please! It looks prettier on wordpress, but this should give you a good idea. {F56... [14:35:54] 10cloud-services-team (FY2023/2024-Q3-Q4), 10Data-Services, 05Goal: [toolsdb] Migrate quickstatements db to Trove - https://phabricator.wikimedia.org/T369177#9949838 (10fnegri) p:05Triage→03Medium [14:38:44] 06cloud-services-team, 10Data-Services, 10Toolforge: toolsdb: evaluate storage usage by some tools - https://phabricator.wikimedia.org/T301967#9949843 (10fnegri) p:05Triage→03Low DB storage is tracked in the subtask {T291782} NFS storage does not seem to be an immediate issue, but we should probably che... [14:43:05] 10cloud-services-team (FY2023/2024-Q3-Q4), 10Data-Services, 07Epic: Migrate largest ToolsDB users to Trove - https://phabricator.wikimedia.org/T291782#9949872 (10fnegri) [14:43:15] 10cloud-services-team (FY2023/2024-Q3-Q4), 10Data-Services, 05Goal: [toolsdb] Migrate mixnmatch db to Trove - https://phabricator.wikimedia.org/T350862#9949873 (10fnegri) [14:53:19] (03update) 10sstefanova: [jobs-api] fix issues in openapi schema [repos/cloud/toolforge/jobs-api] - 10https://gitlab.wikimedia.org/repos/cloud/toolforge/jobs-api/-/merge_requests/96 (owner: 10raymond-ndibe) [14:53:25] (03approved) 10sstefanova: [jobs-api] fix issues in openapi schema [repos/cloud/toolforge/jobs-api] - 10https://gitlab.wikimedia.org/repos/cloud/toolforge/jobs-api/-/merge_requests/96 (owner: 10raymond-ndibe) [15:01:51] (03update) 10sstefanova: [jobs-api] fix issues in openapi schema [repos/cloud/toolforge/jobs-api] - 10https://gitlab.wikimedia.org/repos/cloud/toolforge/jobs-api/-/merge_requests/96 (owner: 10raymond-ndibe) [15:04:59] 10Quarry, 10superset.wmcloud.org: Analysis and metrics collection for quarry and superset adoption - https://phabricator.wikimedia.org/T369150#9949976 (10JJMC89) [15:08:08] (03update) 10raymond-ndibe: [lima-kilo] update bookworm arm64 image [repos/cloud/toolforge/lima-kilo] - 10https://gitlab.wikimedia.org/repos/cloud/toolforge/lima-kilo/-/merge_requests/155 [15:09:18] (03update) 10raymond-ndibe: [lima-kilo] update bookworm arm64 image [repos/cloud/toolforge/lima-kilo] - 10https://gitlab.wikimedia.org/repos/cloud/toolforge/lima-kilo/-/merge_requests/155 [15:27:32] 10cloud-services-team (FY2023/2024-Q3-Q4), 10Cloud-VPS, 06DC-Ops, 10ops-eqiad, 06SRE: cloudcephosd1021-1034: hard drive sector errors increasing - https://phabricator.wikimedia.org/T348643#9950063 (10dcaro) The osd in now in, no changes in the error counter: ` root@cloudcephosd1034:~# for i in /dev/sd?;... [15:27:36] (03update) 10fnegri: [lima-kilo] update bookworm arm64 image [repos/cloud/toolforge/lima-kilo] - 10https://gitlab.wikimedia.org/repos/cloud/toolforge/lima-kilo/-/merge_requests/155 (owner: 10raymond-ndibe) [15:27:41] (03approved) 10fnegri: [lima-kilo] update bookworm arm64 image [repos/cloud/toolforge/lima-kilo] - 10https://gitlab.wikimedia.org/repos/cloud/toolforge/lima-kilo/-/merge_requests/155 (owner: 10raymond-ndibe) [15:46:30] (03update) 10aborrero: utils/update_component.sh: fail if no chartVersion tag was found at all [repos/cloud/toolforge/toolforge-deploy] - 10https://gitlab.wikimedia.org/repos/cloud/toolforge/toolforge-deploy/-/merge_requests/378 [15:46:54] (03merge) 10aborrero: utils/update_component.sh: fail if no chartVersion tag was found at all [repos/cloud/toolforge/toolforge-deploy] - 10https://gitlab.wikimedia.org/repos/cloud/toolforge/toolforge-deploy/-/merge_requests/378 [15:50:12] (03update) 10dcaro: show package origin mr [repos/cloud/toolforge/lima-kilo] (version_with_colors) - 10https://gitlab.wikimedia.org/repos/cloud/toolforge/lima-kilo/-/merge_requests/157 [15:55:00] 06cloud-services-team, 10Toolforge (Toolforge iteration 12): toolforge: upgrade all Kubernetes components to versions supporting Kubernetes 1.25 - https://phabricator.wikimedia.org/T329671#9950216 (10aborrero) hey @Slst2020 maybe we can translate that cert-manager and tekton version information into `kubeVersi... [16:04:00] !log dcaro@urcuchillay admin START - Cookbook wmcs.ceph.osd.depool_and_destroy (T309789) [16:04:06] Logged the message at https://wikitech.wikimedia.org/wiki/Nova_Resource:Admin/SAL [16:04:07] T309789: [ceph] Upgrade hosts to bullseye - https://phabricator.wikimedia.org/T309789 [16:04:17] 10cloud-services-team (FY2023/2024-Q3-Q4), 10Cloud-VPS, 05Cloud-Services-Origin-Team, 07Cloud-Services-Worktype-Maintenance, 05Goal: [ceph] Upgrade hosts to bullseye - https://phabricator.wikimedia.org/T309789#9950278 (10dcaro) [16:26:26] (03open) 10andrew: Add special flavor for wmcs-image-create [repos/cloud/cloud-vps/tofu-infra] - 10https://gitlab.wikimedia.org/repos/cloud/cloud-vps/tofu-infra/-/merge_requests/12 [16:26:49] (03merge) 10andrew: Add special flavor for wmcs-image-create [repos/cloud/cloud-vps/tofu-infra] - 10https://gitlab.wikimedia.org/repos/cloud/cloud-vps/tofu-infra/-/merge_requests/12 [16:30:40] 06cloud-services-team, 10Toolforge (Toolforge iteration 12): toolforge: upgrade all Kubernetes components to versions supporting Kubernetes 1.25 - https://phabricator.wikimedia.org/T329671#9950432 (10dcaro) +1 yep, we should do a pass to all the components that don't have it and add it (when the kubeVersion wa... [16:33:07] 10Data-Services, 06Data-Persistence, 10Data-Platform-SRE (2024.06.17 - 2024.07.07), 13Patch-For-Review: Modify db-mysql to connect to an-redacteddb1001 from cumin hosts - https://phabricator.wikimedia.org/T368354#9950437 (10BTullis) Thanks. I think I have done that, so the CI jobs have run. Do I have to do... [16:36:33] 10cloud-services-team (FY2023/2024-Q3-Q4), 10Data-Services: [wikireplicas] frequent replag spikes in clouddb hosts - https://phabricator.wikimedia.org/T367778#9950469 (10fnegri) Two days later, it's still not looking great: {F56203636} There are several long queries currently running, from several different... [16:51:07] 10cloud-services-team (FY2023/2024-Q3-Q4), 10superset.wmcloud.org: Allow Superset to query ToolsDB public databases - https://phabricator.wikimedia.org/T367393#9950578 (10github-toolforge-bot) dhinus opened https://github.com/toolforge/superset-deploy/pull/26 [16:51:15] dhinus opened https://github.com/toolforge/superset-deploy/pull/26 [16:57:36] 10cloud-services-team (FY2023/2024-Q3-Q4), 10superset.wmcloud.org: Allow Superset to query ToolsDB public databases - https://phabricator.wikimedia.org/T367393#9950656 (10fnegri) I created the user and grants in ToolsDB, similar to the Quarry ones I created in T348407. ` MariaDB [(none)]> CREATE USER 'superse... [16:58:27] 10cloud-services-team (FY2023/2024-Q3-Q4), 10Data-Services, 10Quarry: Allow Quarry to query ToolsDB public databases - https://phabricator.wikimedia.org/T348407#9950680 (10fnegri) Additional clean-up: I removed the grant for `heartbeat_p` as that is already implied in the grant for `%\_p`. ` MariaDB [(n... [17:11:53] (03open) 10dcaro: openapi_version_bump: add pre-commit hook [repos/cloud/cicd/gitlab-ci] - 10https://gitlab.wikimedia.org/repos/cloud/cicd/gitlab-ci/-/merge_requests/41 (https://phabricator.wikimedia.org/T356972) [17:17:46] 10Toolforge (Toolforge iteration 12): [cicd,infra] the pre-commit updater job fails trying to push to builds-admission - https://phabricator.wikimedia.org/T369188 (10dcaro) 03NEW [17:18:18] (03update) 10dcaro: openapi_version_bump: add pre-commit hook [repos/cloud/cicd/gitlab-ci] - 10https://gitlab.wikimedia.org/repos/cloud/cicd/gitlab-ci/-/merge_requests/41 (https://phabricator.wikimedia.org/T356972) [17:23:51] (03update) 10dcaro: openapi_version_bump: add pre-commit hook [repos/cloud/cicd/gitlab-ci] - 10https://gitlab.wikimedia.org/repos/cloud/cicd/gitlab-ci/-/merge_requests/41 (https://phabricator.wikimedia.org/T356972) [17:27:40] (03update) 10dcaro: openapi_version_bump: add pre-commit hook [repos/cloud/cicd/gitlab-ci] - 10https://gitlab.wikimedia.org/repos/cloud/cicd/gitlab-ci/-/merge_requests/41 (https://phabricator.wikimedia.org/T356972) [17:28:42] !log dcaro@urcuchillay admin END (PASS) - Cookbook wmcs.ceph.osd.depool_and_destroy (exit_code=0) (T309789) [17:28:48] Logged the message at https://wikitech.wikimedia.org/wiki/Nova_Resource:Admin/SAL [17:28:48] T309789: [ceph] Upgrade hosts to bullseye - https://phabricator.wikimedia.org/T309789 [17:32:41] (03update) 10dcaro: openapi_version_bump: add pre-commit hook [repos/cloud/cicd/gitlab-ci] - 10https://gitlab.wikimedia.org/repos/cloud/cicd/gitlab-ci/-/merge_requests/41 (https://phabricator.wikimedia.org/T356972) [17:34:25] 10Toolforge: Allow custom tagging of build service generated images - https://phabricator.wikimedia.org/T369192 (10bd808) 03NEW [17:42:35] (03open) 10dcaro: ci: add check-openapi-version-bump pre-commit hook [repos/cloud/toolforge/envvars-api] - 10https://gitlab.wikimedia.org/repos/cloud/toolforge/envvars-api/-/merge_requests/37 (https://phabricator.wikimedia.org/T356972) [17:48:29] (03open) 10dcaro: create_mrs: skip archived repos [repos/cloud/cicd/gitlab-ci] - 10https://gitlab.wikimedia.org/repos/cloud/cicd/gitlab-ci/-/merge_requests/42 (https://phabricator.wikimedia.org/T369188) [17:48:43] 10Toolforge (Toolforge iteration 12), 13Patch-For-Review: [cicd,infra] the pre-commit updater job fails trying to push to builds-admission - https://phabricator.wikimedia.org/T369188#9951002 (10dcaro) p:05Triage→03Medium [18:50:41] FIRING: CloudVPSDesignateLeaks: Detected 1 stray dns records - https://wikitech.wikimedia.org/wiki/Portal:Cloud_VPS/Admin/Runbooks/Designate_record_leaks - https://grafana.wikimedia.org/d/ebJoA6VWz/wmcs-openstack-eqiad-nova-fullstack - https://alerts.wikimedia.org/?q=alertname%3DCloudVPSDesignateLeaks [19:30:42] RESOLVED: CloudVPSDesignateLeaks: Detected 1 stray dns records - https://wikitech.wikimedia.org/wiki/Portal:Cloud_VPS/Admin/Runbooks/Designate_record_leaks - https://grafana.wikimedia.org/d/ebJoA6VWz/wmcs-openstack-eqiad-nova-fullstack - https://alerts.wikimedia.org/?q=alertname%3DCloudVPSDesignateLeaks [19:39:14] (03approved) 10aborrero: create_mrs: skip archived repos [repos/cloud/cicd/gitlab-ci] - 10https://gitlab.wikimedia.org/repos/cloud/cicd/gitlab-ci/-/merge_requests/42 (https://phabricator.wikimedia.org/T369188) (owner: 10dcaro) [19:49:13] 06cloud-services-team, 10Technical-blog-posts: Tech blog post: "Wikimedia Toolforge: migrating Kubernetes from PodSecurityPolicy to kyverno" - https://phabricator.wikimedia.org/T368948#9951522 (10aborrero) LGTM, thanks @debt, feel free to publish. [19:52:56] (03open) 10raymond-ndibe: [jobs-cli] refactor error handling [repos/cloud/toolforge/jobs-cli] - 10https://gitlab.wikimedia.org/repos/cloud/toolforge/jobs-cli/-/merge_requests/48 [20:00:21] (03update) 10raymond-ndibe: [jobs-cli] refactor error handling [repos/cloud/toolforge/jobs-cli] - 10https://gitlab.wikimedia.org/repos/cloud/toolforge/jobs-cli/-/merge_requests/48 [20:03:00] (03update) 10raymond-ndibe: [jobs-cli] refactor error handling [repos/cloud/toolforge/jobs-cli] - 10https://gitlab.wikimedia.org/repos/cloud/toolforge/jobs-cli/-/merge_requests/48 [20:20:42] FIRING: CloudVPSDesignateLeaks: Detected 4 stray dns records - https://wikitech.wikimedia.org/wiki/Portal:Cloud_VPS/Admin/Runbooks/Designate_record_leaks - https://grafana.wikimedia.org/d/ebJoA6VWz/wmcs-openstack-eqiad-nova-fullstack - https://alerts.wikimedia.org/?q=alertname%3DCloudVPSDesignateLeaks [20:30:42] RESOLVED: CloudVPSDesignateLeaks: Detected 4 stray dns records - https://wikitech.wikimedia.org/wiki/Portal:Cloud_VPS/Admin/Runbooks/Designate_record_leaks - https://grafana.wikimedia.org/d/ebJoA6VWz/wmcs-openstack-eqiad-nova-fullstack - https://alerts.wikimedia.org/?q=alertname%3DCloudVPSDesignateLeaks [20:54:03] 06cloud-services-team, 10Technical-blog-posts: Tech blog post: "Wikimedia Toolforge: migrating Kubernetes from PodSecurityPolicy to kyverno" - https://phabricator.wikimedia.org/T368948#9951719 (10debt) 05Open→03Resolved Published, thanks for the writing! :) https://techblog.wikimedia.org/2024/07/03/wi... [21:05:46] (03update) 10raymond-ndibe: [jobs-cli] refactor error handling [repos/cloud/toolforge/jobs-cli] - 10https://gitlab.wikimedia.org/repos/cloud/toolforge/jobs-cli/-/merge_requests/48 [21:06:16] (03close) 10raymond-ndibe: [jobs-cli] refactor handle_http_exception [repos/cloud/toolforge/jobs-cli] - 10https://gitlab.wikimedia.org/repos/cloud/toolforge/jobs-cli/-/merge_requests/46 [21:06:46] (03update) 10raymond-ndibe: Draft: [jobs-cli] refactor job validation [repos/cloud/toolforge/jobs-cli] (refactor_error_handling) - 10https://gitlab.wikimedia.org/repos/cloud/toolforge/jobs-cli/-/merge_requests/45 (https://phabricator.wikimedia.org/T366209) [21:09:24] (03update) 10raymond-ndibe: [jobs-cli] refactor error handling [repos/cloud/toolforge/jobs-cli] - 10https://gitlab.wikimedia.org/repos/cloud/toolforge/jobs-cli/-/merge_requests/48 [21:28:22] (03update) 10raymond-ndibe: Draft: [jobs-cli] refactor job validation [repos/cloud/toolforge/jobs-cli] (refactor_handle_http_exception) - 10https://gitlab.wikimedia.org/repos/cloud/toolforge/jobs-cli/-/merge_requests/45 (https://phabricator.wikimedia.org/T366209) [21:41:24] (03update) 10raymond-ndibe: Draft: [jobs-cli] refactor job validation [repos/cloud/toolforge/jobs-cli] (refactor_handle_http_exception) - 10https://gitlab.wikimedia.org/repos/cloud/toolforge/jobs-cli/-/merge_requests/45 (https://phabricator.wikimedia.org/T366209) [21:41:56] (03update) 10raymond-ndibe: [jobs-cli] refactor job validation [repos/cloud/toolforge/jobs-cli] (refactor_error_handling) - 10https://gitlab.wikimedia.org/repos/cloud/toolforge/jobs-cli/-/merge_requests/45 (https://phabricator.wikimedia.org/T366209) [21:43:00] (03update) 10raymond-ndibe: [jobs-cli] refactor job validation [repos/cloud/toolforge/jobs-cli] (refactor_error_handling) - 10https://gitlab.wikimedia.org/repos/cloud/toolforge/jobs-cli/-/merge_requests/45 (https://phabricator.wikimedia.org/T366209) [21:43:46] (03update) 10raymond-ndibe: [jobs-cli] refactor job validation [repos/cloud/toolforge/jobs-cli] (refactor_error_handling) - 10https://gitlab.wikimedia.org/repos/cloud/toolforge/jobs-cli/-/merge_requests/45 (https://phabricator.wikimedia.org/T366209) [21:53:50] (03update) 10raymond-ndibe: [jobs-cli] refactor job validation [repos/cloud/toolforge/jobs-cli] (refactor_error_handling) - 10https://gitlab.wikimedia.org/repos/cloud/toolforge/jobs-cli/-/merge_requests/45 (https://phabricator.wikimedia.org/T366209) [22:26:10] (03update) 10raymond-ndibe: [jobs-cli] refactor job validation [repos/cloud/toolforge/jobs-cli] (refactor_error_handling) - 10https://gitlab.wikimedia.org/repos/cloud/toolforge/jobs-cli/-/merge_requests/45 (https://phabricator.wikimedia.org/T366209) [22:37:29] FIRING: PuppetStaleCertificates: Found non-revoked Puppet certificates for 4 deleted instances on toolsbeta-puppetserver-1 - https://wikitech.wikimedia.org/wiki/Portal:Cloud_VPS/Admin/Runbooks/PuppetStaleCertificates - https://prometheus-alerts.wmcloud.org/?q=alertname%3DPuppetStaleCertificates [22:38:02] (03update) 10raymond-ndibe: [jobs-cli] refactor job validation [repos/cloud/toolforge/jobs-cli] (refactor_error_handling) - 10https://gitlab.wikimedia.org/repos/cloud/toolforge/jobs-cli/-/merge_requests/45 (https://phabricator.wikimedia.org/T366209) [22:44:23] 06cloud-services-team, 10Cloud-VPS, 06collaboration-services, 13Patch-For-Review: VMs in Cloud VPS share the same machine-id - https://phabricator.wikimedia.org/T351507#9952269 (10Andrew) 05Open→03Resolved I've rebuild the bullseye and bookworm base images, and they should now produce a unique mach... [22:46:26] 06cloud-services-team, 10Cloud-VPS (Debian Buster Deprecation), 10Beta-Cluster-Infrastructure, 06serviceops, 13Patch-For-Review: Replace deployment-memc[08-10] with Bullseye or Bookworm - https://phabricator.wikimedia.org/T361384#9952292 (10Andrew) The three old VMs have been replaced by: deployment-mem... [23:00:23] (03update) 10raymond-ndibe: [jobs-cli] refactor job validation [repos/cloud/toolforge/jobs-cli] (refactor_error_handling) - 10https://gitlab.wikimedia.org/repos/cloud/toolforge/jobs-cli/-/merge_requests/45 (https://phabricator.wikimedia.org/T366209) [23:19:41] FIRING: CloudVPSDesignateLeaks: Detected 4 stray dns records - https://wikitech.wikimedia.org/wiki/Portal:Cloud_VPS/Admin/Runbooks/Designate_record_leaks - https://grafana.wikimedia.org/d/ebJoA6VWz/wmcs-openstack-eqiad-nova-fullstack - https://alerts.wikimedia.org/?q=alertname%3DCloudVPSDesignateLeaks [23:29:42] RESOLVED: CloudVPSDesignateLeaks: Detected 4 stray dns records - https://wikitech.wikimedia.org/wiki/Portal:Cloud_VPS/Admin/Runbooks/Designate_record_leaks - https://grafana.wikimedia.org/d/ebJoA6VWz/wmcs-openstack-eqiad-nova-fullstack - https://alerts.wikimedia.org/?q=alertname%3DCloudVPSDesignateLeaks [23:39:04] (03update) 10raymond-ndibe: [jobs-cli] refactor job validation [repos/cloud/toolforge/jobs-cli] (refactor_error_handling) - 10https://gitlab.wikimedia.org/repos/cloud/toolforge/jobs-cli/-/merge_requests/45 (https://phabricator.wikimedia.org/T366209)