[00:10:49] (TfInfraTestDestroyFailed) firing: Terraform failed to destroy the resources on tf-bastion - https://wikitech.wikimedia.org/wiki/Portal:Cloud_VPS/Admin/Runbooks/TfInfraTestDestroyFailed - https://prometheus-alerts.wmcloud.org/?q=alertname%3DTfInfraTestDestroyFailed [00:10:49] (TfInfraTestApplyFailed) firing: Terraform failed to apply/create the resources on tf-bastion - https://wikitech.wikimedia.org/wiki/Portal:Cloud_VPS/Admin/Runbooks/TfInfraTestApplyFailed - https://prometheus-alerts.wmcloud.org/?q=alertname%3DTfInfraTestApplyFailed [01:15:38] (ProbeDown) firing: Service toolsbeta-test-k8s-haproxy-3:30000 has failed probes (http_this_tool_does_not_exist_beta_toolforge_org_ip4) - https://wikitech.wikimedia.org/wiki/Runbook#toolsbeta-test-k8s-haproxy-3:30000 - https://grafana.wikimedia.org/d/O0nHhdhnz/network-probes-overview?var-job=probes/custom&var-module=All - https://prometheus-alerts.wmcloud.org/?q=alertname%3DProbeDown [01:20:38] (ProbeDown) resolved: Service toolsbeta-test-k8s-haproxy-3:30000 has failed probes (http_this_tool_does_not_exist_beta_toolforge_org_ip4) - https://wikitech.wikimedia.org/wiki/Runbook#toolsbeta-test-k8s-haproxy-3:30000 - https://grafana.wikimedia.org/d/O0nHhdhnz/network-probes-overview?var-job=probes/custom&var-module=All - https://prometheus-alerts.wmcloud.org/?q=alertname%3DProbeDown [03:10:49] (TfInfraTestDestroyFailed) firing: Terraform failed to destroy the resources on tf-bastion - https://wikitech.wikimedia.org/wiki/Portal:Cloud_VPS/Admin/Runbooks/TfInfraTestDestroyFailed - https://prometheus-alerts.wmcloud.org/?q=alertname%3DTfInfraTestDestroyFailed [03:10:49] (TfInfraTestApplyFailed) firing: Terraform failed to apply/create the resources on tf-bastion - https://wikitech.wikimedia.org/wiki/Portal:Cloud_VPS/Admin/Runbooks/TfInfraTestApplyFailed - https://prometheus-alerts.wmcloud.org/?q=alertname%3DTfInfraTestApplyFailed [06:10:49] (TfInfraTestDestroyFailed) firing: Terraform failed to destroy the resources on tf-bastion - https://wikitech.wikimedia.org/wiki/Portal:Cloud_VPS/Admin/Runbooks/TfInfraTestDestroyFailed - https://prometheus-alerts.wmcloud.org/?q=alertname%3DTfInfraTestDestroyFailed [06:10:49] (TfInfraTestApplyFailed) firing: Terraform failed to apply/create the resources on tf-bastion - https://wikitech.wikimedia.org/wiki/Portal:Cloud_VPS/Admin/Runbooks/TfInfraTestApplyFailed - https://prometheus-alerts.wmcloud.org/?q=alertname%3DTfInfraTestApplyFailed [06:11:06] (03CR) 10Eugene233: Bug:T357238. Updated the Help Page on the ISA Tool. (032 comments) [labs/tools/Isa] - 10https://gerrit.wikimedia.org/r/1006160 (owner: 10Ketulucas) [06:11:45] (03CR) 10Eugene233: "recheck" [labs/tools/Isa] - 10https://gerrit.wikimedia.org/r/1006160 (owner: 10Ketulucas) [09:10:49] (TfInfraTestDestroyFailed) firing: Terraform failed to destroy the resources on tf-bastion - https://wikitech.wikimedia.org/wiki/Portal:Cloud_VPS/Admin/Runbooks/TfInfraTestDestroyFailed - https://prometheus-alerts.wmcloud.org/?q=alertname%3DTfInfraTestDestroyFailed [09:10:49] (TfInfraTestApplyFailed) firing: Terraform failed to apply/create the resources on tf-bastion - https://wikitech.wikimedia.org/wiki/Portal:Cloud_VPS/Admin/Runbooks/TfInfraTestApplyFailed - https://prometheus-alerts.wmcloud.org/?q=alertname%3DTfInfraTestApplyFailed [09:26:11] !log aborrero@cloudcumin1001 tools START - Cookbook wmcs.toolforge.add_k8s_node for a control role in the tools cluster (T284656) [09:26:16] T284656: Toolforge k8s: Migrate workers to Containerd and Bookworm - https://phabricator.wikimedia.org/T284656 [09:35:33] !log aborrero@cloudcumin1001 tools Added a new k8s control tools-k8s-control-9.tools.eqiad1.wikimedia.cloud to the cluster [09:35:33] !log aborrero@cloudcumin1001 tools END (PASS) - Cookbook wmcs.toolforge.add_k8s_node (exit_code=0) for a control role in the tools cluster [09:53:59] !log aborrero@cloudcumin1001 tools START - Cookbook wmcs.toolforge.remove_k8s_node (T284656) [09:54:04] T284656: Toolforge k8s: Migrate workers to Containerd and Bookworm - https://phabricator.wikimedia.org/T284656 [09:54:45] !log aborrero@cloudcumin1001 tools END (PASS) - Cookbook wmcs.toolforge.remove_k8s_node (exit_code=0) (T284656) [10:04:20] 10Tools, 10Wikidata, 10Wikidata Dev Team: [GENERAL] Deprecate connecting senses prototype - https://phabricator.wikimedia.org/T351829#9575706 (10HasanAkgun_WMDE) [10:22:35] 10cloud-services-team, 10wikitech.wikimedia.org, 10sre-alert-triage: Alert in need of triage: Wikitech-static MW version up to date (instance wikitech-static.wikimedia.org) - https://phabricator.wikimedia.org/T357880#9575766 (10taavi) a:03taavi [10:36:47] 10Cloud-VPS (Project-requests): Create labs project for NonFreeWiki - https://phabricator.wikimedia.org/T108167#9575815 (10Josve05a) [10:41:41] (CloudVPSDesignateLeaks) firing: Detected 2 stray dns records - https://wikitech.wikimedia.org/wiki/Portal:Cloud_VPS/Admin/Runbooks/Designate_record_leaks - https://grafana.wikimedia.org/d/ebJoA6VWz/wmcs-openstack-eqiad-nova-fullstack - https://alerts.wikimedia.org/?q=alertname%3DCloudVPSDesignateLeaks [10:46:41] (CloudVPSDesignateLeaks) firing: (2) Detected 2 stray dns records - https://wikitech.wikimedia.org/wiki/Portal:Cloud_VPS/Admin/Runbooks/Designate_record_leaks - https://grafana.wikimedia.org/d/ebJoA6VWz/wmcs-openstack-eqiad-nova-fullstack - https://alerts.wikimedia.org/?q=alertname%3DCloudVPSDesignateLeaks [10:51:54] 10Toolforge (Toolforge iteration 06), 10cloud-services-team, 10Kubernetes, 10User-aborrero: toolforge k8s: some static pods needs manual restart - https://phabricator.wikimedia.org/T358476#9575850 (10aborrero) [10:52:22] 10Toolforge (Toolforge iteration 06), 10cloud-services-team, 10Kubernetes, 10Patch-For-Review, 10User-aborrero: Toolforge k8s: Migrate workers to Containerd and Bookworm - https://phabricator.wikimedia.org/T284656#9575860 (10aborrero) [10:55:01] 10Toolforge (Toolforge iteration 06), 10cloud-services-team, 10Kubernetes, 10Patch-For-Review, 10User-aborrero: Toolforge k8s: Migrate workers to Containerd and Bookworm - https://phabricator.wikimedia.org/T284656#9575861 (10aborrero) 05In progress→03Resolved This is done: `lang=shell-session aborre... [10:55:08] 10Toolforge (Toolforge iteration 06), 10Patch-For-Review: Create a pool of NFS-less Toolforge Kubernetes workers - https://phabricator.wikimedia.org/T355883#9575863 (10aborrero) [10:55:13] 10Toolforge, 10cloud-services-team, 10Kubernetes: Migrate Toolforge Kubernetes hosts to Debian Bullseye or later - https://phabricator.wikimedia.org/T311908#9575864 (10aborrero) [10:55:18] 10Toolforge (Toolforge iteration 06), 10cloud-services-team, 10User-aborrero: Upgrade Toolforge Kubernetes to version 1.24 - https://phabricator.wikimedia.org/T307651#9575865 (10aborrero) [10:55:23] 10Toolforge, 10cloud-services-team: Fix the mis-named k8s service in tools and toolsbeta projects - https://phabricator.wikimedia.org/T262562#9575866 (10aborrero) [11:08:00] 10Cloud-VPS, 10cloud-services-team, 10Patch-For-Review: "HAProxy service mysql has no available backends" fires when galera primary is down - https://phabricator.wikimedia.org/T357406#9575889 (10taavi) 05Open→03Resolved a:03taavi [11:10:16] 10Cloud-VPS (Quota-requests): Request for more compute and storage for the GLAMS dashboard project - https://phabricator.wikimedia.org/T358477#9575902 (10YonatanWMIL) [11:25:55] 10cloud-services-team, 10wikitech.wikimedia.org, 10Patch-For-Review, 10sre-alert-triage: Alert in need of triage: Wikitech-static MW version up to date (instance wikitech-static.wikimedia.org) - https://phabricator.wikimedia.org/T357880#9576000 (10taavi) 05Open→03Resolved [11:42:57] 10Toolforge (Toolforge iteration 06): [builds-api,jobs-api,envvars-api,api-gateway] FIgure out and document how to do non-backwards compatible changes - https://phabricator.wikimedia.org/T356974#9524110 (10dcaro) a:03dcaro [11:43:22] 10Toolforge (Toolforge iteration 06): [builds-api,jobs-api,envvars-api,api-gateway] FIgure out and document how to do non-backwards compatible changes - https://phabricator.wikimedia.org/T356974#9576048 (10dcaro) [11:44:30] 10Toolforge (Toolforge iteration 06): [builds-api,jobs-api,envvars-api,api-gateway] FIgure out and document how to do non-backwards compatible changes - https://phabricator.wikimedia.org/T356974#9524110 (10dcaro) [11:44:34] 10Toolforge (Toolforge iteration 06), 10Patch-For-Review, 10User-aborrero: [toolforge API] expose all backend APIs OpenAPI specs - https://phabricator.wikimedia.org/T358100#9576052 (10dcaro) [11:45:07] 10Toolforge (Toolforge iteration 06): [builds-api,jobs-api,envvars-api,api-gateway] FIgure out and document how to do non-backwards compatible changes - https://phabricator.wikimedia.org/T356974#9576053 (10dcaro) [11:45:12] 10Toolforge (Toolforge iteration 06), 10Patch-For-Review, 10User-aborrero: [toolforge API] Investigate ways to present our multiple Openapi definitions to a future consolidated CLI client - https://phabricator.wikimedia.org/T354745#9576055 (10dcaro) [11:45:14] 10Toolforge (Toolforge iteration 06), 10Patch-For-Review, 10User-aborrero: [toolforge API] expose all backend APIs OpenAPI specs - https://phabricator.wikimedia.org/T358100#9563353 (10dcaro) [11:46:00] 10Toolforge (Toolforge iteration 06): [builds-api,jobs-api,envvars-api,api-gateway] FIgure out and document how to do non-backwards compatible changes - https://phabricator.wikimedia.org/T356974#9524110 (10dcaro) [11:46:04] 10Toolforge, 10cloud-services-team, 10Documentation, 10Kubernetes: Figure out and document how to call the Kubernetes API as your tool user from inside a pod - https://phabricator.wikimedia.org/T321919#9576059 (10dcaro) [11:46:07] 10Toolforge (Toolforge iteration 06), 10Patch-For-Review, 10User-aborrero: [toolforge API] Investigate ways to present our multiple Openapi definitions to a future consolidated CLI client - https://phabricator.wikimedia.org/T354745#9449702 (10dcaro) [11:46:48] 10Toolforge (Toolforge iteration 06): [builds-api,jobs-api,envvars-api,api-gateway] FIgure out and document how to do non-backwards compatible changes - https://phabricator.wikimedia.org/T356974#9576060 (10CodeReviewBot) dcaro updated https://gitlab.wikimedia.org/repos/cloud/toolforge/builds-api/-/merge_requests... [11:47:31] 10Toolforge (Toolforge iteration 06), 10cloud-services-team, 10Kubernetes, 10User-aborrero: toolforge k8s: some static pods needs manual restart - https://phabricator.wikimedia.org/T358476#9576061 (10aborrero) p:05Triage→03Medium [12:07:55] (03CR) 10Btullis: "Thanks Merlijn," [labs/tools/wikibugs2] - 10https://gerrit.wikimedia.org/r/999007 (https://phabricator.wikimedia.org/T352783) (owner: 10Btullis) [12:09:25] (03CR) 10Btullis: [C: 03+2] Move #data-platform-sre announcements to a dedicated channel [labs/tools/wikibugs2] - 10https://gerrit.wikimedia.org/r/999007 (https://phabricator.wikimedia.org/T352783) (owner: 10Btullis) [12:10:31] (03Merged) 10jenkins-bot: Move #data-platform-sre announcements to a dedicated channel [labs/tools/wikibugs2] - 10https://gerrit.wikimedia.org/r/999007 (https://phabricator.wikimedia.org/T352783) (owner: 10Btullis) [12:10:49] (TfInfraTestDestroyFailed) firing: Terraform failed to destroy the resources on tf-bastion - https://wikitech.wikimedia.org/wiki/Portal:Cloud_VPS/Admin/Runbooks/TfInfraTestDestroyFailed - https://prometheus-alerts.wmcloud.org/?q=alertname%3DTfInfraTestDestroyFailed [12:10:49] (TfInfraTestApplyFailed) firing: Terraform failed to apply/create the resources on tf-bastion - https://wikitech.wikimedia.org/wiki/Portal:Cloud_VPS/Admin/Runbooks/TfInfraTestApplyFailed - https://prometheus-alerts.wmcloud.org/?q=alertname%3DTfInfraTestApplyFailed [12:11:26] 10Cloud-VPS, 10cloud-services-team: Rescue DBapp trove instance in glamwikidashboard project - https://phabricator.wikimedia.org/T355138#9576116 (10YonatanWMIL) Took longer than expected, daily update is running now [12:21:42] (CloudVPSDesignateLeaks) firing: (2) Detected 2 stray dns records - https://wikitech.wikimedia.org/wiki/Portal:Cloud_VPS/Admin/Runbooks/Designate_record_leaks - https://grafana.wikimedia.org/d/ebJoA6VWz/wmcs-openstack-eqiad-nova-fullstack - https://alerts.wikimedia.org/?q=alertname%3DCloudVPSDesignateLeaks [12:23:27] 10Toolforge (Toolforge iteration 06): Upgrade Toolforge image builder to Bookworm - https://phabricator.wikimedia.org/T358483#9576131 (10taavi) [12:23:34] (DiskSpace) firing: Disk space cloudbackup1004:9100:/ 5.621% free - https://wikitech.wikimedia.org/wiki/Monitoring/Disk_space - https://grafana.wikimedia.org/d/000000377/host-overview?orgId=1&viewPanel=12&var-server=cloudbackup1004 - https://alerts.wikimedia.org/?q=alertname%3DDiskSpace [12:25:18] (03CR) 10CI reject: [V: 04-1] Localisation updates from https://translatewiki.net. [labs/tools/weapon-of-mass-description] - 10https://gerrit.wikimedia.org/r/1006514 (owner: 10L10n-bot) [12:25:20] (03CR) 10CI reject: [V: 04-1] Localisation updates from https://translatewiki.net. [labs/tools/watch-translations] - 10https://gerrit.wikimedia.org/r/1006515 (owner: 10L10n-bot) [12:26:42] (CloudVPSDesignateLeaks) resolved: (2) Detected 2 stray dns records - https://wikitech.wikimedia.org/wiki/Portal:Cloud_VPS/Admin/Runbooks/Designate_record_leaks - https://grafana.wikimedia.org/d/ebJoA6VWz/wmcs-openstack-eqiad-nova-fullstack - https://alerts.wikimedia.org/?q=alertname%3DCloudVPSDesignateLeaks [12:38:34] (DiskSpace) resolved: Disk space cloudbackup1004:9100:/ 5.799% free - https://wikitech.wikimedia.org/wiki/Monitoring/Disk_space - https://grafana.wikimedia.org/d/000000377/host-overview?orgId=1&viewPanel=12&var-server=cloudbackup1004 - https://alerts.wikimedia.org/?q=alertname%3DDiskSpace [12:53:52] 10Toolforge (Toolforge iteration 06), 10Patch-For-Review: [builds-api,jobs-api,envvars-api,api-gateway] FIgure out and document how to do non-backwards compatible changes - https://phabricator.wikimedia.org/T356974#9576163 (10CodeReviewBot) dcaro opened https://gitlab.wikimedia.org/repos/cloud/toolforge/builds... [12:55:27] 10Cloud-VPS (Project-requests): Request creation of logger-discord-bot VPS project - https://phabricator.wikimedia.org/T358337#9576172 (10fnegri) `logger-discord-bot` is a valid tool name in Toolforge, but for the Trove project dashes in the name are [discouraged](https://wikitech.wikimedia.org/wiki/Portal:Cloud... [13:06:00] !log taavi@cloudcumin1001 toolsbeta START - Cookbook wmcs.vps.refresh_puppet_certs on toolsbeta-docker-imagebuilder-2.toolsbeta.eqiad1.wikimedia.cloud [13:07:08] !log taavi@cloudcumin1001 toolsbeta END (PASS) - Cookbook wmcs.vps.refresh_puppet_certs (exit_code=0) on toolsbeta-docker-imagebuilder-2.toolsbeta.eqiad1.wikimedia.cloud [13:15:38] 10Tools, 10Maps (Kartographer), 10Privacy: Wikivoyage should provide non external Nearby articles - https://phabricator.wikimedia.org/T194088#9576196 (10WMDE-Fisch) [13:17:40] 10Tools, 10Maps (Kartographer), 10Privacy: Wikivoyage should provide non external Nearby articles - https://phabricator.wikimedia.org/T194088#9576205 (10WMDE-Fisch) 05Open→03Invalid Yes , the special WikiVoyage nearby feature got deprecated and the code removed {T332785}. So this ticket should be invalid... [13:26:20] (03CR) 10Nikerabbit: [V: 03+2] Localisation updates from https://translatewiki.net. [labs/tools/weapon-of-mass-description] - 10https://gerrit.wikimedia.org/r/1006514 (owner: 10L10n-bot) [13:26:39] (03CR) 10Nikerabbit: [V: 03+2] Localisation updates from https://translatewiki.net. [labs/tools/watch-translations] - 10https://gerrit.wikimedia.org/r/1006515 (owner: 10L10n-bot) [13:45:02] 10Toolforge: Cannot delete directory from incolabot project on Toolforge - https://phabricator.wikimedia.org/T357342#9576315 (10dcaro) p:05Triage→03Medium [13:50:25] (03PS1) 10Arturo Borrero Gonzalez: kubernetes: refactor static pod restart logic [cloud/wmcs-cookbooks] - 10https://gerrit.wikimedia.org/r/1006529 (https://phabricator.wikimedia.org/T358476) [13:56:55] (03PS2) 10Arturo Borrero Gonzalez: kubernetes: refactor static pod restart logic [cloud/wmcs-cookbooks] - 10https://gerrit.wikimedia.org/r/1006529 (https://phabricator.wikimedia.org/T358476) [14:01:01] 10Cloud-VPS (Project-requests): Request creation of logger-discord-bot VPS project - https://phabricator.wikimedia.org/T358337#9576380 (100xDeadbeef) `loggerdiscordbot` is fine by me. [14:01:46] 10Cloud-VPS (Project-requests): Request creation of wikiauthbot-ng VPS project - https://phabricator.wikimedia.org/T358427#9576381 (100xDeadbeef) The redis trove database can be named `wikiauthbot2` since dashes are discouraged. [14:07:11] 10Tool-Global-user-contributions, 10Stewards-and-global-tools, 10Temporary accounts, 10XTools, 10Epic: [Epic] Implement global user contributions feature - https://phabricator.wikimedia.org/T337089#9576391 (10Tchanders) [14:43:38] 10cloud-services-team (FY2023/2024-Q3-Q4), 10Infrastructure-Foundations, 10SRE-tools, 10Spicerack, 10Patch-For-Review: spicerack: tox fails to install PyYAML using python 3.11 on bookworm - https://phabricator.wikimedia.org/T345337#9576504 (10fnegri) a:03fnegri I have updated the patch by @dcaro (https... [14:43:44] 10cloud-services-team (FY2023/2024-Q3-Q4), 10Infrastructure-Foundations, 10SRE-tools, 10Spicerack, 10Patch-For-Review: spicerack: tox fails to install PyYAML using python 3.11 on bookworm - https://phabricator.wikimedia.org/T345337#9576507 (10fnegri) [15:07:16] 10Wikibugs, 10Data-Platform-SRE: wikibugs test bug part II - https://phabricator.wikimedia.org/T90594#9576587 (10taavi) this should appear in #wikimedia-data-platform [15:08:36] 10Wikibugs, 10Data-Platform-SRE: wikibugs test bug part II - https://phabricator.wikimedia.org/T90594#9576589 (10taavi) aaaa [15:09:56] 10Wikibugs, 10Data-Platform-SRE: wikibugs test bug part II - https://phabricator.wikimedia.org/T90594#9576592 (10taavi) one more test [15:10:49] (TfInfraTestDestroyFailed) firing: Terraform failed to destroy the resources on tf-bastion - https://wikitech.wikimedia.org/wiki/Portal:Cloud_VPS/Admin/Runbooks/TfInfraTestDestroyFailed - https://prometheus-alerts.wmcloud.org/?q=alertname%3DTfInfraTestDestroyFailed [15:10:49] (TfInfraTestApplyFailed) firing: Terraform failed to apply/create the resources on tf-bastion - https://wikitech.wikimedia.org/wiki/Portal:Cloud_VPS/Admin/Runbooks/TfInfraTestApplyFailed - https://prometheus-alerts.wmcloud.org/?q=alertname%3DTfInfraTestApplyFailed [15:14:56] 10Wikibugs, 10Data-Platform-SRE: wikibugs test bug part II - https://phabricator.wikimedia.org/T90594#9576619 (10taavi) test [15:15:53] 10Wikibugs: wikibugs test bug part II - https://phabricator.wikimedia.org/T90594#9576621 (10taavi) [15:23:35] 10PAWS: Remove paws-prometheus-[12] - https://phabricator.wikimedia.org/T356429#9576652 (10rook) 05Open→03Resolved [15:28:53] 10Cloud-VPS, 10cloud-services-team: Spicerack: Add CI step to test with wmcs cookbooks - https://phabricator.wikimedia.org/T325758#9576678 (10joanna_borun) [15:34:24] vivian-rook opened https://github.com/toolforge/paws/pull/380 [15:34:28] (PuppetStaleCertificates) firing: Found non-revoked Puppet certificates for 2 deleted instances on paws-puppetmaster-2 - https://wikitech.wikimedia.org/wiki/Portal:Cloud_VPS/Admin/Runbooks/PuppetStaleCertificates - https://prometheus-alerts.wmcloud.org/?q=alertname%3DPuppetStaleCertificates [15:34:40] 10PAWS: Jupyter command `jupyter-nbextension` not found. - https://phabricator.wikimedia.org/T312234#9576707 (10github-toolforge-bot) vivian-rook opened https://github.com/toolforge/paws/pull/380 [15:36:19] 10Cloud-VPS, 10cloud-services-team, 10Infrastructure-Foundations, 10Puppet: wmf_auto_restart_cron.service failing in Cloud VPS bookworm instances - https://phabricator.wikimedia.org/T358343#9576710 (10MoritzMuehlenhoff) p:05Triage→03Medium a:03MoritzMuehlenhoff [15:38:04] 10Toolforge, 10cloud-services-team: Provide per-tool access to cloud-vps object storage - https://phabricator.wikimedia.org/T358496#9576712 (10Andrew) [15:38:57] 10cloud-services-team, 10Infrastructure-Foundations, 10SRE, 10User-aborrero: ACPI kernel failure on debian installer last step - https://phabricator.wikimedia.org/T357896#9576723 (10aborrero) p:05Triage→03Low I haven't checked if the server has the latest firmware updates issued by Dell. Out of cautio... [16:04:46] 10Tool-Global-user-contributions, 10Stewards-and-global-tools, 10Temporary accounts, 10XTools, and 2 others: [Design] Prototype and user testing plan - https://phabricator.wikimedia.org/T356099#9576814 (10KColeman-WMF) [16:08:08] 10Toolforge, 10cloud-services-team: Provide per-tool access to cloud-vps object storage - https://phabricator.wikimedia.org/T358496#9576824 (10bd808) [16:09:14] 10PAWS: Jupyter command `jupyter-nbextension` not found. - https://phabricator.wikimedia.org/T312234#9576838 (10github-toolforge-bot) vivian-rook closed https://github.com/toolforge/paws/pull/380 [16:09:29] vivian-rook closed https://github.com/toolforge/paws/pull/380 [16:09:50] 10Toolforge, 10cloud-services-team: Provide per-tool access to cloud-vps object storage - https://phabricator.wikimedia.org/T358496#9576712 (10bd808) `lang=irc [15:54] < taavi> andrewbogott: did you consider the option of having a second radosgw instance where authentication is not tied to openstack? [16:0... [16:10:17] 10PAWS: Jupyter command `jupyter-nbextension` not found. - https://phabricator.wikimedia.org/T312234#9576844 (10rook) 05Open→03Resolved a:03rook [16:11:41] (CloudVPSDesignateLeaks) firing: Detected 4 stray dns records - https://wikitech.wikimedia.org/wiki/Portal:Cloud_VPS/Admin/Runbooks/Designate_record_leaks - https://grafana.wikimedia.org/d/ebJoA6VWz/wmcs-openstack-eqiad-nova-fullstack - https://alerts.wikimedia.org/?q=alertname%3DCloudVPSDesignateLeaks [16:15:58] 10wikitech.wikimedia.org, 10Content-Transform-Team-WIP, 10DiscussionTools, 10Parsoid-Read-Views (Phase 1 - DiscussionTools support): Use Parsoid for DiscussionTools on wikitech - https://phabricator.wikimedia.org/T355374#9576884 (10cscott) 05Open→03Resolved a:03cscott [16:16:41] (CloudVPSDesignateLeaks) firing: (2) Detected 4 stray dns records - https://wikitech.wikimedia.org/wiki/Portal:Cloud_VPS/Admin/Runbooks/Designate_record_leaks - https://grafana.wikimedia.org/d/ebJoA6VWz/wmcs-openstack-eqiad-nova-fullstack - https://alerts.wikimedia.org/?q=alertname%3DCloudVPSDesignateLeaks [16:16:56] 10cloud-services-team, 10Infrastructure-Foundations, 10SRE, 10User-aborrero: ACPI kernel failure on debian installer last step - https://phabricator.wikimedia.org/T357896#9576914 (10MoritzMuehlenhoff) >>! In T357896#9576723, @aborrero wrote: > I haven't checked if the server has the latest firmware updates... [16:20:36] 10Tool-global-search: Global Search is down: 500: Internal Server Error / Could not resolve host: cloudelastic1004.wikimedia.org - https://phabricator.wikimedia.org/T358061#9576965 (10EBernhardson) I suspect at the time we initially setup global-search we didn't have the cloudelastic.wikimedia.org alias up and r... [16:21:32] 10Toolforge, 10cloud-services-team: Provide per-tool access to cloud-vps object storage - https://phabricator.wikimedia.org/T358496#9576975 (10dcaro) [16:23:31] 10Toolforge, 10cloud-services-team: [toolforge,storage] Provide per-tool access to cloud-vps object storage - https://phabricator.wikimedia.org/T358496#9576984 (10dcaro) p:05Triage→03High [16:31:32] (03PS3) 10Arturo Borrero Gonzalez: kubernetes: refactor static pod restart logic [cloud/wmcs-cookbooks] - 10https://gerrit.wikimedia.org/r/1006529 (https://phabricator.wikimedia.org/T358476) [16:34:48] (03CR) 10CI reject: [V: 04-1] kubernetes: refactor static pod restart logic [cloud/wmcs-cookbooks] - 10https://gerrit.wikimedia.org/r/1006529 (https://phabricator.wikimedia.org/T358476) (owner: 10Arturo Borrero Gonzalez) [16:51:41] (CloudVPSDesignateLeaks) firing: (2) Detected 4 stray dns records - https://wikitech.wikimedia.org/wiki/Portal:Cloud_VPS/Admin/Runbooks/Designate_record_leaks - https://grafana.wikimedia.org/d/ebJoA6VWz/wmcs-openstack-eqiad-nova-fullstack - https://alerts.wikimedia.org/?q=alertname%3DCloudVPSDesignateLeaks [16:51:59] (03PS4) 10Arturo Borrero Gonzalez: kubernetes: refactor static pod restart logic [cloud/wmcs-cookbooks] - 10https://gerrit.wikimedia.org/r/1006529 (https://phabricator.wikimedia.org/T358476) [16:56:41] (CloudVPSDesignateLeaks) resolved: (2) Detected 4 stray dns records - https://wikitech.wikimedia.org/wiki/Portal:Cloud_VPS/Admin/Runbooks/Designate_record_leaks - https://grafana.wikimedia.org/d/ebJoA6VWz/wmcs-openstack-eqiad-nova-fullstack - https://alerts.wikimedia.org/?q=alertname%3DCloudVPSDesignateLeaks [17:07:00] 10Toolforge, 10cloud-services-team: [toolforge,storage] Provide per-tool access to cloud-vps object storage - https://phabricator.wikimedia.org/T358496#9577235 (10Andrew) [17:07:28] 10Toolforge, 10cloud-services-team: [toolforge,storage] Provide per-tool access to cloud-vps object storage - https://phabricator.wikimedia.org/T358496#9576712 (10Andrew) [17:08:28] (PuppetAgentNoResources) firing: No Puppet resources found on instance metricsinfra-alertmanager-1 on project metricsinfra - https://prometheus-alerts.wmcloud.org/?q=alertname%3DPuppetAgentNoResources [17:30:30] PROBLEM - Host wikitech-static.wikimedia.org is DOWN: PING CRITICAL - Packet loss = 100% [18:02:14] RECOVERY - Host wikitech-static.wikimedia.org is UP: PING OK - Packet loss = 0%, RTA = 28.68 ms [18:10:49] (TfInfraTestDestroyFailed) firing: Terraform failed to destroy the resources on tf-bastion - https://wikitech.wikimedia.org/wiki/Portal:Cloud_VPS/Admin/Runbooks/TfInfraTestDestroyFailed - https://prometheus-alerts.wmcloud.org/?q=alertname%3DTfInfraTestDestroyFailed [18:10:49] (TfInfraTestApplyFailed) firing: Terraform failed to apply/create the resources on tf-bastion - https://wikitech.wikimedia.org/wiki/Portal:Cloud_VPS/Admin/Runbooks/TfInfraTestApplyFailed - https://prometheus-alerts.wmcloud.org/?q=alertname%3DTfInfraTestApplyFailed [18:15:05] (03PS1) 10AgnesAbah: Bug:T357376 Remove unused/not-required imports from _init_.py [labs/tools/Isa] - 10https://gerrit.wikimedia.org/r/1006202 [18:21:16] 10cloud-services-team (FY2023/2024-Q3-Q4), 10Infrastructure-Foundations, 10SRE-tools, 10Spicerack, 10Patch-For-Review: spicerack: tox fails to install PyYAML using python 3.11 on bookworm - https://phabricator.wikimedia.org/T345337#9577617 (10Volans) @fnegri Thanks a lot for resuming this and taking care... [18:23:58] (03CR) 10AgnesAbah: "I commented/Removed on import json because it was not used in the file Isa\__init__.py" [labs/tools/Isa] - 10https://gerrit.wikimedia.org/r/1006202 (owner: 10AgnesAbah) [18:34:28] (PuppetStaleCertificates) firing: Found non-revoked Puppet certificates for 2 deleted instances on paws-puppetmaster-2 - https://wikitech.wikimedia.org/wiki/Portal:Cloud_VPS/Admin/Runbooks/PuppetStaleCertificates - https://prometheus-alerts.wmcloud.org/?q=alertname%3DPuppetStaleCertificates [18:54:44] 10Wikibugs, 10Patch-For-Review, 10User-bd808: wikibugs having a hard time staying connected to libera.chat IRC network - https://phabricator.wikimedia.org/T357729#9577718 (10CodeReviewBot) bd808 opened https://gitlab.wikimedia.org/toolforge-repos/wikibugs2-znc/-/merge_requests/2 Make ZNC run [18:59:28] (PuppetStaleCertificates) resolved: Found non-revoked Puppet certificates for 2 deleted instances on paws-puppetmaster-2 - https://wikitech.wikimedia.org/wiki/Portal:Cloud_VPS/Admin/Runbooks/PuppetStaleCertificates - https://prometheus-alerts.wmcloud.org/?q=alertname%3DPuppetStaleCertificates [19:18:41] 10Cloud Services Proposals, 10cloud-services-team, 10User-dcaro: paws prometheus no longer 'trusted' in metricsinfra::alertmanager - https://phabricator.wikimedia.org/T358519#9577837 (10Andrew) [19:22:48] 10Cloud Services Proposals, 10cloud-services-team, 10User-dcaro: paws prometheus no longer 'trusted' in metricsinfra::alertmanager - https://phabricator.wikimedia.org/T358519#9577862 (10rook) So far as I know the prometheus nodes that I removed in T356429 hadn't been doing anything for some time (A year or m... [19:24:35] 10Cloud Services Proposals, 10cloud-services-team, 10User-dcaro: paws prometheus no longer 'trusted' in metricsinfra::alertmanager - https://phabricator.wikimedia.org/T358519#9577871 (10taavi) [19:24:38] 10Cloud Services Proposals, 10User-dcaro: Cloud services enhancement proposal: Prometheus metrics for Toolforge/Toolsbeta/Paws Kubernetes clusters - https://phabricator.wikimedia.org/T304716#9577872 (10taavi) [19:24:50] 10Cloud-VPS, 10PAWS, 10cloud-services-team: paws prometheus no longer 'trusted' in metricsinfra::alertmanager - https://phabricator.wikimedia.org/T358519#9577837 (10taavi) [19:30:35] 10Cloud-VPS, 10PAWS, 10cloud-services-team: paws prometheus no longer 'trusted' in metricsinfra::alertmanager - https://phabricator.wikimedia.org/T358519#9577884 (10taavi) Please remove that bit of config. This config is used to configre access for [[ https://wikitech.wikimedia.org/wiki/Portal:Cloud_VPS/Admi... [19:52:07] 10Cloud-VPS, 10PAWS, 10cloud-services-team: paws prometheus no longer 'trusted' in metricsinfra::alertmanager - https://phabricator.wikimedia.org/T358519#9577923 (10Andrew) 05Open→03Resolved > Please remove that bit of config. Done, and puppet is happy again. I will stop thinking about this! [19:58:28] (PuppetAgentNoResources) resolved: No Puppet resources found on instance metricsinfra-alertmanager-1 on project metricsinfra - https://prometheus-alerts.wmcloud.org/?q=alertname%3DPuppetAgentNoResources [20:43:15] 10Tool-global-search: Global Search is down: 500: Internal Server Error / Could not resolve host: cloudelastic1004.wikimedia.org - https://phabricator.wikimedia.org/T358061#9578054 (10MusikAnimal) 05Open→03Resolved a:03MusikAnimal >>! In T358061#9576965, @EBernhardson wrote: > I suspect at the time we init... [21:06:43] 10Tool-ducttape, 10Abstract Wikipedia team, 10User-vaughnwalters: DUCT exits with "panic: runtime error: invalid memory address or nil pointer dereference" on every run during setup-web-proxy - https://phabricator.wikimedia.org/T357354#9578168 (10vaughnwalters) [21:10:49] (TfInfraTestDestroyFailed) firing: Terraform failed to destroy the resources on tf-bastion - https://wikitech.wikimedia.org/wiki/Portal:Cloud_VPS/Admin/Runbooks/TfInfraTestDestroyFailed - https://prometheus-alerts.wmcloud.org/?q=alertname%3DTfInfraTestDestroyFailed [21:10:49] (TfInfraTestApplyFailed) firing: Terraform failed to apply/create the resources on tf-bastion - https://wikitech.wikimedia.org/wiki/Portal:Cloud_VPS/Admin/Runbooks/TfInfraTestApplyFailed - https://prometheus-alerts.wmcloud.org/?q=alertname%3DTfInfraTestApplyFailed [21:41:41] (CloudVPSDesignateLeaks) firing: Detected 4 stray dns records - https://wikitech.wikimedia.org/wiki/Portal:Cloud_VPS/Admin/Runbooks/Designate_record_leaks - https://grafana.wikimedia.org/d/ebJoA6VWz/wmcs-openstack-eqiad-nova-fullstack - https://alerts.wikimedia.org/?q=alertname%3DCloudVPSDesignateLeaks [21:46:41] (CloudVPSDesignateLeaks) firing: (2) Detected 4 stray dns records - https://wikitech.wikimedia.org/wiki/Portal:Cloud_VPS/Admin/Runbooks/Designate_record_leaks - https://grafana.wikimedia.org/d/ebJoA6VWz/wmcs-openstack-eqiad-nova-fullstack - https://alerts.wikimedia.org/?q=alertname%3DCloudVPSDesignateLeaks [21:52:15] 10Wikibugs, 10User-bd808: wikibugs having a hard time staying connected to libera.chat IRC network - https://phabricator.wikimedia.org/T357729#9578292 (10CodeReviewBot) bd808 merged https://gitlab.wikimedia.org/toolforge-repos/wikibugs2-znc/-/merge_requests/2 Make ZNC run [23:34:09] 10Toolforge: Expose Toolforge service names via environment variables - https://phabricator.wikimedia.org/T151002#9578601 (10bd808) [23:48:44] 10Toolforge: Expose Toolforge service names via environment variables - https://phabricator.wikimedia.org/T151002#9578641 (10bd808) > But until then, it'd be good to be able to adopt services like Redis in a tool while still being able to easily run them locally (without having to set up a hostname called "tools...