[00:46:05] 10Tool-Pageviews: Add some more core data to Langviews and Massviews. - https://phabricator.wikimedia.org/T355627 (10Sadads) [03:13:54] 10Tools: CropTool is down - https://phabricator.wikimedia.org/T355633 (10IagoQnsi) [03:18:49] 10Tools: CropTool is down - https://phabricator.wikimedia.org/T355633 (10IagoQnsi) [04:25:44] 10Grid-Engine-to-K8s-Migration: Migrate croptool from Toolforge GridEngine to Toolforge Kubernetes - https://phabricator.wikimedia.org/T319653 (10SecretName101) This really should be a priority item. For those who are frequent contributors to Commons, this is an especially important tool. Especially for creating... [04:28:08] 10Grid-Engine-to-K8s-Migration: Migrate croptool from Toolforge GridEngine to Toolforge Kubernetes - https://phabricator.wikimedia.org/T319653 (10Soda) I did take a look at this, but I'm currently blocked on {T355575} [05:49:10] 10Grid-Engine-to-K8s-Migration: Migrate croptool from Toolforge GridEngine to Toolforge Kubernetes - https://phabricator.wikimedia.org/T319653 (10Soda) Btw @nskaggs would it be possible to keep the tool's services running for now, since this is a widely used tool ? [05:49:42] 10Grid-Engine-to-K8s-Migration: Migrate croptool from Toolforge GridEngine to Toolforge Kubernetes - https://phabricator.wikimedia.org/T319653 (10Soda) 05Open→03Stalled [06:19:51] 10Grid-Engine-to-K8s-Migration: Migrate panoviewer from Toolforge GridEngine to Toolforge Kubernetes - https://phabricator.wikimedia.org/T319953 (10tstarling) I hit a problem — Hugin is missing from Ubuntu 22.04, which is also the only distro available for Toolforge buildpacks. It's back again in Ubuntu 23.04. I... [08:06:26] 10Grid-Engine-to-K8s-Migration: Migrate croptool from Toolforge GridEngine to Toolforge Kubernetes - https://phabricator.wikimedia.org/T319653 (10taavi) >>! In T319653#9478785, @Jmabel wrote: > As of yesterday, this tool is down. I'm guessing GridEngine has been turned off, and the issue with CropTool using it h... [08:08:04] 10Tools: CropTool is down - https://phabricator.wikimedia.org/T355633 (10taavi) 05Open→03Resolved a:03taavi I have restarted the tool following the discussion in {T319653}. [08:44:37] 10Cloud-VPS, 10cloud-services-team, 10DC-Ops, 10SRE, 10ops-eqiad: cloudrabbit: connect them via cloudsw and cloud-private - https://phabricator.wikimedia.org/T345610 (10taavi) [09:25:23] (HAProxyBackendUnavailable) firing: HAProxy service neutron-api_backend backend cloudcontrol1006.private.eqiad.wikimedia.cloud is down - https://wikitech.wikimedia.org/wiki/HAProxy - TODO - https://alerts.wikimedia.org/?q=alertname%3DHAProxyBackendUnavailable [09:30:22] (HAProxyBackendUnavailable) resolved: HAProxy service neutron-api_backend backend cloudcontrol1006.private.eqiad.wikimedia.cloud is down - https://wikitech.wikimedia.org/wiki/HAProxy - TODO - https://alerts.wikimedia.org/?q=alertname%3DHAProxyBackendUnavailable [09:38:19] 10superset.wmcloud.org: Database not transferring - https://phabricator.wikimedia.org/T355652 (10rook) [10:03:34] 10Toolforge (Toolforge iteration 03), 10Patch-For-Review: [lima-kilo] see how much we can strip off if we only support VM-based setup - https://phabricator.wikimedia.org/T354941 (10CodeReviewBot) dcaro merged https://gitlab.wikimedia.org/repos/cloud/toolforge/lima-kilo/-/merge_requests/90 Vm support only [10:04:48] 10Toolforge (Toolforge iteration 03), 10Patch-For-Review: [lima-kilo] see how much we can strip off if we only support VM-based setup - https://phabricator.wikimedia.org/T354941 (10CodeReviewBot) dcaro merged https://gitlab.wikimedia.org/repos/cloud/toolforge/builds-builder/-/merge_requests/29 setup_harbor: u... [10:06:08] 10Toolforge (Toolforge iteration 03), 10Patch-For-Review: [lima-kilo] see how much we can strip off if we only support VM-based setup - https://phabricator.wikimedia.org/T354941 (10CodeReviewBot) project_1317_bot_df3177307bed93c3f34e421e26c86e38 opened https://gitlab.wikimedia.org/repos/cloud/toolforge/toolfor... [10:06:36] (03PS2) 10David Caro: toolsdb: add cookbook to retrieve stuck table+query [cloud/wmcs-cookbooks] - 10https://gerrit.wikimedia.org/r/992215 [10:06:46] !log dcaro@urcuchillay toolsbeta START - Cookbook wmcs.toolforge.k8s.component.deploy for component builds-builder [10:06:50] Logged the message at https://wikitech.wikimedia.org/wiki/Nova_Resource:Toolsbeta/SAL [10:07:17] !log dcaro@urcuchillay toolsbeta END (PASS) - Cookbook wmcs.toolforge.k8s.component.deploy (exit_code=0) for component builds-builder [10:07:20] Logged the message at https://wikitech.wikimedia.org/wiki/Nova_Resource:Toolsbeta/SAL [10:08:09] 10cloud-services-team (FY2023/2024-Q1-Q2), 10Cloud-Services-Origin-Alert, 10Cloud-Services-Worktype-Unplanned, 10User-dcaro: [toolsdb] ToolsToolsDBReplicationLagIsTooHigh - 2024-01-19 - https://phabricator.wikimedia.org/T355411 (10dcaro) The query is now unstuck, the replica started processing other querie... [10:10:15] (03CR) 10CI reject: [V: 04-1] toolsdb: add cookbook to retrieve stuck table+query [cloud/wmcs-cookbooks] - 10https://gerrit.wikimedia.org/r/992215 (owner: 10David Caro) [10:21:46] 10Toolforge Build Service: Toolforge refuses to install build-essential - https://phabricator.wikimedia.org/T355575 (10dcaro) If this is in order to compile jpetrans on the fly, you can try using the ubuntu package instead: https://launchpad.net/ubuntu/trusty/+package/libjpeg-turbo-progs C/C++ are not supported... [10:23:17] !log dcaro@urcuchillay tools START - Cookbook wmcs.toolforge.k8s.component.deploy for component builds-builder [10:23:20] Logged the message at https://wikitech.wikimedia.org/wiki/Nova_Resource:Tools/SAL [10:23:52] !log dcaro@urcuchillay tools END (PASS) - Cookbook wmcs.toolforge.k8s.component.deploy (exit_code=0) for component builds-builder [10:23:54] Logged the message at https://wikitech.wikimedia.org/wiki/Nova_Resource:Tools/SAL [10:31:37] 10Toolforge Build Service: Toolforge refuses to install build-essential - https://phabricator.wikimedia.org/T355575 (10dcaro) Had a quick look at the code, I see also that there's a lot going on on the [[ https://github.com/sohomdatta1/croptool/blob/master/startserver.sh | start script ]], you should try to move... [10:33:39] 10Toolforge (Toolforge iteration 03), 10Patch-For-Review: [lima-kilo] see how much we can strip off if we only support VM-based setup - https://phabricator.wikimedia.org/T354941 (10CodeReviewBot) dcaro merged https://gitlab.wikimedia.org/repos/cloud/toolforge/toolforge-deploy/-/merge_requests/182 builds-build... [10:51:31] 10Toolforge (Toolforge iteration 03), 10Patch-For-Review: [lima-kilo] see how much we can strip off if we only support VM-based setup - https://phabricator.wikimedia.org/T354941 (10dcaro) So summarizing, it does help considerably, not only on raw lines of yaml, but also complexity: ` dcaro@urcuchillay$ git dif... [10:52:19] 10Toolforge (Toolforge iteration 03): [jobs-api,toolforge-deploy] allow using local harbor instance - https://phabricator.wikimedia.org/T355299 (10dcaro) 05In progress→03Resolved [10:52:32] 10Toolforge (Toolforge iteration 03), 10Patch-For-Review: [lima-kilo] see how much we can strip off if we only support VM-based setup - https://phabricator.wikimedia.org/T354941 (10dcaro) 05In progress→03Resolved [10:54:20] 10Toolforge (Toolforge iteration 03), 10Toolforge Build Service, 10cloud-services-team (FY2023/2024-Q1-Q2), 10User-dcaro: [harbor] Redis using all available memory - https://phabricator.wikimedia.org/T354176 (10dcaro) Still seems to be going down, but slower and slower: {F41709936} [11:03:18] (03PS3) 10David Caro: toolsdb: add cookbook to retrieve stuck table+query [cloud/wmcs-cookbooks] - 10https://gerrit.wikimedia.org/r/992215 [11:24:13] 10Toolforge (Toolforge iteration 03), 10cloud-services-team, 10Kubernetes, 10Patch-For-Review: Upgrade cadvisor - https://phabricator.wikimedia.org/T349795 (10CodeReviewBot) taavi opened https://gitlab.wikimedia.org/repos/cloud/toolforge/toolforge-deploy/-/merge_requests/183 Revert "wmcs-k8s-metrics: roll... [11:24:55] 10Toolforge, 10cloud-services-team, 10Patch-For-Review: toolforge prometheus servers OOMing - https://phabricator.wikimedia.org/T350227 (10CodeReviewBot) taavi opened https://gitlab.wikimedia.org/repos/cloud/toolforge/toolforge-deploy/-/merge_requests/183 Revert "wmcs-k8s-metrics: rollback tools" [11:36:45] 10cloud-services-team, 10Bitu, 10Infrastructure-Foundations, 10LDAP: Allocate more available UNIX UIDs for human users - https://phabricator.wikimedia.org/T355663 (10taavi) [11:39:07] 10Data-Services, 10cloud-services-team, 10Data-Platform-SRE: move cloudelastic behind cloudlb - https://phabricator.wikimedia.org/T346946 (10cmooney) >>! In T346946#9477701, @bking wrote: > @taavi a few questions to clarify scope and amount of work required, since we've already been asked to [[ https://phabr... [12:43:17] 10Toolforge (Toolforge iteration 03), 10Toolforge Build Service, 10User-Raymond_Ndibe: builds log streaming times out when time between two loglines exceeds ~1min - https://phabricator.wikimedia.org/T354189 (10dcaro) 05In progress→03Resolved [12:51:37] 10Toolforge (Toolforge iteration 02), 10Patch-For-Review: [envvars-cli] use toolforge-weld for error handling - https://phabricator.wikimedia.org/T351459 (10CodeReviewBot) dcaro opened https://gitlab.wikimedia.org/repos/cloud/toolforge/envvars-cli/-/merge_requests/25 bump to 0.0.4 [12:52:27] 10Toolforge (Toolforge iteration 02), 10Patch-For-Review, 10User-dcaro: [envvars-cli] Ask for the contents of the envvar when setting it if not passed - https://phabricator.wikimedia.org/T354196 (10CodeReviewBot) dcaro opened https://gitlab.wikimedia.org/repos/cloud/toolforge/envvars-cli/-/merge_requests/25... [12:52:44] 10Toolforge (Toolforge iteration 02), 10Patch-For-Review, 10User-dcaro: [envvars-cli] Ask for the contents of the envvar when setting it if not passed - https://phabricator.wikimedia.org/T354196 (10CodeReviewBot) dcaro updated https://gitlab.wikimedia.org/repos/cloud/toolforge/envvars-cli/-/merge_requests/25... [12:52:46] 10Toolforge (Toolforge iteration 02), 10Patch-For-Review: [envvars-cli] use toolforge-weld for error handling - https://phabricator.wikimedia.org/T351459 (10CodeReviewBot) dcaro updated https://gitlab.wikimedia.org/repos/cloud/toolforge/envvars-cli/-/merge_requests/25 bump to 0.0.4 [12:57:27] 10Data-Services, 10cloud-services-team, 10Data-Platform-SRE: move cloudelastic behind cloudlb - https://phabricator.wikimedia.org/T346946 (10ayounsi) Thanks for the thorough comment ! My vote goes to option 1 :) * It's a design we've done 1000 times (expose a prod service externally through the LVS), so it... [12:58:42] 10Toolforge (Toolforge iteration 02), 10Patch-For-Review, 10User-dcaro: [envvars-cli] Ask for the contents of the envvar when setting it if not passed - https://phabricator.wikimedia.org/T354196 (10CodeReviewBot) dcaro merged https://gitlab.wikimedia.org/repos/cloud/toolforge/envvars-cli/-/merge_requests/25... [12:58:44] 10Toolforge (Toolforge iteration 02), 10Patch-For-Review: [envvars-cli] use toolforge-weld for error handling - https://phabricator.wikimedia.org/T351459 (10CodeReviewBot) dcaro merged https://gitlab.wikimedia.org/repos/cloud/toolforge/envvars-cli/-/merge_requests/25 bump to 0.0.4 [13:00:21] 10Toolforge (Toolforge iteration 03): Indicate when long envvars are cutoff when listing - https://phabricator.wikimedia.org/T353287 (10dcaro) Deproyed it with toolforge-envvars-cli 0.0.4 [13:01:02] 10Toolforge (Toolforge iteration 03): Indicate when long envvars are cutoff when listing - https://phabricator.wikimedia.org/T353287 (10dcaro) 05In progress→03Resolved [13:02:24] 10Tool-Global-user-contributions, 10Stewards-and-global-tools, 10Temporary accounts, 10XTools, 10Epic: Investigate: How to make the GUC query performant - https://phabricator.wikimedia.org/T355672 (10Tchanders) [13:37:28] (InstanceDown) firing: Project tools instance tools-sgeexec-10-22 is down - https://prometheus-alerts.wmcloud.org/?q=alertname%3DInstanceDown [13:42:28] (InstanceDown) resolved: Project tools instance tools-sgeexec-10-22 is down - https://prometheus-alerts.wmcloud.org/?q=alertname%3DInstanceDown [13:54:12] 10cloud-services-team (FY2023/2024-Q1-Q2), 10Goal: have cloud hardware servers in the cloud realm using a dedicated LB layer - https://phabricator.wikimedia.org/T297596 (10cmooney) [13:55:08] 10Data-Services, 10cloud-services-team, 10Data-Platform-SRE: move cloudelastic behind cloudlb - https://phabricator.wikimedia.org/T346946 (10cmooney) 05Open→03Declined >>! In T346946#9480830, @ayounsi wrote: > My vote goes to option 1 :) Ok. I've no strong objection. > * "still have VM traffic connect... [14:08:56] 10Data-Services, 10cloud-services-team, 10Data-Platform-SRE: move cloudelastic behind cloudlb - https://phabricator.wikimedia.org/T346946 (10bking) Thanks @cmooney , @taavi and @ayounsi . I've created T355617 for the private IP migration and will reach out after discussing the timetable with my team lead @G... [14:10:28] 10Toolforge (Toolforge iteration 03), 10Toolforge Build Service, 10User-Raymond_Ndibe: [builds-api,logs] Increase pod starting timeout to the same as the request - https://phabricator.wikimedia.org/T354856 (10dcaro) 05Open→03Resolved [14:10:33] 10Toolforge (Toolforge iteration 03), 10Toolforge Build Service, 10User-Raymond_Ndibe: builds log streaming times out when time between two loglines exceeds ~1min - https://phabricator.wikimedia.org/T354189 (10dcaro) [14:18:19] 10Toolforge Build Service: [builds-api] Improve error message when logs time out - https://phabricator.wikimedia.org/T354755 (10dcaro) [14:18:28] 10Toolforge Build Service: [builds-api] Improve error message when logs time out - https://phabricator.wikimedia.org/T354755 (10dcaro) from @taavi: maybe the best solution here is to patch the api gateway to return a nicely formatted json so the client can parse it in a nicer way [14:20:15] 10Toolforge (Toolforge iteration 03): [jobs-cli] AttributeError: module 'requests.exceptions' has no attribute 'InvalidJSONError' when getting 5xx from the server - https://phabricator.wikimedia.org/T354748 (10dcaro) 05Open→03Declined Let's wait until the grid goes away [14:20:17] 10Toolforge (Toolforge iteration 03): Create a kubernetes container with mono and dotnet - https://phabricator.wikimedia.org/T311466 (10dcaro) [14:23:28] 10Toolforge (Toolforge iteration 03), 10Toolforge Build Service, 10User-Raymond_Ndibe: alert users when they are about to exceed their harbor quota - https://phabricator.wikimedia.org/T353535 (10dcaro) a:03Raymond_Ndibe [14:23:49] 10Toolforge (Toolforge iteration 03): Toolforge next user stories - 2024 version - https://phabricator.wikimedia.org/T352857 (10dcaro) a:03dcaro [14:23:56] (ProbeDown) firing: Service tools-k8s-haproxy-3:30000 has failed probes (http_admin_toolforge_org_ip4) - https://wikitech.wikimedia.org/wiki/Runbook#tools-k8s-haproxy-3:30000 - https://grafana.wikimedia.org/d/O0nHhdhnz/network-probes-overview?var-job=probes/custom&var-module=All - https://prometheus-alerts.wmcloud.org/?q=alertname%3DProbeDown [14:24:14] 10Toolforge (Toolforge iteration 03): Toolforge next user stories - 2024 version - https://phabricator.wikimedia.org/T352857 (10dcaro) 05Open→03In progress [14:25:02] 10Toolforge (Toolforge iteration 03), 10Toolforge Build Service, 10User-Raymond_Ndibe: [tbs] Give a meaningful error message when a user exceeds their Harbor quota - https://phabricator.wikimedia.org/T351178 (10dcaro) a:03Raymond_Ndibe [14:25:53] 10Toolforge (Toolforge iteration 03): [jobs-cli] AttributeError: module 'requests.exceptions' has no attribute 'InvalidJSONError' when getting 5xx from the server - https://phabricator.wikimedia.org/T354748 (10dcaro) 05Declined→03Resolved [14:25:59] 10Toolforge (Toolforge iteration 03): Create a kubernetes container with mono and dotnet - https://phabricator.wikimedia.org/T311466 (10dcaro) [14:26:04] 10Toolforge: Expose tool-labs service names via environment variables - https://phabricator.wikimedia.org/T151002 (10taavi) [14:27:11] 10Toolforge (Toolforge iteration 03): [jobs-cli] AttributeError: module 'requests.exceptions' has no attribute 'InvalidJSONError' when getting 5xx from the server - https://phabricator.wikimedia.org/T354748 (10taavi) 05Resolved→03Declined [14:27:15] 10Toolforge (Toolforge iteration 03): Create a kubernetes container with mono and dotnet - https://phabricator.wikimedia.org/T311466 (10taavi) [14:28:56] (ProbeDown) firing: (2) Service tools-k8s-haproxy-3:30000 has failed probes (http_admin_toolforge_org_ip4) - https://grafana.wikimedia.org/d/O0nHhdhnz/network-probes-overview?var-job=probes/custom&var-module=All - https://prometheus-alerts.wmcloud.org/?q=alertname%3DProbeDown [14:33:56] (ProbeDown) resolved: Service tools-k8s-haproxy-4:30000 has failed probes (http_admin_toolforge_org_ip4) - https://wikitech.wikimedia.org/wiki/Runbook#tools-k8s-haproxy-4:30000 - https://grafana.wikimedia.org/d/O0nHhdhnz/network-probes-overview?var-job=probes/custom&var-module=All - https://prometheus-alerts.wmcloud.org/?q=alertname%3DProbeDown [14:40:55] 10Toolforge: [builds-builder] Consider building our own bash image with extra tooling for toml parsing - https://phabricator.wikimedia.org/T355228 (10dcaro) p:05Triage→03Low [14:43:22] 10Toolforge, 10cloud-services-team, 10Patch-For-Review: toolforge prometheus servers OOMing - https://phabricator.wikimedia.org/T350227 (10CodeReviewBot) taavi merged https://gitlab.wikimedia.org/repos/cloud/toolforge/toolforge-deploy/-/merge_requests/183 Revert "wmcs-k8s-metrics: rollback tools" [14:43:27] 10Toolforge (Toolforge iteration 03), 10cloud-services-team, 10Kubernetes, 10Patch-For-Review: Upgrade cadvisor - https://phabricator.wikimedia.org/T349795 (10CodeReviewBot) taavi merged https://gitlab.wikimedia.org/repos/cloud/toolforge/toolforge-deploy/-/merge_requests/183 Revert "wmcs-k8s-metrics: roll... [14:43:27] !log taavi@cloudcumin1001 tools START - Cookbook wmcs.toolforge.k8s.component.deploy for component wmcs-k8s-metrics [14:43:43] !log taavi@cloudcumin1001 tools END (PASS) - Cookbook wmcs.toolforge.k8s.component.deploy (exit_code=0) for component wmcs-k8s-metrics [14:51:23] !log taavi@cloudcumin1001 tools START - Cookbook wmcs.toolforge.k8s.component.deploy for component wmcs-k8s-metrics [14:51:38] !log taavi@cloudcumin1001 tools END (PASS) - Cookbook wmcs.toolforge.k8s.component.deploy (exit_code=0) for component wmcs-k8s-metrics [14:54:56] (ProbeDown) firing: Service tools-k8s-haproxy-3:30000 has failed probes (http_admin_toolforge_org_ip4) - https://wikitech.wikimedia.org/wiki/Runbook#tools-k8s-haproxy-3:30000 - https://grafana.wikimedia.org/d/O0nHhdhnz/network-probes-overview?var-job=probes/custom&var-module=All - https://prometheus-alerts.wmcloud.org/?q=alertname%3DProbeDown [14:59:56] (ProbeDown) resolved: Service tools-k8s-haproxy-3:30000 has failed probes (http_admin_toolforge_org_ip4) - https://wikitech.wikimedia.org/wiki/Runbook#tools-k8s-haproxy-3:30000 - https://grafana.wikimedia.org/d/O0nHhdhnz/network-probes-overview?var-job=probes/custom&var-module=All - https://prometheus-alerts.wmcloud.org/?q=alertname%3DProbeDown [15:20:09] 10Toolforge (Toolforge iteration 03), 10cloud-services-team, 10Kubernetes, 10Patch-For-Review: Upgrade cadvisor - https://phabricator.wikimedia.org/T349795 (10CodeReviewBot) taavi opened https://gitlab.wikimedia.org/repos/cloud/toolforge/wmcs-k8s-metrics/-/merge_requests/7 cadvisor: Update disabled metric... [15:39:56] (ProbeDown) firing: Service tools-k8s-haproxy-3:30000 has failed probes (http_admin_toolforge_org_ip4) - https://wikitech.wikimedia.org/wiki/Runbook#tools-k8s-haproxy-3:30000 - https://grafana.wikimedia.org/d/O0nHhdhnz/network-probes-overview?var-job=probes/custom&var-module=All - https://prometheus-alerts.wmcloud.org/?q=alertname%3DProbeDown [15:44:56] (ProbeDown) firing: (2) Service tools-k8s-haproxy-3:30000 has failed probes (http_admin_toolforge_org_ip4) - https://grafana.wikimedia.org/d/O0nHhdhnz/network-probes-overview?var-job=probes/custom&var-module=All - https://prometheus-alerts.wmcloud.org/?q=alertname%3DProbeDown [15:49:56] (ProbeDown) resolved: (2) Service tools-k8s-haproxy-3:30000 has failed probes (http_admin_toolforge_org_ip4) - https://grafana.wikimedia.org/d/O0nHhdhnz/network-probes-overview?var-job=probes/custom&var-module=All - https://prometheus-alerts.wmcloud.org/?q=alertname%3DProbeDown [15:53:22] (HAProxyBackendUnavailable) firing: HAProxy service neutron-api_backend backend cloudcontrol1007.private.eqiad.wikimedia.cloud is down - https://wikitech.wikimedia.org/wiki/HAProxy - TODO - https://alerts.wikimedia.org/?q=alertname%3DHAProxyBackendUnavailable [15:56:20] 10cloud-services-team: SystemdUnitDown Unit wmf_auto_restart_prometheus-mysqld-exporter.service on node cloudcontrol1007 has been down for long. - https://phabricator.wikimedia.org/T355572 (10Andrew) 05Open→03Resolved a:03Andrew This was a side-effect of galera work, cleared itself after a bit. [15:58:22] (HAProxyBackendUnavailable) resolved: HAProxy service neutron-api_backend backend cloudcontrol1007.private.eqiad.wikimedia.cloud is down - https://wikitech.wikimedia.org/wiki/HAProxy - TODO - https://alerts.wikimedia.org/?q=alertname%3DHAProxyBackendUnavailable [16:37:16] 10Cloud-VPS, 10cloud-services-team (FY2023/2024-Q1-Q2), 10Goal: Support 'unmanaged' projects in cloud-vps - https://phabricator.wikimedia.org/T326818 (10fnegri) 05Open→03Resolved [16:56:38] 10Toolforge Build Service: Toolforge refuses to install build-essential - https://phabricator.wikimedia.org/T355575 (10Soda) >>! In T355575#9480266, @dcaro wrote: > If this is in order to compile jpetrans on the fly, you can try using the ubuntu package instead: https://launchpad.net/ubuntu/trusty/+package/libjp... [17:45:12] 10Toolforge Build Service: Toolforge refuses to install build-essential - https://phabricator.wikimedia.org/T355575 (10Soda) >>! In T355575#9480311, @dcaro wrote: > Had a quick look at the code, I see also that there's a lot going on on the [[ https://github.com/sohomdatta1/croptool/blob/master/startserver.sh |... [18:06:10] vivian-rook opened https://github.com/toolforge/superset-deploy/pull/18 [18:06:25] vivian-rook closed https://github.com/toolforge/superset-deploy/pull/18 [18:06:32] 10superset.wmcloud.org: Database not transferring - https://phabricator.wikimedia.org/T355652 (10rook) Upgrade required reencryption of db. [18:07:00] 10superset.wmcloud.org: Upgrade to 3.1.0 - https://phabricator.wikimedia.org/T355652 (10rook) 05Open→03Resolved a:03rook [18:08:30] 10superset.wmcloud.org: Remove superset-123-3 cluster - https://phabricator.wikimedia.org/T355707 (10rook) [18:59:50] (ProbeDown) firing: Service tools-static-14:80 has failed probes (http_tools_static_wmflabs_org_ip4) - https://wikitech.wikimedia.org/wiki/Runbook#tools-static-14:80 - https://grafana.wikimedia.org/d/O0nHhdhnz/network-probes-overview?var-job=probes/custom&var-module=All - https://prometheus-alerts.wmcloud.org/?q=alertname%3DProbeDown [19:04:50] (ProbeDown) resolved: Service tools-static-14:80 has failed probes (http_tools_static_wmflabs_org_ip4) - https://wikitech.wikimedia.org/wiki/Runbook#tools-static-14:80 - https://grafana.wikimedia.org/d/O0nHhdhnz/network-probes-overview?var-job=probes/custom&var-module=All - https://prometheus-alerts.wmcloud.org/?q=alertname%3DProbeDown [19:07:38] 10Toolforge (Toolforge iteration 03), 10cloud-services-team, 10Kubernetes, 10Patch-For-Review: Upgrade cadvisor - https://phabricator.wikimedia.org/T349795 (10CodeReviewBot) taavi merged https://gitlab.wikimedia.org/repos/cloud/toolforge/wmcs-k8s-metrics/-/merge_requests/7 cadvisor: Update disabled metric... [19:08:59] 10Toolforge (Toolforge iteration 03), 10cloud-services-team, 10Kubernetes, 10Patch-For-Review: Upgrade cadvisor - https://phabricator.wikimedia.org/T349795 (10CodeReviewBot) project_1317_bot_df3177307bed93c3f34e421e26c86e38 opened https://gitlab.wikimedia.org/repos/cloud/toolforge/toolforge-deploy/-/merge_... [19:09:26] !log taavi@cloudcumin1001 toolsbeta START - Cookbook wmcs.toolforge.k8s.component.deploy for component wmcs-k8s-metrics [19:09:28] !log taavi@cloudcumin1001 toolsbeta END (ERROR) - Cookbook wmcs.toolforge.k8s.component.deploy (exit_code=97) for component wmcs-k8s-metrics [19:09:35] 10Toolforge (Toolforge iteration 03), 10cloud-services-team, 10Kubernetes, 10Patch-For-Review: Upgrade cadvisor - https://phabricator.wikimedia.org/T349795 (10CodeReviewBot) taavi merged https://gitlab.wikimedia.org/repos/cloud/toolforge/toolforge-deploy/-/merge_requests/185 wmcs-k8s-metrics: bump to 0.0.... [19:10:31] !log taavi@cloudcumin1001 toolsbeta START - Cookbook wmcs.toolforge.k8s.component.deploy for component wmcs-k8s-metrics [19:10:47] !log taavi@cloudcumin1001 toolsbeta END (PASS) - Cookbook wmcs.toolforge.k8s.component.deploy (exit_code=0) for component wmcs-k8s-metrics [19:11:44] !log taavi@cloudcumin1001 tools START - Cookbook wmcs.toolforge.k8s.component.deploy for component wmcs-k8s-metrics [19:11:59] !log taavi@cloudcumin1001 tools END (PASS) - Cookbook wmcs.toolforge.k8s.component.deploy (exit_code=0) for component wmcs-k8s-metrics [19:14:35] 10Tool-Global-user-contributions, 10Stewards-and-global-tools, 10Temporary accounts, 10XTools, 10Epic: Investigate: How to make the GUC query performant - https://phabricator.wikimedia.org/T355672 (10MusikAnimal) I don't know how to make it WMF-agnostic, but XTools and I believe GUC basically do the foll... [19:19:36] 10Toolforge, 10cloud-services-team: tools-sgeweblight / drives very full - https://phabricator.wikimedia.org/T352802 (10taavi) 05Open→03Invalid [19:21:31] 10Toolforge: Add a Thanos frontend in front of Toolforge Prometheus - https://phabricator.wikimedia.org/T355711 (10taavi) [19:21:46] 10Toolforge: Add a Thanos frontend in front of Toolforge Prometheus - https://phabricator.wikimedia.org/T355711 (10taavi) p:05Triage→03Low [19:40:51] 10Cloud-VPS, 10Toolforge (Toolforge iteration 03), 10cloud-services-team: Ensure Toolforge and Cloud VPS comply with Google's new email sender guidelines - https://phabricator.wikimedia.org/T354112 (10Tgr) [21:35:30] (03PS1) 10Eric Gardner: releases: Bump Codex to 1.3.0 [labs/libraryupgrader/config] - 10https://gerrit.wikimedia.org/r/992536 [21:45:36] (03CR) 10LWatson: [C: 03+2] releases: Bump Codex to 1.3.0 [labs/libraryupgrader/config] - 10https://gerrit.wikimedia.org/r/992536 (owner: 10Eric Gardner) [21:51:23] (03Merged) 10jenkins-bot: releases: Bump Codex to 1.3.0 [labs/libraryupgrader/config] - 10https://gerrit.wikimedia.org/r/992536 (owner: 10Eric Gardner) [22:17:50] 10VPS-project-Wikistats, 10User-RhinosF1: Wikistats is using a malformed user agent - https://phabricator.wikimedia.org/T354101 (10Dzahn) https://gitlab.wikimedia.org/cloudvps-repos/wikistats/-/commit/47a146b0cfc987e7838a016ffd19a4acf3682c81 [22:18:07] 10VPS-project-Wikistats, 10User-RhinosF1: Wikistats is using a malformed user agent - https://phabricator.wikimedia.org/T354101 (10Dzahn) @RhinosF1 What do you think? [22:50:49] 10Cloud-VPS, 10Data-Services, 10cloud-services-team, 10User-Marostegui: Fix 'openstack database instance rebuild' - https://phabricator.wikimedia.org/T355721 (10Andrew) [23:22:47] 10Grid-Engine-to-K8s-Migration: Migrate copyclear from Toolforge GridEngine to Toolforge Kubernetes - https://phabricator.wikimedia.org/T319647 (10Hannolans) I read some documentation, but it's not clear to me if I should migrate the files as well, or should I just re-enable the scripts by adding a schedule, som... [23:25:42] 10Grid-Engine-to-K8s-Migration: Migrate panoviewer from Toolforge GridEngine to Toolforge Kubernetes - https://phabricator.wikimedia.org/T319953 (10tstarling) While option 5 might seem attractive for productionization, it would create a new maintenance responsibility. Hugin is actually doing the remapping, it's... [23:31:56] (ProbeDown) firing: Service tools-k8s-haproxy-4:30000 has failed probes (http_admin_toolforge_org_ip4) - https://wikitech.wikimedia.org/wiki/Runbook#tools-k8s-haproxy-4:30000 - https://grafana.wikimedia.org/d/O0nHhdhnz/network-probes-overview?var-job=probes/custom&var-module=All - https://prometheus-alerts.wmcloud.org/?q=alertname%3DProbeDown [23:36:56] (ProbeDown) resolved: (2) Service tools-k8s-haproxy-4:30000 has failed probes (http_admin_toolforge_org_ip4) - https://grafana.wikimedia.org/d/O0nHhdhnz/network-probes-overview?var-job=probes/custom&var-module=All - https://prometheus-alerts.wmcloud.org/?q=alertname%3DProbeDown [23:42:00] 10Grid-Engine-to-K8s-Migration: Migrate copyclear from Toolforge GridEngine to Toolforge Kubernetes - https://phabricator.wikimedia.org/T319647 (10Hannolans) This is my cronjob 0 5 * * * jsub -once -mem 1000m -j y -o /data/project/copyclear/logs/DACS_status.log -N DACSstatus /data/project/copyclear/tool-copyclea...