[04:42:26] 10Phabricator, 10DBA: Sporadic MySQL connection errors in Phabricator - https://phabricator.wikimedia.org/T341311 (10Novem_Linguae) [04:42:34] 10Phabricator, 10DBA: Sporadic MySQL connection errors in Phabricator - https://phabricator.wikimedia.org/T341311 (10Novem_Linguae) [04:47:13] 10Phabricator, 10DBA, 10SRE: Sporadic MySQL connection errors in Phabricator - https://phabricator.wikimedia.org/T341311 (10Novem_Linguae) [05:01:41] 10Phabricator, 10DBA, 10SRE: Sporadic MySQL connection errors in Phabricator - https://phabricator.wikimedia.org/T341311 (10colewhite) 05Open→03Resolved a:03colewhite The extra load on Phabricator has been removed. Going to optimistically resolve this, but please reopen if it comes back. [05:44:56] 10Phabricator, 10DBA, 10SRE: Sporadic MySQL connection errors in Phabricator - https://phabricator.wikimedia.org/T341311 (10Aklapper) > The extra load on Phabricator has been removed. @colewhite: Are there any more details about actions taken that could be shared in public please, if available? Thanks! [07:35:51] integration.wikimedia.org seems to return 404 for some ongoing jobs, such as https://integration.wikimedia.org/ci/job/quibble-vendor-mysql-php74-noselenium-docker/112570/console. is that expected? [07:36:38] (okay, now it works, but...still) [09:19:30] 10GitLab (Pipeline Services Migration🐤), 10Release-Engineering-Team, 10MediaWiki-Docker, 10dev-images, and 3 others: MySQL/MariaDB images for development environments - https://phabricator.wikimedia.org/T238925 (10kostajh) >>! In T238925#8956118, @hashar wrote: > Else go with the official MariaDB image. I... [09:34:38] 10GitLab (CI & Job Runners), 10collaboration-services: Disable unprivileged userns on GitLab Runners - https://phabricator.wikimedia.org/T341334 (10Jelto) [09:34:49] 10GitLab (CI & Job Runners), 10collaboration-services: Disable unprivileged userns on GitLab Runners - https://phabricator.wikimedia.org/T341334 (10Jelto) p:05Triage→03Medium [10:24:08] 10Phabricator, 10DBA, 10SRE, 10collaboration-services: Sporadic MySQL connection errors in Phabricator - https://phabricator.wikimedia.org/T341311 (10LSobanski) [11:11:35] 10Continuous-Integration-Infrastructure, 10SRE, 10collaboration-services, 10Patch-For-Review: contint2002 service implementation tracking - https://phabricator.wikimedia.org/T324659 (10hashar) For switching the services via Puppet, that is nowadays done via single Hiera variable (introduced by [[ https://g... [11:31:01] PROBLEM - zuul_merger_service_running on contint1002 is CRITICAL: PROCS CRITICAL: 0 processes with regex args bin/zuul-merger https://www.mediawiki.org/wiki/Continuous_integration/Zuul [11:37:09] RECOVERY - zuul_merger_service_running on contint1002 is OK: PROCS OK: 1 process with regex args bin/zuul-merger https://www.mediawiki.org/wiki/Continuous_integration/Zuul [11:49:59] 10Continuous-Integration-Infrastructure, 10SRE, 10collaboration-services, 10Patch-For-Review: contint2002 service implementation tracking - https://phabricator.wikimedia.org/T324659 (10Jelto) [11:51:12] 10Continuous-Integration-Infrastructure, 10Zuul: zuul-merger fails when repository names overlaps - https://phabricator.wikimedia.org/T157818 (10hashar) When bringing a new zuul-merger instance I went to populate all active code git repositories with: ` gerrit ls-projects --type CODE --state ACTIVE | \ xargs... [11:52:13] 10Continuous-Integration-Infrastructure, 10SRE, 10collaboration-services, 10Patch-For-Review: contint2002 service implementation tracking - https://phabricator.wikimedia.org/T324659 (10Jelto) Thanks for testing and running rsync! I created a rough checklist in the task description. Feel free to edit if I... [11:52:51] 10Continuous-Integration-Infrastructure, 10SRE, 10collaboration-services, 10Patch-For-Review: contint2002 service implementation tracking - https://phabricator.wikimedia.org/T324659 (10Jelto) [11:55:39] 10Continuous-Integration-Infrastructure, 10SRE, 10collaboration-services, 10Patch-For-Review: contint2002 service implementation tracking - https://phabricator.wikimedia.org/T324659 (10Jelto) [12:31:46] 10Continuous-Integration-Infrastructure, 10SRE, 10collaboration-services, 10Patch-For-Review: contint2002 service implementation tracking - https://phabricator.wikimedia.org/T324659 (10hashar) [12:32:03] 10Continuous-Integration-Infrastructure, 10SRE, 10collaboration-services, 10Patch-For-Review: contint2002 service implementation tracking - https://phabricator.wikimedia.org/T324659 (10hashar) >>! In T324659#8997167, @Jelto wrote: > Thanks for testing and running rsync! > > I created a rough checklist in... [12:47:03] 10Continuous-Integration-Infrastructure, 10SRE, 10collaboration-services, 10Patch-For-Review: contint2002 service implementation tracking - https://phabricator.wikimedia.org/T324659 (10hashar) [12:49:23] 10Continuous-Integration-Infrastructure, 10SRE, 10collaboration-services, 10Patch-For-Review: contint2002 service implementation tracking - https://phabricator.wikimedia.org/T324659 (10hashar) Fun Friday finding: neither contint1002 (recently moved) nor contint2002 where allowed to ssh to the `integration`... [12:55:02] 10GitLab (Infrastructure), 10collaboration-services, 10Patch-For-Review: gitlab.wikimedia.org ssh host key should appear in wmf-known-host - https://phabricator.wikimedia.org/T337107 (10LSobanski) [12:57:25] 10Continuous-Integration-Infrastructure, 10SRE, 10collaboration-services, 10Patch-For-Review: contint2002 service implementation tracking - https://phabricator.wikimedia.org/T324659 (10hashar) [14:15:32] I’ve noticed that in-progress builds often just give me “HTTP ERROR 404 Not Found” now, is that happening for anyone else too? [14:15:46] random example (though that build will probably finish pretty soon) at https://integration.wikimedia.org/ci/job/mediawiki-quibble-vendor-mysql-php74-docker/27265/console [14:16:21] That number doesn't even show on https://integration.wikimedia.org/ci/job/mediawiki-quibble-vendor-mysql-php74-docker/ [14:16:22] :/ [14:16:26] Some recent jenkins update? [14:16:33] I got the link from Zuul [14:16:46] now it finished and the log is available [14:17:26] though the number still doesn’t show up on https://integration.wikimedia.org/ci/job/mediawiki-quibble-vendor-mysql-php74-docker/, hm [14:17:34] should I file a task for more investigation? [14:18:03] I'd say so... It feels very regression-y [14:18:08] (cause it definitely worked before) [14:18:18] alright, will do [14:22:26] 10Continuous-Integration-Infrastructure, 10Jenkins: In-progress Jenkins logs sometimes unavailable (HTTP ERROR 404 Not Found) - https://phabricator.wikimedia.org/T341348 (10Lucas_Werkmeister_WMDE) [14:22:36] ^ done [15:30:28] (03PS1) 10Ssingh: Zuul: [operations/debs/pdns-recursor] Add debian-glue CI [integration/config] - 10https://gerrit.wikimedia.org/r/936296 [15:50:22] well that's interesting [15:51:35] 10Gerrit, 10Release-Engineering-Team: Install gerrit image-diff plugin - https://phabricator.wikimedia.org/T341291 (10hashar) + @Paladox since I am pretty sure you talked about image-diff in the future and you have the relevant expertise for PolyGerrit plugins :) If course I had to send a few patches due to... [16:01:09] 10Gerrit, 10Release-Engineering-Team: Install gerrit image-diff plugin - https://phabricator.wikimedia.org/T341291 (10hashar) [16:04:23] 10Continuous-Integration-Infrastructure, 10Jenkins: In-progress Jenkins logs sometimes unavailable (HTTP ERROR 404 Not Found) - https://phabricator.wikimedia.org/T341348 (10thcipriani) I was able to recreate this pretty fast. I looked at https://integration.wikimedia.org/ci/job/quibble-vendor-mysql-php74-sele... [16:18:28] 10Continuous-Integration-Infrastructure, 10Jenkins, 10Upstream: In-progress Jenkins logs sometimes unavailable (HTTP ERROR 404 Not Found) - https://phabricator.wikimedia.org/T341348 (10hashar) From `sudo journalctl -u jenkins|grep 52953` I got: ` Jul 07 15:39:59 contint2001 jenkins[32446]: WARNING: [jenkins.... [16:18:46] thcipriani: I am restarting Jenkins CI in a few [16:19:03] thanks for the repro case :] [16:19:13] I am waiting for a Wikibase build to complete [16:22:05] 0b12c291c72e (Antoine Musso 2023-01-03 12:15:18 +0100)|# CI Jenkins takes longer than the default 90 seconds to start up [16:22:05] 0dc81e2bbd25 (Antoine Musso 2023-01-03 14:35:54 +0100)|TimeoutStartSec=300 [16:22:06] :] [16:23:50] 10Continuous-Integration-Infrastructure, 10Jenkins, 10Upstream: In-progress Jenkins logs sometimes unavailable (HTTP ERROR 404 Not Found) - https://phabricator.wikimedia.org/T341348 (10hashar) 05Open→03Resolved a:03hashar Fixed by restarting Jenkins. Ideally guess that could deserves a report to the u... [16:23:53] fixed by restarting the CI Jenkins [16:24:02] I am off for groceries shopping and well... the week-end [16:38:30] nice, thanks hashar [18:31:46] 10GitLab (Infrastructure), 10collaboration-services: Create alerting for GitLab CI failures - https://phabricator.wikimedia.org/T339370 (10Dzahn) 05Open→03In progress [18:32:22] ^ merged changed by Jelto that adds monitoringa/alerting for GItlab CI for the first time [18:33:10] keeping an eye on it, but severity is just set to "task". so if it alerts it just means new phab ticket. [18:33:51] this is still the "trying it out" phase but a start for sure [18:37:07] (https://grafana.wikimedia.org/d/Chb-gC07k/gitlab-ci-overview?orgId=1) [19:13:10] !log restarted gitlab-webhooks to apply https://gitlab.wikimedia.org/repos/releng/gitlab-phabricator/-/merge_requests/14 and https://gitlab.wikimedia.org/repos/releng/gitlab-phabricator/-/merge_requests/13 [19:13:11] Logged the message at https://wikitech.wikimedia.org/wiki/Release_Engineering/SAL [19:15:07] 10GitLab (Integrations), 10Phabricator, 10Release-Engineering-Team (GitLab III: GitLab in LA 🪃), 10User-brennen: Sandbox task for gitlab-phabricator comment integration - https://phabricator.wikimedia.org/T324164 (10CodeReviewBot) brennen closed https://gitlab.wikimedia.org/brennen/test/-/merge_requests/7... [19:15:42] (^ and verified gitlab-webhooks still working) [19:38:38] 10Release-Engineering-Team (Deployment Training Requests): Deployment Training Request for xcollazo - https://phabricator.wikimedia.org/T341377 (10xcollazo) [19:39:33] 10Release-Engineering-Team (Deployment Training Requests): Deployment Training Request for xcollazo - https://phabricator.wikimedia.org/T341377 (10xcollazo) Opening this as part of T341045. [21:59:31] brennen: well done :-] [23:22:58] 10GitLab (CI & Job Runners), 10commit-message-validator, 10Patch-For-Review, 10User-bd808: Update commit-message-validator to work nicely with GitLab repos - https://phabricator.wikimedia.org/T339307 (10CodeReviewBot) bd808 opened https://gitlab.wikimedia.org/repos/ci-tools/commit-message-validator/-/merge...