[00:43:44] 10GitLab (Account Approval), 10Release-Engineering-Team: Requesting GitLab account activation for USER[S] - https://phabricator.wikimedia.org/T344211 (10Nux) [00:44:32] 10GitLab (Account Approval), 10Release-Engineering-Team: Requesting GitLab account activation for USER[S] - https://phabricator.wikimedia.org/T344211 (10Nux) [00:46:31] 10GitLab (Account Approval), 10Release-Engineering-Team: Requesting GitLab account activation for eccenux - https://phabricator.wikimedia.org/T344211 (10Nux) [08:06:24] (03PS1) 10Umherirrender: zuul: [mediawiki/extensions/PageImages] Run phan with MobileFrontend [integration/config] - 10https://gerrit.wikimedia.org/r/948989 [09:00:26] 10GitLab (Pipeline Services Migration🐤), 10Research, 10collaboration-services, 10Patch-For-Review: Move research webpages to gitlab - https://phabricator.wikimedia.org/T334511 (10Jelto) Both services are migrated to GitLab (https://gitlab.wikimedia.org/repos/sre/miscweb/wikiworkshop and https://gitlab.wiki... [10:07:04] 10GitLab, 10Release-Engineering-Team: Strange gitlab behavior - https://phabricator.wikimedia.org/T344156 (10Jelto) What remotes are configured in your local repo? `git remote show origin`. The `warning: redirecting to https://gitlab.wikimedia.org/repos/data-engineering/airflow-dags.git/` looks a bit suspicio... [10:28:22] 10Phabricator (Upstream), 10Upstream: Query for global Feed Transaction Logs: Exception: ConpherenceTransactionQuery overheated - https://phabricator.wikimedia.org/T344232 (10Aklapper) p:05Triage→03Low [10:31:08] hi folks [10:31:35] CI is failing with "This change or one of its cross-repo dependencies was unable to be automatically merged with the current state of its repository. Please rebase the change and upload a new patchset." [10:31:40] on operations/puppet [10:31:52] I see a few existing tickets related to this but thought I had try here first before I go debugging [10:42:07] 10Release-Engineering-Team (Escape Goats🐐), 10Patch-For-Review, 10Release, 10Train Deployments: 1.41.0-wmf.22 deployment blockers - https://phabricator.wikimedia.org/T343724 (10taavi) [10:50:27] the error is https://phabricator.wikimedia.org/P50585 [11:05:08] fwiw, this was fixed by https://sal.toolforge.org/log/7szV-IkBhuQtenzvIe8C [11:06:34] anddd it's back [11:08:48] hi! gerrit seems to be reporting "merge failed" for all repos despite patches being up to date with master. is that a known issue? [11:09:40] jakob_WMDE: was just talking about it here [11:09:48] ah, thanks! [11:09:51] I tried fixing it by https://sal.toolforge.org/log/7szV-IkBhuQtenzvIe8C [11:10:02] which works, though a hack and not a proper fix [11:10:04] * MichaelG_WMDE reads along [11:10:09] definitely needs someone who knows what they are doing here and not me [11:14:04] Is there already a task? [11:14:12] Otherwise I'll create one and make it UBN [11:14:22] MichaelG_WMDE: sorry, I didn't create one but please do [11:14:23] thank you [11:14:37] * MichaelG_WMDE creates task [11:17:35] sukhe: I would normally say hash.ar but looks like they are off in EU time [11:18:00] RhinosF1: yeah I think they are on PTO IIRC [11:19:34] I created this for now: [⚓ T344238 CI "Merge Failed. because cross-repo dependencies" on CI jobs, even up-to-date ones](https://phabricator.wikimedia.org/T344238) [11:19:34] sukhe: they are docs https://www.mediawiki.org/wiki/Continuous_integration/Zuul#All_Gerrit_patches_complain_of_merge_conflicts [11:19:35] T344238: CI "Merge Failed. because cross-repo dependencies" on CI jobs, even up-to-date ones - https://phabricator.wikimedia.org/T344238 [11:19:46] But I'm not seeing much in terms of a runbook for this error [11:20:01] oh interesting [11:20:12] well we can surely try that [11:20:37] sukhe: I don't see anything helpful on the docs though to try [11:21:29] well restarting zuul, maybe [11:21:54] sukhe: it can be tried [11:21:58] It might work [11:22:09] yeah [11:22:15] I see the same errors in logstash [11:22:17] I am going to try this [11:23:15] Cool [11:24:30] should be back [11:24:37] thanks RhinosF1 for pointing to the right task [11:24:42] MichaelG_WMDE: jakob_WMDE: ^ [11:24:51] sukhe: hopeful guess [11:25:08] RhinosF1: we are having a busy day here with the knams stuff, so anything helps and it did :) [11:25:33] nice, thanks sukhe! [11:25:37] was it maxSessionCount errors? [11:25:47] TheresNoTime: yep :) [11:27:20] 10Phabricator (Upstream), 10Upstream: Query for global Feed Transaction Logs: Exception: ConpherenceTransactionQuery overheated - https://phabricator.wikimedia.org/T344232 (10Aklapper) Created and set up a new custom [default query](https://phabricator.wikimedia.org/feed/transactions/query/kLO7GnoM8gVl/) not c... [11:28:04] Are we sure that this is fixed? [11:28:21] still seeing issues? recheck worked for me [11:28:50] Aug 15 11:25:08 contint2002 zuul-server[40306]: AttributeError: 'NoneType' object has no attribute 'getJobs' [11:30:25] sukhe yeah, just tried a recheck on https://gerrit.wikimedia.org/r/c/mediawiki/extensions/Wikibase/+/949015 and it still fails with "Merge failed" [11:30:31] maxSessionCount / IOException are starting up again :/ [11:32:12] yeah I guess it's time for the task then, because I don't think a restart is the solution [11:32:20] a few initial rechecks worked [11:38:18] @sukhe @TheresNoTime can you bring this to the attention of the right WMF folks? Not sure if me pinging @here in the wmf slack engineering-all channel is the best way forward and I'm barely in any other channels [11:40:58] MichaelG_WMDE: I am not sure who to ping for this as well but I can find out [11:44:49] 10Continuous-Integration-Infrastructure, 10Release-Engineering-Team, 10ci-test-error (WMF-deployed Build Failure): CI "Merge Failed. because cross-repo dependencies" on CI jobs, even up-to-date ones - https://phabricator.wikimedia.org/T344238 (10Michael) [11:44:50] sukhe: probably someone releng like thcipriani [11:45:01] yeah I thought of that but it's too early for them [11:45:09] as I understand it [11:45:13] checking with -sre folks on who to ping [11:46:09] sukhe: jnuche ? [11:46:20] They are the only other relenger in the EU [11:46:22] jnuche: ^ hi, around? [11:46:32] (03PS1) 10Umherirrender: zuul: [mediawiki/extensions/Scribunto] Run phan with CodeEditor [integration/config] - 10https://gerrit.wikimedia.org/r/949022 [11:54:11] 10Continuous-Integration-Infrastructure, 10Release-Engineering-Team, 10ci-test-error (WMF-deployed Build Failure): CI "Merge Failed. because cross-repo dependencies" on CI jobs, even up-to-date ones - https://phabricator.wikimedia.org/T344238 (10Michael) [11:55:17] 10Continuous-Integration-Infrastructure, 10Release-Engineering-Team, 10ci-test-error (WMF-deployed Build Failure): CI "Merge Failed. because cross-repo dependencies" on CI jobs, even up-to-date ones - https://phabricator.wikimedia.org/T344238 (10ssingh) Thanks @Michael for filing this task. Restarting zuul h... [11:57:11] 10Continuous-Integration-Infrastructure, 10Release-Engineering-Team, 10ci-test-error (WMF-deployed Build Failure): CI "Merge Failed. because cross-repo dependencies" on CI jobs, even up-to-date ones - https://phabricator.wikimedia.org/T344238 (10Michael) [12:04:02] I have restarted it once again [12:04:13] the last time, but since so many things were stuck [12:07:21] public holiday for jnuche [12:07:27] ah sorry about that [12:07:47] but I'm around, looks like we tried two restarts, but I still see it has 3 connections? [12:08:39] thcipriani: not that I am an expert but I think it's beyond that now in a way [12:08:42] Aug 15 12:03:06 contint2002 zuul-server[14365]: Exception: ('Gerrit error executing gerrit review --project operations/puppet --messag [12:10:38] where are you seeing that one? [12:10:47] contint2002 [12:10:58] I have to step out for breakfast shortly, be back soon [12:11:16] so will reply when back [12:11:19] cool, thanks for all your help [12:16:16] 10Phabricator: Exception: Map returned by "newPagingMapFromCursorObject()" in class "ManiphestTaskQuery" omits required key - https://phabricator.wikimedia.org/T344241 (10Aklapper) p:05Triage→03Low [12:16:20] alright so that gerrit error seems like it was refusing to comment on a closed patchset, but I do still see zuul-merger problems. [12:17:00] Is there anything special when deploying gerrit config changes? In https://gerrit.wikimedia.org/r/c/operations/puppet/+/949026/ I'd like to raise the maxConnectionsPerUser because of the zuul issues but I'm not sure if that can be done just by merging the change [12:17:48] the last time I did this it was merge + restart gerrit [12:17:57] well. merge, run puppet, restart gerrit [12:19:12] !log gerrit.wikimedia.org -p 29418 gerrit [12:19:13] Logged the message at https://wikitech.wikimedia.org/wiki/Release_Engineering/SAL [12:19:21] ^ trying that to see if the connection comes back [12:19:57] !log (correction) gerrit.wikimedia.org -p 29418 gerrit close-connection [12:19:58] Logged the message at https://wikitech.wikimedia.org/wiki/Release_Engineering/SAL [12:20:32] there are still issues with the merge failed error (T344238) even after restarting zuul [12:20:36] T344238: CI "Merge Failed. because cross-repo dependencies" on CI jobs, even up-to-date ones - https://phabricator.wikimedia.org/T344238 [12:20:49] so what do you think about deploying https://gerrit.wikimedia.org/r/c/operations/puppet/+/949026/? [12:22:15] give me one sec to monitor what happens post manually closing that hanging connection [12:22:24] ack :) [12:25:11] Looking at the logs things look a bit healthier. [12:26:16] yeah, so far so good [12:27:28] for the reccord I did `ssh gerrit.wikimedia.org -p 29418 gerrit show-connections | grep jenkins`, found the oldest one, and ran the `close-connection ` command above. [12:28:02] Good to know. Would it be worth adding to the hints here? https://www.mediawiki.org/wiki/Continuous_integration/Zuul#All_Gerrit_patches_complain_of_merge_conflicts [12:28:34] I showed up right after the restart of zuul and there were three connections (it should only have two unless something is...strange) [12:29:24] my suspicion is something inside zuul-merger is hung, but I forgot to check lsof for a connection before killing it [12:29:29] yeah, I can update. [12:36:53] Looks like you need to be a gerrit admin to do this. But good to have it documented anyway. [12:38:06] so zuul and jenkins are looking better now? Can we lower the priority of https://phabricator.wikimedia.org/T344238? its unbreak now at the moment [12:40:46] updated: https://www.mediawiki.org/wiki/Continuous_integration/Zuul#All_Gerrit_patches_complain_of_merge_conflicts [12:41:08] I have not seen that error since I closed the old connection [12:41:37] I am tempted to close it and ask folks to reopen if they see it again [12:42:29] Sounds good to me, I'll happily leave further handling of that Phabricator task to you :) [12:42:58] <3 thanks for filing [12:50:08] 10Continuous-Integration-Infrastructure, 10Release-Engineering-Team, 10Patch-For-Review, 10ci-test-error (WMF-deployed Build Failure): CI "Merge Failed. because cross-repo dependencies" on CI jobs, even up-to-date ones - https://phabricator.wikimedia.org/T344238 (10thcipriani) 05Open→03Resolved a:03th... [12:52:18] 10Continuous-Integration-Infrastructure, 10Release-Engineering-Team, 10Patch-For-Review, 10ci-test-error (WMF-deployed Build Failure): CI "Merge Failed. because cross-repo dependencies" on CI jobs, even up-to-date ones - https://phabricator.wikimedia.org/T344238 (10thcipriani) Thank you @ssingh for the zuu... [13:58:06] (03PS1) 10Umherirrender: zuul: [mediawiki/extensions/Gadgets] Run phan with CodeEditor [integration/config] - 10https://gerrit.wikimedia.org/r/949047 [14:27:55] (03PS1) 10Umherirrender: zuul: [mediawiki/extensions/DiscussionTools] Run phan with BetaFeatures [integration/config] - 10https://gerrit.wikimedia.org/r/949054 [14:35:59] 10GitLab (Pipeline Services Migration🐤), 10Research, 10collaboration-services, 10Patch-For-Review: Move research webpages to gitlab - https://phabricator.wikimedia.org/T334511 (10fkaelin) Great, thank you for the update @Jelto. [14:51:05] 10GitLab, 10Release-Engineering-Team: Strange gitlab behavior - https://phabricator.wikimedia.org/T344156 (10fkaelin) 05Open→03Resolved a:03fkaelin Thanks for looking into this. I looked at the redirect too, but it is only due to the missing `.git` in the url. It is good to know that the replica should... [15:22:58] For whatever reason the "There should be two connections." sentence thcipriani wrote about zuul made me think of "There are four lights!" and then fall into a shallow rabbit hole of essays about that ST:TNG episode. [15:23:30] you're welcome [15:24:22] one of the best episodes in terms of patrick stewart out-acting the writing [15:24:58] up there with darmok [15:27:05] Janeway is the best captain, but Stewart is the best actor (/me will not debate this) [15:27:23] Sokath! His eyes open! [15:27:47] :D [15:51:08] Hi, selenium tests are failing: see https://integration.wikimedia.org/ci/job/quibble-composer-mysql-php74-selenium-docker/8414/console [15:51:12] the error is "00:01:11 node: ../src/coroutine.cc:134: void* find_thread_id_key(void*): Assertion `thread_id_key != 0x7777' failed." [15:51:31] re-running the test fails with the same error constantly [15:55:32] oh maybe https://github.com/wikimedia/integration-config/commit/7c02ff6db5c5f1b6d4aa7210e8d9a06f0f5eb52d is the cause cc James_F [15:55:51] i found a similar error @ https://github.com/laverdet/node-fibers/issues/451 which suggests it's node 16. [16:00:02] https://github.com/webdriverio/webdriverio/pull/9480 and https://github.com/wikimedia/mediawiki-extensions-Wikibase/blob/83776d1e0a45b86c1419332e64ef87b12d7ea11a/rm-node16-incompatible-packages.sh#L3 [16:01:47] https://github.com/webdriverio/webdriverio/issues/6703 [16:13:46] https://gerrit.wikimedia.org/r/c/mediawiki/core/+/948602 [16:16:45] (03PS1) 10Umherirrender: zuul: [mediawiki/extensions/VisualEditor] Run phan with BetaFeatures [integration/config] - 10https://gerrit.wikimedia.org/r/949077 [16:44:50] 10Phabricator, 10Dumps-Generation: Compress phabricator dump - https://phabricator.wikimedia.org/T262744 (10dancy) >>! In T262744#9071677, @Ladsgroup wrote: > https://gitlab.wikimedia.org/repos/phabricator/tools/-/merge_requests/1 > > I had to fork the whole repo into ladsgroup/tools. Can we make it possible... [17:37:00] 10GitLab (Upstream pit of despair 🕳️), 10Release-Engineering-Team (Radar), 10Upstream, 10User-brennen: GitLab group permissions are not inherited by sub-groups for groups of users invited to the parent repo - https://phabricator.wikimedia.org/T300939 (10thcipriani) Clearer explanation on the upstream task... [18:54:04] 10GitLab (Project Migration), 10Release-Engineering-Team (Escape Goats🐐), 10Platform Engineering, 10Patch-For-Review: Migrate mediawiki/services/kask to GitLab - https://phabricator.wikimedia.org/T335691 (10CodeReviewBot) dancy opened https://gitlab.wikimedia.org/repos/releng/gitlab-trusted-runner/-/merge_... [18:54:18] 10GitLab (Project Migration), 10Release-Engineering-Team (Escape Goats🐐), 10Platform Engineering, 10Patch-For-Review: Migrate mediawiki/services/kask to GitLab - https://phabricator.wikimedia.org/T335691 (10CodeReviewBot) dancy merged https://gitlab.wikimedia.org/repos/releng/gitlab-trusted-runner/-/merge_... [18:56:38] 10GitLab (Project Migration), 10Release-Engineering-Team (Escape Goats🐐), 10Platform Engineering, 10Patch-For-Review: Migrate mediawiki/services/kask to GitLab - https://phabricator.wikimedia.org/T335691 (10dancy) [19:21:36] 10GitLab (Project Migration), 10Release-Engineering-Team (Escape Goats🐐), 10Platform Engineering, 10Patch-For-Review: Migrate mediawiki/services/kask to GitLab - https://phabricator.wikimedia.org/T335691 (10CodeReviewBot) dancy opened https://gitlab.wikimedia.org/repos/mediawiki/services/kask/-/merge_reque... [19:27:12] 10GitLab (Project Migration), 10Release-Engineering-Team (Escape Goats🐐), 10Platform Engineering, 10Patch-For-Review: Migrate mediawiki/services/kask to GitLab - https://phabricator.wikimedia.org/T335691 (10CodeReviewBot) dancy merged https://gitlab.wikimedia.org/repos/mediawiki/services/kask/-/merge_reque... [19:29:21] 10GitLab (Project Migration), 10Release-Engineering-Team (Escape Goats🐐), 10Platform Engineering, 10Patch-For-Review: Migrate mediawiki/services/kask to GitLab - https://phabricator.wikimedia.org/T335691 (10dancy) [19:29:41] 10GitLab (Project Migration), 10Release-Engineering-Team (Escape Goats🐐), 10Platform Engineering, 10Patch-For-Review: Migrate mediawiki/services/kask to GitLab - https://phabricator.wikimedia.org/T335691 (10dancy) [20:10:45] (03PS1) 10Krinkle: build: Add `npm run changelog` command to help with releases [performance/WikimediaDebug] - 10https://gerrit.wikimedia.org/r/949108 [21:45:49] (03PS1) 10Umherirrender: zuul: [mediawiki/extensions/AbuseFilter] Run phan/tests with UserMerge [integration/config] - 10https://gerrit.wikimedia.org/r/949123 [21:52:49] (03PS1) 10Umherirrender: zuul: [mediawiki/extensions/Echo] Run phan with UserMerge [integration/config] - 10https://gerrit.wikimedia.org/r/949124 [22:23:40] 10Phabricator, 10Release-Engineering-Team (Escape Goats🐐): Exception: No such implementation "HarbormasterLeaseHostBuildStepImplementation" exists - https://phabricator.wikimedia.org/T344296 (10Aklapper) p:05Triage→03Low [22:34:40] (Queue (Jenkins jobs + Zuul functions) alert) firing: - https://alerts.wikimedia.org/?q=alertname%3DQueue+%28Jenkins+jobs+%2B+Zuul+functions%29+alert [22:38:35] PROBLEM - Work requests waiting in Zuul Gearman server on contint2002 is CRITICAL: CRITICAL: 100.00% of data above the critical threshold [400.0] https://www.mediawiki.org/wiki/Continuous_integration/Zuul https://grafana.wikimedia.org/d/000000322/zuul-gearman?orgId=1&viewPanel=10 [22:41:17] 10Phabricator, 10Release-Engineering-Team (Escape Goats🐐): Exception: No such implementation "HarbormasterLeaseHostBuildStepImplementation" exists - https://phabricator.wikimedia.org/T344296 (10Aklapper) 05Open→03Resolved `HarbormasterLeaseHostBuildStepImplementation` got removed in upstream https://we.pho... [22:41:23] 10Phabricator, 10Release-Engineering-Team (Priority Backlog 📥), 10Patch-For-Review, 10User-brennen: Clean up Phabricator production error logs - https://phabricator.wikimedia.org/T337500 (10Aklapper) [22:49:57] (Queue (Jenkins jobs + Zuul functions) alert) resolved: - https://alerts.wikimedia.org/?q=alertname%3DQueue+%28Jenkins+jobs+%2B+Zuul+functions%29+alert [22:49:57] (Queue (Jenkins jobs + Zuul functions) alert) firing: - https://alerts.wikimedia.org/?q=alertname%3DQueue+%28Jenkins+jobs+%2B+Zuul+functions%29+alert [22:54:55] (Queue (Jenkins jobs + Zuul functions) alert) resolved: - https://alerts.wikimedia.org/?q=alertname%3DQueue+%28Jenkins+jobs+%2B+Zuul+functions%29+alert [22:56:51] RECOVERY - Work requests waiting in Zuul Gearman server on contint2002 is OK: OK: Less than 100.00% above the threshold [200.0] https://www.mediawiki.org/wiki/Continuous_integration/Zuul https://grafana.wikimedia.org/d/000000322/zuul-gearman?orgId=1&viewPanel=10