[02:32:46] 10Release-Engineering-Team (Priority Backlog 📥), 10Release, 10Train Deployments: 1.39.0-wmf.19 deployment blockers - https://phabricator.wikimedia.org/T308072 (10AlexisJazz) Never too sure where to post. T311799 might be something. [07:31:47] (03PS2) 10Hashar: zuul: Adjust description for several gate pipelines [integration/config] - 10https://gerrit.wikimedia.org/r/810074 (owner: 10Stang) [07:32:02] (03CR) 10Hashar: [C: 03+2] "Excellent, thank you very much!" [integration/config] - 10https://gerrit.wikimedia.org/r/810074 (owner: 10Stang) [07:34:29] (03Merged) 10jenkins-bot: zuul: Adjust description for several gate pipelines [integration/config] - 10https://gerrit.wikimedia.org/r/810074 (owner: 10Stang) [07:43:02] (03CR) 10Hashar: [C: 03+2] "I have deployed the change which shows up on https://integration.wikimedia.org/zuul/" [integration/config] - 10https://gerrit.wikimedia.org/r/810074 (owner: 10Stang) [07:46:00] (03CR) 10Hashar: [C: 03+2] Zuul: [mediawiki/extensions/MultiMail] add basic quibble and phan jobs [integration/config] - 10https://gerrit.wikimedia.org/r/809918 (https://phabricator.wikimedia.org/T311542) (owner: 10Mainframe98) [07:47:54] (03Merged) 10jenkins-bot: Zuul: [mediawiki/extensions/MultiMail] add basic quibble and phan jobs [integration/config] - 10https://gerrit.wikimedia.org/r/809918 (https://phabricator.wikimedia.org/T311542) (owner: 10Mainframe98) [07:49:02] (03CR) 10Hashar: [C: 03+2] "Deployed!" [integration/config] - 10https://gerrit.wikimedia.org/r/809918 (https://phabricator.wikimedia.org/T311542) (owner: 10Mainframe98) [07:49:43] (03CR) 10Mainframe98: Zuul: [mediawiki/extensions/MultiMail] add basic quibble and phan jobs (031 comment) [integration/config] - 10https://gerrit.wikimedia.org/r/809918 (https://phabricator.wikimedia.org/T311542) (owner: 10Mainframe98) [08:17:59] 10Phabricator: Delete account xcollazo - https://phabricator.wikimedia.org/T311772 (10Aklapper) 05Open→03Resolved a:03Aklapper Unused account `@xcollazo` deleted; then renamed `@XCollazo-WMF` to `@xcollazo`. [09:59:50] 10Phabricator: Should not require an NDA privilege or special template to see in Phabricator the “Other assignee” field - https://phabricator.wikimedia.org/T310638 (10Aklapper) a:03Aklapper > Also interestingly, I can't see the other assignee on field on this task Me neither, if you refer to the "Add Action..... [10:00:37] 10Phabricator: Allow setting “Other assignee” field when editing tasks of subtype "Bug Report" - https://phabricator.wikimedia.org/T310638 (10Aklapper) p:05Triage→03Medium [10:01:10] 10Phabricator: Allow setting “Other assignee” field when editing tasks of subtype "Bug Report" - https://phabricator.wikimedia.org/T310638 (10Aklapper) 05Open→03Resolved This task uses https://phabricator.wikimedia.org/transactions/editengine/maniphest.task/edit/59/ as its Edit (and not Create) Form, due to... [10:19:15] 10Beta-Cluster-Infrastructure, 10DBA, 10Wikimedia-Rdbms, 10Epic, and 2 others: Enable MariaDB/MySQL's Strict Mode - https://phabricator.wikimedia.org/T108255 (10Diesel_kapasule) [10:25:52] 10Phabricator, 10Design: Task symbol for "Open" suggests "alert" and is inconsistent in its placement when task types are also used - https://phabricator.wikimedia.org/T297249 (10Aklapper) 05Open→03Resolved It looks like this got deployed at some point. If not, then please reopen. [10:29:55] 10Phabricator: Edit task missing space/visible/editable options for @robh (Administrative Request subtype; Form 66; Form 39) - https://phabricator.wikimedia.org/T310722 (10Aklapper) [10:33:55] 10Phabricator: Edit task missing space/visible/editable options for @robh (Administrative Request subtype; Form 66; Form 39) - https://phabricator.wikimedia.org/T310722 (10Aklapper) 05Open→03Resolved a:03Aklapper @robh: I went to Edit Form https://phabricator.wikimedia.org/transactions/editengine/maniphest... [10:42:47] 10Project-Admins: Create project tag for Json5Config - https://phabricator.wikimedia.org/T308199 (10Aklapper) Hi, are there any plans to push code into Gerrit soon? [11:04:49] 10Phabricator: Some deadline stamps change color, while others do not, despite having the same deadline - https://phabricator.wikimedia.org/T311168 (10Aklapper) See also {T310188} [11:04:58] 10Phabricator: Due Date stamp doesn't show on a Phab task even though the field is filled - https://phabricator.wikimedia.org/T310188 (10Aklapper) See also {T311168} [11:54:59] 10Project-Admins: Create project tag for Json5Config - https://phabricator.wikimedia.org/T308199 (10Jasonkhanlar) Hello. Unfortunately I am still stuck on this extension, and haven't much progress on my website as I am still determined to prepare a functional JSON/JSON5 environment implementation. As of this tim... [12:43:51] 10Phabricator: Delete account xcollazo - https://phabricator.wikimedia.org/T311772 (10xcollazo) Thank you @Aklapper! [15:08:09] 10Release-Engineering-Team, 10Add-Link, 10Growth-Team, 10Release Pipeline, 10Patch-For-Review: Copy test coverage artifacts from test to coverage variant - https://phabricator.wikimedia.org/T307772 (10MShilova_WMF) [15:25:53] 10Scap, 10Performance-Team, 10serviceops: MW wmf-config tmp cache stays outdated after Scap deploy (opcache revalidation is off) - https://phabricator.wikimedia.org/T311788 (10dancy) Thanks for the analysis Krinkle. [15:43:47] 10Scap, 10Performance-Team, 10serviceops: MW wmf-config tmp cache stays outdated after Scap deploy (opcache revalidation is off) - https://phabricator.wikimedia.org/T311788 (10Joe) A 10 seconds grace period, which was the previous value for revalidate_freq, would be ok I think in the meantime. [15:47:31] (03CR) 10Ahmon Dancy: Move serializing_lock_file to a setgid directory (031 comment) [tools/scap] - 10https://gerrit.wikimedia.org/r/810069 (https://phabricator.wikimedia.org/T310395) (owner: 10Ahmon Dancy) [15:51:49] (03CR) 10Ahmon Dancy: scap prep: Ensure umask is 002 before running (031 comment) [tools/scap] - 10https://gerrit.wikimedia.org/r/809708 (owner: 10Ahmon Dancy) [16:02:07] (03CR) 10Ahmon Dancy: Scap backport --list: Add mergeable column (031 comment) [tools/scap] - 10https://gerrit.wikimedia.org/r/810078 (https://phabricator.wikimedia.org/T303967) (owner: 10Jeena Huneidi) [16:09:40] (03CR) 10Ahmon Dancy: Scap backport --list: Add mergeable column (031 comment) [tools/scap] - 10https://gerrit.wikimedia.org/r/810078 (https://phabricator.wikimedia.org/T303967) (owner: 10Jeena Huneidi) [16:25:56] https://sal.toolforge.org 500ing [16:26:08] and of course it fixes itself :) [16:42:10] (03PS1) 10Ahmon Dancy: Many changes to support mwpresync [tools/train-dev] - 10https://gerrit.wikimedia.org/r/810384 [16:42:33] (03CR) 10CI reject: [V: 04-1] Many changes to support mwpresync [tools/train-dev] - 10https://gerrit.wikimedia.org/r/810384 (owner: 10Ahmon Dancy) [16:45:18] (03PS2) 10Ahmon Dancy: Many changes to support mwpresync [tools/train-dev] - 10https://gerrit.wikimedia.org/r/810384 [16:48:16] (03CR) 10Ahmon Dancy: Move serializing_lock_file to a setgid directory (032 comments) [tools/scap] - 10https://gerrit.wikimedia.org/r/810069 (https://phabricator.wikimedia.org/T310395) (owner: 10Ahmon Dancy) [16:53:22] (03PS1) 10Ahmon Dancy: Text changes [tools/scap] - 10https://gerrit.wikimedia.org/r/810388 [16:53:57] (03CR) 10Ahmon Dancy: [C: 03+2] Text changes [tools/scap] - 10https://gerrit.wikimedia.org/r/810388 (owner: 10Ahmon Dancy) [16:57:17] (03Merged) 10jenkins-bot: Text changes [tools/scap] - 10https://gerrit.wikimedia.org/r/810388 (owner: 10Ahmon Dancy) [17:48:10] (03PS2) 10Ahmon Dancy: Only restart gerrit when the config actually changed [tools/train-dev] - 10https://gerrit.wikimedia.org/r/810085 [18:54:26] (03CR) 10Ahmon Dancy: Move serializing_lock_file to a setgid directory (031 comment) [tools/scap] - 10https://gerrit.wikimedia.org/r/810069 (https://phabricator.wikimedia.org/T310395) (owner: 10Ahmon Dancy) [18:59:44] (03CR) 10Ahmon Dancy: [C: 04-1] Move serializing_lock_file to a setgid directory [tools/scap] - 10https://gerrit.wikimedia.org/r/810069 (https://phabricator.wikimedia.org/T310395) (owner: 10Ahmon Dancy) [19:22:32] (03PS1) 10Ahmon Dancy: scap prep chmod 777 cache dir bugfix [tools/scap] - 10https://gerrit.wikimedia.org/r/810396 [19:33:16] (03PS1) 10Ahmon Dancy: Move serializing_lock_file to a setgid directory [tools/scap] - 10https://gerrit.wikimedia.org/r/810397 (https://phabricator.wikimedia.org/T310395) [19:35:11] (03CR) 10Ahmon Dancy: "This is an alternative to https://gerrit.wikimedia.org/r/c/mediawiki/tools/scap/+/810069." [tools/scap] - 10https://gerrit.wikimedia.org/r/810397 (https://phabricator.wikimedia.org/T310395) (owner: 10Ahmon Dancy) [19:38:00] (03PS3) 10Ahmon Dancy: Add directory mode check to TimeoutLock [tools/scap] - 10https://gerrit.wikimedia.org/r/810069 (https://phabricator.wikimedia.org/T310395) [19:39:18] (03PS4) 10Ahmon Dancy: Add directory mode check to TimeoutLock [tools/scap] - 10https://gerrit.wikimedia.org/r/810069 (https://phabricator.wikimedia.org/T310395) [19:53:59] (03PS3) 10Ahmon Dancy: Only restart gerrit when the config actually changed [tools/train-dev] - 10https://gerrit.wikimedia.org/r/810085 [19:59:41] 10Continuous-Integration-Infrastructure, 10Release-Engineering-Team (Seen), 10SRE, 10serviceops, and 2 others: replace doc1001.eqiad.wmnet with a buster VM and create the codfw equivalent - https://phabricator.wikimedia.org/T247653 (10Dzahn) >>! In T247653#7982883, @Krinkle wrote: > 1. [change 744763 (pupp... [20:00:16] 10Continuous-Integration-Infrastructure, 10Release-Engineering-Team (Seen), 10SRE, 10serviceops, and 2 others: replace doc1001.eqiad.wmnet with a buster VM and create the codfw equivalent - https://phabricator.wikimedia.org/T247653 (10Dzahn) a:05hashar→03Dzahn [20:02:02] (03PS4) 10Ahmon Dancy: Only restart gerrit when the config actually changed [tools/train-dev] - 10https://gerrit.wikimedia.org/r/810085 [20:12:49] 10Continuous-Integration-Infrastructure, 10Release-Engineering-Team (Seen), 10SRE, 10serviceops, and 2 others: replace doc1001.eqiad.wmnet with a buster VM and create the codfw equivalent - https://phabricator.wikimedia.org/T247653 (10Dzahn) >>! In T247653#7982883, @Krinkle wrote: > I propose the following... [20:13:14] (03CR) 10Jeena Huneidi: [C: 03+2] "LGTM" [tools/train-dev] - 10https://gerrit.wikimedia.org/r/810085 (owner: 10Ahmon Dancy) [20:13:43] (03Merged) 10jenkins-bot: Only restart gerrit when the config actually changed [tools/train-dev] - 10https://gerrit.wikimedia.org/r/810085 (owner: 10Ahmon Dancy) [20:58:40] 10Continuous-Integration-Infrastructure, 10Release-Engineering-Team (Seen), 10SRE, 10serviceops, and 2 others: replace doc1001.eqiad.wmnet with a buster VM and create the codfw equivalent - https://phabricator.wikimedia.org/T247653 (10Dzahn) >>! In T247653#7982883, @Krinkle wrote: > I propose the following... [21:01:28] mutante: I'm here :) [21:02:04] (03CR) 10Jeena Huneidi: [C: 03+2] "LGTM" [tools/train-dev] - 10https://gerrit.wikimedia.org/r/810086 (owner: 10Ahmon Dancy) [21:02:07] Krinkle: I am here as well. do you want to join the meet or was it only for the calendar part? [21:02:22] just calendar [21:02:31] (03Merged) 10jenkins-bot: Use pregenerated ssh host keys for gerrit container [tools/train-dev] - 10https://gerrit.wikimedia.org/r/810086 (owner: 10Ahmon Dancy) [21:03:08] Krinkle: ok, please see what I just added: https://phabricator.wikimedia.org/T247653#8045398 [21:03:32] added a few steps to actually remove it etc [21:04:55] mutante: ok, reviewed, LGTM [21:05:03] cool!, thanks [21:05:13] doing the first step with cumin [21:05:47] done, puppet disabled on 3 hosts [21:06:05] next step: systemctl list-timers | grep doc [21:06:13] it's empty on 2001 and 1002 [21:06:37] manually starting "last sync" from doc1001 to doc1002 [21:06:52] Jul 01 21:06:41 doc1001 systemd[1]: Started rsync documentation to a non-active server. [21:07:41] waiting for └─6189 /usr/bin/rsync -avp --delete /srv/doc/ rsync://doc1002.eqiad.wmnet/doc-between-nodes [21:08:38] there is currently 1 running doc building job (mediawiki-core-doxygen-docker) no problem [21:08:53] will probably fail at some point and can retry after we're done [21:08:56] (03PS1) 10QChris: Allow “Gerrit Managers” to import history [software/varnish/libvmod-querysort] (refs/meta/config) - 10https://gerrit.wikimedia.org/r/810409 [21:09:02] (03CR) 10QChris: [V: 03+2 C: 03+2] Allow “Gerrit Managers” to import history [software/varnish/libvmod-querysort] (refs/meta/config) - 10https://gerrit.wikimedia.org/r/810409 (owner: 10QChris) [21:09:06] this is based on the list at https://integration.wikimedia.org/ci/job/publish-to-doc/ [21:09:20] logged in operations that doc has maintenance [21:09:25] rsync to doc1002 finished [21:09:28] now to doc2001 [21:09:31] (03PS1) 10QChris: Import done. Revoke import grants [software/varnish/libvmod-querysort] (refs/meta/config) - 10https://gerrit.wikimedia.org/r/810410 [21:09:33] (03CR) 10QChris: [V: 03+2 C: 03+2] Import done. Revoke import grants [software/varnish/libvmod-querysort] (refs/meta/config) - 10https://gerrit.wikimedia.org/r/810410 (owner: 10QChris) [21:09:56] waiting for └─6493 /usr/bin/rsync -avp --delete /srv/doc/ rsync://doc2001.codfw.wmnet/doc-between-nodes [21:11:03] systemctl stop rsync-doc-doc1002.eqiad.wmnet.timer [21:11:10] systemctl stop rsync-doc-doc1002.eqiad.wmnet.service [21:11:31] systemctl mask rsync-doc-doc1002.eqiad.wmnet.timer [21:11:37] systemctl mask rsync-doc-doc1002.eqiad.wmnet.service [21:12:03] sync to 2001 finished [21:12:09] Jul 01 21:11:56 doc1001 rsync[6493]: total size is 67,382,346,880 speedup is 546.97 [21:12:29] systemctl stop rsync-doc-doc2001.codfw.wmnet.timer [21:12:36] systemctl stop rsync-doc-doc2001.codfw.wmnet.service [21:12:55] systemctl mask rsync-doc-doc2001.codfw.wmnet.timer [21:13:00] systemctl mask rsync-doc-doc2001.codfw.wmnet.service [21:13:17] now it gets more interesting: merging [21:13:18] https://gerrit.wikimedia.org/r/c/operations/puppet/+/744763 [21:14:28] Krinkle: merged on master.. are we ready? then I enable puppet again on 2001 and 1002 [21:14:41] then we should see a new timer on 1002 [21:14:57] mutante: did we do the manual sync? I thought the above was the scheduled one waiting to finish [21:15:12] e.g. the job build and finished two minutes ago - that's presumably gone to 1001 but not synced [21:15:24] once incoming syncs are disabled, is when we do the final sync right? [21:15:34] Krinkle: I started the service manually like systemctl start rsync-doc-doc1002.eqiad.wmnet.service [21:15:37] or I can retry it afterwards either way is fine [21:15:47] it won't work afterwards though [21:15:48] but that was when incoming syncs from jenkins were still enabled [21:15:53] let's repeat the sync now then [21:15:56] yeah [21:16:12] reverting the "mask" [21:16:15] so with the puppet change, jenkins can't send new data to 1001 but discovery points to 1001 so now we're in read-only basically [21:16:31] I misread your steps I guess, sorry [21:16:42] np, hold on [21:17:06] systemctl start rsync-doc-doc1002.eqiad.wmnet.service [21:17:24] ... [21:17:24] Jul 01 21:17:10 doc1001 rsync[6982]: cover-extensions/ORES/includes/Storage/dashboard.html [21:17:28] .. [21:19:14] I started both, to 1002 and to 2001 [21:19:21] ack, that's the one :) [21:19:39] rsync[6982]: mediawiki-core/master/php/classSpecialWithoutInterwiki__inherit__graph.md5 [21:20:00] sync to 1002: Main PID: 6982 (code=exited, status=0/SUCCESS) [21:22:17] sync to 2001: Active: inactive (dead) [21:23:20] I did not really see it finish.. I used "watch " though [21:23:32] still runing [21:24:34] `server: Apache`, I guess we don't have the fqdn thing we do for mw and noc.wm.o etc. [21:24:57] https://doc.wikimedia.org/cover-extensions/ORES/ and https://doc.wikimedia.org/mediawiki-core/master/php/ I will use a test case (last modified footer) [21:25:19] Jul 01 21:25:06 doc1001 rsync[7452]: total size is 67,382,356,700 speedup is 556.69 [21:25:31] ok, done. manual sync to both. I am masking stuff again [21:26:44] Krinkle: I'll go enable puppet, ok? [21:26:51] ack [21:26:53] yes [21:27:33] enabled on 2001. running puppet on 2001 [21:28:03] -hosts allow = doc1001.eqiad.wmnet localhost [21:28:03] +hosts allow = doc1002.eqiad.wmnet localhost [21:28:15] -&R_SERVICE(tcp, 873, @resolve((doc1001.eqiad.wmnet),AAAA)); [21:28:15] +&R_SERVICE(tcp, 873, @resolve((doc1002.eqiad.wmnet),AAAA)); [21:28:20] rsync and firewall ^ [21:28:27] done. now doc1002 [21:29:31] Notice: /Stage[main]/Rsync::Server/Service[rsync]: Triggered 'refresh' from 1 event [21:29:57] oh, I made a small mistake [21:30:10] should have merged https://gerrit.wikimedia.org/r/c/operations/puppet/+/810399 before [21:30:21] now it created timers to sync from 1002 to 1001 [21:30:29] but I'll manually have to remove them [21:30:44] merging the removal of doc1001 from scap [21:31:37] manually submitting. not waiting for jerkins a second time [21:32:13] ack [21:33:19] on doc1002: stopping and masking service/timer to sync to doc1001 [21:34:06] root@doc1002:/home/dzahn# rm /lib/systemd/system/rsync-doc-doc1001.eqiad.wmnet.* [21:34:09] cleaned up [21:34:19] now starting sync from 1002 to 2001 [21:34:30] mutante: ah, that's because no puppt ensure=>absent, right, so we're doing that by hand for now [21:34:41] I would have avoided that [21:34:47] if I had merged in the right order [21:34:52] right [21:34:56] I did not follow my own plan :p [21:35:00] but no biggie [21:35:30] actually running puppet one more time to see if it corrects anyting I did [21:35:34] just in case [21:36:07] noop. moving on [21:36:43] on doc1002: systemctl start rsync-doc-doc2001.codfw.wmnet.service [21:38:09] re-enabled puppet on doc1001 (we don't want it to stay disabled for days but also not delete it right now) [21:38:55] but that still recreates the timers.. narf [21:39:24] disabling it again [21:40:26] so.. what this does is doc1001 is now a passive host that would allow 1002 to push TO it [21:40:27] I'm not sure I follow. do we not want 1002 to sync also to 1001 meanwhile given it's up and provisioned? [21:40:29] doesnt matter [21:40:36] I guess it won't since we removed it from the hiera list. [21:41:01] what happened is 1002 is the new source of truth [21:41:09] and if we told it to it COULD now sync back to 1001 [21:41:16] but it has no timers to tell it to do that [21:41:22] so it's more theoretical [21:41:27] ok [21:41:48] yeah, the hiera delisting I originally didn't think about [21:41:51] question. do you think we should keep 1001 up for a while? [21:42:21] I thought we would, it seemed simpler. then delist from hiera (removing receiving crons from 1002 then) and decom next week. [21:42:35] but I don't mind either way. [21:42:39] yea, over the weekend at least. sounds good [21:43:00] let's continue with the DNS switch then [21:43:08] first [21:45:06] sync _from_ 1002 to 2001 worked btw [21:45:10] and is finished [21:45:30] puppet is enabled on doc1001 and nothing bad happens [21:46:47] DNS change merged.. syncing to DNS servers [21:47:47] doc.discovery.wmnet is an alias for doc1002.eqiad.wmnet. [21:47:57] Krinkle: I see the first hits in apache log on 1002 [21:48:05] but also still some on 1001.. in progress [21:48:29] yeah, I guess it varies by DC [21:48:36] or even individual cp [21:49:35] my own request went to 1002 and looks good to me [21:49:36] in browser [21:49:48] could tell by tailing logs on both [21:51:38] already pretty quiet in 1001 logs by now.. and active on 1002 [21:52:46] mutante: okay to try a sync from Jenkins? [21:52:55] Krinkle: yes, go ahead please [21:53:00] it seems done to me [21:53:34] last try was 7 min ago: [21:53:35] 21:47:26 rsync: failed to connect to doc.discovery.wmnet (2620:0:861:101:10:64:0:142): Connection timed out (110) [21:53:35] 21:47:26 rsync: failed to connect to doc.discovery.wmnet (10.64.0.142): Connection timed out (110) [21:53:37] will retry now [21:54:24] on contint1001 : doc.discovery.wmnet is an alias for doc1002.eqiad.wmnet. [21:54:42] cont2001 is currently primary for this job it seems [21:55:00] ah, yea [21:55:04] for everything I thnk [22:00:33] 21:56:40 + rsync --archive --stats --compress --delete-after . rsync://doc.discovery.wmnet/doc/puppet [22:00:37] 21:56:42 Published at https://doc.wikimedia.org/puppet/ [22:01:21] that was the only job we missed, so all good. Even if we missed none, I would've rebuilt the last one to see it succeed and sync of course. [22:01:41] now checking the PHP portions of it [22:02:29] https://doc.wikimedia.org/mediawiki-core/master/php/search.php?query=WANObjectCache [22:02:30] https://doc.wikimedia.org/oojs-ui/master/demos/demos.php [22:02:34] ref T297035 [22:02:34] T297035: Demos page for OOUI in php is broken - https://phabricator.wikimedia.org/T297035 [22:02:56] doxygen search still working + OOUI working for the first time in a long time [22:03:33] great :) [22:11:16] mutante: so.. that's it? [22:12:58] Krinkle: yea, I think that's it indeed [22:13:05] there are 2 things left for next week [22:13:14] well, 3 [22:13:24] run decom cookbook to delete VM [22:13:28] merge https://gerrit.wikimedia.org/r/c/operations/puppet/+/810400/ [22:13:45] merge https://gerrit.wikimedia.org/r/c/operations/puppet/+/810401/ [22:13:55] the very last one is currently V-1 [22:14:51] profile not supported by stretch (file: /srv/workspace/puppet/modules/profile/manifests/doc.pp, line: 17, column: 23) [22:18:36] PROBLEM - Check systemd state on doc1002 is CRITICAL: CRITICAL - degraded: The following units failed: rsync-doc-doc1001.eqiad.wmnet.service https://wikitech.wikimedia.org/wiki/Monitoring/check_systemd_state [22:19:00] well. I need to fix that one ..sigh [22:19:09] on it though [22:25:27] systemctl reset-failed will resolve it [22:27:10] RECOVERY - Check systemd state on doc1002 is OK: OK - running: The system is fully operational https://wikitech.wikimedia.org/wiki/Monitoring/check_systemd_state [22:28:04] 10GitLab (Infrastructure), 10Data-Persistence-Backup, 10serviceops, 10serviceops-collab, and 2 others: Backups for GitLab - https://phabricator.wikimedia.org/T274463 (10Dzahn) unfortunately just noticed an Icinga alert for gitlab1003 (nothing mails us about this, that's just if you happen to log at web UI... [22:28:06] Krinkle: ^ I consider this all done now. thanks for doing that today with me [22:28:30] it was time and several people will be happy because stretch [22:31:29] 10GitLab (Infrastructure), 10Data-Persistence-Backup, 10serviceops, 10serviceops-collab, and 2 others: Backups for GitLab - https://phabricator.wikimedia.org/T274463 (10Dzahn) ` Jul 01 01:33:24 gitlab1003 gitlab-restore.sh[2196430]: /opt/gitlab/embedded/service/gitlab-rails/lib/backup/manager.rb:94:in `ea... [23:48:11] 10Continuous-Integration-Infrastructure, 10OOUI: Demos page for OOUI in php is broken - https://phabricator.wikimedia.org/T297035 (10Krinkle) 05Open→03Resolved [23:48:23] 10Continuous-Integration-Infrastructure, 10OOUI, 10Performance-Team: Demos page for OOUI in php is broken - https://phabricator.wikimedia.org/T297035 (10Krinkle)