[08:18:53] <_joe_> tests for some profile are running super slow in CI [08:18:56] <_joe_> 10:18:00 Finished in 1 minute 18.42 seconds (files took 4.93 seconds to load) [08:19:10] <_joe_> and also [08:19:13] <_joe_> 10:18:51 Finished in 2 minutes 11.5 seconds (files took 3.04 seconds to load) [08:19:29] <_joe_> not sure if it's CI being super slow as I imagine or if we changed something [10:19:21] Just checking, where do I run `authdns-update` now? Is it dns1004? This page still says dns1001: https://wikitech.wikimedia.org/wiki/DNS#Changing_records_in_a_zonefile [10:53:07] btullis: any dns box, so dns1004 is fine [10:53:20] I will update the page, thanks [10:53:41] sukhe: Many thanks. [10:54:28] you can run authdns-update on any A:dns-rec box (or A:dns-auth, same thing) and it syncs the changes to others [10:56:50] sukhe: Ack, thanks. Just thought it worth checking because I hadn't seen an email to ops or similar. [10:58:28] you are right, I did announce during the sre meeting but should have done an email instead [10:58:54] I will do it today [10:59:28] <3 [12:26:17] hi all puppetboard is now using the newer puppet7 infrastrcutre let me know if you see issues (also sent an email to sre-at-large) [12:33:46] looks much faster too1 [12:33:46] ! [12:34:41] its got a new maintner now and they have done quite a bit of refactoring good to see tis payed of [12:37:42] for anyone intrested the old version is 3.1 the new version is 4.3 https://github.com/voxpupuli/puppetboard/blob/master/CHANGELOG.md [12:44:45] it's definitely a lot faster for when you click on the host! [12:45:14] /radiator is quite something :) [12:46:28] :) [12:49:49] looks like something to put on a NOC room's screen [13:01:02] yeah. now we know why jbond has four monitors [13:05:12] lol i only have three ;) [13:05:29] i got rid of the forth when i moved to spain [13:08:17] sad times [13:08:27] indeed [13:08:53] * Emperor can fit 3 emacsen on their one monitor, why would they need more? :) [13:35:32] At one of my old jobs, there used to be a guy with 9 monitors on his desk...his cubicle became quite the tourist attraction [14:04:07] imagine the heat of 9 monitors in front of you... [14:07:07] !log mw141[23] downtimes and relocating per T308339 [14:07:10] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log [14:07:11] T308339: eqiad: move non WMCS servers out of rack C8 - https://phabricator.wikimedia.org/T308339 [14:42:09] ok booting mw141[34] back from rack migration (same row) lets see if I did it correctly [14:48:43] !log mw141[34] returned to service per T308339 [14:48:46] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log [14:48:47] T308339: eqiad: move non WMCS servers out of rack C8 - https://phabricator.wikimedia.org/T308339 [14:49:33] !log mw141[23] returned to service per T308339. ignore typo of mw1414 it is uninvolved [14:49:36] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log [15:11:54] !log mw141[01] returned to service per T308339 [15:11:57] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log [15:11:58] T308339: eqiad: move non WMCS servers out of rack C8 - https://phabricator.wikimedia.org/T308339 [15:14:30] !log mw140[89] downtime for relocation per T308339 [15:14:33] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log [15:20:07] fyi going to reimage sretest1002 [15:24:24] please hold off on running authdns-update [15:26:41] all resolved, good to go [17:36:11] XioNoX: hi [17:36:14] E003|MISSING_OR_WRONG_PTR_FOR_NAME_AND_IP: Missing PTR '2.0.0.0.0.0.0.0.0.0.0.0.0.0.0.0.2.1.e.f.1.6.8.0.0.0.0.0.0.2.6.2.ip6.arpa.' for name 'ethernet72.lsw1-e8-eqiad.eqiad.wmnet.' and IP '2620:0:861:fe12::2', PTRs are: 13.0.66.10.in-addr.arpa. [17:36:18] E003|MISSING_OR_WRONG_PTR_FOR_NAME_AND_IP: Missing PTR '2.0.0.0.0.0.0.0.0.0.0.0.0.0.0.0.3.1.e.f.1.6.8.0.0.0.0.0.0.2.6.2.ip6.arpa.' for name 'ethernet76.lsw1-e8-eqiad.eqiad.wmnet.' and IP '2620:0:861:fe13::2', PTRs are: [17:36:23] sukhe: eh [17:36:31] I think I know what's up [17:36:36] sorry for the ping, but saw it on netbox and the changelog [17:36:42] so thought I should ping you without trying to fix [17:37:15] I need to send a patch to the dns repo [17:41:57] XioNoX: hth [17:42:08] (on it) [17:46:09] sukhe, topranks, https://gerrit.wikimedia.org/r/c/operations/dns/+/939752 [17:55:44] XioNoX: thanks! [17:55:52] XioNoX: patch looks good thanks [17:56:34] thanks! [17:56:37] RESULT: 0 Errors, 304 Warnings, 1965 Ignored violations, 43 Ignored lines [17:56:39] all good [17:56:57] nice [17:57:16] I am now thinking [17:57:24] that for whatever reason authdns-update fails to run, we should have some monitoring [17:57:42] for sure [17:57:45] the only reason we discovered this was because I did a dummy run to see if a DNS host came back properly [17:57:52] I will think about it [17:58:58] sukhe: but also that one is on me, I had the terminal open with the dns cookbook error in red [17:59:11] just forgot in all the multitasking [17:59:33] XioNoX: I think even then there should be some alert, something similar to the netbox like for uncomitted DNS changes or something similar [17:59:47] because like in the morning with the incorrect hostname [17:59:50] again, we discovered it by chance [17:59:59] yeah I agree [20:25:44] FYI SRE, `ssh: connect to host parse1002.eqiad.wmnet port 22: Connection timed out` during a backport, https://phabricator.wikimedia.org/P49604 [20:25:55] dancy: ^ [20:26:30] Can we get it marked inactive so it doesn't get deployments while down [20:27:06] TheresNoTime: maybe worth a task if the host is down? [20:27:41] will do after deployment [20:27:51] TheresNoTime: I'll save you time [20:27:54] (and/or feel free) [20:29:06] https://phabricator.wikimedia.org/T342298 [23:43:27] MW-on-k8s got a round of applause at today's NYC meetup :)