[07:01:15] 10netops, 10Infrastructure-Foundations: cr2-eqiad:FPC3 partial failure (PIC2/3) - https://phabricator.wikimedia.org/T312745 (10ayounsi) p:05Triage→03High [07:01:53] 10netops, 10Infrastructure-Foundations: cr2-eqiad:FPC3 partial failure (PIC2/3) - https://phabricator.wikimedia.org/T312745 (10ayounsi) High severity Case Number 2022-0711-508366 [08:31:17] hi all back in today, just catching up and plan to do some work on beaker for insperation week but ping me if there is anything that needs to go to the top of the stack [08:34:19] jbond: what's beaker? [08:35:23] XioNoX: https://github.com/voxpupuli/beaker [08:36:00] its application testing for puppet, it allows you to spin up machines, apply a puppet policyt and then run some tests [08:36:10] nice! [08:36:24] the patch set i have is here https://gerrit.wikimedia.org/r/c/operations/puppet/+/809224 [08:36:41] if intrested you should be able to run it with [08:36:42] BEAKER_debug=yes BEACKER_destroy=no BEAKER_set="debian10" bundle exec rake beaker [08:37:03] and it will spin up a docker instance and run puppet on it using the sretest role [08:37:35] welcome back jbond! Depending on what dcops plans to do this week there are a couple of patches (spicerack + provision cookbook) that might need a review. Both quite small [08:37:59] volans: ack ill go through my mail and then ping you for anything i have missed [08:38:38] <3 [08:39:06] p.s. add your project to the officewiki page ;) [08:39:17] volans: i thought i did let me check [08:39:37] I didn't actually check :) [08:39:51] I didn't see it when I looked, but might have been before you added it, sorry for the confusion [08:40:05] jbond [08:40:26] no not there must have not submitted or somethihng [08:40:46] ack [08:43:52] ahh seems it got reverted im not mad :P https://office.wikimedia.org/w/index.php?title=Engineering%2FInspiration_Week_2022%2FCoordination&type=revision&diff=315505&oldid=315501 [08:44:18] ahahaha [08:48:44] 10netbox, 10Infrastructure-Foundations: Netbox: use Custom Model Validation - https://phabricator.wikimedia.org/T310590 (10ayounsi) A concrete example: https://gist.github.com/tyler-8/0a99763cae01c97e8f80f5aca09db968 [09:08:18] 10netops, 10Infrastructure-Foundations, 10SRE, 10ops-eqiad: eqiad: Move links to new MPC7E linecard - https://phabricator.wikimedia.org/T304712 (10ayounsi) Great, so next step are: # Install the breakout panels, (document them, similar to {T304710}) # Pre-populate the ports/panels that will be used with th... [09:24:01] 10Puppet, 10puppet-compiler, 10Infrastructure-Foundations, 10Patch-For-Review: pcc-uploader failing on tools-puppetmaster-02 - https://phabricator.wikimedia.org/T311742 (10jbond) i think this is related to when the ssl certificate needed to be extended. I have uploaded the [[ https://gerrit.wikimedia.org/... [09:24:58] 10Puppet, 10puppet-compiler, 10Infrastructure-Foundations, 10Patch-For-Review: pcc-uploader failing on tools-puppetmaster-02 - https://phabricator.wikimedia.org/T311742 (10jbond) 05Open→03Resolved a:03jbond [13:10:18] wb jbond :) [13:10:57] I'm trying to reimage cloudelastic1005 and DHCP doesn't seem to be working, is that y'all or should I hit up DCOps? [13:15:10] thanks cdanis :) [13:15:26] inflatador: afaik it should be working id check dcops [13:19:55] inflatador: is not getting dhcp for PXE or d-i? [13:21:00] I don't see incoming dhcp requests on install1003 besides cloudswift1002 that is a planned host [13:22:31] Thanks jbond and volans .... I just got done reimaging the rest of cloudelastic with no problems, will check w/ DCops [13:42:41] 10netops, 10Infrastructure-Foundations, 10SRE, 10ops-eqiad: cr2-eqiad:FPC3 partial failure (PIC2/3) - https://phabricator.wikimedia.org/T312745 (10ayounsi) a:03Cmjohnson Juniper agreed on an RMA, forwarded the email thread to Chris for the shipping details. @Cmjohnson please sync up with Netops once rec... [14:36:24] topranks: any chance i could swap my august oncall with you? (im 08-29, you are 08-08) [14:38:00] jbond: just looking there... [14:38:07] mostly it's ok, I have something on Fri 2nd might need to take half day that day, but otherwise no issue with that week [14:38:40] topranks: ack thanks worst case i can cover the half day on the 2nd [14:39:29] jbond: ok cool, yep let's do it then. [14:39:55] I can probably cover through until about 15:00 CEST at least, will try to work it out [14:40:09] awesome thanks ill do the updates [14:40:46] great, thanks! [15:09:25] topranks: jbond following up on shift swap can I request your individual TZs to schedule the override? [15:10:42] lmata: Europe/Dublin. UTC+1 at this time of year. [15:11:21] lmata: im CEST (UTC+2) 8:00 UTC is my prefered start time [15:13:13] thanks I'll confirm in a few mins as i double check all the dates and schedule the overrides [15:13:20] ack thanks [15:54:47] overrides scheduled please confirm details in: https://portal.victorops.com/dash/wikimedia#/team/team-ra3ayi0mHc3Nr6qu/scheduled-overrides [15:55:56] please imagine there are 2 lost commas (,) somewhere in the previous sentence [16:00:16] lmata: can you confirm what is happening with the schedualed times for oncall im still confused on that the expectation is. last time i did it the time was 11:00 UTC -> 19:00 UTC; there was some comments earlier in #w-sre-private that this is flexible and up to the engioneer; however i see you have put me schedualed for 8:00->17:00 localtime which a) is 9 hours not 8 b) not what i [16:00:22] requested i.e. 10:00 -> 18:00 (8 hours). [16:00:40] let me fix that [16:00:47] tbh i dont mind changing my schedual for a week and getting up0 ealier but the 9 hour window seemd strange [16:01:07] no no, i didnt specify start/stop times that were specific to either of you [16:01:28] so this is more of jsut updating the times right now, part of the reason why im asking you to review [16:01:40] ahh ok thanks [16:02:17] yeah i have a hard time with timezones [16:02:23] :D [16:05:04] topranks: can you confirm your " start/end" time for the override? [16:05:14] jbond: please refresh now and let me know if that looks better [16:07:00] lmata: i dont see an override for me now :/ [16:07:32] right i need Cathal's start/stop to schedule your override with his hours :-) [16:07:39] ahh ok [16:07:44] sorry for the hassle [16:07:47] np [16:07:52] happy to assist with this [16:07:57] thx <3 [16:10:55] lmata: 08.00 UTC works for me in terms of start, so I guess 17.00 finish? [16:12:22] topranks: just to confirm thats 0900-1600 Dublin right? [16:12:43] or 1800? [16:12:57] sorry I'm getting mixed up, 9-6 local is what I had in mind [16:13:02] cool [16:13:06] thanks :) [16:14:14] topranks: that's with lunch break? :D [16:14:40] 5-courses minimum :P [16:15:33] :D [16:15:57] ok overrides are updated please review and lmk if you see anything off. [16:19:12] lmata: timezone IST ? [16:19:24] yeah Splunk did that on its own ¯\_(ツ)_/¯ [16:19:36] i corrected it 2x so im hoping its overlap and not a bug [16:19:46] do the times match? [16:21:01] ill try UTC [16:21:33] lmata: from my pov its always better to just work in UTC :) [16:21:57] UTC 8 -> 16:00 ideally :) [16:30:27] jhathaway: you may be intrested in https://gerrit.wikimedia.org/r/c/operations/puppet/+/812904 and the follow up patch. it should prevent bolt from throwing an error when a policy uses puppetdb (however you will probably get an unexpected diff) [16:30:52] jbond: ooh, thanks, I'll take a look [16:31:18] and if you can think of better ideas for wmflib::have_puppetdb please send a patch :)