[10:22:40] anyone elses having issues with gerrit https://phabricator.wikimedia.org/P45929 [10:30:00] not at the moment but you can check if there is any client stuck or abusing our git [10:30:47] seems i did have some stalke connections around, possibly from different machines and TIL `ss -K` [10:45:30] apergos: if/when you have a moment, and given that you're the person I most associate with the data dumps, would you mind taking a quick look at T331761? tl;dr I think I have a script which does what Hokwelum is suggesting, and I'd be interested in figuring out the next steps [10:45:31] T331761: Some dumps do not have checksums - https://phabricator.wikimedia.org/T331761 [10:46:37] TheresNoTime: I would reply right on the task so both of us can see it, we have very little bandwidth but anyways if the script is around we can look at it, etc. [10:47:50] I'll commit it somewhere and we can maybe go from there :) thanks! [10:48:33] sound great, just link it on the task, say what it does, say what's not tested, all that kinda sstuff, would be great if wre can just plop it into use :-) [12:34:48] hey JustHannah and TheresNoTime maybe we can cut down some of the time by chatting in here and getting the details down for the next comment on the task [12:35:13] what sort of access are you looking for? i.e. what exactly do you want to do with it? [12:36:10] well firstly, test that my assumptions about the directory layout is reasonable.. but other than that, it can do dry runs, so ideally seeing if it'd work... [12:36:40] so I would suggest the script gets committed somewhere in our gerrit repo before any kind of testing [12:36:56] * TheresNoTime is just making sure it's nicely commented etc before committing it for review :P [12:37:16] for the directory layout we can be gophers for you in the meantime and tell you what sevral of the directory trees look like, there are a few different flavours [12:38:07] these would run on the clouddumps hosts I guess? so we could checksum e.g. phabricator dumps and other things that are fetched only to those hosts, not created by us [12:38:31] and then those files would be rsynced over to the other clouddumps host? [12:38:59] these are questions, I don't have proper answers for them, probably best again to loop WMCS folks into the discussion on the task [12:40:19] Okay :) admittedly I was hoping that the result of T331761 would be "oh yes, we forgot to turn on the `--create-hashes` flag" or something, but having a dedicated script for creating hash files almost sounds more reasonable anyway [12:40:20] T331761: Some dumps do not have checksums - https://phabricator.wikimedia.org/T331761 [12:40:24] presumably all of those files are publically readable, and if you're not actually writing into the directories, (some of which might be owned by root? we'd have to check) [12:41:42] there is no --create-hashes flag: we get page views from hdfs from the data engineering group, wikidata entity dumps from wikibase maintenance scripts, shorturls from maybe it's an sql query, category dumps from category specific maintenance scripts, elastic search dumps from cirrus search maintenance scripts... [12:41:46] all maintained by different people [12:41:50] etc. [12:42:02] lots more of that. so there's no one flag to rule them all :-P [12:42:30] I admit it was wishful thinking ^^ [12:42:50] did I leave out content translation and global blocks and ... yeah I did. I left out a bunch more stuff [12:42:56] anyways... :-P [12:44:19] so ideally you woul dwant to commit the script somewhere so people couldf look at it, discuss with people where it should run, what impact that might have on cpu or whatever (maybe not much, but nice to ask), and request access with your regular dev account on the clouddumps hosts if that's where it should be, which I guess it will? [12:44:32] JustHannah: stop me if I'm overlooking something please [12:45:10] one of these hosts is the public facing web servre and the other serves nfs with these datasets to all wmcs instances and the stats boxes, [12:46:01] the nfs server sometimes is swapped in to take on web server duties in case of maintenance and such, so it needs to have copies of everything [12:46:31] sure! apergos [12:47:44] stuff on clouddumps hosts lives under /srv/dumps/xmldatadumps/public/other/ but this should not be hardcoded into the script, try getting those values out of hiera in puppet I would say [12:48:12] common/profile/dumps/distribution.yaml:profile::dumps::distribution::miscdumpsdir: '/srv/dumps/xmldatadumps/public/other' [12:49:22] ariel@clouddumps1001:/srv/dumps/xmldatadumps/public/other$ ls cirrussearch/ [12:49:22] 20230109 20230116 20230123 20230130 20230206 20230213 20230220 20230227 20230306 20230313 20230320 current [12:49:52] that's one way directories might be set up. current is a symlink, that's not a standard name, everyone uses their own terminology [12:50:36] ariel@clouddumps1001:/srv/dumps/xmldatadumps/public/other$ ls shorturls/ [12:50:36] shorturls-20230206.gz shorturls-20230213.gz shorturls-20230220.gz shorturls-20230227.gz shorturls-20230306.gz shorturls-20230313.gz shorturls-20230320.gz [12:50:39] no subdir here at all [12:51:25] if it's going to be easier for you to poke around directly than looking at these copy pasta then I would go ahead and describe exactly what you need on an access task and poke WMCS< maybe link from the current task to that one [12:53:47] anouther layout: incr/wikiname-here/YYYYMMDD [12:53:55] there's not a lot of uniformity [12:54:15] (put the script at https://gitlab.wikimedia.org/-/snippets/70 pending a repo to commit it to) [12:54:29] and hmm on the non-uniformity of directories [12:54:29] (I saw the email already ;-) ) [12:56:25] file names usually have a timestamp someplace in them but it would need double checking to see if that's always true [13:00:56] should one of us write a summary of this conversation as a reply comment? or do you want to do that, TheresNoTime? [13:02:26] apergos: I'm just about to do the backport deploy, so would prefer someone else write this up if that's okay? [13:02:37] that's fine! [13:02:51] have fun running the window! [14:01:58] anyone know if the makevm recipe is idempotent? The dns update piece failed, and I need to re-run it. [14:02:49] jhathaway: if you didn't kill -9 or ctrl+c twice it does cleanup [14:03:09] and should have shown you the diffs of reverting the dns if that was already done for example [14:03:34] or delete the IPs from netbox [14:04:31] I let the cookbook finish, and all the other steps completed succesfully [14:04:55] so I assume the correct ip was provision in netbox and I only need to push the dns updates? [14:05:12] that sounds weird, let me check [14:05:29] https://netbox.wikimedia.org/virtualization/virtual-machines/571/ [14:05:38] the dns-snippets piece is the one that failed [14:06:17] but is wrapped in a confirm_on_failure [14:06:29] * volans looking at the logs [14:07:45] https://phabricator.wikimedia.org/P45931 [14:08:36] thx [14:11:19] jhathaway: doh, that's a but in the makevm code, I'll send a patch [14:11:40] so yes I'd say run the sre.dns.netbox cookbook if the change was not already pushed by someone else [14:11:55] cool, thanks! [14:18:31] jhathaway: I wonder if all the other steps have worked fine withot the DNS but I guess so, for the sre.ganeti.reimage one instead you need it [14:19:06] volans: well I'm not sure, since the host isn't started in ganeti, so something failed [14:20:32] the reimage will do shutdown+startup anyway, so that shouldn't be a problem [14:20:57] I thought it's expected for it to be off after makevm [14:21:09] but not sure I recall it correctly [14:41:13] apergos: hey, could you please leave your feedback here? https://gerrit.wikimedia.org/r/c/operations/puppet/+/879274 [14:46:57] sure, I figured this was an internal thing for y'all but I can give a thumbs up [14:50:33] {{done}} arturo [15:22:41] thanks apergos !! [15:23:50] no problem! [15:56:16] Web Shooter starts in 5: https://narrow.one/#974B | https://meet.google.com/dud-puoy-xvt [17:07:45] effie: https://github.com/wikimedia/operations-puppet/blob/production/modules/role/manifests/webperf/xhgui.pp#L21 [17:07:49] mod_php alive and kicking [17:20:54] * _joe_ waves cane at mod_php [17:21:37] <_joe_> Krinkle: can you imagine a way to express, in an html file, "margin-top: 14%" without using an actual percent sign? [17:21:56] _joe_: 14vh [17:22:13] <_joe_> Krinkle: and for margin? the same? [17:22:19] <_joe_> I knew I should ask you :P [17:24:31] <_joe_> uhm for margin it's trickier - I guess vmin ? [17:27:09] if you set both vh and vw? [17:28:08] margin: 14vh 14vw 14vh 14vw; [17:31:30] _joe_: do I want to know why you're avoiding a % sign? [17:32:06] <_joe_> Krinkle: envoy doesn't have a way to escape % in its format strings [17:32:33] <_joe_> so when importing our standard error page to be served by envoy for its own errors [17:33:22] I was going to point to that error page as an example of why something might use % and why you might not want to use it. [17:33:25] * Krinkle wrote that error page [17:33:47] but yeah that kind of centerring is a rare example of where %/vh makes a lot of sense [17:34:18] it's using it for margin-top there [17:34:36] <_joe_> Krinkle: ok so, in that context, what is better? margin: 7vmax or what volans was proposing? [17:35:02] Hey all, is there anyone who might be able to do a quick k8s deployment for https://gerrit.wikimedia.org/r/c/operations/deployment-charts/+/902443 ? [17:35:33] margin: 7vh auto 0 auto; [17:35:41] instead of margin: 7% auto 0; [17:35:55] if you're not sure, open https://en.wikipedia.org/sdf and change the value and see if it's visually different [17:36:00] <_joe_> Seddon: anyone with deployment rights can; however usually we ask the person who +2;d a change to deploy it [17:36:49] _joe_: https://usercontent.irccloud-cdn.com/file/cccuP9IS/Screenshot%202023-03-23%20at%2010.36.36.png [17:36:51] <_joe_> nemo-yiannis: are you able to deploy wikifeeds now? [17:37:08] Looks like this needs vw not vh [17:37:20] _joe_: ok [17:37:53] <_joe_> nemo-yiannis: <3, but if you're about to leave I can pick it up [17:38:24] in CSS % of margin refers to logical width, even when used for top or bottom margin [17:38:34] nah, its easy to do the deployment fast. i just didn't want to leave it broken :) [17:38:40] https://w3c.github.io/csswg-drafts/css-box/#margin-shorthand [17:39:24] lol, the box model is something that never stick to my brain [17:39:33] nemo-yiannis: if anything breaks, i'll ping someone else to revert [17:39:48] might have a relationship to how much I've tried to stay away from it in my career :D [17:41:23] <_joe_> Seddon: I can help with the revert [17:41:36] _joe_: awesome [17:41:40] <_joe_> Seddon: I'm around cargo-culting CSS anyways [17:43:38] _joe_: What is the process for getting one of my tech leads trained in doing deployments? Having that self sufficiency for stuff like this which was an important but literal one character change I think would be useful all round [17:44:20] <_joe_> Seddon: so first of all I guess having someone in the deployment group is a precondition [17:44:40] <_joe_> then I guess they can pair with one of us - there's also a good chunk of documentation on wikitech [17:46:56] _joe_: for adding to the deployment group, should I get them to request via phabtask? [17:47:20] jhathaway: https://gerrit.wikimedia.org/r/c/operations/cookbooks/+/902449 should fix your earlier issue, sorry for the trouble [17:47:21] <_joe_> Seddon: yes. But, are you sure noone from your teams is? :) [17:47:39] volans: thanks, no problem at all! [17:48:42] Do you (not) need to do a whole training thing? https://wikitech.wikimedia.org/w/index.php?title=Deployments/Training [17:55:39] <_joe_> TheresNoTime: I think it might be time to start separating mediawiki deployers from the rest [17:55:54] <_joe_> given the amount of footguns in deploying mw :) [17:57:20] Sprint Week Trivia starts in 5 mins https://meet.google.com/tpx-nqwm-obs [18:02:18] lmata: Sorry, can't make it to this one. [18:15:46] * kamila_ saw the whole % thing and thus is signing off for the day