[01:31:31] (SystemdUnitFailed) firing: (2) wmf_auto_restart_prometheus-mysqld-exporter@s7.timer Failed on db1101:9100 - https://wikitech.wikimedia.org/wiki/Monitoring/check_systemd_state - https://grafana.wikimedia.org/d/g-AaZRFWk/systemd-status?orgId=1&forceLogin&editPanel=13 - https://alerts.wikimedia.org/?q=alertname%3DSystemdUnitFailed [04:58:08] <_joe_> Good morning DP folks, one moment of your attention :) The annual Product and Tech tentative OKRs are open for feedback from the org for *this week*. You can read them here https://office.wikimedia.org/wiki/Annual_planning/FY23-24/P%26T_OKRs - please read the instructions carefully and then please spend the half hour it will take to go through those. [04:58:46] <_joe_> Specifically - this plan isn't designed to include *everything* we do, but most of our "project" time should be focused on supporting initiatives related to it. [04:59:30] <_joe_> I would ask you to go through it, make a note about things that aren't clear to you, or things that look risky / unachievable, and discuss that with me. [04:59:49] <_joe_> (I am your Annual Plan representative wrangler :)) [05:00:31] <_joe_> if you see a non-defined number there, and you think your subject-matter expertise can help define it better, please comment. [05:01:09] <_joe_> We'll also try to schedule a meeting with you all so that I can answer your doubts and/or help you give feedback with the necessary amount of context [05:02:02] <_joe_> I repeat the deadline is *this friday*. I'm sorry for the short notice, and I am sorry I am notifying you one day late; OTOH, this is both crucially important and shouldn't take more than 1 hour of your day to go through, so please take the time. [05:31:31] (SystemdUnitFailed) firing: (2) wmf_auto_restart_prometheus-mysqld-exporter@s7.timer Failed on db1101:9100 - https://wikitech.wikimedia.org/wiki/Monitoring/check_systemd_state - https://grafana.wikimedia.org/d/g-AaZRFWk/systemd-status?orgId=1&forceLogin&editPanel=13 - https://alerts.wikimedia.org/?q=alertname%3DSystemdUnitFailed [07:45:18] _joe_: thanks; please note our manager is OOO all week, so if you need to co-ordinate with the team, please don't rely on k.wakuofori being responsive [07:48:32] <_joe_> Emperor: yes I'm aware [07:48:39] <_joe_> I sent you a meeting invite already :) [08:11:31] (SystemdUnitFailed) resolved: (2) wmf_auto_restart_prometheus-mysqld-exporter@s7.timer Failed on db1101:9100 - https://wikitech.wikimedia.org/wiki/Monitoring/check_systemd_state - https://grafana.wikimedia.org/d/g-AaZRFWk/systemd-status?orgId=1&forceLogin&editPanel=13 - https://alerts.wikimedia.org/?q=alertname%3DSystemdUnitFailed [09:18:56] _joe_: if I have questions about this, should I hold them 'til that meeting? [09:28:11] <_joe_> Emperor: do as you prefer, really, it's ok also to ask here on IRC, just expect some asynchronicity [09:29:12] 👍 [09:37:15] _joe_: so the instructions say to not think "about how any given team can contribute to achieving these KRs"; but e.g. WE2 is about reading & media experience but says nothing about Commons at all despite a) that being widely known as an under-resourced area b) it being a community concern. Instead commons only turns up in the small "Future Audiences" bucket. That's going to make it hard to prioritise work like e.g. sorting out mult [09:37:15] thumbs, improving the general state of swift, improving the visibility of how image/thumb requests move through the stack. Which is another way of saying the swift-y bits of DP are going to struggle to contribute to these KRs, but we're not meant to bring that up? [09:37:54] <_joe_> Emperor: go look at the talk page, ctrl+f commons :P [09:38:05] <_joe_> but having said that [09:38:31] <_joe_> these KRs are not supposed to cover everything we do [09:38:59] <_joe_> there is, distinctively, a 50% time reserved for "fundamental work" that we cannot not do [09:39:25] <_joe_> I assume that improving the reliability of swift and/or fix our multi-dc logic in mediawiki fit in that slot [09:41:27] TY [09:58:42] <_joe_> to make a counterpoint, I'm not 100% sure a large initiative like e.g. tracing (which wouldgive you the visibility you were talking about) will only happen if they're instrumental to moving the needle of some KR [09:59:02] <_joe_> err I messed up the sentence sorry [09:59:20] <_joe_> I'm not 100% sure it will happen unless it's instrumental is what I wanted to mean :) [10:00:27] <_joe_> but I'd ask you to focus on feasibility/vagueness first. If you see most of the comments I left are exactly in that direction - clarify language, define percentages [14:45:56] jynus: sorry but I need your help :D I can't find any doc for creating users for wikireplicas, do you know where it is? [14:46:09] users? [14:46:21] we don't create users onwikireplicas [14:48:28] sorry, I mean grants [14:48:33] https://wikitech.wikimedia.org/wiki/Add_a_wiki#Cloud_Services [14:48:44] > Ensure a DBA has created the ${wiki}_p database and granted access to labsdbuser [14:49:02] it doesn't say what the aforementioned DBA should do [14:49:24] it has something but with a big disclaimer: "This what they will probably do: " [14:50:18] if the question is "how to make sure that is right"? I would just check with an existing user to make sure it is the same -e.g. for enwiki_p [14:50:36] if users don't complain about enwiki, you can just copy that [14:51:02] that "FLUSH PRIVILEGES;" is completely unneeded [14:51:02] ah, good point [14:51:07] thanks [14:51:31] with 'SHOW GRANTS FOR X' you can check existing grants [14:51:58] I belive that was only necessary because a mariadb bug [14:52:15] in theory wikireplicas automation handles that [14:52:45] but a bug needed some workaround [14:52:56] check with analytics if the bug is still there [14:58:14] jynus: on a unrelated note, when do you think would be a good time for m1 switchover? backup-wise [14:59:34] Amir1: sadly there was a restart on a long running backup process and we have a bit of overload due to docs [15:00:17] we have until Monday, monday morning would be okay as well, just give me a time [15:01:03] let me wait until tomorrow and I can tell you if on friday or monday [15:01:13] sounds good [15:01:14] thanks [15:12:32] Amir1: let's try to schedule it for early day Friday? [15:12:50] doesn't have to be super early [15:13:19] sure, yeah [15:14:55] what time is good for you? [15:43:29] ^ Amir1 [15:59:02] sorry I was afk [15:59:20] jynus: one hour before noon? [16:00:13] "yay" the three codfw databases for the container wikipedia-en-local-public.a8 _all_ have different object counts in [16:01:06] Amir1: ok, will send an invite [16:01:16] noon CEST or UTC? [16:01:34] any of them would work [22:43:53] (SystemdUnitFailed) firing: ferm.service Failed on ms-be2067:9100 - https://wikitech.wikimedia.org/wiki/Monitoring/check_systemd_state - https://grafana.wikimedia.org/d/g-AaZRFWk/systemd-status?orgId=1&forceLogin&editPanel=13 - https://alerts.wikimedia.org/?q=alertname%3DSystemdUnitFailed [22:44:37] (SystemdUnitFailed) resolved: ferm.service Failed on ms-be2067:9100 - https://wikitech.wikimedia.org/wiki/Monitoring/check_systemd_state - https://grafana.wikimedia.org/d/g-AaZRFWk/systemd-status?orgId=1&forceLogin&editPanel=13 - https://alerts.wikimedia.org/?q=alertname%3DSystemdUnitFailed [22:48:34] (SystemdUnitFailed) firing: (2) puppet-agent-timer.service Failed on ms-be2059:9100 - https://wikitech.wikimedia.org/wiki/Monitoring/check_systemd_state - https://grafana.wikimedia.org/d/g-AaZRFWk/systemd-status?orgId=1&forceLogin&editPanel=13 - https://alerts.wikimedia.org/?q=alertname%3DSystemdUnitFailed [22:53:49] (SystemdUnitFailed) firing: (2) puppet-agent-timer.service Failed on ms-be2059:9100 - https://wikitech.wikimedia.org/wiki/Monitoring/check_systemd_state - https://grafana.wikimedia.org/d/g-AaZRFWk/systemd-status?orgId=1&forceLogin&editPanel=13 - https://alerts.wikimedia.org/?q=alertname%3DSystemdUnitFailed [22:58:49] (SystemdUnitFailed) firing: (2) puppet-agent-timer.service Failed on ms-be2059:9100 - https://wikitech.wikimedia.org/wiki/Monitoring/check_systemd_state - https://grafana.wikimedia.org/d/g-AaZRFWk/systemd-status?orgId=1&forceLogin&editPanel=13 - https://alerts.wikimedia.org/?q=alertname%3DSystemdUnitFailed [23:03:34] (SystemdUnitFailed) firing: (3) puppet-agent-timer.service Failed on ms-be2059:9100 - https://wikitech.wikimedia.org/wiki/Monitoring/check_systemd_state - https://grafana.wikimedia.org/d/g-AaZRFWk/systemd-status?orgId=1&forceLogin&editPanel=13 - https://alerts.wikimedia.org/?q=alertname%3DSystemdUnitFailed [23:08:34] (SystemdUnitFailed) firing: (3) puppet-agent-timer.service Failed on ms-be2059:9100 - https://wikitech.wikimedia.org/wiki/Monitoring/check_systemd_state - https://grafana.wikimedia.org/d/g-AaZRFWk/systemd-status?orgId=1&forceLogin&editPanel=13 - https://alerts.wikimedia.org/?q=alertname%3DSystemdUnitFailed [23:28:49] (SystemdUnitFailed) resolved: systemd-timedated.service Failed on ms-be2059:9100 - https://wikitech.wikimedia.org/wiki/Monitoring/check_systemd_state - https://grafana.wikimedia.org/d/g-AaZRFWk/systemd-status?orgId=1&forceLogin&editPanel=13 - https://alerts.wikimedia.org/?q=alertname%3DSystemdUnitFailed