[05:54:49] btullis dcaro arturo dhinus we have something big going on on wikireplicas for the affected sections: https://phabricator.wikimedia.org/T337446 I am going to rebuild s5 for now (and I will see how to deal with the other 3 later) but you'll need to re-run the users/views scripts on s5 on the wikireplicas once I let you know [07:47:56] marostegui: I'm around now, anything I can help with? [07:48:22] dcaro: not for now, but I will ping here and the task https://phabricator.wikimedia.org/T337446 once it is time to start running the script for users and views [07:48:24] tanks [07:48:25] thanks [07:48:28] 👍 [08:14:17] marostegui: ack [08:23:38] marostegui: ack - thanks for the heads-up [08:33:44] jbond: all kafka clusters running mirror maker with PKI :) [08:34:32] elukey: awesome great work :) [08:35:18] \o. [08:35:19] elukey: What John said :-) [08:35:22] \o/ [08:41:56] I'll try to clean up the classes as suggested, next step is also to move varnishkafka to PKI [08:42:00] it should be doable [08:52:13] elukey: \o Looking at the test failure of 922809, I seem to be unable to get any kind of useful error message out of Jenkins. Am I blind? [08:52:27] oops, this should've been on -ml [08:53:37] klausman: ci: exit 1 (13.33 seconds) /srv/app> pre-commit run --all-files --show-diff-on-failure pid=26 [08:55:04] the comma, sigh [08:55:06] Where/how do I find that? All I have is https://integration.wikimedia.org/ci/job/trigger-inference-services-pipeline-ores-migration/60/console [08:55:07] fixing [08:55:22] klausman: you need to click in the runs listed in the console logs [08:55:31] you'll get to another CI run [08:55:47] and in its console logs you'll find what Riccardo mentioned [08:55:55] basically black runs and wants to reformat [08:56:03] I see [08:56:48] Sounds like something that one should be able to run locally, if not even a git-pre-commit hook :) [08:59:03] if it might help this is how spicerack is setup, each one can run it how they please: https://doc.wikimedia.org/spicerack/master/development.html#code-style [08:59:50] neat [09:00:49] I also recently have run into https://github.com/tummychow/git-absorb but have to yet try it [09:04:36] klausman: in our case it is my bad, I should've ran `tox` to see the CI failure (so we can do it locally) [09:05:35] I am a firm believer that these things should be run autmagically as early as feasible, possibly even pre-commit. Relying on humans on remembering this stuff is a case of making humans do that computers are better at [09:07:54] klausman: right, but we should also strive for a compromise between things to do and their priorities [09:08:16] absolutely, I wasn't implying that this needs fixing _now_ [09:11:50] ack, let's specify these things, a lot folks work on CI and automation :) [09:18:52] I meant it as a "it would cool if", not "why wasn't this done" [09:21:17] yep yep I wanted to clarify since IRC is not always the best tool to disambiguate sentences :D [09:21:32] :+1: [09:39:42] klausman: git-absorb looks very nice [09:40:05] Yeah, I like the idea. But there's always the practicalities :) [09:40:33] yes indeed, i often have stacked changes so will give it a p[lay [09:53:14] jbond: did you forget to press yes on the puppet-merge? :) [09:53:22] the process has been locked for a while :) [09:54:38] marostegui: yes sorry its merged now [09:54:49] thanks! [13:04:10] hello folks, as FYI we are experimenting with Data Engineering to extend the webrequest_sampled_live retention (in Druid) to 3+ days [13:04:33] \o/ [13:04:57] the idea would be to incrementally update the retention up to a week [13:05:01] atm it is 3 days [13:29:23] _joe_: ftr, I still want to kill RESTbase. I just want to make sure we have buy-in, and the rationale is properly documented :) [13:30:31] <_joe_> duesen: I am not on a plane to come pay you a visit don't worry :P [13:31:01] <_joe_> elukey: I'm not 100% sure we do need 3 days there [13:31:49] _joe_ why not? [13:32:08] <_joe_> elukey: oh wait, I read your next message a second later [13:32:18] <_joe_> I was about to add "but a week would give us some value" [13:32:20] <_joe_> :) [13:32:24] ahhhh :) [13:32:35] yes yes gentle ramp up to avoid breaking Druid [13:32:36] <_joe_> elukey: basically I think comparing stuff WoW is useful [13:32:48] <_joe_> 3 days doesn't give you much more info [13:33:32] the ideal goal would be to have a single datasource IMHO with all the data, from live up to max retention [13:33:49] which one that will be is not important to me [13:34:05] and DE might have preferences for optimizations reasons on the druid side [13:35:28] _joe_: you are welcome any time :) [13:35:46] hm... [13:35:56] is this a good time to poke you about jobrunners? [13:37:18] <_joe_> duesen: yeah sorry eff.ie was handling it and is out atm [13:37:27] <_joe_> we can get someone else on it :) [13:37:39] <_joe_> we'll have something for you on monday, hopefully [13:43:42] thank yo [14:11:17] !incidents [14:11:17] 3681 (UNACKED) ProbeDown sre (10.2.2.28 ip4 parsoid-php:443 probes/service http_parsoid-php_ip4 eqiad) [14:11:17] 3680 (RESOLVED) Primary outbound port utilisation over 80% (paged) global noc (cr2-eqsin.wikimedia.org) [14:11:17] 3678 (RESOLVED) Host db2110 (paged) - PING - Packet loss = 100% [14:11:17] 3679 (RESOLVED) Primary outbound port utilisation over 80% (paged) global noc (cr2-eqsin.wikimedia.org) [14:11:24] !ack 3681 [14:11:24] 3681 (ACKED) ProbeDown sre (10.2.2.28 ip4 parsoid-php:443 probes/service http_parsoid-php_ip4 eqiad) [14:11:48] #-operations is a full of errors related to the above [14:12:44] topranks: you sure it was you? [14:13:00] jayme: moving to security [14:13:04] ack [14:17:00] !incidents [14:17:00] !incidents [14:17:01] 3681 (ACKED) ProbeDown sre (10.2.2.28 ip4 parsoid-php:443 probes/service http_parsoid-php_ip4 eqiad) [14:17:01] 3682 (UNACKED) VarnishUnavailable global sre (varnish-text) [14:17:01] 3683 (UNACKED) HaproxyUnavailable cache_text global sre () [14:17:01] 3680 (RESOLVED) Primary outbound port utilisation over 80% (paged) global noc (cr2-eqsin.wikimedia.org) [14:17:01] 3678 (RESOLVED) Host db2110 (paged) - PING - Packet loss = 100% [14:17:02] 3679 (RESOLVED) Primary outbound port utilisation over 80% (paged) global noc (cr2-eqsin.wikimedia.org) [14:17:02] 3681 (ACKED) ProbeDown sre (10.2.2.28 ip4 parsoid-php:443 probes/service http_parsoid-php_ip4 eqiad) [14:17:02] 3682 (UNACKED) VarnishUnavailable global sre (varnish-text) [14:17:02] 3683 (UNACKED) HaproxyUnavailable cache_text global sre () [14:17:03] 3680 (RESOLVED) Primary outbound port utilisation over 80% (paged) global noc (cr2-eqsin.wikimedia.org) [14:17:03] 3678 (RESOLVED) Host db2110 (paged) - PING - Packet loss = 100% [14:17:04] 3679 (RESOLVED) Primary outbound port utilisation over 80% (paged) global noc (cr2-eqsin.wikimedia.org) [14:17:07] /o\ [14:17:07] those escalated [14:17:15] !ack 3682 [14:17:15] 3682 (ACKED) VarnishUnavailable global sre (varnish-text) [14:17:16] !ack 3682 [14:17:16] 3682 (ACKED) VarnishUnavailable global sre (varnish-text) [14:17:21] haha akosiaris ! [14:17:22] lol [14:17:23] great timing [14:17:24] touch read [14:17:26] red* [14:17:35] I'll ack the rest [14:17:38] ok thanks [14:17:39] !ack 3683 [14:17:39] 3683 (ACKED) HaproxyUnavailable cache_text global sre () [14:17:41] I am backing off [14:17:50] tis all good [14:18:07] "touch red" is a local game kids play when they simultaneously say the same thing btw [16:06:11] Amir1: Do you have a way that isn't jq-ing sitematrix of matching dbname to url ? [16:06:46] depends on how accurate you want it to be [16:07:27] because we even have exceptions such as be-tarask.wikipedia.org being be_x_oldwiki [16:07:53] i.e. no, you have to do jq. Sorry. [16:08:00] Fair enough [16:49:14] <_joe_> Amir1: I hoped something something mwscript would work :P [16:50:16] it depends on how you want to tackle it, we could add something to multiversion, it does the routing and has the logic there [16:50:25] or get it from mwscript [16:50:29] `echo $wgServer | mwscript eval.php --wiki=$wiki` [16:51:39] Lego's suggestion would work if you need it for one wiki only [16:51:51] or you can make it loop through it [16:54:06] I need to know the exact usecase to be able to suggest something [16:54:50] Honestly I went with a very dirty curl | jq | grep | cut [16:55:29] But the use case is, we're redirecting to mw-on-k8s based on URL, and I want to migrate all closed wikis to it [16:55:54] So I needed a way to do wikiname -> url [17:17:53] <_joe_> yeah legoktm that's what I was thinking of [17:18:07] <_joe_> it's the most reliable way to get things from a dblist [17:18:17] <_joe_> given we also have foreachwikiindblist [17:18:28] <_joe_> (which is a wrapper around mwscript) [19:35:05] I guess we never turned on the svc domains for netbox->authdns includes? [19:36:02] yeah appears not