[07:30:57] o/ we're having issues with wdqs in codfw, still unclear what's the cause yet but could someone depool wdqs@codfw and route all traffic to wdqs@eqiad while we investigate what's going on? User impact is bot being throttled because mediawiki maxlag. [07:58:08] dcausse: you want "confctl --object-type discovery select 'dnsdisc=wdqs,name=codfw' set/pooled=false" ? [07:58:33] Emperor: yes I think that would be it [07:58:52] dcausse: OK, is there a phab item? [07:59:04] no sorry, filing one [07:59:32] it doesn't matter if not, it's just if there is I can associate the depool with it in SAL [08:00:22] (but if you're filing one I'll wait so I can do so :) ) [08:01:12] Emperor: T362508 [08:01:12] T362508: WDQS updater misbehaving in codfw - https://phabricator.wikimedia.org/T362508 [08:01:56] dcausse: {{done}} [08:02:09] Emperor: thanks! <3 [08:02:46] NP [08:04:18] Is there a known issue that would cause MWAPI requests about ruwiki to timeout? We're seeing timeouts from LW, but only for ruwiki requests (triggered by the ORES extension querying revscoring-editquality-damaging) [08:53:22] ^^^ this was a wedged service on our side. not root-caused quite yet, but it's always DNS, isn't it. [09:21:00] klausman: (Not sure if related but) we've had issues in the past with ruwiki specifically because of too many flaggedrevs of articles not existing in parsercache that caused increase in latency or timeouts when we asked for parsed content [09:22:30] ah, thanks for the info. I'm pretty sure this was different since a simple restart on our side fixed it, but that's good to know. [13:07:09] not sure if your schedule is right, but rzl, so far so good today [13:07:42] yesterday there were database issues [13:20:09] I guess technically, network issue? [14:00:54] hmm I'm having some sort of split brain issues in gitlab.wm.o if that's even possible [14:04:49] or just a brain fart on my side :) [14:06:52] vgutierrez: did you solve the issue or can I help in some way? [14:07:05] solved :D [14:07:10] great, then! [14:07:22] cdanis: see my non-notes above [14:10:00] thanks jynus [14:28:26] jynus: oops, yeah I just forgot to update the schedule over the weekend, thanks [14:29:09] yeah, this is true for many people (including myself) on mondays [14:29:43] that's is why I didn't asume it was your work start [15:11:36] elukey we're having some issues w/WDQS updater in CODFW. We wanted to rule out Kafka (esp mirrormaker) , any suggestions? I've just been looking at the grafana dashboard for MM [15:12:33] inflatador: any specific issue in WDQS that points to kafka? In theory you can check the per-topic dashboard to see if anything changed in the traffic patterns of the topics used by WDQS [15:14:31] elukey no evidence of kafka misbehaving, we're still struggling to find a root cause. Will take a look at the topic dashboard [15:15:24] ack! [15:38:53] Not sure if others have seen this, but: http://cheat.sh/ [15:39:12] (curl-able cheat sheets) [15:43:02] elukey: `{"msg":"Error connecting to Cassandra: gocql: unable to create session: unable to discover protocol version: x509: certificate signed by unknown authority","appname":"sessionstore","time":"2024-04-15T15:42:16Z","level":"FATAL"}` [15:54:25] urandom nice, thanks for sharing! [15:54:36] just don't curl|sudo ;) [15:58:47] everyone's favorite install instructions ;) [16:49:22] that HW discussion got me thinking about OCP again (not the one from Robocop). i guess it's still a thing: https://www.opencompute.org/ [17:18:47] jynus: Feel free to merge my puppet change when you're ready (yours looks more significant and perhaps needing timing) [17:19:36] mine is ok, if you are just updating ncredir, we can go