[00:27:54] Thanks ebernhardson , will update [07:53:22] ryankemper, inflatador: I left a Quick Q on your access request task [08:02:35] gehel: the task says analytics-privatedata-users is wanted for shell not ops [08:04:07] Oh, that might indeed be a mistake. Our onboarding notes for the team specify analytics-privatedata as an additional group for all team members, but that's probably not required for an SRE, who will already have access. [08:04:22] We don't onboard SREs often enough in our team! [08:06:38] gehel: might be worth checking with Otto [08:06:49] ottomata: mind clarifying ^ [08:07:04] I assume it's ops shell + analytics kerbos [08:07:08] RhinosF1: will do (cc ryankemper) [08:10:18] for what it's worth, the analytics-private-data group contains quite a few people who are also ops [08:11:19] It's new to me too :) [08:11:32] I'll let ryankemper sort this out with the relevant people [08:11:39] Just wanted to make sure he gets the right access [08:12:39] and digging a bit more into it, we have an `analytics-search` group that is a member of `analytics-privatedata-users`. So in our case, the more specific group probably makes more sense. [08:12:45] I'll add a note on the ticket [08:18:40] gehel: I updated task description to reflect too and added a link to your comment [08:18:53] RhinosF1: thanks! [08:23:58] time to get my (hopefully) last covid shot. Back later [08:26:33] Good luck [08:31:30] is there a difference between advanced search and special search? [09:01:19] wcqs import eta: 1172 out of 1435 [09:07:45] going to try to head to town to find a dentist. be back later [09:10:44] mpham: I think advanced search allows more filters [09:25:14] dcausse: about advanced/special search, you probably know [09:25:52] oops did not see the question [09:26:38] advanced search is an extension to Special:Search that adds more filters to help search users discover advanced search features [11:15:17] Lunch [12:10:48] lunch [12:20:20] errand [13:46:48] Good day! [13:55:04] o/ [14:13:53] o/ [15:41:48] * inflatador watches 'vagrant up' spew messages across my terminal [15:52:34] quick workout, back in ~30 [16:06:08] \o [16:06:30] o/ [16:18:42] dcausse: where would i find the code that writes lastModified tripples? I was trying to verify what the tripple will look like for wcqs (probably httpS://commons.wikimedia.org?) [16:18:59] * ebernhardson nas no clue why triple got so many p's [16:19:22] it's the updater lemme check [16:20:35] and then relatedly, i'll have a patch up today that adds a lag endpoint to nginx from puppet, right now the wcqs lag check is being caught in auth [16:21:18] so it's here: https://gerrit.wikimedia.org/r/plugins/gitiles/wikidata/query/rdf/+/refs/heads/master/tools/src/main/java/org/wikidata/query/rdf/tool/rdf/RdfRepositoryUpdater.java#109 [16:21:37] oh oauth is capturing everything that goes to /sparql? [16:21:45] so all updates will be captured [16:22:08] dcausse: oauth catches everything, period :) [16:22:13] ah :) [16:22:13] there is one escape for /readiness-probe [16:22:43] even if we access to :9999 (bypass nginx) ? [16:23:11] we can access :9999, but then we have to understand aliases.map. I suppose maybe that isn't important here [16:23:26] but the patch that switched prometheus from querying :9999 to querying :80 said it was to be able to use that map [16:23:44] ah ok I see [16:24:15] i suppose it probably collects metrics from other endpoints, i guess could split it and only pull this from 9999 [16:24:27] was worried about updates but they're most probably hitting :9999 [16:25:13] lag is also availabe on another metric (produced by the updater) [16:25:30] hmm, do we need per-instance lag metrics still? [16:25:48] i suppose it might help notice an instance falling behind still [16:26:02] yes I suppose for instances being hit by deadly queries [16:26:09] ahh, ya [16:26:41] so wdqs_streaming_updater_kafka_stream_consumer_lag_Value might have the same as the triple [16:26:50] I'm fine switching to this [16:27:04] and back [16:27:06] oh, of course thats still per instance [16:27:17] but we might have the same problem for the number of triples [16:27:27] yea that should be just fine. We probably need a ticket to followup other places that reference lag? [16:27:37] * ebernhardson doesn't know where all its used, just that it is :P [16:28:02] might be a good opportunity to switch to alert manager for these ones [16:28:28] I think it's mostly used in graphana and the icinga lag alert [16:28:29] i dont follow on the problem re number of triples? [16:28:40] how does wikidata itself decide to stop taking edits based on lag? [16:28:43] the number of triples is obtained by querying blazegraph too [16:28:52] and will commons have same i suppose [16:29:04] oh right this must be changed to [16:29:20] I have no clue if commons can have the same :/ [16:29:24] :) [16:29:59] hmm, for obtaining the number of triples it will have to query through :9999. If more things need to bypass auth we should probably think about opening port localhost:81 or some such [16:30:19] ok [16:30:23] render a copy of the nginx config without oauth for a port that's only accessible to localhost [17:15:36] the technical aspects of this are over the top: https://googleprojectzero.blogspot.com/2021/12/a-deep-dive-into-nso-zero-click.html . [17:15:38] "JBIG2 doesn't have scripting capabilities, but when combined with a vulnerability, it does have the ability to emulate circuits of arbitrary logic gates operating on arbitrary memory. So why not just use that to build your own computer architecture and script that!?" [17:38:10] Is their an official position on whether or not to use VPN (not the wikipedia VPN)? Typically I have it on all the time, but I haven't been using it during work [17:38:33] ebernhardson I saw that one yesterday, pretty insane [18:12:04] inflatador: i'm not aware of any particular position, generally we are permissive when it comes to how you work. If you prefer to use a VPN i doubt anyone would have issue [18:20:54] Thanks ebernhardson . Will use VPN until someone tells me not to ;) [19:25:21] when we graph the blazegraph_lastupdated next to consumer lag metric an interesting thing happens. Sometimes the lines are close together, sometimes they are as much as 30s apart, and this changes when refreshing. Since the old metric is `time() - last_updated` it increases over the 60s between metric updates and provides the extra time :) [19:26:05] i suspect using the consumer lag will be more accurate in that case as well [20:37:17] late lunch [23:20:14] OK all, have a great weekend/holiday/rest of your year. [23:28:04] will do :)