[09:56:13] Krinkle: I tweaked https://grafana.wikimedia.org/d/lqE4lcGWz/wanobjectcache-key-group?orgId=1 to use sample_rate & count for the hit/miss metrics, I believe that after switching to a timing metric my understanding is that it should use those rather than rate & sum [09:57:01] the general wanobjectcache dashboard might need some changes too but I can't seem to be able to load this one in firefox [09:57:22] but please revert if I'm wrong [10:15:51] lunch [13:08:28] o/ [13:18:30] cleaning up some hourly dags that are handled by the SUP, wondering what to do mw_sql_to_hive.py in discolytics, I vaguely remember it was connected to ores but I can't seem to find where it's used now [13:21:17] dcausse: https://wikitech.wikimedia.org/wiki/Graphite#Extended_properties [13:22:08] reviewing now :) [13:22:38] thx! [13:26:08] \o [13:26:15] reading this doc, Timer.count seems discouraged tho and I used it [13:26:17] o/ [13:27:48] dcausse: git history says mw_sql_to_hive was for ores propagation via wbitem, but we dropped it when switching to outlink model of topic's [13:30:14] dcausse: yeah, if you plot Timer.count and you show the last 2 days you might see 2K/min, and then when you zoom out to last 30 days, the same recent days will report as 14K/min or something like that. It's tied to resolution window and sum'ed during aggregation instead of kept to average. The shape is the same, but it means there is no unit you can choose, it will always be wrong the majority of zoom levels and time ranges. Even if [13:30:14] you zoom in to 24 hours, the number depends on whether it's the last 24h vs some 24 hour period in the past. [13:30:45] for the percentage plot, that seems to work fine though. [13:30:57] you're only comparing it against other .counts, not plotting the number itself [13:31:31] Count.sum has the same behaviour [13:31:43] and similarly is rarely what you want to plot, but LGTM [13:31:55] These are basically all written on the assumtion that they are counters. [13:32:31] I guess we changed that at some point. It's so easy to get fooled [13:33:14] Krinkle: definitely... this got changed in https://gerrit.wikimedia.org/r/c/mediawiki/core/+/617291 [13:34:28] I'm viewing a few different popular core metrics with an unsaved revert of your change. The order of magnitude and percentage stays the same for a lot of them. [13:34:39] 200K > 300K. 98% -> 97% [13:34:48] but yeah, it's non-sensical for what we want [13:35:08] but for yours, it's a different story. [13:35:09] yes I think for the cirrus one it got massively different because the timing of "compute" is high [13:35:27] how did you get to this? [13:36:01] Krinkle: when you first warned us about the poor hit rate of this key, Erik filed T370796 [13:36:02] T370796: Improve cache hit rate of CirrusSearchParserOutputPageProperties during cirrusbuilddoc - https://phabricator.wikimedia.org/T370796 [13:36:54] then looked at the numbers and realized they made no sense, which would amount for 10M parses/minutes [13:39:04] ebernhardson: thanks! do you see a reason to keep this script around? [13:40:06] dcausse: nice, so these changes are for the first 2 rows only, right? The diff doesnt' really say, but I think that's it. The 2 rate charts and pie charts to the side [13:40:55] Krinkle: and the regen rate breakdown IIRC [13:41:11] dcausse: i don't see anything that needs it today. Poking the old analytics repo it looks like this is the only thing it was used for. Can probably drop it (and the analytics team has a different method to accomplish the same, we used this because ew could configure it ourselves) [13:41:28] sure [13:41:39] could always be brought back out of git history if we really needed it [13:41:45] ok [14:21:43] dcausse: fixed the main wancache dash as well now [14:21:52] Krinkle: thanks! [14:24:04] dcausse: https://grafana-rw.wikimedia.org/d/lqE4lcGWz/wanobjectcache-key-group?orgId=1&var-kClass=CirrusSearchParserOutputPageProperties&from=now-1y&to=now [14:24:10] in case you hadn't zoomed out yet. [14:24:22] fortunatey the change works retroactively! [14:24:26] so we have the data [14:27:21] thanks! [14:40:39] can't remember how to run tests on discolytics, following the README I'm seeing "ModuleNotFoundError: No module named 'setuptools'" [14:40:54] but pip install says it's already there [14:40:56] dcausse: hmm, setuptools is part of the base python stuff [14:41:05] something odd :S [14:41:29] I'm building the image and running it [14:42:08] "which python" says /opt/conda/bin/python so seems correct [14:42:53] i wonder if conda does something, setuptools is also doing package installation and such [14:43:11] it's the thing that handles setup.py and building egg's and such [14:44:14] dcausse: and the conda env is activated? [14:45:18] ebernhardson: I think it is, setup.sh seems to run and it has conda activate [14:47:10] dcausse: and invoking tests via `tox` or `tox -e pytest` ? [14:47:54] oh, actually i guess i have a docker image for running it [14:48:11] ran it without args [14:49:11] looks like i run via: docker run -it --rm -u 0:0 --name conda_env -v $PWD:/srv/app:rw -e XDG_CACHE_HOME=/srv/app/.cache --entrypoint .pipeline/entrypoint.sh discolytics-tox:latest [14:49:27] but double checking if that image actually exists or is some local hack [14:49:55] looks like it should be from blubber [14:50:24] yes from the readme: DOCKER_BUILDKIT=1 docker build --target tox --tag discolytics-tox:latest -f .pipeline/blubber.yaml . [14:50:47] yea that looks like what i see in my history [14:51:51] passing -e pytest, same, (docker run --rm discolytics-tox:latest -e pytest) ends up running python -m tox -e pytest but fails on setuptools [14:52:13] trying a fresh build of the image, see if it does anything different [14:53:02] fails here too :S [14:53:14] my old image was fine, so it's something in there [14:53:21] (i ran the tests also before rebuilding image) [14:53:40] i can look closer today, doesn't seem like going to solve in 7 minute s:) [14:53:55] i'm working on a discolytics patch anyways [14:55:22] thanks! [14:55:47] for extra funsies, loading the bash shell and running python3 directly finds setuptools just fine :P [14:56:41] ah removing the version constraint in setup.sh seems to do it [14:56:50] for tox [14:56:51] the tox<4? [14:56:53] yes [14:56:59] curious, i don't remember why i had that there [14:57:39] ouch getting other failures now [14:58:05] ImportError: numpy.core.multiarray failed to import [14:58:07] A module that was compiled using NumPy 1.x cannot be run in [14:58:09] NumPy 2.0.1 as it may crash. [14:59:24] might be pyarrow [14:59:44] yea was trying to think what we do with numpy, it should mostly be transitive via pandas, pyspark or pyarrow [15:00:12] pyspark should be leading there i suppose, specifying compatible versions of pyarrow and pandas [16:28:19] dinner [18:33:58] since we met with Jan during the Wednesday meeting, I've reduced the number of people on the similar meeting this Friday. [19:15:37] back [20:34:44] dcausse: the answer i came up with was: https://gitlab.wikimedia.org/repos/search-platform/discolytics/-/merge_requests/44/diffs#61be067c7cf3bdbf8a6b021a2b5167eb30612d0c_1_1 [20:34:55] but i'm not sure how to stack MR's, so it's mixed into my other MR [20:52:04] is it even possible to stack MRs with gitlab? [20:58:31] they are working on it, i saw some link to their gitlab instance showing they were working on it. I wasn't too inspired by what they put together so far, but there is always room for improvement [20:58:36] i don't think we have that on our instance yet though [21:00:19] that'd be cool. I have an old coworker who's now at Gitlab, I should look him up [21:06:49] if they get it right, imo the current version is still far too focused on MR's instead of individual commits. Can't interact with stacked MR's with normal git tooling. But that may change as they get further along, basically said they were trying to avoid changing the backend until they are more decided on how exactly it will work [21:21:57] curiously, the audienceCan() error messages have declined significantly prior to deploying the fix