[02:59:07] 10Data-Engineering, 10Product-Analytics, 10wmfdata-python: Decide whether and how to consolidate Wmfdata-Python and Refinery's Python modules - https://phabricator.wikimedia.org/T293700 (10nshahquinn-wmf) [03:42:41] 10Data-Engineering, 10Product-Analytics, 10wmfdata-python: Decide whether and how to consolidate Wmfdata-Python and Refinery's Python modules - https://phabricator.wikimedia.org/T293700 (10Ottomata) More related stuff: In {T296543} are working on some [[ https://gitlab.wikimedia.org/repos/data-engineering/wo... [08:27:30] is there any way to tell the split of people using WikiEditor vs not on a specific wiki? [08:27:48] (and, specifically, also not using VisualEditor) [11:51:24] inductiveload: many editors apply tags to edits made via them. For example, `visualeditor-wikitext` is a tag added to all edits made via the 2017 wikitext editor [11:52:27] there are a small number of WS people not using any editor toolbar at all, just the 2006 (?) toolbar [11:52:47] just wondering if iit possible to figure out how many that actually [11:53:51] inductiveload: afaik the 2006 editor was removed [11:54:17] or maybne it's the 2003 one with a fake toolbar then [11:54:27] (HiveServerHeapUsage) firing: Hive Server JVM Heap usage is above 80% on an-test-coord1001:10100 - https://wikitech.wikimedia.org/wiki/Analytics/Systems/Cluster/Hive/Alerts#Hive_Server_Heap_Usage - https://grafana.wikimedia.org/d/000000379/hive?panelId=7&fullscreen&orgId=1&var-instance=an-test-coord1001:10100 - https://alerts.wikimedia.org [11:54:30] some wikis have gadgets or user scripts to bring it back [11:54:46] then there might be ways to see number of people using those, but it highly depends on how that's actually done [11:54:54] yeah, enWS has a gadget for extra buttons, 2 people use that [11:55:08] according to the counter thingy [11:55:15] should be reliable :-) [11:55:35] but i don't think there are any official ways to see counts for the very old editors [11:55:42] i wasn't sure if is was possible to find people just not using anytinh at all [11:55:43] ok [11:56:02] are there counters for people who have userbetatoolbar turned off? [11:56:03] how would we define an user who is "not using anything at all"? [11:56:22] (there are certainly a lot of users who don't edit at all, but i don't think that's what you asked) [11:56:37] no WikiEditor, no VisualEditor, no gadget [11:58:05] and how would that person edit? [11:58:43] just in the text box [11:58:49] i see [11:58:49] you just don't get a toolbar [11:59:33] i have no idea if anyone is doing that, hence, you know, the question [11:59:36] since recently, all edits made via wikieditor appear to be tagged with `wikieditor` [11:59:53] so maybe look at non-bot edits that have no tag at all? [12:00:03] is that not basically all edits? [12:00:41] oh it's hidden [12:00:44] makes sense [12:00:44] yeah [12:00:49] otherwise it'd be too distracting [12:01:10] such a report will have API edits, but if you filter out bots, it might be usable? [12:01:12] right well thats my answer then [12:01:20] thanks [12:01:26] i suggest not to rely on it too much though [12:01:43] i'm just curious how many people are doing that (might be zero) [12:01:49] it's trying to answer question we _don't_ have data for by using data we have [12:01:56] but for curiosity, it should be sufficient :). [12:02:13] (I'm not sure exactly since when is the wikieditor tag applied, but i can look that up) [12:02:38] i wasn't sure if usebetatoolbar user setting is tracked in aggregate or something [12:03:01] it's definitely tracked somewhere, but i don't know if any data is released about it [12:05:23] inductiveload: for context, https://phabricator.wikimedia.org/T249038 is where the wikieditor tag was added. Might be helpful. [12:06:25] huh that is new [12:06:36] yep, very recent addition [12:06:42] i did wonder why there were only 10k changes with it on [12:12:33] inductiveload: let methe world know if you find anything interesting ;) [12:14:22] will do [12:14:27] (HiveServerHeapUsage) resolved: Hive Server JVM Heap usage is above 80% on an-test-coord1001:10100 - https://wikitech.wikimedia.org/wiki/Analytics/Systems/Cluster/Hive/Alerts#Hive_Server_Heap_Usage - https://grafana.wikimedia.org/d/000000379/hive?panelId=7&fullscreen&orgId=1&var-instance=an-test-coord1001:10100 - https://alerts.wikimedia.org [13:57:27] (HiveServerHeapUsage) firing: Hive Server JVM Heap usage is above 80% on an-test-coord1001:10100 - https://wikitech.wikimedia.org/wiki/Analytics/Systems/Cluster/Hive/Alerts#Hive_Server_Heap_Usage - https://grafana.wikimedia.org/d/000000379/hive?panelId=7&fullscreen&orgId=1&var-instance=an-test-coord1001:10100 - https://alerts.wikimedia.org [14:12:27] (HiveServerHeapUsage) resolved: Hive Server JVM Heap usage is above 80% on an-test-coord1001:10100 - https://wikitech.wikimedia.org/wiki/Analytics/Systems/Cluster/Hive/Alerts#Hive_Server_Heap_Usage - https://grafana.wikimedia.org/d/000000379/hive?panelId=7&fullscreen&orgId=1&var-instance=an-test-coord1001:10100 - https://alerts.wikimedia.org [14:45:26] (03PS1) 10Mforns: Add the rest of the anomaly detection queries in hql/ [analytics/refinery] - 10https://gerrit.wikimedia.org/r/749541 (https://phabricator.wikimedia.org/T295201) [14:47:47] (03CR) 10Mforns: [V: 03+2 C: 03+2] "Self-merging, given this is a no-op, and necessary for enabling the remaining anomaly detection Airflow jobs. Both queries did already exi" [analytics/refinery] - 10https://gerrit.wikimedia.org/r/749541 (https://phabricator.wikimedia.org/T295201) (owner: 10Mforns) [14:54:32] !log starting refinery deployment for anomaly detection queries [14:54:34] Logged the message at https://www.mediawiki.org/wiki/Analytics/Server_Admin_Log [15:55:57] !log finished refinery deployment for anomaly detection queries [15:55:59] Logged the message at https://www.mediawiki.org/wiki/Analytics/Server_Admin_Log [15:58:27] (HiveServerHeapUsage) firing: Hive Server JVM Heap usage is above 80% on an-test-coord1001:10100 - https://wikitech.wikimedia.org/wiki/Analytics/Systems/Cluster/Hive/Alerts#Hive_Server_Heap_Usage - https://grafana.wikimedia.org/d/000000379/hive?panelId=7&fullscreen&orgId=1&var-instance=an-test-coord1001:10100 - https://alerts.wikimedia.org [16:13:27] (HiveServerHeapUsage) resolved: Hive Server JVM Heap usage is above 80% on an-test-coord1001:10100 - https://wikitech.wikimedia.org/wiki/Analytics/Systems/Cluster/Hive/Alerts#Hive_Server_Heap_Usage - https://grafana.wikimedia.org/d/000000379/hive?panelId=7&fullscreen&orgId=1&var-instance=an-test-coord1001:10100 - https://alerts.wikimedia.org [18:00:27] (HiveServerHeapUsage) firing: Hive Server JVM Heap usage is above 80% on an-test-coord1001:10100 - https://wikitech.wikimedia.org/wiki/Analytics/Systems/Cluster/Hive/Alerts#Hive_Server_Heap_Usage - https://grafana.wikimedia.org/d/000000379/hive?panelId=7&fullscreen&orgId=1&var-instance=an-test-coord1001:10100 - https://alerts.wikimedia.org [18:10:27] (HiveServerHeapUsage) resolved: Hive Server JVM Heap usage is above 80% on an-test-coord1001:10100 - https://wikitech.wikimedia.org/wiki/Analytics/Systems/Cluster/Hive/Alerts#Hive_Server_Heap_Usage - https://grafana.wikimedia.org/d/000000379/hive?panelId=7&fullscreen&orgId=1&var-instance=an-test-coord1001:10100 - https://alerts.wikimedia.org [19:02:21] PROBLEM - Check unit status of reportupdater-reference-previews on an-launcher1002 is CRITICAL: CRITICAL: Status of the systemd unit reportupdater-reference-previews https://wikitech.wikimedia.org/wiki/Analytics/Systems/Managing_systemd_timers [19:03:19] PROBLEM - Check unit status of eventlogging_to_druid_navigationtiming_hourly on an-launcher1002 is CRITICAL: CRITICAL: Status of the systemd unit eventlogging_to_druid_navigationtiming_hourly https://wikitech.wikimedia.org/wiki/Analytics/Systems/Managing_systemd_timers [19:04:29] milimetric: I think we got problemsssss [19:04:48] oh, cave? [19:04:50] an-launcher1002.eqiad.wmnet's / partition is full 100% [19:04:52] yes [19:04:54] oh! I see [19:05:17] PROBLEM - Check unit status of reportupdater-wmcs on an-launcher1002 is CRITICAL: CRITICAL: Status of the systemd unit reportupdater-wmcs https://wikitech.wikimedia.org/wiki/Analytics/Systems/Managing_systemd_timers [19:05:47] PROBLEM - Check unit status of eventlogging_to_druid_netflow_hourly on an-launcher1002 is CRITICAL: CRITICAL: Status of the systemd unit eventlogging_to_druid_netflow_hourly https://wikitech.wikimedia.org/wiki/Analytics/Systems/Managing_systemd_timers [19:07:33] PROBLEM - Check unit status of eventlogging_to_druid_editattemptstep_hourly on an-launcher1002 is CRITICAL: CRITICAL: Status of the systemd unit eventlogging_to_druid_editattemptstep_hourly https://wikitech.wikimedia.org/wiki/Analytics/Systems/Managing_systemd_timers [19:08:53] PROBLEM - Check unit status of reportupdater-templatedata on an-launcher1002 is CRITICAL: CRITICAL: Status of the systemd unit reportupdater-templatedata https://wikitech.wikimedia.org/wiki/Analytics/Systems/Managing_systemd_timers [19:10:49] PROBLEM - Check unit status of reportupdater-templatewizard on an-launcher1002 is CRITICAL: CRITICAL: Status of the systemd unit reportupdater-templatewizard https://wikitech.wikimedia.org/wiki/Analytics/Systems/Managing_systemd_timers [19:10:57] PROBLEM - Check unit status of reportupdater-browser on an-launcher1002 is CRITICAL: CRITICAL: Status of the systemd unit reportupdater-browser https://wikitech.wikimedia.org/wiki/Analytics/Systems/Managing_systemd_timers [19:12:15] PROBLEM - Check unit status of eventlogging_to_druid_prefupdate_hourly on an-launcher1002 is CRITICAL: CRITICAL: Status of the systemd unit eventlogging_to_druid_prefupdate_hourly https://wikitech.wikimedia.org/wiki/Analytics/Systems/Managing_systemd_timers [19:12:43] PROBLEM - Check unit status of reportupdater-visualeditor on an-launcher1002 is CRITICAL: CRITICAL: Status of the systemd unit reportupdater-visualeditor https://wikitech.wikimedia.org/wiki/Analytics/Systems/Managing_systemd_timers [19:12:53] !log Marcel and I are deleting files from /tmp older than 60 days [19:12:55] Logged the message at https://www.mediawiki.org/wiki/Analytics/Server_Admin_Log [19:13:20] !log Additional context on the last delete message: on an-launcher1002 which is filled up [19:13:22] Logged the message at https://www.mediawiki.org/wiki/Analytics/Server_Admin_Log [19:14:57] PROBLEM - Check unit status of refine_event_sanitized_main_immediate on an-launcher1002 is CRITICAL: CRITICAL: Status of the systemd unit refine_event_sanitized_main_immediate https://wikitech.wikimedia.org/wiki/Analytics/Systems/Managing_systemd_timers [19:14:57] PROBLEM - Check unit status of refine_event_sanitized_analytics_immediate on an-launcher1002 is CRITICAL: CRITICAL: Status of the systemd unit refine_event_sanitized_analytics_immediate https://wikitech.wikimedia.org/wiki/Analytics/Systems/Managing_systemd_timers [19:18:18] mforns,milimetric o/ [19:18:25] do you need any help? [19:18:53] there are 14G in /home [19:19:15] 2.3G mforns [19:19:16] 2.4G joal [19:19:16] 8.2G milimetric [19:19:17] :P [19:19:59] and 7.1G of refinery logs [19:20:13] thanks elukey no worries, we got it. We cleaned 5G from /tmp and looking at the rest now [19:21:26] ack [19:21:40] if you clean up the /home dirs it should be fine [19:24:36] yep, we got 15G clean so far, should be good [19:24:40] sorry to bother you! [19:25:47] nono on the contrary, I was passing by and wanted to help :) [19:26:30] k :) we don't have permissions to see home dirs actually, so that was really helpful, thanks [19:39:03] RECOVERY - Check unit status of reportupdater-wmcs on an-launcher1002 is OK: OK: Status of the systemd unit reportupdater-wmcs https://wikitech.wikimedia.org/wiki/Analytics/Systems/Managing_systemd_timers [19:39:35] RECOVERY - Check unit status of eventlogging_to_druid_netflow_hourly on an-launcher1002 is OK: OK: Status of the systemd unit eventlogging_to_druid_netflow_hourly https://wikitech.wikimedia.org/wiki/Analytics/Systems/Managing_systemd_timers [19:41:23] RECOVERY - Check unit status of eventlogging_to_druid_editattemptstep_hourly on an-launcher1002 is OK: OK: Status of the systemd unit eventlogging_to_druid_editattemptstep_hourly https://wikitech.wikimedia.org/wiki/Analytics/Systems/Managing_systemd_timers [19:42:39] RECOVERY - Check unit status of reportupdater-templatedata on an-launcher1002 is OK: OK: Status of the systemd unit reportupdater-templatedata https://wikitech.wikimedia.org/wiki/Analytics/Systems/Managing_systemd_timers [19:44:33] RECOVERY - Check unit status of reportupdater-templatewizard on an-launcher1002 is OK: OK: Status of the systemd unit reportupdater-templatewizard https://wikitech.wikimedia.org/wiki/Analytics/Systems/Managing_systemd_timers [19:44:43] RECOVERY - Check unit status of reportupdater-browser on an-launcher1002 is OK: OK: Status of the systemd unit reportupdater-browser https://wikitech.wikimedia.org/wiki/Analytics/Systems/Managing_systemd_timers [19:45:59] RECOVERY - Check unit status of eventlogging_to_druid_prefupdate_hourly on an-launcher1002 is OK: OK: Status of the systemd unit eventlogging_to_druid_prefupdate_hourly https://wikitech.wikimedia.org/wiki/Analytics/Systems/Managing_systemd_timers [19:46:23] RECOVERY - Check unit status of reportupdater-visualeditor on an-launcher1002 is OK: OK: Status of the systemd unit reportupdater-visualeditor https://wikitech.wikimedia.org/wiki/Analytics/Systems/Managing_systemd_timers [19:47:13] RECOVERY - Check unit status of reportupdater-reference-previews on an-launcher1002 is OK: OK: Status of the systemd unit reportupdater-reference-previews https://wikitech.wikimedia.org/wiki/Analytics/Systems/Managing_systemd_timers [19:48:13] RECOVERY - Check unit status of eventlogging_to_druid_navigationtiming_hourly on an-launcher1002 is OK: OK: Status of the systemd unit eventlogging_to_druid_navigationtiming_hourly https://wikitech.wikimedia.org/wiki/Analytics/Systems/Managing_systemd_timers [19:48:35] RECOVERY - Check unit status of refine_event_sanitized_main_immediate on an-launcher1002 is OK: OK: Status of the systemd unit refine_event_sanitized_main_immediate https://wikitech.wikimedia.org/wiki/Analytics/Systems/Managing_systemd_timers [19:48:35] RECOVERY - Check unit status of refine_event_sanitized_analytics_immediate on an-launcher1002 is OK: OK: Status of the systemd unit refine_event_sanitized_analytics_immediate https://wikitech.wikimedia.org/wiki/Analytics/Systems/Managing_systemd_timers [19:50:14] we're currently suspecting somehow when we call spark jobs from AirFlow, something copies the refinery-job jar file to something like /tmp/{guid}_resources. This takes up 120 MB every time, and these jobs run frequently. So we estimate we have 5 days worth of space with what we cleared up. We'll have to figure it out today or tomorrow or turn off the jobs. [19:57:02] I turned off the jobs, the SparkSQL job, as it was using deploy-mode=client and was adding a JAR (hive udf), the jar was being copied from HDFS to local at every run of the task... [19:57:34] I deleted all the /tmp/*_resources/refinery-job*.jar files, and that freed up 20G more. So we're good for now [20:01:27] (HiveServerHeapUsage) firing: Hive Server JVM Heap usage is above 80% on an-test-coord1001:10100 - https://wikitech.wikimedia.org/wiki/Analytics/Systems/Cluster/Hive/Alerts#Hive_Server_Heap_Usage - https://grafana.wikimedia.org/d/000000379/hive?panelId=7&fullscreen&orgId=1&var-instance=an-test-coord1001:10100 - https://alerts.wikimedia.org [20:11:27] (HiveServerHeapUsage) resolved: Hive Server JVM Heap usage is above 80% on an-test-coord1001:10100 - https://wikitech.wikimedia.org/wiki/Analytics/Systems/Cluster/Hive/Alerts#Hive_Server_Heap_Usage - https://grafana.wikimedia.org/d/000000379/hive?panelId=7&fullscreen&orgId=1&var-instance=an-test-coord1001:10100 - https://alerts.wikimedia.org [22:01:27] (HiveServerHeapUsage) firing: Hive Server JVM Heap usage is above 80% on an-test-coord1001:10100 - https://wikitech.wikimedia.org/wiki/Analytics/Systems/Cluster/Hive/Alerts#Hive_Server_Heap_Usage - https://grafana.wikimedia.org/d/000000379/hive?panelId=7&fullscreen&orgId=1&var-instance=an-test-coord1001:10100 - https://alerts.wikimedia.org [22:11:27] (HiveServerHeapUsage) resolved: Hive Server JVM Heap usage is above 80% on an-test-coord1001:10100 - https://wikitech.wikimedia.org/wiki/Analytics/Systems/Cluster/Hive/Alerts#Hive_Server_Heap_Usage - https://grafana.wikimedia.org/d/000000379/hive?panelId=7&fullscreen&orgId=1&var-instance=an-test-coord1001:10100 - https://alerts.wikimedia.org [23:46:15] hi all! Quick question, what is the data size limit for user databases in Hive? thx in advance!