[00:05:34] !log DEPLOYED Refinery at 4e7a2b32 for changes: pageview allowlist 1305158 (+min.wikiquote) 1305162 (+bol.wikipedia), 1305156 (+isv.wikipedia); 1305980 (pv allowlist -api.wikimedia, sqoop +isvwiki); sqoop 1295064 (+globalimagelinks) 1295069 (+filerevision) using scap, then deployed onto HDFS (manual copyToLocal required additionally) [00:05:35] Logged the message at https://www.mediawiki.org/wiki/Analytics/Server_Admin_Log [00:12:15] FIRING: HdfsRpcQueueLatency: RPC queue latency on the analytics-hadoop cluster is too high. - https://wikitech.wikimedia.org/wiki/Analytics/Systems/Cluster/Hadoop/Alerts#HDFS_Namenode_RPC_length_queue/latency - https://grafana.wikimedia.org/d/000000585/hadoop?var-hadoop_cluster=analytics-hadoop&orgId=1&panelId=56&fullscreen - https://alerts.wikimedia.org/?q=alertname%3DHdfsRpcQueueLatency [00:22:15] RESOLVED: HdfsRpcQueueLatency: RPC queue latency on the analytics-hadoop cluster is too high. - https://wikitech.wikimedia.org/wiki/Analytics/Systems/Cluster/Hadoop/Alerts#HDFS_Namenode_RPC_length_queue/latency - https://grafana.wikimedia.org/d/000000585/hadoop?var-hadoop_cluster=analytics-hadoop&orgId=1&panelId=56&fullscreen - https://alerts.wikimedia.org/?q=alertname%3DHdfsRpcQueueLatency [01:48:57] (03PS1) 10Dr0ptp4kt: commonswiki.csv required for globalimagelinks sqoop [analytics/refinery] - 10https://gerrit.wikimedia.org/r/1306802 (https://phabricator.wikimedia.org/T425385) [02:10:05] (03PS2) 10Dr0ptp4kt: commonswiki.csv required for globalimagelinks sqoop [analytics/refinery] - 10https://gerrit.wikimedia.org/r/1306802 (https://phabricator.wikimedia.org/T425385) [07:14:34] (03CR) 10Aqu: [C:03+2] commonswiki.csv required for globalimagelinks sqoop [analytics/refinery] - 10https://gerrit.wikimedia.org/r/1306802 (https://phabricator.wikimedia.org/T425385) (owner: 10Dr0ptp4kt) [07:14:41] (03CR) 10Aqu: [V:03+2 C:03+2] commonswiki.csv required for globalimagelinks sqoop [analytics/refinery] - 10https://gerrit.wikimedia.org/r/1306802 (https://phabricator.wikimedia.org/T425385) (owner: 10Dr0ptp4kt) [08:03:34] 06Data-Engineering (Q4 FS25/26 April 1st - June 30st), 07Essential-Work: Inconsistent wiki list: grouped_wikis.csv extended *after* some sqoop jobs have already started - https://phabricator.wikimedia.org/T425385#12074662 (10dr0ptp4kt) I posted a note on IM about this - look for //subject:"[data engineering al... [09:28:45] 06Data-Engineering: dbt data tests validate production data - https://phabricator.wikimedia.org/T430786 (10GGoncalves-WMF) 03NEW [09:29:02] 06Data-Engineering: dbt data tests validate production data - https://phabricator.wikimedia.org/T430786#12074970 (10GGoncalves-WMF) [09:29:05] 10Data-Engineering-Roadmap, 07Epic, 07OKR-Work (WE1 FY2025-26): dbt DPE work - https://phabricator.wikimedia.org/T416679#12074971 (10GGoncalves-WMF) [09:29:52] 06Data-Engineering: dbt data tests validate production data - https://phabricator.wikimedia.org/T430786#12074972 (10GGoncalves-WMF) [13:02:11] 06Data-Engineering: dbt data tests validate production data - https://phabricator.wikimedia.org/T430786#12075868 (10GGoncalves-WMF) Oh good point! I used "asynchronous" mostly to mean "not in CI/Gitlab", but you're right that this still leaves two options: 1. Run after the `dbt run` in Airflow, executing only t... [13:32:57] 06Data-Engineering (Q4 FS25/26 April 1st - June 30st), 10DPE-MediaWiki-Incremental-History: Right-size Spark resource config using History Server data - https://phabricator.wikimedia.org/T428966#12075984 (10APizzata-WMF) ` +--------------------------------+--------+--------------+-------------------------+----... [13:44:24] 06Data-Engineering (Q4 FS25/26 April 1st - June 30st), 07Essential-Work: Inconsistent wiki list: grouped_wikis.csv extended *after* some sqoop jobs have already started - https://phabricator.wikimedia.org/T425385#12076046 (10JAllemandou) Adding to the task: now that we don't rely on the [[ https://github.com/w... [13:45:20] 06Data-Engineering, 06Data-Platform-SRE, 06Java-Scala-Standardization, 07Essential-Work, 13Patch-For-Review: Migrate existing Java packages to deploying to Gitlab, including new version of parent pom, validation that all dependencies are available, and v... - https://phabricator.wikimedia.org/T367405#12076056 [14:30:06] !log Test Kitchen experiment (poll 103281) - adds: none; removes: fy25-26-we-1-7-8-suggestion-mode-beta; fields: none - TK tips at https://w.wiki/_cvdP [14:30:08] Logged the message at https://www.mediawiki.org/wiki/Analytics/Server_Admin_Log [16:47:20] 06Data-Engineering: sanitization re-run request: event_sanitized.mediawiki_page_html_feature_counts_change_v1 - https://phabricator.wikimedia.org/T430752#12077349 (10CMyrick-WMF) @AKhatun_WMF thank you for all the information! > So I think we can wait a day to let the sanitization catch up, or manually run sani... [17:01:22] !log Test Kitchen experiment (poll 104146) - adds: logged-out-retention-round18; removes: none; fields: none - TK tips at https://w.wiki/_cvdP [17:01:25] Logged the message at https://www.mediawiki.org/wiki/Analytics/Server_Admin_Log [17:56:14] 06Data-Engineering: sanitization re-run request: event_sanitized.mediawiki_page_html_feature_counts_change_v1 - https://phabricator.wikimedia.org/T430752#12077822 (10nshahquinn-wmf) >>! In T430752#12073889, @AKhatun_WMF wrote: > I did a group by event dt (year/month/day) partitions and row counts match except Ma... [20:20:20] !log Test Kitchen experiment (poll 105332) - adds: fy25-26-we-1-7-8-suggestion-mode-beta; removes: none; fields: none - TK tips at https://w.wiki/_cvdP [20:20:22] Logged the message at https://www.mediawiki.org/wiki/Analytics/Server_Admin_Log [21:48:01] 06Data-Engineering (Q4 FS25/26 April 1st - June 30st): Attribution Research First Experiment - https://phabricator.wikimedia.org/T416200#12078318 (10Ahoelzl) 05Open→03Resolved [21:48:03] 06Data-Engineering (Q4 FS25/26 April 1st - June 30st), 10Event-Platform: Upgrade eventstreams and eventstreams-internal to node24 (or node22) - https://phabricator.wikimedia.org/T420257#12078319 (10Ahoelzl) 05Open→03Resolved [21:48:06] 06Data-Engineering (Q4 FS25/26 April 1st - June 30st), 10Data-Engineering-Wikistats, 10Wikidata, 06Wikidata-Omega: Wikidata unique devices statistics are obviously wrong - https://phabricator.wikimedia.org/T420210#12078317 (10Ahoelzl) 05Open→03Resolved