[06:44:12] 10Data-Engineering-Planning, 10Event-Platform Value Stream (Sprint 10): Flink EventStreamCatalog should add watermark - https://phabricator.wikimedia.org/T330441 (10tchin) I don't think I'm fully understand what the options are for. You can set watermarks for tables in the catalog by doing something like `lang... [08:49:12] 10Data-Engineering-Planning, 10Event-Platform Value Stream, 10Patch-For-Review: Use new PageUndeleteComplete hook to emit mediawiki.page_change undelete event - https://phabricator.wikimedia.org/T328308 (10Ottomata) Thank you @OwenRB this is amazing! Code looks great. Have you run it locally and verified t... [08:58:16] 10Data-Engineering-Planning, 10Event-Platform Value Stream, 10Patch-For-Review: Use new PageUndeleteComplete hook to emit mediawiki.page_change undelete event - https://phabricator.wikimedia.org/T328308 (10Ottomata) @OwenRB I'm curious! Are you using EventBus in your own MediaWiki installs (at Mirahaze?) [09:11:27] 10Analytics-Radar, 10Data-Engineering, 10Data-Engineering-Wikistats, 10Diffusion-Repository-Administrators, and 4 others: Archive analytics/wikistats - https://phabricator.wikimedia.org/T332004 (10hashar) What is the replacement to `analytics/wikistats`? If it exists maybe @Nemo_bis can migrate to it rathe... [10:04:32] 10Data-Engineering-Planning, 10Shared-Data-Infrastructure (Shared-Data-Infra Sprint 10): Upgrade db1108 to Bullseye - https://phabricator.wikimedia.org/T304492 (10Marostegui) db1108 will need to be replaced by the new hardware (T326669), which will be installed on bullseye, so probably we can just migrate the... [10:04:34] 10Data-Engineering-Planning, 10Shared-Data-Infrastructure (Shared-Data-Infra Sprint 10): Upgrade db1108 to Bullseye - https://phabricator.wikimedia.org/T304492 (10Marostegui) [10:18:36] nfraison_: o/ I left some comments in the spark/kerberos gdoc, really nicely written [10:18:53] I have only some doubts about Vault (mostly because I am ignorant about it) [10:19:12] SRE may be interested in Vault if you want to add it to deployment-charts [10:44:17] 10Data-Engineering-Planning, 10Shared-Data-Infrastructure (Shared-Data-Infra Sprint 10): Recreate replica of matomo database on db1108 - https://phabricator.wikimedia.org/T332128 (10BTullis) [10:47:31] 10Data-Engineering-Planning, 10Shared-Data-Infrastructure (Shared-Data-Infra Sprint 10): Recreate replica of matomo database on db1108 - https://phabricator.wikimedia.org/T332128 (10BTullis) p:05Triage→03High [10:51:19] 10Data-Engineering-Planning, 10DBA, 10Shared-Data-Infrastructure (Shared-Data-Infra Sprint 10): Recreate replica of matomo database on db1108 - https://phabricator.wikimedia.org/T332128 (10BTullis) Tagging #dba for awareness, but I'm happy to carry out the work myself. [10:57:09] As per T332128 I've discovered that the matomo replica database on db1108 needs to be recreated, because we currently have no up-to-date backups of matomo. I'll be working on that for some of today. [10:57:10] T332128: Recreate replica of matomo database on db1108 - https://phabricator.wikimedia.org/T332128 [10:59:22] Also, we missed the deployment train yesterday. I see from the etherpad (https://etherpad.wikimedia.org/p/analytics-weekly-train) that there is a refinery deploy and a special deployment for pageviews. [11:04:04] I'd appreciate some hand-holding please. I'm supposed to be holding steve_munene's hand to do these deploys, but all the mention of altering and dropping tables is a bit scary to say the least. [11:08:17] 10Data-Engineering-Planning, 10DBA, 10Shared-Data-Infrastructure (Shared-Data-Infra Sprint 10): Recreate replica of matomo database on db1108 - https://phabricator.wikimedia.org/T332128 (10BTullis) Configured 6 hours of downtime for the matomo database checks on db1108 and the check for the correct number of... [11:15:56] 10Data-Engineering-Planning, 10SRE-swift-storage, 10Event-Platform Value Stream (Sprint 10): Storage request: swift s3 bucket for mediawiki-page-content-change-enrichment checkpointing - https://phabricator.wikimedia.org/T330693 (10gmodena) Hi, >>! In T330693#8696219, @Eevans wrote: Regarding the use of Fl... [11:17:46] 10Data-Engineering-Planning, 10DBA, 10Shared-Data-Infrastructure (Shared-Data-Infra Sprint 10): Recreate replica of matomo database on db1108 - https://phabricator.wikimedia.org/T332128 (10BTullis) Size of `/var/log/mysql` on matomo1002 is 2.5 GB Size of the InnoDB tablespace in `/var/lib/mysql/piwik` on ma... [11:18:04] !log stopping the matomo database replica on db1108 [11:18:05] Logged the message at https://www.mediawiki.org/wiki/Analytics/Server_Admin_Log [11:19:50] hey folks, qq - anything against deploying golang on stat100x nodes? [11:20:17] elukey: No objections from me. What are you cooking? [11:20:56] I'd need to test https://github.com/Shopify/sarama with jumbo [11:21:31] it is the client used by benthos, we found a weird issue when selecting some topic partitions (instead of all) [11:23:54] Cool. Yep, no objections from me. [11:29:11] 10Data-Engineering-Planning, 10DBA, 10Shared-Data-Infrastructure (Shared-Data-Infra Sprint 10): Recreate replica of matomo database on db1108 - https://phabricator.wikimedia.org/T332128 (10BTullis) Disabling puppet ` btullis@db1108:/srv$ sudo disable-puppet btullis-T332128 ` Stopping the database service ` b... [11:30:25] 10Data-Engineering-Planning, 10DBA, 10Shared-Data-Infrastructure (Shared-Data-Infra Sprint 10): Recreate replica of matomo database on db1108 - https://phabricator.wikimedia.org/T332128 (10Marostegui) Let's keep coordinating on IRC like we've been doing an hour ago. There's probably no need to use xtrabackup... [11:49:55] 10Data-Engineering-Planning, 10DBA, 10Shared-Data-Infrastructure (Shared-Data-Infra Sprint 10): Recreate replica of matomo database on db1108 - https://phabricator.wikimedia.org/T332128 (10Marostegui) 05Open→03Resolved a:05BTullis→03Marostegui What I am going to do is basically use mysqldump from the... [11:53:44] 10Data-Engineering, 10DBA, 10Infrastructure-Foundations, 10Machine-Learning-Team, and 8 others: eqiad row C switches upgrade - https://phabricator.wikimedia.org/T331882 (10Clement_Goubert) [11:54:36] 10Data-Engineering-Planning, 10DBA, 10Shared-Data-Infrastructure (Shared-Data-Infra Sprint 10): Recreate replica of matomo database on db1108 - https://phabricator.wikimedia.org/T332128 (10BTullis) Nice work. Thanks, you're right, that's simpler than using xtrabackup. [11:58:07] 10Data-Engineering-Planning, 10DBA, 10Shared-Data-Infrastructure (Shared-Data-Infra Sprint 10): Recreate replica of matomo database on db1108 - https://phabricator.wikimedia.org/T332128 (10Marostegui) I have re-enabled puppet [13:10:35] !log rerunning eventlogging_legacy failed job [13:10:39] Logged the message at https://www.mediawiki.org/wiki/Analytics/Server_Admin_Log [13:57:45] 10Data-Engineering-Planning, 10Event-Platform Value Stream (Sprint 10): Flink EventStreamCatalog should add watermark - https://phabricator.wikimedia.org/T330441 (10Ottomata) YES, sorry. not table options. Miswrote that in the description. Whatever EventTableDescriptorBuilder default setup kafka stuff is do... [14:00:18] 10Data-Engineering, 10DBA, 10Infrastructure-Foundations, 10Machine-Learning-Team, and 9 others: eqiad row C switches upgrade - https://phabricator.wikimedia.org/T331882 (10Marostegui) [14:17:11] (03PS1) 10Kosta Harlan: helppanel: Document savedTaskType in action_data for trynewtask-impression [schemas/event/secondary] - 10https://gerrit.wikimedia.org/r/899637 (https://phabricator.wikimedia.org/T330637) [14:23:19] 10Data-Engineering-Planning, 10DBA, 10Data Pipelines, 10Infrastructure-Foundations, and 9 others: eqiad row B switches upgrade - https://phabricator.wikimedia.org/T330165 (10ssingh) [14:26:02] 10Data-Engineering-Planning, 10DBA, 10Data Pipelines, 10Infrastructure-Foundations, and 9 others: eqiad row B switches upgrade - https://phabricator.wikimedia.org/T330165 (10herron) [14:26:36] 10Data-Engineering-Planning, 10DBA, 10Data Pipelines, 10Infrastructure-Foundations, and 9 others: eqiad row B switches upgrade - https://phabricator.wikimedia.org/T330165 (10ssingh) [15:00:36] 10Data-Engineering-Planning, 10DBA, 10Data Pipelines, 10Infrastructure-Foundations, and 9 others: eqiad row B switches upgrade - https://phabricator.wikimedia.org/T330165 (10bking) [15:01:56] 10Data-Engineering-Planning, 10DBA, 10Data Pipelines, 10Infrastructure-Foundations, and 9 others: eqiad row B switches upgrade - https://phabricator.wikimedia.org/T330165 (10ayounsi) [15:02:59] 10Data-Engineering, 10AQS 2.0 Roadmap, 10API Platform (API Platform Roadmap), 10Epic, and 2 others: Obtain a security review of AQS 2.0 - https://phabricator.wikimedia.org/T288663 (10thcipriani) [15:33:23] 10Data-Engineering, 10Equity-Landscape: World Bank Data - https://phabricator.wikimedia.org/T309282 (10ntsako) 05In progress→03Resolved [15:33:25] 10Data-Engineering, 10Equity-Landscape: Extract + Transformation Raw Data into Input Metrics - https://phabricator.wikimedia.org/T306625 (10ntsako) [15:39:00] 10Data-Engineering-Planning, 10SRE-swift-storage, 10Event-Platform Value Stream (Sprint 10): Storage request: swift s3 bucket for mediawiki-page-content-change-enrichment checkpointing - https://phabricator.wikimedia.org/T330693 (10MatthewVernon) This is a k8s application running on the WMF OpenStack, yes?... [15:40:51] 10Data-Engineering-Planning, 10Data Pipelines, 10Product-Analytics: Add TikTok's in-app browser to ua-parser library - https://phabricator.wikimedia.org/T325611 (10odimitrijevic) Another ping to @Maryana and @MMiller_WMF . Do you have opinions on asking the TikTok team vs parsing the user-agent string as is? [16:21:19] 10Data-Engineering, 10Machine-Learning-Team, 10Research, 10Event-Platform Value Stream (Sprint 10): Design event schema for ML scores/recommendations on current page state - https://phabricator.wikimedia.org/T331401 (10achou) > If a model had 1000 classes though, maybe doesn't make so much sense to include... [17:34:11] btullis: I suggest manually adding the missing partitions for the culprit dataset [17:34:29] so that we can restart the failed task and have it succeed [17:52:03] 10Data-Engineering, 10IP Masking, 10Product-Analytics: Clarify definitions around anonymous and temporary editors - https://phabricator.wikimedia.org/T332205 (10kzimmerman) Adding #data-engineering since we will work with them on the technical details. [18:05:34] PROBLEM - Check systemd state on an-launcher1002 is CRITICAL: CRITICAL - degraded: The following units failed: produce_canary_events.service https://wikitech.wikimedia.org/wiki/Monitoring/check_systemd_state [18:11:07] 10Data-Engineering-Planning, 10DBA, 10Data Pipelines, 10Infrastructure-Foundations, and 9 others: eqiad row B switches upgrade - https://phabricator.wikimedia.org/T330165 (10Eevans) [18:16:32] RECOVERY - Check systemd state on an-launcher1002 is OK: OK - running: The system is fully operational https://wikitech.wikimedia.org/wiki/Monitoring/check_systemd_state [18:30:31] joal: ack. Will do that tomorrow morning. [18:32:11] btullis: I'm still doing some work now - do you wish me to do it? [18:32:45] joal: Oh if you would please, that would be great. Many thanks. [18:32:56] ack butills- doing this now [18:37:43] !log Manually creating partitions for event.mediawiki_client_session_tick (datacenter=eqiad/year=2023/month=3/day=7/hour=[10,11,12,13,14]) [18:37:44] Logged the message at https://www.mediawiki.org/wiki/Analytics/Server_Admin_Log [18:40:18] milimetric: we've found the problem with btullis for the session-tick stuff [18:40:30] milimetric: do you wish to batcave for an explanantion? [18:41:03] 10Analytics-Radar, 10Data-Engineering, 10Data-Engineering-Wikistats, 10Diffusion-Repository-Administrators, and 4 others: Archive analytics/wikistats - https://phabricator.wikimedia.org/T332004 (10Milimetric) >>! In T332004#8696727, @hashar wrote: > What is the replacement to `analytics/wikistats`? If it e... [18:41:08] omw batcave joal [18:47:43] 10Data-Engineering, 10Event-Platform Value Stream, 10WMF-Architecture-Team: Major (API) versioning of Event Platform streams - https://phabricator.wikimedia.org/T332212 (10Ottomata) [18:48:37] 10Data-Engineering, 10Event-Platform Value Stream, 10Product-Analytics, 10WMF-Architecture-Team: Major (API) versioning of Event Platform streams - https://phabricator.wikimedia.org/T332212 (10Ottomata) [18:52:20] 10Data-Engineering, 10Event-Platform Value Stream, 10Product-Analytics, 10WMF-Architecture-Team: Major (API) versioning of Event Platform streams - https://phabricator.wikimedia.org/T332212 (10Ottomata) [18:53:11] 10Data-Engineering, 10Event-Platform Value Stream, 10Metrics-Platform-Planning, 10Product-Analytics, 10WMF-Architecture-Team: Major (API) versioning of Event Platform streams - https://phabricator.wikimedia.org/T332212 (10Ottomata) [19:25:52] 10Data-Engineering, 10Advanced-Search, 10All-and-every-Wikisource, 10ArticlePlaceholder, and 72 others: Remove unnecessary targets definitions - https://phabricator.wikimedia.org/T328497 (10Jdlrobson) [19:38:05] 10Data-Engineering: Airflow skein hook shouldn't fail when not managing to gather yarn logs - https://phabricator.wikimedia.org/T332215 (10JAllemandou) [19:43:37] 10Data-Engineering: Airflow ArchiveOperator should have a number of retries of 0 - https://phabricator.wikimedia.org/T332216 (10JAllemandou) [19:50:26] (03CR) 10Joal: T330206 - Create Mediacounts Load Hourly HQL (037 comments) [analytics/refinery] - 10https://gerrit.wikimedia.org/r/896334 (owner: 10Jennifer Ebe) [20:11:44] 10Data-Engineering, 10Event-Platform Value Stream, 10Metrics-Platform-Planning, 10Product-Analytics, 10WMF-Architecture-Team: Major (API) versioning of Event Platform streams - https://phabricator.wikimedia.org/T332212 (10Mayakp.wiki) hey @Ottomata could you complete the second con for S2 "Consumers woul... [20:49:05] (03CR) 10Sergio Gimeno: [C: 03+1] helppanel: Document savedTaskType in action_data for trynewtask-impression [schemas/event/secondary] - 10https://gerrit.wikimedia.org/r/899637 (https://phabricator.wikimedia.org/T330637) (owner: 10Kosta Harlan) [21:45:53] 10Data-Engineering-Planning, 10SRE-swift-storage, 10Event-Platform Value Stream (Sprint 10): Storage request: swift s3 bucket for mediawiki-page-content-change-enrichment checkpointing - https://phabricator.wikimedia.org/T330693 (10Eevans) >>! In T330693#8697098, @gmodena wrote: >>>! In T330693#8696219, @Eev... [22:37:16] 10Data-Engineering, 10MediaWiki-extensions-EventLogging, 10Metrics-Platform-Planning, 10Patch-For-Review: Deprecate/delete the mw.eventLog.Schema class - https://phabricator.wikimedia.org/T305491 (10Jdlrobson) [23:06:03] 10Data-Engineering, 10IP Masking, 10Product-Analytics: Clarify definitions around anonymous and temporary editors - https://phabricator.wikimedia.org/T332205 (10kzimmerman)