[00:23:50] 06Data-Engineering, 06DC-Ops, 10ops-eqiad, 06SRE: Q4:rack/setup/install an-conf100[4-6] - https://phabricator.wikimedia.org/T364429#9942789 (10Jclark-ctr) @BTullis if you get a chance to update files. These are ready to be imaged and handed over [02:19:04] FIRING: GobblinKafkaRecordsExtractedNotEqualRecordsExpected: Gobblin job event_default ingested an unexpected number of records for a Kafka topic partition. ... [02:19:04] - https://wikitech.wikimedia.org/wiki/Analytics/Systems/Cluster/Gobblin - https://grafana.wikimedia.org/d/pAQaJwEnk/gobblin?orgId=1&var-gobblin_job_name=event_default&var-kafka_topic=codfw.mediawiki.cirrussearch.page_rerender.v1&viewPanel=4 - https://alerts.wikimedia.org/?q=alertname%3DGobblinKafkaRecordsExtractedNotEqualRecordsExpected [03:19:04] RESOLVED: GobblinKafkaRecordsExtractedNotEqualRecordsExpected: Gobblin job event_default ingested an unexpected number of records for a Kafka topic partition. ... [03:19:04] - https://wikitech.wikimedia.org/wiki/Analytics/Systems/Cluster/Gobblin - https://grafana.wikimedia.org/d/pAQaJwEnk/gobblin?orgId=1&var-gobblin_job_name=event_default&var-kafka_topic=codfw.mediawiki.cirrussearch.page_rerender.v1&viewPanel=4 - https://alerts.wikimedia.org/?q=alertname%3DGobblinKafkaRecordsExtractedNotEqualRecordsExpected [07:53:23] 14Analytics-Radar, 06Data-Engineering-Icebox, 10Data-Services: Discuss labsdb visibility of rev_text_id and ar_comment - https://phabricator.wikimedia.org/T158166#9943298 (10Marostegui) 05Open→03Declined All those fields are gone ar_comment {T233135} rev_text_id https://gerrit.wikimedia.org/r/c/media... [08:10:47] 14Analytics-Radar, 06Data-Engineering-Icebox, 10Data-Services: Discuss labsdb visibility of rev_text_id and ar_comment - https://phabricator.wikimedia.org/T158166#9943352 (10Zache) I am still interested for archive comments as it makes possible to for example analyse if there were notability discussion b... [08:59:21] 14Analytics-Radar, 06Data-Engineering-Icebox, 10Data-Services: Discuss labsdb visibility of rev_text_id and ar_comment - https://phabricator.wikimedia.org/T158166#9943500 (10Marostegui) Would you mind creating a task for that field? Just to have a clearer task, as this one is a bit messy and can be confu... [09:14:52] btullis, brouberol o/ - just checking in for https://phabricator.wikimedia.org/T366555 - any plans for the reboots? [09:16:59] Hi! the analytics hadoop workers should be taken care of by ryankemper, btullis and I will reboot the rest in the coming days [09:17:09] sorry for lagging behind [09:23:51] no problem, take your time, I was just pinging to get the schedule :) [10:02:07] I'm rebooting the dse-k8s-eqiad workers atm, and then I'll move onto the druid clusters [10:02:31] some hosts are touchy, as it's non-HA databases, or the stat servers, and they will require coordination [10:04:18] !log killing stuck gobblin jobs [10:04:19] Logged the message at https://www.mediawiki.org/wiki/Analytics/Server_Admin_Log [10:26:16] !log rebooting an-master1003 (current standby namenode and resourcemanager) for T366555 [10:26:18] Logged the message at https://www.mediawiki.org/wiki/Analytics/Server_Admin_Log [10:53:40] 06Data-Engineering, 10Temporary accounts, 10Data-Platform-SRE (2024.06.17 - 2024.07.07): Generate a list of Superset users affected by changes to IP masking/temp users - https://phabricator.wikimedia.org/T347510#9943811 (10kostajh) >>! In T347510#9939924, @lbowmaker wrote: > @BTullis this seems good for now.... [11:06:25] 06Data-Engineering, 06Data Products, 06DBA, 07Schema-change-in-production: Cleanup revision table schema - https://phabricator.wikimedia.org/T367856#9943882 (10Marostegui) [11:06:50] 06Data-Engineering, 06Data Products, 06DBA, 07Schema-change-in-production: Cleanup revision table schema - https://phabricator.wikimedia.org/T367856#9943894 (10Marostegui) [11:18:40] 07Analytics-Data-Problem, 06Data-Engineering, 10Data-Engineering-Dashiki, 10Data Products (Data Products Sprint 15), and 2 others: Investigate surprising "10% Other" portion of Analytics Browsers report - https://phabricator.wikimedia.org/T342267#9943937 (10WDoranWMF) a:05mforns→03Milimetric [11:21:06] 06Data-Engineering, 10Temporary accounts, 10Data-Platform-SRE (2024.06.17 - 2024.07.07): Generate a list of Superset users affected by changes to IP masking/temp users - https://phabricator.wikimedia.org/T347510#9943953 (10BTullis) [11:36:11] 06Data-Engineering, 06Data Products, 06DBA, 07Schema-change-in-production: Cleanup revision table schema - https://phabricator.wikimedia.org/T367856#9944006 (10ops-monitoring-bot) Icinga downtime and Alertmanager silence (ID=db16489d-f0e8-4ab0-a59a-6aa2880a1bb4) set by marostegui@cumin1002 for 1 day, 0:00:... [12:15:39] 06Data-Engineering, 10Dumps-Generation, 06SRE, 10Data Products (Data Products Sprint 15), and 2 others: Dumps generation without prefetch cause disruption to the production environment - https://phabricator.wikimedia.org/T368098#9944101 (10ABran-WMF) [[ https://wm-bot.wmflabs.org/libera_logs/%23wikimedia-d... [14:23:50] 06Data-Engineering, 06SRE, 10SRE-Access-Requests: Grant Access to analytics-privatedata-users for cwylo - https://phabricator.wikimedia.org/T368027#9945028 (10Volans) [14:59:28] 06Data-Engineering, 06SRE, 10SRE-Access-Requests, 13Patch-For-Review: Grant Access to analytics-privatedata-users for cwylo - https://phabricator.wikimedia.org/T368027#9945273 (10Volans) 05In progress→03Resolved @cwylo this is now done, I'm resolving the task. Within 30 minutes the change should be... [15:35:35] 06Data-Engineering, 10Data-Platform-SRE (2024.06.17 - 2024.07.07), 10Event-Platform: mw-page-content-change-enrich flink app is missing in k8s staging - https://phabricator.wikimedia.org/T367116#9945439 (10bking) 05Open→03Resolved Closing per @Ottomata's comment. However, Data Platform SRE can do any... [15:49:27] !log failing over hadoop namenode from an-master1004 to an-master1003 [15:49:28] Logged the message at https://www.mediawiki.org/wiki/Analytics/Server_Admin_Log [15:50:45] !log failing over hadoop yarn resourcemanager from an-master1004 to an-master1003 [15:50:47] Logged the message at https://www.mediawiki.org/wiki/Analytics/Server_Admin_Log [15:59:24] 10Quarry, 10Data-Services, 10cloud-services-team (FY2023/2024-Q3-Q4): Allow Quarry to query ToolsDB public databases - https://phabricator.wikimedia.org/T348407#9945538 (10fnegri) 05In progress→03Resolved > I will give people a 2-week notice for this change, and enable access to all _p databases on M... [16:07:35] 14Analytics, 06Data-Engineering-Icebox, 10MediaWiki-REST-API, 13Patch-For-Review, 07Story: System administrator reviews API usage by client - https://phabricator.wikimedia.org/T251812#9945597 (10akosiaris) 4 years later, we don't see any data flowing in the kafka topic created back then. This feature app... [16:20:56] 06Data-Engineering, 10SRE-Access-Requests, 13Patch-For-Review: add approvers to analytics-research-admins - https://phabricator.wikimedia.org/T368435#9945722 (10Dzahn) [16:22:00] 14Analytics, 06Data-Engineering-Icebox, 10MediaWiki-REST-API, 13Patch-For-Review, 07Story: System administrator reviews API usage by client - https://phabricator.wikimedia.org/T251812#9945715 (10akosiaris) 05Open→03Resolved a:03akosiaris I am resolving the task given comments from 4 years ago.... [17:52:37] 06Data-Engineering, 10Observability-Logging, 06Traffic, 13Patch-For-Review: Upgrade hosts to haproxy 2.8.10 - https://phabricator.wikimedia.org/T367756#9946300 (10Fabfur) 05Open→03Resolved All cp hosts has been upgraded to 2.8.10 [17:57:16] 06Data-Engineering, 10Dumps-Generation, 06SRE, 10Data Products (Data Products Sprint 15), and 2 others: Dumps generation without prefetch cause disruption to the production environment - https://phabricator.wikimedia.org/T368098#9946355 (10xcollazo) >>! In T368098#9944101, @ABran-WMF wrote: > [[ https://wm... [18:20:55] 14Analytics, 10AQS2.0, 06Tech-Docs-Team, 10Data Products (Epics Timeline), and 2 others: AQS 2.0 user documentation - https://phabricator.wikimedia.org/T288664#9946474 (10apaskulin) [20:30:15] 06Data-Engineering, 10Dumps-Generation, 06SRE, 10Data Products (Data Products Sprint 15), and 2 others: Dumps generation without prefetch cause disruption to the production environment - https://phabricator.wikimedia.org/T368098#9947164 (10xcollazo) `20240701` run update: Most all wikis are now done with... [22:00:15] 06Data-Engineering, 06Web-Team-Backlog, 10Event-Platform: Deprecate use of desktop- and mobilewebuiactions in Event Platform - https://phabricator.wikimedia.org/T368678#9947431 (10KSarabia-WMF) FYI to folks, just to be super careful, we probably won't deploy this until after the data collection for T367871 i...