[00:26:05] 10Data-Engineering: Don't pollute skein logs. Part II. - https://phabricator.wikimedia.org/T342926 (10CodeReviewBot) xcollazo updated https://gitlab.wikimedia.org/repos/data-engineering/airflow-dags/-/merge_requests/468 Don't pollute skein logs with heartbeats. [00:31:54] 10Data-Engineering: Don't pollute skein logs. Part II. - https://phabricator.wikimedia.org/T342926 (10xcollazo) p:05Triage→03High a:03xcollazo [00:32:25] 10Data-Engineering, 10Data Products (Sprint 0): Don't pollute skein logs. Part II. - https://phabricator.wikimedia.org/T342926 (10xcollazo) 05Open→03In progress [03:03:28] (MediawikiPageContentChangeEnrichAvailability) firing: ... [03:03:28] Low percentage of enriched events produced by mw_page_content_change_enrich in codfw - TODO - https://grafana.wikimedia.org/d/K9x0c4aVk/flink-app?orgId=1&var-datasource=codfw%20prometheus/k8s&var-namespace=mw-page-content-change-enrich&var-helm_release=main&var-operator_name=All&var-flink_job_name=mw_page_content_change_enrich - https://alerts.wikimedia.org/?q=alertname%3DMediawikiPageContentChangeEnrichAvailability [05:03:29] PROBLEM - Check systemd state on an-launcher1002 is CRITICAL: CRITICAL - degraded: The following units failed: refinery-sqoop-wikifunctions-production.service https://wikitech.wikimedia.org/wiki/Monitoring/check_systemd_state [05:04:44] (SystemdUnitFailed) firing: refinery-sqoop-wikifunctions-production.service Failed on an-launcher1002:9100 - https://wikitech.wikimedia.org/wiki/Monitoring/check_systemd_state - https://grafana.wikimedia.org/d/g-AaZRFWk/systemd-status - https://alerts.wikimedia.org/?q=alertname%3DSystemdUnitFailed [05:54:39] 10Data-Engineering, 10Data Engineering and Event Platform Team (Sprint 1), 10Event-Platform, 10Patch-For-Review: mediawiki page_content_change should generate new meta.id field - https://phabricator.wikimedia.org/T341277 (10tchin) a:03tchin [07:03:28] (MediawikiPageContentChangeEnrichAvailability) firing: ... [07:03:28] Low percentage of enriched events produced by mw_page_content_change_enrich in codfw - TODO - https://grafana.wikimedia.org/d/K9x0c4aVk/flink-app?orgId=1&var-datasource=codfw%20prometheus/k8s&var-namespace=mw-page-content-change-enrich&var-helm_release=main&var-operator_name=All&var-flink_job_name=mw_page_content_change_enrich - https://alerts.wikimedia.org/?q=alertname%3DMediawikiPageContentChangeEnrichAvailability [08:57:16] PROBLEM - mysqld processes on db1108 is CRITICAL: PROCS CRITICAL: 0 processes with command name mysqld https://wikitech.wikimedia.org/wiki/MariaDB/troubleshooting [08:59:44] (SystemdUnitFailed) firing: (2) refinery-sqoop-wikifunctions-production.service Failed on an-launcher1002:9100 - https://wikitech.wikimedia.org/wiki/Monitoring/check_systemd_state - https://grafana.wikimedia.org/d/g-AaZRFWk/systemd-status - https://alerts.wikimedia.org/?q=alertname%3DSystemdUnitFailed [09:07:44] 10Data-Platform-SRE, 10Data Engineering and Event Platform Team, 10Data Pipelines, 10Data-Persistence: Replace db1108 with db1208 - https://phabricator.wikimedia.org/T334055 (10ops-monitoring-bot) Icinga downtime and Alertmanager silence (ID=78c3d470-e1af-4646-8758-79672fccf2d6) set by btullis@cumin1001 fo... [09:44:58] 10Data-Platform-SRE, 10SRE, 10vm-requests, 10Discovery-Search (Current work): eqiad: 3 VMs requested for Zookeeper - https://phabricator.wikimedia.org/T341705 (10Gehel) 05Open→03Resolved [09:45:03] 10Data-Engineering, 10Data Engineering and Event Platform Team, 10Discovery-Search, 10serviceops-radar, 10Event-Platform: [NEEDS GROOMING] Store Flink HA metadata in Zookeeper - https://phabricator.wikimedia.org/T331283 (10Gehel) [09:45:05] 10Data-Platform-SRE, 10Discovery-Search (Current work), 10Patch-For-Review: Provision Zookeeper Cluster for storing Flink HA data - https://phabricator.wikimedia.org/T341792 (10Gehel) [09:51:13] 10Data-Platform-SRE, 10Discovery-Search (Current work): Reimage wdqs20[13-22] servers to Bullseye - https://phabricator.wikimedia.org/T328325 (10Gehel) 05Open→03Resolved [09:55:16] 10Data-Platform-SRE, 10Discovery-Search (Current work): Investigate WDQS categories update failures on Bullseye hosts - https://phabricator.wikimedia.org/T342060 (10Gehel) 05Open→03Resolved a:03Gehel [09:55:19] 10Data-Platform-SRE, 10Discovery-Search (Current work): Ensure WDQS stack works on Bullseye - https://phabricator.wikimedia.org/T331300 (10Gehel) [09:55:53] 10Data-Platform-SRE, 10Discovery-Search (Current work): Ensure WDQS stack works on Bullseye - https://phabricator.wikimedia.org/T331300 (10Gehel) 05Open→03Resolved [09:56:02] 10Data-Platform-SRE, 10Wikidata, 10Wikidata-Query-Service, 10Discovery-Search (Current work), 10Patch-For-Review: Configure new WDQS servers in codfw (wdqs20[13-22]) - https://phabricator.wikimedia.org/T332314 (10Gehel) [09:56:16] 10Data-Engineering, 10Data Engineering and Event Platform Team, 10Discovery-Search (Current work), 10Event-Platform, 10MW-1.41-notes (1.41.0-wmf.16; 2023-07-04): Add support for redirects in CirrusSearch - https://phabricator.wikimedia.org/T325315 (10Gehel) 05Open→03Resolved [09:56:18] 10Data-Engineering, 10Data Engineering and Event Platform Team, 10Machine-Learning-Team, 10Event-Platform: Create new mediawiki.page_links_change stream based on fragment/mediawiki/state/change/page - https://phabricator.wikimedia.org/T331399 (10Gehel) [10:01:06] 10Data-Platform-SRE: Bump Yarn logs retention period to support debugging long running jobs - https://phabricator.wikimedia.org/T342923 (10LSobanski) [10:59:30] 10Data-Platform-SRE, 10Data Engineering and Event Platform Team, 10GitLab (Project Migration), 10Patch-For-Review, 10Release-Engineering-Team (Priority Backlog 📥): Migrate analytics/datahub pipeline to GitLab - https://phabricator.wikimedia.org/T341194 (10CodeReviewBot) btullis opened https://gitlab.wiki... [11:01:05] (03PS1) 10Jforrester: pageview: Add wikidata.org to the allowlist [analytics/refinery] - 10https://gerrit.wikimedia.org/r/942632 [11:02:23] (03PS2) 10Jforrester: pageview: Add wikidata.org to the allowlist [analytics/refinery] - 10https://gerrit.wikimedia.org/r/942632 (https://phabricator.wikimedia.org/T342865) [11:03:28] (MediawikiPageContentChangeEnrichAvailability) firing: ... [11:03:28] Low percentage of enriched events produced by mw_page_content_change_enrich in codfw - TODO - https://grafana.wikimedia.org/d/K9x0c4aVk/flink-app?orgId=1&var-datasource=codfw%20prometheus/k8s&var-namespace=mw-page-content-change-enrich&var-helm_release=main&var-operator_name=All&var-flink_job_name=mw_page_content_change_enrich - https://alerts.wikimedia.org/?q=alertname%3DMediawikiPageContentChangeEnrichAvailability [11:11:18] 10Data-Platform-SRE, 10Data Engineering and Event Platform Team, 10GitLab (Project Migration), 10Patch-For-Review, 10Release-Engineering-Team (Priority Backlog 📥): Migrate analytics/datahub pipeline to GitLab - https://phabricator.wikimedia.org/T341194 (10BTullis) I have created [[https://gitlab.wikimedi... [12:59:44] (SystemdUnitFailed) firing: refinery-sqoop-wikifunctions-production.service Failed on an-launcher1002:9100 - https://wikitech.wikimedia.org/wiki/Monitoring/check_systemd_state - https://grafana.wikimedia.org/d/g-AaZRFWk/systemd-status - https://alerts.wikimedia.org/?q=alertname%3DSystemdUnitFailed [13:34:15] (03CR) 10DCausse: Provide internal schema for CirrusSearch update-pipeline updates. (033 comments) [schemas/event/primary] - 10https://gerrit.wikimedia.org/r/856507 (https://phabricator.wikimedia.org/T317202) (owner: 10Peter Fischer) [13:40:02] 10Data-Engineering, 10Data Products (Sprint 0), 10Patch-For-Review: Don't pollute skein logs. Part II. - https://phabricator.wikimedia.org/T342926 (10CodeReviewBot) milimetric closed https://gitlab.wikimedia.org/repos/data-engineering/airflow-dags/-/merge_requests/468 Draft: Don't pollute skein logs with he... [13:40:20] 10Data-Engineering, 10Data Products (Sprint 0), 10Patch-For-Review: Don't pollute skein logs. Part II. - https://phabricator.wikimedia.org/T342926 (10CodeReviewBot) milimetric reopened https://gitlab.wikimedia.org/repos/data-engineering/airflow-dags/-/merge_requests/468 Draft: Don't pollute skein logs with... [13:46:44] (03CR) 10Milimetric: GDI Equity Landscape Tables (031 comment) [analytics/refinery] - 10https://gerrit.wikimedia.org/r/941911 (owner: 10Nmaphophe) [14:13:55] (03PS1) 10Milimetric: Fix typo [analytics/refinery] - 10https://gerrit.wikimedia.org/r/942666 [14:14:28] (03CR) 10Milimetric: [V: 03+2 C: 03+2] Fix typo [analytics/refinery] - 10https://gerrit.wikimedia.org/r/942666 (owner: 10Milimetric) [14:20:49] 10Data-Platform-SRE: Decide on installation details for new ceph cluster - https://phabricator.wikimedia.org/T326945 (10MatthewVernon) I've had fun with `rados bench` in the past; it's at least arguably useful to see how much performance you can squeeze out of it, since it'll give us an idea later of how close w... [14:34:28] !log deployed a fix for a sqoop typo [14:34:30] Logged the message at https://www.mediawiki.org/wiki/Analytics/Server_Admin_Log [15:03:44] (MediawikiPageContentChangeEnrichAvailability) firing: ... [15:03:44] Low percentage of enriched events produced by mw_page_content_change_enrich in codfw - TODO - https://grafana.wikimedia.org/d/K9x0c4aVk/flink-app?orgId=1&var-datasource=codfw%20prometheus/k8s&var-namespace=mw-page-content-change-enrich&var-helm_release=main&var-operator_name=All&var-flink_job_name=mw_page_content_change_enrich - https://alerts.wikimedia.org/?q=alertname%3DMediawikiPageContentChangeEnrichAvailability [15:09:44] (03CR) 10Milimetric: [V: 03+2 C: 03+2] pageview: Add wikidata.org to the allowlist [analytics/refinery] - 10https://gerrit.wikimedia.org/r/942632 (https://phabricator.wikimedia.org/T342865) (owner: 10Jforrester) [15:47:16] 10Data-Platform-SRE, 10Data Engineering and Event Platform Team, 10GitLab (Project Migration), 10Patch-For-Review, 10Release-Engineering-Team (Priority Backlog 📥): Migrate analytics/datahub pipeline to GitLab - https://phabricator.wikimedia.org/T341194 (10CodeReviewBot) btullis merged https://gitlab.wiki... [16:13:11] 10Data-Engineering, 10Data-Platform-SRE: NEW BUG REPORT remove mysql databases from SQLLab - https://phabricator.wikimedia.org/T337056 (10BTullis) 05Open→03Resolved a:03BTullis [16:32:04] 10Data-Platform-SRE, 10API Platform, 10Anti-Harassment, 10Content-Transform-Team, and 19 others: Migrate PipelineLib repos to GitLab - https://phabricator.wikimedia.org/T332953 (10BTullis) [16:44:17] 10Data-Platform-SRE, 10Data Engineering and Event Platform Team, 10GitLab (Project Migration), 10Patch-For-Review, 10Release-Engineering-Team (Priority Backlog 📥): Migrate analytics/datahub pipeline to GitLab - https://phabricator.wikimedia.org/T341194 (10BTullis) I'll aim to deploy the newly built DataH... [16:59:44] (SystemdUnitFailed) firing: refinery-sqoop-wikifunctions-production.service Failed on an-launcher1002:9100 - https://wikitech.wikimedia.org/wiki/Monitoring/check_systemd_state - https://grafana.wikimedia.org/d/g-AaZRFWk/systemd-status - https://alerts.wikimedia.org/?q=alertname%3DSystemdUnitFailed [17:56:05] (03PS1) 10Milimetric: Turn on pageviews for wikifunctions [analytics/refinery/source] - 10https://gerrit.wikimedia.org/r/942699 [18:01:06] 10Data-Platform-SRE, 10Data Engineering and Event Platform Team, 10GitLab (Project Migration), 10Patch-For-Review, 10Release-Engineering-Team (Priority Backlog 📥): Migrate analytics/datahub pipeline to GitLab - https://phabricator.wikimedia.org/T341194 (10thcipriani) >>! In T341194#9052184, @BTullis wrot... [18:01:39] (03CR) 10Jforrester: Turn on pageviews for wikifunctions (031 comment) [analytics/refinery/source] - 10https://gerrit.wikimedia.org/r/942699 (owner: 10Milimetric) [18:38:44] (03PS2) 10Milimetric: Turn on pageviews for wikifunctions [analytics/refinery/source] - 10https://gerrit.wikimedia.org/r/942699 (https://phabricator.wikimedia.org/T342865) [18:40:09] (03CR) 10Milimetric: "added a language variant parser and I'm *fairly* sure it's not going to make anything else blow up... still a bit scary as our unit tests " [analytics/refinery/source] - 10https://gerrit.wikimedia.org/r/942699 (https://phabricator.wikimedia.org/T342865) (owner: 10Milimetric) [18:58:30] (03CR) 10Milimetric: [C: 03+2] Turn on pageviews for wikifunctions [analytics/refinery/source] - 10https://gerrit.wikimedia.org/r/942699 (https://phabricator.wikimedia.org/T342865) (owner: 10Milimetric) [19:00:13] (03CR) 10Milimetric: [V: 03+2 C: 03+2] Turn on pageviews for wikifunctions [analytics/refinery/source] - 10https://gerrit.wikimedia.org/r/942699 (https://phabricator.wikimedia.org/T342865) (owner: 10Milimetric) [19:01:34] (03PS1) 10Milimetric: Update changelog for v0.2.19 [analytics/refinery/source] - 10https://gerrit.wikimedia.org/r/942704 [19:01:46] (03CR) 10Milimetric: [V: 03+2 C: 03+2] Update changelog for v0.2.19 [analytics/refinery/source] - 10https://gerrit.wikimedia.org/r/942704 (owner: 10Milimetric) [19:02:12] Starting build #124 for job analytics-refinery-maven-release-docker [19:08:28] (MediawikiPageContentChangeEnrichAvailability) firing: ... [19:08:28] Low percentage of enriched events produced by mw_page_content_change_enrich in codfw - TODO - https://grafana.wikimedia.org/d/K9x0c4aVk/flink-app?orgId=1&var-datasource=codfw%20prometheus/k8s&var-namespace=mw-page-content-change-enrich&var-helm_release=main&var-operator_name=All&var-flink_job_name=mw_page_content_change_enrich - https://alerts.wikimedia.org/?q=alertname%3DMediawikiPageContentChangeEnrichAvailability [19:14:47] Project analytics-refinery-maven-release-docker build #124: 09SUCCESS in 12 min: https://integration.wikimedia.org/ci/job/analytics-refinery-maven-release-docker/124/ [19:20:54] Starting build #83 for job analytics-refinery-update-jars-docker [19:21:15] (03PS1) 10Maven-release-user: Add refinery-source jars for v0.2.19 to artifacts [analytics/refinery] - 10https://gerrit.wikimedia.org/r/941923 [19:21:16] Project analytics-refinery-update-jars-docker build #83: 09SUCCESS in 21 sec: https://integration.wikimedia.org/ci/job/analytics-refinery-update-jars-docker/83/ [19:27:03] (03CR) 10Milimetric: [V: 03+2 C: 03+2] Add refinery-source jars for v0.2.19 to artifacts [analytics/refinery] - 10https://gerrit.wikimedia.org/r/941923 (owner: 10Maven-release-user) [19:30:46] 10Data-Engineering, 10Data Products (Sprint 0), 10Patch-For-Review: Don't pollute skein logs. Part II. - https://phabricator.wikimedia.org/T342926 (10CodeReviewBot) xcollazo merged https://gitlab.wikimedia.org/repos/data-engineering/airflow-dags/-/merge_requests/468 Don't pollute skein logs with heartbeats. [19:38:01] !log Deployed T342926 and https://gitlab.wikimedia.org/repos/data-engineering/airflow-dags/-/merge_requests/469 to analytics Airflow instance [19:38:06] Logged the message at https://www.mediawiki.org/wiki/Analytics/Server_Admin_Log [19:38:06] T342926: Don't pollute skein logs. Part II. - https://phabricator.wikimedia.org/T342926 [20:22:59] (03PS1) 10Milimetric: Add special=ViewObject to allowed pageviews [analytics/refinery/source] - 10https://gerrit.wikimedia.org/r/942708 (https://phabricator.wikimedia.org/T342865) [20:23:09] (03CR) 10Milimetric: [C: 03+2] Add special=ViewObject to allowed pageviews [analytics/refinery/source] - 10https://gerrit.wikimedia.org/r/942708 (https://phabricator.wikimedia.org/T342865) (owner: 10Milimetric) [20:27:28] 10Data-Engineering, 10Data Products (Sprint 0), 10Patch-For-Review: Don't pollute skein logs. Part II. - https://phabricator.wikimedia.org/T342926 (10xcollazo) Patch is WAD. [20:27:38] 10Data-Engineering, 10Data Products (Sprint 0), 10Patch-For-Review: Don't pollute skein logs. Part II. - https://phabricator.wikimedia.org/T342926 (10xcollazo) 05In progress→03Resolved [20:31:55] (03Merged) 10jenkins-bot: Add special=ViewObject to allowed pageviews [analytics/refinery/source] - 10https://gerrit.wikimedia.org/r/942708 (https://phabricator.wikimedia.org/T342865) (owner: 10Milimetric) [20:59:44] (SystemdUnitFailed) firing: refinery-sqoop-wikifunctions-production.service Failed on an-launcher1002:9100 - https://wikitech.wikimedia.org/wiki/Monitoring/check_systemd_state - https://grafana.wikimedia.org/d/g-AaZRFWk/systemd-status - https://alerts.wikimedia.org/?q=alertname%3DSystemdUnitFailed [21:18:48] (03PS1) 10Milimetric: Update changelog for v0.2.20 [analytics/refinery/source] - 10https://gerrit.wikimedia.org/r/942712 [21:19:05] (03CR) 10Milimetric: [V: 03+2 C: 03+2] Update changelog for v0.2.20 [analytics/refinery/source] - 10https://gerrit.wikimedia.org/r/942712 (owner: 10Milimetric) [21:22:03] Starting build #125 for job analytics-refinery-maven-release-docker [21:34:04] Project analytics-refinery-maven-release-docker build #125: 09SUCCESS in 12 min: https://integration.wikimedia.org/ci/job/analytics-refinery-maven-release-docker/125/ [21:38:53] Starting build #84 for job analytics-refinery-update-jars-docker [21:39:14] (03PS1) 10Maven-release-user: Add refinery-source jars for v0.2.20 to artifacts [analytics/refinery] - 10https://gerrit.wikimedia.org/r/941924 [21:39:14] Project analytics-refinery-update-jars-docker build #84: 09SUCCESS in 21 sec: https://integration.wikimedia.org/ci/job/analytics-refinery-update-jars-docker/84/ [21:43:54] (03CR) 10Milimetric: [V: 03+2 C: 03+2] Add refinery-source jars for v0.2.20 to artifacts [analytics/refinery] - 10https://gerrit.wikimedia.org/r/941924 (owner: 10Maven-release-user) [23:08:28] (MediawikiPageContentChangeEnrichAvailability) firing: ... [23:08:28] Low percentage of enriched events produced by mw_page_content_change_enrich in codfw - TODO - https://grafana.wikimedia.org/d/K9x0c4aVk/flink-app?orgId=1&var-datasource=codfw%20prometheus/k8s&var-namespace=mw-page-content-change-enrich&var-helm_release=main&var-operator_name=All&var-flink_job_name=mw_page_content_change_enrich - https://alerts.wikimedia.org/?q=alertname%3DMediawikiPageContentChangeEnrichAvailability