[02:28:53] 10Data-Engineering: Drop GuidedTour* tables - https://phabricator.wikimedia.org/T317460 (10phuedx) [02:29:16] 10Data-Engineering-Radar, 10Growth-Team, 10MediaWiki-extensions-GuidedTour, 10MW-1.39-notes (1.39.0-wmf.18; 2022-06-27), 10MW-1.40-notes (1.40.0-wmf.1; 2022-09-12): Finish decommissioning the legacy GuidedTour schemas - https://phabricator.wikimedia.org/T303712 (10phuedx) [02:29:22] 10Data-Engineering: Drop GuidedTour* tables - https://phabricator.wikimedia.org/T317460 (10phuedx) [09:43:34] Hello, I'm currently working on a project to analyze and identify vandalism on frwiki, and in order to do this I'm trying to setup a spark cluster with the revision history as a data source. What I currently do is that I download the last xml dumps, and convert them to parquet files. This process is very memory and bandwidth intensive, so I was [09:43:35] wondering if maybe one of you here has already worked on this, or maybe the wmf has a parquet dataset I haven't heard of ? [10:52:54] RECOVERY - MegaRAID on an-worker1079 is OK: OK: optimal, 13 logical, 14 physical, WriteBack policy https://wikitech.wikimedia.org/wiki/MegaCli%23Monitoring [11:27:06] PROBLEM - MegaRAID on an-worker1079 is CRITICAL: CRITICAL: 13 LD(s) must have write cache policy WriteBack, currently using: WriteThrough, WriteThrough, WriteThrough, WriteThrough, WriteThrough, WriteThrough, WriteThrough, WriteThrough, WriteThrough, WriteThrough, WriteThrough, WriteThrough, WriteThrough https://wikitech.wikimedia.org/wiki/MegaCli%23Monitoring [12:10:10] PROBLEM - SSH on analytics1073.mgmt is CRITICAL: CRITICAL - Socket timeout after 10 seconds https://wikitech.wikimedia.org/wiki/Dc-operations/Hardware_Troubleshooting_Runbook [15:14:20] RECOVERY - SSH on analytics1073.mgmt is OK: SSH OK - OpenSSH_7.4 (protocol 2.0) https://wikitech.wikimedia.org/wiki/Dc-operations/Hardware_Troubleshooting_Runbook