[00:30:45] RECOVERY - Check unit status of monitor_refine_eventlogging_legacy on an-launcher1002 is OK: OK: Status of the systemd unit monitor_refine_eventlogging_legacy https://wikitech.wikimedia.org/wiki/Analytics/Systems/Managing_systemd_timers [00:43:55] PROBLEM - Check unit status of monitor_refine_eventlogging_legacy on an-launcher1002 is CRITICAL: CRITICAL: Status of the systemd unit monitor_refine_eventlogging_legacy https://wikitech.wikimedia.org/wiki/Analytics/Systems/Managing_systemd_timers [04:13:20] 10Analytics-Radar, 10Fundraising-Backlog, 10Product-Analytics, 10Wikipedia-iOS-App-Backlog, and 2 others: Understand impact of Apple's Relay Service - https://phabricator.wikimedia.org/T289795 (10AntiCompositeNumber) >>! In T289795#7408864, @sgrabarczuk wrote: > We (the Product department, Wikimedia Founda... [09:22:44] 10Analytics-Radar, 10Fundraising-Backlog, 10Product-Analytics, 10Wikipedia-iOS-App-Backlog, and 2 others: Understand impact of Apple's Relay Service - https://phabricator.wikimedia.org/T289795 (10MarioGom) Thanks for the update @AntiCompositeNumber. I have posted some thoughts at https://meta.wikimedia.org... [09:28:19] PROBLEM - Check unit status of produce_canary_events on an-launcher1002 is CRITICAL: CRITICAL: Status of the systemd unit produce_canary_events https://wikitech.wikimedia.org/wiki/Analytics/Systems/Managing_systemd_timers [09:39:23] RECOVERY - Check unit status of produce_canary_events on an-launcher1002 is OK: OK: Status of the systemd unit produce_canary_events https://wikitech.wikimedia.org/wiki/Analytics/Systems/Managing_systemd_timers [15:45:51] PROBLEM - aqs endpoints health on aqs1011 is CRITICAL: /analytics.wikimedia.org/v1/pageviews/top-by-country/{project}/{access}/{year}/{month} (Get top countries by page views) is CRITICAL: Test Get top countries by page views returned the unexpected status 500 (expecting: 200): /analytics.wikimedia.org/v1/mediarequests/top/{referer}/{media_type}/{year}/{month}/{day} (Get top files by mediarequests) is CRITICAL: Test Get top files by mediar [15:45:51] returned the unexpected status 500 (expecting: 200) https://wikitech.wikimedia.org/wiki/Services/Monitoring/aqs [15:45:55] PROBLEM - aqs endpoints health on aqs1014 is CRITICAL: /analytics.wikimedia.org/v1/pageviews/top-by-country/{project}/{access}/{year}/{month} (Get top countries by page views) is CRITICAL: Test Get top countries by page views returned the unexpected status 500 (expecting: 200): /analytics.wikimedia.org/v1/mediarequests/top/{referer}/{media_type}/{year}/{month}/{day} (Get top files by mediarequests) is CRITICAL: Test Get top files by mediar [15:45:55] returned the unexpected status 500 (expecting: 200) https://wikitech.wikimedia.org/wiki/Services/Monitoring/aqs [15:46:27] PROBLEM - aqs endpoints health on aqs1015 is CRITICAL: /analytics.wikimedia.org/v1/pageviews/top-by-country/{project}/{access}/{year}/{month} (Get top countries by page views) is CRITICAL: Test Get top countries by page views returned the unexpected status 500 (expecting: 200): /analytics.wikimedia.org/v1/mediarequests/top/{referer}/{media_type}/{year}/{month}/{day} (Get top files by mediarequests) is CRITICAL: Test Get top files by mediar [15:46:27] returned the unexpected status 500 (expecting: 200) https://wikitech.wikimedia.org/wiki/Services/Monitoring/aqs [15:46:27] PROBLEM - aqs endpoints health on aqs1012 is CRITICAL: /analytics.wikimedia.org/v1/pageviews/top-by-country/{project}/{access}/{year}/{month} (Get top countries by page views) is CRITICAL: Test Get top countries by page views returned the unexpected status 500 (expecting: 200): /analytics.wikimedia.org/v1/mediarequests/top/{referer}/{media_type}/{year}/{month}/{day} (Get top files by mediarequests) is CRITICAL: Test Get top files by mediar [15:46:28] returned the unexpected status 500 (expecting: 200) https://wikitech.wikimedia.org/wiki/Services/Monitoring/aqs [15:46:29] PROBLEM - aqs endpoints health on aqs1013 is CRITICAL: /analytics.wikimedia.org/v1/pageviews/top-by-country/{project}/{access}/{year}/{month} (Get top countries by page views) is CRITICAL: Test Get top countries by page views returned the unexpected status 500 (expecting: 200): /analytics.wikimedia.org/v1/mediarequests/top/{referer}/{media_type}/{year}/{month}/{day} (Get top files by mediarequests) is CRITICAL: Test Get top files by mediar [15:46:30] returned the unexpected status 500 (expecting: 200) https://wikitech.wikimedia.org/wiki/Services/Monitoring/aqs [15:47:23] PROBLEM - aqs endpoints health on aqs1010 is CRITICAL: /analytics.wikimedia.org/v1/pageviews/top-by-country/{project}/{access}/{year}/{month} (Get top countries by page views) is CRITICAL: Test Get top countries by page views returned the unexpected status 500 (expecting: 200): /analytics.wikimedia.org/v1/mediarequests/top/{referer}/{media_type}/{year}/{month}/{day} (Get top files by mediarequests) is CRITICAL: Test Get top files by mediar [15:47:23] returned the unexpected status 500 (expecting: 200) https://wikitech.wikimedia.org/wiki/Services/Monitoring/aqs [16:08:32] hello folks, just downtimed all the new aqs nodes for a day [16:08:41] Cc: btullis, joal --^