[00:01:58] PROBLEM - SSH on wtp1038.mgmt is CRITICAL: CRITICAL - Socket timeout after 10 seconds https://wikitech.wikimedia.org/wiki/Dc-operations/Hardware_Troubleshooting_Runbook [00:04:01] !log ladsgroup@cumin1001 dbctl commit (dc=all): 'Repooling after maintenance db1146:3314', diff saved to https://phabricator.wikimedia.org/P24976 and previous config saved to /var/cache/conftool/dbconfig/20220418-000401-ladsgroup.json [00:04:04] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log [00:05:15] !log ladsgroup@cumin1001 START - Cookbook sre.hosts.downtime for 6:00:00 on db2129.codfw.wmnet with reason: Maintenance [00:05:16] !log ladsgroup@cumin1001 END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db2129.codfw.wmnet with reason: Maintenance [00:05:17] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log [00:05:18] !log ladsgroup@cumin1001 START - Cookbook sre.hosts.downtime for 12:00:00 on 8 hosts with reason: Maintenance [00:05:20] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log [00:05:22] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log [00:05:24] !log ladsgroup@cumin1001 END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on 8 hosts with reason: Maintenance [00:05:26] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log [00:15:01] !log ladsgroup@cumin1001 START - Cookbook sre.hosts.downtime for 6:00:00 on db1140.eqiad.wmnet with reason: Maintenance [00:15:03] !log ladsgroup@cumin1001 END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1140.eqiad.wmnet with reason: Maintenance [00:15:04] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log [00:15:08] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log [00:19:07] !log ladsgroup@cumin1001 dbctl commit (dc=all): 'Repooling after maintenance db1146:3314', diff saved to https://phabricator.wikimedia.org/P24977 and previous config saved to /var/cache/conftool/dbconfig/20220418-001906-ladsgroup.json [00:19:09] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log [00:24:37] !log ladsgroup@cumin1001 START - Cookbook sre.hosts.downtime for 6:00:00 on db1180.eqiad.wmnet with reason: Maintenance [00:24:39] !log ladsgroup@cumin1001 END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1180.eqiad.wmnet with reason: Maintenance [00:24:40] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log [00:24:43] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log [00:24:44] !log ladsgroup@cumin1001 dbctl commit (dc=all): 'Depooling db1180 (T298565)', diff saved to https://phabricator.wikimedia.org/P24978 and previous config saved to /var/cache/conftool/dbconfig/20220418-002443-ladsgroup.json [00:24:47] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log [00:24:48] T298565: Fix mismatching field type of user table for columns user_email_authenticated, user_email_token, user_email_token_expires, user_newpass_time, user_registration, user_token, user_touched, user_newpassword, user_password, user_email on wmf wikis - https://phabricator.wikimedia.org/T298565 [00:34:12] !log ladsgroup@cumin1001 dbctl commit (dc=all): 'Repooling after maintenance db1146:3314 (T298565)', diff saved to https://phabricator.wikimedia.org/P24979 and previous config saved to /var/cache/conftool/dbconfig/20220418-003411-ladsgroup.json [00:34:13] !log ladsgroup@cumin1001 START - Cookbook sre.hosts.downtime for 6:00:00 on dbstore1007.eqiad.wmnet with reason: Maintenance [00:34:15] !log ladsgroup@cumin1001 END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on dbstore1007.eqiad.wmnet with reason: Maintenance [00:34:15] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log [00:34:16] T298565: Fix mismatching field type of user table for columns user_email_authenticated, user_email_token, user_email_token_expires, user_newpass_time, user_registration, user_token, user_touched, user_newpassword, user_password, user_email on wmf wikis - https://phabricator.wikimedia.org/T298565 [00:34:18] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log [00:34:20] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log [00:37:13] (KubernetesRsyslogDown) firing: rsyslog on kubernetes1018:9105 is missing kubernetes logs - https://wikitech.wikimedia.org/wiki/Kubernetes/Logging#Common_issues - https://alerts.wikimedia.org/?q=alertname%3DKubernetesRsyslogDown [00:42:54] !log ladsgroup@cumin1001 START - Cookbook sre.hosts.downtime for 6:00:00 on db1145.eqiad.wmnet with reason: Maintenance [00:42:56] !log ladsgroup@cumin1001 END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1145.eqiad.wmnet with reason: Maintenance [00:42:56] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log [00:42:59] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log [00:51:32] !log ladsgroup@cumin1001 START - Cookbook sre.hosts.downtime for 6:00:00 on db1138.eqiad.wmnet with reason: Maintenance [00:51:34] !log ladsgroup@cumin1001 END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1138.eqiad.wmnet with reason: Maintenance [00:51:35] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log [00:51:38] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log [00:51:39] !log ladsgroup@cumin1001 dbctl commit (dc=all): 'Depooling db1138 (T298565)', diff saved to https://phabricator.wikimedia.org/P24980 and previous config saved to /var/cache/conftool/dbconfig/20220418-005138-ladsgroup.json [00:51:42] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log [00:51:43] T298565: Fix mismatching field type of user table for columns user_email_authenticated, user_email_token, user_email_token_expires, user_newpass_time, user_registration, user_token, user_touched, user_newpassword, user_password, user_email on wmf wikis - https://phabricator.wikimedia.org/T298565 [01:24:58] !log ladsgroup@cumin1001 dbctl commit (dc=all): 'Repooling after maintenance db1180 (T298565)', diff saved to https://phabricator.wikimedia.org/P24981 and previous config saved to /var/cache/conftool/dbconfig/20220418-012458-ladsgroup.json [01:25:02] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log [01:25:03] T298565: Fix mismatching field type of user table for columns user_email_authenticated, user_email_token, user_email_token_expires, user_newpass_time, user_registration, user_token, user_touched, user_newpassword, user_password, user_email on wmf wikis - https://phabricator.wikimedia.org/T298565 [01:38:45] (JobUnavailable) firing: Reduced availability for job sidekiq in ops@codfw - https://wikitech.wikimedia.org/wiki/Prometheus#Prometheus_job_unavailable - https://grafana.wikimedia.org/d/NEJu05xZz/prometheus-targets - https://alerts.wikimedia.org/?q=alertname%3DJobUnavailable [01:40:03] !log ladsgroup@cumin1001 dbctl commit (dc=all): 'Repooling after maintenance db1180', diff saved to https://phabricator.wikimedia.org/P24982 and previous config saved to /var/cache/conftool/dbconfig/20220418-014003-ladsgroup.json [01:40:06] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log [01:47:54] (NodeTextfileStale) firing: Stale textfile for cloudcontrol2001-dev:9100 - https://wikitech.wikimedia.org/wiki/Prometheus#Stale_file_for_node-exporter_textfile - https://grafana.wikimedia.org/d/knkl4dCWz/node-exporter-textfile - https://alerts.wikimedia.org/?q=alertname%3DNodeTextfileStale [01:48:45] (JobUnavailable) resolved: Reduced availability for job sidekiq in ops@codfw - https://wikitech.wikimedia.org/wiki/Prometheus#Prometheus_job_unavailable - https://grafana.wikimedia.org/d/NEJu05xZz/prometheus-targets - https://alerts.wikimedia.org/?q=alertname%3DJobUnavailable [01:51:53] !log ladsgroup@cumin1001 dbctl commit (dc=all): 'Repooling after maintenance db1138 (T298565)', diff saved to https://phabricator.wikimedia.org/P24983 and previous config saved to /var/cache/conftool/dbconfig/20220418-015152-ladsgroup.json [01:51:56] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log [01:51:58] T298565: Fix mismatching field type of user table for columns user_email_authenticated, user_email_token, user_email_token_expires, user_newpass_time, user_registration, user_token, user_touched, user_newpassword, user_password, user_email on wmf wikis - https://phabricator.wikimedia.org/T298565 [01:55:08] !log ladsgroup@cumin1001 dbctl commit (dc=all): 'Repooling after maintenance db1180', diff saved to https://phabricator.wikimedia.org/P24984 and previous config saved to /var/cache/conftool/dbconfig/20220418-015508-ladsgroup.json [01:55:11] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log [02:06:58] !log ladsgroup@cumin1001 dbctl commit (dc=all): 'Repooling after maintenance db1138', diff saved to https://phabricator.wikimedia.org/P24985 and previous config saved to /var/cache/conftool/dbconfig/20220418-020657-ladsgroup.json [02:07:00] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log [02:10:14] !log ladsgroup@cumin1001 dbctl commit (dc=all): 'Repooling after maintenance db1180 (T298565)', diff saved to https://phabricator.wikimedia.org/P24986 and previous config saved to /var/cache/conftool/dbconfig/20220418-021013-ladsgroup.json [02:10:15] !log ladsgroup@cumin1001 START - Cookbook sre.hosts.downtime for 6:00:00 on db1098.eqiad.wmnet with reason: Maintenance [02:10:16] !log ladsgroup@cumin1001 END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1098.eqiad.wmnet with reason: Maintenance [02:10:17] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log [02:10:18] T298565: Fix mismatching field type of user table for columns user_email_authenticated, user_email_token, user_email_token_expires, user_newpass_time, user_registration, user_token, user_touched, user_newpassword, user_password, user_email on wmf wikis - https://phabricator.wikimedia.org/T298565 [02:10:20] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log [02:10:22] !log ladsgroup@cumin1001 dbctl commit (dc=all): 'Depooling db1098:3316 (T298565)', diff saved to https://phabricator.wikimedia.org/P24987 and previous config saved to /var/cache/conftool/dbconfig/20220418-021021-ladsgroup.json [02:10:22] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log [02:10:25] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log [02:22:03] !log ladsgroup@cumin1001 dbctl commit (dc=all): 'Repooling after maintenance db1138', diff saved to https://phabricator.wikimedia.org/P24988 and previous config saved to /var/cache/conftool/dbconfig/20220418-022202-ladsgroup.json [02:22:05] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log [02:32:55] (NodeTextfileStale) firing: (3) Stale textfile for elastic1075:9100 - https://wikitech.wikimedia.org/wiki/Prometheus#Stale_file_for_node-exporter_textfile - https://grafana.wikimedia.org/d/knkl4dCWz/node-exporter-textfile - https://alerts.wikimedia.org/?q=alertname%3DNodeTextfileStale [02:37:08] !log ladsgroup@cumin1001 dbctl commit (dc=all): 'Repooling after maintenance db1138 (T298565)', diff saved to https://phabricator.wikimedia.org/P24989 and previous config saved to /var/cache/conftool/dbconfig/20220418-023707-ladsgroup.json [02:37:11] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log [02:37:13] T298565: Fix mismatching field type of user table for columns user_email_authenticated, user_email_token, user_email_token_expires, user_newpass_time, user_registration, user_token, user_touched, user_newpassword, user_password, user_email on wmf wikis - https://phabricator.wikimedia.org/T298565 [02:37:15] !log ladsgroup@cumin1001 START - Cookbook sre.hosts.downtime for 6:00:00 on db1145.eqiad.wmnet with reason: Maintenance [02:37:16] !log ladsgroup@cumin1001 END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1145.eqiad.wmnet with reason: Maintenance [02:37:17] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log [02:37:19] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log [02:46:37] !log ladsgroup@cumin1001 START - Cookbook sre.hosts.downtime for 6:00:00 on dbstore1007.eqiad.wmnet with reason: Maintenance [02:46:38] !log ladsgroup@cumin1001 END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on dbstore1007.eqiad.wmnet with reason: Maintenance [02:46:39] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log [02:46:42] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log [02:55:09] !log ladsgroup@cumin1001 START - Cookbook sre.hosts.downtime for 6:00:00 on db1146.eqiad.wmnet with reason: Maintenance [02:55:10] !log ladsgroup@cumin1001 END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1146.eqiad.wmnet with reason: Maintenance [02:55:11] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log [02:55:14] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log [02:55:15] !log ladsgroup@cumin1001 dbctl commit (dc=all): 'Depooling db1146:3314 (T298565)', diff saved to https://phabricator.wikimedia.org/P24990 and previous config saved to /var/cache/conftool/dbconfig/20220418-025515-ladsgroup.json [02:55:18] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log [02:55:19] T298565: Fix mismatching field type of user table for columns user_email_authenticated, user_email_token, user_email_token_expires, user_newpass_time, user_registration, user_token, user_touched, user_newpassword, user_password, user_email on wmf wikis - https://phabricator.wikimedia.org/T298565 [03:01:55] (NodeTextfileStale) firing: Stale textfile for ms-be2067:9100 - https://wikitech.wikimedia.org/wiki/Prometheus#Stale_file_for_node-exporter_textfile - https://grafana.wikimedia.org/d/knkl4dCWz/node-exporter-textfile - https://alerts.wikimedia.org/?q=alertname%3DNodeTextfileStale [03:05:42] RECOVERY - SSH on wtp1038.mgmt is OK: SSH OK - OpenSSH_7.0 (protocol 2.0) https://wikitech.wikimedia.org/wiki/Dc-operations/Hardware_Troubleshooting_Runbook [03:06:11] !log ladsgroup@cumin1001 dbctl commit (dc=all): 'Repooling after maintenance db1146:3314 (T298565)', diff saved to https://phabricator.wikimedia.org/P24991 and previous config saved to /var/cache/conftool/dbconfig/20220418-030610-ladsgroup.json [03:06:14] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log [03:06:15] T298565: Fix mismatching field type of user table for columns user_email_authenticated, user_email_token, user_email_token_expires, user_newpass_time, user_registration, user_token, user_touched, user_newpassword, user_password, user_email on wmf wikis - https://phabricator.wikimedia.org/T298565 [03:10:36] !log ladsgroup@cumin1001 dbctl commit (dc=all): 'Repooling after maintenance db1098:3316 (T298565)', diff saved to https://phabricator.wikimedia.org/P24992 and previous config saved to /var/cache/conftool/dbconfig/20220418-031036-ladsgroup.json [03:10:40] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log [03:21:16] !log ladsgroup@cumin1001 dbctl commit (dc=all): 'Repooling after maintenance db1146:3314', diff saved to https://phabricator.wikimedia.org/P24993 and previous config saved to /var/cache/conftool/dbconfig/20220418-032116-ladsgroup.json [03:21:18] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log [03:25:41] !log ladsgroup@cumin1001 dbctl commit (dc=all): 'Repooling after maintenance db1098:3316', diff saved to https://phabricator.wikimedia.org/P24994 and previous config saved to /var/cache/conftool/dbconfig/20220418-032541-ladsgroup.json [03:25:43] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log [03:36:21] !log ladsgroup@cumin1001 dbctl commit (dc=all): 'Repooling after maintenance db1146:3314', diff saved to https://phabricator.wikimedia.org/P24995 and previous config saved to /var/cache/conftool/dbconfig/20220418-033621-ladsgroup.json [03:36:23] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log [03:40:46] !log ladsgroup@cumin1001 dbctl commit (dc=all): 'Repooling after maintenance db1098:3316', diff saved to https://phabricator.wikimedia.org/P24996 and previous config saved to /var/cache/conftool/dbconfig/20220418-034046-ladsgroup.json [03:40:49] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log [03:51:26] !log ladsgroup@cumin1001 dbctl commit (dc=all): 'Repooling after maintenance db1146:3314 (T298565)', diff saved to https://phabricator.wikimedia.org/P24997 and previous config saved to /var/cache/conftool/dbconfig/20220418-035126-ladsgroup.json [03:51:28] !log ladsgroup@cumin1001 START - Cookbook sre.hosts.downtime for 6:00:00 on db1149.eqiad.wmnet with reason: Maintenance [03:51:29] !log ladsgroup@cumin1001 END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1149.eqiad.wmnet with reason: Maintenance [03:51:29] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log [03:51:31] T298565: Fix mismatching field type of user table for columns user_email_authenticated, user_email_token, user_email_token_expires, user_newpass_time, user_registration, user_token, user_touched, user_newpassword, user_password, user_email on wmf wikis - https://phabricator.wikimedia.org/T298565 [03:51:33] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log [03:51:34] !log ladsgroup@cumin1001 dbctl commit (dc=all): 'Depooling db1149 (T298565)', diff saved to https://phabricator.wikimedia.org/P24998 and previous config saved to /var/cache/conftool/dbconfig/20220418-035134-ladsgroup.json [03:51:35] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log [03:51:38] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log [03:55:51] !log ladsgroup@cumin1001 dbctl commit (dc=all): 'Repooling after maintenance db1098:3316 (T298565)', diff saved to https://phabricator.wikimedia.org/P24999 and previous config saved to /var/cache/conftool/dbconfig/20220418-035551-ladsgroup.json [03:55:55] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log [03:55:59] !log ladsgroup@cumin1001 START - Cookbook sre.hosts.downtime for 6:00:00 on db1140.eqiad.wmnet with reason: Maintenance [03:56:01] !log ladsgroup@cumin1001 END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1140.eqiad.wmnet with reason: Maintenance [03:56:01] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log [03:56:04] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log [04:02:12] !log ladsgroup@cumin1001 dbctl commit (dc=all): 'Repooling after maintenance db1149 (T298565)', diff saved to https://phabricator.wikimedia.org/P25000 and previous config saved to /var/cache/conftool/dbconfig/20220418-040211-ladsgroup.json [04:02:15] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log [04:02:16] T298565: Fix mismatching field type of user table for columns user_email_authenticated, user_email_token, user_email_token_expires, user_newpass_time, user_registration, user_token, user_touched, user_newpassword, user_password, user_email on wmf wikis - https://phabricator.wikimedia.org/T298565 [04:05:32] !log ladsgroup@cumin1001 START - Cookbook sre.hosts.downtime for 6:00:00 on db2129.codfw.wmnet with reason: Maintenance [04:05:34] !log ladsgroup@cumin1001 END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db2129.codfw.wmnet with reason: Maintenance [04:05:34] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log [04:05:35] !log ladsgroup@cumin1001 START - Cookbook sre.hosts.downtime for 12:00:00 on 8 hosts with reason: Maintenance [04:05:37] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log [04:05:39] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log [04:05:41] !log ladsgroup@cumin1001 END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on 8 hosts with reason: Maintenance [04:05:43] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log [04:15:07] !log ladsgroup@cumin1001 START - Cookbook sre.hosts.downtime for 6:00:00 on dbstore1005.eqiad.wmnet with reason: Maintenance [04:15:08] !log ladsgroup@cumin1001 END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on dbstore1005.eqiad.wmnet with reason: Maintenance [04:15:10] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log [04:15:12] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log [04:17:17] !log ladsgroup@cumin1001 dbctl commit (dc=all): 'Repooling after maintenance db1149', diff saved to https://phabricator.wikimedia.org/P25001 and previous config saved to /var/cache/conftool/dbconfig/20220418-041716-ladsgroup.json [04:17:19] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log [04:24:59] !log ladsgroup@cumin1001 START - Cookbook sre.hosts.downtime for 6:00:00 on db1131.eqiad.wmnet with reason: Maintenance [04:25:01] !log ladsgroup@cumin1001 END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1131.eqiad.wmnet with reason: Maintenance [04:25:02] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log [04:25:05] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log [04:25:06] !log ladsgroup@cumin1001 dbctl commit (dc=all): 'Depooling db1131 (T298565)', diff saved to https://phabricator.wikimedia.org/P25002 and previous config saved to /var/cache/conftool/dbconfig/20220418-042505-ladsgroup.json [04:25:09] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log [04:25:10] T298565: Fix mismatching field type of user table for columns user_email_authenticated, user_email_token, user_email_token_expires, user_newpass_time, user_registration, user_token, user_touched, user_newpassword, user_password, user_email on wmf wikis - https://phabricator.wikimedia.org/T298565 [04:29:26] !log ladsgroup@cumin1001 dbctl commit (dc=all): 'Repooling after maintenance db1131 (T298565)', diff saved to https://phabricator.wikimedia.org/P25003 and previous config saved to /var/cache/conftool/dbconfig/20220418-042925-ladsgroup.json [04:29:29] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log [04:32:22] !log ladsgroup@cumin1001 dbctl commit (dc=all): 'Repooling after maintenance db1149', diff saved to https://phabricator.wikimedia.org/P25004 and previous config saved to /var/cache/conftool/dbconfig/20220418-043221-ladsgroup.json [04:32:24] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log [04:37:13] (KubernetesRsyslogDown) firing: rsyslog on kubernetes1018:9105 is missing kubernetes logs - https://wikitech.wikimedia.org/wiki/Kubernetes/Logging#Common_issues - https://alerts.wikimedia.org/?q=alertname%3DKubernetesRsyslogDown [04:44:31] !log ladsgroup@cumin1001 dbctl commit (dc=all): 'Repooling after maintenance db1131', diff saved to https://phabricator.wikimedia.org/P25005 and previous config saved to /var/cache/conftool/dbconfig/20220418-044430-ladsgroup.json [04:44:33] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log [04:45:02] PROBLEM - SSH on aqs1008.mgmt is CRITICAL: CRITICAL - Socket timeout after 10 seconds https://wikitech.wikimedia.org/wiki/Dc-operations/Hardware_Troubleshooting_Runbook [04:47:27] !log ladsgroup@cumin1001 dbctl commit (dc=all): 'Repooling after maintenance db1149 (T298565)', diff saved to https://phabricator.wikimedia.org/P25006 and previous config saved to /var/cache/conftool/dbconfig/20220418-044726-ladsgroup.json [04:47:29] !log ladsgroup@cumin1001 START - Cookbook sre.hosts.downtime for 6:00:00 on db1148.eqiad.wmnet with reason: Maintenance [04:47:30] !log ladsgroup@cumin1001 END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1148.eqiad.wmnet with reason: Maintenance [04:47:30] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log [04:47:31] T298565: Fix mismatching field type of user table for columns user_email_authenticated, user_email_token, user_email_token_expires, user_newpass_time, user_registration, user_token, user_touched, user_newpassword, user_password, user_email on wmf wikis - https://phabricator.wikimedia.org/T298565 [04:47:33] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log [04:47:35] !log ladsgroup@cumin1001 dbctl commit (dc=all): 'Depooling db1148 (T298565)', diff saved to https://phabricator.wikimedia.org/P25007 and previous config saved to /var/cache/conftool/dbconfig/20220418-044735-ladsgroup.json [04:47:35] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log [04:47:38] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log [04:59:36] !log ladsgroup@cumin1001 dbctl commit (dc=all): 'Repooling after maintenance db1131', diff saved to https://phabricator.wikimedia.org/P25008 and previous config saved to /var/cache/conftool/dbconfig/20220418-045935-ladsgroup.json [04:59:38] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log [05:08:06] !log ladsgroup@cumin1001 dbctl commit (dc=all): 'Repooling after maintenance db1148 (T298565)', diff saved to https://phabricator.wikimedia.org/P25009 and previous config saved to /var/cache/conftool/dbconfig/20220418-050806-ladsgroup.json [05:08:10] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log [05:08:11] T298565: Fix mismatching field type of user table for columns user_email_authenticated, user_email_token, user_email_token_expires, user_newpass_time, user_registration, user_token, user_touched, user_newpassword, user_password, user_email on wmf wikis - https://phabricator.wikimedia.org/T298565 [05:13:40] RECOVERY - SSH on wtp1045.mgmt is OK: SSH OK - OpenSSH_7.0 (protocol 2.0) https://wikitech.wikimedia.org/wiki/Dc-operations/Hardware_Troubleshooting_Runbook [05:14:41] !log ladsgroup@cumin1001 dbctl commit (dc=all): 'Repooling after maintenance db1131 (T298565)', diff saved to https://phabricator.wikimedia.org/P25010 and previous config saved to /var/cache/conftool/dbconfig/20220418-051440-ladsgroup.json [05:14:42] !log ladsgroup@cumin1001 START - Cookbook sre.hosts.downtime for 6:00:00 on db1096.eqiad.wmnet with reason: Maintenance [05:14:44] !log ladsgroup@cumin1001 END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1096.eqiad.wmnet with reason: Maintenance [05:14:44] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log [05:14:46] T298565: Fix mismatching field type of user table for columns user_email_authenticated, user_email_token, user_email_token_expires, user_newpass_time, user_registration, user_token, user_touched, user_newpassword, user_password, user_email on wmf wikis - https://phabricator.wikimedia.org/T298565 [05:14:48] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log [05:14:49] !log ladsgroup@cumin1001 dbctl commit (dc=all): 'Depooling db1096:3316 (T298565)', diff saved to https://phabricator.wikimedia.org/P25011 and previous config saved to /var/cache/conftool/dbconfig/20220418-051448-ladsgroup.json [05:14:50] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log [05:14:54] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log [05:23:11] !log ladsgroup@cumin1001 dbctl commit (dc=all): 'Repooling after maintenance db1148', diff saved to https://phabricator.wikimedia.org/P25012 and previous config saved to /var/cache/conftool/dbconfig/20220418-052311-ladsgroup.json [05:23:14] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log [05:26:41] !log ladsgroup@cumin1001 dbctl commit (dc=all): 'Repooling after maintenance db1096:3316 (T298565)', diff saved to https://phabricator.wikimedia.org/P25013 and previous config saved to /var/cache/conftool/dbconfig/20220418-052641-ladsgroup.json [05:26:46] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log [05:26:47] T298565: Fix mismatching field type of user table for columns user_email_authenticated, user_email_token, user_email_token_expires, user_newpass_time, user_registration, user_token, user_touched, user_newpassword, user_password, user_email on wmf wikis - https://phabricator.wikimedia.org/T298565 [05:38:17] !log ladsgroup@cumin1001 dbctl commit (dc=all): 'Repooling after maintenance db1148', diff saved to https://phabricator.wikimedia.org/P25014 and previous config saved to /var/cache/conftool/dbconfig/20220418-053816-ladsgroup.json [05:38:19] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log [05:41:46] !log ladsgroup@cumin1001 dbctl commit (dc=all): 'Repooling after maintenance db1096:3316', diff saved to https://phabricator.wikimedia.org/P25015 and previous config saved to /var/cache/conftool/dbconfig/20220418-054146-ladsgroup.json [05:41:48] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log [05:47:54] (NodeTextfileStale) firing: Stale textfile for cloudcontrol2001-dev:9100 - https://wikitech.wikimedia.org/wiki/Prometheus#Stale_file_for_node-exporter_textfile - https://grafana.wikimedia.org/d/knkl4dCWz/node-exporter-textfile - https://alerts.wikimedia.org/?q=alertname%3DNodeTextfileStale [05:53:22] !log ladsgroup@cumin1001 dbctl commit (dc=all): 'Repooling after maintenance db1148 (T298565)', diff saved to https://phabricator.wikimedia.org/P25016 and previous config saved to /var/cache/conftool/dbconfig/20220418-055321-ladsgroup.json [05:53:23] !log ladsgroup@cumin1001 START - Cookbook sre.hosts.downtime for 6:00:00 on db1150.eqiad.wmnet with reason: Maintenance [05:53:25] !log ladsgroup@cumin1001 END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1150.eqiad.wmnet with reason: Maintenance [05:53:25] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log [05:53:26] T298565: Fix mismatching field type of user table for columns user_email_authenticated, user_email_token, user_email_token_expires, user_newpass_time, user_registration, user_token, user_touched, user_newpassword, user_password, user_email on wmf wikis - https://phabricator.wikimedia.org/T298565 [05:53:28] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log [05:53:32] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log [05:56:51] !log ladsgroup@cumin1001 dbctl commit (dc=all): 'Repooling after maintenance db1096:3316', diff saved to https://phabricator.wikimedia.org/P25017 and previous config saved to /var/cache/conftool/dbconfig/20220418-055651-ladsgroup.json [05:56:54] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log [06:02:10] !log ladsgroup@cumin1001 START - Cookbook sre.hosts.downtime for 6:00:00 on db1147.eqiad.wmnet with reason: Maintenance [06:02:12] !log ladsgroup@cumin1001 END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1147.eqiad.wmnet with reason: Maintenance [06:02:13] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log [06:02:15] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log [06:02:17] !log ladsgroup@cumin1001 dbctl commit (dc=all): 'Depooling db1147 (T298565)', diff saved to https://phabricator.wikimedia.org/P25018 and previous config saved to /var/cache/conftool/dbconfig/20220418-060216-ladsgroup.json [06:02:20] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log [06:02:21] T298565: Fix mismatching field type of user table for columns user_email_authenticated, user_email_token, user_email_token_expires, user_newpass_time, user_registration, user_token, user_touched, user_newpassword, user_password, user_email on wmf wikis - https://phabricator.wikimedia.org/T298565 [06:11:56] !log ladsgroup@cumin1001 dbctl commit (dc=all): 'Repooling after maintenance db1096:3316 (T298565)', diff saved to https://phabricator.wikimedia.org/P25019 and previous config saved to /var/cache/conftool/dbconfig/20220418-061156-ladsgroup.json [06:11:58] !log ladsgroup@cumin1001 START - Cookbook sre.hosts.downtime for 6:00:00 on db1113.eqiad.wmnet with reason: Maintenance [06:11:59] !log ladsgroup@cumin1001 END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1113.eqiad.wmnet with reason: Maintenance [06:12:00] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log [06:12:01] T298565: Fix mismatching field type of user table for columns user_email_authenticated, user_email_token, user_email_token_expires, user_newpass_time, user_registration, user_token, user_touched, user_newpassword, user_password, user_email on wmf wikis - https://phabricator.wikimedia.org/T298565 [06:12:03] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log [06:12:04] !log ladsgroup@cumin1001 dbctl commit (dc=all): 'Depooling db1113:3316 (T298565)', diff saved to https://phabricator.wikimedia.org/P25020 and previous config saved to /var/cache/conftool/dbconfig/20220418-061204-ladsgroup.json [06:12:05] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log [06:12:09] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log [06:13:58] !log ladsgroup@cumin1001 dbctl commit (dc=all): 'Repooling after maintenance db1147 (T298565)', diff saved to https://phabricator.wikimedia.org/P25021 and previous config saved to /var/cache/conftool/dbconfig/20220418-061358-ladsgroup.json [06:14:02] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log [06:15:18] (03PS1) 10Majavah: kubernetes: Fix default resource handling [software/tools-webservice] - 10https://gerrit.wikimedia.org/r/783663 [06:16:49] (03CR) 10jerkins-bot: [V: 04-1] kubernetes: Fix default resource handling [software/tools-webservice] - 10https://gerrit.wikimedia.org/r/783663 (owner: 10Majavah) [06:20:14] (03PS2) 10Majavah: kubernetes: Fix default resource handling [software/tools-webservice] - 10https://gerrit.wikimedia.org/r/783663 [06:22:51] !log ladsgroup@cumin1001 dbctl commit (dc=all): 'Repooling after maintenance db1113:3316 (T298565)', diff saved to https://phabricator.wikimedia.org/P25022 and previous config saved to /var/cache/conftool/dbconfig/20220418-062251-ladsgroup.json [06:22:55] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log [06:22:56] T298565: Fix mismatching field type of user table for columns user_email_authenticated, user_email_token, user_email_token_expires, user_newpass_time, user_registration, user_token, user_touched, user_newpassword, user_password, user_email on wmf wikis - https://phabricator.wikimedia.org/T298565 [06:29:03] !log ladsgroup@cumin1001 dbctl commit (dc=all): 'Repooling after maintenance db1147', diff saved to https://phabricator.wikimedia.org/P25023 and previous config saved to /var/cache/conftool/dbconfig/20220418-062903-ladsgroup.json [06:29:06] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log [06:32:55] (NodeTextfileStale) firing: (3) Stale textfile for elastic1075:9100 - https://wikitech.wikimedia.org/wiki/Prometheus#Stale_file_for_node-exporter_textfile - https://grafana.wikimedia.org/d/knkl4dCWz/node-exporter-textfile - https://alerts.wikimedia.org/?q=alertname%3DNodeTextfileStale [06:37:56] !log ladsgroup@cumin1001 dbctl commit (dc=all): 'Repooling after maintenance db1113:3316', diff saved to https://phabricator.wikimedia.org/P25024 and previous config saved to /var/cache/conftool/dbconfig/20220418-063756-ladsgroup.json [06:37:59] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log [06:44:09] !log ladsgroup@cumin1001 dbctl commit (dc=all): 'Repooling after maintenance db1147', diff saved to https://phabricator.wikimedia.org/P25025 and previous config saved to /var/cache/conftool/dbconfig/20220418-064408-ladsgroup.json [06:44:11] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log [06:53:01] !log ladsgroup@cumin1001 dbctl commit (dc=all): 'Repooling after maintenance db1113:3316', diff saved to https://phabricator.wikimedia.org/P25026 and previous config saved to /var/cache/conftool/dbconfig/20220418-065301-ladsgroup.json [06:53:04] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log [06:59:13] !log ladsgroup@cumin1001 dbctl commit (dc=all): 'Repooling after maintenance db1147 (T298565)', diff saved to https://phabricator.wikimedia.org/P25027 and previous config saved to /var/cache/conftool/dbconfig/20220418-065913-ladsgroup.json [06:59:15] !log ladsgroup@cumin1001 START - Cookbook sre.hosts.downtime for 6:00:00 on db1144.eqiad.wmnet with reason: Maintenance [06:59:17] !log ladsgroup@cumin1001 END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1144.eqiad.wmnet with reason: Maintenance [06:59:17] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log [06:59:19] T298565: Fix mismatching field type of user table for columns user_email_authenticated, user_email_token, user_email_token_expires, user_newpass_time, user_registration, user_token, user_touched, user_newpassword, user_password, user_email on wmf wikis - https://phabricator.wikimedia.org/T298565 [06:59:21] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log [06:59:22] !log ladsgroup@cumin1001 dbctl commit (dc=all): 'Depooling db1144:3314 (T298565)', diff saved to https://phabricator.wikimedia.org/P25028 and previous config saved to /var/cache/conftool/dbconfig/20220418-065921-ladsgroup.json [06:59:23] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log [06:59:26] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log [07:00:05] Amir1, awight, Urbanecm, and taavi: It is that lovely time of the day again! You are hereby commanded to deploy UTC morning backport window. (https://wikitech.wikimedia.org/wiki/Deployments#deploycal-item-20220418T0700). [07:00:05] nn1l2: A patch you scheduled for UTC morning backport window is about to be deployed. Please be around during the process. Note: If you break AND fix the wikis, you will be rewarded with a sticker. [07:01:55] (NodeTextfileStale) firing: Stale textfile for ms-be2067:9100 - https://wikitech.wikimedia.org/wiki/Prometheus#Stale_file_for_node-exporter_textfile - https://grafana.wikimedia.org/d/knkl4dCWz/node-exporter-textfile - https://alerts.wikimedia.org/?q=alertname%3DNodeTextfileStale [07:08:07] !log ladsgroup@cumin1001 dbctl commit (dc=all): 'Repooling after maintenance db1113:3316 (T298565)', diff saved to https://phabricator.wikimedia.org/P25029 and previous config saved to /var/cache/conftool/dbconfig/20220418-070806-ladsgroup.json [07:08:08] !log ladsgroup@cumin1001 START - Cookbook sre.hosts.downtime for 6:00:00 on db1168.eqiad.wmnet with reason: Maintenance [07:08:10] !log ladsgroup@cumin1001 END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1168.eqiad.wmnet with reason: Maintenance [07:08:10] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log [07:08:11] T298565: Fix mismatching field type of user table for columns user_email_authenticated, user_email_token, user_email_token_expires, user_newpass_time, user_registration, user_token, user_touched, user_newpassword, user_password, user_email on wmf wikis - https://phabricator.wikimedia.org/T298565 [07:08:13] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log [07:08:15] !log ladsgroup@cumin1001 dbctl commit (dc=all): 'Depooling db1168 (T298565)', diff saved to https://phabricator.wikimedia.org/P25030 and previous config saved to /var/cache/conftool/dbconfig/20220418-070814-ladsgroup.json [07:08:15] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log [07:08:18] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log [07:10:03] !log ladsgroup@cumin1001 dbctl commit (dc=all): 'Repooling after maintenance db1144:3314 (T298565)', diff saved to https://phabricator.wikimedia.org/P25031 and previous config saved to /var/cache/conftool/dbconfig/20220418-071002-ladsgroup.json [07:10:06] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log [07:12:28] !log ladsgroup@cumin1001 dbctl commit (dc=all): 'Repooling after maintenance db1168 (T298565)', diff saved to https://phabricator.wikimedia.org/P25032 and previous config saved to /var/cache/conftool/dbconfig/20220418-071227-ladsgroup.json [07:12:31] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log [07:25:08] !log ladsgroup@cumin1001 dbctl commit (dc=all): 'Repooling after maintenance db1144:3314', diff saved to https://phabricator.wikimedia.org/P25033 and previous config saved to /var/cache/conftool/dbconfig/20220418-072508-ladsgroup.json [07:25:10] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log [07:27:33] !log ladsgroup@cumin1001 dbctl commit (dc=all): 'Repooling after maintenance db1168', diff saved to https://phabricator.wikimedia.org/P25034 and previous config saved to /var/cache/conftool/dbconfig/20220418-072732-ladsgroup.json [07:27:35] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log [07:40:13] !log ladsgroup@cumin1001 dbctl commit (dc=all): 'Repooling after maintenance db1144:3314', diff saved to https://phabricator.wikimedia.org/P25035 and previous config saved to /var/cache/conftool/dbconfig/20220418-074013-ladsgroup.json [07:40:16] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log [07:42:38] !log ladsgroup@cumin1001 dbctl commit (dc=all): 'Repooling after maintenance db1168', diff saved to https://phabricator.wikimedia.org/P25036 and previous config saved to /var/cache/conftool/dbconfig/20220418-074237-ladsgroup.json [07:42:40] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log [07:48:20] RECOVERY - SSH on aqs1008.mgmt is OK: SSH OK - OpenSSH_7.0 (protocol 2.0) https://wikitech.wikimedia.org/wiki/Dc-operations/Hardware_Troubleshooting_Runbook [07:55:18] !log ladsgroup@cumin1001 dbctl commit (dc=all): 'Repooling after maintenance db1144:3314 (T298565)', diff saved to https://phabricator.wikimedia.org/P25037 and previous config saved to /var/cache/conftool/dbconfig/20220418-075518-ladsgroup.json [07:55:20] !log ladsgroup@cumin1001 START - Cookbook sre.hosts.downtime for 6:00:00 on db1143.eqiad.wmnet with reason: Maintenance [07:55:21] !log ladsgroup@cumin1001 END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1143.eqiad.wmnet with reason: Maintenance [07:55:22] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log [07:55:23] T298565: Fix mismatching field type of user table for columns user_email_authenticated, user_email_token, user_email_token_expires, user_newpass_time, user_registration, user_token, user_touched, user_newpassword, user_password, user_email on wmf wikis - https://phabricator.wikimedia.org/T298565 [07:55:25] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log [07:55:26] !log ladsgroup@cumin1001 dbctl commit (dc=all): 'Depooling db1143 (T298565)', diff saved to https://phabricator.wikimedia.org/P25038 and previous config saved to /var/cache/conftool/dbconfig/20220418-075526-ladsgroup.json [07:55:28] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log [07:55:31] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log [07:57:43] !log ladsgroup@cumin1001 dbctl commit (dc=all): 'Repooling after maintenance db1168 (T298565)', diff saved to https://phabricator.wikimedia.org/P25039 and previous config saved to /var/cache/conftool/dbconfig/20220418-075742-ladsgroup.json [07:57:45] !log ladsgroup@cumin1001 START - Cookbook sre.hosts.downtime for 6:00:00 on db1165.eqiad.wmnet with reason: Maintenance [07:57:46] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log [07:57:46] !log ladsgroup@cumin1001 END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1165.eqiad.wmnet with reason: Maintenance [07:57:47] !log ladsgroup@cumin1001 START - Cookbook sre.hosts.downtime for 12:00:00 on clouddb[1015,1019].eqiad.wmnet,db1155.eqiad.wmnet with reason: Maintenance [07:57:48] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log [07:57:50] !log ladsgroup@cumin1001 END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on clouddb[1015,1019].eqiad.wmnet,db1155.eqiad.wmnet with reason: Maintenance [07:57:51] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log [07:57:53] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log [07:57:55] !log ladsgroup@cumin1001 dbctl commit (dc=all): 'Depooling db1165 (T298565)', diff saved to https://phabricator.wikimedia.org/P25040 and previous config saved to /var/cache/conftool/dbconfig/20220418-075755-ladsgroup.json [07:57:55] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log [07:57:58] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log [08:02:06] !log ladsgroup@cumin1001 dbctl commit (dc=all): 'Repooling after maintenance db1165 (T298565)', diff saved to https://phabricator.wikimedia.org/P25041 and previous config saved to /var/cache/conftool/dbconfig/20220418-080206-ladsgroup.json [08:02:09] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log [08:02:10] T298565: Fix mismatching field type of user table for columns user_email_authenticated, user_email_token, user_email_token_expires, user_newpass_time, user_registration, user_token, user_touched, user_newpassword, user_password, user_email on wmf wikis - https://phabricator.wikimedia.org/T298565 [08:06:00] !log ladsgroup@cumin1001 dbctl commit (dc=all): 'Repooling after maintenance db1143 (T298565)', diff saved to https://phabricator.wikimedia.org/P25042 and previous config saved to /var/cache/conftool/dbconfig/20220418-080559-ladsgroup.json [08:06:03] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log [08:17:11] !log ladsgroup@cumin1001 dbctl commit (dc=all): 'Repooling after maintenance db1165', diff saved to https://phabricator.wikimedia.org/P25043 and previous config saved to /var/cache/conftool/dbconfig/20220418-081711-ladsgroup.json [08:17:14] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log [08:21:05] !log ladsgroup@cumin1001 dbctl commit (dc=all): 'Repooling after maintenance db1143', diff saved to https://phabricator.wikimedia.org/P25044 and previous config saved to /var/cache/conftool/dbconfig/20220418-082104-ladsgroup.json [08:21:07] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log [08:32:16] !log ladsgroup@cumin1001 dbctl commit (dc=all): 'Repooling after maintenance db1165', diff saved to https://phabricator.wikimedia.org/P25045 and previous config saved to /var/cache/conftool/dbconfig/20220418-083216-ladsgroup.json [08:32:19] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log [08:32:41] (03PS1) 10Marostegui: Revert "db2072: Disable notifications" [puppet] - 10https://gerrit.wikimedia.org/r/780639 [08:33:19] (03CR) 10Marostegui: [C: 03+2] Revert "db2072: Disable notifications" [puppet] - 10https://gerrit.wikimedia.org/r/780639 (owner: 10Marostegui) [08:36:10] !log ladsgroup@cumin1001 dbctl commit (dc=all): 'Repooling after maintenance db1143', diff saved to https://phabricator.wikimedia.org/P25046 and previous config saved to /var/cache/conftool/dbconfig/20220418-083609-ladsgroup.json [08:36:12] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log [08:37:13] (KubernetesRsyslogDown) firing: rsyslog on kubernetes1018:9105 is missing kubernetes logs - https://wikitech.wikimedia.org/wiki/Kubernetes/Logging#Common_issues - https://alerts.wikimedia.org/?q=alertname%3DKubernetesRsyslogDown [08:37:34] (03PS1) 10Marostegui: mariadb: Promote db1181 to s7 master [puppet] - 10https://gerrit.wikimedia.org/r/783819 (https://phabricator.wikimedia.org/T306001) [08:37:56] (03CR) 10Marostegui: [C: 04-2] "Wait for the failover day" [puppet] - 10https://gerrit.wikimedia.org/r/783819 (https://phabricator.wikimedia.org/T306001) (owner: 10Marostegui) [08:39:24] (03PS1) 10Marostegui: wmnet: Update s7-master CNAME [dns] - 10https://gerrit.wikimedia.org/r/783820 (https://phabricator.wikimedia.org/T306001) [08:39:52] (03CR) 10Marostegui: [C: 04-2] "Wait for the failover date" [dns] - 10https://gerrit.wikimedia.org/r/783820 (https://phabricator.wikimedia.org/T306001) (owner: 10Marostegui) [08:44:21] (03PS1) 10Marostegui: check_private_data_report: Add Amir's email [puppet] - 10https://gerrit.wikimedia.org/r/783821 [08:44:59] (03CR) 10Marostegui: [C: 03+2] check_private_data_report: Add Amir's email [puppet] - 10https://gerrit.wikimedia.org/r/783821 (owner: 10Marostegui) [08:47:21] !log ladsgroup@cumin1001 dbctl commit (dc=all): 'Repooling after maintenance db1165 (T298565)', diff saved to https://phabricator.wikimedia.org/P25047 and previous config saved to /var/cache/conftool/dbconfig/20220418-084721-ladsgroup.json [08:47:23] !log ladsgroup@cumin1001 START - Cookbook sre.hosts.downtime for 6:00:00 on db1180.eqiad.wmnet with reason: Maintenance [08:47:24] !log ladsgroup@cumin1001 END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1180.eqiad.wmnet with reason: Maintenance [08:47:25] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log [08:47:26] T298565: Fix mismatching field type of user table for columns user_email_authenticated, user_email_token, user_email_token_expires, user_newpass_time, user_registration, user_token, user_touched, user_newpassword, user_password, user_email on wmf wikis - https://phabricator.wikimedia.org/T298565 [08:47:28] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log [08:47:30] !log ladsgroup@cumin1001 dbctl commit (dc=all): 'Depooling db1180 (T298565)', diff saved to https://phabricator.wikimedia.org/P25048 and previous config saved to /var/cache/conftool/dbconfig/20220418-084729-ladsgroup.json [08:47:30] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log [08:47:33] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log [08:51:15] !log ladsgroup@cumin1001 dbctl commit (dc=all): 'Repooling after maintenance db1143 (T298565)', diff saved to https://phabricator.wikimedia.org/P25049 and previous config saved to /var/cache/conftool/dbconfig/20220418-085114-ladsgroup.json [08:51:16] !log ladsgroup@cumin1001 START - Cookbook sre.hosts.downtime for 6:00:00 on db1142.eqiad.wmnet with reason: Maintenance [08:51:18] !log ladsgroup@cumin1001 END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1142.eqiad.wmnet with reason: Maintenance [08:51:18] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log [08:51:20] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log [08:51:23] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log [08:51:23] !log ladsgroup@cumin1001 dbctl commit (dc=all): 'Depooling db1142 (T298565)', diff saved to https://phabricator.wikimedia.org/P25050 and previous config saved to /var/cache/conftool/dbconfig/20220418-085122-ladsgroup.json [08:51:26] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log [09:01:59] !log ladsgroup@cumin1001 dbctl commit (dc=all): 'Repooling after maintenance db1142 (T298565)', diff saved to https://phabricator.wikimedia.org/P25051 and previous config saved to /var/cache/conftool/dbconfig/20220418-090159-ladsgroup.json [09:02:02] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log [09:02:04] T298565: Fix mismatching field type of user table for columns user_email_authenticated, user_email_token, user_email_token_expires, user_newpass_time, user_registration, user_token, user_touched, user_newpassword, user_password, user_email on wmf wikis - https://phabricator.wikimedia.org/T298565 [09:11:51] !log dbmaint s7@eqiad T306269 [09:11:54] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log [09:11:55] T306269: Make primary key ipblocks.ipb_id unsigned on wmf wikis - https://phabricator.wikimedia.org/T306269 [09:14:36] !log dbmaint s8@eqiad T306269 [09:14:40] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log [09:17:04] !log ladsgroup@cumin1001 dbctl commit (dc=all): 'Repooling after maintenance db1142', diff saved to https://phabricator.wikimedia.org/P25052 and previous config saved to /var/cache/conftool/dbconfig/20220418-091704-ladsgroup.json [09:17:07] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log [09:17:44] (03PS1) 10Marostegui: change_ipb_id_T306269.py: New schema change [software/schema-changes] - 10https://gerrit.wikimedia.org/r/783823 (https://phabricator.wikimedia.org/T306269) [09:19:48] !log dbmaint s2@eqiad T306269 [09:19:51] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log [09:19:52] T306269: Make primary key ipblocks.ipb_id unsigned on wmf wikis - https://phabricator.wikimedia.org/T306269 [09:25:11] !log dbmaint s4@eqiad T306269 [09:25:15] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log [09:25:15] T306269: Make primary key ipblocks.ipb_id unsigned on wmf wikis - https://phabricator.wikimedia.org/T306269 [09:29:35] !log dbmaint s5@eqiad T306269 [09:29:38] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log [09:32:09] !log ladsgroup@cumin1001 dbctl commit (dc=all): 'Repooling after maintenance db1142', diff saved to https://phabricator.wikimedia.org/P25053 and previous config saved to /var/cache/conftool/dbconfig/20220418-093209-ladsgroup.json [09:32:12] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log [09:34:04] !log dbmaint s8@eqiad T306270 [09:34:06] !log dbmaint s7@eqiad T306270 [09:34:07] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log [09:34:08] T306270: Make primary key ipblocks_restrictions.ir_ipb_id unsigned on wmf wikis - https://phabricator.wikimedia.org/T306270 [09:34:11] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log [09:34:56] !log dbmaint s6@eqiad T306270 [09:34:59] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log [09:36:46] !log dbmaint s2@eqiad T306270 [09:36:49] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log [09:44:46] !log dbmaint s1@eqiad T306270 [09:44:49] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log [09:44:50] T306270: Make primary key ipblocks_restrictions.ir_ipb_id unsigned on wmf wikis - https://phabricator.wikimedia.org/T306270 [09:45:49] !log dbmaint s4@eqiad T306270 [09:45:52] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log [09:47:14] !log ladsgroup@cumin1001 dbctl commit (dc=all): 'Repooling after maintenance db1142 (T298565)', diff saved to https://phabricator.wikimedia.org/P25054 and previous config saved to /var/cache/conftool/dbconfig/20220418-094714-ladsgroup.json [09:47:16] !log ladsgroup@cumin1001 START - Cookbook sre.hosts.downtime for 6:00:00 on db1141.eqiad.wmnet with reason: Maintenance [09:47:17] !log ladsgroup@cumin1001 END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1141.eqiad.wmnet with reason: Maintenance [09:47:18] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log [09:47:19] T298565: Fix mismatching field type of user table for columns user_email_authenticated, user_email_token, user_email_token_expires, user_newpass_time, user_registration, user_token, user_touched, user_newpassword, user_password, user_email on wmf wikis - https://phabricator.wikimedia.org/T298565 [09:47:21] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log [09:47:23] !log ladsgroup@cumin1001 dbctl commit (dc=all): 'Depooling db1141 (T298565)', diff saved to https://phabricator.wikimedia.org/P25055 and previous config saved to /var/cache/conftool/dbconfig/20220418-094722-ladsgroup.json [09:47:23] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log [09:47:26] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log [09:47:44] !log ladsgroup@cumin1001 dbctl commit (dc=all): 'Repooling after maintenance db1180 (T298565)', diff saved to https://phabricator.wikimedia.org/P25056 and previous config saved to /var/cache/conftool/dbconfig/20220418-094743-ladsgroup.json [09:47:47] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log [09:47:54] (NodeTextfileStale) firing: Stale textfile for cloudcontrol2001-dev:9100 - https://wikitech.wikimedia.org/wiki/Prometheus#Stale_file_for_node-exporter_textfile - https://grafana.wikimedia.org/d/knkl4dCWz/node-exporter-textfile - https://alerts.wikimedia.org/?q=alertname%3DNodeTextfileStale [09:51:52] !log dbmaint s5@eqiad T306270 [09:51:55] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log [09:51:57] T306270: Make primary key ipblocks_restrictions.ir_ipb_id unsigned on wmf wikis - https://phabricator.wikimedia.org/T306270 [09:57:57] !log ladsgroup@cumin1001 dbctl commit (dc=all): 'Repooling after maintenance db1141 (T298565)', diff saved to https://phabricator.wikimedia.org/P25057 and previous config saved to /var/cache/conftool/dbconfig/20220418-095756-ladsgroup.json [09:58:00] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log [09:58:02] T298565: Fix mismatching field type of user table for columns user_email_authenticated, user_email_token, user_email_token_expires, user_newpass_time, user_registration, user_token, user_touched, user_newpassword, user_password, user_email on wmf wikis - https://phabricator.wikimedia.org/T298565 [10:02:49] !log ladsgroup@cumin1001 dbctl commit (dc=all): 'Repooling after maintenance db1180', diff saved to https://phabricator.wikimedia.org/P25058 and previous config saved to /var/cache/conftool/dbconfig/20220418-100249-ladsgroup.json [10:02:52] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log [10:06:24] !log dbmaint s3@eqiad T306270 [10:06:27] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log [10:06:28] T306270: Make primary key ipblocks_restrictions.ir_ipb_id unsigned on wmf wikis - https://phabricator.wikimedia.org/T306270 [10:13:02] !log ladsgroup@cumin1001 dbctl commit (dc=all): 'Repooling after maintenance db1141', diff saved to https://phabricator.wikimedia.org/P25059 and previous config saved to /var/cache/conftool/dbconfig/20220418-101301-ladsgroup.json [10:13:04] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log [10:17:54] !log ladsgroup@cumin1001 dbctl commit (dc=all): 'Repooling after maintenance db1180', diff saved to https://phabricator.wikimedia.org/P25060 and previous config saved to /var/cache/conftool/dbconfig/20220418-101754-ladsgroup.json [10:17:57] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log [10:20:29] (KubernetesRsyslogDown) firing: (4) rsyslog on ml-staging-ctrl2001:9105 is missing kubernetes logs - https://wikitech.wikimedia.org/wiki/Kubernetes/Logging#Common_issues - https://alerts.wikimedia.org/?q=alertname%3DKubernetesRsyslogDown [10:28:07] !log ladsgroup@cumin1001 dbctl commit (dc=all): 'Repooling after maintenance db1141', diff saved to https://phabricator.wikimedia.org/P25061 and previous config saved to /var/cache/conftool/dbconfig/20220418-102806-ladsgroup.json [10:28:09] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log [10:30:23] !log dbmaint s1@eqiad T297189 [10:30:27] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log [10:30:28] T297189: Schema change for dropping ft_title and ft_namespace - https://phabricator.wikimedia.org/T297189 [10:32:55] (NodeTextfileStale) firing: (3) Stale textfile for elastic1075:9100 - https://wikitech.wikimedia.org/wiki/Prometheus#Stale_file_for_node-exporter_textfile - https://grafana.wikimedia.org/d/knkl4dCWz/node-exporter-textfile - https://alerts.wikimedia.org/?q=alertname%3DNodeTextfileStale [10:32:59] !log ladsgroup@cumin1001 dbctl commit (dc=all): 'Repooling after maintenance db1180 (T298565)', diff saved to https://phabricator.wikimedia.org/P25062 and previous config saved to /var/cache/conftool/dbconfig/20220418-103259-ladsgroup.json [10:33:01] !log ladsgroup@cumin1001 START - Cookbook sre.hosts.downtime for 6:00:00 on db1098.eqiad.wmnet with reason: Maintenance [10:33:02] !log ladsgroup@cumin1001 END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1098.eqiad.wmnet with reason: Maintenance [10:33:03] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log [10:33:04] T298565: Fix mismatching field type of user table for columns user_email_authenticated, user_email_token, user_email_token_expires, user_newpass_time, user_registration, user_token, user_touched, user_newpassword, user_password, user_email on wmf wikis - https://phabricator.wikimedia.org/T298565 [10:33:06] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log [10:33:07] !log ladsgroup@cumin1001 dbctl commit (dc=all): 'Depooling db1098:3316 (T298565)', diff saved to https://phabricator.wikimedia.org/P25063 and previous config saved to /var/cache/conftool/dbconfig/20220418-103307-ladsgroup.json [10:33:08] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log [10:33:11] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log [10:43:12] !log ladsgroup@cumin1001 dbctl commit (dc=all): 'Repooling after maintenance db1141 (T298565)', diff saved to https://phabricator.wikimedia.org/P25064 and previous config saved to /var/cache/conftool/dbconfig/20220418-104311-ladsgroup.json [10:43:13] !log ladsgroup@cumin1001 START - Cookbook sre.hosts.downtime for 6:00:00 on db1121.eqiad.wmnet with reason: Maintenance [10:43:15] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log [10:43:15] !log ladsgroup@cumin1001 END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1121.eqiad.wmnet with reason: Maintenance [10:43:16] T298565: Fix mismatching field type of user table for columns user_email_authenticated, user_email_token, user_email_token_expires, user_newpass_time, user_registration, user_token, user_touched, user_newpassword, user_password, user_email on wmf wikis - https://phabricator.wikimedia.org/T298565 [10:43:16] !log ladsgroup@cumin1001 START - Cookbook sre.hosts.downtime for 12:00:00 on clouddb[1015,1019].eqiad.wmnet,db1155.eqiad.wmnet with reason: Maintenance [10:43:18] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log [10:43:19] !log ladsgroup@cumin1001 END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on clouddb[1015,1019].eqiad.wmnet,db1155.eqiad.wmnet with reason: Maintenance [10:43:20] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log [10:43:22] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log [10:43:24] !log ladsgroup@cumin1001 dbctl commit (dc=all): 'Depooling db1121 (T298565)', diff saved to https://phabricator.wikimedia.org/P25065 and previous config saved to /var/cache/conftool/dbconfig/20220418-104323-ladsgroup.json [10:43:24] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log [10:43:28] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log [10:51:57] PROBLEM - SSH on aqs1008.mgmt is CRITICAL: CRITICAL - Socket timeout after 10 seconds https://wikitech.wikimedia.org/wiki/Dc-operations/Hardware_Troubleshooting_Runbook [11:01:55] (NodeTextfileStale) firing: Stale textfile for ms-be2067:9100 - https://wikitech.wikimedia.org/wiki/Prometheus#Stale_file_for_node-exporter_textfile - https://grafana.wikimedia.org/d/knkl4dCWz/node-exporter-textfile - https://alerts.wikimedia.org/?q=alertname%3DNodeTextfileStale [11:04:33] !log ladsgroup@cumin1001 dbctl commit (dc=all): 'Repooling after maintenance db1121 (T298565)', diff saved to https://phabricator.wikimedia.org/P25066 and previous config saved to /var/cache/conftool/dbconfig/20220418-110432-ladsgroup.json [11:04:36] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log [11:04:38] T298565: Fix mismatching field type of user table for columns user_email_authenticated, user_email_token, user_email_token_expires, user_newpass_time, user_registration, user_token, user_touched, user_newpassword, user_password, user_email on wmf wikis - https://phabricator.wikimedia.org/T298565 [11:19:38] !log ladsgroup@cumin1001 dbctl commit (dc=all): 'Repooling after maintenance db1121', diff saved to https://phabricator.wikimedia.org/P25067 and previous config saved to /var/cache/conftool/dbconfig/20220418-111937-ladsgroup.json [11:19:40] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log [11:33:22] !log ladsgroup@cumin1001 dbctl commit (dc=all): 'Repooling after maintenance db1098:3316 (T298565)', diff saved to https://phabricator.wikimedia.org/P25068 and previous config saved to /var/cache/conftool/dbconfig/20220418-113322-ladsgroup.json [11:33:25] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log [11:33:26] T298565: Fix mismatching field type of user table for columns user_email_authenticated, user_email_token, user_email_token_expires, user_newpass_time, user_registration, user_token, user_touched, user_newpassword, user_password, user_email on wmf wikis - https://phabricator.wikimedia.org/T298565 [11:34:43] !log ladsgroup@cumin1001 dbctl commit (dc=all): 'Repooling after maintenance db1121', diff saved to https://phabricator.wikimedia.org/P25069 and previous config saved to /var/cache/conftool/dbconfig/20220418-113442-ladsgroup.json [11:34:45] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log [11:48:27] !log ladsgroup@cumin1001 dbctl commit (dc=all): 'Repooling after maintenance db1098:3316', diff saved to https://phabricator.wikimedia.org/P25070 and previous config saved to /var/cache/conftool/dbconfig/20220418-114827-ladsgroup.json [11:48:29] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log [11:49:48] !log ladsgroup@cumin1001 dbctl commit (dc=all): 'Repooling after maintenance db1121 (T298565)', diff saved to https://phabricator.wikimedia.org/P25071 and previous config saved to /var/cache/conftool/dbconfig/20220418-114947-ladsgroup.json [11:49:51] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log [11:49:52] T298565: Fix mismatching field type of user table for columns user_email_authenticated, user_email_token, user_email_token_expires, user_newpass_time, user_registration, user_token, user_touched, user_newpassword, user_password, user_email on wmf wikis - https://phabricator.wikimedia.org/T298565 [11:49:53] !log ladsgroup@cumin1001 START - Cookbook sre.hosts.downtime for 6:00:00 on db2110.codfw.wmnet with reason: Maintenance [11:49:54] !log ladsgroup@cumin1001 END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db2110.codfw.wmnet with reason: Maintenance [11:49:55] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log [11:49:56] !log ladsgroup@cumin1001 START - Cookbook sre.hosts.downtime for 12:00:00 on 12 hosts with reason: Maintenance [11:49:57] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log [11:49:59] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log [11:50:04] !log ladsgroup@cumin1001 END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on 12 hosts with reason: Maintenance [11:50:06] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log [11:59:08] !log ladsgroup@cumin1001 START - Cookbook sre.hosts.downtime for 6:00:00 on db1138.eqiad.wmnet with reason: Maintenance [11:59:10] !log ladsgroup@cumin1001 END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1138.eqiad.wmnet with reason: Maintenance [11:59:11] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log [11:59:13] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log [11:59:15] !log ladsgroup@cumin1001 dbctl commit (dc=all): 'Depooling db1138 (T298565)', diff saved to https://phabricator.wikimedia.org/P25072 and previous config saved to /var/cache/conftool/dbconfig/20220418-115914-ladsgroup.json [11:59:18] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log [11:59:19] T298565: Fix mismatching field type of user table for columns user_email_authenticated, user_email_token, user_email_token_expires, user_newpass_time, user_registration, user_token, user_touched, user_newpassword, user_password, user_email on wmf wikis - https://phabricator.wikimedia.org/T298565 [12:03:32] !log ladsgroup@cumin1001 dbctl commit (dc=all): 'Repooling after maintenance db1098:3316', diff saved to https://phabricator.wikimedia.org/P25073 and previous config saved to /var/cache/conftool/dbconfig/20220418-120332-ladsgroup.json [12:03:34] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log [12:18:37] !log ladsgroup@cumin1001 dbctl commit (dc=all): 'Repooling after maintenance db1098:3316 (T298565)', diff saved to https://phabricator.wikimedia.org/P25074 and previous config saved to /var/cache/conftool/dbconfig/20220418-121837-ladsgroup.json [12:18:41] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log [12:18:42] T298565: Fix mismatching field type of user table for columns user_email_authenticated, user_email_token, user_email_token_expires, user_newpass_time, user_registration, user_token, user_touched, user_newpassword, user_password, user_email on wmf wikis - https://phabricator.wikimedia.org/T298565 [12:18:46] !log ladsgroup@cumin1001 START - Cookbook sre.hosts.downtime for 6:00:00 on db1165.eqiad.wmnet with reason: Maintenance [12:18:47] !log ladsgroup@cumin1001 END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1165.eqiad.wmnet with reason: Maintenance [12:18:48] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log [12:18:49] !log ladsgroup@cumin1001 START - Cookbook sre.hosts.downtime for 12:00:00 on clouddb[1015,1019].eqiad.wmnet,db1155.eqiad.wmnet with reason: Maintenance [12:18:50] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log [12:18:51] !log ladsgroup@cumin1001 END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on clouddb[1015,1019].eqiad.wmnet,db1155.eqiad.wmnet with reason: Maintenance [12:18:52] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log [12:18:54] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log [12:18:56] !log ladsgroup@cumin1001 dbctl commit (dc=all): 'Depooling db1165 (T298565)', diff saved to https://phabricator.wikimedia.org/P25075 and previous config saved to /var/cache/conftool/dbconfig/20220418-121856-ladsgroup.json [12:19:00] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log [12:23:11] !log ladsgroup@cumin1001 dbctl commit (dc=all): 'Repooling after maintenance db1165 (T298565)', diff saved to https://phabricator.wikimedia.org/P25076 and previous config saved to /var/cache/conftool/dbconfig/20220418-122309-ladsgroup.json [12:23:16] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log [12:37:13] (KubernetesRsyslogDown) firing: rsyslog on kubernetes1018:9105 is missing kubernetes logs - https://wikitech.wikimedia.org/wiki/Kubernetes/Logging#Common_issues - https://alerts.wikimedia.org/?q=alertname%3DKubernetesRsyslogDown [12:38:16] !log ladsgroup@cumin1001 dbctl commit (dc=all): 'Repooling after maintenance db1165', diff saved to https://phabricator.wikimedia.org/P25077 and previous config saved to /var/cache/conftool/dbconfig/20220418-123816-ladsgroup.json [12:38:19] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log [12:53:21] !log ladsgroup@cumin1001 dbctl commit (dc=all): 'Repooling after maintenance db1165', diff saved to https://phabricator.wikimedia.org/P25078 and previous config saved to /var/cache/conftool/dbconfig/20220418-125321-ladsgroup.json [12:53:24] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log [12:54:19] RECOVERY - SSH on aqs1008.mgmt is OK: SSH OK - OpenSSH_7.0 (protocol 2.0) https://wikitech.wikimedia.org/wiki/Dc-operations/Hardware_Troubleshooting_Runbook [12:59:06] (03PS1) 10Zabe: update some composer packages [mediawiki-config] - 10https://gerrit.wikimedia.org/r/783842 [12:59:29] !log ladsgroup@cumin1001 dbctl commit (dc=all): 'Repooling after maintenance db1138 (T298565)', diff saved to https://phabricator.wikimedia.org/P25079 and previous config saved to /var/cache/conftool/dbconfig/20220418-125929-ladsgroup.json [12:59:32] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log [12:59:33] T298565: Fix mismatching field type of user table for columns user_email_authenticated, user_email_token, user_email_token_expires, user_newpass_time, user_registration, user_token, user_touched, user_newpassword, user_password, user_email on wmf wikis - https://phabricator.wikimedia.org/T298565 [12:59:35] (03CR) 10jerkins-bot: [V: 04-1] update some composer packages [mediawiki-config] - 10https://gerrit.wikimedia.org/r/783842 (owner: 10Zabe) [13:00:04] RoanKattouw, Lucas_WMDE, and Urbanecm: How many deployers does it take to do UTC afternoon backport window deploy? (https://wikitech.wikimedia.org/wiki/Deployments#deploycal-item-20220418T1300). [13:00:04] koi and zabe: A patch you scheduled for UTC afternoon backport window is about to be deployed. Please be around during the process. Note: If you break AND fix the wikis, you will be rewarded with a sticker. [13:00:14] i can deploy today [13:00:17] hi [13:00:19] hello koi and zabe [13:00:31] hi [13:01:06] actually, I remove my patch from the window [13:01:16] zabe: ack [13:02:33] (03CR) 10Urbanecm: [C: 04-1] "Please update logos/config.yaml to include the Commons filename (https://github.com/wikimedia/operations-mediawiki-config/blob/master/logo" [mediawiki-config] - 10https://gerrit.wikimedia.org/r/783417 (https://phabricator.wikimedia.org/T306037) (owner: 10Stang) [13:03:11] ack, doing [13:03:36] thanks koi [13:03:40] reviewing the second patch in the meantime [13:05:40] (03CR) 10Urbanecm: [C: 03+2] Increase autoconfirmed threshold to 10 edits on iswiki [mediawiki-config] - 10https://gerrit.wikimedia.org/r/783445 (https://phabricator.wikimedia.org/T306305) (owner: 10Stang) [13:06:19] koi: i'll verify this one for you once it is at mwdebug, so you can focus on the other patch :). [13:06:24] (03Merged) 10jenkins-bot: Increase autoconfirmed threshold to 10 edits on iswiki [mediawiki-config] - 10https://gerrit.wikimedia.org/r/783445 (https://phabricator.wikimedia.org/T306305) (owner: 10Stang) [13:06:32] thanks! [13:07:41] (03PS2) 10Zabe: update some composer packages [mediawiki-config] - 10https://gerrit.wikimedia.org/r/783842 [13:07:57] testing by checking https://is.wikipedia.org/wiki/Kerfiss%C3%AD%C3%B0a:Notandar%C3%A9ttindi/Martin_Urbanec and https://is.wikipedia.org/wiki/Kerfiss%C3%AD%C3%B0a:Notandar%C3%A9ttindi/Martin_Urbanec_(WMF). Volunteer acc has 17 edits and is still autoconfirmed, WMF one has 1 edit and is no longer autoconfirmed. [13:08:00] calling that a success => syncing [13:08:08] (03CR) 10jerkins-bot: [V: 04-1] update some composer packages [mediawiki-config] - 10https://gerrit.wikimedia.org/r/783842 (owner: 10Zabe) [13:08:14] (03PS10) 10Ssingh: dnsrecursor: refactor module (see detailed commit message) [puppet] - 10https://gerrit.wikimedia.org/r/779936 (https://phabricator.wikimedia.org/T305589) [13:08:26] !log ladsgroup@cumin1001 dbctl commit (dc=all): 'Repooling after maintenance db1165 (T298565)', diff saved to https://phabricator.wikimedia.org/P25080 and previous config saved to /var/cache/conftool/dbconfig/20220418-130826-ladsgroup.json [13:08:28] !log ladsgroup@cumin1001 START - Cookbook sre.hosts.downtime for 6:00:00 on db1168.eqiad.wmnet with reason: Maintenance [13:08:30] !log ladsgroup@cumin1001 END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1168.eqiad.wmnet with reason: Maintenance [13:08:30] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log [13:08:31] T298565: Fix mismatching field type of user table for columns user_email_authenticated, user_email_token, user_email_token_expires, user_newpass_time, user_registration, user_token, user_touched, user_newpassword, user_password, user_email on wmf wikis - https://phabricator.wikimedia.org/T298565 [13:08:33] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log [13:08:35] !log ladsgroup@cumin1001 dbctl commit (dc=all): 'Depooling db1168 (T298565)', diff saved to https://phabricator.wikimedia.org/P25081 and previous config saved to /var/cache/conftool/dbconfig/20220418-130834-ladsgroup.json [13:08:35] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log [13:08:38] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log [13:09:21] !log urbanecm@deploy1002 Synchronized wmf-config/InitialiseSettings.php: c90079a: Increase autoconfirmed threshold to 10 edits on iswiki (T306305) (duration: 00m 53s) [13:09:24] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log [13:09:24] T306305: Change autoconfirmed edits in is.wikipedia - https://phabricator.wikimedia.org/T306305 [13:09:47] koi: iswiki patch is live now. Please let me know once the other one is ready (or if you need any help). [13:10:11] thanks, just wondering why I need to run `tox` command [13:10:55] I downloads the svg and generates the pngs from it and compresses them [13:10:59] koi: the tox command is responsible for maintaining the png files those days (they shouldn't be created manually those days). [13:11:18] (03CR) 10Ssingh: [V: 03+1] "PCC SUCCESS (DIFF 3): https://integration.wikimedia.org/ci/job/operations-puppet-catalog-compiler/34872/console" [puppet] - 10https://gerrit.wikimedia.org/r/779936 (https://phabricator.wikimedia.org/T305589) (owner: 10Ssingh) [13:11:54] in this case, it's there to ensure the new config in `config.yaml` will result in the same PNGs as you upload to gerrit [13:12:07] there are some docs: https://gerrit.wikimedia.org/r/plugins/gitiles/operations/mediawiki-config/+/refs/heads/master/logos/README.md [13:12:12] thanks zabe [13:12:49] !log ladsgroup@cumin1001 dbctl commit (dc=all): 'Repooling after maintenance db1168 (T298565)', diff saved to https://phabricator.wikimedia.org/P25082 and previous config saved to /var/cache/conftool/dbconfig/20220418-131249-ladsgroup.json [13:12:52] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log [13:13:32] hmm, it said `FileNotFoundError: [Errno 2] No such file or directory: 'pngquant'` [13:13:46] it means that you don't have pngguant installed [13:14:11] !log mwdebug-deploy@deploy1002 helmfile [eqiad] START helmfile.d/services/mwdebug: apply [13:14:13] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log [13:14:14] !log mwdebug-deploy@deploy1002 helmfile [eqiad] DONE helmfile.d/services/mwdebug: apply [13:14:15] !log mwdebug-deploy@deploy1002 helmfile [codfw] START helmfile.d/services/mwdebug: apply [13:14:16] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log [13:14:19] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log [13:14:19] !log mwdebug-deploy@deploy1002 helmfile [codfw] DONE helmfile.d/services/mwdebug: apply [13:14:20] (at a debian-like system, `sudo apt install pngquant` should do the trick) [13:14:21] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log [13:14:34] !log ladsgroup@cumin1001 dbctl commit (dc=all): 'Repooling after maintenance db1138', diff saved to https://phabricator.wikimedia.org/P25083 and previous config saved to /var/cache/conftool/dbconfig/20220418-131434-ladsgroup.json [13:14:36] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log [13:16:10] (03PS3) 10Stang: Wikispecies: update logo to prevent being obscured [mediawiki-config] - 10https://gerrit.wikimedia.org/r/783417 (https://phabricator.wikimedia.org/T306037) [13:17:29] urbanecm ^ [13:17:37] koi: looks good to me. let's try it :) [13:17:40] (03CR) 10Urbanecm: [C: 03+2] Wikispecies: update logo to prevent being obscured [mediawiki-config] - 10https://gerrit.wikimedia.org/r/783417 (https://phabricator.wikimedia.org/T306037) (owner: 10Stang) [13:18:20] (03Merged) 10jenkins-bot: Wikispecies: update logo to prevent being obscured [mediawiki-config] - 10https://gerrit.wikimedia.org/r/783417 (https://phabricator.wikimedia.org/T306037) (owner: 10Stang) [13:18:43] koi: your patch is at mwdebug1001. can you check? [13:18:48] ok [13:19:09] lgtm [13:20:02] syncing [13:21:48] !log urbanecm@deploy1002 Synchronized static/images/project-logos/: c927c3a: Wikispecies: update logo to prevent being obscured (T306037; 1/2) (duration: 00m 51s) [13:21:52] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log [13:21:53] T306037: Optimize logo for Wikispecies - https://phabricator.wikimedia.org/T306037 [13:22:16] (03CR) 10Ladsgroup: [C: 03+1] wmnet: Update s7-master CNAME [dns] - 10https://gerrit.wikimedia.org/r/783820 (https://phabricator.wikimedia.org/T306001) (owner: 10Marostegui) [13:22:44] !log urbanecm@deploy1002 Synchronized logos/config.yaml: c927c3a: Wikispecies: update logo to prevent being obscured (T306037; 2/2) (duration: 00m 55s) [13:22:48] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log [13:22:56] (03PS2) 10Bking: elasticsearch: upgrade eqiad to elasticsearch 6.8 [puppet] - 10https://gerrit.wikimedia.org/r/763484 (https://phabricator.wikimedia.org/T301959) (owner: 10Gehel) [13:23:33] (03CR) 10Bking: [V: 03+2] elasticsearch: upgrade eqiad to elasticsearch 6.8 [puppet] - 10https://gerrit.wikimedia.org/r/763484 (https://phabricator.wikimedia.org/T301959) (owner: 10Gehel) [13:23:38] (03CR) 10Bking: [V: 03+2 C: 03+2] elasticsearch: upgrade eqiad to elasticsearch 6.8 [puppet] - 10https://gerrit.wikimedia.org/r/763484 (https://phabricator.wikimedia.org/T301959) (owner: 10Gehel) [13:23:57] !log ladsgroup@cumin1001 START - Cookbook sre.hosts.downtime for 6:00:00 on db1112.eqiad.wmnet with reason: Maintenance [13:23:58] !log ladsgroup@cumin1001 END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1112.eqiad.wmnet with reason: Maintenance [13:23:59] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log [13:24:00] !log ladsgroup@cumin1001 START - Cookbook sre.hosts.downtime for 12:00:00 on clouddb[1013,1017].eqiad.wmnet,db1154.eqiad.wmnet with reason: Maintenance [13:24:02] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log [13:24:03] !log ladsgroup@cumin1001 END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on clouddb[1013,1017].eqiad.wmnet,db1154.eqiad.wmnet with reason: Maintenance [13:24:04] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log [13:24:06] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log [13:24:08] !log ladsgroup@cumin1001 dbctl commit (dc=all): 'Depooling db1112 (T298565)', diff saved to https://phabricator.wikimedia.org/P25084 and previous config saved to /var/cache/conftool/dbconfig/20220418-132407-ladsgroup.json [13:24:11] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log [13:24:12] T298565: Fix mismatching field type of user table for columns user_email_authenticated, user_email_token, user_email_token_expires, user_newpass_time, user_registration, user_token, user_touched, user_newpassword, user_password, user_email on wmf wikis - https://phabricator.wikimedia.org/T298565 [13:24:27] !log mwdebug-deploy@deploy1002 helmfile [eqiad] START helmfile.d/services/mwdebug: apply [13:24:29] (03PS1) 10Bking: Revert "elasticsearch: upgrade eqiad to elasticsearch 6.8" [puppet] - 10https://gerrit.wikimedia.org/r/780640 [13:24:30] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log [13:24:31] !log mwdebug-deploy@deploy1002 helmfile [eqiad] DONE helmfile.d/services/mwdebug: apply [13:24:32] !log mwdebug-deploy@deploy1002 helmfile [codfw] START helmfile.d/services/mwdebug: apply [13:24:32] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log [13:24:35] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log [13:24:36] !log mwdebug-deploy@deploy1002 helmfile [codfw] DONE helmfile.d/services/mwdebug: apply [13:24:40] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log [13:25:25] (03CR) 10Andrew Bogott: [C: 03+1] "lgtm." [puppet] - 10https://gerrit.wikimedia.org/r/781977 (https://phabricator.wikimedia.org/T274666) (owner: 10Majavah) [13:26:17] (03CR) 10Andrew Bogott: [C: 03+2] openstack: make wmf_sink authenticate to enc api via keystone [puppet] - 10https://gerrit.wikimedia.org/r/781977 (https://phabricator.wikimedia.org/T274666) (owner: 10Majavah) [13:26:35] (03CR) 10Ladsgroup: [C: 03+1] mariadb: Promote db1181 to s7 master (031 comment) [puppet] - 10https://gerrit.wikimedia.org/r/783819 (https://phabricator.wikimedia.org/T306001) (owner: 10Marostegui) [13:27:06] (03CR) 10Ladsgroup: "Thanks. I didn't know I needed to be there." [puppet] - 10https://gerrit.wikimedia.org/r/783821 (owner: 10Marostegui) [13:27:54] !log ladsgroup@cumin1001 dbctl commit (dc=all): 'Repooling after maintenance db1168', diff saved to https://phabricator.wikimedia.org/P25085 and previous config saved to /var/cache/conftool/dbconfig/20220418-132754-ladsgroup.json [13:27:57] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log [13:29:39] !log mwdebug-deploy@deploy1002 helmfile [eqiad] START helmfile.d/services/mwdebug: apply [13:29:39] !log ladsgroup@cumin1001 dbctl commit (dc=all): 'Repooling after maintenance db1138', diff saved to https://phabricator.wikimedia.org/P25086 and previous config saved to /var/cache/conftool/dbconfig/20220418-132939-ladsgroup.json [13:29:41] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log [13:29:42] !log mwdebug-deploy@deploy1002 helmfile [eqiad] DONE helmfile.d/services/mwdebug: apply [13:29:43] !log mwdebug-deploy@deploy1002 helmfile [codfw] START helmfile.d/services/mwdebug: apply [13:29:44] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log [13:29:46] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log [13:29:46] !log mwdebug-deploy@deploy1002 helmfile [codfw] DONE helmfile.d/services/mwdebug: apply [13:29:48] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log [13:29:51] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log [13:39:44] (03PS1) 10Ladsgroup: maintain-views: Drop views on revision_actor_temp [puppet] - 10https://gerrit.wikimedia.org/r/783845 (https://phabricator.wikimedia.org/T275246) [13:42:25] PROBLEM - Widespread puppet agent failures- no resources reported on alert1001 is CRITICAL: 0.01213 ge 0.01 https://wikitech.wikimedia.org/wiki/Monitoring/Missing_notes_link https://grafana.wikimedia.org/d/yOxVDGvWk/puppet [13:42:30] !log ladsgroup@cumin1001 dbctl commit (dc=all): 'Repooling after maintenance db1112 (T298565)', diff saved to https://phabricator.wikimedia.org/P25087 and previous config saved to /var/cache/conftool/dbconfig/20220418-134229-ladsgroup.json [13:42:33] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log [13:42:34] T298565: Fix mismatching field type of user table for columns user_email_authenticated, user_email_token, user_email_token_expires, user_newpass_time, user_registration, user_token, user_touched, user_newpassword, user_password, user_email on wmf wikis - https://phabricator.wikimedia.org/T298565 [13:42:59] !log ladsgroup@cumin1001 dbctl commit (dc=all): 'Repooling after maintenance db1168', diff saved to https://phabricator.wikimedia.org/P25088 and previous config saved to /var/cache/conftool/dbconfig/20220418-134259-ladsgroup.json [13:43:01] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log [13:44:20] (03CR) 10Ladsgroup: "A couple hundred million revisions in s8 is not backfilled yet but I think it'll be done by later this week." [puppet] - 10https://gerrit.wikimedia.org/r/783845 (https://phabricator.wikimedia.org/T275246) (owner: 10Ladsgroup) [13:44:44] !log ladsgroup@cumin1001 dbctl commit (dc=all): 'Repooling after maintenance db1138 (T298565)', diff saved to https://phabricator.wikimedia.org/P25089 and previous config saved to /var/cache/conftool/dbconfig/20220418-134444-ladsgroup.json [13:44:48] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log [13:44:55] !log ladsgroup@cumin1001 START - Cookbook sre.hosts.downtime for 6:00:00 on db2110.codfw.wmnet with reason: Maintenance [13:44:56] !log ladsgroup@cumin1001 END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db2110.codfw.wmnet with reason: Maintenance [13:44:57] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log [13:44:57] !log ladsgroup@cumin1001 START - Cookbook sre.hosts.downtime for 12:00:00 on 12 hosts with reason: Maintenance [13:44:59] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log [13:45:01] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log [13:45:06] !log ladsgroup@cumin1001 END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on 12 hosts with reason: Maintenance [13:45:08] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log [13:47:17] (03CR) 10Marostegui: [C: 03+1] mariadb: Stop special-casing db2093 [puppet] - 10https://gerrit.wikimedia.org/r/775852 (https://phabricator.wikimedia.org/T301315) (owner: 10Kormat) [13:47:55] (NodeTextfileStale) firing: Stale textfile for cloudcontrol2001-dev:9100 - https://wikitech.wikimedia.org/wiki/Prometheus#Stale_file_for_node-exporter_textfile - https://grafana.wikimedia.org/d/knkl4dCWz/node-exporter-textfile - https://alerts.wikimedia.org/?q=alertname%3DNodeTextfileStale [13:50:46] (03PS2) 10Alexandros Kosiaris: Add dummy tokens for developer-portal [labs/private] - 10https://gerrit.wikimedia.org/r/773268 (https://phabricator.wikimedia.org/T297140) (owner: 10Majavah) [13:52:38] (03PS2) 10Alexandros Kosiaris: Add developer-portal k8s accounts [puppet] - 10https://gerrit.wikimedia.org/r/773270 (https://phabricator.wikimedia.org/T297140) (owner: 10Majavah) [13:52:43] (03CR) 10Alexandros Kosiaris: [V: 03+2 C: 03+2] Add dummy tokens for developer-portal [labs/private] - 10https://gerrit.wikimedia.org/r/773268 (https://phabricator.wikimedia.org/T297140) (owner: 10Majavah) [13:53:56] !log ladsgroup@cumin1001 START - Cookbook sre.hosts.downtime for 6:00:00 on db1121.eqiad.wmnet with reason: Maintenance [13:53:58] !log ladsgroup@cumin1001 END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1121.eqiad.wmnet with reason: Maintenance [13:53:58] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log [13:53:59] !log ladsgroup@cumin1001 START - Cookbook sre.hosts.downtime for 12:00:00 on clouddb[1015,1019].eqiad.wmnet,db1155.eqiad.wmnet with reason: Maintenance [13:54:02] !log ladsgroup@cumin1001 END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on clouddb[1015,1019].eqiad.wmnet,db1155.eqiad.wmnet with reason: Maintenance [13:54:03] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log [13:54:05] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log [13:54:07] !log ladsgroup@cumin1001 dbctl commit (dc=all): 'Depooling db1121 (T298565)', diff saved to https://phabricator.wikimedia.org/P25090 and previous config saved to /var/cache/conftool/dbconfig/20220418-135406-ladsgroup.json [13:54:07] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log [13:54:10] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log [13:54:11] T298565: Fix mismatching field type of user table for columns user_email_authenticated, user_email_token, user_email_token_expires, user_newpass_time, user_registration, user_token, user_touched, user_newpassword, user_password, user_email on wmf wikis - https://phabricator.wikimedia.org/T298565 [13:57:35] !log ladsgroup@cumin1001 dbctl commit (dc=all): 'Repooling after maintenance db1112', diff saved to https://phabricator.wikimedia.org/P25091 and previous config saved to /var/cache/conftool/dbconfig/20220418-135734-ladsgroup.json [13:57:37] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log [13:58:04] !log ladsgroup@cumin1001 dbctl commit (dc=all): 'Repooling after maintenance db1168 (T298565)', diff saved to https://phabricator.wikimedia.org/P25092 and previous config saved to /var/cache/conftool/dbconfig/20220418-135804-ladsgroup.json [13:58:06] !log ladsgroup@cumin1001 START - Cookbook sre.hosts.downtime for 6:00:00 on db1113.eqiad.wmnet with reason: Maintenance [13:58:07] !log ladsgroup@cumin1001 END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1113.eqiad.wmnet with reason: Maintenance [13:58:08] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log [13:58:10] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log [13:58:13] !log ladsgroup@cumin1001 dbctl commit (dc=all): 'Depooling db1113:3316 (T298565)', diff saved to https://phabricator.wikimedia.org/P25093 and previous config saved to /var/cache/conftool/dbconfig/20220418-135812-ladsgroup.json [13:58:13] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log [13:58:16] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log [13:58:45] (JobUnavailable) firing: (2) Reduced availability for job cloud_dev_pdns in ops@codfw - https://wikitech.wikimedia.org/wiki/Prometheus#Prometheus_job_unavailable - https://grafana.wikimedia.org/d/NEJu05xZz/prometheus-targets - https://alerts.wikimedia.org/?q=alertname%3DJobUnavailable [13:59:21] (03PS1) 10KartikMistry: TTMServerAid::getData: Do not swallow TranslationHelperException [extensions/Translate] (wmf/1.39.0-wmf.7) - 10https://gerrit.wikimedia.org/r/780641 (https://phabricator.wikimedia.org/T306233) [13:59:45] (03CR) 10Marostegui: [C: 04-2] mariadb: Promote db1181 to s7 master (031 comment) [puppet] - 10https://gerrit.wikimedia.org/r/783819 (https://phabricator.wikimedia.org/T306001) (owner: 10Marostegui) [14:04:19] (03CR) 10Abijeet Patro: [C: 03+1] TTMServerAid::getData: Do not swallow TranslationHelperException [extensions/Translate] (wmf/1.39.0-wmf.7) - 10https://gerrit.wikimedia.org/r/780641 (https://phabricator.wikimedia.org/T306233) (owner: 10KartikMistry) [14:04:24] (03CR) 10Ladsgroup: [C: 03+1] change_ipb_id_T306269.py: New schema change [software/schema-changes] - 10https://gerrit.wikimedia.org/r/783823 (https://phabricator.wikimedia.org/T306269) (owner: 10Marostegui) [14:04:31] (03CR) 10Marostegui: [C: 03+2] change_ipb_id_T306269.py: New schema change [software/schema-changes] - 10https://gerrit.wikimedia.org/r/783823 (https://phabricator.wikimedia.org/T306269) (owner: 10Marostegui) [14:04:58] (03Merged) 10jenkins-bot: change_ipb_id_T306269.py: New schema change [software/schema-changes] - 10https://gerrit.wikimedia.org/r/783823 (https://phabricator.wikimedia.org/T306269) (owner: 10Marostegui) [14:06:48] (03PS1) 10Bking: Elastic: use major version only for 'config_version' [puppet] - 10https://gerrit.wikimedia.org/r/783848 (https://phabricator.wikimedia.org/T301959) [14:07:07] (03CR) 10Alexandros Kosiaris: [C: 03+2] Add developer-portal k8s accounts [puppet] - 10https://gerrit.wikimedia.org/r/773270 (https://phabricator.wikimedia.org/T297140) (owner: 10Majavah) [14:07:22] (03CR) 10jerkins-bot: [V: 04-1] Elastic: use major version only for 'config_version' [puppet] - 10https://gerrit.wikimedia.org/r/783848 (https://phabricator.wikimedia.org/T301959) (owner: 10Bking) [14:07:40] (03PS2) 10Bking: Elastic: use major version only for 'config_version' [puppet] - 10https://gerrit.wikimedia.org/r/783848 (https://phabricator.wikimedia.org/T301959) [14:08:12] (03CR) 10jerkins-bot: [V: 04-1] Elastic: use major version only for 'config_version' [puppet] - 10https://gerrit.wikimedia.org/r/783848 (https://phabricator.wikimedia.org/T301959) (owner: 10Bking) [14:09:14] !log ladsgroup@cumin1001 dbctl commit (dc=all): 'Repooling after maintenance db1113:3316 (T298565)', diff saved to https://phabricator.wikimedia.org/P25094 and previous config saved to /var/cache/conftool/dbconfig/20220418-140914-ladsgroup.json [14:09:18] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log [14:09:19] T298565: Fix mismatching field type of user table for columns user_email_authenticated, user_email_token, user_email_token_expires, user_newpass_time, user_registration, user_token, user_touched, user_newpassword, user_password, user_email on wmf wikis - https://phabricator.wikimedia.org/T298565 [14:10:33] (03PS3) 10Bking: Elastic: use major version only for 'config_version' [puppet] - 10https://gerrit.wikimedia.org/r/783848 (https://phabricator.wikimedia.org/T301959) [14:10:51] (03PS2) 10Alexandros Kosiaris: admin: add developer-portal namespace [deployment-charts] - 10https://gerrit.wikimedia.org/r/773267 (https://phabricator.wikimedia.org/T297140) (owner: 10Majavah) [14:12:25] (03CR) 10Alexandros Kosiaris: [C: 04-1] helmfile.d: add developer-portal (031 comment) [deployment-charts] - 10https://gerrit.wikimedia.org/r/773995 (https://phabricator.wikimedia.org/T297140) (owner: 10Majavah) [14:12:40] !log ladsgroup@cumin1001 dbctl commit (dc=all): 'Repooling after maintenance db1112', diff saved to https://phabricator.wikimedia.org/P25095 and previous config saved to /var/cache/conftool/dbconfig/20220418-141239-ladsgroup.json [14:12:42] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log [14:13:19] !log ladsgroup@cumin1001 dbctl commit (dc=all): 'Repooling after maintenance db1121 (T298565)', diff saved to https://phabricator.wikimedia.org/P25096 and previous config saved to /var/cache/conftool/dbconfig/20220418-141319-ladsgroup.json [14:13:22] (03CR) 10Bking: "check experimental" [puppet] - 10https://gerrit.wikimedia.org/r/783848 (https://phabricator.wikimedia.org/T301959) (owner: 10Bking) [14:13:22] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log [14:13:59] (03CR) 10jerkins-bot: [V: 04-1] TTMServerAid::getData: Do not swallow TranslationHelperException [extensions/Translate] (wmf/1.39.0-wmf.7) - 10https://gerrit.wikimedia.org/r/780641 (https://phabricator.wikimedia.org/T306233) (owner: 10KartikMistry) [14:15:12] (03PS3) 10Majavah: admin: add developer-portal namespace [deployment-charts] - 10https://gerrit.wikimedia.org/r/773267 (https://phabricator.wikimedia.org/T297140) [14:15:14] (03PS2) 10Majavah: Add developer-portal chart [deployment-charts] - 10https://gerrit.wikimedia.org/r/773994 (https://phabricator.wikimedia.org/T297140) [14:15:16] (03PS2) 10Majavah: helmfile.d: add developer-portal [deployment-charts] - 10https://gerrit.wikimedia.org/r/773995 (https://phabricator.wikimedia.org/T297140) [14:15:57] (03CR) 10Majavah: helmfile.d: add developer-portal (031 comment) [deployment-charts] - 10https://gerrit.wikimedia.org/r/773995 (https://phabricator.wikimedia.org/T297140) (owner: 10Majavah) [14:17:33] (03PS4) 10Majavah: admin: add developer-portal namespace [deployment-charts] - 10https://gerrit.wikimedia.org/r/773267 (https://phabricator.wikimedia.org/T297140) [14:17:34] (03PS3) 10Majavah: Add developer-portal chart [deployment-charts] - 10https://gerrit.wikimedia.org/r/773994 (https://phabricator.wikimedia.org/T297140) [14:17:36] (03PS3) 10Majavah: helmfile.d: add developer-portal [deployment-charts] - 10https://gerrit.wikimedia.org/r/773995 (https://phabricator.wikimedia.org/T297140) [14:18:06] (03CR) 10Bking: [V: 03+1] "PCC SUCCESS (NOOP 1): https://integration.wikimedia.org/ci/job/operations-puppet-catalog-compiler/34873/console" [puppet] - 10https://gerrit.wikimedia.org/r/783848 (https://phabricator.wikimedia.org/T301959) (owner: 10Bking) [14:19:06] (03PS2) 10Ladsgroup: TimedMediaHandler: Make videojs the only player on Commons [mediawiki-config] - 10https://gerrit.wikimedia.org/r/780925 (https://phabricator.wikimedia.org/T248418) (owner: 10Jforrester) [14:19:20] jouncebot: nowandnext [14:19:20] No deployments scheduled for the next 1 hour(s) and 10 minute(s) [14:19:20] In 1 hour(s) and 10 minute(s): Wikimedia Portals Update (https://wikitech.wikimedia.org/wiki/Deployments#deploycal-item-20220418T1530) [14:19:27] cooolio [14:19:31] (03CR) 10Ladsgroup: [C: 03+2] TimedMediaHandler: Make videojs the only player on Commons [mediawiki-config] - 10https://gerrit.wikimedia.org/r/780925 (https://phabricator.wikimedia.org/T248418) (owner: 10Jforrester) [14:20:10] (03CR) 10jerkins-bot: [V: 04-1] helmfile.d: add developer-portal [deployment-charts] - 10https://gerrit.wikimedia.org/r/773995 (https://phabricator.wikimedia.org/T297140) (owner: 10Majavah) [14:20:13] (03Merged) 10jenkins-bot: TimedMediaHandler: Make videojs the only player on Commons [mediawiki-config] - 10https://gerrit.wikimedia.org/r/780925 (https://phabricator.wikimedia.org/T248418) (owner: 10Jforrester) [14:21:41] !log ladsgroup@deploy1002 Synchronized wmf-config/InitialiseSettings.php: Config: [[gerrit:780925|TimedMediaHandler: Make videojs the only player on Commons (T248418)]] (duration: 00m 50s) [14:21:45] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log [14:21:46] T248418: Roll out videojs as the only video/audio player on all Wikimedia wikis - https://phabricator.wikimedia.org/T248418 [14:22:44] (KubernetesRsyslogDown) firing: (4) rsyslog on ml-staging-ctrl2001:9105 is missing kubernetes logs - https://wikitech.wikimedia.org/wiki/Kubernetes/Logging#Common_issues - https://alerts.wikimedia.org/?q=alertname%3DKubernetesRsyslogDown [14:24:09] Let's hope there's no stampede. :-) [14:24:22] !log ladsgroup@cumin1001 dbctl commit (dc=all): 'Repooling after maintenance db1113:3316', diff saved to https://phabricator.wikimedia.org/P25097 and previous config saved to /var/cache/conftool/dbconfig/20220418-142421-ladsgroup.json [14:24:25] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log [14:24:50] Amir1: Note that my group1 and group2 patches also drop it as a beta feature on wikis, but I couldn't be bothered to add all the other wikis into the existing patch tree. [14:25:02] (03CR) 10Alexandros Kosiaris: [C: 03+2] "Thanks, I was about to remove the empty datahub: {} line myself, beat me to it." [deployment-charts] - 10https://gerrit.wikimedia.org/r/773267 (https://phabricator.wikimedia.org/T297140) (owner: 10Majavah) [14:25:12] !log mwdebug-deploy@deploy1002 helmfile [eqiad] START helmfile.d/services/mwdebug: apply [14:25:14] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log [14:25:15] !log mwdebug-deploy@deploy1002 helmfile [eqiad] DONE helmfile.d/services/mwdebug: apply [14:25:16] !log mwdebug-deploy@deploy1002 helmfile [codfw] START helmfile.d/services/mwdebug: apply [14:25:17] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log [14:25:20] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log [14:25:20] !log mwdebug-deploy@deploy1002 helmfile [codfw] DONE helmfile.d/services/mwdebug: apply [14:25:22] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log [14:25:23] James_F: hmm, shouldn't we keep it until the PC cache expire? [14:25:47] Amir1: Shouldn't break things I think? I'm trying to remember the config. [14:25:58] Amir1: FWIW, that's what we did for the group0 roll-out. [14:26:19] we changed that because of the cache stampede problem [14:26:38] Not the config but the code inside the extension, ISTR? [14:26:38] basically you still have to get the new thing when you enable beta option [14:26:44] yup [14:27:10] because with the default changing, you're not guaranteed to get the new thing [14:27:11] So if there's a cache hit, with the new config it will just give you whatever's in the cache. [14:27:17] yup [14:27:22] unless you enabled beta [14:27:33] And if there's a cache miss, with the new config it will default to videojs. [14:27:33] because otherwise it'd be confusing to the user [14:27:40] So where's the stampede? [14:27:45] !log ladsgroup@cumin1001 dbctl commit (dc=all): 'Repooling after maintenance db1112 (T298565)', diff saved to https://phabricator.wikimedia.org/P25098 and previous config saved to /var/cache/conftool/dbconfig/20220418-142744-ladsgroup.json [14:27:46] !log ladsgroup@cumin1001 START - Cookbook sre.hosts.downtime for 6:00:00 on db1175.eqiad.wmnet with reason: Maintenance [14:27:48] !log ladsgroup@cumin1001 END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1175.eqiad.wmnet with reason: Maintenance [14:27:48] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log [14:27:49] T298565: Fix mismatching field type of user table for columns user_email_authenticated, user_email_token, user_email_token_expires, user_newpass_time, user_registration, user_token, user_touched, user_newpassword, user_password, user_email on wmf wikis - https://phabricator.wikimedia.org/T298565 [14:27:51] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log [14:27:53] !log ladsgroup@cumin1001 dbctl commit (dc=all): 'Depooling db1175 (T298565)', diff saved to https://phabricator.wikimedia.org/P25099 and previous config saved to /var/cache/conftool/dbconfig/20220418-142752-ladsgroup.json [14:27:54] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log [14:27:57] the stampede was before this change [14:27:57] Oh, we only vary the cache key if the beta is active? [14:27:58] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log [14:28:15] James_F: I think so [14:28:24] !log ladsgroup@cumin1001 dbctl commit (dc=all): 'Repooling after maintenance db1121', diff saved to https://phabricator.wikimedia.org/P25100 and previous config saved to /var/cache/conftool/dbconfig/20220418-142824-ladsgroup.json [14:28:27] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log [14:28:30] But https://gerrit.wikimedia.org/r/c/operations/mediawiki-config/+/612348/5/wmf-config/InitialiseSettings.php should be safe once you're happy to proceed. [14:28:42] (Maybe not today, however. :-)) [14:28:48] (03Merged) 10jenkins-bot: admin: add developer-portal namespace [deployment-charts] - 10https://gerrit.wikimedia.org/r/773267 (https://phabricator.wikimedia.org/T297140) (owner: 10Majavah) [14:29:31] I think it partially okay as long as the cache expired e.g. desktop improvement wikis [14:29:42] Which will be fine. [14:31:01] !log akosiaris@deploy1002 helmfile [eqiad] START helmfile.d/admin 'apply'. [14:31:03] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log [14:31:12] !log akosiaris@deploy1002 helmfile [eqiad] DONE helmfile.d/admin 'apply'. [14:31:14] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log [14:32:55] (NodeTextfileStale) firing: (3) Stale textfile for elastic1075:9100 - https://wikitech.wikimedia.org/wiki/Prometheus#Stale_file_for_node-exporter_textfile - https://grafana.wikimedia.org/d/knkl4dCWz/node-exporter-textfile - https://alerts.wikimedia.org/?q=alertname%3DNodeTextfileStale [14:33:25] !log akosiaris@deploy1002 helmfile [codfw] START helmfile.d/admin 'apply'. [14:33:27] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log [14:34:14] !log akosiaris@deploy1002 helmfile [codfw] DONE helmfile.d/admin 'apply'. [14:34:16] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log [14:36:39] (03PS1) 10Majavah: add developer.wikimedia.org alias [dns] - 10https://gerrit.wikimedia.org/r/783849 (https://phabricator.wikimedia.org/T297140) [14:39:27] !log ladsgroup@cumin1001 dbctl commit (dc=all): 'Repooling after maintenance db1113:3316', diff saved to https://phabricator.wikimedia.org/P25101 and previous config saved to /var/cache/conftool/dbconfig/20220418-143927-ladsgroup.json [14:39:30] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log [14:39:32] (03PS4) 10Majavah: Add developer-portal chart [deployment-charts] - 10https://gerrit.wikimedia.org/r/773994 (https://phabricator.wikimedia.org/T297140) [14:39:34] (03PS4) 10Majavah: helmfile.d: add developer-portal [deployment-charts] - 10https://gerrit.wikimedia.org/r/773995 (https://phabricator.wikimedia.org/T297140) [14:42:25] (03PS2) 10Majavah: add developer.wikimedia.org alias [dns] - 10https://gerrit.wikimedia.org/r/783849 (https://phabricator.wikimedia.org/T287748) [14:43:29] !log ladsgroup@cumin1001 dbctl commit (dc=all): 'Repooling after maintenance db1121', diff saved to https://phabricator.wikimedia.org/P25102 and previous config saved to /var/cache/conftool/dbconfig/20220418-144329-ladsgroup.json [14:43:32] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log [14:54:32] !log ladsgroup@cumin1001 dbctl commit (dc=all): 'Repooling after maintenance db1113:3316 (T298565)', diff saved to https://phabricator.wikimedia.org/P25103 and previous config saved to /var/cache/conftool/dbconfig/20220418-145432-ladsgroup.json [14:54:34] !log ladsgroup@cumin1001 START - Cookbook sre.hosts.downtime for 6:00:00 on db1096.eqiad.wmnet with reason: Maintenance [14:54:35] !log ladsgroup@cumin1001 END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1096.eqiad.wmnet with reason: Maintenance [14:54:36] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log [14:54:37] T298565: Fix mismatching field type of user table for columns user_email_authenticated, user_email_token, user_email_token_expires, user_newpass_time, user_registration, user_token, user_touched, user_newpassword, user_password, user_email on wmf wikis - https://phabricator.wikimedia.org/T298565 [14:54:39] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log [14:54:40] !log ladsgroup@cumin1001 dbctl commit (dc=all): 'Depooling db1096:3316 (T298565)', diff saved to https://phabricator.wikimedia.org/P25104 and previous config saved to /var/cache/conftool/dbconfig/20220418-145440-ladsgroup.json [14:54:42] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log [14:54:45] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log [14:56:45] (03CR) 10Ebernhardson: [C: 03+1] Elastic: use major version only for 'config_version' [puppet] - 10https://gerrit.wikimedia.org/r/783848 (https://phabricator.wikimedia.org/T301959) (owner: 10Bking) [14:58:34] !log ladsgroup@cumin1001 dbctl commit (dc=all): 'Repooling after maintenance db1121 (T298565)', diff saved to https://phabricator.wikimedia.org/P25105 and previous config saved to /var/cache/conftool/dbconfig/20220418-145834-ladsgroup.json [14:58:36] !log ladsgroup@cumin1001 START - Cookbook sre.hosts.downtime for 6:00:00 on db1141.eqiad.wmnet with reason: Maintenance [14:58:37] !log ladsgroup@cumin1001 END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1141.eqiad.wmnet with reason: Maintenance [14:58:38] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log [14:58:41] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log [14:58:42] !log ladsgroup@cumin1001 dbctl commit (dc=all): 'Depooling db1141 (T298565)', diff saved to https://phabricator.wikimedia.org/P25106 and previous config saved to /var/cache/conftool/dbconfig/20220418-145842-ladsgroup.json [14:58:43] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log [14:58:46] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log [15:01:55] (NodeTextfileStale) firing: Stale textfile for ms-be2067:9100 - https://wikitech.wikimedia.org/wiki/Prometheus#Stale_file_for_node-exporter_textfile - https://grafana.wikimedia.org/d/knkl4dCWz/node-exporter-textfile - https://alerts.wikimedia.org/?q=alertname%3DNodeTextfileStale [15:02:59] (03CR) 10Bking: [V: 03+1 C: 03+2] Elastic: use major version only for 'config_version' [puppet] - 10https://gerrit.wikimedia.org/r/783848 (https://phabricator.wikimedia.org/T301959) (owner: 10Bking) [15:09:23] !log ladsgroup@cumin1001 dbctl commit (dc=all): 'Repooling after maintenance db1141 (T298565)', diff saved to https://phabricator.wikimedia.org/P25107 and previous config saved to /var/cache/conftool/dbconfig/20220418-150923-ladsgroup.json [15:09:26] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log [15:09:27] T298565: Fix mismatching field type of user table for columns user_email_authenticated, user_email_token, user_email_token_expires, user_newpass_time, user_registration, user_token, user_touched, user_newpassword, user_password, user_email on wmf wikis - https://phabricator.wikimedia.org/T298565 [15:09:51] (03PS2) 10MusikAnimal: Add WikiEditor's Realtime Preview to BetaFeatures [mediawiki-config] - 10https://gerrit.wikimedia.org/r/781096 (https://phabricator.wikimedia.org/T304596) [15:12:26] (03CR) 10Jforrester: [C: 03+1] Add WikiEditor's Realtime Preview to BetaFeatures [mediawiki-config] - 10https://gerrit.wikimedia.org/r/781096 (https://phabricator.wikimedia.org/T304596) (owner: 10MusikAnimal) [15:14:00] 10SRE, 10ops-codfw: codfw: Dedicate Rack B1 for cloudX-dev servers - https://phabricator.wikimedia.org/T305469 (10Papaul) [15:15:47] 10SRE, 10ops-codfw: codfw: Dedicate Rack B1 for cloudX-dev servers - https://phabricator.wikimedia.org/T305469 (10Papaul) [15:16:40] !log ladsgroup@cumin1001 dbctl commit (dc=all): 'Repooling after maintenance db1096:3316 (T298565)', diff saved to https://phabricator.wikimedia.org/P25108 and previous config saved to /var/cache/conftool/dbconfig/20220418-151639-ladsgroup.json [15:16:43] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log [15:16:45] T298565: Fix mismatching field type of user table for columns user_email_authenticated, user_email_token, user_email_token_expires, user_newpass_time, user_registration, user_token, user_touched, user_newpassword, user_password, user_email on wmf wikis - https://phabricator.wikimedia.org/T298565 [15:19:58] PROBLEM - SSH on wtp1038.mgmt is CRITICAL: CRITICAL - Socket timeout after 10 seconds https://wikitech.wikimedia.org/wiki/Dc-operations/Hardware_Troubleshooting_Runbook [15:22:36] PROBLEM - Host cloudcephosd2002-dev.mgmt is DOWN: PING CRITICAL - Packet loss = 100% [15:24:29] !log ladsgroup@cumin1001 dbctl commit (dc=all): 'Repooling after maintenance db1141', diff saved to https://phabricator.wikimedia.org/P25109 and previous config saved to /var/cache/conftool/dbconfig/20220418-152428-ladsgroup.json [15:24:31] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log [15:25:08] !log andrew@cumin1001 START - Cookbook sre.hosts.reimage for host cloudvirt1020.eqiad.wmnet with OS bullseye [15:25:10] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log [15:30:04] jan_drewniak: How many deployers does it take to do Wikimedia Portals Update deploy? (https://wikitech.wikimedia.org/wiki/Deployments#deploycal-item-20220418T1530). [15:30:40] RECOVERY - Host cloudcephosd2002-dev.mgmt is UP: PING OK - Packet loss = 0%, RTA = 34.61 ms [15:31:45] !log ladsgroup@cumin1001 dbctl commit (dc=all): 'Repooling after maintenance db1096:3316', diff saved to https://phabricator.wikimedia.org/P25110 and previous config saved to /var/cache/conftool/dbconfig/20220418-153144-ladsgroup.json [15:31:47] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log [15:32:09] !log ladsgroup@cumin1001 dbctl commit (dc=all): 'Repooling after maintenance db1175 (T298565)', diff saved to https://phabricator.wikimedia.org/P25111 and previous config saved to /var/cache/conftool/dbconfig/20220418-153209-ladsgroup.json [15:32:12] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log [15:32:13] T298565: Fix mismatching field type of user table for columns user_email_authenticated, user_email_token, user_email_token_expires, user_newpass_time, user_registration, user_token, user_touched, user_newpassword, user_password, user_email on wmf wikis - https://phabricator.wikimedia.org/T298565 [15:37:46] !log andrew@cumin1001 START - Cookbook sre.hosts.downtime for 2:00:00 on cloudvirt1020.eqiad.wmnet with reason: host reimage [15:37:48] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log [15:39:34] !log ladsgroup@cumin1001 dbctl commit (dc=all): 'Repooling after maintenance db1141', diff saved to https://phabricator.wikimedia.org/P25112 and previous config saved to /var/cache/conftool/dbconfig/20220418-153933-ladsgroup.json [15:39:36] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log [15:40:41] !log andrew@cumin1001 END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on cloudvirt1020.eqiad.wmnet with reason: host reimage [15:40:43] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log [15:41:21] PROBLEM - Host cloudvirt2002-dev.mgmt is DOWN: PING CRITICAL - Packet loss = 100% [15:41:58] (03CR) 10Krinkle: Enable $wgFixDoubleRedirects on officewiki (031 comment) [mediawiki-config] - 10https://gerrit.wikimedia.org/r/780636 (https://phabricator.wikimedia.org/T305782) (owner: 10MarcoAurelio) [15:44:28] (03CR) 10Abijeet Patro: [C: 03+1] "recheck" [extensions/Translate] (wmf/1.39.0-wmf.7) - 10https://gerrit.wikimedia.org/r/780641 (https://phabricator.wikimedia.org/T306233) (owner: 10KartikMistry) [15:46:27] RECOVERY - Host cloudvirt2002-dev.mgmt is UP: PING OK - Packet loss = 0%, RTA = 35.60 ms [15:46:50] !log ladsgroup@cumin1001 dbctl commit (dc=all): 'Repooling after maintenance db1096:3316', diff saved to https://phabricator.wikimedia.org/P25113 and previous config saved to /var/cache/conftool/dbconfig/20220418-154650-ladsgroup.json [15:46:52] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log [15:47:14] !log ladsgroup@cumin1001 dbctl commit (dc=all): 'Repooling after maintenance db1175', diff saved to https://phabricator.wikimedia.org/P25114 and previous config saved to /var/cache/conftool/dbconfig/20220418-154714-ladsgroup.json [15:47:17] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log [15:54:39] !log ladsgroup@cumin1001 dbctl commit (dc=all): 'Repooling after maintenance db1141 (T298565)', diff saved to https://phabricator.wikimedia.org/P25115 and previous config saved to /var/cache/conftool/dbconfig/20220418-155438-ladsgroup.json [15:54:40] !log ladsgroup@cumin1001 START - Cookbook sre.hosts.downtime for 6:00:00 on db1142.eqiad.wmnet with reason: Maintenance [15:54:42] !log ladsgroup@cumin1001 END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1142.eqiad.wmnet with reason: Maintenance [15:54:42] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log [15:54:43] T298565: Fix mismatching field type of user table for columns user_email_authenticated, user_email_token, user_email_token_expires, user_newpass_time, user_registration, user_token, user_touched, user_newpassword, user_password, user_email on wmf wikis - https://phabricator.wikimedia.org/T298565 [15:54:45] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log [15:54:47] !log ladsgroup@cumin1001 dbctl commit (dc=all): 'Depooling db1142 (T298565)', diff saved to https://phabricator.wikimedia.org/P25116 and previous config saved to /var/cache/conftool/dbconfig/20220418-155446-ladsgroup.json [15:54:48] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log [15:54:51] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log [16:01:55] !log ladsgroup@cumin1001 dbctl commit (dc=all): 'Repooling after maintenance db1096:3316 (T298565)', diff saved to https://phabricator.wikimedia.org/P25118 and previous config saved to /var/cache/conftool/dbconfig/20220418-160155-ladsgroup.json [16:01:57] !log ladsgroup@cumin1001 START - Cookbook sre.hosts.downtime for 6:00:00 on db1131.eqiad.wmnet with reason: Maintenance [16:01:58] !log ladsgroup@cumin1001 END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1131.eqiad.wmnet with reason: Maintenance [16:01:58] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log [16:02:00] T298565: Fix mismatching field type of user table for columns user_email_authenticated, user_email_token, user_email_token_expires, user_newpass_time, user_registration, user_token, user_touched, user_newpassword, user_password, user_email on wmf wikis - https://phabricator.wikimedia.org/T298565 [16:02:02] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log [16:02:03] !log ladsgroup@cumin1001 dbctl commit (dc=all): 'Depooling db1131 (T298565)', diff saved to https://phabricator.wikimedia.org/P25119 and previous config saved to /var/cache/conftool/dbconfig/20220418-160203-ladsgroup.json [16:02:04] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log [16:02:07] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log [16:02:19] !log ladsgroup@cumin1001 dbctl commit (dc=all): 'Repooling after maintenance db1175', diff saved to https://phabricator.wikimedia.org/P25120 and previous config saved to /var/cache/conftool/dbconfig/20220418-160219-ladsgroup.json [16:02:21] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log [16:03:10] 10SRE, 10ops-codfw: codfw: Dedicate Rack B1 for cloudX-dev servers - https://phabricator.wikimedia.org/T305469 (10Papaul) [16:05:30] !log ladsgroup@cumin1001 dbctl commit (dc=all): 'Repooling after maintenance db1142 (T298565)', diff saved to https://phabricator.wikimedia.org/P25121 and previous config saved to /var/cache/conftool/dbconfig/20220418-160529-ladsgroup.json [16:05:33] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log [16:06:25] !log ladsgroup@cumin1001 dbctl commit (dc=all): 'Repooling after maintenance db1131 (T298565)', diff saved to https://phabricator.wikimedia.org/P25122 and previous config saved to /var/cache/conftool/dbconfig/20220418-160624-ladsgroup.json [16:06:28] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log [16:11:36] (03PS1) 10Andrew Bogott: raid::hpsa: install ssacli on Bullseye [puppet] - 10https://gerrit.wikimedia.org/r/783861 (https://phabricator.wikimedia.org/T306354) [16:13:26] (03CR) 10Andrew Bogott: [C: 03+2] raid::hpsa: install ssacli on Bullseye [puppet] - 10https://gerrit.wikimedia.org/r/783861 (https://phabricator.wikimedia.org/T306354) (owner: 10Andrew Bogott) [16:15:42] (03PS1) 10Andrew Bogott: raid::hpsa: fix distro logic that was exactly backwards [puppet] - 10https://gerrit.wikimedia.org/r/783862 (https://phabricator.wikimedia.org/T306354) [16:16:44] (03CR) 10Andrew Bogott: [C: 03+2] raid::hpsa: fix distro logic that was exactly backwards [puppet] - 10https://gerrit.wikimedia.org/r/783862 (https://phabricator.wikimedia.org/T306354) (owner: 10Andrew Bogott) [16:17:24] !log ladsgroup@cumin1001 dbctl commit (dc=all): 'Repooling after maintenance db1175 (T298565)', diff saved to https://phabricator.wikimedia.org/P25123 and previous config saved to /var/cache/conftool/dbconfig/20220418-161724-ladsgroup.json [16:17:26] !log ladsgroup@cumin1001 START - Cookbook sre.hosts.downtime for 6:00:00 on db1166.eqiad.wmnet with reason: Maintenance [16:17:27] !log ladsgroup@cumin1001 END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1166.eqiad.wmnet with reason: Maintenance [16:17:28] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log [16:17:29] T298565: Fix mismatching field type of user table for columns user_email_authenticated, user_email_token, user_email_token_expires, user_newpass_time, user_registration, user_token, user_touched, user_newpassword, user_password, user_email on wmf wikis - https://phabricator.wikimedia.org/T298565 [16:17:31] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log [16:17:32] !log ladsgroup@cumin1001 dbctl commit (dc=all): 'Depooling db1166 (T298565)', diff saved to https://phabricator.wikimedia.org/P25124 and previous config saved to /var/cache/conftool/dbconfig/20220418-161732-ladsgroup.json [16:17:34] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log [16:17:38] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log [16:19:08] 10SRE, 10ops-codfw: codfw: Dedicate Rack B1 for cloudX-dev servers - https://phabricator.wikimedia.org/T305469 (10Papaul) [16:20:35] !log ladsgroup@cumin1001 dbctl commit (dc=all): 'Repooling after maintenance db1142', diff saved to https://phabricator.wikimedia.org/P25125 and previous config saved to /var/cache/conftool/dbconfig/20220418-162034-ladsgroup.json [16:20:38] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log [16:21:11] PROBLEM - SSH on mw2258.mgmt is CRITICAL: CRITICAL - Socket timeout after 10 seconds https://wikitech.wikimedia.org/wiki/Dc-operations/Hardware_Troubleshooting_Runbook [16:21:30] !log ladsgroup@cumin1001 dbctl commit (dc=all): 'Repooling after maintenance db1131', diff saved to https://phabricator.wikimedia.org/P25126 and previous config saved to /var/cache/conftool/dbconfig/20220418-162129-ladsgroup.json [16:21:32] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log [16:26:03] !log andrew@cumin1001 END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host cloudvirt1020.eqiad.wmnet with OS bullseye [16:26:05] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log [16:29:36] PROBLEM - Host cloudcephosd2003-dev.mgmt is DOWN: PING CRITICAL - Packet loss = 100% [16:33:14] (03PS1) 10Andrew Bogott: raid::hpsa: symlink hpssacli to ssacli [puppet] - 10https://gerrit.wikimedia.org/r/783866 (https://phabricator.wikimedia.org/T306354) [16:33:52] !log ladsgroup@cumin1001 dbctl commit (dc=all): 'Repooling after maintenance db1166 (T298565)', diff saved to https://phabricator.wikimedia.org/P25127 and previous config saved to /var/cache/conftool/dbconfig/20220418-163351-ladsgroup.json [16:33:55] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log [16:33:57] T298565: Fix mismatching field type of user table for columns user_email_authenticated, user_email_token, user_email_token_expires, user_newpass_time, user_registration, user_token, user_touched, user_newpassword, user_password, user_email on wmf wikis - https://phabricator.wikimedia.org/T298565 [16:35:08] RECOVERY - Host cloudcephosd2003-dev.mgmt is UP: PING OK - Packet loss = 0%, RTA = 33.76 ms [16:35:40] !log ladsgroup@cumin1001 dbctl commit (dc=all): 'Repooling after maintenance db1142', diff saved to https://phabricator.wikimedia.org/P25128 and previous config saved to /var/cache/conftool/dbconfig/20220418-163539-ladsgroup.json [16:35:42] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log [16:36:35] !log ladsgroup@cumin1001 dbctl commit (dc=all): 'Repooling after maintenance db1131', diff saved to https://phabricator.wikimedia.org/P25129 and previous config saved to /var/cache/conftool/dbconfig/20220418-163634-ladsgroup.json [16:36:37] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log [16:37:13] (KubernetesRsyslogDown) firing: rsyslog on kubernetes1018:9105 is missing kubernetes logs - https://wikitech.wikimedia.org/wiki/Kubernetes/Logging#Common_issues - https://alerts.wikimedia.org/?q=alertname%3DKubernetesRsyslogDown [16:40:03] I'm going to run a schema change on master of all wikis with flaggedrevs enable, there might be a bit of read-only on ruwiki [16:40:39] done [16:48:57] !log ladsgroup@cumin1001 dbctl commit (dc=all): 'Repooling after maintenance db1166', diff saved to https://phabricator.wikimedia.org/P25130 and previous config saved to /var/cache/conftool/dbconfig/20220418-164856-ladsgroup.json [16:49:00] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log [16:50:45] !log ladsgroup@cumin1001 dbctl commit (dc=all): 'Repooling after maintenance db1142 (T298565)', diff saved to https://phabricator.wikimedia.org/P25131 and previous config saved to /var/cache/conftool/dbconfig/20220418-165044-ladsgroup.json [16:50:46] !log ladsgroup@cumin1001 START - Cookbook sre.hosts.downtime for 6:00:00 on db1143.eqiad.wmnet with reason: Maintenance [16:50:48] !log ladsgroup@cumin1001 END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1143.eqiad.wmnet with reason: Maintenance [16:50:49] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log [16:50:50] T298565: Fix mismatching field type of user table for columns user_email_authenticated, user_email_token, user_email_token_expires, user_newpass_time, user_registration, user_token, user_touched, user_newpassword, user_password, user_email on wmf wikis - https://phabricator.wikimedia.org/T298565 [16:50:51] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log [16:50:53] !log ladsgroup@cumin1001 dbctl commit (dc=all): 'Depooling db1143 (T298565)', diff saved to https://phabricator.wikimedia.org/P25132 and previous config saved to /var/cache/conftool/dbconfig/20220418-165053-ladsgroup.json [16:50:54] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log [16:50:58] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log [16:51:40] !log ladsgroup@cumin1001 dbctl commit (dc=all): 'Repooling after maintenance db1131 (T298565)', diff saved to https://phabricator.wikimedia.org/P25133 and previous config saved to /var/cache/conftool/dbconfig/20220418-165139-ladsgroup.json [16:51:42] !log ladsgroup@cumin1001 START - Cookbook sre.hosts.downtime for 6:00:00 on dbstore1005.eqiad.wmnet with reason: Maintenance [16:51:43] !log ladsgroup@cumin1001 END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on dbstore1005.eqiad.wmnet with reason: Maintenance [16:51:43] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log [16:51:46] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log [16:51:48] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log [17:00:04] ryankemper: I, the Bot under the Fountain, call upon thee, The Deployer, to do Wikidata Query Service weekly deploy deploy. (https://wikitech.wikimedia.org/wiki/Deployments#deploycal-item-20220418T1700). [17:01:25] !log ladsgroup@cumin1001 START - Cookbook sre.hosts.downtime for 6:00:00 on db2129.codfw.wmnet with reason: Maintenance [17:01:27] !log ladsgroup@cumin1001 END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db2129.codfw.wmnet with reason: Maintenance [17:01:28] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log [17:01:28] !log ladsgroup@cumin1001 START - Cookbook sre.hosts.downtime for 12:00:00 on 8 hosts with reason: Maintenance [17:01:31] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log [17:01:33] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log [17:01:34] !log ladsgroup@cumin1001 END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on 8 hosts with reason: Maintenance [17:01:36] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log [17:01:42] !log ladsgroup@cumin1001 dbctl commit (dc=all): 'Repooling after maintenance db1143 (T298565)', diff saved to https://phabricator.wikimedia.org/P25134 and previous config saved to /var/cache/conftool/dbconfig/20220418-170141-ladsgroup.json [17:01:45] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log [17:01:46] T298565: Fix mismatching field type of user table for columns user_email_authenticated, user_email_token, user_email_token_expires, user_newpass_time, user_registration, user_token, user_touched, user_newpassword, user_password, user_email on wmf wikis - https://phabricator.wikimedia.org/T298565 [17:04:02] !log ladsgroup@cumin1001 dbctl commit (dc=all): 'Repooling after maintenance db1166', diff saved to https://phabricator.wikimedia.org/P25135 and previous config saved to /var/cache/conftool/dbconfig/20220418-170401-ladsgroup.json [17:04:04] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log [17:11:12] !log ladsgroup@cumin1001 START - Cookbook sre.hosts.downtime for 6:00:00 on db1140.eqiad.wmnet with reason: Maintenance [17:11:14] !log ladsgroup@cumin1001 END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1140.eqiad.wmnet with reason: Maintenance [17:11:15] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log [17:11:17] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log [17:13:45] (JobUnavailable) resolved: (2) Reduced availability for job cloud_dev_pdns in ops@codfw - https://wikitech.wikimedia.org/wiki/Prometheus#Prometheus_job_unavailable - https://grafana.wikimedia.org/d/NEJu05xZz/prometheus-targets - https://alerts.wikimedia.org/?q=alertname%3DJobUnavailable [17:16:47] !log ladsgroup@cumin1001 dbctl commit (dc=all): 'Repooling after maintenance db1143', diff saved to https://phabricator.wikimedia.org/P25136 and previous config saved to /var/cache/conftool/dbconfig/20220418-171646-ladsgroup.json [17:16:49] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log [17:19:07] !log ladsgroup@cumin1001 dbctl commit (dc=all): 'Repooling after maintenance db1166 (T298565)', diff saved to https://phabricator.wikimedia.org/P25137 and previous config saved to /var/cache/conftool/dbconfig/20220418-171906-ladsgroup.json [17:19:08] !log ladsgroup@cumin1001 START - Cookbook sre.hosts.downtime for 6:00:00 on db1157.eqiad.wmnet with reason: Maintenance [17:19:10] !log ladsgroup@cumin1001 END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1157.eqiad.wmnet with reason: Maintenance [17:19:11] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log [17:19:12] T298565: Fix mismatching field type of user table for columns user_email_authenticated, user_email_token, user_email_token_expires, user_newpass_time, user_registration, user_token, user_touched, user_newpassword, user_password, user_email on wmf wikis - https://phabricator.wikimedia.org/T298565 [17:19:14] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log [17:19:15] !log ladsgroup@cumin1001 dbctl commit (dc=all): 'Depooling db1157 (T298565)', diff saved to https://phabricator.wikimedia.org/P25138 and previous config saved to /var/cache/conftool/dbconfig/20220418-171914-ladsgroup.json [17:19:16] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log [17:19:19] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log [17:20:55] !log ladsgroup@cumin1001 START - Cookbook sre.hosts.downtime for 6:00:00 on db1180.eqiad.wmnet with reason: Maintenance [17:20:57] !log ladsgroup@cumin1001 END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1180.eqiad.wmnet with reason: Maintenance [17:20:58] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log [17:21:00] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log [17:21:02] !log ladsgroup@cumin1001 dbctl commit (dc=all): 'Depooling db1180 (T298565)', diff saved to https://phabricator.wikimedia.org/P25139 and previous config saved to /var/cache/conftool/dbconfig/20220418-172101-ladsgroup.json [17:21:05] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log [17:21:59] PROBLEM - Maps HTTPS on maps1008 is CRITICAL: CRITICAL - Socket timeout after 10 seconds https://wikitech.wikimedia.org/wiki/Maps/RunBook [17:22:00] PROBLEM - Maps HTTPS on maps1005 is CRITICAL: CRITICAL - Socket timeout after 10 seconds https://wikitech.wikimedia.org/wiki/Maps/RunBook [17:22:35] PROBLEM - Maps HTTPS on maps1006 is CRITICAL: CRITICAL - Socket timeout after 10 seconds https://wikitech.wikimedia.org/wiki/Maps/RunBook [17:22:53] PROBLEM - PyBal backends health check on lvs1020 is CRITICAL: PYBAL CRITICAL - CRITICAL - tegola-vector-tiles_4105: Servers kubernetes1022.eqiad.wmnet, kubernetes1019.eqiad.wmnet, kubernetes1009.eqiad.wmnet, kubernetes1020.eqiad.wmnet, kubernetes1005.eqiad.wmnet, kubernetes1006.eqiad.wmnet, kubernetes1017.eqiad.wmnet, kubernetes1015.eqiad.wmnet, kubernetes1016.eqiad.wmnet are marked down but pooled https://wikitech.wikimedia.org/wiki/PyBa [17:22:53] PROBLEM - LVS kartotherian-ssl eqiad port 443/tcp - Kartotherian- kartotherian.svc.eqiad.wmnet - HTTPS IPv4 on kartotherian.svc.eqiad.wmnet is CRITICAL: CRITICAL - Socket timeout after 10 seconds https://wikitech.wikimedia.org/wiki/LVS%23Diagnosing_problems [17:23:28] PROBLEM - PyBal backends health check on lvs1019 is CRITICAL: PYBAL CRITICAL - CRITICAL - tegola-vector-tiles_4105: Servers kubernetes1012.eqiad.wmnet, kubernetes1020.eqiad.wmnet, kubernetes1010.eqiad.wmnet, kubernetes1007.eqiad.wmnet, kubernetes1009.eqiad.wmnet, kubernetes1018.eqiad.wmnet, kubernetes1021.eqiad.wmnet, kubernetes1014.eqiad.wmnet, kubernetes1006.eqiad.wmnet are marked down but pooled https://wikitech.wikimedia.org/wiki/PyBa [17:26:27] PROBLEM - LVS kartotherian eqiad port 6533/tcp - Kartotherian- kartotherian.svc.eqiad.wmnet IPv4 on kartotherian.svc.eqiad.wmnet is CRITICAL: CRITICAL - Socket timeout after 10 seconds https://wikitech.wikimedia.org/wiki/LVS%23Diagnosing_problems [17:30:57] PROBLEM - Maps HTTPS on maps1007 is CRITICAL: CRITICAL - Socket timeout after 10 seconds https://wikitech.wikimedia.org/wiki/Maps/RunBook [17:31:52] !log ladsgroup@cumin1001 dbctl commit (dc=all): 'Repooling after maintenance db1143', diff saved to https://phabricator.wikimedia.org/P25140 and previous config saved to /var/cache/conftool/dbconfig/20220418-173151-ladsgroup.json [17:31:54] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log [17:43:42] (03CR) 10Tacsipacsi: Add WikiEditor's Realtime Preview to BetaFeatures (031 comment) [mediawiki-config] - 10https://gerrit.wikimedia.org/r/781096 (https://phabricator.wikimedia.org/T304596) (owner: 10MusikAnimal) [17:45:13] PROBLEM - Host cloudcephmon2002-dev.mgmt is DOWN: PING CRITICAL - Packet loss = 100% [17:45:53] PROBLEM - Host cloudnet2002-dev.mgmt is DOWN: PING CRITICAL - Packet loss = 100% [17:46:29] PROBLEM - LVS tegola-vector-tiles eqiad port 4105/tcp - Tegola Vector Tiles- tegola-vector-tiles.svc.eqiad.wmnet IPv4 on tegola-vector-tiles.svc.eqiad.wmnet is CRITICAL: connect to address 10.2.2.60 and port 4105: Connection refused https://wikitech.wikimedia.org/wiki/LVS%23Diagnosing_problems [17:46:57] !log ladsgroup@cumin1001 dbctl commit (dc=all): 'Repooling after maintenance db1143 (T298565)', diff saved to https://phabricator.wikimedia.org/P25141 and previous config saved to /var/cache/conftool/dbconfig/20220418-174656-ladsgroup.json [17:46:59] !log ladsgroup@cumin1001 START - Cookbook sre.hosts.downtime for 6:00:00 on db1144.eqiad.wmnet with reason: Maintenance [17:47:00] !log ladsgroup@cumin1001 END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1144.eqiad.wmnet with reason: Maintenance [17:47:01] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log [17:47:03] T298565: Fix mismatching field type of user table for columns user_email_authenticated, user_email_token, user_email_token_expires, user_newpass_time, user_registration, user_token, user_touched, user_newpassword, user_password, user_email on wmf wikis - https://phabricator.wikimedia.org/T298565 [17:47:05] !log ladsgroup@cumin1001 dbctl commit (dc=all): 'Depooling db1144:3314 (T298565)', diff saved to https://phabricator.wikimedia.org/P25142 and previous config saved to /var/cache/conftool/dbconfig/20220418-174704-ladsgroup.json [17:47:05] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log [17:47:08] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log [17:47:11] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log [17:47:55] (NodeTextfileStale) firing: Stale textfile for cloudcontrol2001-dev:9100 - https://wikitech.wikimedia.org/wiki/Prometheus#Stale_file_for_node-exporter_textfile - https://grafana.wikimedia.org/d/knkl4dCWz/node-exporter-textfile - https://alerts.wikimedia.org/?q=alertname%3DNodeTextfileStale [17:48:53] PROBLEM - Maps HTTPS on maps1010 is CRITICAL: CRITICAL - Socket timeout after 10 seconds https://wikitech.wikimedia.org/wiki/Maps/RunBook [17:51:23] RECOVERY - Host cloudcephmon2002-dev.mgmt is UP: PING OK - Packet loss = 0%, RTA = 33.54 ms [17:54:31] (03CR) 10Tacsipacsi: "Thanks for the cherry pick! When do you want to deploy it? The last backport window before the branch cut (https://wikitech.wikimedia.org/" [extensions/Translate] (wmf/1.39.0-wmf.7) - 10https://gerrit.wikimedia.org/r/780641 (https://phabricator.wikimedia.org/T306233) (owner: 10KartikMistry) [17:58:02] !log ladsgroup@cumin1001 dbctl commit (dc=all): 'Repooling after maintenance db1144:3314 (T298565)', diff saved to https://phabricator.wikimedia.org/P25143 and previous config saved to /var/cache/conftool/dbconfig/20220418-175802-ladsgroup.json [17:58:06] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log [17:58:07] T298565: Fix mismatching field type of user table for columns user_email_authenticated, user_email_token, user_email_token_expires, user_newpass_time, user_registration, user_token, user_touched, user_newpassword, user_password, user_email on wmf wikis - https://phabricator.wikimedia.org/T298565 [17:58:31] RECOVERY - Host cloudnet2002-dev.mgmt is UP: PING OK - Packet loss = 0%, RTA = 33.46 ms [18:02:14] 10SRE, 10ops-codfw: codfw: Dedicate Rack B1 for cloudX-dev servers - https://phabricator.wikimedia.org/T305469 (10Papaul) [18:09:34] 10SRE, 10ops-eqiad, 10DC-Ops, 10serviceops, 10GitLab (Infrastructure): Q3:(Need By: TBD) rack/setup/install gitlab100[3|4] and gitlab-runner100[2|3|4] - https://phabricator.wikimedia.org/T301177 (10Jclark-ctr) Name Rack U Port Cableid gitlab1003... [18:10:45] PROBLEM - Host cloudgw2002-dev.mgmt is DOWN: PING CRITICAL - Packet loss = 100% [18:13:07] !log ladsgroup@cumin1001 dbctl commit (dc=all): 'Repooling after maintenance db1144:3314', diff saved to https://phabricator.wikimedia.org/P25144 and previous config saved to /var/cache/conftool/dbconfig/20220418-181307-ladsgroup.json [18:13:10] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log [18:16:29] 10SRE, 10ops-eqiad, 10DC-Ops, 10serviceops, 10GitLab (Infrastructure): Q3:(Need By: TBD) rack/setup/install gitlab100[3|4] and gitlab-runner100[2|3|4] - https://phabricator.wikimedia.org/T301177 (10Jclark-ctr) [18:16:59] RECOVERY - Host cloudgw2002-dev.mgmt is UP: PING OK - Packet loss = 0%, RTA = 33.73 ms [18:18:49] 10SRE, 10ops-codfw: codfw: Dedicate Rack B1 for cloudX-dev servers - https://phabricator.wikimedia.org/T305469 (10Papaul) [18:19:30] !log ladsgroup@cumin1001 dbctl commit (dc=all): 'Repooling after maintenance db1157 (T298565)', diff saved to https://phabricator.wikimedia.org/P25145 and previous config saved to /var/cache/conftool/dbconfig/20220418-181929-ladsgroup.json [18:19:33] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log [18:19:34] T298565: Fix mismatching field type of user table for columns user_email_authenticated, user_email_token, user_email_token_expires, user_newpass_time, user_registration, user_token, user_touched, user_newpassword, user_password, user_email on wmf wikis - https://phabricator.wikimedia.org/T298565 [18:20:09] 10SRE, 10ops-eqiad, 10DC-Ops, 10serviceops, 10GitLab (Infrastructure): Q3:(Need By: TBD) rack/setup/install gitlab100[3|4] and gitlab-runner100[2|3|4] - https://phabricator.wikimedia.org/T301177 (10Jclark-ctr) a:05Jclark-ctr→03Cmjohnson [18:21:17] !log ladsgroup@cumin1001 dbctl commit (dc=all): 'Repooling after maintenance db1180 (T298565)', diff saved to https://phabricator.wikimedia.org/P25146 and previous config saved to /var/cache/conftool/dbconfig/20220418-182116-ladsgroup.json [18:21:20] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log [18:22:24] 10SRE, 10ops-codfw: codfw: Dedicate Rack B1 for cloudX-dev servers - https://phabricator.wikimedia.org/T305469 (10Papaul) [18:22:44] (KubernetesRsyslogDown) firing: (4) rsyslog on ml-staging-ctrl2001:9105 is missing kubernetes logs - https://wikitech.wikimedia.org/wiki/Kubernetes/Logging#Common_issues - https://alerts.wikimedia.org/?q=alertname%3DKubernetesRsyslogDown [18:28:12] !log ladsgroup@cumin1001 dbctl commit (dc=all): 'Repooling after maintenance db1144:3314', diff saved to https://phabricator.wikimedia.org/P25147 and previous config saved to /var/cache/conftool/dbconfig/20220418-182812-ladsgroup.json [18:28:15] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log [18:30:09] RECOVERY - SSH on mw2258.mgmt is OK: SSH OK - OpenSSH_7.0 (protocol 2.0) https://wikitech.wikimedia.org/wiki/Dc-operations/Hardware_Troubleshooting_Runbook [18:30:50] 10SRE, 10ops-eqiad, 10Infrastructure-Foundations, 10netops: 2M 25G DAC testing - https://phabricator.wikimedia.org/T306220 (10Jclark-ctr) @cmooney Relocated xe-0/0/1 to port xe-0/0/2 [18:32:55] (NodeTextfileStale) firing: (3) Stale textfile for elastic1075:9100 - https://wikitech.wikimedia.org/wiki/Prometheus#Stale_file_for_node-exporter_textfile - https://grafana.wikimedia.org/d/knkl4dCWz/node-exporter-textfile - https://alerts.wikimedia.org/?q=alertname%3DNodeTextfileStale [18:34:35] !log ladsgroup@cumin1001 dbctl commit (dc=all): 'Repooling after maintenance db1157', diff saved to https://phabricator.wikimedia.org/P25148 and previous config saved to /var/cache/conftool/dbconfig/20220418-183434-ladsgroup.json [18:34:37] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log [18:36:22] !log ladsgroup@cumin1001 dbctl commit (dc=all): 'Repooling after maintenance db1180', diff saved to https://phabricator.wikimedia.org/P25149 and previous config saved to /var/cache/conftool/dbconfig/20220418-183621-ladsgroup.json [18:36:24] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log [18:43:17] !log ladsgroup@cumin1001 dbctl commit (dc=all): 'Repooling after maintenance db1144:3314 (T298565)', diff saved to https://phabricator.wikimedia.org/P25150 and previous config saved to /var/cache/conftool/dbconfig/20220418-184317-ladsgroup.json [18:43:19] !log ladsgroup@cumin1001 START - Cookbook sre.hosts.downtime for 6:00:00 on db1147.eqiad.wmnet with reason: Maintenance [18:43:20] !log ladsgroup@cumin1001 END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1147.eqiad.wmnet with reason: Maintenance [18:43:21] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log [18:43:22] T298565: Fix mismatching field type of user table for columns user_email_authenticated, user_email_token, user_email_token_expires, user_newpass_time, user_registration, user_token, user_touched, user_newpassword, user_password, user_email on wmf wikis - https://phabricator.wikimedia.org/T298565 [18:43:24] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log [18:43:25] !log ladsgroup@cumin1001 dbctl commit (dc=all): 'Depooling db1147 (T298565)', diff saved to https://phabricator.wikimedia.org/P25151 and previous config saved to /var/cache/conftool/dbconfig/20220418-184325-ladsgroup.json [18:43:26] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log [18:43:30] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log [18:46:24] 10SRE, 10SRE-tools, 10Infrastructure-Foundations, 10tox-wikimedia, and 2 others: Introduce Python code formatters usage - https://phabricator.wikimedia.org/T211750 (10jhathaway) thanks @Volans for the additional detail and I am happy to see that folks have been persistently chipping away at some of these b... [18:49:40] !log ladsgroup@cumin1001 dbctl commit (dc=all): 'Repooling after maintenance db1157', diff saved to https://phabricator.wikimedia.org/P25152 and previous config saved to /var/cache/conftool/dbconfig/20220418-184939-ladsgroup.json [18:49:42] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log [18:51:16] 10SRE, 10ops-eqiad, 10DC-Ops, 10serviceops, 10GitLab (Infrastructure): Q3:(Need By: TBD) rack/setup/install gitlab100[3|4] and gitlab-runner100[2|3|4] - https://phabricator.wikimedia.org/T301177 (10Dzahn) It's sufficient if you put the "insetup" role on this and hand it over to us. Let us apply the actua... [18:51:27] !log ladsgroup@cumin1001 dbctl commit (dc=all): 'Repooling after maintenance db1180', diff saved to https://phabricator.wikimedia.org/P25153 and previous config saved to /var/cache/conftool/dbconfig/20220418-185126-ladsgroup.json [18:51:29] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log [19:00:34] (03CR) 10Dzahn: [C: 03+2] role::mediawiki::maintenance: remove reference to cron [puppet] - 10https://gerrit.wikimedia.org/r/783410 (owner: 10Zabe) [19:01:55] (NodeTextfileStale) firing: Stale textfile for ms-be2067:9100 - https://wikitech.wikimedia.org/wiki/Prometheus#Stale_file_for_node-exporter_textfile - https://grafana.wikimedia.org/d/knkl4dCWz/node-exporter-textfile - https://alerts.wikimedia.org/?q=alertname%3DNodeTextfileStale [19:03:23] 10SRE, 10Horizon, 10wikitech.wikimedia.org, 10GitLab (Administration, Settings & Policy), 10Security: Take some pointers from GitHub security updates - https://phabricator.wikimedia.org/T304231 (10brennen) [19:04:44] !log ladsgroup@cumin1001 dbctl commit (dc=all): 'Repooling after maintenance db1157 (T298565)', diff saved to https://phabricator.wikimedia.org/P25154 and previous config saved to /var/cache/conftool/dbconfig/20220418-190444-ladsgroup.json [19:04:46] !log ladsgroup@cumin1001 START - Cookbook sre.hosts.downtime for 6:00:00 on db1179.eqiad.wmnet with reason: Maintenance [19:04:47] !log ladsgroup@cumin1001 END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1179.eqiad.wmnet with reason: Maintenance [19:04:48] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log [19:04:49] T298565: Fix mismatching field type of user table for columns user_email_authenticated, user_email_token, user_email_token_expires, user_newpass_time, user_registration, user_token, user_touched, user_newpassword, user_password, user_email on wmf wikis - https://phabricator.wikimedia.org/T298565 [19:04:51] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log [19:04:52] !log ladsgroup@cumin1001 dbctl commit (dc=all): 'Depooling db1179 (T298565)', diff saved to https://phabricator.wikimedia.org/P25155 and previous config saved to /var/cache/conftool/dbconfig/20220418-190452-ladsgroup.json [19:04:54] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log [19:04:58] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log [19:06:32] !log ladsgroup@cumin1001 dbctl commit (dc=all): 'Repooling after maintenance db1180 (T298565)', diff saved to https://phabricator.wikimedia.org/P25156 and previous config saved to /var/cache/conftool/dbconfig/20220418-190632-ladsgroup.json [19:06:34] !log ladsgroup@cumin1001 START - Cookbook sre.hosts.downtime for 6:00:00 on db1098.eqiad.wmnet with reason: Maintenance [19:06:35] !log ladsgroup@cumin1001 END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1098.eqiad.wmnet with reason: Maintenance [19:06:35] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log [19:06:38] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log [19:06:40] !log ladsgroup@cumin1001 dbctl commit (dc=all): 'Depooling db1098:3316 (T298565)', diff saved to https://phabricator.wikimedia.org/P25157 and previous config saved to /var/cache/conftool/dbconfig/20220418-190640-ladsgroup.json [19:06:40] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log [19:06:44] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log [19:18:49] !log ladsgroup@cumin1001 dbctl commit (dc=all): 'Repooling after maintenance db1098:3316 (T298565)', diff saved to https://phabricator.wikimedia.org/P25158 and previous config saved to /var/cache/conftool/dbconfig/20220418-191849-ladsgroup.json [19:18:53] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log [19:18:54] T298565: Fix mismatching field type of user table for columns user_email_authenticated, user_email_token, user_email_token_expires, user_newpass_time, user_registration, user_token, user_touched, user_newpassword, user_password, user_email on wmf wikis - https://phabricator.wikimedia.org/T298565 [19:24:16] 10SRE, 10SRE-Access-Requests, 10Data-Engineering, 10Generated Data Platform: Request to grant cparle and mfossati login to an-airflow1003.eqiad.wmne - https://phabricator.wikimedia.org/T306057 (10Dzahn) @ottomata We can use the existing group `airflow-search-admins`. It will give them access to machines wi... [19:27:01] RECOVERY - SSH on wtp1038.mgmt is OK: SSH OK - OpenSSH_7.0 (protocol 2.0) https://wikitech.wikimedia.org/wiki/Dc-operations/Hardware_Troubleshooting_Runbook [19:32:30] !log razzi@cumin1001 START - Cookbook sre.hosts.reimage for host clouddb1021.eqiad.wmnet with OS buster [19:32:32] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log [19:33:34] (03PS1) 10Sharvaniharan: Stream configs for newly migrated android schemas [mediawiki-config] - 10https://gerrit.wikimedia.org/r/783874 [19:33:54] !log ladsgroup@cumin1001 dbctl commit (dc=all): 'Repooling after maintenance db1098:3316', diff saved to https://phabricator.wikimedia.org/P25159 and previous config saved to /var/cache/conftool/dbconfig/20220418-193354-ladsgroup.json [19:33:57] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log [19:34:52] (03PS2) 10Sharvaniharan: Stream configs for newly migrated android schemas [mediawiki-config] - 10https://gerrit.wikimedia.org/r/783874 (https://phabricator.wikimedia.org/T306385) [19:43:16] !log razzi@cumin1001 START - Cookbook sre.hosts.downtime for 2:00:00 on clouddb1021.eqiad.wmnet with reason: host reimage [19:43:18] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log [19:43:41] !log ladsgroup@cumin1001 dbctl commit (dc=all): 'Repooling after maintenance db1147 (T298565)', diff saved to https://phabricator.wikimedia.org/P25160 and previous config saved to /var/cache/conftool/dbconfig/20220418-194340-ladsgroup.json [19:43:44] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log [19:43:47] T298565: Fix mismatching field type of user table for columns user_email_authenticated, user_email_token, user_email_token_expires, user_newpass_time, user_registration, user_token, user_touched, user_newpassword, user_password, user_email on wmf wikis - https://phabricator.wikimedia.org/T298565 [19:46:42] !log razzi@cumin1001 END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on clouddb1021.eqiad.wmnet with reason: host reimage [19:46:45] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log [19:48:59] !log ladsgroup@cumin1001 dbctl commit (dc=all): 'Repooling after maintenance db1098:3316', diff saved to https://phabricator.wikimedia.org/P25161 and previous config saved to /var/cache/conftool/dbconfig/20220418-194859-ladsgroup.json [19:49:02] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log [19:58:46] !log ladsgroup@cumin1001 dbctl commit (dc=all): 'Repooling after maintenance db1147', diff saved to https://phabricator.wikimedia.org/P25162 and previous config saved to /var/cache/conftool/dbconfig/20220418-195845-ladsgroup.json [19:58:48] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log [20:00:05] RoanKattouw and Urbanecm: Dear deployers, time to do the UTC late backport window deploy. Dont look at me like that. You signed up for it. (https://wikitech.wikimedia.org/wiki/Deployments#deploycal-item-20220418T2000). [20:00:05] musikanimal: A patch you scheduled for UTC late backport window is about to be deployed. Please be around during the process. Note: If you break AND fix the wikis, you will be rewarded with a sticker. [20:00:21] I'm here! [20:00:23] i can deploy today [20:00:26] hello musikanimal ! [20:00:29] o/ [20:02:04] !log razzi@cumin1001 END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host clouddb1021.eqiad.wmnet with OS buster [20:02:07] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log [20:02:49] hmm, I feel like the approval comment in IS.php is out of date. but, a +1 from one of the people listed is on the file, so let's deploy it. [20:03:30] (03CR) 10Urbanecm: [C: 03+2] "Comment requires approval from James F. / Greg G. James +1'ed, so I assume this is fine. Deploying." [mediawiki-config] - 10https://gerrit.wikimedia.org/r/781096 (https://phabricator.wikimedia.org/T304596) (owner: 10MusikAnimal) [20:04:04] !log ladsgroup@cumin1001 dbctl commit (dc=all): 'Repooling after maintenance db1098:3316 (T298565)', diff saved to https://phabricator.wikimedia.org/P25163 and previous config saved to /var/cache/conftool/dbconfig/20220418-200404-ladsgroup.json [20:04:08] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log [20:04:09] T298565: Fix mismatching field type of user table for columns user_email_authenticated, user_email_token, user_email_token_expires, user_newpass_time, user_registration, user_token, user_touched, user_newpassword, user_password, user_email on wmf wikis - https://phabricator.wikimedia.org/T298565 [20:04:11] (03Merged) 10jenkins-bot: Add WikiEditor's Realtime Preview to BetaFeatures [mediawiki-config] - 10https://gerrit.wikimedia.org/r/781096 (https://phabricator.wikimedia.org/T304596) (owner: 10MusikAnimal) [20:04:13] !log ladsgroup@cumin1001 START - Cookbook sre.hosts.downtime for 6:00:00 on db1098.eqiad.wmnet with reason: Maintenance [20:04:14] !log ladsgroup@cumin1001 END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1098.eqiad.wmnet with reason: Maintenance [20:04:15] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log [20:04:17] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log [20:04:19] !log ladsgroup@cumin1001 dbctl commit (dc=all): 'Depooling db1098:3316 (T298565)', diff saved to https://phabricator.wikimedia.org/P25164 and previous config saved to /var/cache/conftool/dbconfig/20220418-200418-ladsgroup.json [20:04:22] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log [20:04:27] (03CR) 10MusikAnimal: Add WikiEditor's Realtime Preview to BetaFeatures (031 comment) [mediawiki-config] - 10https://gerrit.wikimedia.org/r/781096 (https://phabricator.wikimedia.org/T304596) (owner: 10MusikAnimal) [20:04:59] musikanimal: pulled to mwdebug1001. can you test this behaves as intended? [20:05:06] doing! [20:05:06] !log ladsgroup@cumin1001 dbctl commit (dc=all): 'Repooling after maintenance db1179 (T298565)', diff saved to https://phabricator.wikimedia.org/P25165 and previous config saved to /var/cache/conftool/dbconfig/20220418-200506-ladsgroup.json [20:05:10] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log [20:07:21] urbanecm: looks good! the beta feature correctly only appears on testwiki, and is also opt-in (in reference to Tacsipacsi's comment on the patch) [20:07:49] behaves the same for me :). syncing. [20:08:52] !log mwdebug-deploy@deploy1002 helmfile [eqiad] START helmfile.d/services/mwdebug: apply [20:08:54] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log [20:08:55] !log mwdebug-deploy@deploy1002 helmfile [eqiad] DONE helmfile.d/services/mwdebug: apply [20:08:56] !log mwdebug-deploy@deploy1002 helmfile [codfw] START helmfile.d/services/mwdebug: apply [20:08:57] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log [20:08:59] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log [20:09:00] !log mwdebug-deploy@deploy1002 helmfile [codfw] DONE helmfile.d/services/mwdebug: apply [20:09:02] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log [20:09:07] !log urbanecm@deploy1002 Synchronized wmf-config/InitialiseSettings.php: 0efb2b2: Add WikiEditor Realtime Preview to BetaFeatures (T304596) (duration: 00m 51s) [20:09:11] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log [20:09:12] T304596: Add Realtime Preview as Beta Feature - https://phabricator.wikimedia.org/T304596 [20:09:12] musikanimal: should be live. anything else? [20:09:28] that'll do it! thank you! [20:09:37] any time :) [20:10:18] !log UTC late backport window done [20:10:20] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log [20:13:51] !log ladsgroup@cumin1001 dbctl commit (dc=all): 'Repooling after maintenance db1147', diff saved to https://phabricator.wikimedia.org/P25166 and previous config saved to /var/cache/conftool/dbconfig/20220418-201350-ladsgroup.json [20:13:54] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log [20:16:10] !log ladsgroup@cumin1001 dbctl commit (dc=all): 'Repooling after maintenance db1098:3316 (T298565)', diff saved to https://phabricator.wikimedia.org/P25167 and previous config saved to /var/cache/conftool/dbconfig/20220418-201609-ladsgroup.json [20:16:13] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log [20:16:14] T298565: Fix mismatching field type of user table for columns user_email_authenticated, user_email_token, user_email_token_expires, user_newpass_time, user_registration, user_token, user_touched, user_newpassword, user_password, user_email on wmf wikis - https://phabricator.wikimedia.org/T298565 [20:18:51] (03CR) 10MusikAnimal: Add WikiEditor's Realtime Preview to BetaFeatures (031 comment) [mediawiki-config] - 10https://gerrit.wikimedia.org/r/781096 (https://phabricator.wikimedia.org/T304596) (owner: 10MusikAnimal) [20:19:17] PROBLEM - SSH on wtp1037.mgmt is CRITICAL: CRITICAL - Socket timeout after 10 seconds https://wikitech.wikimedia.org/wiki/Dc-operations/Hardware_Troubleshooting_Runbook [20:20:11] !log ladsgroup@cumin1001 dbctl commit (dc=all): 'Repooling after maintenance db1179', diff saved to https://phabricator.wikimedia.org/P25168 and previous config saved to /var/cache/conftool/dbconfig/20220418-202011-ladsgroup.json [20:20:14] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log [20:25:56] 10SRE, 10SRE-Access-Requests: Requesting access to analytics-privatedata-users for Essex Igyan eigyan - https://phabricator.wikimedia.org/T305948 (10Dzahn) 05Open→03Stalled [20:26:43] 10SRE, 10SRE-Access-Requests: Requesting access to google console for TomekSikora.Monsoon - https://phabricator.wikimedia.org/T304502 (10Dzahn) a:03TomekSikora.Monsoon [20:27:04] 10SRE, 10SRE-Access-Requests: Requesting access to analytics-privatedata-users for Essex Igyan eigyan - https://phabricator.wikimedia.org/T305948 (10Dzahn) a:03mepps [20:28:56] !log ladsgroup@cumin1001 dbctl commit (dc=all): 'Repooling after maintenance db1147 (T298565)', diff saved to https://phabricator.wikimedia.org/P25169 and previous config saved to /var/cache/conftool/dbconfig/20220418-202855-ladsgroup.json [20:28:57] !log ladsgroup@cumin1001 START - Cookbook sre.hosts.downtime for 6:00:00 on db1150.eqiad.wmnet with reason: Maintenance [20:28:59] !log ladsgroup@cumin1001 END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1150.eqiad.wmnet with reason: Maintenance [20:29:00] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log [20:29:02] T298565: Fix mismatching field type of user table for columns user_email_authenticated, user_email_token, user_email_token_expires, user_newpass_time, user_registration, user_token, user_touched, user_newpassword, user_password, user_email on wmf wikis - https://phabricator.wikimedia.org/T298565 [20:29:03] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log [20:29:06] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log [20:31:15] !log ladsgroup@cumin1001 dbctl commit (dc=all): 'Repooling after maintenance db1098:3316', diff saved to https://phabricator.wikimedia.org/P25170 and previous config saved to /var/cache/conftool/dbconfig/20220418-203114-ladsgroup.json [20:31:17] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log [20:31:51] (03PS3) 10Zabe: ci: remove absented gitcache crons [puppet] - 10https://gerrit.wikimedia.org/r/779041 (https://phabricator.wikimedia.org/T273673) [20:35:17] !log ladsgroup@cumin1001 dbctl commit (dc=all): 'Repooling after maintenance db1179', diff saved to https://phabricator.wikimedia.org/P25171 and previous config saved to /var/cache/conftool/dbconfig/20220418-203516-ladsgroup.json [20:35:19] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log [20:37:13] (KubernetesRsyslogDown) firing: rsyslog on kubernetes1018:9105 is missing kubernetes logs - https://wikitech.wikimedia.org/wiki/Kubernetes/Logging#Common_issues - https://alerts.wikimedia.org/?q=alertname%3DKubernetesRsyslogDown [20:37:49] !log ladsgroup@cumin1001 START - Cookbook sre.hosts.downtime for 6:00:00 on db1148.eqiad.wmnet with reason: Maintenance [20:37:50] !log ladsgroup@cumin1001 END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1148.eqiad.wmnet with reason: Maintenance [20:37:51] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log [20:37:54] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log [20:37:55] !log ladsgroup@cumin1001 dbctl commit (dc=all): 'Depooling db1148 (T298565)', diff saved to https://phabricator.wikimedia.org/P25172 and previous config saved to /var/cache/conftool/dbconfig/20220418-203755-ladsgroup.json [20:37:58] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log [20:37:59] T298565: Fix mismatching field type of user table for columns user_email_authenticated, user_email_token, user_email_token_expires, user_newpass_time, user_registration, user_token, user_touched, user_newpassword, user_password, user_email on wmf wikis - https://phabricator.wikimedia.org/T298565 [20:43:03] (03CR) 10Dzahn: [C: 03+2] ci: remove absented gitcache crons [puppet] - 10https://gerrit.wikimedia.org/r/779041 (https://phabricator.wikimedia.org/T273673) (owner: 10Zabe) [20:46:20] !log ladsgroup@cumin1001 dbctl commit (dc=all): 'Repooling after maintenance db1098:3316', diff saved to https://phabricator.wikimedia.org/P25173 and previous config saved to /var/cache/conftool/dbconfig/20220418-204619-ladsgroup.json [20:46:23] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log [20:50:22] !log ladsgroup@cumin1001 dbctl commit (dc=all): 'Repooling after maintenance db1179 (T298565)', diff saved to https://phabricator.wikimedia.org/P25174 and previous config saved to /var/cache/conftool/dbconfig/20220418-205021-ladsgroup.json [20:50:23] !log ladsgroup@cumin1001 START - Cookbook sre.hosts.downtime for 6:00:00 on dbstore1007.eqiad.wmnet with reason: Maintenance [20:50:25] !log ladsgroup@cumin1001 END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on dbstore1007.eqiad.wmnet with reason: Maintenance [20:50:25] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log [20:50:26] T298565: Fix mismatching field type of user table for columns user_email_authenticated, user_email_token, user_email_token_expires, user_newpass_time, user_registration, user_token, user_touched, user_newpassword, user_password, user_email on wmf wikis - https://phabricator.wikimedia.org/T298565 [20:50:28] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log [20:50:31] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log [21:00:04] Reedy, sbassett, Maryum, and manfredi: #bothumor When your hammer is PHP, everything starts looking like a thumb. Rise for Weekly Security deployment window. (https://wikitech.wikimedia.org/wiki/Deployments#deploycal-item-20220418T2100). [21:00:48] !log ladsgroup@cumin1001 dbctl commit (dc=all): 'Repooling after maintenance db1148 (T298565)', diff saved to https://phabricator.wikimedia.org/P25175 and previous config saved to /var/cache/conftool/dbconfig/20220418-210047-ladsgroup.json [21:00:51] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log [21:00:52] T298565: Fix mismatching field type of user table for columns user_email_authenticated, user_email_token, user_email_token_expires, user_newpass_time, user_registration, user_token, user_touched, user_newpassword, user_password, user_email on wmf wikis - https://phabricator.wikimedia.org/T298565 [21:01:25] !log ladsgroup@cumin1001 dbctl commit (dc=all): 'Repooling after maintenance db1098:3316 (T298565)', diff saved to https://phabricator.wikimedia.org/P25176 and previous config saved to /var/cache/conftool/dbconfig/20220418-210124-ladsgroup.json [21:01:26] !log ladsgroup@cumin1001 START - Cookbook sre.hosts.downtime for 6:00:00 on db1140.eqiad.wmnet with reason: Maintenance [21:01:28] !log ladsgroup@cumin1001 END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1140.eqiad.wmnet with reason: Maintenance [21:01:29] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log [21:01:32] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log [21:01:34] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log [21:10:58] !log ladsgroup@cumin1001 START - Cookbook sre.hosts.downtime for 6:00:00 on db2129.codfw.wmnet with reason: Maintenance [21:11:00] !log ladsgroup@cumin1001 END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db2129.codfw.wmnet with reason: Maintenance [21:11:00] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log [21:11:01] !log ladsgroup@cumin1001 START - Cookbook sre.hosts.downtime for 12:00:00 on 8 hosts with reason: Maintenance [21:11:04] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log [21:11:07] !log ladsgroup@cumin1001 END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on 8 hosts with reason: Maintenance [21:11:08] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log [21:11:10] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log [21:12:16] (03PS7) 10Bking: elastic: Restart masters one at a time after all others [software/spicerack] - 10https://gerrit.wikimedia.org/r/781009 (https://phabricator.wikimedia.org/T306389) (owner: 10Ebernhardson) [21:15:53] !log ladsgroup@cumin1001 dbctl commit (dc=all): 'Repooling after maintenance db1148', diff saved to https://phabricator.wikimedia.org/P25177 and previous config saved to /var/cache/conftool/dbconfig/20220418-211552-ladsgroup.json [21:15:56] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log [21:16:27] !log mw2382 - iptables -Z INPUT 151 (zero'ing iptables rule for jobrunners, want to confirm for https://gerrit.wikimedia.org/r/c/operations/puppet/+//5/modules/profile/manifests/mediawiki/jobrunner.pp) [21:16:29] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log [21:20:41] !log ladsgroup@cumin1001 START - Cookbook sre.hosts.downtime for 6:00:00 on dbstore1005.eqiad.wmnet with reason: Maintenance [21:20:43] !log ladsgroup@cumin1001 END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on dbstore1005.eqiad.wmnet with reason: Maintenance [21:20:43] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log [21:20:46] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log [21:21:48] (03CR) 10Dzahn: [C: 04-1] "I went to check this and on mw2382 looked at all iptables rules regarding dest port 9005. There were 2 rules that had counters > 0 that ha" [puppet] - 10https://gerrit.wikimedia.org/r/376024 (owner: 10Giuseppe Lavagetto) [21:30:31] !log ladsgroup@cumin1001 START - Cookbook sre.hosts.downtime for 6:00:00 on db1131.eqiad.wmnet with reason: Maintenance [21:30:32] !log ladsgroup@cumin1001 END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1131.eqiad.wmnet with reason: Maintenance [21:30:33] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log [21:30:36] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log [21:30:37] !log ladsgroup@cumin1001 dbctl commit (dc=all): 'Depooling db1131 (T298565)', diff saved to https://phabricator.wikimedia.org/P25178 and previous config saved to /var/cache/conftool/dbconfig/20220418-213037-ladsgroup.json [21:30:41] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log [21:30:42] T298565: Fix mismatching field type of user table for columns user_email_authenticated, user_email_token, user_email_token_expires, user_newpass_time, user_registration, user_token, user_touched, user_newpassword, user_password, user_email on wmf wikis - https://phabricator.wikimedia.org/T298565 [21:30:58] !log ladsgroup@cumin1001 dbctl commit (dc=all): 'Repooling after maintenance db1148', diff saved to https://phabricator.wikimedia.org/P25179 and previous config saved to /var/cache/conftool/dbconfig/20220418-213057-ladsgroup.json [21:31:00] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log [21:31:48] PROBLEM - puppet last run on contint1001 is CRITICAL: CRITICAL: Puppet has been disabled for 604828 seconds, message: reason not specified, last run 7 days ago with 0 failures https://wikitech.wikimedia.org/wiki/Monitoring/puppet_checkpuppetrun [21:34:59] !log ladsgroup@cumin1001 dbctl commit (dc=all): 'Repooling after maintenance db1131 (T298565)', diff saved to https://phabricator.wikimedia.org/P25180 and previous config saved to /var/cache/conftool/dbconfig/20220418-213459-ladsgroup.json [21:35:03] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log [21:44:12] 10SRE, 10serviceops: Service Ops SRE support for iOS notifications update - https://phabricator.wikimedia.org/T306397 (10RLazarus) [21:44:45] 10SRE, 10serviceops: Service Ops SRE support for iOS notifications update - https://phabricator.wikimedia.org/T306397 (10RLazarus) p:05Triage→03Medium [21:45:07] 10SRE, 10serviceops: Service Ops SRE support for iOS notifications update - https://phabricator.wikimedia.org/T306397 (10RLazarus) [21:46:03] !log ladsgroup@cumin1001 dbctl commit (dc=all): 'Repooling after maintenance db1148 (T298565)', diff saved to https://phabricator.wikimedia.org/P25181 and previous config saved to /var/cache/conftool/dbconfig/20220418-214602-ladsgroup.json [21:46:04] !log ladsgroup@cumin1001 START - Cookbook sre.hosts.downtime for 6:00:00 on db1149.eqiad.wmnet with reason: Maintenance [21:46:06] !log ladsgroup@cumin1001 END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1149.eqiad.wmnet with reason: Maintenance [21:46:06] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log [21:46:09] T298565: Fix mismatching field type of user table for columns user_email_authenticated, user_email_token, user_email_token_expires, user_newpass_time, user_registration, user_token, user_touched, user_newpassword, user_password, user_email on wmf wikis - https://phabricator.wikimedia.org/T298565 [21:46:11] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log [21:46:12] !log ladsgroup@cumin1001 dbctl commit (dc=all): 'Depooling db1149 (T298565)', diff saved to https://phabricator.wikimedia.org/P25182 and previous config saved to /var/cache/conftool/dbconfig/20220418-214610-ladsgroup.json [21:46:14] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log [21:46:17] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log [21:46:43] PROBLEM - MariaDB Replica Lag: s1 on clouddb1021 is CRITICAL: CRITICAL slave_sql_lag Replication lag: 372058.09 seconds https://wikitech.wikimedia.org/wiki/MariaDB/troubleshooting%23Depooling_a_replica [21:47:17] PROBLEM - MariaDB Replica Lag: s2 on clouddb1021 is CRITICAL: CRITICAL slave_sql_lag Replication lag: 326304.48 seconds https://wikitech.wikimedia.org/wiki/MariaDB/troubleshooting%23Depooling_a_replica [21:47:47] PROBLEM - MariaDB Replica Lag: s3 on clouddb1021 is CRITICAL: CRITICAL slave_sql_lag Replication lag: 264677.37 seconds https://wikitech.wikimedia.org/wiki/MariaDB/troubleshooting%23Depooling_a_replica [21:47:55] (NodeTextfileStale) firing: Stale textfile for cloudcontrol2001-dev:9100 - https://wikitech.wikimedia.org/wiki/Prometheus#Stale_file_for_node-exporter_textfile - https://grafana.wikimedia.org/d/knkl4dCWz/node-exporter-textfile - https://alerts.wikimedia.org/?q=alertname%3DNodeTextfileStale [21:48:09] PROBLEM - MariaDB Replica Lag: s4 on clouddb1021 is CRITICAL: CRITICAL slave_sql_lag Replication lag: 337266.19 seconds https://wikitech.wikimedia.org/wiki/MariaDB/troubleshooting%23Depooling_a_replica [21:48:21] PROBLEM - MariaDB Replica Lag: s5 on clouddb1021 is CRITICAL: CRITICAL slave_sql_lag Replication lag: 408239.39 seconds https://wikitech.wikimedia.org/wiki/MariaDB/troubleshooting%23Depooling_a_replica [21:48:37] PROBLEM - MariaDB Replica Lag: s6 on clouddb1021 is CRITICAL: CRITICAL slave_sql_lag Replication lag: 223735.52 seconds https://wikitech.wikimedia.org/wiki/MariaDB/troubleshooting%23Depooling_a_replica [21:48:47] PROBLEM - MariaDB Replica Lag: s7 on clouddb1021 is CRITICAL: CRITICAL slave_sql_lag Replication lag: 345378.11 seconds https://wikitech.wikimedia.org/wiki/MariaDB/troubleshooting%23Depooling_a_replica [21:48:51] PROBLEM - MariaDB Replica Lag: s8 on clouddb1021 is CRITICAL: CRITICAL slave_sql_lag Replication lag: 406476.98 seconds https://wikitech.wikimedia.org/wiki/MariaDB/troubleshooting%23Depooling_a_replica [21:50:05] !log ladsgroup@cumin1001 dbctl commit (dc=all): 'Repooling after maintenance db1131', diff saved to https://phabricator.wikimedia.org/P25183 and previous config saved to /var/cache/conftool/dbconfig/20220418-215004-ladsgroup.json [21:50:07] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log [21:54:25] 10SRE, 10serviceops: Service Ops SRE support for iOS notifications update - https://phabricator.wikimedia.org/T306397 (10Dzahn) [21:56:09] 10SRE, 10Infrastructure-Foundations, 10LDAP-Access-Requests: Grant Access to ldap/wmf for Nathillard - https://phabricator.wikimedia.org/T305978 (10Dzahn) Hi @NHillard-WMF let's rule out option 1 now that some time has passed. Do you still get the same "Service access denied due to missing privileges." messa... [21:57:02] !log ladsgroup@cumin1001 dbctl commit (dc=all): 'Repooling after maintenance db1149 (T298565)', diff saved to https://phabricator.wikimedia.org/P25184 and previous config saved to /var/cache/conftool/dbconfig/20220418-215701-ladsgroup.json [21:57:05] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log [21:57:06] T298565: Fix mismatching field type of user table for columns user_email_authenticated, user_email_token, user_email_token_expires, user_newpass_time, user_registration, user_token, user_touched, user_newpassword, user_password, user_email on wmf wikis - https://phabricator.wikimedia.org/T298565 [22:03:59] PROBLEM - SSH on aqs1008.mgmt is CRITICAL: CRITICAL - Socket timeout after 10 seconds https://wikitech.wikimedia.org/wiki/Dc-operations/Hardware_Troubleshooting_Runbook [22:05:09] !log ladsgroup@cumin1001 dbctl commit (dc=all): 'Repooling after maintenance db1131', diff saved to https://phabricator.wikimedia.org/P25185 and previous config saved to /var/cache/conftool/dbconfig/20220418-220509-ladsgroup.json [22:05:12] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log [22:12:07] !log ladsgroup@cumin1001 dbctl commit (dc=all): 'Repooling after maintenance db1149', diff saved to https://phabricator.wikimedia.org/P25186 and previous config saved to /var/cache/conftool/dbconfig/20220418-221206-ladsgroup.json [22:12:09] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log [22:20:15] !log ladsgroup@cumin1001 dbctl commit (dc=all): 'Repooling after maintenance db1131 (T298565)', diff saved to https://phabricator.wikimedia.org/P25187 and previous config saved to /var/cache/conftool/dbconfig/20220418-222014-ladsgroup.json [22:20:16] !log ladsgroup@cumin1001 START - Cookbook sre.hosts.downtime for 6:00:00 on db1096.eqiad.wmnet with reason: Maintenance [22:20:18] !log ladsgroup@cumin1001 END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1096.eqiad.wmnet with reason: Maintenance [22:20:18] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log [22:20:20] T298565: Fix mismatching field type of user table for columns user_email_authenticated, user_email_token, user_email_token_expires, user_newpass_time, user_registration, user_token, user_touched, user_newpassword, user_password, user_email on wmf wikis - https://phabricator.wikimedia.org/T298565 [22:20:22] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log [22:20:23] !log ladsgroup@cumin1001 dbctl commit (dc=all): 'Depooling db1096:3316 (T298565)', diff saved to https://phabricator.wikimedia.org/P25188 and previous config saved to /var/cache/conftool/dbconfig/20220418-222022-ladsgroup.json [22:20:25] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log [22:20:29] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log [22:21:39] RECOVERY - SSH on wtp1037.mgmt is OK: SSH OK - OpenSSH_7.0 (protocol 2.0) https://wikitech.wikimedia.org/wiki/Dc-operations/Hardware_Troubleshooting_Runbook [22:22:44] (KubernetesRsyslogDown) firing: (4) rsyslog on ml-staging-ctrl2001:9105 is missing kubernetes logs - https://wikitech.wikimedia.org/wiki/Kubernetes/Logging#Common_issues - https://alerts.wikimedia.org/?q=alertname%3DKubernetesRsyslogDown [22:23:02] !log contint1001 - re-enabling puppet that was disabled a week ago. to prevent more issues when it falls out of puppet DB, hopefully there wasn't a hard reason for this [22:23:04] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log [22:25:05] PROBLEM - BGP status on cr2-eqiad is CRITICAL: BGP CRITICAL - AS64605/IPv6: Active - Anycast https://wikitech.wikimedia.org/wiki/Network_monitoring%23BGP_status [22:26:32] (03PS2) 10Zabe: scap: remove two absented files [puppet] - 10https://gerrit.wikimedia.org/r/783461 [22:27:12] !log ladsgroup@cumin1001 dbctl commit (dc=all): 'Repooling after maintenance db1149', diff saved to https://phabricator.wikimedia.org/P25189 and previous config saved to /var/cache/conftool/dbconfig/20220418-222712-ladsgroup.json [22:27:14] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log [22:28:03] RECOVERY - puppet last run on contint1001 is OK: OK: Puppet is currently enabled, last run 4 minutes ago with 0 failures https://wikitech.wikimedia.org/wiki/Monitoring/puppet_checkpuppetrun [22:32:27] !log ladsgroup@cumin1001 dbctl commit (dc=all): 'Repooling after maintenance db1096:3316 (T298565)', diff saved to https://phabricator.wikimedia.org/P25190 and previous config saved to /var/cache/conftool/dbconfig/20220418-223227-ladsgroup.json [22:32:31] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log [22:32:32] T298565: Fix mismatching field type of user table for columns user_email_authenticated, user_email_token, user_email_token_expires, user_newpass_time, user_registration, user_token, user_touched, user_newpassword, user_password, user_email on wmf wikis - https://phabricator.wikimedia.org/T298565 [22:32:55] (NodeTextfileStale) firing: (3) Stale textfile for elastic1075:9100 - https://wikitech.wikimedia.org/wiki/Prometheus#Stale_file_for_node-exporter_textfile - https://grafana.wikimedia.org/d/knkl4dCWz/node-exporter-textfile - https://alerts.wikimedia.org/?q=alertname%3DNodeTextfileStale [22:42:17] !log ladsgroup@cumin1001 dbctl commit (dc=all): 'Repooling after maintenance db1149 (T298565)', diff saved to https://phabricator.wikimedia.org/P25191 and previous config saved to /var/cache/conftool/dbconfig/20220418-224217-ladsgroup.json [22:42:19] !log ladsgroup@cumin1001 START - Cookbook sre.hosts.downtime for 6:00:00 on db1146.eqiad.wmnet with reason: Maintenance [22:42:20] !log ladsgroup@cumin1001 END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1146.eqiad.wmnet with reason: Maintenance [22:42:21] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log [22:42:22] T298565: Fix mismatching field type of user table for columns user_email_authenticated, user_email_token, user_email_token_expires, user_newpass_time, user_registration, user_token, user_touched, user_newpassword, user_password, user_email on wmf wikis - https://phabricator.wikimedia.org/T298565 [22:42:24] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log [22:42:25] !log ladsgroup@cumin1001 dbctl commit (dc=all): 'Depooling db1146:3314 (T298565)', diff saved to https://phabricator.wikimedia.org/P25192 and previous config saved to /var/cache/conftool/dbconfig/20220418-224225-ladsgroup.json [22:42:27] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log [22:42:31] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log [22:47:32] !log ladsgroup@cumin1001 dbctl commit (dc=all): 'Repooling after maintenance db1096:3316', diff saved to https://phabricator.wikimedia.org/P25193 and previous config saved to /var/cache/conftool/dbconfig/20220418-224732-ladsgroup.json [22:47:35] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log [22:53:32] !log ladsgroup@cumin1001 dbctl commit (dc=all): 'Repooling after maintenance db1146:3314 (T298565)', diff saved to https://phabricator.wikimedia.org/P25194 and previous config saved to /var/cache/conftool/dbconfig/20220418-225331-ladsgroup.json [22:53:35] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log [22:53:37] T298565: Fix mismatching field type of user table for columns user_email_authenticated, user_email_token, user_email_token_expires, user_newpass_time, user_registration, user_token, user_touched, user_newpassword, user_password, user_email on wmf wikis - https://phabricator.wikimedia.org/T298565 [22:58:51] PROBLEM - BGP status on cr2-eqiad is CRITICAL: BGP CRITICAL - AS64605/IPv6: Active - Anycast https://wikitech.wikimedia.org/wiki/Network_monitoring%23BGP_status [23:01:55] (NodeTextfileStale) firing: Stale textfile for ms-be2067:9100 - https://wikitech.wikimedia.org/wiki/Prometheus#Stale_file_for_node-exporter_textfile - https://grafana.wikimedia.org/d/knkl4dCWz/node-exporter-textfile - https://alerts.wikimedia.org/?q=alertname%3DNodeTextfileStale [23:02:37] !log ladsgroup@cumin1001 dbctl commit (dc=all): 'Repooling after maintenance db1096:3316', diff saved to https://phabricator.wikimedia.org/P25195 and previous config saved to /var/cache/conftool/dbconfig/20220418-230237-ladsgroup.json [23:02:40] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log [23:05:09] RECOVERY - SSH on aqs1008.mgmt is OK: SSH OK - OpenSSH_7.0 (protocol 2.0) https://wikitech.wikimedia.org/wiki/Dc-operations/Hardware_Troubleshooting_Runbook [23:08:39] !log ladsgroup@cumin1001 dbctl commit (dc=all): 'Repooling after maintenance db1146:3314', diff saved to https://phabricator.wikimedia.org/P25196 and previous config saved to /var/cache/conftool/dbconfig/20220418-230836-ladsgroup.json [23:08:41] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log [23:09:46] (03CR) 10Dzahn: [C: 03+2] "checked with cumin on mw*. files don't exist anywhere" [puppet] - 10https://gerrit.wikimedia.org/r/783461 (owner: 10Zabe) [23:17:42] !log ladsgroup@cumin1001 dbctl commit (dc=all): 'Repooling after maintenance db1096:3316 (T298565)', diff saved to https://phabricator.wikimedia.org/P25197 and previous config saved to /var/cache/conftool/dbconfig/20220418-231742-ladsgroup.json [23:17:44] !log ladsgroup@cumin1001 START - Cookbook sre.hosts.downtime for 6:00:00 on db1113.eqiad.wmnet with reason: Maintenance [23:17:45] !log ladsgroup@cumin1001 END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1113.eqiad.wmnet with reason: Maintenance [23:17:46] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log [23:17:47] T298565: Fix mismatching field type of user table for columns user_email_authenticated, user_email_token, user_email_token_expires, user_newpass_time, user_registration, user_token, user_touched, user_newpassword, user_password, user_email on wmf wikis - https://phabricator.wikimedia.org/T298565 [23:17:50] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log [23:17:50] !log ladsgroup@cumin1001 dbctl commit (dc=all): 'Depooling db1113:3316 (T298565)', diff saved to https://phabricator.wikimedia.org/P25198 and previous config saved to /var/cache/conftool/dbconfig/20220418-231750-ladsgroup.json [23:17:52] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log [23:17:56] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log [23:23:44] !log ladsgroup@cumin1001 dbctl commit (dc=all): 'Repooling after maintenance db1146:3314', diff saved to https://phabricator.wikimedia.org/P25199 and previous config saved to /var/cache/conftool/dbconfig/20220418-232343-ladsgroup.json [23:23:46] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log [23:30:47] !log ladsgroup@cumin1001 dbctl commit (dc=all): 'Repooling after maintenance db1113:3316 (T298565)', diff saved to https://phabricator.wikimedia.org/P25200 and previous config saved to /var/cache/conftool/dbconfig/20220418-233047-ladsgroup.json [23:30:50] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log [23:30:52] T298565: Fix mismatching field type of user table for columns user_email_authenticated, user_email_token, user_email_token_expires, user_newpass_time, user_registration, user_token, user_touched, user_newpassword, user_password, user_email on wmf wikis - https://phabricator.wikimedia.org/T298565 [23:38:49] !log ladsgroup@cumin1001 dbctl commit (dc=all): 'Repooling after maintenance db1146:3314 (T298565)', diff saved to https://phabricator.wikimedia.org/P25201 and previous config saved to /var/cache/conftool/dbconfig/20220418-233848-ladsgroup.json [23:38:50] !log ladsgroup@cumin1001 START - Cookbook sre.hosts.downtime for 6:00:00 on dbstore1007.eqiad.wmnet with reason: Maintenance [23:38:52] !log ladsgroup@cumin1001 END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on dbstore1007.eqiad.wmnet with reason: Maintenance [23:38:53] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log [23:38:54] T298565: Fix mismatching field type of user table for columns user_email_authenticated, user_email_token, user_email_token_expires, user_newpass_time, user_registration, user_token, user_touched, user_newpassword, user_password, user_email on wmf wikis - https://phabricator.wikimedia.org/T298565 [23:38:56] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log [23:38:58] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log [23:45:52] !log ladsgroup@cumin1001 dbctl commit (dc=all): 'Repooling after maintenance db1113:3316', diff saved to https://phabricator.wikimedia.org/P25202 and previous config saved to /var/cache/conftool/dbconfig/20220418-234552-ladsgroup.json [23:45:54] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log [23:47:41] !log ladsgroup@cumin1001 START - Cookbook sre.hosts.downtime for 6:00:00 on db1145.eqiad.wmnet with reason: Maintenance [23:47:42] !log ladsgroup@cumin1001 END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1145.eqiad.wmnet with reason: Maintenance [23:47:43] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log [23:47:44] (03CR) 10Cwhite: [C: 03+1] "I don't think we have any more hosts running the version of lsblk without that flag. Thanks!" [puppet] - 10https://gerrit.wikimedia.org/r/780907 (owner: 10JHathaway) [23:47:46] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log [23:56:28] !log ladsgroup@cumin1001 START - Cookbook sre.hosts.downtime for 6:00:00 on db1138.eqiad.wmnet with reason: Maintenance [23:56:30] !log ladsgroup@cumin1001 END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1138.eqiad.wmnet with reason: Maintenance [23:56:31] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log [23:56:34] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log [23:56:35] !log ladsgroup@cumin1001 dbctl commit (dc=all): 'Depooling db1138 (T298565)', diff saved to https://phabricator.wikimedia.org/P25203 and previous config saved to /var/cache/conftool/dbconfig/20220418-235634-ladsgroup.json [23:56:38] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log [23:56:39] T298565: Fix mismatching field type of user table for columns user_email_authenticated, user_email_token, user_email_token_expires, user_newpass_time, user_registration, user_token, user_touched, user_newpassword, user_password, user_email on wmf wikis - https://phabricator.wikimedia.org/T298565