[11:02:18] o/ I've been doing some work on a tool which needs some fairly serious updating.. before doing anything more, I'd like to do a file-level backup of the entire tool directory. Before I run `zip -r tools.refill.zip -9 /data/project/refill`, is there a "better way"? [11:03:24] I’d probably run that on dev.toolforge.org instead of login.toolforge.org, to reduce NFS load on the primary bastion [11:03:35] otherwise, not much I can think of [11:04:05] you could rsync the files to a local system without compression, and then compress them locally instead of on Toolforge, but I don’t think `zip -9` needs enough CPU for that to make a significant difference [11:04:29] (I do that with database backups, `xz` can use more CPU) [11:04:46] cool, thank you Lucas_WMDE :) I'm hoping `-9` isn't too resource-heavy [11:04:47] https://github.com/lucaswerkmeister/home/blob/f184a6445b6624ee4d2923aad0871cc8c5857f80/.local/bin/toolsdb-backup#L27-L31 [11:05:44] * TheresNoTime will run it in a `screen` just in case [11:05:49] think it'll take a while :P [11:06:59] normally I'd rely on the fact everything "important" is in git, but it seems the local instance has diverged a lot from what the actual repo holds :') [11:07:11] :mildpanic: [11:09:50] that is bumping the load up on `tools-sgebastion-11` a fair bit :/ [12:43:03] !log devtools `os quota set devtools --ram 45056 --cores 22 --instances 9` # T311302 [12:43:07] Logged the message at https://wikitech.wikimedia.org/wiki/Nova_Resource:Devtools/SAL [12:43:07] T311302: Request increased quota for devtools Cloud VPS project - https://phabricator.wikimedia.org/T311302 [15:53:17] !log tools.quickcategories deployed 64df38cf96 (README.md update – pulled without webservice restart) [15:53:19] Logged the message at https://wikitech.wikimedia.org/wiki/Nova_Resource:Tools.quickcategories/SAL [17:11:43] what project do people use when they need a vm for a one-off development task? [17:30:13] ori: we really don't have a shared space for that, but there might be places to send you with a bit more context... [17:48:04] bd808: I wanted a dev environment similar to the prod varnishes -- that is: debian buster, and the various wmf repos set up in /etc/apt, so i get the same version of varnish that is currently in use [17:49:59] it seems like docker might be a way but the base buster image has no repos set up in /etc/apt [19:14:59] Can [19:15:15] Can't SSH into Toolforge. Am I missing something? [19:15:21] ori: a docker base that is setup for the typical /etc/apt config would be useful. For you for now, you might get what you need out of https://openstack-browser.toolforge.org/project/sre-sandbox might be useful. See T247517 for the scope of that project and it's interesting solution. [19:15:22] T247517: Request creation of 'sre-sandbox' VPS project - https://phabricator.wikimedia.org/T247517 [19:16:01] Getting "Network error: Connection timed out:" [19:16:01] Cyberpower678: do you happen to have any more details? [19:16:17] what hostname are you trying to connect to? [19:16:27] https://wikitech.wikimedia.org/wiki/Reporting_a_connectivity_issue [19:18:01] taavi: login-stretch.tools.wmflabs.org [19:18:29] try login.toolforge.org [19:18:41] That'll be buster though taavi [19:18:55] Although stretch is gone ain't it [19:18:57] RhinosF1: yes, I shut down the stretch bastions earlier today [19:19:44] taavi, so those hosts don't redirect to the new machines? [19:20:08] -Stretch was only temp in case you had to [19:20:15] Until it got shut off [19:21:24] Cyberpower678: if they did then you would be here asking about why the ssh host keys had changed. I'm not sure what the least disruptive answer is honestly. [19:22:14] Not why, if. Besides, I'm aware of the Fingerprints page on Wikitech now. :-) [19:24:04] bd808: sre-sandbox is prefect, thanks [19:29:24] !log added self to sre-sandbox [19:29:25] ori: Unknown project "added" [19:29:36] !log sre-sandbox added self to project [19:29:37] Logged the message at https://wikitech.wikimedia.org/wiki/Nova_Resource:Sre-sandbox/SAL [19:30:17] taavi: also getting a timeout [19:30:46] Nevermind. Connecting now [19:33:01] :-) [20:31:20] Hi, I'm trying to fix the Telegram @wikilinksbot. it seems a package it relied on has mysteriously disappeared from its virtualenv [20:32:36] but when i try to do `./my_venv/bin/pip3 install `, it says "ModuleNotFoundError: No module named 'pip'". What am I doing wrong? [20:33:20] are you running that command directly on a bastion or in a webservice shell? [20:33:28] directly [20:33:46] oh, but i'm not *in* the virtual environment, perhaps that's my mistake [20:34:01] you should try activating the venv too [20:34:09] but I think it might also be because you’re no longer on a Stretch bastion [20:34:33] so if the venv was set up using Python 3.5, then it won’t work on the bastion anymore and you’d need to use a webservice shell instead [20:35:24] hmmmm [20:35:33] is there a guide? [20:36:34] @lucaswerkmeister: if it runs on the buster grid, the venv should be set up a buster bastion / grid host [20:36:44] @jhsoby: yes, https://wikitech.wikimedia.org/w/index.php?title=News/Toolforge_Stretch_deprecation&mobileaction=toggle_view_desktop#Rebuild_virtualenv_for_python_users [20:37:01] I don’t even know if it runs on grid or k8s because the tool’s home dir isn’t world-readable [20:38:17] https://sge-status.toolforge.org/ reveals it's running on the job grid [20:44:24] Hello team, I get the following error when doing 'scap pull' in a WMCS instance: https://phabricator.wikimedia.org/P30243 [20:44:24] I've checked that the 'rsync' daemon is running as well as that I can ping the instance that 'scap' is trying to connect to (deployment-deploy03.deployment-prep.eqiad1.wikimedia.cloud). I couldn't find any relevant kernel logs about the 'rsync' daemon. [20:44:24] My guess is that the 'rsync' daemon is not running deployment-deploy03.deployment-prep.eqiad1.wikimedia.cloud but my SSH connection to that machine is refused. [20:46:07] thanks, it worked flawlessly (re @wmtelegram_bot: @jhsoby: yes, https://wikitech.wikimedia.org/w/index.php?title=News/Toolforge_Stretch_deprecation&mobileaction=toggle_vi...) [20:46:07] denisse: trying if I can ssh to it.. checking on rsync [20:46:32] denisse: rsync is running. Active: active (running) since Fri 2022-05-27 21:43:40 UTC; 3 weeks 6 days ago [20:47:00] mutante: Thank you! [20:47:11] Do you know what else I should look for? :O [20:47:20] denisse: 301 #wikimedia-releng, who maintain scap [20:47:20] denisse_: let me see if I can fix SSH for you [20:48:05] taavi: I'll ping them there, thanks. [20:48:15] although it seems a bit weird that an instance in the monitoring project tries to use something in deployment-prep by default [20:48:16] mutante: Thank you! [20:49:06] denisse_: so.. you need membership in the project "deployment-prep" [20:49:10] I can see you don't have it [20:49:22] if your user name is denisse there as well [20:49:43] mutante: Is the membership to that project required for SSH access? [20:50:00] can't add people though. [20:50:05] yes, you need to be a member of a project to ssh into instances in it [20:50:09] taavi: could you add her as regular user? [20:50:12] i can.. give me a second [20:50:17] with admin powers. thanks! [20:50:33] I meant "with the admin powers you have" heh [20:51:56] denisse_: your error pastebin mentions "pontoon". what is the project you created that instance in? [20:52:01] !log deployment-prep added `denisse` as a member [20:52:04] Logged the message at https://wikitech.wikimedia.org/wiki/Nova_Resource:Deployment-prep/SAL [20:52:14] like in horizon.wikimedia.org.. in the dropdown menu at the top [20:52:24] there is a project selected [20:52:38] mutante: I created that instance inside the 'monitoring' project. The instance name is 'pontoon-netmon-02'. [20:52:45] And it's a Debian Bullseye instance. [20:53:24] denisse_: you should be able to ssh into deployment-prep instances now [20:53:24] denisse_: the root cause is that it would require instances talking to each other across different projects [20:53:44] the machine you are on and the deployment server it tries to pull from.. they are not in the same project [20:53:47] mutante taavi: I confirm access to 'deployment-deploy03.deployment-prep.eqiad1.wikimedia.cloud', thank you! :) [20:54:07] so depending how you want to look at it .. either it needs security rules to allow that [20:54:19] or more likely.. you want to override the deployment server it tries to use [20:54:28] to something within the monitoring project [20:55:00] it's going to be a setting in Hiera somewhere that makes it try that deployment server [20:55:14] or the absence of that means it falls back to a default in deployment-prep [20:55:44] so what exactly are you trying to accomplish? `scap pull` updates the machines local clone of mediawiki, and I'm not sure if that's what you actually want [20:55:44] mutante: That makes sense, I may be using the wrong server then. I'll check that out. Thank you for your insights. :) [20:55:50] either way it's not common to deploy with scap from one project from the deployment server in another project [20:55:58] though it could be allowed with security groups [20:56:40] taavi: I want to be able to install LibreNMS in Debian Bullseye to put netmon1003 in service. Here's the relevant task: https://phabricator.wikimedia.org/T309074 [20:56:41] denisse_: check what other instance names you see in "monitoring" project context.. is anything called like "deploy" ? [20:58:07] right.. I don't think librenms needs mediawiki, so you might have accidentally spotted a bigger issue in the puppet code [20:58:12] either you need a deployment server in "monitoring" (a bit of a pain to setup but some other projects do that too) [20:58:23] or you need to use deployment-prep for the whole testing [20:58:33] mutante: I don't think we have one, maybe I should deploy one. [20:58:39] or some other project that already has a deployment server [20:59:52] denisse_: you can try to make one.. create an instance in 'monitoring', call it sometihng like deploy1001 .. apply the puppet role::deployment_server [21:00:08] but prepare for some more hurdles to get it going [21:00:31] I can tell you at least where to copy stuff from [21:01:49] that being said.. whether you do your test in the monitoring project or another project might not matter much [21:01:50] mutante: should netmon actually need scap? [21:02:02] taavi: I found the issues while running this puppet file on Debian Bullseye. I'm updating the puppet code to add support for Debian Bullseye but the errors I get now are because LibreNMS is not installed. https://github.com/wikimedia/puppet/blob/production/modules/librenms/manifests/init.pp [21:02:07] depends how much you see this kind of thing happening in the future [21:02:26] RhinosF1: eh.. good question. let's see if it's a scap::target [21:02:29] RhinosF1: I think yes because LibreNMS is installed using scap. [21:03:20] scap::target uses a different feature of scap than what `scap pull` is [21:03:50] taavi: it's a target for https://github.com/wikimedia/puppet/blob/production/modules/librenms/manifests/init.pp#L29 [21:04:08] What should it be doing then [21:04:12] so... end of the day it's just rsync [21:04:28] you probably can cheat and just rsync it yourself in the same place [21:04:37] and not worry about doing an actual deployment_server setup [21:04:48] denisse_: I just want to make sure you don't do stuff that's not needed [21:05:02] Here's the error I get when running 'sudo run-puppet-agent' https://phabricator.wikimedia.org/P30244 [21:05:20] denisse_: I can't see that [21:05:37] yea, that makes sense, denisse_ [21:05:42] I'm a mere mortal [21:05:55] the part that the errors are about "File[/srv/deployment/" means it is because of missing scap pull [21:06:12] stuff under that dir comes from there instead of puppet [21:06:28] so yea, you got that right, you need the scap pull to work [21:06:46] this is the actual scap command is failing: https://phabricator.wikimedia.org/P30244$41 [21:07:10] mutante: `scap pull` is still a different thing than what's wanted here [21:07:31] `scap pull` is used to update the local mediawiki installation [21:07:38] librenms is not mediawiki [21:09:04] ok. fair enough [21:09:15] regardless ..the problem is.. cant talk to deployment_server in another project [21:09:22] and ..has no deployment server in local project [21:09:38] taavi: if I may ask, any idea why IP Info fails on enwiki-beta but not on metawiki-beta? /me is confused - cfr. https://phabricator.wikimedia.org/T309400#8025778 [21:09:42] so either it's "test netmon in deployment-prep" or "setup deployment server in monitoring" or "manually rsync it, forget about scap" [21:11:05] taavi: Thanks, you're right. 'scap deploy-local --repo librenms/librenms' is what I need. [21:12:28] hauskatze: I'm not familiar with ipinfo at all, but that stacktrace suggests that the beta enwiki database has an entry that's invalid by modern standards [21:13:24] taavi: I'm guessing that if IPInfo code were broken it'd be broken everywhere and not just there? Just guessing [21:13:41] so your explanation is convincing but I don't speak Logstash :) [21:13:50] well depends on how exactly it is broken [21:14:56] but the stack trace tells me it breaks when it tries to format an individual entry, so if other entries work that's my conclusion [21:36:06] I'm around if anything in scap needs fixing [21:37:57] denisse_: ^ [21:41:19] dancy: what she needs is a new deployment_server though :P [21:41:33] and currently it's like we are waiting for an initial puppet run on a new instance [21:41:36] to become that [21:42:39] OK. Sounds like you're taking care of things. Let me know where I can help. [21:46:16] Thanks dancy, will do. :) [21:47:35] is the puppetized root authorized_keys file still a thing? [21:47:53] to get root on instances regardless of project membership [22:32:50] Hello team, I have an issue with a Cloud instance that is requiring a sudo password but I don't have one. [22:32:50] When I SSH into it I get a message that says "Puppet does not seem to have run in this machine. Unable to find '/var/lib/puppet/state/last_run_report.yaml'." but I can't run Puppet because I can't escalate privileges. [22:34:09] Would it be possible for me to get root access in the Cloud instances?