[00:36:42] <roy649>	 O
[00:36:56] <roy649>	 I'm trying to set up a web proxy following the directions at https://wikitech.wikimedia.org/wiki/Help:MediaWiki-Vagrant_in_Cloud_VPS#Setting_up_your_instance_with_MediaWiki-Vagrant
[00:37:00] <roy649>	 I got up to:
[00:37:08] <roy649>	 $ vagrant hiera role::mediawiki::hostname wikimedia-dev.spi-tools.wmflabs.org
[00:37:21] <roy649>	 but the name doesn't seem to have been created.
[00:37:36] <roy649>	 nslookup wikimedia-dev.spi-tools.wmflabs.org
[00:37:36] <roy649>	 Server:		208.80.154.143
[00:37:36] <roy649>	 Address:	208.80.154.143#53
[00:37:36] <roy649>	 ** server can't find wikimedia-dev.spi-tools.wmflabs.org: NXDOMAIN
[00:56:54] <denisse|m>	 Hello team, I've noticed that when using a Cloud VPS instance as Standalone Puppet Master, pushing my changes from my workstation and running '# run-puppet-agent' I get this message after pushing my changes: 'Error: Could not retrieve catalog from remote server: Error 500 on SERVER: Server Error: Could not find class profile::pontoon::base for pontoon-netmon-02.monitoring.eqiad1.wikimedia.cloud on node
[00:56:54] <denisse|m>	 pontoon-netmon-02.monitoring.eqiad1.wikimedia.cloud
[00:56:54] <denisse|m>	 Warning: Not using cache on failed catalog
[00:56:54] <denisse|m>	 Error: Could not retrieve catalog; skipping run'.
[00:58:08] <denisse|m>	 After a while I can run 'run-puppet-agent' again. I'm not sure what could be causing this issue or how could I accelerate that process to test my changes faster. Do you know what could be causing this?
[04:49:49] <andrewbogott>	 denisse|m: <long pause> Probably you're experiencing the fact that your puppetmaster only pulls down merged changes periodically according to a cron.  You can force it pull changes RIGHT NOW by running 'sudo /usr/local/bin/git-sync-upstream' on the puppetmaster
[04:51:04] <andrewbogott>	 and if you want to see what's happening, your puppetmaster has its manifests checked out in /var/lib/git/operations/puppet
[04:51:09] <andrewbogott>	 hope that helps
[05:07:14] <denisse|m>	 Thanks Andrew, I'll give it a try.
[05:26:27] <PhantomTech>	 I'm having a performance issue with an SQL query, if anyone's willing to help me with it or take a look please ping me
[05:26:27] <PhantomTech>	 Quick summary is I have a query that returns 95 rows, I'm trying to reduce those rows to only those where a column = 0 but adding that condition is adding significantly more time to the query than I'd expect
[06:48:51] <dcaro>	 PhantomTech: My sql foo is not great, but I can try to take a look, can you share the query?
[06:53:12] <PhantomTech>	 dcaro: I made a lot of changes while trying to test different things but I think I restored them to how it was when I asked for help. The current query takes about 40 seconds, uncommenting the only comment makes it go for an unknown much longer time.
[06:53:12] <PhantomTech>	 https://quarry.wmcloud.org/query/65688#
[06:57:28] <dcaro>	 👍 will let you know if I get anything xd
[06:58:38] <PhantomTech>	 thanks :)
[07:19:36] <dcaro>	 This is a really poor alternative that is kinda fast (but you don't get the 100 exactly) https://quarry.wmcloud.org/history/65699/667960/648683
[07:19:39] <dcaro>	 PhantomTech: ^
[07:20:44] <PhantomTech>	 dcaro: Thank you :)
[07:22:54] <JJMC89>	 you may want to try using revision_userindex instead of revision  https://wikitech.wikimedia.org/wiki/Help:MySQL_queries#Alternative_views
[07:34:46] <PhantomTech>	 Thanks :)
[07:41:36] <PhantomTech>	 There's no way to use a temporary table in Quarry right?
[14:26:44] <Rook>	 @PhantomTech that's correct, there are no temporary tables in quarry. Though I'm not immediately seeing a reason that they could not be introduced. Feel free to open a ticket for exploration on that front.
[14:52:48] <bd808>	 roy649: you still need to manually create the proxy via horizon. the vagrant settings only configure the local MediaWiki
[14:54:04] <roy649>	 bd808 I'm not sure what I had originally done wrong, but it's working now.
[14:54:42] <bd808>	 Rook: I'd have to dig through old phab tasks, but my recollection is that temp tables have a negative impact on replication and have been banned for that reason.
[14:55:13] <roy649>	 I still don't quite understand the domain structure.  There's wmcloud.org, wmflabs.org, and wikimedia.cloud
[14:55:25] <roy649>	 It's unclear what goes in which domain
[14:56:07] <bd808>	 https://wikitech.wikimedia.org/wiki/Portal:Cloud_VPS/Admin/DNS
[14:56:16] <Rook>	 Oh, I was wondering if it would be possible to use the quarry db for the temp table (I am probably misunderstanding temp tables, and am realizing they probably have to be in the DB that the query came from). The replicas should remain read only
[14:56:43] <dcaro>	 ^ I think they mean on the replicas yes (the DBs they are querying)
[14:56:43] <bd808>	 roy649: wmflabs.org is legacy, replaced by wmcloud.org
[14:57:31] <Rook>	 Ah ok, if temp tables cannot be in a different db, then we cannot support them.
[14:58:37] <roy649>	 It would be good if somebody could go through the wikitech documentation and bring it up to date :-)
[14:58:48] <bd808>	 roy649: it's a wiki ;)
[15:00:05] <roy649>	 well, yeah, I know that, but it really should be fixed by people who actually understand the system.  I can make changes, but I'm mostly guessing at what it's supposed to be, so enshrining my guesses in the docs doesn't actually help anything.
[15:00:09] <bd808>	 If I was going to bring the Vagrant in Cloud docs up to date I would probably start by marking them as outdated as mw-vagrant is functionally a dead project
[15:00:33] <roy649>	 Oh?
[15:01:06] <roy649>	 I just started using it because it seemed like the right thing to be doing.
[15:02:33] <roy649>	 I made one attempt to get docker running on my laptop, but wasn't able to make that work, so I tried Vagrant and had more success.
[15:04:07] <bd808>	 roy649: *nod* It is the best documented system for sure. the main issue is that no one really maintains the software anymore and it is currently using Debian Stretch as a base image.
[15:05:08] <bd808>	 I was the primary maintainer of mw-vagrant for many years, but I have been out of the MediaWiki development game for too long for it to be an interesting hobby for me
[15:05:59] <roy649>	 OK, so let's take a step backwards.  My real goal is to do some work on the checkuser extension.  What's the recommended way to set up a dev environment to do that?
[15:07:21] <bd808>	 I think the "official" system is https://www.mediawiki.org/wiki/MediaWiki-Docker these days (as much as there is anything official)
[15:08:41] <taavi>	 generally Cloud VPS is not intended to replace a local development setup
[15:09:05] <bd808>	 this is also true
[15:10:23] <bd808>	 Cloud VPS is a good place to host a demo server to show off active work, but not ideal for hosting dev environments. This is basically just because we don't have enough compute to give everyone a "laptop in the cloud" for development.
[16:37:30] <urbanecm>	 hi, would someone mind approving https://toolsadmin.wikimedia.org/tools/membership/status/1278? it's for a fellow team member. thanks!
[16:44:47] <urbanecm>	 thanks!
[16:47:18] <dcaro>	 👍 /me is cloud duty! :)
[19:13:24] <roy649>	 So, I'm going down the docker path.  I've gotten to the point where my wiki is running but I get "Whoops! The default skin for your wiki, defined in $wgDefaultSkin as vector, is not available."
[19:13:50] <roy649>	 Following the instructions at https://www.mediawiki.org/wiki/Skin:Vector#Installation, I added "wfLoadSkin( 'Vector' );" to my LocalSettings.php
[19:14:41] <roy649>	 Which gets me "Fatal error: Uncaught Exception: Unable to open file /var/www/html/w/skins/Vector/skin.json....
[19:15:32] <bd808>	 roy649: have you cloned vector?
[19:15:52] <roy649>	 Yup, and symlinked as suggested: Vector -> ../../Vector
[19:18:14] <roy649>	 cat mediawiki/skins/Vector/skin.json  prints the file, so the symlink is right
[19:20:42] <bd808>	 is /var/www/html/w mounting something other than that mediawiki dir?
[19:21:51] <roy649>	 I need to ssh into the container to see that, right?
[19:22:00] <roy649>	 I haven't figured out how to do that yet :-)
[19:22:02] <bd808>	 and if not, is the symlink outside of the mounted file tree?
[19:22:39] <bd808>	 roy649: use https://docs.docker.com/engine/reference/commandline/attach/ rather than ssh
[19:23:04] <roy649>	 ah, yeah, that was it.
[19:23:38] <roy649>	 When I got rid of the symlink and physically moved the Vector directory under mediawiki/skins, it works.
[19:23:41] <roy649>	 thanks
[19:24:19] <bd808>	 np
[19:42:45] <wm-bot>	 !log tools.lexeme-forms <lucaswerkmeister> deployed 6ac757a997 (Igbo verbs + pronouns)
[19:42:48] <stashbot>	 Logged the message at https://wikitech.wikimedia.org/wiki/Nova_Resource:Tools.lexeme-forms/SAL
[20:26:50] <Luke081515>	 hi, I have two questions: a) I have upgraded an instance, but it doesn't show up in the livereport / in horizon, does this need some time? b) I can't login into neon.rcm.eqiad1.wikimedia.cloud, it refuses my key, can someone help me with that?
[20:31:17] <Rook>	 Luke081515: looking...
[20:31:26] <bd808>	 Luke081515: what does "upgraded an instance" mean?
[20:31:31] <Rook>	 (Also, yes I've found horizon takes a bit of time to update)
[20:31:59] <Luke081515>	 bd808: I've manually upgraded it from stretch to buster, changing that apt-sources, then apt upgrade, apt dist upgrade, reboot etc)
[20:32:26] <bd808>	 ah... yeah that's never going to show as the new distro version on reports.
[20:32:53] <Rook>	 I can get in as root, let's see project permissions
[20:32:57] <bd808>	 that's an "in-place upgrade". we track the base image the vm was created from
[20:33:30] <urbanecm>	 Rook: for what it is worth, i (a project member) can ssh to oxygen.rcm.eqiad1.wikimedia.cloud, but not neon.rcm.eqiad1.wikimedia.cloud
[20:33:52] <Rook>	 Logs then :)
[20:35:29] <Luke081515>	 bd808: As long as that prevents the instance from beeing deleted, that's enough for me :)
[20:35:37] <Luke081515>	 (I also documented that at that phab task)
[20:36:14] <bd808>	 Luke081515: well it will show up on every report that we run, so I'm not sure that it will prevent confusion at least
[20:36:29] <Rook>	 `Jun 29 20:26:38 neon sshd[304]: input_userauth_request: invalid user urbanecm [preauth]`
[20:36:29] <Rook>	 ldap issue?
[20:36:57] <bd808>	 Rook: sounds like it. maybe reboot before digging much deeper
[20:37:00] <Luke081515>	 (I added him as projectadmin just today, he was only projectmember before, maybe related to that?)
[20:37:05] <Luke081515>	 I already rebooted it twice today
[20:37:13] <Luke081515>	 from horizon. once soft, then hard
[20:37:30] <bd808>	 and I'm guessing this is an instance that you in-place upgraded?
[20:37:39] <taavi>	 (not) being a projectadmin doesn't affect your ability to ssh in
[20:37:56] <Luke081515>	 no, that is oxygen. the instance I also wanted to in-place upgrade, but can't get in is neon
[20:38:23] <taavi>	 stretch->buster changes the ldap auth stack completely, so an in-place upgrade breaking things wouldn't surprise me at all
[20:38:36] <bd808>	 yeah, that was my thinking too
[20:38:37] <Luke081515>	 well, but at that instance it works fine
[20:38:46] <bd808>	 but sounds like this is something else
[20:39:03] <taavi>	 is puppet running fine on the broken instance?
[20:39:37] <Luke081515>	 not sure if the question was directed to me, I don't know since I can't get it :/
[20:39:50] <Luke081515>	 *can't get in
[20:40:00] <Rook>	 puppet is complaining about some deps, including python-pyldap
[20:40:02] <bd808>	 "id: ‘bd808’: no such user" -- ldap is totally busted on neon. Not sure why yet
[20:41:04] <bd808>	 apt is made there and breaking puppet
[20:41:10] <bd808>	 *mad
[20:43:07] <mutante>	 that would be "python-pyldap" -> "python3-pyldap"
[20:43:20] <Rook>	 `/dev/vda3                           19479164 19051408         0 100% /`
[20:43:20] <Rook>	 that might be a problem
[20:43:55] <Rook>	 https://www.irccloud.com/pastebin/6Cws8KaI/
[20:43:59] <Luke081515>	 hm, the instance had that problem earlier, clearing the apt cache helped then (since some images were big)
[20:44:15] <Luke081515>	 *some packages
[20:44:30] <mutante>	 Rook: heh:) that would be it.  "apt-get clean" ?
[20:44:39] <bd808>	 in theory https://packages.debian.org/stretch/python-pyldap exists, but maybe not in our apt mirror
[20:44:45] <Rook>	 heh, let's give it a try
[20:44:56] <Luke081515>	 would be nice if that helps :D
[20:45:11] <mutante>	 it usually give a little breating room
[20:45:24] <Rook>	 That's looking better, down to 75, let's see how puppet runs now
[20:45:32] * bd808 sees that Rook is on this and gets the heck out of the way
[20:47:19] <Rook>	 Puppet seems happier. Though I can't get in as me yet. Can anyone else get in?
[20:47:28] * Luke081515 tries
[20:47:42] <Luke081515>	 no, same error yet :/
[20:47:48] <Luke081515>	 *same error again
[20:48:10] <taavi>	 Jun 29 20:47:29 neon nslcd[707]: [9b500d] <passwd="urbanecm"> ldap_start_tls_s() failed (uri=ldap://ldap-ro.eqiad.wikimedia.org:389): Connect error: (unknown error code)
[20:48:37] * taavi restart that service
[20:55:29] <Luke081515>	 got a connect error now again
[20:56:52] <taavi>	 looks like it has tls certificate validation issues in general: https://phabricator.wikimedia.org/P30630
[20:57:07] <taavi>	 and because of that, it can't connect to ldap to fetch your user info
[20:59:26] <taavi>	 let's see if upgrading libssl fixes it
[20:59:30] <Rook>	 I wonder if that is related to neon getting an invalid signature for the GitLab repo
[20:59:59] <Luke081515>	 if that might be related, you can remove it, I can correct that later
[21:00:38] <Rook>	 Oh I think if it is related it would be remedied along with ldap with what taavi is trying
[21:01:57] <taavi>	 try now?
[21:02:13] <Luke081515>	 nice, works :)
[21:02:17] <Luke081515>	 thank you very much :)
[21:02:20] <taavi>	 great!
[21:02:22] <Rook>	 +1 Thank you taavi !
[21:02:28] <Luke081515>	 thank you both :)
[21:03:59] <urbanecm>	 ssh works for me too. thanks :)
[21:04:10] <Luke081515>	 bd808: Would it be okay if I assign that phab ticket about this two servers and their upgrade to you, mentioning that I did an in-place upgrade there, so that it can be considered? (I'm not sure if I'm the only person doing that?)
[21:05:51] <bd808>	 Luke081515: I'm not running the process this year. Komla is the person you need to sync with. And mostly you are the only person I've heard of doing in-place this year. In the past we have only had 1-2 folks go that route as well. "cattle not pets" is the watchword here :)
[21:06:49] <bd808>	 I do wish we had better processes for y'all to make automation. working in ops/puppet as a volunteer is not very fun
[21:10:28] <hauskatze>	 Do we still need local virtualenvs for python bots to migrate our jobs to toolforge-jobs?
[21:17:40] <urbanecm>	 hauskatze: yes, the image has only a couple of widely needed packages
[21:18:06] <hauskatze>	 :(
[21:18:39] <hauskatze>	 Well, I'll stick with Grid using -release buster then
[21:20:52] <bd808>	 hauskatze: venvs are always going to be needed for kubernetes jobs. the only potential fix for that in the future will be custom docker containers.
[21:22:56] <Luke081515>	 taavi: Might I ask you again? I did some apt upgrade, and at that server, it looks like it forgot that I have sudo. It now asks me for the pwd and writes me that "With great power comes great responsibility" stuff down
[21:23:25] <hauskatze>	 bd808: so --image [my-custom-image-here] ?
[21:23:49] <hauskatze>	 instead of e.g. tf-python39 ?
[21:26:07] <bd808>	 hauskatze: not sure what it will look like honestly, but there will eventually be some system where docker containers can be built which include additional apt packages and other custom changes. Right now the work on it is in a very proof-of-concept state.
[21:28:27] <bd808>	 T194332 is the epic that this all fits under
[21:28:27] <stashbot>	 T194332: [Epic] Make Toolforge a proper platform as a service with push-to-deploy and build packs - https://phabricator.wikimedia.org/T194332
[21:32:05] <hauskatze>	 thanks bd808 :)