Hi folks,
Multi-layered one here. We have x2goserver (4.1.0.4-0.0x2go1.0.git20200228.1815.heuler.el7) available on a (CentOS7) login node of our HPC cluster for folks to submit jobs from. If they start an x2go client session from a Mac to our login node, submit interactive jobs with X11 tunnelling enabled (or even just ssh -X directly to an execution node) and run RStudio (1.1.463) desktop from within that second host, there is an un-usable lag in their session. If they type in the console, it can take up to 10 seconds to register and display the keystroke. No errors, everything works, it just isn't very responsive. Linux x2go clients are fine. Windows x2go clients are fine.
I know that's a lot of layers they're going through but I'm at a loss for identifying which layer is the issue. Any troubleshooting tips? Anyone out there experiencing similar symptoms?
Cheers
On Tue, Sep 22, 2020 at 11:42 PM Scott Wood <woodystrash@hotmail.com> wrote:
Hi folks,
Multi-layered one here. We have x2goserver (4.1.0.4-0.0x2go1.0.git20200228.1815.heuler.el7) available on a (CentOS7) login node of our HPC cluster for folks to submit jobs from. If they start an x2go client session from a Mac to our login node, submit interactive jobs with X11 tunnelling enabled (or even just ssh -X directly to an execution node) and run RStudio (1.1.463) desktop from within that second host, there is an un-usable lag in their session. If they type in the console, it can take up to 10 seconds to register and display the keystroke. No errors, everything works, it just isn't very responsive. Linux x2go clients are fine. Windows x2go clients are fine.
Hmm, so this is tied to MacOS on the client side and everything else is the same?
Do you have the same connection type for all the x2go sessions?
One thing you could try: run x11vnc within the x2go session and connect to the session via vnc. Is it also slow in vnc? Suspend the x2go session. Is anything changing in the vnc session?
Note: for this test to work you must disable the sleep feature on the server. Expand the X2GO_NXOPTIONS variable /etc/x2go/x2goagent.options like this: X2GO_NXOPTIONS="sleep=0"
Afterwards start a NEW session (not reconnecting to an existing one) and do the vnc test from above.
Uli
Thanks, Ulrich!
Yes. The whole stack is the same. Only the OS on the client end is different. I Have tested connecting from CentOS 7, Fedora 32, Windows 10, Mac OSX 10.11 and 10.15. Windows and both Linux distros play nicely. Both OSX versions present the lag
I've used a few VNC flavours before but not that specific one so I'll do a bit of reading and testing before dropping it on our production server.
Still, as a troubleshooting step, to get as close as I could to your proposed test, I just fired up TigerVNC server in the x2go session on the login node and used a VNC client in that same x2go session to hit localhost:#. All good. No lag. While that is a functioning solution, we're trying to present an intuitive and usable solution to new users. X2go provides that (thanks devs!) but telling new users "All you need to do is hook up to the VPN, connect with x2go, then in that x2go, start vncserver, check your display number, start vncviewer in there, to get a second desktop. From that submit an interactive job, and you're good to go!" may be a bit more that they're ready for.
Our hope is that we can establish what is introducing the lag, iron it out, and keep our users in the single x2go solution. Does the VNC test above shed any light on which layer we should be troubleshooting?
Cheers Scott
p.s. Apologies for the double reply. I missed the list on my first
From: Ulrich Sibiller <uli42@gmx.de> Sent: Wednesday, 23 September 2020 8:25 AM To: Scott Wood <woodystrash@hotmail.com> Cc: x2go-user@lists.x2go.org <x2go-user@lists.x2go.org> Subject: Re: [X2Go-User] Lag in RStudio desktop in nested ssh sessions (or qsub -I) MacOS clients
On Tue, Sep 22, 2020 at 11:42 PM Scott Wood <woodystrash@hotmail.com> wrote:
Hi folks,
Multi-layered one here. We have x2goserver (4.1.0.4-0.0x2go1.0.git20200228.1815.heuler.el7) available on a (CentOS7) login node of our HPC cluster for folks to submit jobs from. If they start an x2go client session from a Mac to our login node, submit interactive jobs with X11 tunnelling enabled (or even just ssh -X directly to an execution node) and run RStudio (1.1.463) desktop from within that second host, there is an un-usable lag in their session. If they type in the console, it can take up to 10 seconds to register and display the keystroke. No errors, everything works, it just isn't very responsive. Linux x2go clients are fine. Windows x2go clients are fine.
Hmm, so this is tied to MacOS on the client side and everything else is the same?
Do you have the same connection type for all the x2go sessions?
One thing you could try: run x11vnc within the x2go session and connect to the session via vnc. Is it also slow in vnc? Suspend the x2go session. Is anything changing in the vnc session?
Note: for this test to work you must disable the sleep feature on the server. Expand the X2GO_NXOPTIONS variable /etc/x2go/x2goagent.options like this: X2GO_NXOPTIONS="sleep=0"
Afterwards start a NEW session (not reconnecting to an existing one) and do the vnc test from above.
Uli
You got me wrong. All I wanted to achieve by that is some diagnosis. If the session is fast in X11vnc but not in X2go and if there's a difference between the x2go session being suspend or not we can probably deduce the cause of the problem. It was in no way thought as a solution or a workaround.
Another thing you could check is the options of your session. In a running session please check /tmp/.x2go-<username>/C-<your-session-id>/options and post the content. Please compare them to the options for a session startet from Linux or Windows. You can also send me the session.log in the same dir (but please not on the ML).
Next thing to check: If you start a session in Linux and then reconnect to that same session from MacOS will it become slow as well?
Uli
On Thu, Sep 24, 2020 at 4:37 AM Scott Wood <woodystrash@hotmail.com> wrote:
Thanks, Ulrich!
Yes. The whole stack is the same. Only the OS on the client end is different. I Have tested connecting from CentOS 7, Fedora 32, Windows 10, Mac OSX 10.11 and 10.15. Windows and both Linux distros play nicely. Both OSX versions present the lag
I've used a few VNC flavours before but not that specific one so I'll do a bit of reading and testing before dropping it on our production server.
Still, as a troubleshooting step, to get as close as I could to your proposed test, I just fired up TigerVNC server in the x2go session on the login node and used a VNC client in that same x2go session to hit localhost:#. All good. No lag. While that is a functioning solution, we're trying to present an intuitive and usable solution to new users. X2go provides that (thanks devs!) but telling new users "All you need to do is hook up to the VPN, connect with x2go, then in that x2go, start vncserver, check your display number, start vncviewer in there, to get a second desktop. From that submit an interactive job, and you're good to go!" may be a bit more that they're ready for.
Our hope is that we can establish what is introducing the lag, iron it out, and keep our users in the single x2go solution. Does the VNC test above shed any light on which layer we should be troubleshooting?
Cheers Scott
p.s. Apologies for the double reply. I missed the list on my first
From: Ulrich Sibiller <uli42@gmx.de> Sent: Wednesday, 23 September 2020 8:25 AM To: Scott Wood <woodystrash@hotmail.com> Cc: x2go-user@lists.x2go.org <x2go-user@lists.x2go.org> Subject: Re: [X2Go-User] Lag in RStudio desktop in nested ssh sessions (or qsub -I) MacOS clients
On Tue, Sep 22, 2020 at 11:42 PM Scott Wood <woodystrash@hotmail.com> wrote:
Hi folks,
Multi-layered one here. We have x2goserver (4.1.0.4-0.0x2go1.0.git20200228.1815.heuler.el7) available on a (CentOS7) login node of our HPC cluster for folks to submit jobs from. If they start an x2go client session from a Mac to our login node, submit interactive jobs with X11 tunnelling enabled (or even just ssh -X directly to an execution node) and run RStudio (1.1.463) desktop from within that second host, there is an un-usable lag in their session. If they type in the console, it can take up to 10 seconds to register and display the keystroke. No errors, everything works, it just isn't very responsive. Linux x2go clients are fine. Windows x2go clients are fine.
Hmm, so this is tied to MacOS on the client side and everything else is the same?
Do you have the same connection type for all the x2go sessions?
One thing you could try: run x11vnc within the x2go session and connect to the session via vnc. Is it also slow in vnc? Suspend the x2go session. Is anything changing in the vnc session?
Note: for this test to work you must disable the sleep feature on the server. Expand the X2GO_NXOPTIONS variable /etc/x2go/x2goagent.options like this: X2GO_NXOPTIONS="sleep=0"
Afterwards start a NEW session (not reconnecting to an existing one) and do the vnc test from above.
Uli
Am 22.09.20 um 23:40 schrieb Scott Wood:
Hi folks,
Multi-layered one here. We have x2goserver (4.1.0.4-0.0x2go1.0.git20200228.1815.heuler.el7) available on a (CentOS7) login node of our HPC cluster for folks to submit jobs from. If they start an x2go client session from a Mac to our login node, submit interactive jobs with X11 tunnelling enabled (or even just ssh -X directly to an execution node) and run RStudio (1.1.463) desktop from within that second host, there is an un-usable lag in their session. If they type in the console, it can take up to 10 seconds to register and display the keystroke. No errors, everything works, it just isn't very responsive. Linux x2go clients are fine. Windows x2go clients are fine.
I know that's a lot of layers they're going through but I'm at a loss for identifying which layer is the issue. Any troubleshooting tips? Anyone out there experiencing similar symptoms?
Well, we do have several HPC cluster operators that use X2Go - see our "success stories" list in the Wiki. Maybe it would make sense to contact them directly and see if they've had the same issue?
Also, why can't you run X2GoServer directly on the execution nodes, instead of using ssh -X? If the login node is the only one with a publicly accessible IP, that's not a problem - Activate the proxy feature in the X2GoClient session setup and tell X2GoClient to use your login node as an SSH proxy (i.e. "jump host").
Kind Regards, Stefan Baur
-- BAUR-ITCS UG (haftungsbeschränkt) Geschäftsführer: Stefan Baur Eichenäckerweg 10, 89081 Ulm | Registergericht Ulm, HRB 724364 Fon/Fax 0731 40 34 66-36/-35 | USt-IdNr.: DE268653243
Thanks, Stefan!
I'll have a look at the "success stories" section of the page and check the various HPC sites to see if their site specific docs are available online as that may answer my question without me needing to pester any busy admins.
As for x2go directly on the execution nodes, that doesn't work so well in the HPC environment. I provided the example of double ssh just to illustrate that I'd ruled out the submission of an interactive session under PBS Professional was not contributing to the problem. When folks work on the execution nodes, it needs to be via a job submitted with qsub, which is either a script that can be left unattended or an interactive session that can have X11 forwarding enabled. Without going through qsub, there is no way to track nor limit the resource use based on the requests (for example mem=20gb,ncpus=4).
Cheers, Scott
p.s. Apologies for the double reply. I missed the list on my first
From: x2go-user <x2go-user-bounces@lists.x2go.org> on behalf of Stefan Baur <X2Go-ML-1@baur-itcs.de> Sent: Wednesday, 23 September 2020 6:13 PM To: x2go-user@lists.x2go.org <x2go-user@lists.x2go.org> Subject: Re: [X2Go-User] Lag in RStudio desktop in nested ssh sessions (or qsub -I) MacOS clients
Am 22.09.20 um 23:40 schrieb Scott Wood:
Hi folks,
Multi-layered one here. We have x2goserver (4.1.0.4-0.0x2go1.0.git20200228.1815.heuler.el7) available on a (CentOS7) login node of our HPC cluster for folks to submit jobs from. If they start an x2go client session from a Mac to our login node, submit interactive jobs with X11 tunnelling enabled (or even just ssh -X directly to an execution node) and run RStudio (1.1.463) desktop from within that second host, there is an un-usable lag in their session. If they type in the console, it can take up to 10 seconds to register and display the keystroke. No errors, everything works, it just isn't very responsive. Linux x2go clients are fine. Windows x2go clients are fine.
I know that's a lot of layers they're going through but I'm at a loss for identifying which layer is the issue. Any troubleshooting tips? Anyone out there experiencing similar symptoms?
Well, we do have several HPC cluster operators that use X2Go - see our "success stories" list in the Wiki. Maybe it would make sense to contact them directly and see if they've had the same issue?
Also, why can't you run X2GoServer directly on the execution nodes, instead of using ssh -X? If the login node is the only one with a publicly accessible IP, that's not a problem - Activate the proxy feature in the X2GoClient session setup and tell X2GoClient to use your login node as an SSH proxy (i.e. "jump host").
Kind Regards, Stefan Baur
-- BAUR-ITCS UG (haftungsbeschränkt) Geschäftsführer: Stefan Baur Eichenäckerweg 10, 89081 Ulm | Registergericht Ulm, HRB 724364 Fon/Fax 0731 40 34 66-36/-35 | USt-IdNr.: DE268653243
x2go-user mailing list x2go-user@lists.x2go.org https://lists.x2go.org/listinfo/x2go-user