On 02/05/2011 05:25 AM, Mike Gabriel wrote:
Hi Gerry,
On Sa 05 Feb 2011 00:09:52 CET Gerry Reno wrote:
What's interesting is that it presents the menu and the process is shown as "running" so then your "suspend" it and "resume" it and you still get the "remote proxy closed the connection" error and NX proxy shows "aborting" but it does resume the session. Very confusing to users but at least they get their session back.
By what I read the issue really occurs around the port forwarding request. I try to sketch the mechanism...
o client: sets up a forwarding tunnel to the server (for nxproxy traffic which in our discussed scenario) o server: thinks the tunnel is already/still set up (by the former request) and denies the tunnel
-> here PyHoca-GUI tries to push a cancel_port_forward request to
the SSH daemon which sometimes works but sometimes doesn't
o the x2goclient does not realize the failed tunnel (PyHoca-GUI btw. does and reports an error) o thus, x2goclient continues as planned o client: call of server-side x2goresume-session script (which fails because of the missing port forwarding tunnel) o client: from x2goresum-session script the x2goresume script is called (which updates the database and states that the new session is running although the session resuming failed)
The problem is that old port forwarding requests on the SSH server do not get properly cancelled by the clients. If a request does not get canceled the port is blocked/busy and another SSH client instance cannot request a port forwarding on this server port.
Hope that sheds more light on the situation... If you can find an SSHd option that could fix this behaviour this would be really great!!!
Greetings, Mike
Right now I'm working only with the browser-client. I'm intentionally crashing the browser and then trying to resume the detached session. So from the client perspective it's a whole new client process that doesn't know anything about an existing tunnel.
What I'm thinking is that when this happens the server is not receiving an ACK and is putting the connection into TIME_WAIT state whose interval is defined as 2*MSL where MSL is the packet lifetime. Most networks have MSL set to 30 seconds these days. I think it used to be 120 seconds. So it looks like TIME_WAIT will last from 1 to 4 minutes depending upon the server network.
So what happens is if the user just waits long enough then they can resume the detached session. If they keep trying at intervals less than TIME_WAIT then it seems they never get reconnected.
I'm not sure what type of experiments need to be constructed to account for all the scenarios that can happen between the different types of clients and the server. But we need to think about that. And then see if we can manually construct these scenarios just using terminals and try to tweak the settings until we have good behavior under each scenario. Then try to determine how to get the client, nxproxy and the server to implement those behaviors.
Regards, Gerry