[X2go-Dev] 3.99 Testing - Multiple X2goclient instances fail

John A. Sullivan III jsullivan at opensourcedevel.com
Mon Aug 1 20:38:27 CEST 2011


On Mon, 2011-08-01 at 13:30 -0400, John A. Sullivan III wrote:
> On Fri, 2011-07-29 at 10:21 +0200, Mike Gabriel wrote:
> > Hi Daniel,
> > 
> > On Fr 29 Jul 2011 07:24:40 CEST Daniel Lindgren wrote:
> > 
> > > [...]
> > >
> > > I've been trying to think of some way to make x2go automatically only
> > > use free ports, but since there are possibly several clients and
> > > servers involved in multiple incoming and outgoing sessions, the only
> > > safe way I've come up with would be to register used ports (or reserve
> > > port ranges) on a shared resource somewhere, which would make all x2go
> > > clients and servers dependent on that one resource.
> > 
> > What you have been describing above is the old, well known  
> > NXServer/NXClient problem.
> > 
> > To my experience it does not exist with X2go (at least not with  
> > pyhoca-gui ;-) ).
> > 
> > The connecting stuff is a bit tricky...
> > 
> >    - x2goclient connects via SSH and launches either x2gostartagent or
> >      x2goresume-session
> >    - x2gostartagent will try to detect a free server-side port >30000.
> >      This port is put into the session database table. (Make sure you do
> >      not have other server-side stuff running that claims ports between
> >      30000 and 30000++.
> >    - x2goresume-session will pick the formerly detected port from the session
> >      database table (i.e. SQLite or postgres) and presume it is still unused
> >      (as it should be, unless unused by X2go -> x2gostartagent avoids claiming
> >      ports that are in the database marked with S (suspended) or R (running)
> >      state
> >    - say we detect a free server-side port: 30015: x2gostartagent will  
> > launch the
> >      XNest-like x2goagent XServer, listening for incoming connections on port
> >      localhost:300015.
> > 
> >    - x2goclient now tries to set up a forwarding tunnel from
> >      client to server (-L 30000:localhost:30015)
> >    - if the local port 30000 (the first one in the above expression) is already
> >      in use, it simply selects another port. Each selected port is  
> > probed before
> >      usage. That is done by x2goclient.
> >    - say we end up with -L 30028:localhost:30015
> > 
> > The problem you observe: ”the remote proxy closed the connection while  
> > negotiating …” to my point of view stems from uncleanly closed SSH  
> > connections. Try to restart the SSH daemon when this error occurs and  
> > you should be able to connect again. This error can also be caused by  
> > the sound/sshfs tunnels. Here the same cause applies: unclean shutdown  
> > of SSH connections.
> > 
> > Reasons may be:
> > 
> >    o unclean x2goclient/pyhoca-gui code
> >    o line failures (so the clients do not get the chance to cancel the port
> >      forwarding requests...)
> > 
> > You may do the following to debug:
> > 
> >    o try if the problem occurs in the same way when you use pyhoca-gui/-cli
> >    o pyhoca-gui (i.e. python-x2go) tries to restore the mal-closure of SSH
> >      port forwardings... You may try to resume a session twice, the second
> >      time you should get a session window
> > 
> > The addressed issue was a big problem for Python X2go before version  
> > 0.1.0.0. It took me quite some sleepness nights to get rid of it by  
> > 90%+.
> <snip>
> I continue to have a devil of a time with this.  It seems like
> 
>    - x2goclient now tries to set up a forwarding tunnel from
>      client to server (-L 30000:localhost:30015)
>    - if the local port 30000 (the first one in the above expression) is
> already
>      in use, it simply selects another port. Each selected port is  
> probed before
>      usage. That is done by x2goclient.
> 
> isn't working.  In the past, using older clients, we would generate
> error messages about a problem connecting on port 30001 but the sessions
> would all still work.
> 
> We are no longer seeing those errors.  Instead, the sessions are rudely
> dropped.  I've restarted all the X2Go servers and clients after deleting
> all the old session directories on both.  Still no success.  Here are
> the details.
> 
> In our environment, we have a single PostgreSQL database for all the
> X2Go servers. There is one X2Go server for each user (x2go client).
> I connect to the first server and I start a valid session.  I am using
> ports 30001 - 30003 on the server side and 30004 on the client side:
> tcp        0      0 0.0.0.0:30004           0.0.0.0:*               LISTEN
> 
> Server           User              Status  Last Connection           Run Time  Display  Client             GP      SP      FP  Context   Load       RSS      VSZ
> jasiii           jasiii            S       01.08.11 12:45:12       0d  0h  2m       51  74.75.231.235   30001   30002   30003    40016   0.04   434.2Mb    8.4Gb
> 
> I start a second x2goclient to connect to the second server.  The client
> is listening on port 30005:
> tcp        0      0 0.0.0.0:30004           0.0.0.0:*               LISTEN
> tcp        0      0 0.0.0.0:30005           0.0.0.0:*               LISTEN
> 
> The server side is using the same ports as the other x2go server:
> Server           User              Status  Last Connection           Run Time  Display  Client             GP      SP      FP  Context   Load       RSS      VSZ
> ----------------------------------------------------------------------------------------------------------------------------------------------------------------
> cuser-a6         cuser-a6          R       01.08.11 12:45:06       0d  0h  1m       53  74.75.231.235   30001   30002   30003    40046   0.01   315.3Mb    5.9Gb
> jasiii           jasiii            S       01.08.11 12:45:12       0d  0h  2m       51  74.75.231.235   30001   30002   30003    40016   0.04   434.2Mb    8.4Gb
> 
> This is where the original session dies.  If I understand the logic,
> shouldn't there be a mapping from 30004 to localhost:30001 for the first
> session and another from 30005 to localhost:30001 for the second
> session? Why is the first session dropped as if there was no network
> connectivity?
> 
> This is what the server side session.log says:
> Error: Failure reading from the peer proxy.
> Error: Connection with remote peer broken.
> Error: Please check the state of your network and retry.
> Session: Display failure detected at 'Mon Aug  1 12:45:07 2011'.
> 
> Sometimes, it can get really ugly:
> Error: Failure reading from the peer proxy.
> Error: Connection with remote peer broken.
> Error: Please check the state of your network and retry.
> nxagentReconnectFailedFonts: WARNING! Font server tunneling not
> retrieved.
> *** glibc detected *** /usr/lib/x2go/x2goagent: corrupted double-linked
> list: 0x0000000001131d10 ***
> ======= Backtrace: =========
> /lib/libc.so.6(+0x71ad6)[0x7f05278fbad6]
> /lib/libc.so.6(+0x71f4d)[0x7f05278fbf4d]
> /lib/libc.so.6(+0x73418)[0x7f05278fd418]
> /lib/libc.so.6(cfree+0x6c)[0x7f052790084c]
> /usr/lib/x2go/libX11.so.6(XFreeFontPath+0x1d)[0x7f05295ff8dd]
> /usr/lib/x2go/x2goagent[0x485af8]
> /usr/lib/x2go/x2goagent[0x4ab421]
> /usr/lib/x2go/x2goagent[0x4ab840]
> /usr/lib/x2go/x2goagent[0x4ab95b]
> /usr/lib/x2go/x2goagent[0x48cb75]
> /usr/lib/x2go/x2goagent[0x450bbb]
> /usr/lib/x2go/x2goagent[0x45c65e]
> /usr/lib/x2go/x2goagent[0x429595]
> /usr/lib/x2go/x2goagent[0x455863]
> /lib/libc.so.6(__libc_start_main+0xfd)[0x7f05278a8c4d]
> /usr/lib/x2go/x2goagent[0x40aa69]
> 
> The client side says this:
> Loop: WARNING! Connected to remote version 3.4.0 with local version
> 3.5.0.
> Loop: WARNING! Unrecognized session type 'unix-kde-depth_24'. Assuming
> agent session.
> Proxy: PANIC! Failure reading from the peer proxy on FD#5.
> Loop: PANIC! No shutdown of proxy link performed by remote proxy.
> 
> So what changed? Is this an issue with libssh? Thanks - John
<snip>
I'm not sure where it is coming from but sometimes, not all the time, we
receive a message such as:

channel_open_session failed - Received SSH_MSG_DISCONNECT: Received ieof
for nonexistent channel 0

Lots of Internet research didn't produce a lot of helpful information on
the error.

On the server side ssh logs I see:
Aug  1 12:45:07 jasiii sshd[22977]: error: connect_to localhost port
30001: failed.
Aug  1 12:45:07 jasiii sshd[22977]: channel_by_id: 0: bad id: channel
free

I do not see anything of interest in the client ssh logs.

Out of curiosity, I tried reproducing this manually.  Both systems where
I can reliably reproduce this problem have a local smtp daemon.  I did a
"ssh -p <some hidden port> -L 40015:localhost:25 user1 at machine1" and a
"ssh -p <some hidden port> -L 40016:localhost:25 user2 at machine2".  I was
able to telnet to each on port 40015 and 40016 without an issue or one
disconnecting the other. Could there be a problem in the way libssh is
handling multiple channels? - John




More information about the x2go-dev mailing list