[paramiko] Threads getting stuck during SFTP using Paramiko.

Yateen Joshi yateenjoshi at gmail.com
Mon May 28 08:22:28 PDT 2007


Additionally,

If I disable prefetch, the problem is not seen.

Yateen V. Joshi


On 5/26/07, Yateen Joshi <yateenjoshi at gmail.com> wrote:
>
> Hi,
>
>
>
> The simple nature of the problem is as follows –
>
>
>
> I am periodically getting files from a remote machine (which is having
> openSSH) using Paramiko sftp client as a part of my python script.
>
> The files keep on generating at the remote server and my application
> fetches it on a regular basis. It is observed that the application gets
> stuck after some time and it no longer fetches any file.
>
>
>
> Here is the analysis –
>
>
>
> I observed in the paramiko logs that it is stuck after a certain number of
> rekeys. To easily reproduce the problem, I reduced the number of bytes and
> packets after which the rekey happens. Default is 2 power 30, I made them 2
> power 13 and the problem gets reproduced easily.
>
> Just to verify my logic, I disabled this rekeying mechanism (instead of
> incrementing the bytes and packets in read_message, I kept them constant)
> and found the problem   was NOT reproduced.
>
>
>
> Since my application was running on a Solaris (9) machine, I took a gcore
> of the same and found that one of the threads was stuck. I put more logs in
> paramiko code, and found that it got stuck in channel::_feed() method while
> trying to acquire lock.
>
>
>
> Here are the paramiko logs -
>
>
>
> DEB [20070526-19:21:47.001] thr=1   paramiko.transport.1 : releasing lock
> - channel::_feed
>
> DEB [20070526-19:21:47.002] thr=1   paramiko.transport.1 : released lock -
> channel::_feed
>
> DEB [20070526-19:21:47.002] thr=1   paramiko.transport : calling
> send_kex_init in need key
>
> DEB [20070526-19:21:47.002] thr=1   paramiko.transport : acquiring clear
> to send lock - _send_kex_init
>
> DEB [20070526-19:21:47.002] thr=1   paramiko.transport : acquired clear to
> send lock - _send_kex_init
>
> DEB [20070526-19:21:47.003] thr=2   paramiko.transport.1.sftp :
> stat('/records/edr/NC38_RB_05262007190750_000_000057032.gz')
>
> DEB [20070526-19:21:47.003] thr=1   paramiko.transport : releasing lock -
> clear to send lock _send_kex_init
>
> DEB [20070526-19:21:47.004] thr=2   paramiko.transport.1 : acquiring lock
> - channel::send
>
> DEB [20070526-19:21:47.004] thr=1   paramiko.transport : released lock -
> clear to send lock _send_kex_init
>
> DEB [20070526-19:21:47.005] thr=2   paramiko.transport.1 : acquired lock -
> channel::send
>
> DEB [20070526-19:21:47.007] thr=1   paramiko.transport : coming out of
> _send_kex_init
>
> DEB [20070526-19:21:47.008] thr=1   paramiko.transport : called
> send_kex_init in need key
>
> DEB [20070526-19:21:47.011] thr=1   paramiko.transport.1 : acquiring lock
> - channel::_feed
>
>
>
>
>
> Here is the dbx output –
>
>
>
> Reading parser.so
>
> t at 2 (l at 2) terminated by signal 0 (UNKNOWN SIGNAL)
>
> 0xfee354b0: __lwp_park+0x0010:  ta       8
>
> Current function is PyThread_acquire_lock
>
> dbx.11: warning: can't find file "/sol10/SOURCES/S8/Python-2.4.2
> /Python/thread_pthread.h"
>
> dbx.11: warning: see `help finding-files'
>
> (dbx.11)
>
> (dbx.11) where
>
> current thread: t at 2
>
> dbx.11: warning: invalid type for LCSYM/STSYM stab:
> /export/home/edrserver/ess/3rdparty/python/solaris/bin/python2.4:Objects/classobject.c
> stab #10921 docstr:V(49,10)
>
>   [1] __lwp_park(0x0, 0x0, 0x0, 0x1, 0xfee48000, 0x206288), at 0xfee354b0
>
>   [2] cond_wait_queue(0x2035c0, 0xfee48c08, 0x0, 0x0, 0xfed70200,
> 0xfee48000), at 0xfee326b8
>
>   [3] _cond_wait_cancel(0x2035c0, 0x2035d0, 0x7c, 0xfed29138, 0x140000,
> 0x0), at 0xfee32e74
>
>   [4] _pthread_cond_wait(0x2035c0, 0x2035d0, 0xfe8fb08c, 0x0, 0x1b, 0x1),
> at 0xfee32eb0
>
> =>[5] PyThread_acquire_lock(lock = 0x2035b8, waitflag = 2110928), line 416
> in "thread_pthread.h"
>
>   [6] lock_PyThread_acquire_lock(self = 0x17d4c0, args = 0x141030), line
> 63 in "threadmodule.c"
>
>   [7] PyCFunction_Call(func = 0x334198, arg = 0x141030, kw = (nil)), line
> 94 in "methodobject.c"
>
>   [8] call_function(pp_stack = 0xfe8fb290, oparg = 0), line 3558 in "
> ceval.c"
>
>   [9] PyEval_EvalFrame(f = 0x59deb0), line 2163 in "ceval.c"
>
>   [10] PyEval_EvalCodeEx(co = 0x278060, globals = (nil), locals = 0x81da0,
> args = 0x327b54, argcount = 2, kws = (nil), kwcount = 0, defs = (nil),
> defcount = 0, closure = (nil)), line 2736 in " ceval.c"
>
>   [11] function_call(func = 0x27ea70, arg = 0x327b48, kw = (nil)), line
> 548 in "funcobject.c"
>
>   [12] PyObject_Call(func = 0x27ea70, arg = 0x327b48, kw = (nil)), line
> 1756 in "abstract.c"
>
>   [13] instancemethod_call(func = 0x27ea70, arg = 0x327b48, kw = (nil)),
> line 2447 in "classobject.c "
>
>   [14] PyObject_Call(func = 0x27ea70, arg = 0x327b48, kw = (nil)), line
> 1756 in "abstract.c"
>
>   [15] do_call(func = 0x1e5378, pp_stack = 0xfe8fb890, na = 2, nk =
> 3308360), line 3766 in "ceval.c "
>
>   [16] call_function(pp_stack = 0xfe8fb890, oparg = 2), line 3581 in "
> ceval.c"
>
>   [17] PyEval_EvalFrame(f = 0x5f0488), line 2163 in "ceval.c"
>
>   [18] fast_function(func = 0x3320b0, pp_stack = 0x18d420, n = 1, na =
> 6211252, nk = 6227080), line 3640 in " ceval.c"
>
>   [19] call_function(pp_stack = 0xfe8fba38, oparg = 1), line 3579 in "
> ceval.c"
>
>   [20] PyEval_EvalFrame(f = 0x18d2c0), line 2163 in "ceval.c"
>
>   [21] PyEval_EvalCodeEx(co = 0x2238a0, globals = (nil), locals = 0x81da0,
> args = 0x1edd9c, argcount = 1, kws = (nil), kwcount = 0, defs = (nil),
> defcount = 0, closure = (nil)), line 2736 in " ceval.c"
>
>   [22] function_call(func = 0x244430, arg = 0x1edd90, kw = (nil)), line
> 548 in "funcobject.c"
>
>   [23] PyObject_Call(func = 0x244430, arg = 0x1edd90, kw = (nil)), line
> 1756 in "abstract.c"
>
>   [24] instancemethod_call(func = 0x244430, arg = 0x1edd90, kw = (nil)),
> line 2447 in "classobject.c "
>
>   [25] PyObject_Call(func = 0x244430, arg = 0x1edd90, kw = (nil)), line
> 1756 in " abstract.c"
>
>   [26] PyEval_CallObjectWithKeywords(func = 0x31d828, arg = 0x141030, kw =
> (nil)), line 3425 in "ceval.c "
>
>   [27] t_bootstrap(boot_raw = 0x6abfd8), line 434 in "threadmodule.c"
>
> (dbx.11) threads
>
>       t at 1  a  l at 1    ?()   running          in  _poll()
>
>  >    t at 2  b   l at 2   t_bootstrap()   sleep on 0x2035c0  in  __lwp_park()
>
> (dbx.11)
>
> (dbx.11)
>
> (dbx.11)
>
> (dbx.11)
>
> (dbx.11)
>
> (dbx.11)
>
> (dbx.11) thread -blockedby t at 1
>
> Thread t at 1 is not asleep
>
> (dbx.11) thread -blockedby t at 2
>
> Thread t at 2 is blocked by:
>
> 0x002035c0 (0x2035c0): thread  condition variable
>
> (dbx.11) thread -blocks t at 1
>
> Locks held by t at 1:
>
> (dbx.11) thread -blocks t at 2
>
> Locks held by t at 2:
>
> (dbx.11) thread -info t at 1
>
>         Thread t at 1 (0xfed70000) at priority 0
>
>         state: active on    l at 1
>
>         base function: 0x0: 0x00000000() stack: 0xffc00000[8388608]
>
>         flags: (none)
>
>         masked signals: (none)
>
>         Currently active in _poll
>
> (dbx.11) thread -info t at 2
>
>         Thread t at 2 (0xfed70200) at priority 0
>
>         state: asleep on 0x002035c0 (0x2035c0): thread  condition variable
>
>
>         base function: 0xc54f8: t_bootstrap() stack: 0xfe8fc000[1032192]
>
>         flags: BOUND|DETACHED|SUSPENDED
>
>         masked signals: (none)
>
>         Currently inactive in __lwp_park
>
> (dbx.11)
>
>
>
>
>
> As can be seen from the dbx and paramiko logs,  thread  is
> channel::_feed() and thread2 is channel::send().
>
> The thread 2 (send) is waiting in an infinite while loop (please see code
> below) to get a variable set (clear_to_send), however, that will be set
> after some data is received. The other thread that might receive the data is
> now stuck   in channel::_feed().
>
> transport.py -
>     def _send_user_message(self, data):
>         """
>         send a message, but block if we're in key negotiation.  this is
> used
>         for user-initiated requests.
>         """
>         while True:
>             self.clear_to_send.wait(0.1)
>             if not self.active:
>                 self._log(DEBUG, 'Dropping user packet because connection
> is dea
> d.')
>                 return
>             self.clear_to_send_lock.acquire()
>             if self.clear_to_send.isSet():
>                 break
>             self.clear_to_send_lock.release()
>         try:
>             self._send_message(data)
>         finally:
>             self.clear_to_send_lock.release()
>
>
> I tried to use pdb (python debugger), however, did not have any luck. Can
> you give any pointers as to why this is happening and reason for the thread
> getting stuck? Also, any pointers why this is related to rekeying?
>
>
>
> Appreciating your response in advance,
>
>
>
> Best regards,
>
>
>
>
>
> Yateen Vishnu Joshi
>
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://mail.lag.net/pipermail/paramiko/attachments/20070528/49cf7d25/attachment.htm 


More information about the paramiko mailing list