记一次死锁示例

此问题现象为两个进程通信失败,但是ipc消息的确发出去了,如下为在系统中的gdb堆栈信息
函数pthread_join用来等待一个线程的结束,线程间同步的操作
线程3正在销毁进程关闭,它持有一个内部GLIBC pthread互斥锁,线程3调用pthread_join()来等待另一个线程(线程2)退出。线程2退出时,它需要线程3已经持有的GLIBC pthread互斥锁;因此,线程2和线程3相互阻塞之后,发生死锁,造成IPC不通

Thread 3 (Thread 0x2b8424e30700 (LWP 1389)):
#0  0x00002b84151be49d in pthread_join (threadid=47846560868816, thread_return=0x0) at pthread_join.c:90
#1  0x00002b8425037cb4 in AlarmReporterHaObserver::~AlarmReporterHaObserver (this=<optimized out>, __in_chrg=<optimized out>)
    at sw/se/xc/bsd/plat/sf/common/oam/fm/lib/alarm_reporter/alarm_reporter_api.cc:111
#2  0x0000003723e37dcf in __cxa_finalize (d=0x2b84252396c0) at cxa_finalize.c:56
#3  0x00002b8425037a61 in __do_global_dtors_aux () from /opt/ipos/lib/libalarm_reporter.so.0.0
---Type <return> to continue, or q <return> to quit---
#4  0x0000000000000000 in ?? ()

Thread 2 (Thread 0x2b842543a700 (LWP 1390)):
#0  __lll_lock_wait () at ../sysdeps/unix/sysv/linux/x86_64/lowlevellock.S:135
#1  0x00002b84151bf922 in __GI___pthread_mutex_lock (mutex=0x3723c22948 <_rtld_local+2312>) at ../nptl/pthread_mutex_lock.c:115
#2  0x0000003723a1170d in tls_get_addr_tail (ti=<optimized out>, dtv=<optimized out>, the_map=<optimized out>) at dl-tls.c:765
#3  0x00002b84146da145 in cm_log_information (event=3293053056, formatstring=<optimized out>) at sw/se/xc/bsd/plat/sf/common/oam/lib/cm_log/ipos_log/iposlog.cc:62
#4  0x00002b8425037b50 in AlarmReporterHaObserverRun (arg=<optimized out>) at sw/se/xc/bsd/plat/sf/common/oam/fm/lib/alarm_reporter/alarm_reporter_api.cc:53
#5  0x00002b84151bd2b4 in start_thread (arg=0x2b842543a700) at pthread_create.c:336
#6  0x0000003723ee819d in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:109

(gdb) 
(gdb) print *(pthread_mutex_t *)0x3723c22948         
$7 = {__data = {__lock = 2, __count = 1, __owner = 1389, __nusers = 1, __kind = 1, __spins = 0, __elision = 0, __list = {__prev = 0x0, __next = 0x0}}, 
  __size = "\002\000\000\000\001\000\000\000m\005\000\000\001\000\000\000\001", '\000' <repeats 22 times>, __align = 4294967298}
(gdb)

打印出pthread_mutex_t数据可发现lock=2,发生死锁。

;