可重入和线程安全的联系和区别

coredump 发表于 23-12-2009 17:35:51

维基百科关于线程安全的定义：Thread safety is a computer programming concept applicable in the context of multi-threaded programs. A piece of code is thread-safe if it functions correctly during simultaneous execution by multiple threads.
一段代码(一般来说以一个函数为单位)，如果是线程安全的，那么它必须能在多线程环境下始终保证能够"正确"执行。

关于可重入的定义:
A computer program or routine is described as reentrant if the routine can be re-entered while it is already running (i.e it can be safely executed concurrently).
一个函数可重入，意味着当这个函数正在运行的过程中，可以再次被运行。显然，多线程并发就是这么一种情形。

那么，可重入和线程安全，是不是在说同一件事呢？毕竟他们的定义和描述的场景是如此的相像。如果不是的话，那是否很容易指出1，2个线程安全但是不可重入，或者可重入但是不线程安全的例子呢？

我们先来分析下线程安全的定义，最重要的是两点：
1. 多线程并发，也就是意味着，这段代码会在"同一时间"运行
2. 始终正确执行

那么什么情况下，一段代码在多线程情况下，会发生错误呢？这个问题，其实很难回答，有很多时候，那些很有经验的老鸟，也会判断错误，不过，大体上，如果遇到以下情形，就可以怀疑这段代码在多线程环境下不安全：
from:wikipedia
以下情况意味着线程不安全:
[*]accessing global variables or the heap //访问全局变量和堆内存[*]allocating/reallocating/freeing resources that have global scope (files, sub-processes, etc.)//访问(分配，重新分配，释放)全局资源 (最常见的是I/O资源)[*]indirect accesses through handles or pointers//通过指针间接访问资源[*]any visible side-effect (e.g., access to volatile variables in the C programming language) //执行任何有副作用的代码，大部分也是IO类的以下情况线程安全：
[*]the only variables it uses are from the stack //只使用栈内存[*]execution depends only on the arguments passed in, and //代码运行只依赖传入参数[*]the only subroutines it calls have the same properties. //或者这段代码所调用的其它函数的运行结果只取决于传入参数Such a sub-routine is sometimes called a "pure function", and is much like a mathematical function. //所谓函数式编程是也

以上对于集中线程不安全的情况，如果绕不开它们，又必须要求多线程安全，那么就得使用锁，线程局部存储等方式。其他还有：
Mutual exclusion //使用锁对并发控制在访问临界资源时串行化Access to shared data is serialized using mechanisms that ensure only one thread reads or writes the shared data at any time. Great care is required if a piece of code accesses multiple shared pieces of data—problems include race conditions, deadlocks, livelocks, starvation, and various other ills enumerated in many operating systems textbooks.Thread-local storage //线程局部存储，绕开全局变量的一种方式Variables are localized so that each thread has its own private copy. These variables retain their values across subroutine and other code boundaries, and are thread-safe since they are local to each thread, even though the code which accesses them might be reentrant.Atomic operations //原子操作，一般而言对于i++这样的操作，都不是原子操作，所谓原子操作，是不可能被线程切换中断的操作。现代计算机的汇编指令，一般会有一些支持原子加减之类的操作Shared data are accessed by using atomic operations which cannot be interrupted by other threads. This usually requires using special machine language instructions, which might be available in a runtime library. Since the operations are atomic, the shared data are always kept in a valid state, no matter what other threads access it. Atomic operations form the basis of many thread locking mechanisms.

对于可重入：To be reentrant, a computer program or routine:[*]Must hold no static (or global) non-constant data.//无静态或全局变量[*]Must not return the address to static (or global) non-constant data. //不能返回静态或全局变量的指针(只读的话没问题)[*]Must work only on the data provided to it by the caller. //只能历来传入参数[*]Must not rely on locks to singleton resources. //不能依靠锁或单例资源[*]Must not modify its own code. (unless executing in its own unique thread storage) //除了使用线程局部存储外，不能修改自身代码(数据) ,( 很多编程语言可以在运行时修改自身代码), C/C++也行，不过很tricky[*]Must not call non-reentrant computer programs or routines.//不能调用非可重入的函数 (典型的递归定义计算机概念)

所以：
可重入与线程安全两个概念都关系到函数处理资源的方式。但是，他们有一定的区别。可重入概念会影响函数的外部接口，而线程安全只关心函数的实现。
[*]大多数情况下，要将不可重入函数改为可重入的，需要修改函数接口，使得所有的数据都通过函数的调用者提供。[*]要将非线程安全的函数改为线程安全的，则只需要修改函数的实现部分。一般通过加入同步机制以保护共享的资源，使之不会被几个进程同时访问。因此，相对线程安全来说，可重入性是更基本的特性，它可以保证线程安全：即，所有的可重入函数都是线程安全的，但并非所有的线程安全函数都是可重入的。

但是, 以上一般是对函数而言的，对C++的类而言，却反过来：
from:http://doc.qt.nokia.com/4.6/threads-reentrancy.html#reentrant
A thread-safe function can be called simultaneously from multiple threads, even when the invocations use shared data, because all references to the shared data are serialized.A reentrant function can also be called simultaneously from multiple threads, but only if each invocation uses its own data.

Hence, a thread-safe function is always reentrant, but a reentrant function is not always thread-safe.
C++ classes are often reentrant, simply because they only access their own member data. Any thread can call a member function on an instance of a reentrant class, as long as no other thread can call a member function on the same instance of the class at the same time.

例如，这是可重入的C++类：
class Counter {public: Counter() { n = 0; } void increment() { ++n; } void decrement() { --n; } int value() const { return n; } private: int n; };这里的n, 如果变成静态或者全局变量，就不在是可重入的了。

如果需要thread-safe：
class Counter{ public: Counter() { n = 0; } void increment() { QMutexLocker locker(&mutex); ++n; } void decrement() { QMutexLocker locker(&mutex); --n; } int value() const { QMutexLocker locker(&mutex); return n; } private: mutable QMutex mutex; int n; };

someonehappy 发表于 23-12-2009 19:37:03

原来这个重入是多重进入的意思，而不是再次进入的意思啊。

key 发表于 23-12-2009 21:55:28

没有细看。但我认为，可重入和“无状态”是近义（是不是同义我还没想清楚）。
而线程安全是临界资源是否保护好，如果没有临界资源，那当然是线程安全了，如果有，就需要保护好。
没有临界资源是不是就等同于“无状态”，我也不能完全确定。。。。

水平次成我这个样子还跳出来现世的，估计没有多少了，见笑见笑

woodheadz 发表于 24-12-2009 03:08:28

有一种情况不太清楚该怎么算：
如果有一个函数接收一个资源的指针作为参数并在函数内修改该资源，这个函数貌似没有违反楼上”可重入“的几个特征吧？

[ 本帖最后由 woodheadz 于 24-12-2009 03:37 编辑 ]

key 发表于 24-12-2009 09:03:07

我觉得如果资源本身作为参数传入，应该算是“可重入”了。
如果资源的读写有着线程安全问题，那这个函数就不能算是线程安全的。

所以，从这个角度来看，可重入不是线程安全的子集。
这是个人见解。

原帖由 woodheadz 于 24-12-2009 03:08 发表 http://www.freeoz.org/bbs/images/common/back.gif
有一种情况不太清楚该怎么算：
如果有一个函数接收一个资源的指针作为参数并在函数内修改该资源，这个函数貌似没有违反楼上”可重入“的几个特征吧？

coredump 发表于 24-12-2009 11:07:02

原帖由 woodheadz 于 24-12-2009 02:08 发表 http://www.freeoz.org/bbs/images/common/back.gif
有一种情况不太清楚该怎么算：
如果有一个函数接收一个资源的指针作为参数并在函数内修改该资源，这个函数貌似没有违反楼上”可重入“的几个特征吧？这种情况被特别说明了：
Must not rely on locks to singleton resources. //不能依靠锁或单例资源

也就是说，如果这个资源是单例资源，也就是只能串行访问，或者访问这个资源的函数本身因为其它原因是不可重入的，那么这个函数就是不可重入的

GPS 发表于 24-12-2009 22:12:30

我这样理解。
Thread-safe和re-entrant首先要看是针对什么的。在过程语言，比如C中，就是指函数。在类语言中，指的是类。
re-entrant的函数就是说，函数的内容可以在两个线程中同时运行。thread-safe的指函数可以被不同线程同时调用。有微妙的差别。
一个函数，不re-entrant,有两种方式，一种是，虽然语法上可以同时运行，但是由于操作公共资源（C中全局，静态变量），导致结果不定，这种也是不thread-safe的。另一种是，语法上就是不可同时运行的，比如用了MUTEX来锁公共资源，虽然函数可以同时调用，但却是串行的，这就是thread-safe的，是coredump楼上讲的。
所以说，在过程语言中，thread-safe包括re-entrant.

在类语言中，是针对类的。Qt中，re-entrant 是指在不同线程中，类的不同实例是否可以同时调用成员函数。由于一般成员变量都是在不同实例中的，所以，只要不操作全局或静态变量，就是re-entrant的。coredump讲的那个类的例子就是这样的。即判断类Counter是否re-entrant时，考虑的是，变量n在不同实例中有不同版本。
QT中，thread-safe指的是在不同线程中，同一个实例的成员函数是否可以被同时调用。由于是一个实例，一般的成员变量也变成了公共资源，也不能同时操作。即Counter类的一个实例中，n是要受保护的。Qt中，类的re-entran和thread-safe是有交叉的关系。这应该是coredump开篇时候讲的。
我不知道别的类语言中怎么定义。
补充一个Qt的说明。

Reentrancy and Thread-Safety

Throughout the Qt documentation, the terms reentrant and thread-safe are used to specify how a function can be used in multithreaded applications:

A reentrant function can be called simultaneously by multiple threads provided that each invocation of the function references unique data.
A thread-safe function can be called simultaneously by multiple threads when each invocation references shared data. All access to the shared data is serialized.
By extension, a class is said to be reentrant if each and every one of its functions can be called simultaneously by multiple threads on different instances of the class. Similarly, the class is said to be thread-safe if the functions can be called by different threads on the same instance.

Classes in the documentation will be documented as thread-safe only if they are intended to be used by multiple threads.

Note that the terminology in this domain isn't entirely standardized. POSIX uses a somewhat different definition of reentrancy and thread-safety for its C APIs. When dealing with an object-oriented C++ class library such as Qt, the definitions must be adapted.

Most C++ classes are inherently reentrant, since they typically only reference member data. Any thread can call such a member function on an instance of the class, as long as no other thread is calling a member function on the same instance.

[ 本帖最后由 GPS 于 24-12-2009 22:15 编辑 ]

页: [1]

FreeOZ论坛's Archiver

可重入和线程安全的联系和区别