为什么需要多线程
对于这个问题可能很多朋友会说是为了高性能,个人觉得这是误解,多线程不等于高性能,从cpu(单核)的角度上看单线程才能带来最高性能。
对于单纯的运算任务来说一条线程依次执行到底肯定是最快速的(因为线程间的调度,通信及资源的共享等都需要额外的开销),在计算机的早期岁月,操作系统没有提供线程概念。事实上整个只运行着一个执行线程, 其中同时包含操作系统代码和应用程序代码。当然这样带来的问题也很明显一旦应用程序出现问题只能reset了。不过我还是认为多线程产生是因为CPU太快了,从计算机诞生CPU运算速度的提升实在是太快了,当然不是说因为快就要分几个线程出去运行。计算机的运行绝不只是中央处理器CPU一个人的事情,还要数据的储存,传输(数据加算法嘛,算法靠cpu,数据靠存储器了),数据的存取远远跟不上了,当然后面还出现了互联网,网络的传输速度更无法与cpu相匹配。
所以个人看法是多线程的出现是科技进步的必然结果,试想一下应用程序得出了目标结果不过却存不进去或传不出去,那是多么的着急,(当然多线程还带了跟多的好处,如应用程序的隔离等)而事实上多线程的应用场景多是什么地方,什么地方必须使用到多线程这些都十分有规律,那些一定会阻塞的操作大部分会使用到多线程,而他们为什么会阻塞,我们可以发现这些操作一般都是磁盘的读取,网络请求的处理,这些操作受制于环境必须阻塞,而我们不可能一直等在那里。
然后对于线程跟性能,我想简单谈一下自己的看法,多线程绝对不能与高性能划等号,而多线程的额外开销也没有达到用户能察觉的地步(即使很多权威的书籍中强掉线程的切换回带来十分大的性能耗费,而实际的经验都表明这消耗是十分难以擦觉的,个人认为书中的出发点可能不同,对于cpu本身在加上现在windows动不动就2,3千条的线程也是是一个不应该被忽视的内容)
然后还是先讲下线程的开销吧
这些都是线程应该具备的一些东西(windows),也就是说创建一个线程就需要创建这样东西,刚刚我说线程切换很难被察觉,可不是线程创建,创建大量的线程的确是需要大量的时间(所以高级的程序框架会提供线程池或类似的东西,缓解线程频繁创建销毁带来的负面影响)
再谈下线程的切换
然后时候操作系(windows)都只将一个线程分配给一个cpu(其实就是任何时候一个cpu都只能处理一个线程),线程允许运行一个”时间片“一旦时间片到期windows就会进行切换(对此还必须提一下windows可能会在时间片运行中的任何时候进行切换,同时时间片到期后也是可能选择同一个线程进行特殊的切换的)
让我们看看切换的时候都要完成什么
一般情况下”时间片“大约为30毫秒,虽然书上是这样写的,不过实际测试结果远小于这个时间,可能是书写的比较早吧,一般情况下这个时间片会根据操作系统的运行负担进行自动调节。
当然时间片里的时间毕竟都是应用程序本身的消耗,所以这些消耗都是有意义的,而对于线程切换则完全由系统跟cpu内部完成,对于应用程序业务本身来说的确是浪费。
好在如上面提到的实际上的消耗很难被察觉,一般情况下甚至很难被准确的测试出确切的时间数据,因为消耗的时间不能以我们常见的毫秒甚至是微秒来衡量,实际的测试只能说明一次切换的消耗在个人计算机上一定是纳米级别的
Context Switch Definition
A context switch (also sometimes referred to as a process switch or a task switch) is the switching of the CPU (central processing unit) from one process or thread to another.
A process (also sometimes referred to as a task) is an executing (i.e., running) instance of a program. In Linux, threads are lightweight processes that can run in parallel and share an address space (i.e., a range of memory locations) and other resources with their parent processes (i.e., the processes that created them).
A context is the contents of a CPU's registers and program counter at any point in time. A register is a small amount of very fast memory inside of a CPU (as opposed to the slower RAM main memory outside of the CPU) that is used to speed the execution of computer programs by providing quick access to commonly used values, generally those in the midst of a calculation. A program counter is a specialized register that indicates the position of the CPU in its instruction sequence and which holds either the address of the instruction being executed or the address of the next instruction to be executed, depending on the specific system.
Context switching can be described in slightly more detail as the kernel (i.e., the core of the operating system) performing the following activities with regard to processes (including threads) on the CPU: (1) suspending the progression of one process and storing the CPU's state (i.e., the context) for that process somewhere in memory, (2) retrieving the context of the next process from memory and restoring it in the CPU's registers and (3) returning to the location indicated by the program counter (i.e., returning to the line of code at which the process was interrupted) in order to resume the process.
A context switch is sometimes described as the kernel suspending execution of one process on the CPU and resuming execution of some other process that had previously been suspended. Although this wording can help clarify the concept, it can be confusing in itself because a process is, by definition, an executing instance of a program. Thus the wording suspending progression of a process might be preferable.
Context Switches and Mode Switches
Context switches can occur only in kernel mode. Kernel mode is a privileged mode of the CPU in which only the kernel runs and which provides access to all memory locations and all other system resources. Other programs, including applications, initially operate in user mode, but they can run portions of the kernel code via system calls. A system call is a request in a Unix-like operating system by an active process (i.e., a process currently progressing in the CPU) for a service performed by the kernel, such as input/output (I/O) or process creation (i.e., creation of a new process). I/O can be defined as any movement of information to or from the combination of the CPU and main memory (i.e. RAM), that is, communication between this combination and the computer's users (e.g., via the keyboard or mouse), its storage devices (e.g., disk or tape drives), or other computers.
The existence of these two modes in Unix-like operating systems means that a similar, but simpler, operation is necessary when a system call causes the CPU to shift to kernel mode. This is referred to as a mode switch rather than a context switch, because it does not change the current process.
Context switching is an essential feature of multitasking operating systems. A multitasking operating system is one in which multiple processes execute on a single CPU seemingly simultaneously and without interfering with each other. This illusion of concurrency is achieved by means of context switches that are occurring in rapid succession (tens or hundreds of times per second). These context switches occur as a result of processes voluntarily relinquishing their time in the CPU or as a result of the scheduler making the switch when a process has used up its CPU time slice.
A context switch can also occur as a result of a hardware interrupt, which is a signal from a hardware device (such as a keyboard, mouse, modem or system clock) to the kernel that an event (e.g., a key press, mouse movement or arrival of data from a network connection) has occurred.
Intel 80386 and higher CPUs contain hardware support for context switches. However, most modern operating systems perform software context switching, which can be used on any CPU, rather than hardware context switching in an attempt to obtain improved performance. Software context switching was first implemented in Linux for Intel-compatible processors with the 2.4 kernel.
One major advantage claimed for software context switching is that, whereas the hardware mechanism saves almost all of the CPU state, software can be more selective and save only that portion that actually needs to be saved and reloaded. However, there is some question as to how important this really is in increasing the efficiency of context switching. Its advocates also claim that software context switching allows for the possibility of improving the switching code, thereby further enhancing efficiency, and that it permits better control over the validity of the data that is being loaded.
The Cost of Context Switching
Context switching is generally computationally intensive. That is, it requires considerable processor time, which can be on the order of nanoseconds for each of the tens or hundreds of switches per second. Thus, context switching represents a substantial cost to the system in terms of CPU time and can, in fact, be the most costly operation on an operating system.
Consequently, a major focus in the design of operating systems has been to avoid unnecessary context switching to the extent possible. However, this has not been easy to accomplish in practice. In fact, although the cost of context switching has been declining when measured in terms of the absolute amount of CPU time consumed, this appears to be due mainly to increases in CPU clock speeds rather than to improvements in the efficiency of context switching itself.
One of the many advantages claimed for Linux as compared with other operating systems, including some other Unix-like systems, is its extremely low cost of context switching and mode switching.
这里提供一个很权威的描述,就是上面的
讲的是上下文切换也就是线程间的切换,是Linux的,不过线程的处理都是十分相似的。
which can be on the order of nanoseconds for each of the tens or hundreds of switches per second 。 我英文不是很好,大致是说每秒几百或几十次的切换,而实际上只消耗了几纳秒。当然现如今的windows的切换次数会大的多 借助工具我们可以看到线程切换的次数十分惊人,单qq这个进程的所有活着的线程一共切换了3亿多次,这个数量级带来的性能消耗可能就不应该被直接忽视掉。 不过事实上实际经验发现,由多线程带来的性能消耗或程序业务处理能力的下降都不是由这正常的切换造成的,事实上线程的滥用是导致性能下降直接原因,而这些滥用是普遍存在的,在不该使用多线程的时候使用了它,频繁的创建及销毁线程,不正确的使用线程锁,让线程频繁访问共享资源等等不合理操作导致多线程对资源或性能的消耗比系统切换来说,页面的切换也就不止一提了