Java多线程专题2: JMM(Java内存模型)
-
- Java多线程专题2: JMM(Java内存模型)
Java中Synchronized关键字的内存语义是什么?
If two or more threads share an object, and more than one thread updates variables in that shared object, race conditions may occur.
To solve this problem you can use a Java synchronized block. A synchronized block guarantees that only one thread can enter a given critical section of the code at any given time. Synchronized blocks also guarantee that all variables accessed inside the synchronized block will be read in from main memory, and when the thread exits the synchronized block, all updated variables will be flushed back to main memory again, regardless of whether the variable is declared volatile or not.
The Java programming language provides multiple mechanisms for communicating between threads. The most basic of these methods is synchronization, which is implemented using monitors. Each object in Java is associated with a monitor, which a thread can lock or unlock. Only one thread at a time may hold a lock on a monitor. Any other threads attempting to lock that monitor are blocked until they can obtain a lock on that monitor. A thread t may lock a particular monitor multiple times; each unlock reverses the effect of one lock operation.
The synchronized statement computes a reference to an object; it then attempts to perform a lock action on that object's monitor and does not proceed further until the lock action has successfully completed. After the lock action has been performed, the body of the synchronized statement is executed. If execution of the body is ever completed, either normally or abruptly, an unlock action is automatically performed on that same monitor.
A synchronized method automatically performs a lock action when it is invoked; its body is not executed until the lock action has successfully completed. If the method is an instance method, it locks the monitor associated with the instance for which it was invoked (that is, the object that will be known as this during execution of the body of the method). If the method is static, it locks the monitor associated with the Class object that represents the class in which the method is defined. If execution of the method's body is ever completed, either normally or abruptly, an unlock action is automatically performed on that same monitor. 注意: 如果同一个类中有多个用synchronized修饰的方法, 那么对于同一个实例, 这些方法之间也是互斥的, 因为都是使用了这个实例的锁.
synchronized 的内存语义
- 当线程释放锁时, JMM会把该线程对应的本地内存中的共享变量刷新到主内存中
- 当线程获取锁时, JMM会把该线程对应的本地内存置为无效. 从而使得被监视器保护的临界区代码必须从主内存中读取共享变量
- 锁的释放-获取和volatile的写-读具有相同的内存语义, volatile可以看成是轻量级的锁.
线程执行互斥代码的过程
- 获取监视器锁
- 清空工作内存
- 从主内存中拷贝变量的最新副本到工作内存
- 执行代码
- 将更改后的共享变量的值刷新到主内存
- 释放监视器锁
如果某个任务处于一个对标记为synchronized的方法的调用中, 那么在这个线程从该方法返回之前, 其它所有要调用类中任何标记为synchronized方法的线程都会被阻塞.
Java中Volatile关键字的内存语义是什么?
volatile keyword can make sure that a given variable is read directly from main memory, and always written back to main memory when updated
volatile是通过加入内存屏障和禁止指令重排序来实现的
- 对volatile变量执行写操作时, 会在写操作后加入一条store屏障指令, 这样就会把读写时的数据缓存加载到主内存中
- 对volatile变量执行读操作时, 会在读操作前加入一条load屏障指令, 这样就会从主内存中加载变量
- 当后一个操作是volatile写时, 不管前一个操作是什么, 都不能重排序. 这个规则确保volatile写之前的操作不会被编译器重排序到volatile写之后.
- 当前一个操作是volatile读时, 不管后一个操作是什么, 都不能重排序. 这个规则确保volatile读之后的操作不会被编译器重排序到volatile读之前.
- 当前一个操作是volatile写, 后一个操作是volatile读时, 不能重排序
所以说, volatile变量在每次被线程访问时, 都强迫从主内存中重读该变量的值, 而当该变量发生变化时, 就会强迫线程将最新的值刷新到主内存, 这样任何时刻, 不同的线程总能看到该变量的最新值.
- 线程写volatile变量的过程
- 改变线程工作内存中volatile变量副本的值
- 将改变后的副本的值从工作内存刷新到主内存中
- 线程读volatile变量的过程
- 从主内存中读取volatile变量的最新值到线程的工作内存中
- 从工作内存中读取volatile变量的副本
volatile变量也存在一些局限: 不能用于构建原子的复合操作, 因此当一个变量依赖旧值时就不能使用volatile变量, 例如在嵌入式设备中, volatile的变量在使用的过程中, 值可能会因为硬件产生变化.
JDK各版本对volatile的处理有什么不同
JDK5之前对volatile的处理和JDK5是不同的
- 在JDK4及之前, 对volatile变量的读写与对其他变量的读写指令, 在编译优化阶段可能会被调换顺序
- 在JDK5之后保证了发生在volatile变量之前的读写, 不会被调整到volatile变量的读写之后. 为了实现volatile内存语义, JMM会分别限制编译器重排序和处理器重排序
为了实现volatile的内存语义, 编译器在生成字节码时, 会在指令序列中插入内存屏障来禁止特定类型的处理器重排序
JDK5以及之后的顺序保证(Happens-Before Guarantee): 读取之后不提前, 写入之前不推后
- 如果代码中对某个变量的读取和写入发生在对volatile变量的写入之前, 那么编译后这个读写操作保证不会被调整到对volatile的写入之后. 注意这仅仅是保证发生在volatile写入之前的操作不会放到后面, 但是不能保证volatile写入之后的操作不会被放到前面.
- 如果代码中对某个变量的读取和写入发生在对volatile变量的读取之后, 那么编译后这个读写操作保证不会被调整到对volatile的读取之前. 注意这也不能保证volatile读取之前的操作不会被放到后面.
JDK5的这个改变, 也是为了解决double-checked locking问题
double-checked locking 问题
double-checked locking是一种单例延迟初始化的实现, 代码如下
// double-checked-locking - don't do this!
private static Something instance = null;
public Something getInstance() {
if (instance == null) {
synchronized (this) {
if (instance == null)
instance = new Something();
}
}
return instance;
}
This looks awfully clever -- the synchronization is avoided on the common code path. There's only one problem with it -- it doesn't work.
Why not? The most obvious reason is that the writes which initialize instance and the write to the instance field can be reordered by the compiler or the cache, which would have the effect of returning what appears to be a partially constructed Something. The result would be that we read an uninitialized object. There are lots of other reasons why this is wrong, and why algorithmic corrections to it are wrong. There is no way to fix it using the old Java memory model.
Many people assumed that the use of the volatile keyword would eliminate the problems that arise when trying to use the double-checked-locking pattern. In JVMs prior to 1.5, volatile would not ensure that it worked (your mileage may vary). Under the new memory model, making the instance field volatile will "fix" the problems with double-checked locking, because then there will be a happens-before relationship between the initialization of the Something by the constructing thread and the return of its value by the thread that reads it.
什么是伪共享(False Sharing),为何会出现, 以及如何避免?
Memory is stored within the cache system in units know as cache lines. Cache lines are a power of 2 of contiguous bytes which are typically 32-256 in size. The most common cache line size is 64 bytes. False sharing is a term which applies when threads unwittingly impact the performance of each other while modifying independent variables sharing the same cache line. Write contention on cache lines is the single most limiting factor on achieving scalability for parallel threads of execution in an SMP system. I’ve heard false sharing described as the silent performance killer because it is far from obvious when looking at code.
To achieve linear scalability with number of threads, we must ensure no two threads write to the same variable or cache line. Two threads writing to the same variable can be tracked down at a code level. To be able to know if independent variables share the same cache line we need to know the memory layout, or we can get a tool to tell us. Intel VTune is such a profiling tool. In this article I’ll explain how memory is laid out for Java objects and how we can pad out our cache lines to avoid false sharing.
讨论这个问题, 需要先了解以下知识
- 多核CPU的每个core都有自己的缓存
- 每个core访问数据的时候, 首先会尝试从缓存中读取, 如果缓存中不存在, 再从内存中读取.
- 每个core将数据从内存加载到缓存中是以块为单位的, 称为cache line, 一般大小是64字节
在实际的程序执行中, 如果定义两个相邻的long变量var0和var1, 现在出现这种情况
- core 0 和 core 1 分别在执行不同的线程, 其中 core 0 使用的 var0 和 core 1 使用的 var1 存储在了同一个 cache line上
- core 0 修改了 var0. 也就是说core 0对 var0 做了一次修改, 需要把这个cache line的所有数据同步到内存中. 同时需要把core 1 中的这个缓存置为失效, 这个过程是由CPU的缓存一致性协议(MESI)保证的.
- 当core 1 需要读取 var1 的时候就发现缓存失效了, 需要重新从内存中加载,
上面这个例子中, 缓存的存在不仅没有使访问变快, 反而使得这次访问变慢了. 所以问题在于对于var0的修改, 导致对于 var1 的访问缓存命中失效, 使得软件上没有关系的变量在硬件上耦合了.
所以伪共享问题可以表示为: 几个在逻辑上(使用上)并不包含在同一个内存单元内的数据, 由于被cpu加载在同一个缓存行cache line当中, 当在多线程环境下被不同的core执行, 导致缓存行失效而引起的缓存命中率降低.
在频繁访问的场景下会有很大的性能损耗. 解决的方式也就是避免二者在一个cache line里面. 由于一个cache line一般是64字节, 所以只需要在var0和var1后填充7个long型的变量即可.