# synchronized、monitor lock加锁流程

synchronized关键字是Java提供的内置的锁，也可以成为monitor lock。通过synchronzied加锁，实现可见性、原子性、防止重排序等并发场景下的线程间同步协调机制。

在Java语言层面，通过对一个对象加synchronized形成加锁代码块或给一个方法增加synchronized标识，都能使得对应的代码块或方法在执行前后进行加锁、释放锁的操作。

synchronized(lock) {
    // 
}

1
2
3

public synchronized void doSomething() {
    //
}

1
2
3

# 实现

从字节码层面来看，synchornized(lock)这样的代码块，在前后会有一个monitorenter和monitorexit指令

比如如下Java代码

private Object lock = new Object();
public void test() {
    synchronized (lock) {
        System.out.println("test");
    }
}

1
2
3
4
5
6

编译后字节码中包含两个monitorexit，一个是代码块正常执行完，另一个用于exception table实现异常情况也能执行monitorexit保证锁释放。

 0 aload_0
 1 getfield #3 &lt;com/lzy/SynchronizedTest.lock : Ljava/lang/Object;>
 4 dup
 5 astore_1
 6 monitorenter
 7 getstatic #4 &lt;java/lang/System.out : Ljava/io/PrintStream;>
10 ldc #5 &lt;test>
12 invokevirtual #6 &lt;java/io/PrintStream.println : (Ljava/lang/String;)V>
15 aload_1
16 monitorexit
17 goto 25 (+8)
20 astore_2
21 aload_1
22 monitorexit
23 aload_2
24 athrow
25 return

Exception table:
from    to  target type
 7    17    20   any
20    23    20   any

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22

而synchronized方法，该方法的access flag中的ACC_SYNCHRONIZED标记位会被设置为1。

public synchronized void testSync();
    descriptor: ()V
    flags: (0x0021) ACC_PUBLIC, ACC_SYNCHRONIZED

1
2
3

# monitorenter的实现

从模版解释器开始看monitor的处理。在遇到monitorenter指令时，jvm会在当前的线程栈上创建一个monitor block的内存区域，这块内存用于存放BasicObjectLock对象，该对象会用于后续的轻量级锁加锁。

下面以templateInterpreter字节码解释执行时的实现进行分析。

rax寄存器是加锁的对象（即synchronized(obj)中的obj）
__ movptr(Address(rmon, BasicObjectLock::obj_offset_in_bytes()), rax);将被锁对象地址设置到BasicObjectLock中的obj字段。
__ lock_object(rmon);：具体的加锁过程。

//-----------------------------------------------------------------------------
// Synchronization
//
// Note: monitorenter &amp; exit are symmetric routines; which is reflected
//       in the assembly code structure as well
//
// Stack layout:
//
// [expressions  ] &lt;--- rsp               = expression stack top
// ..
// [expressions  ]
// [monitor entry] &lt;--- monitor block top = expression stack bot
// ..
// [monitor entry]
// [frame data   ] &lt;--- monitor block bot
// ...
// [saved rbp    ] &lt;--- rbp
void TemplateTable::monitorenter() {
  transition(atos, vtos);

  // check for NULL object
  __ null_check(rax);

  const Address monitor_block_top(
        rbp, frame::interpreter_frame_monitor_block_top_offset * wordSize);
  const Address monitor_block_bot(
        rbp, frame::interpreter_frame_initial_sp_offset * wordSize);
  const int entry_size = frame::interpreter_frame_monitor_size() * wordSize;

  Label allocated;

  Register rtop = LP64_ONLY(c_rarg3) NOT_LP64(rcx);
  Register rbot = LP64_ONLY(c_rarg2) NOT_LP64(rbx);
  Register rmon = LP64_ONLY(c_rarg1) NOT_LP64(rdx);

  // initialize entry pointer
  __ xorl(rmon, rmon); // points to free slot or NULL

  // find a free slot in the monitor block (result in rmon)
  {
    Label entry, loop, exit;
    __ movptr(rtop, monitor_block_top); // points to current entry,
                                        // starting with top-most entry
    __ lea(rbot, monitor_block_bot);    // points to word before bottom
                                        // of monitor block
    __ jmpb(entry);

    __ bind(loop);
    // check if current entry is used
    __ cmpptr(Address(rtop, BasicObjectLock::obj_offset_in_bytes()), (int32_t) NULL_WORD);
    // if not used then remember entry in rmon
    __ cmovptr(Assembler::equal, rmon, rtop);   // cmov => cmovptr
    // check if current entry is for same object
    __ cmpptr(rax, Address(rtop, BasicObjectLock::obj_offset_in_bytes()));
    // if same object then stop searching
    __ jccb(Assembler::equal, exit);
    // otherwise advance to next entry
    __ addptr(rtop, entry_size);
    __ bind(entry);
    // check if bottom reached
    __ cmpptr(rtop, rbot);
    // if not at bottom then check this entry
    __ jcc(Assembler::notEqual, loop);
    __ bind(exit);
  }

  __ testptr(rmon, rmon); // check if a slot has been found
  __ jcc(Assembler::notZero, allocated); // if found, continue with that one

  // allocate one if there's no free slot
  {
    Label entry, loop;
    // 1. compute new pointers          // rsp: old expression stack top
    __ movptr(rmon, monitor_block_bot); // rmon: old expression stack bottom
    __ subptr(rsp, entry_size);         // move expression stack top
    __ subptr(rmon, entry_size);        // move expression stack bottom
    __ mov(rtop, rsp);                  // set start value for copy loop
    __ movptr(monitor_block_bot, rmon); // set new monitor block bottom
    __ jmp(entry);
    // 2. move expression stack contents
    __ bind(loop);
    __ movptr(rbot, Address(rtop, entry_size)); // load expression stack
                                                // word from old location
    __ movptr(Address(rtop, 0), rbot);          // and store it at new location
    __ addptr(rtop, wordSize);                  // advance to next word
    __ bind(entry);
    __ cmpptr(rtop, rmon);                      // check if bottom reached
    __ jcc(Assembler::notEqual, loop);          // if not at bottom then
                                                // copy next word
  }

  // call run-time routine
  // rmon: points to monitor entry
  __ bind(allocated);

  // Increment bcp to point to the next bytecode, so exception
  // handling for async. exceptions work correctly.
  // The object has already been poped from the stack, so the
  // expression stack looks correct.
  __ increment(rbcp);

  // store object
  __ movptr(Address(rmon, BasicObjectLock::obj_offset_in_bytes()), rax);
  __ lock_object(rmon);

  // check to make sure this monitor doesn't cause stack overflow after locking
  __ save_bcp();  // in case of exception
  __ generate_stack_overflow_check(0);

  // The bcp has already been incremented. Just need to dispatch to
  // next instruction.
  __ dispatch_next(vtos);
}

# lock_object

BasicObjectLock包含两个字段，一个BasicLock _lock，一个oop _obj。 _lock对象是内嵌到BasicObjectLock对象内存中的，BasicLock中目前只有一个markWord类型的_displaced_header字段。

class BasicObjectLock {
  friend class VMStructs;
 private:
  BasicLock _lock;                                    // the lock, must be double word aligned
  oop       _obj;                                     // object holds the lock;

1
2
3
4
5

BasicLock类中有一个volatile生命的markWork类型的_displaced_header用于保存_obj之前的对象头（因为加锁会涉及到cas替换被加锁对象的对象头，加锁完成后需要恢复）

class BasicLock {
  friend class VMStructs;
  friend class JVMCIVMStructs;
 private:
  volatile markWord _displaced_header;

1
2
3
4
5

参数传过来的是BasicObjectLock的地址（在线程栈中）
UseHeavyMonitors是全局配置，默认false，会使用轻量级锁。
orptr(swap_reg, Address(obj_reg, oopDesc::mark_offset_in_bytes()));: Load (object->mark() | 1) into swap_reg %rax，设置要cas的目标值，markWord的标记为设置为01，即
movptr(Address(lock_reg, mark_offset), swap_reg);：将锁对象的markWord保存到BasicLock的displaced header。
cmpxchgptr(lock_reg, Address(obj_reg, oopDesc::mark_offset_in_bytes())): cas替换锁对象的对象头为BasicLock地址。如果cas成功，则markWord的标记为会替换为00，表示轻量级加锁状态，因为BasicObjectLock的地址是以一个byte为单位的，后三位是000。
jcc(Assembler::zero, done)：如果cmpxchgptr成功，会设置结果为0，则直接走到done label完成了加锁。
call_VM(noreg, CAST_FROM_FN_PTR(address, InterpreterRuntime::monitorenter), lock_reg);：否则调用InterpreterRuntime::monitorenter

// Lock object
//
// Args:
//      rdx, c_rarg1: BasicObjectLock to be used for locking
//
// Kills:
//      rax, rbx
void InterpreterMacroAssembler::lock_object(Register lock_reg) {
  assert(lock_reg == LP64_ONLY(c_rarg1) NOT_LP64(rdx),
         "The argument is only for looks. It must be c_rarg1");

  if (UseHeavyMonitors) {
    call_VM(noreg,
            CAST_FROM_FN_PTR(address, InterpreterRuntime::monitorenter),
            lock_reg);
  } else {
    Label done;

    const Register swap_reg = rax; // Must use rax for cmpxchg instruction
    const Register tmp_reg = rbx; // Will be passed to biased_locking_enter to avoid a
                                  // problematic case where tmp_reg = no_reg.
    const Register obj_reg = LP64_ONLY(c_rarg3) NOT_LP64(rcx); // Will contain the oop
    const Register rklass_decode_tmp = LP64_ONLY(rscratch1) NOT_LP64(noreg);

    const int obj_offset = BasicObjectLock::obj_offset_in_bytes();
    const int lock_offset = BasicObjectLock::lock_offset_in_bytes ();
    const int mark_offset = lock_offset +
                            BasicLock::displaced_header_offset_in_bytes();

    Label slow_case;

    // Load object pointer into obj_reg
    movptr(obj_reg, Address(lock_reg, obj_offset));

    if (DiagnoseSyncOnValueBasedClasses != 0) {
      load_klass(tmp_reg, obj_reg, rklass_decode_tmp);
      movl(tmp_reg, Address(tmp_reg, Klass::access_flags_offset()));
      testl(tmp_reg, JVM_ACC_IS_VALUE_BASED_CLASS);
      jcc(Assembler::notZero, slow_case);
    }

    if (UseBiasedLocking) {
      biased_locking_enter(lock_reg, obj_reg, swap_reg, tmp_reg, rklass_decode_tmp, false, done, &amp;slow_case);
    }

    // Load immediate 1 into swap_reg %rax
    movl(swap_reg, (int32_t)1);

    // Load (object->mark() | 1) into swap_reg %rax
    orptr(swap_reg, Address(obj_reg, oopDesc::mark_offset_in_bytes()));

    // Save (object->mark() | 1) into BasicLock's displaced header
    movptr(Address(lock_reg, mark_offset), swap_reg);

    assert(lock_offset == 0,
           "displaced header must be first word in BasicObjectLock");

    lock();
    cmpxchgptr(lock_reg, Address(obj_reg, oopDesc::mark_offset_in_bytes()));
    if (PrintBiasedLockingStatistics) {
      cond_inc32(Assembler::zero,
                 ExternalAddress((address) BiasedLocking::fast_path_entry_count_addr()));
    }
    jcc(Assembler::zero, done);

    const int zero_bits = LP64_ONLY(7) NOT_LP64(3);

    subptr(swap_reg, rsp);
    andptr(swap_reg, zero_bits - os::vm_page_size());

    // Save the test result, for recursive case, the result is zero
    movptr(Address(lock_reg, mark_offset), swap_reg);

    if (PrintBiasedLockingStatistics) {
      cond_inc32(Assembler::zero,
                 ExternalAddress((address) BiasedLocking::fast_path_entry_count_addr()));
    }
    jcc(Assembler::zero, done);

    bind(slow_case);

    // Call the runtime routine for slow case
    call_VM(noreg,
            CAST_FROM_FN_PTR(address, InterpreterRuntime::monitorenter),
            lock_reg);

    bind(done);
  }
}

# interpreter runtime中的monitor enter

前面模版解释器中通过cas加轻量级锁失败后，说明加锁存在冲突，会回退到slow path。

主要是调用 ObjectSynchronizer::enter(h_obj, elem->lock(), current);

//------------------------------------------------------------------------------------------------------------------------
// Synchronization
//
// The interpreter's synchronization code is factored out so that it can
// be shared by method invocation and synchronized blocks.
//%note synchronization_3

//%note monitor_1
JRT_ENTRY_NO_ASYNC(void, InterpreterRuntime::monitorenter(JavaThread* current, BasicObjectLock* elem))
#ifdef ASSERT
  current->last_frame().interpreter_frame_verify_monitor(elem);
#endif
  if (PrintBiasedLockingStatistics) {
    Atomic::inc(BiasedLocking::slow_path_entry_count_addr());
  }
  Handle h_obj(current, elem->obj());
  assert(Universe::heap()->is_in_or_null(h_obj()),
         "must be NULL or an object");
  ObjectSynchronizer::enter(h_obj, elem->lock(), current);
  assert(Universe::heap()->is_in_or_null(elem->obj()),
         "must be NULL or an object");
#ifdef ASSERT
  current->last_frame().interpreter_frame_verify_monitor(elem);
#endif
JRT_END

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25

enter方法

先通过markWord mark = obj->mark()拿到markWord
先判断mark.is_neutral()：is_neutral判断markWord标记为是否是01即未加锁状态，则再通过cas尝试轻量级加锁，如果成功返回。不成功则走到后面的锁膨胀流程。
如果不是is_neutral，则判断mark.has_locker()&& current->is_lock_owned((address)mark.locker())：说明当前线程在前面已经加了这个对象的锁，直接返回。
进入锁膨胀再重量级加锁流程，在while(true)循环中不断尝试inflate进行锁膨胀，然后用膨胀后的重量级锁ObjectMonitor调用enter方法进行加锁。

// -----------------------------------------------------------------------------
// Monitor Enter/Exit
// The interpreter and compiler assembly code tries to lock using the fast path
// of this algorithm. Make sure to update that code if the following function is
// changed. The implementation is extremely sensitive to race condition. Be careful.

void ObjectSynchronizer::enter(Handle obj, BasicLock* lock, JavaThread* current) {
  if (obj->klass()->is_value_based()) {
    handle_sync_on_value_based_class(obj, current);
  }

  if (UseBiasedLocking) {
    BiasedLocking::revoke(current, obj);
  }

  markWord mark = obj->mark();
  assert(!mark.has_bias_pattern(), "should not see bias pattern here");

  if (mark.is_neutral()) {
    // Anticipate successful CAS -- the ST of the displaced mark must
    // be visible &lt;= the ST performed by the CAS.
    lock->set_displaced_header(mark);
    if (mark == obj()->cas_set_mark(markWord::from_pointer(lock), mark)) {
      return;
    }
    // Fall through to inflate() ...
  } else if (mark.has_locker() &amp;&amp;
             current->is_lock_owned((address)mark.locker())) {
    assert(lock != mark.locker(), "must not re-lock the same lock");
    assert(lock != (BasicLock*)obj->mark().value(), "don't relock with same BasicLock");
    lock->set_displaced_header(markWord::from_pointer(NULL));
    return;
  }

  // The object header will never be displaced to this lock,
  // so it does not matter what the value is, except that it
  // must be non-zero to avoid looking like a re-entrant lock,
  // and must not look locked either.
  lock->set_displaced_header(markWord::unused_mark());
  // An async deflation can race after the inflate() call and before
  // enter() can make the ObjectMonitor busy. enter() returns false if
  // we have lost the race to async deflation and we simply try again.
  while (true) {
    ObjectMonitor* monitor = inflate(current, obj(), inflate_cause_monitor_enter);
    if (monitor->enter(current)) {
      return;
    }
  }
}

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49

# 锁膨胀inflation

判断mark如果已经是has_monitor说明有其他线程完成了所膨胀，直接从mark中获取monitor指针返回。
如果mark状态是INFLATING膨胀中（全是0），说明有其他线程在进行锁膨胀，则continue继续下一轮循环判断
如果mark状态是has_locker，说明有其他线程加了轻量级锁，则通过object->cas_set_mark(markWord::INFLATING(), mark)cas修改锁对象头设置状态为膨胀中
- 如果cas失败则continue继续循环
- 通过displaced_mark_helper方法获取锁对象原本的markWord，设置到ObjectMonitor中（ObjectMonitor对象在膨胀锁之前创建）。
- 设置ObjectMonitor的_owner、mark，然后返回
如果mark处于neutral状态，说明没有其他线程在加轻量级锁，则创建ObjectMonitor对象，然后cas修改锁对象头，成功返回。

ObjectMonitor* ObjectSynchronizer::inflate(Thread* current, oop object,
                                           const InflateCause cause) {
  EventJavaMonitorInflate event;

  for (;;) {
    const markWord mark = object->mark();
    assert(!mark.has_bias_pattern(), "invariant");

    // The mark can be in one of the following states:
    // *  Inflated     - just return
    // *  Stack-locked - coerce it to inflated
    // *  INFLATING    - busy wait for conversion to complete
    // *  Neutral      - aggressively inflate the object.
    // *  BIASED       - Illegal.  We should never see this

    // CASE: inflated
    if (mark.has_monitor()) {
      ObjectMonitor* inf = mark.monitor();
      markWord dmw = inf->header();
      assert(dmw.is_neutral(), "invariant: header=" INTPTR_FORMAT, dmw.value());
      return inf;
    }

    // CASE: inflation in progress - inflating over a stack-lock.
    // Some other thread is converting from stack-locked to inflated.
    // Only that thread can complete inflation -- other threads must wait.
    // The INFLATING value is transient.
    // Currently, we spin/yield/park and poll the markword, waiting for inflation to finish.
    // We could always eliminate polling by parking the thread on some auxiliary list.
    if (mark == markWord::INFLATING()) {
      read_stable_mark(object);
      continue;
    }

    // CASE: stack-locked
    // Could be stack-locked either by this thread or by some other thread.
    //
    // Note that we allocate the ObjectMonitor speculatively, _before_ attempting
    // to install INFLATING into the mark word.  We originally installed INFLATING,
    // allocated the ObjectMonitor, and then finally STed the address of the
    // ObjectMonitor into the mark.  This was correct, but artificially lengthened
    // the interval in which INFLATING appeared in the mark, thus increasing
    // the odds of inflation contention.

    LogStreamHandle(Trace, monitorinflation) lsh;

    if (mark.has_locker()) {
      ObjectMonitor* m = new ObjectMonitor(object);
      // Optimistically prepare the ObjectMonitor - anticipate successful CAS
      // We do this before the CAS in order to minimize the length of time
      // in which INFLATING appears in the mark.

      markWord cmp = object->cas_set_mark(markWord::INFLATING(), mark);
      if (cmp != mark) {
        delete m;
        continue;       // Interference -- just retry
      }

      // We've successfully installed INFLATING (0) into the mark-word.
      // This is the only case where 0 will appear in a mark-word.
      // Only the singular thread that successfully swings the mark-word
      // to 0 can perform (or more precisely, complete) inflation.
      //
      // Why do we CAS a 0 into the mark-word instead of just CASing the
      // mark-word from the stack-locked value directly to the new inflated state?
      // Consider what happens when a thread unlocks a stack-locked object.
      // It attempts to use CAS to swing the displaced header value from the
      // on-stack BasicLock back into the object header.  Recall also that the
      // header value (hash code, etc) can reside in (a) the object header, or
      // (b) a displaced header associated with the stack-lock, or (c) a displaced
      // header in an ObjectMonitor.  The inflate() routine must copy the header
      // value from the BasicLock on the owner's stack to the ObjectMonitor, all
      // the while preserving the hashCode stability invariants.  If the owner
      // decides to release the lock while the value is 0, the unlock will fail
      // and control will eventually pass from slow_exit() to inflate.  The owner
      // will then spin, waiting for the 0 value to disappear.   Put another way,
      // the 0 causes the owner to stall if the owner happens to try to
      // drop the lock (restoring the header from the BasicLock to the object)
      // while inflation is in-progress.  This protocol avoids races that might
      // would otherwise permit hashCode values to change or "flicker" for an object.
      // Critically, while object->mark is 0 mark.displaced_mark_helper() is stable.
      // 0 serves as a "BUSY" inflate-in-progress indicator.


      // fetch the displaced mark from the owner's stack.
      // The owner can't die or unwind past the lock while our INFLATING
      // object is in the mark.  Furthermore the owner can't complete
      // an unlock on the object, either.
      markWord dmw = mark.displaced_mark_helper();
      // Catch if the object's header is not neutral (not locked and
      // not marked is what we care about here).
      assert(dmw.is_neutral(), "invariant: header=" INTPTR_FORMAT, dmw.value());

      // Setup monitor fields to proper values -- prepare the monitor
      m->set_header(dmw);

      // Optimization: if the mark.locker stack address is associated
      // with this thread we could simply set m->_owner = current.
      // Note that a thread can inflate an object
      // that it has stack-locked -- as might happen in wait() -- directly
      // with CAS.  That is, we can avoid the xchg-NULL .... ST idiom.
      m->set_owner_from(NULL, mark.locker());
      // TODO-FIXME: assert BasicLock->dhw != 0.

      // Must preserve store ordering. The monitor state must
      // be stable at the time of publishing the monitor address.
      guarantee(object->mark() == markWord::INFLATING(), "invariant");
      // Release semantics so that above set_object() is seen first.
      object->release_set_mark(markWord::encode(m));

      // Once ObjectMonitor is configured and the object is associated
      // with the ObjectMonitor, it is safe to allow async deflation:
      _in_use_list.add(m);

      // Hopefully the performance counters are allocated on distinct cache lines
      // to avoid false sharing on MP systems ...
      OM_PERFDATA_OP(Inflations, inc());
      if (log_is_enabled(Trace, monitorinflation)) {
        ResourceMark rm(current);
        lsh.print_cr("inflate(has_locker): object=" INTPTR_FORMAT ", mark="
                     INTPTR_FORMAT ", type='%s'", p2i(object),
                     object->mark().value(), object->klass()->external_name());
      }
      if (event.should_commit()) {
        post_monitor_inflate_event(&amp;event, object, cause);
      }
      return m;
    }

    // CASE: neutral
    // TODO-FIXME: for entry we currently inflate and then try to CAS _owner.
    // If we know we're inflating for entry it's better to inflate by swinging a
    // pre-locked ObjectMonitor pointer into the object header.   A successful
    // CAS inflates the object *and* confers ownership to the inflating thread.
    // In the current implementation we use a 2-step mechanism where we CAS()
    // to inflate and then CAS() again to try to swing _owner from NULL to current.
    // An inflateTry() method that we could call from enter() would be useful.

    // Catch if the object's header is not neutral (not locked and
    // not marked is what we care about here).
    assert(mark.is_neutral(), "invariant: header=" INTPTR_FORMAT, mark.value());
    ObjectMonitor* m = new ObjectMonitor(object);
    // prepare m for installation - set monitor to initial state
    m->set_header(mark);

    if (object->cas_set_mark(markWord::encode(m), mark) != mark) {
      delete m;
      m = NULL;
      continue;
      // interference - the markword changed - just retry.
      // The state-transitions are one-way, so there's no chance of
      // live-lock -- "Inflated" is an absorbing state.
    }

    // Once the ObjectMonitor is configured and object is associated
    // with the ObjectMonitor, it is safe to allow async deflation:
    _in_use_list.add(m);

    // Hopefully the performance counters are allocated on distinct
    // cache lines to avoid false sharing on MP systems ...
    OM_PERFDATA_OP(Inflations, inc());
    if (log_is_enabled(Trace, monitorinflation)) {
      ResourceMark rm(current);
      lsh.print_cr("inflate(neutral): object=" INTPTR_FORMAT ", mark="
                   INTPTR_FORMAT ", type='%s'", p2i(object),
                   object->mark().value(), object->klass()->external_name());
    }
    if (event.should_commit()) {
      post_monitor_inflate_event(&amp;event, object, cause);
    }
    return m;
  }
}

# 重量级锁的enter加锁

通过cas修改ObjectMonitor的_owner，从null cas到当前线程，如果cas成功说明重量级锁加锁成功，返回。
判断锁重入情况，如果重入（锁的owner和当前线程想通）则增加_recursions并返回成功
如果当前线程已经轻量级加锁，说明当前线程在轻量级锁加锁完成后再对相同锁对象加锁，此时被其他线程完成了锁膨胀，所以会将owner从BasicLock设置为当前线程，并设置recursion为1表示第一次重入，并返回
TrySpin，TrySpin是cas设置owner，会经过多次自适应自旋，如果TrySpin成功说明加锁成功也会返回
for(;;)中不断尝试加锁
- ThreadBlockInVMPreprocess<ExitOnSuspend> tbivs(current, eos): 创建一个ThreadBlockInVMPreprocess对象，在方法块结束前析构方法会检查是否处于safepoint过程中，如果是则会等待safepoint完成再返回。
- EnterI：是重量级锁的入队、park阻塞实现

// -----------------------------------------------------------------------------
// Enter support

bool ObjectMonitor::enter(JavaThread* current) {
  // The following code is ordered to check the most common cases first
  // and to reduce RTS->RTO cache line upgrades on SPARC and IA32 processors.

  void* cur = try_set_owner_from(NULL, current);
  if (cur == NULL) {
    assert(_recursions == 0, "invariant");
    return true;
  }

  if (cur == current) {
    // TODO-FIXME: check for integer overflow!  BUGID 6557169.
    _recursions++;
    return true;
  }

  if (current->is_lock_owned((address)cur)) {
    assert(_recursions == 0, "internal state error");
    _recursions = 1;
    set_owner_from_BasicLock(cur, current);  // Convert from BasicLock* to Thread*.
    return true;
  }

  // We've encountered genuine contention.
  assert(current->_Stalled == 0, "invariant");
  current->_Stalled = intptr_t(this);

  // Try one round of spinning *before* enqueueing current
  // and before going through the awkward and expensive state
  // transitions.  The following spin is strictly optional ...
  // Note that if we acquire the monitor from an initial spin
  // we forgo posting JVMTI events and firing DTRACE probes.
  if (TrySpin(current) > 0) {
    assert(owner_raw() == current, "must be current: owner=" INTPTR_FORMAT, p2i(owner_raw()));
    assert(_recursions == 0, "must be 0: recursions=" INTX_FORMAT, _recursions);
    assert(object()->mark() == markWord::encode(this),
           "object mark must match encoded this: mark=" INTPTR_FORMAT
           ", encoded this=" INTPTR_FORMAT, object()->mark().value(),
           markWord::encode(this).value());
    current->_Stalled = 0;
    return true;
  }

  assert(owner_raw() != current, "invariant");
  assert(_succ != current, "invariant");
  assert(!SafepointSynchronize::is_at_safepoint(), "invariant");
  assert(current->thread_state() != _thread_blocked, "invariant");

  // Keep track of contention for JVM/TI and M&amp;M queries.
  add_to_contentions(1);
  if (is_being_async_deflated()) {
    // Async deflation is in progress and our contentions increment
    // above lost the race to async deflation. Undo the work and
    // force the caller to retry.
    const oop l_object = object();
    if (l_object != NULL) {
      // Attempt to restore the header/dmw to the object's header so that
      // we only retry once if the deflater thread happens to be slow.
      install_displaced_markword_in_object(l_object);
    }
    current->_Stalled = 0;
    add_to_contentions(-1);
    return false;
  }

  { // Change java thread status to indicate blocked on monitor enter.
    JavaThreadBlockedOnMonitorEnterState jtbmes(current, this);

    assert(current->current_pending_monitor() == NULL, "invariant");
    current->set_current_pending_monitor(this);

    DTRACE_MONITOR_PROBE(contended__enter, this, object(), current);
    if (JvmtiExport::should_post_monitor_contended_enter()) {
      JvmtiExport::post_monitor_contended_enter(current, this);

      // The current thread does not yet own the monitor and does not
      // yet appear on any queues that would get it made the successor.
      // This means that the JVMTI_EVENT_MONITOR_CONTENDED_ENTER event
      // handler cannot accidentally consume an unpark() meant for the
      // ParkEvent associated with this ObjectMonitor.
    }

    OSThreadContendState osts(current->osthread());

    assert(current->thread_state() == _thread_in_vm, "invariant");

    for (;;) {
      ExitOnSuspend eos(this);
      {
        ThreadBlockInVMPreprocess&lt;ExitOnSuspend> tbivs(current, eos);
        EnterI(current);
        current->set_current_pending_monitor(NULL);
        // We can go to a safepoint at the end of this block. If we
        // do a thread dump during that safepoint, then this thread will show
        // as having "-locked" the monitor, but the OS and java.lang.Thread
        // states will still report that the thread is blocked trying to
        // acquire it.
        // If there is a suspend request, ExitOnSuspend will exit the OM
        // and set the OM as pending.
      }
      if (!eos.exited()) {
        // ExitOnSuspend did not exit the OM
        assert(owner_raw() == current, "invariant");
        break;
      }
    }

    // We've just gotten past the enter-check-for-suspend dance and we now own
    // the monitor free and clear.
  }

  add_to_contentions(-1);
  assert(contentions() >= 0, "must not be negative: contentions=%d", contentions());
  current->_Stalled = 0;

  // Must either set _recursions = 0 or ASSERT _recursions == 0.
  assert(_recursions == 0, "invariant");
  assert(owner_raw() == current, "invariant");
  assert(_succ != current, "invariant");
  assert(object()->mark() == markWord::encode(this), "invariant");

  return true;
}

# EnterI

再尝试TryLock，包含在后面的步骤中还会穿插尝试TryLock
创建ObjectWaiter等待节点，然后通过cas设置为新的_cxq头节点，cas成功完成入队。
如果ObjectWaiter的next为NULL并且_EntryList也是NULL，需要将自己设置成_Responsible，当前这种情况，已经持有锁的线程在释放锁时可能走快速锁释放流程导致等待锁的线程不能被唤醒，所以_Responsible会定期检查锁状态，即park阻塞会设置超时时间不断检查。
入队完成后，for循环内不断
- TryLock尝试加锁
- 加锁不成功current->_ParkEvent->park()，阻塞即挂起当前线程，内核可以调度运行其他线程。park恢复后（可能是锁释放了），会继续尝试加锁。

void ObjectMonitor::EnterI(JavaThread* current) {
  assert(current->thread_state() == _thread_blocked, "invariant");

  // Try the lock - TATAS
  if (TryLock (current) > 0) {
    assert(_succ != current, "invariant");
    assert(owner_raw() == current, "invariant");
    assert(_Responsible != current, "invariant");
    return;
  }

  if (try_set_owner_from(DEFLATER_MARKER, current) == DEFLATER_MARKER) {
    // Cancelled the in-progress async deflation by changing owner from
    // DEFLATER_MARKER to current. As part of the contended enter protocol,
    // contentions was incremented to a positive value before EnterI()
    // was called and that prevents the deflater thread from winning the
    // last part of the 2-part async deflation protocol. After EnterI()
    // returns to enter(), contentions is decremented because the caller
    // now owns the monitor. We bump contentions an extra time here to
    // prevent the deflater thread from winning the last part of the
    // 2-part async deflation protocol after the regular decrement
    // occurs in enter(). The deflater thread will decrement contentions
    // after it recognizes that the async deflation was cancelled.
    add_to_contentions(1);
    assert(_succ != current, "invariant");
    assert(_Responsible != current, "invariant");
    return;
  }

  assert(InitDone, "Unexpectedly not initialized");

  // We try one round of spinning *before* enqueueing current.
  //
  // If the _owner is ready but OFFPROC we could use a YieldTo()
  // operation to donate the remainder of this thread's quantum
  // to the owner.  This has subtle but beneficial affinity
  // effects.

  if (TrySpin(current) > 0) {
    assert(owner_raw() == current, "invariant");
    assert(_succ != current, "invariant");
    assert(_Responsible != current, "invariant");
    return;
  }

  // The Spin failed -- Enqueue and park the thread ...
  assert(_succ != current, "invariant");
  assert(owner_raw() != current, "invariant");
  assert(_Responsible != current, "invariant");

  // Enqueue "current" on ObjectMonitor's _cxq.
  //
  // Node acts as a proxy for current.
  // As an aside, if were to ever rewrite the synchronization code mostly
  // in Java, WaitNodes, ObjectMonitors, and Events would become 1st-class
  // Java objects.  This would avoid awkward lifecycle and liveness issues,
  // as well as eliminate a subset of ABA issues.
  // TODO: eliminate ObjectWaiter and enqueue either Threads or Events.

  ObjectWaiter node(current);
  current->_ParkEvent->reset();
  node._prev   = (ObjectWaiter*) 0xBAD;
  node.TState  = ObjectWaiter::TS_CXQ;

  // Push "current" onto the front of the _cxq.
  // Once on cxq/EntryList, current stays on-queue until it acquires the lock.
  // Note that spinning tends to reduce the rate at which threads
  // enqueue and dequeue on EntryList|cxq.
  ObjectWaiter* nxt;
  for (;;) {
    node._next = nxt = _cxq;
    if (Atomic::cmpxchg(&amp;_cxq, nxt, &amp;node) == nxt) break;

    // Interference - the CAS failed because _cxq changed.  Just retry.
    // As an optional optimization we retry the lock.
    if (TryLock (current) > 0) {
      assert(_succ != current, "invariant");
      assert(owner_raw() == current, "invariant");
      assert(_Responsible != current, "invariant");
      return;
    }
  }

  // Check for cxq|EntryList edge transition to non-null.  This indicates
  // the onset of contention.  While contention persists exiting threads
  // will use a ST:MEMBAR:LD 1-1 exit protocol.  When contention abates exit
  // operations revert to the faster 1-0 mode.  This enter operation may interleave
  // (race) a concurrent 1-0 exit operation, resulting in stranding, so we
  // arrange for one of the contending thread to use a timed park() operations
  // to detect and recover from the race.  (Stranding is form of progress failure
  // where the monitor is unlocked but all the contending threads remain parked).
  // That is, at least one of the contended threads will periodically poll _owner.
  // One of the contending threads will become the designated "Responsible" thread.
  // The Responsible thread uses a timed park instead of a normal indefinite park
  // operation -- it periodically wakes and checks for and recovers from potential
  // strandings admitted by 1-0 exit operations.   We need at most one Responsible
  // thread per-monitor at any given moment.  Only threads on cxq|EntryList may
  // be responsible for a monitor.
  //
  // Currently, one of the contended threads takes on the added role of "Responsible".
  // A viable alternative would be to use a dedicated "stranding checker" thread
  // that periodically iterated over all the threads (or active monitors) and unparked
  // successors where there was risk of stranding.  This would help eliminate the
  // timer scalability issues we see on some platforms as we'd only have one thread
  // -- the checker -- parked on a timer.

  if (nxt == NULL &amp;&amp; _EntryList == NULL) {
    // Try to assume the role of responsible thread for the monitor.
    // CONSIDER:  ST vs CAS vs { if (Responsible==null) Responsible=current }
    Atomic::replace_if_null(&amp;_Responsible, current);
  }

  // The lock might have been released while this thread was occupied queueing
  // itself onto _cxq.  To close the race and avoid "stranding" and
  // progress-liveness failure we must resample-retry _owner before parking.
  // Note the Dekker/Lamport duality: ST cxq; MEMBAR; LD Owner.
  // In this case the ST-MEMBAR is accomplished with CAS().
  //
  // TODO: Defer all thread state transitions until park-time.
  // Since state transitions are heavy and inefficient we'd like
  // to defer the state transitions until absolutely necessary,
  // and in doing so avoid some transitions ...

  int nWakeups = 0;
  int recheckInterval = 1;

  for (;;) {

    if (TryLock(current) > 0) break;
    assert(owner_raw() != current, "invariant");

    // park self
    if (_Responsible == current) {
      current->_ParkEvent->park((jlong) recheckInterval);
      // Increase the recheckInterval, but clamp the value.
      recheckInterval *= 8;
      if (recheckInterval > MAX_RECHECK_INTERVAL) {
        recheckInterval = MAX_RECHECK_INTERVAL;
      }
    } else {
      current->_ParkEvent->park();
    }

    if (TryLock(current) > 0) break;

    if (try_set_owner_from(DEFLATER_MARKER, current) == DEFLATER_MARKER) {
      // Cancelled the in-progress async deflation by changing owner from
      // DEFLATER_MARKER to current. As part of the contended enter protocol,
      // contentions was incremented to a positive value before EnterI()
      // was called and that prevents the deflater thread from winning the
      // last part of the 2-part async deflation protocol. After EnterI()
      // returns to enter(), contentions is decremented because the caller
      // now owns the monitor. We bump contentions an extra time here to
      // prevent the deflater thread from winning the last part of the
      // 2-part async deflation protocol after the regular decrement
      // occurs in enter(). The deflater thread will decrement contentions
      // after it recognizes that the async deflation was cancelled.
      add_to_contentions(1);
      break;
    }

    // The lock is still contested.
    // Keep a tally of the # of futile wakeups.
    // Note that the counter is not protected by a lock or updated by atomics.
    // That is by design - we trade "lossy" counters which are exposed to
    // races during updates for a lower probe effect.

    // This PerfData object can be used in parallel with a safepoint.
    // See the work around in PerfDataManager::destroy().
    OM_PERFDATA_OP(FutileWakeups, inc());
    ++nWakeups;

    // Assuming this is not a spurious wakeup we'll normally find _succ == current.
    // We can defer clearing _succ until after the spin completes
    // TrySpin() must tolerate being called with _succ == current.
    // Try yet another round of adaptive spinning.
    if (TrySpin(current) > 0) break;

    // We can find that we were unpark()ed and redesignated _succ while
    // we were spinning.  That's harmless.  If we iterate and call park(),
    // park() will consume the event and return immediately and we'll
    // just spin again.  This pattern can repeat, leaving _succ to simply
    // spin on a CPU.

    if (_succ == current) _succ = NULL;

    // Invariant: after clearing _succ a thread *must* retry _owner before parking.
    OrderAccess::fence();
  }

  // Egress :
  // current has acquired the lock -- Unlink current from the cxq or EntryList.
  // Normally we'll find current on the EntryList .
  // From the perspective of the lock owner (this thread), the
  // EntryList is stable and cxq is prepend-only.
  // The head of cxq is volatile but the interior is stable.
  // In addition, current.TState is stable.

  assert(owner_raw() == current, "invariant");

  UnlinkAfterAcquire(current, &amp;node);
  if (_succ == current) _succ = NULL;

  assert(_succ != current, "invariant");
  if (_Responsible == current) {
    _Responsible = NULL;
    OrderAccess::fence(); // Dekker pivot-point

    // We may leave threads on cxq|EntryList without a designated
    // "Responsible" thread.  This is benign.  When this thread subsequently
    // exits the monitor it can "see" such preexisting "old" threads --
    // threads that arrived on the cxq|EntryList before the fence, above --
    // by LDing cxq|EntryList.  Newly arrived threads -- that is, threads
    // that arrive on cxq after the ST:MEMBAR, above -- will set Responsible
    // non-null and elect a new "Responsible" timer thread.
    //
    // This thread executes:
    //    ST Responsible=null; MEMBAR    (in enter epilogue - here)
    //    LD cxq|EntryList               (in subsequent exit)
    //
    // Entering threads in the slow/contended path execute:
    //    ST cxq=nonnull; MEMBAR; LD Responsible (in enter prolog)
    //    The (ST cxq; MEMBAR) is accomplished with CAS().
    //
    // The MEMBAR, above, prevents the LD of cxq|EntryList in the subsequent
    // exit operation from floating above the ST Responsible=null.
  }

  // We've acquired ownership with CAS().
  // CAS is serializing -- it has MEMBAR/FENCE-equivalent semantics.
  // But since the CAS() this thread may have also stored into _succ,
  // EntryList, cxq or Responsible.  These meta-data updates must be
  // visible __before this thread subsequently drops the lock.
  // Consider what could occur if we didn't enforce this constraint --
  // STs to monitor meta-data and user-data could reorder with (become
  // visible after) the ST in exit that drops ownership of the lock.
  // Some other thread could then acquire the lock, but observe inconsistent
  // or old monitor meta-data and heap data.  That violates the JMM.
  // To that end, the 1-0 exit() operation must have at least STST|LDST
  // "release" barrier semantics.  Specifically, there must be at least a
  // STST|LDST barrier in exit() before the ST of null into _owner that drops
  // the lock.   The barrier ensures that changes to monitor meta-data and data
  // protected by the lock will be visible before we release the lock, and
  // therefore before some other thread (CPU) has a chance to acquire the lock.
  // See also: http://gee.cs.oswego.edu/dl/jmm/cookbook.html.
  //
  // Critically, any prior STs to _succ or EntryList must be visible before
  // the ST of null into _owner in the *subsequent* (following) corresponding
  // monitorexit.  Recall too, that in 1-0 mode monitorexit does not necessarily
  // execute a serializing instruction.

  return;
}

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253

# exit释放锁

# ObjectMonitor的exit

为了提高性能，ObjectMonitor的exit实现会做很多优化，比如如果锁冲突比较激烈， exit会倾向于只设置owner为NULL然后返回，因为在spin过程中的线程会通过cas获取到锁，不需要由释放锁的线程去唤醒。节省了unpark的唤醒开销。如果出现罕见的等待线程没被唤醒的情况，通过前面EnterI中的_Responsible机制保证等待线程能够获取到锁。

_succ字段（successor的意思），是用于减少多余的warkup调用量。

release_clear_owner(current)：设置owner为NULL
OrderAccess::storeload()：storeload保证之后的其他线程能够读取到owner的新值（即NULL）
(intptr_t(_EntryList)|intptr_t(_cxq)) == 0 || _succ != NULL)
- 说明要么两个队列都是空的，不需要唤醒，如果有并发情况（其他线程cas失败再入队，这里没有读取到），可以通过_Responsible机制保证等待线程不会无限等待。
- 如果队列不为空，并且_succ不为空，_succ不为空说明有其他线程是spin，不需要主动唤醒，可以直接返回。
try_set_owner_from(NULL, current) != NULL：说明修改_owner为NULL之后有其他线程获取了锁，可以退出。
判断_EntryList不为空，则ExitEpilog(current, w)设置_succ为EntryList头结点，并调用unpark唤醒。
如果_EntryList为空，但是_cxq不为空，则将_cxq队列元素转移到_EntryList，在通过ExitEpilog(current, w)唤醒EntryList等待的头结点。

  for (;;) {
    assert(current == owner_raw(), "invariant");

    // Drop the lock.
    // release semantics: prior loads and stores from within the critical section
    // must not float (reorder) past the following store that drops the lock.
    // Uses a storeload to separate release_store(owner) from the
    // successor check. The try_set_owner() below uses cmpxchg() so
    // we get the fence down there.
    release_clear_owner(current);
    OrderAccess::storeload();

    if ((intptr_t(_EntryList)|intptr_t(_cxq)) == 0 || _succ != NULL) {
      return;
    }
    // Other threads are blocked trying to acquire the lock.

    // Normally the exiting thread is responsible for ensuring succession,
    // but if other successors are ready or other entering threads are spinning
    // then this thread can simply store NULL into _owner and exit without
    // waking a successor.  The existence of spinners or ready successors
    // guarantees proper succession (liveness).  Responsibility passes to the
    // ready or running successors.  The exiting thread delegates the duty.
    // More precisely, if a successor already exists this thread is absolved
    // of the responsibility of waking (unparking) one.
    //
    // The _succ variable is critical to reducing futile wakeup frequency.
    // _succ identifies the "heir presumptive" thread that has been made
    // ready (unparked) but that has not yet run.  We need only one such
    // successor thread to guarantee progress.
    // See http://www.usenix.org/events/jvm01/full_papers/dice/dice.pdf
    // section 3.3 "Futile Wakeup Throttling" for details.
    //
    // Note that spinners in Enter() also set _succ non-null.
    // In the current implementation spinners opportunistically set
    // _succ so that exiting threads might avoid waking a successor.
    // Another less appealing alternative would be for the exiting thread
    // to drop the lock and then spin briefly to see if a spinner managed
    // to acquire the lock.  If so, the exiting thread could exit
    // immediately without waking a successor, otherwise the exiting
    // thread would need to dequeue and wake a successor.
    // (Note that we'd need to make the post-drop spin short, but no
    // shorter than the worst-case round-trip cache-line migration time.
    // The dropped lock needs to become visible to the spinner, and then
    // the acquisition of the lock by the spinner must become visible to
    // the exiting thread).

    // It appears that an heir-presumptive (successor) must be made ready.
    // Only the current lock owner can manipulate the EntryList or
    // drain _cxq, so we need to reacquire the lock.  If we fail
    // to reacquire the lock the responsibility for ensuring succession
    // falls to the new owner.
    //
    if (try_set_owner_from(NULL, current) != NULL) {
      return;
    }

    guarantee(owner_raw() == current, "invariant");

    ObjectWaiter* w = NULL;

    w = _EntryList;
    if (w != NULL) {
      // I'd like to write: guarantee (w->_thread != current).
      // But in practice an exiting thread may find itself on the EntryList.
      // Let's say thread T1 calls O.wait().  Wait() enqueues T1 on O's waitset and
      // then calls exit().  Exit release the lock by setting O._owner to NULL.
      // Let's say T1 then stalls.  T2 acquires O and calls O.notify().  The
      // notify() operation moves T1 from O's waitset to O's EntryList. T2 then
      // release the lock "O".  T2 resumes immediately after the ST of null into
      // _owner, above.  T2 notices that the EntryList is populated, so it
      // reacquires the lock and then finds itself on the EntryList.
      // Given all that, we have to tolerate the circumstance where "w" is
      // associated with current.
      assert(w->TState == ObjectWaiter::TS_ENTER, "invariant");
      ExitEpilog(current, w);
      return;
    }

    // If we find that both _cxq and EntryList are null then just
    // re-run the exit protocol from the top.
    w = _cxq;
    if (w == NULL) continue;

    // Drain _cxq into EntryList - bulk transfer.
    // First, detach _cxq.
    // The following loop is tantamount to: w = swap(&amp;cxq, NULL)
    for (;;) {
      assert(w != NULL, "Invariant");
      ObjectWaiter* u = Atomic::cmpxchg(&amp;_cxq, w, (ObjectWaiter*)NULL);
      if (u == w) break;
      w = u;
    }

    assert(w != NULL, "invariant");
    assert(_EntryList == NULL, "invariant");

    // Convert the LIFO SLL anchored by _cxq into a DLL.
    // The list reorganization step operates in O(LENGTH(w)) time.
    // It's critical that this step operate quickly as
    // "current" still holds the outer-lock, restricting parallelism
    // and effectively lengthening the critical section.
    // Invariant: s chases t chases u.
    // TODO-FIXME: consider changing EntryList from a DLL to a CDLL so
    // we have faster access to the tail.

    _EntryList = w;
    ObjectWaiter* q = NULL;
    ObjectWaiter* p;
    for (p = w; p != NULL; p = p->_next) {
      guarantee(p->TState == ObjectWaiter::TS_CXQ, "Invariant");
      p->TState = ObjectWaiter::TS_ENTER;
      p->_prev = q;
      q = p;
    }

    // In 1-0 mode we need: ST EntryList; MEMBAR #storestore; ST _owner = NULL
    // The MEMBAR is satisfied by the release_store() operation in ExitEpilog().

    // See if we can abdicate to a spinner instead of waking a thread.
    // A primary goal of the implementation is to reduce the
    // context-switch rate.
    if (_succ != NULL) continue;

    w = _EntryList;
    if (w != NULL) {
      guarantee(w->TState == ObjectWaiter::TS_ENTER, "invariant");
      ExitEpilog(current, w);
      return;
    }
  }

← synchronzier JDK Virtual Thread的实现 →