# synchronzier

# 术语

  • fast path: 通过cas等方式避免使用内核的重量级等待唤醒的操作(pthread_cond_wait)
  • slow path: cas自旋不成功,入队等待。
  • inflate/deflate: 膨胀、反膨胀
  • 轻量级锁: 通过cas目标对象的对象头为当前线程中的锁对象实现
  • 重量级锁: 当轻量级锁出现锁冲突时,会进入到重量级锁,即通过一个ObjectMonitor对象来维护锁数据,加锁过程可能涉及到系统调用挂起线程。

# 核心数据结构

# Java对象

每个Java对象,都是有对象头、内存数据、内存对齐(padding)构成的。(如果是数组会多一个4byte的数组长度)。 对象头中包含MarkWord和metadata。 MarkWord是标记字段,metadata指向class数据。

MarkWord作为Java对象的辅助字段,用于加锁、保存对象identity hashCode、在gc时标记存活对象。

MarkWord大小为一个Word,当内存大小不够时,会在其他内存位置保存MarkWord。比如MarkWord指向重量级锁时,JVM会先将MarkWord的内容copy到重量级锁ObjectMonitor中, 然后在将ObjectMonitor的地址设置到MarkWord,使用完再复制回来。

既然MarkWord有几种可能的用途,如何判断当前是哪一种呢? 是通过最后的两个bit来判断的。

  • [ptr | 00]:locked状态, ptr为指向线程栈中的LockRecord,原有的mark保存到了线程栈中的LockRecord中。
  • [header | 0 | 01]: unlocked状态,这是通常情况下的对象头未加锁状态
  • [ptr | 10]: monitored,ptr指向膨胀锁地址, 原有的mark被保存到了膨胀锁中。
  • [ptr | 11]: marked, GC过程中对象被标记为活跃对象、ptr为forward的地址
  • [0.....0| 00]: inflating, 正在膨胀过程中

这里ptr能够存储下是因为JVM内存分配使用8字节对齐。所以指针的最后3个bit为0

# ObjectMonitor

锁对象数据,其中核心的数据有

  • _owner: 当前锁的持有者、线程
  • _EntryList:等待队列
  • _cxq:最近到达的线程的等待队列,新的等待线程会加入到_cxq队列中,在monitor exit时,如果EntryList为空并且_cxq不为空则会批量的将_cxq中的线程转移到EntryList
  • _succ:下一个要唤醒的ObjectWaiter
  • _Responsible:为了解决极端情况下可能出现线程不能被唤醒的race condition,会有一个_Responsible负责轮训锁状态。

# 入口

在C1编译器编译后的入口为

JRT_BLOCK_ENTRY(void, Runtime1::monitorenter(JavaThread* current, oopDesc* obj, BasicObjectLock* lock))
  NOT_PRODUCT(_monitorenter_slowcase_cnt++;)
  if (!UseFastLocking) {
    lock->set_obj(obj);
  }
  assert(obj == lock->obj(), "must match");
  SharedRuntime::monitor_enter_helper(obj, lock->lock(), current);
JRT_END


JRT_LEAF(void, Runtime1::monitorexit(JavaThread* current, BasicObjectLock* lock))
  NOT_PRODUCT(_monitorexit_slowcase_cnt++;)
  assert(current->last_Java_sp(), "last_Java_sp must be set");
  oop obj = lock->obj();
  assert(oopDesc::is_oop(obj), "must be NULL or an object");
  SharedRuntime::monitor_exit_helper(obj, lock->lock(), current);
JRT_END

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18

如果当前没有在进入safepoint过程中,则会先调用ObjectSynchronizer::quick_enter尝试快速加锁。 因为quick_enter过程中不会检查safepoint状态,这样能让进入safepoint更快。

quick_enter会obj是否处于膨胀锁状态,如果是则会判断重入、尝试cas,成功返回,失败则会再调用ObjectSynchronizer::enter进行锁膨胀(inflation)加锁。

void SharedRuntime::monitor_enter_helper(oopDesc* obj, BasicLock* lock, JavaThread* current) {
  if (!SafepointSynchronize::is_synchronizing()) {
    // Only try quick_enter() if we're not trying to reach a safepoint
    // so that the calling thread reaches the safepoint more quickly.
    if (ObjectSynchronizer::quick_enter(obj, current, lock)) return;
  }
  // NO_ASYNC required because an async exception on the state transition destructor
  // would leave you with the lock held and it would never be released.
  // The normal monitorenter NullPointerException is thrown without acquiring a lock
  // and the model is that an exception implies the method failed.
  JRT_BLOCK_NO_ASYNC
  if (PrintBiasedLockingStatistics) {
    Atomic::inc(BiasedLocking::slow_path_entry_count_addr());
  }
  Handle h_obj(THREAD, obj);
  ObjectSynchronizer::enter(h_obj, lock, current);
  assert(!HAS_PENDING_EXCEPTION, "Should have no exception here");
  JRT_BLOCK_END
}
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19

quick_enter会判断重入、尝试cas修改object monitor的owner,如果成功说明加锁完成直接返回。

bool ObjectSynchronizer::quick_enter(oop obj, JavaThread* current,
                                     BasicLock * lock) {
  assert(current->thread_state() == _thread_in_Java, "invariant");
  NoSafepointVerifier nsv;
  if (obj == NULL) return false;       // Need to throw NPE

  if (obj->klass()->is_value_based()) {
    return false;
  }

  const markWord mark = obj->mark();

  if (mark.has_monitor()) {
    ObjectMonitor* const m = mark.monitor();
    // An async deflation or GC can race us before we manage to make
    // the ObjectMonitor busy by setting the owner below. If we detect
    // that race we just bail out to the slow-path here.
    if (m->object_peek() == NULL) {
      return false;
    }
    JavaThread* const owner = (JavaThread*) m->owner_raw();

    // Lock contention and Transactional Lock Elision (TLE) diagnostics
    // and observability
    // Case: light contention possibly amenable to TLE
    // Case: TLE inimical operations such as nested/recursive synchronization

    if (owner == current) {
      m->_recursions++;
      return true;
    }

    // This Java Monitor is inflated so obj's header will never be
    // displaced to this thread's BasicLock. Make the displaced header
    // non-NULL so this BasicLock is not seen as recursive nor as
    // being locked. We do this unconditionally so that this thread's
    // BasicLock cannot be mis-interpreted by any stack walkers. For
    // performance reasons, stack walkers generally first check for
    // Biased Locking in the object's header, the second check is for
    // stack-locking in the object's header, the third check is for
    // recursive stack-locking in the displaced header in the BasicLock,
    // and last are the inflated Java Monitor (ObjectMonitor) checks.
    lock->set_displaced_header(markWord::unused_mark());

    if (owner == NULL && m->try_set_owner_from(NULL, current) == NULL) {
      assert(m->_recursions == 0, "invariant");
      return true;
    }
  }

  // Note that we could inflate in quick_enter.
  // This is likely a useful optimization
  // Critically, in quick_enter() we must not:
  // -- perform bias revocation, or
  // -- block indefinitely, or
  // -- reach a safepoint

  return false;        // revert to slow-path
}
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59

# monitor enter

加锁流程: ObjectSynchronizer::enter

void ObjectSynchronizer::enter(Handle obj, BasicLock* lock, JavaThread* current)
1
// 获取monitor对象的markWord对象头中的mark标记
markWord mark = obj->mark();
1
2
// is_neutral表示当前没有加锁,mark bit中是未加锁状态
// 则会尝试轻量级锁加锁,即cas设置对象头为当前线程的BasicLock
if (mark.is_neutral()) {
    // Anticipate successful CAS -- the ST of the displaced mark must
    // be visible <= the ST performed by the CAS.
    // lock先保存monitor的对象头
    lock->set_displaced_header(mark);
    // 通过cas修改monitor对象的对象头为当前的lock,因为对象是8字节对齐,所以markWord最后两个bit肯定是00,是加锁状态。
    // mark == cas结果说明cas成功加锁完成直接返回。如果没cas成功,则会进入到inflate流程即锁膨胀流程
    if (mark == obj()->cas_set_mark(markWord::from_pointer(lock), mark)) {
      return;
    }
    // Fall through to inflate() ...
  } else if (mark.has_locker() &&
current->is_lock_owned((address)mark.locker())) {
    // 如果已经加锁,并且是当前线程加锁,返回。
    assert(lock != mark.locker(), "must not re-lock the same lock");
    assert(lock != (BasicLock*)obj->mark().value(), "don't relock with same BasicLock");
lock->set_displaced_header(markWord::from_pointer(NULL));
    return;
    }
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21

cas和判断重入都失败的情况,进入到最后的inflate即锁膨胀。 调用inflate创建ObjectMonitor对象,然后调用enter对象加锁。 直到enter返回true返回说明加锁成功。

// The object header will never be displaced to this lock,
// so it does not matter what the value is, except that it
// must be non-zero to avoid looking like a re-entrant lock,
// and must not look locked either.
lock->set_displaced_header(markWord::unused_mark());
// An async deflation can race after the inflate() call and before
// enter() can make the ObjectMonitor busy. enter() returns false if
// we have lost the race to async deflation and we simply try again.
while (true) {
    ObjectMonitor* monitor = inflate(current, obj(), inflate_cause_monitor_enter);
    if (monitor->enter(current)) {
      return;
    }
}
1
2
3
4
5
6
7
8
9
10
11
12
13
14

再看一下ObjectMonitor的enter实现

先通过cas设置owner为当前线程对象,如果成功,返回true,表示加锁成功。

void* cur = try_set_owner_from(NULL, current);
  if (cur == NULL) {
    assert(_recursions == 0, "invariant");
    return true;
  }
1
2
3
4
5

cas失败还可能是重入,增加_recursions重入计数

if (cur == current) {
    // TODO-FIXME: check for integer overflow!  BUGID 6557169.
    _recursions++;
    return true;
  }
1
2
3
4
5
if (current->is_lock_owned((address)cur)) {
    assert(_recursions == 0, "internal state error");
    _recursions = 1;
    set_owner_from_BasicLock(cur, current);  // Convert from BasicLock* to Thread*.
    return true;
  }
1
2
3
4
5
6

尝试一轮自适应spin自旋,如果成功,说明加锁成功返回true。自旋的操作也是cas修改ObjectMonitor的owner为当前线程。

// Try one round of spinning *before* enqueueing current
  // and before going through the awkward and expensive state
  // transitions.  The following spin is strictly optional ...
  // Note that if we acquire the monitor from an initial spin
  // we forgo posting JVMTI events and firing DTRACE probes.
  if (TrySpin(current) > 0) {
    assert(owner_raw() == current, "must be current: owner=" INTPTR_FORMAT, p2i(owner_raw()));
    assert(_recursions == 0, "must be 0: recursions=" INTX_FORMAT, _recursions);
    assert(object()->mark() == markWord::encode(this),
           "object mark must match encoded this: mark=" INTPTR_FORMAT
           ", encoded this=" INTPTR_FORMAT, object()->mark().value(),
           markWord::encode(this).value());
    current->_Stalled = 0;
    return true;
  }
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15

FIXME TrySpin实现

接下来在循环中,进行线程阻塞等待。 ThreadBlockInVMPreprocess对象在析构时会检查当前safepoint状态,如果在safepoint中,线程会等待 safepoint结束再返回。

for (;;) {
  ExitOnSuspend eos(this);
  {
    ThreadBlockInVMPreprocess&lt;ExitOnSuspend> tbivs(current, eos);
    EnterI(current);
    current->set_current_pending_monitor(NULL);
    // We can go to a safepoint at the end of this block. If we
    // do a thread dump during that safepoint, then this thread will show
    // as having "-locked" the monitor, but the OS and java.lang.Thread
    // states will still report that the thread is blocked trying to
    // acquire it.
    // If there is a suspend request, ExitOnSuspend will exit the OM
    // and set the OM as pending.
  }
  if (!eos.exited()) {
    // ExitOnSuspend did not exit the OM
    assert(owner_raw() == current, "invariant");
    break;
  }
}
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20

下面进入到EnterI的实现。

void ObjectMonitor::EnterI(JavaThread* current) {
1

执行一次TryLock,如果成功返回。

if (TryLock (current) > 0) {
    assert(_succ != current, "invariant");
    assert(owner_raw() == current, "invariant");
    assert(_Responsible != current, "invariant");
    return;
  }
1
2
3
4
5
6

再进行一轮TrySpin

// We try one round of spinning *before* enqueueing current.
  //
  // If the _owner is ready but OFFPROC we could use a YieldTo()
  // operation to donate the remainder of this thread's quantum
  // to the owner.  This has subtle but beneficial affinity
  // effects.

  if (TrySpin(current) > 0) {
    assert(owner_raw() == current, "invariant");
    assert(_succ != current, "invariant");
    assert(_Responsible != current, "invariant");
    return;
  }
1
2
3
4
5
6
7
8
9
10
11
12
13

接下来,创建ObjectWaiter节点,加入到等待队列中,并park线程。

// Enqueue "current" on ObjectMonitor's _cxq.
  //
  // Node acts as a proxy for current.
  // As an aside, if were to ever rewrite the synchronization code mostly
  // in Java, WaitNodes, ObjectMonitors, and Events would become 1st-class
  // Java objects.  This would avoid awkward lifecycle and liveness issues,
  // as well as eliminate a subset of ABA issues.
  // TODO: eliminate ObjectWaiter and enqueue either Threads or Events.
  // 创建节点。
  ObjectWaiter node(current);
  current->_ParkEvent->reset();
  node._prev   = (ObjectWaiter*) 0xBAD;
  node.TState  = ObjectWaiter::TS_CXQ;

  // Push "current" onto the front of the _cxq.
  // Once on cxq/EntryList, current stays on-queue until it acquires the lock.
  // Note that spinning tends to reduce the rate at which threads
  // enqueue and dequeue on EntryList|cxq.
  // 通过cas 头结点入队。
  ObjectWaiter* nxt;
  for (;;) {
    node._next = nxt = _cxq;
    if (Atomic::cmpxchg(&_cxq, nxt, &node) == nxt) break;
    
          // cas失败再增加一次TryLock,目的是减少cas减少开销、提高加锁速度
    // Interference - the CAS failed because _cxq changed.  Just retry.
    // As an optional optimization we retry the lock.
    if (TryLock (current) > 0) {
      assert(_succ != current, "invariant");
      assert(owner_raw() == current, "invariant");
      assert(_Responsible != current, "invariant");
      return;
    }
  }
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34

接下里一段注释,表示enter和exit可能出现非常罕见的race condition并发问题,可能导致等待的线程不被唤醒, 这是因为有一些使用fast_unlock的地方,fast_unlock中的实现没有通过MEMBAR和CAS保证可见性,导致exit的线程可能看不到等待线程。 所以等待队列中的一个_Responsible线程使用的是一个不断timed park,也就是定时的park,万一出现并发问题有一个兜底能够自己唤醒检查能否加锁。

// Check for cxq|EntryList edge transition to non-null.  This indicates
  // the onset of contention.  While contention persists exiting threads
  // will use a ST:MEMBAR:LD 1-1 exit protocol.  When contention abates exit
  // operations revert to the faster 1-0 mode.  This enter operation may interleave
  // (race) a concurrent 1-0 exit operation, resulting in stranding, so we
  // arrange for one of the contending thread to use a timed park() operations
  // to detect and recover from the race.  (Stranding is form of progress failure
  // where the monitor is unlocked but all the contending threads remain parked).
  // That is, at least one of the contended threads will periodically poll _owner.
  // One of the contending threads will become the designated "Responsible" thread.
  // The Responsible thread uses a timed park instead of a normal indefinite park
  // operation -- it periodically wakes and checks for and recovers from potential
  // strandings admitted by 1-0 exit operations.   We need at most one Responsible
  // thread per-monitor at any given moment.  Only threads on cxq|EntryList may
  // be responsible for a monitor.
  //
  // Currently, one of the contended threads takes on the added role of "Responsible".
  // A viable alternative would be to use a dedicated "stranding checker" thread
  // that periodically iterated over all the threads (or active monitors) and unparked
  // successors where there was risk of stranding.  This would help eliminate the
  // timer scalability issues we see on some platforms as we'd only have one thread
  // -- the checker -- parked on a timer.
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22

如果等待队列为空,则通过cas设置_Responsible为自己。

if (nxt == NULL && _EntryList == NULL) {
    // Try to assume the role of responsible thread for the monitor.
    // CONSIDER:  ST vs CAS vs { if (Responsible==null) Responsible=current }
    Atomic::replace_if_null(&_Responsible, current);
  }
1
2
3
4
5
 for (;;) {
    // 再掺杂一次TryLock,成功则break跳出循环
    if (TryLock(current) > 0) break;
    assert(owner_raw() != current, "invariant");

    // park self
    // 如果当前线程是_Responsible,则需要进行定时的park,超时时间从1ms开始,如果再重试则乘以8,直到最大的1s。
    if (_Responsible == current) {
      current->_ParkEvent->park((jlong) recheckInterval);
      // Increase the recheckInterval, but clamp the value.
      recheckInterval *= 8;
      if (recheckInterval > MAX_RECHECK_INTERVAL) {
        recheckInterval = MAX_RECHECK_INTERVAL;
      }
    } else {
        // 不是_Responsible节点是使用无超时时间的park
      current->_ParkEvent->park();
    }

    // 被唤醒后,再次调用TryLock尝试加锁。
    if (TryLock(current) > 0) break;

    if (try_set_owner_from(DEFLATER_MARKER, current) == DEFLATER_MARKER) {
      // Cancelled the in-progress async deflation by changing owner from
      // DEFLATER_MARKER to current. As part of the contended enter protocol,
      // contentions was incremented to a positive value before EnterI()
      // was called and that prevents the deflater thread from winning the
      // last part of the 2-part async deflation protocol. After EnterI()
      // returns to enter(), contentions is decremented because the caller
      // now owns the monitor. We bump contentions an extra time here to
      // prevent the deflater thread from winning the last part of the
      // 2-part async deflation protocol after the regular decrement
      // occurs in enter(). The deflater thread will decrement contentions
      // after it recognizes that the async deflation was cancelled.
      add_to_contentions(1);
      break;
    }

    // The lock is still contested.
    // Keep a tally of the # of futile wakeups.
    // Note that the counter is not protected by a lock or updated by atomics.
    // That is by design - we trade "lossy" counters which are exposed to
    // races during updates for a lower probe effect.

    // This PerfData object can be used in parallel with a safepoint.
    // See the work around in PerfDataManager::destroy().
    OM_PERFDATA_OP(FutileWakeups, inc());
    ++nWakeups;

    // Assuming this is not a spurious wakeup we'll normally find _succ == current.
    // We can defer clearing _succ until after the spin completes
    // TrySpin() must tolerate being called with _succ == current.
    // Try yet another round of adaptive spinning.
    // 尝试一轮自使用Spin自旋
    if (TrySpin(current) > 0) break;

    // We can find that we were unpark()ed and redesignated _succ while
    // we were spinning.  That's harmless.  If we iterate and call park(),
    // park() will consume the event and return immediately and we'll
    // just spin again.  This pattern can repeat, leaving _succ to simply
    // spin on a CPU.

    if (_succ == current) _succ = NULL;

    // Invariant: after clearing _succ a thread *must* retry _owner before parking.
    OrderAccess::fence();
  }
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67

执行到这里说明已经加锁成功,需要从等待队列中出队。

UnlinkAfterAcquire(current, &node);
  if (_succ == current) _succ = NULL;

  assert(_succ != current, "invariant");
  if (_Responsible == current) {
    _Responsible = NULL;
    OrderAccess::fence(); // Dekker pivot-point

    // We may leave threads on cxq|EntryList without a designated
    // "Responsible" thread.  This is benign.  When this thread subsequently
    // exits the monitor it can "see" such preexisting "old" threads --
    // threads that arrived on the cxq|EntryList before the fence, above --
    // by LDing cxq|EntryList.  Newly arrived threads -- that is, threads
    // that arrive on cxq after the ST:MEMBAR, above -- will set Responsible
    // non-null and elect a new "Responsible" timer thread.
    //
    // This thread executes:
    //    ST Responsible=null; MEMBAR    (in enter epilogue - here)
    //    LD cxq|EntryList               (in subsequent exit)
    //
    // Entering threads in the slow/contended path execute:
    //    ST cxq=nonnull; MEMBAR; LD Responsible (in enter prolog)
    //    The (ST cxq; MEMBAR) is accomplished with CAS().
    //
    // The MEMBAR, above, prevents the LD of cxq|EntryList in the subsequent
    // exit operation from floating above the ST Responsible=null.
  }

  // We've acquired ownership with CAS().
  // CAS is serializing -- it has MEMBAR/FENCE-equivalent semantics.
  // But since the CAS() this thread may have also stored into _succ,
  // EntryList, cxq or Responsible.  These meta-data updates must be
  // visible __before this thread subsequently drops the lock.
  // Consider what could occur if we didn't enforce this constraint --
  // STs to monitor meta-data and user-data could reorder with (become
  // visible after) the ST in exit that drops ownership of the lock.
  // Some other thread could then acquire the lock, but observe inconsistent
  // or old monitor meta-data and heap data.  That violates the JMM.
  // To that end, the 1-0 exit() operation must have at least STST|LDST
  // "release" barrier semantics.  Specifically, there must be at least a
  // STST|LDST barrier in exit() before the ST of null into _owner that drops
  // the lock.   The barrier ensures that changes to monitor meta-data and data
  // protected by the lock will be visible before we release the lock, and
  // therefore before some other thread (CPU) has a chance to acquire the lock.
  // See also: http://gee.cs.oswego.edu/dl/jmm/cookbook.html.
  //
  // Critically, any prior STs to _succ or EntryList must be visible before
  // the ST of null into _owner in the *subsequent* (following) corresponding
  // monitorexit.  Recall too, that in 1-0 mode monitorexit does not necessarily
  // execute a serializing instruction.

  return
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52

# monitor exit

exit也就是释放锁这里采用了一定的优化,减少cas/membar的使用,但是代价是存在非常罕见(extreme rare)的并发问题可能导致等待线程不被唤醒, 所以enter过程中使用了一个_Responsible机制定期检查。

重入情况的返回,如果_recursions不是0说明还是当前线程持有锁。

 if (_recursions != 0) {
    _recursions--;        // this is simple recursive enter
    return;
  }
1
2
3
4

设置_Responsible为NULL

// Invariant: after setting Responsible=null an thread must execute
  // a MEMBAR or other serializing instruction before fetching EntryList|cxq.
  _Responsible = NULL;
1
2
3
for (;;) {
    assert(current == owner_raw(), "invariant");

    // Drop the lock.
    // release semantics: prior loads and stores from within the critical section
    // must not float (reorder) past the following store that drops the lock.
    // Uses a storeload to separate release_store(owner) from the
    // successor check. The try_set_owner() below uses cmpxchg() so
    // we get the fence down there.
    // 设置owner为NULL
    release_clear_owner(current);
    
    // 防止重排序、保证可见性
    OrderAccess::storeload();

    if ((intptr_t(_EntryList)|intptr_t(_cxq)) == 0 || _succ != NULL) {
      return;
    }
    // Other threads are blocked trying to acquire the lock.

    // Normally the exiting thread is responsible for ensuring succession,
    // but if other successors are ready or other entering threads are spinning
    // then this thread can simply store NULL into _owner and exit without
    // waking a successor.  The existence of spinners or ready successors
    // guarantees proper succession (liveness).  Responsibility passes to the
    // ready or running successors.  The exiting thread delegates the duty.
    // More precisely, if a successor already exists this thread is absolved
    // of the responsibility of waking (unparking) one.
    //
    // The _succ variable is critical to reducing futile wakeup frequency.
    // _succ identifies the "heir presumptive" thread that has been made
    // ready (unparked) but that has not yet run.  We need only one such
    // successor thread to guarantee progress.
    // See http://www.usenix.org/events/jvm01/full_papers/dice/dice.pdf
    // section 3.3 "Futile Wakeup Throttling" for details.
    //
    // Note that spinners in Enter() also set _succ non-null.
    // In the current implementation spinners opportunistically set
    // _succ so that exiting threads might avoid waking a successor.
    // Another less appealing alternative would be for the exiting thread
    // to drop the lock and then spin briefly to see if a spinner managed
    // to acquire the lock.  If so, the exiting thread could exit
    // immediately without waking a successor, otherwise the exiting
    // thread would need to dequeue and wake a successor.
    // (Note that we'd need to make the post-drop spin short, but no
    // shorter than the worst-case round-trip cache-line migration time.
    // The dropped lock needs to become visible to the spinner, and then
    // the acquisition of the lock by the spinner must become visible to
    // the exiting thread).

    // It appears that an heir-presumptive (successor) must be made ready.
    // Only the current lock owner can manipulate the EntryList or
    // drain _cxq, so we need to reacquire the lock.  If we fail
    // to reacquire the lock the responsibility for ensuring succession
    // falls to the new owner.
    //
    // 尝试进行一次cas,如果失败,说明其他线程获得了锁,当前线程可以返回,不用负责进行唤醒。
    if (try_set_owner_from(NULL, current) != NULL) {
      return;
    }

    guarantee(owner_raw() == current, "invariant");

    ObjectWaiter* w = NULL;

    // 先读取EntryList列表
    w = _EntryList;
    if (w != NULL) {
      // I'd like to write: guarantee (w->_thread != current).
      // But in practice an exiting thread may find itself on the EntryList.
      // Let's say thread T1 calls O.wait().  Wait() enqueues T1 on O's waitset and
      // then calls exit().  Exit release the lock by setting O._owner to NULL.
      // Let's say T1 then stalls.  T2 acquires O and calls O.notify().  The
      // notify() operation moves T1 from O's waitset to O's EntryList. T2 then
      // release the lock "O".  T2 resumes immediately after the ST of null into
      // _owner, above.  T2 notices that the EntryList is populated, so it
      // reacquires the lock and then finds itself on the EntryList.
      // Given all that, we have to tolerate the circumstance where "w" is
      // associated with current.
      assert(w->TState == ObjectWaiter::TS_ENTER, "invariant");
      // EntryList不为空,则在ExitEpilog中通过unpark唤醒头结点。
      ExitEpilog(current, w);
      return;
    }

    // 如果EntryList为空,则尝试_cxq
    // If we find that both _cxq and EntryList are null then just
    // re-run the exit protocol from the top.
    w = _cxq;
    if (w == NULL) continue;
    
    // 如果_cxq不为空,则将_cxq的节点drain到EntryList中。
    // Drain _cxq into EntryList - bulk transfer.
    // First, detach _cxq.
    // The following loop is tantamount to: w = swap(&cxq, NULL)
    for (;;) {
      assert(w != NULL, "Invariant");
      ObjectWaiter* u = Atomic::cmpxchg(&_cxq, w, (ObjectWaiter*)NULL);
      if (u == w) break;
      w = u;
    }

    assert(w != NULL, "invariant");
    assert(_EntryList == NULL, "invariant");

    // Convert the LIFO SLL anchored by _cxq into a DLL.
    // The list reorganization step operates in O(LENGTH(w)) time.
    // It's critical that this step operate quickly as
    // "current" still holds the outer-lock, restricting parallelism
    // and effectively lengthening the critical section.
    // Invariant: s chases t chases u.
    // TODO-FIXME: consider changing EntryList from a DLL to a CDLL so
    // we have faster access to the tail.

    _EntryList = w;
    ObjectWaiter* q = NULL;
    ObjectWaiter* p;
    for (p = w; p != NULL; p = p->_next) {
      guarantee(p->TState == ObjectWaiter::TS_CXQ, "Invariant");
      p->TState = ObjectWaiter::TS_ENTER;
      p->_prev = q;
      q = p;
    }

    // In 1-0 mode we need: ST EntryList; MEMBAR #storestore; ST _owner = NULL
    // The MEMBAR is satisfied by the release_store() operation in ExitEpilog().

    // See if we can abdicate to a spinner instead of waking a thread.
    // A primary goal of the implementation is to reduce the
    // context-switch rate.
    if (_succ != NULL) continue;

    w = _EntryList;
    if (w != NULL) {
      guarantee(w->TState == ObjectWaiter::TS_ENTER, "invariant");
      ExitEpilog(current, w);
      return;
    }
  }
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139

再看一下ExitEpilog的实现

  • 设置_succ设置为被唤醒的线程,目的是减少大量无用的unpark
  • 设置owner为NULL,并通过OrderAccess::fence()防止重排序
  • 唤醒wakee
void ObjectMonitor::ExitEpilog(JavaThread* current, ObjectWaiter* Wakee) {

    // Exit protocol:
    // 1. ST _succ = wakee
    // 2. membar #loadstore|#storestore;
    // 2. ST _owner = NULL
    // 3. unpark(wakee)

    _succ = Wakee->_thread;
    ParkEvent * Trigger = Wakee->_event;

    // Hygiene -- once we've set _owner = NULL we can't safely dereference Wakee again.
    // The thread associated with Wakee may have grabbed the lock and "Wakee" may be
    // out-of-scope (non-extant).
    Wakee  = NULL;

    // Drop the lock.
    // Uses a fence to separate release_store(owner) from the LD in unpark().
    release_clear_owner(current);
    OrderAccess::fence();

    DTRACE_MONITOR_PROBE(contended__exit, this, object(), current);
    Trigger->unpark();

    // Maintain stats and report events to JVMTI
    OM_PERFDATA_OP(Parks, inc());
}
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27