HashMap源码笔记及分析

HashMap作为kv形式非常常用及方便,所以秉着知其所以然的态度,了解一下内部流程。

数据结构

HashMap的内部是数组+单链表,又叫哈希桶、哈希表、散列表。

看一眼结构图。

cmd-markdown-logo

图转自: https://www.jianshu.com/p/7dcff1fd05ad

结构图分三块,一一进行拆分:

  • 黄色区域:哈希桶中的数组
  • 绿色区域:哈希桶中的单链表
  • 黑红区域:哈希桶中的红黑树(JDK1.8之后增加) //先不看这个

源码常量

数据结构瞧一眼之后,记下,数组+单链表,后续咱们再把它带入到代码中。

下面介绍一下一些定义:

cmd-markdown-logo

一共有6个常量,目前只看这三个常量:

  • 初始长度: 理解为初始化的时候数组的length == 16
  • 最大长度: 桶不能无限大,不然服务炸了
  • 扩容比例:每次扩容75%后续

流程

现在按照一个流程来解读hashMap源码, 流程不搞复杂,简简单单的:

  1. Map map = new HashMap();
  2. map.put(0, 1);
  3. map.put(null, 2);

特意挑了两个特殊的key,用这三步来追踪代码。

Map map = new HashMap();
1
2
3
4
5
6
7
8
9
10
/**
* Constructs an empty <tt>HashMap</tt> with the default initial capacity
* (16) and the default load factor (0.75).
*
* 可以看出来,这个new Hash() 很简单(不像Spring 源码一样,new 个对象跳半天)
* 就是初始化了长度,初始化了扩容比例
*/
public HashMap() {
this.loadFactor = DEFAULT_LOAD_FACTOR; // all other fields defaulted
}
map.put(0, 1);

对象已经初始化了,现在追踪put方法。

1
2
3
public V put(K key, V value) {
return putVal(hash(key), key, value, false, true);
}

调用了.putValue()方法,putValue()方法里hash(key), 追:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
/**
* Computes key.hashCode() and spreads (XORs) higher bits of hash
* to lower. Because the table uses power-of-two masking, sets of
* hashes that vary only in bits above the current mask will
* always collide. (Among known examples are sets of Float keys
* holding consecutive whole numbers in small tables.) So we
* apply a transform that spreads the impact of higher bits
* downward. There is a tradeoff between speed, utility, and
* quality of bit-spreading. Because many common sets of hashes
* are already reasonably distributed (so dont benefit from
* spreading), and because we use trees to handle large sets of
* collisions in bins, we just XOR some shifted bits in the
* cheapest possible way to reduce systematic lossage, as well as
* to incorporate impact of the highest bits that would otherwise
* never be used in index calculations because of table bounds.
*/

static final int hash(Object key) {
int h;
return (key == null) ? 0 : (h = key.hashCode()) ^ (h >>> 16);
}

分析之前补充一下语法:

^异或运算符: 规则为1^0 = 1 , 1^1 = 0 , 0^1 = 1 , 0^0 = 0, 相同为0 不同为1

>>> 无符号右移运算符: 15 >>> 2 15是00001111 >>> 2 = 00000011; 15 >>> 3 = 00000001

继续,

看hash这个方法, 如果key == null 则赋值0, 所以hashMap,key 可以为空;

所以这个方法的意思是key 进行hash之后 再与 哈希之后的key右移16位, 在进行异或运算。

不深究,这一步是让hash更均匀。

继续,

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
/**
* Implements Map.put and related methods
*
* @param hash hash for key
* @param key the key
* @param value the value to put
* @param onlyIfAbsent if true, dont change existing value
* @param evict if false, the table is in creation mode.
* @return previous value, or null if none
*/
final V putVal(int hash, K key, V value, boolean onlyIfAbsent,
boolean evict) {

Node<K,V>[] tab; Node<K,V> p; int n, i;
if ((tab = table) == null || (n = tab.length) == 0)
n = (tab = resize()).length;
if ((p = tab[i = (n - 1) & hash]) == null)
tab[i] = newNode(hash, key, value, null);
else {
Node<K,V> e; K k;
if (p.hash == hash &&
((k = p.key) == key || (key != null && key.equals(k))))
e = p;
else if (p instanceof TreeNode)
e = ((TreeNode<K,V>)p).putTreeVal(this, tab, hash, key, value);
else {
for (int binCount = 0; ; ++binCount) {
if ((e = p.next) == null) {
p.next = newNode(hash, key, value, null);
if (binCount >= TREEIFY_THRESHOLD - 1) // -1 for 1st
treeifyBin(tab, hash);
break;
}
if (e.hash == hash &&
((k = e.key) == key || (key != null && key.equals(k))))
break;
p = e;
}
}
if (e != null) { // existing mapping for key
V oldValue = e.value;
if (!onlyIfAbsent || oldValue == null)
e.value = value;
afterNodeAccess(e);
return oldValue;
}
}
++modCount;
if (++size > threshold)
resize();
afterNodeInsertion(evict);
return null;
}