/** * Constructs an empty <tt>HashMap</tt> with the specified initial * capacity and load factor. * * @param initialCapacity the initial capacity * @param loadFactor the load factor * @throws IllegalArgumentException if the initial capacity is negative * or the load factor is nonpositive */ publicHashMap(int initialCapacity, float loadFactor){ // 如果初始容量小于0,抛出异常 if (initialCapacity < 0) thrownew IllegalArgumentException("Illegal initial capacity: " + initialCapacity); // 如果初始容量大于最大容量2^30,复制为最大容量。防止溢出 if (initialCapacity > MAXIMUM_CAPACITY) initialCapacity = MAXIMUM_CAPACITY; // 负载因子小于0,报错提示 if (loadFactor <= 0 || Float.isNaN(loadFactor)) thrownew IllegalArgumentException("Illegal load factor: " + loadFactor); this.loadFactor = loadFactor; // threshold hashmap扩容阈值,注意这个值会发生变化。 this.threshold = tableSizeFor(initialCapacity); } /** * 返回给定目标容量的2次幂大小。 * Returns a power of two size for the given target capacity. */ staticfinalinttableSizeFor(int cap){ int n = cap - 1; n |= n >>> 1; n |= n >>> 2; n |= n >>> 4; n |= n >>> 8; n |= n >>> 16; return (n < 0) ? 1 : (n >= MAXIMUM_CAPACITY) ? MAXIMUM_CAPACITY : n + 1; }
HashMap <String,Object> hashmap = new HashMap<>(17); cap = 17 int n = cap - 1 = 16; 第一次无符号右移一位 00000000000000000000000000010000 n=16 00000000000000000000000000001000 n >>> 1 ------------------------------------------------------ 0000000000000000000000000001100024 ===>n
第二次无符号右移2位 00000000000000000000000000011000 n=24 00000000000000000000000000000110 n >>> 2 ------------------------------------------------------ 00000000000000000000000000011110 n = 30
第三次无符号右移4位 00000000000000000000000000011110 n = 30 00000000000000000000000000000001 n >>> 4 ------------------------------------------------------ 00000000000000000000000000011111 n = 31
第四次无符号右移8位 00000000000000000000000000011111 n=31 00000000000000000000000000000000 n >>> 8 ------------------------------------------------------- 00000000000000000000000000011111 n=31
001000000000000000000000000000002^30 int n = cap-1; 00011111111111111111111111111111 n=2^30 -1 00001111111111111111111111111111 n >>> 1 ------------------------------------------- 000111111111111111111111111111112^29
/** * The bin count threshold for using a tree rather than list for a * bin. Bins are converted to trees when adding an element to a * bin with at least this many nodes. The value must be greater * than 2 and should be at least 8 to mesh with assumptions in * tree removal about conversion back to plain bins upon * shrinkage. */ staticfinalint TREEIFY_THRESHOLD = 8;
* Because TreeNodes are about twice the size of regular nodes, we * use them only when bins contain enough nodes to warrant use * (see TREEIFY_THRESHOLD). And when they become too small(due to * removal or resizing) they are converted back to plain bins. In * usages with well-distributed user hashCodes, tree bins are * rarely used. Ideally, under random hashCodes, the frequency of * nodes in bins follows a Poisson distribution * (http://en.wikipedia.org/wiki/Poisson_distribution) with a * parameter of about 0.5 on average for the default resizing * threshold of 0.75, although with a large variance because of * resizing granularity. Ignoring variance, the expected * occurrences of list size k are(exp(-0.5) * pow(0.5, k) / * factorial(k)). The first values are: 因为树节点的大小大约是普通节点的两倍,所以我们只在bin包含足够的节点时才使用树节点(参见 TREEIFY_THRESHOLD)。当它们变得太小(由于删除或调整大小)时,就会被转换回普通的桶。在使用分布良好的用户 hashcode时,很少使用树箱。理想情况下,在随机哈希码下,箱子中节点的频率服从泊松分布 (http://en.wikipedia. org/wiki/Poisson. distr ibution),默认调整阈值为0.75,平均参数约为0.5,尽管由于调整粒度的差异很大。忽略方差,列表大小k的预期出现次数是(exp(-0.5)*pow(0.5,k)/factorial(k)). * * 0: 0.60653066 * 1: 0.30326533 * 2: 0.07581633 * 3: 0.01263606 * 4: 0.00157952 * 5: 0.00015795 * 6: 0.00001316 * 7: 0.00000094 * 8: 0.00000006 * more: less than 1 in ten million * * The root of a tree bin is normally its first node. However, * sometimes(currently only upon Iterator.remove), the root might * be elsewhere, but can be recovered following parent links * (method TreeNode.root()).
/** * The bin count threshold for untreeifying a (split) bin during a * resize operation. Should be less than TREEIFY_THRESHOLD, and at * most 6 to mesh with shrinkage detection under removal. 当桶(bucket) 上的结点数小于这个值时树转链表 */ staticfinalint UNTREEIFY_THRESHOLD = 6;
1.6 链表转红黑树时,数组的长度最小值
1 2 3 4 5 6 7
/** * The smallest table capacity for which bins may be treeified. * (Otherwise the table is resized if too many nodes in a bin.) * Should be at least 4 * TREEIFY_THRESHOLD to avoid conflicts * between resizing and treeification thresholds. */ staticfinalint MIN_TREEIFY_CAPACITY = 64;
1.7 table 用来初始化
table必须是2的幂
1 2 3 4 5 6 7
/** * The table, initialized on first use, and resized as * necessary. When allocated, length is always a power of two. * (We also tolerate length zero in some operations to allow * bootstrapping mechanics that are currently not needed.) */ transient Node<K,V>[] table;
/** * Holds cached entrySet(). Note that AbstractMap fields are used * for keySet() and values(). */ transient Set<Map.Entry<K,V>> entrySet;
1.9 HashMap元素的个数
1 2 3 4
/** * The number of key-value mappings contained in this map. */ transientint size;
1.10 用来记录HashMap的修改次数
1 2 3 4 5 6 7 8 9
/** * The number of times this HashMap has been structurally modified * Structural modifications are those that change the number of mappings in * the HashMap or otherwise modify its internal structure (e.g., * rehash). This field is used to make iterators on Collection-views of * the HashMap fail-fast. (See ConcurrentModificationException). 结构修改是指改变HashMap中的映射数量或修改其内部结构(例如,rehash) */ transientint modCount;
1.11 要调整大小的下一个大小值(容量*负载系数)
1 2 3 4 5 6 7 8 9 10
/** * The next size value at which to resize (capacity * load factor). * 要调整大小的下一个大小值(容量*负载系数)。 数组长度唱过临界值时会进行扩容操作 * @serial */ // (The javadoc description is true upon serialization. // Additionally, if the table array has not been allocated, this // field holds the initial array capacity, or zero signifying // DEFAULT_INITIAL_CAPACITY.) int threshold;
/** * Basic hash bin node, used for most entries. (See below for * TreeNode subclass, and in LinkedHashMap for its Entry subclass.) */ staticclassNode<K,V> implementsMap.Entry<K,V> { finalint hash; final K key; V value; Node<K,V> next;
/** * Entry for Tree bins. Extends LinkedHashMap.Entry (which in turn * extends Node) so can be used as extension of either regular or * linked node. */ staticfinalclassTreeNode<K,V> extendsLinkedHashMap.Entry<K,V> { TreeNode<K,V> parent; // red-black tree links TreeNode<K,V> left; TreeNode<K,V> right; TreeNode<K,V> prev; // needed to unlink next upon deletion boolean red; TreeNode(int hash, K key, V val, Node<K,V> next) { super(hash, key, val, next); }
/** * Returns root of tree containing this node. */ final TreeNode<K,V> root(){ for (TreeNode<K,V> r = this, p;;) { if ((p = r.parent) == null) return r; r = p; } }
/** * 计算hash值的方法 * Computes key.hashCode() and spreads (XORs) higher bits of hash * to lower. Because the table uses power-of-two masking, sets of * hashes that vary only in bits above the current mask will * always collide. (Among known examples are sets of Float keys * holding consecutive whole numbers in small tables.) So we * apply a transform that spreads the impact of higher bits * downward. There is a tradeoff between speed, utility, and * quality of bit-spreading. Because many common sets of hashes * are already reasonably distributed (so don't benefit from * spreading), and because we use trees to handle large sets of * collisions in bins, we just XOR some shifted bits in the * cheapest possible way to reduce systematic lossage, as well as * to incorporate impact of the highest bits that would otherwise * never be used in index calculations because of table bounds. 计算 key.hashCode() 并将散列的较高位(异或)传播到较低位。由于该表使用二次幂掩码,因此仅在当前 掩码之上位变化的散列集将始终发生冲突。 (众所周知的例子是在小表中保存连续整数的浮点键集。)所以我们应用一个变换,将高位的影响向下传播。位扩展的速度、效用和质量之间存在折衷。因为许多常见的散列集已经合理分布(因此不会从扩展中受益),并且因为我们使用树来处理 bin 中的大量冲突,所以我们只是以最便宜的方式对一些移位的位进行异或以减少系统损失,以及合并最高位的影响,否则由于表边界而永远不会在索引计算中使用。 */ staticfinalinthash(Object key){ int h; return (key == null) ? 0 : (h = key.hashCode()) ^ (h >>> 16); }
/** * Associates the specified value with the specified key in this map. * If the map previously contained a mapping for the key, the old * value is replaced. * * @param key key with which the specified value is to be associated * @param value value to be associated with the specified key * @return the previous value associated with <tt>key</tt>, or * <tt>null</tt> if there was no mapping for <tt>key</tt>. * (A <tt>null</tt> return can also indicate that the map * previously associated <tt>null</tt> with <tt>key</tt>.) */ public V put(K key, V value){ return putVal(hash(key), key, value, false, true); }
/** * Implements Map.put and related methods. * * @param hash hash for key * @param key the key * @param value the value to put * @param onlyIfAbsent if true, don't change existing value 如果为true表示不更改现有的值 * @param evict if false, the table is in creation mode. 表示table为创建状态 * @return previous value, or null if none */ final V putVal(int hash, K key, V value, boolean onlyIfAbsent, boolean evict){ // p表示当前的节点。n表示表的长度 Node<K,V>[] tab; Node<K,V> p; int n, i; // hashmap的table表的初始化操作,是在这里进行的。第一次执行的时候,会先在这里进行初始化操作 if ((tab = table) == null || (n = tab.length) == 0) n = (tab = resize()).length; // 通过上述hash()方法得到hash值和数组长度进行按位与运算,得到元素的存储位置,如果table表的位置为空,就直接进行存储操作 if ((p = tab[i = (n - 1) & hash]) == null) tab[i] = newNode(hash, key, value, null); else { // 产生了hash碰撞,走下述代码 Node<K,V> e; K k; // 这里的p的值是上面p = tab[i = (n - 1) & hash ,if语句体虽然没有执行,但是这一段代码是否执行的,判断hash值和key值 if (p.hash == hash && ((k = p.key) == key || (key != null && key.equals(k)))) e = p; // 判断当前table中的p节点是不是树节点 elseif (p instanceof TreeNode) e = ((TreeNode<K,V>)p).putTreeVal(this, tab, hash, key, value); else { // 遍历链表的,取下一个位置存放新增的元素,这里采用的是尾插法,a.横竖都要遍历链表的长度是否大于树化的阈值,所以遍历的过程中,就直接插入元素了b.可能的因素就是遍历的过程中要比较key值是否相同,和jdk1.7有些不同 for (int binCount = 0; ; ++binCount) { //如果下一个位置为空,就直接连接在链表后面 if ((e = p.next) == null) { p.next = newNode(hash, key, value, null); // 如果链表的长度大于8个时,就进行链表转红黑树的操作 if (binCount >= TREEIFY_THRESHOLD - 1) // -1 for 1st treeifyBin(tab, hash); break; } // 如果链表的下一个元素不为空,比较hash值和key值 if (e.hash == hash && ((k = e.key) == key || (key != null && key.equals(k)))) break; p = e; } } // key值存在,就替换value值。新插入的元素的value值,替换原来这个key的value值 if (e != null) { // existing mapping for key V oldValue = e.value; if (!onlyIfAbsent || oldValue == null) e.value = value; afterNodeAccess(e); return oldValue; } } ++modCount; // hashmap的键值对个数大于扩容的阈值,就进行扩容操作 if (++size > threshold) resize(); afterNodeInsertion(evict); returnnull; }
"This is a classic symptom of an incorrectly synchronized use of HashMap. Clearly, the submitters need to use a thread-safe HashMap. If they upgraded to Java 5, they could just use ConcurrentHashMap. If they can't do this yet, they can use either the pre-JSR166 version, or better, the unofficial backport as mentioned by Martin. If they can't do any of these, they can use Hashtable or synchhronizedMap wrappers, and live with poorer performance. In any case, it's not a JDK or JVM bug."
I agree that the presence of a corrupted data structure alone does not indicate a bug in the JDK.