开发者

Maximum size of HashSet, Vector, LinkedList

开发者 https://www.devze.com 2023-04-09 22:01 出处:网络
What is the maximum size of HashSet, Vector, LinkedList? I know that ArrayList can store more than 3277000 numbers.

What is the maximum size of HashSet, Vector, LinkedList? I know that ArrayList can store more than 3277000 numbers.

However the size of list depends on the memory (heap) size. If it reaches m开发者_StackOverflow社区aximum the JDK throws an OutOfMemoryError.

But I don't know the limit for the number of elements in HashSet, Vector and LinkedList.


There is no specified maximum size of these structures.

The actual practical size limit is probably somewhere in the region of Integer.MAX_VALUE (i.e. 2147483647, roughly 2 billion elements), as that's the maximum size of an array in Java.

  • A HashSet uses a HashMap internally, so it has the same maximum size as that
    • A HashMap uses an array which always has a size that is a power of two, so it can be at most 230 = 1073741824 elements big (since the next power of two is bigger than Integer.MAX_VALUE).
    • Normally the number of elements is at most the number of buckets multiplied by the load factor (0.75 by default). However, when the HashMap stops resizing, then it will still allow you to add elements, exploiting the fact that each bucket is managed via a linked list. Therefore the only limit for elements in a HashMap/HashSet is memory.
  • A Vector uses an array internally which has a maximum size of exactly Integer.MAX_VALUE, so it can't support more than that many elements
  • A LinkedList doesn't use an array as the underlying storage, so that doesn't limit the size. It uses a classical doubly linked list structure with no inherent limit, so its size is only bounded by the available memory. Note that a LinkedList will report the size wrongly if it is bigger than Integer.MAX_VALUE, because it uses a int field to store the size and the return type of size() is int as well.

Note that while the Collection API does define how a Collection with more than Integer.MAX_VALUE elements should behave. Most importantly it states this the size() documentation:

If this collection contains more than Integer.MAX_VALUE elements, returns Integer.MAX_VALUE.

Note that while HashMap, HashSet and LinkedList seem to support more than Integer.MAX_VALUE elements, none of those implement the size() method in this way (i.e. they simply let the internal size field overflow).

This leads me to believe that other operations also aren't well-defined in this condition.

So I'd say it's safe to use those general-purpose collections with up to Integer.MAX_VLAUE elements. If you know that you'll need to store more than that, then you should switch to dedicated collection implementations that actually support this.


In all cases, you're likely to be limited by the JVM heap size rather than anything else. Eventually you'll always get down to arrays so I very much doubt that any of them will manage more than 231 - 1 elements, but you're very, very likely to run out of heap before then anyway.


It very much depends on the implementation details.

A HashSet uses an array as an underlying store which by default it attempt to grow when the collection is 75% full. This means it will fail if you try to add more than about 750,000,000 entries. (It cannot grow the array from 2^30 to 2^31 entries)

Increasing the load factor increases the maximum size of the collection. e.g. a load factor of 10 allows 10 billion elements. (It is worth noting that HashSet is relatively inefficient past 100 million elements as the distribution of the 32-bit hashcode starts to look less random, and the number of collisions increases)

A Vector doubles its capacity and starts at 10. This means it will fail to grow above approx 1.34 billion. Changing the initial size to 2^n-1 gives you slightly more head room.

BTW: Use ArrayList rather than Vector if you can.

A LinkedList has no inherent limit and can grow beyond 2.1 billion. At this point size() could return Integer.MAX_VALUE, however some functions such as toArray will fail as it cannot put all objects into an array, in will instead give you the first Integer.MAX_VALUE rather than throw an exception.

As @Joachim Sauer notes, the current OpenJDK could return an incorrect result for sizes above Integer.MAX_VALUE. e.g. it could be a negative number.


The maximum size depends on the memory settings of the JVM and of course the available system memory. Specific size of memory consumption per list entry also differs between platforms, so the easiest way might be to run simple tests.


As stated in other answers, an array cannot reach 2^31 entries. Other data types are limited either by this or they will likely misreport their size() eventually. However, these theoretical limits cannot be reached on some systems:

On a 32 bit system, the number of bytes available never exceeds 2^32 exactly. And that is assuming that you have no operating system taking up memory. A 32 bit pointer is 4 bytes. Anything which does not rely on arrays must include at least one pointer per entry: this means that the maximum number of entries is 2^32/4 or 2^30 for things that do not utilize arrays.

A plain array can achieve it's theoretical limit, but only a byte array, a short array of length 2^31-1 would use up about 2^32+38 bytes.

Some java VMs have introduced a new memory model that uses compressed pointers. By adjusting pointer alignment, slightly more than 2^32 bytes may be referenced with 32 byte pointers. Around four times more. This is enough to cause a LinkedList size() to become negative, but not enough to allow it to wrap around to zero.

A sixty four bit system has sixty four bit pointers, making all pointers twice as big, making non array lists a bunch fatter. This also means that the maximum capacity supported jumps to 2^64 bytes exactly. This is enough for a 2D array to reach its theoretical maximum. byte[0x7fffffff][0x7fffffff] uses memory apporximately equal to 40+40*(2^31-1)+(2^31-1)(2^31-1)=40+40(2^31-1)+(2^62-2^32+1)

0

精彩评论

暂无评论...
验证码 换一张
取 消

关注公众号