开发者

What are the disadvantages to hashmaps? [closed]

开发者 https://www.devze.com 2023-03-25 02:59 出处:网络
Closed. This question needs to be more focused. It is not currently accepting answers. Want to improve this ques开发者_开发百科tion? Update the question so it focuses on one problem only by
Closed. This question needs to be more focused. It is not currently accepting answers.

Want to improve this ques开发者_开发百科tion? Update the question so it focuses on one problem only by editing this post.

Closed 3 years ago.

Improve this question

Whatever language I use, I always aim to use the equivalent of a hashmap. However, I was going through some practice interview questions and it asked what is the limitation to this?

The only reason I could think of is limited main memory, but then that wouldn't be limited only to hashmaps, but also ArrayLists etc etc.


  1. Whilst hash-tables have constant time insertion, a hash-table will occasionally need to grow its internal structure and re-bucket its entries. This is an operation that has a cost proportional to the current size of the hash-table. The result of this is that insertion time is not always consistent, i.e. insertion will be constant, O(1), but occasionally you will notice a linear delay, O(n) as the table is grown. (This behaviour characteristic has led some to suggest favouring a tree over hash-table in the default/naïve case.)
  2. You need to make sure the hashing algorithm of the item you are adding is sound. What this means that for an arbitrary set of elements, the resultant hash-codes are spread well across the range of the hash-code type (in Java and C# this is int). If you have a number of items with the same value (zero anyone?) then your hash-table will degrade into an elaborate linked-list and performance will dramatically decrease.
  3. You need to ensure that the hash-code of your items does not change over time and that the equality method (Java's equals() or .NET's Equals()) is implemented to compare the same set of fields used for the hash-code. (Ideally this would mean the objects you add to the table are immutable but alternatively you may instead make sure that any mutable fields have no bearing on the hash-code calculation and equals method: a risky strategy. With changing hash-codes the table will not be able to find the entries you have already added to it when you later come to retrieve them.
  4. Hash-tables do not, generally, preserve ordering -- be it natural ordering or order of insertion. (Those that do typically employ a parallel structure to maintain the ordering, or else perform a relatively expensive sort at time of iteration.)

See also:

  • http://web.archive.org/web/20090430172748/http://enfranchisedmind.com/blog/posts/problems-with-hash-tables/
  • http://en.wikipedia.org/wiki/Hash_table#Drawbacks


Use the right data structure for the right job. If you don't need access by a key, don't use a Map.

In terms of HashMap limitations, I guess it can suffer if items have a bad hashing algorithm, but thats about it.


Chained hash tables also inherit the disadvantages of linked lists. When storing small keys and values, the space overhead of the next pointer in each entry record can be significant. An additional disadvantage is that traversing a linked list has poor cache performance, making the processor cache ineffective.

from Wikipedia - Hash Tables


One (very important) limitation is that you shouldn't use them with types that have unstable (mutable) hashcodes. Here's Eric Lippert on the subject.


Two things I can think of. One is that you can't guarantee ordering (stable or otherwise) when iterating through a hashmap. The other is that they have the possibility of thrashing your cache when you iterate over them.


Hash map usage is situational.

If your Hash key is not chosen well ur hash map run at the speed equivalent to that of a list, with the added issue of huge memory hog.

In general Hashmaps are a bad choice when ur gonna perform iterative tasks on your data.


There's also the potential for collisions. The cost of writing and/or executing the hashing-function could be high if the requirement for collision avoidance is strict, or if you have a small hash-space.


They mean that the order of elements is not preserved in HashMap. The next question is "how to solve this problem." And the answer is: use LinkedHashMap to be able to get elements in the same order they were inserted and TreeMap with appropriate comparator to control the order by any criteria you want.


The typical alternative to hash tables is a binary tree. While hash tables are typically faster the contents are not in any meaningful order; with binary trees the contents are sorted.


A disadvantage to a hashmap on Java is that it is not synchronized. If multiple threads access a hash map concurrently, and at least one of the threads modifies the map structurally, it must be synchronized externally. You have to wrap it in Collections.synchronizedMap


Map may be persistent

The only reason I could think of is limited main memory, but then that wouldn't be limited only to hashmaps, but also ArrayLists etc etc.

A map need not be limited to memory.

Some databases provide a persistent key-value store such as hstore in Postgres, or MVStore in H2 Database Engine. That second one uses the same Map interface defined in Java as do the in-memory implementations.

A key-value map may also be distributed across a network of computers, persisting parts of the map. There are several such products available.

Considerations such as concurrency, nulls, and iteration order

Characteristics vary among different implementations of a key-value store, commonly called a map or dictionary. You mentioned HashMap but that is only one way to do a map. There are skip list maps, and there are maps to track objects by reference (pointer) rather than by the content of the key as does a typical hashmap. In Java, an EnumMap is highly optimized for the case of the keys being based on an Enum subclass, with items represented internally as a bit-map of all positions defined in the enum, yielding very fast execution and taking very little memory. Some implementations may be more highly concurrent that others depending on the amount of data, such as ConcurrentSkipListMap in Java.

Some maps may accept or forbid nulls in the key and/or the value. This may assist or violate the needs of your business rules.

In some cases you may want to maintain a sort order or an original insertion order among your keys.

Here is a list I made of the 10 Map implementations provided with Java 11. You can compare the various aspects as pros and cons depending on your needs.

What are the disadvantages to hashmaps? [closed]

0

精彩评论

暂无评论...
验证码 换一张
取 消