CodingBison

As we know the basic property of a Set is that, it does not allow duplicate elements. More details about Set interface is available here. One of the two classes that implements the interface Set, is an HashSet. The LinkedHashSet class extends HashSet. It uses an "Hash Table and a Linked List" as its underlying data structures.



Figure: Set Interface and Class Hierarchy

Before we go into the details of the LinkedHashSet class, let us take a detour and understand the concepts Hash tables, Load Factor, Collisions, Hashing etc.

Things To Remember
The LinkedHashSet uses two underlying data-structures: HashTable and LinkedList. Like a HashSet, LinkedHashSet does not allow duplicates. But, unlike HashSet, LinkedHashSet preserves the insertion order.

A hash table is a popular data structure for providing a faster search -- a well-designed hash table can provide an average lookup complexity of O(1). A hash table data structure uses a hashing function that maps a given key to one of the indexes (also referred to as slots) in the hash-table. For a hash-table, indices are identified using an integer and the hashing function converts the passed key (which can be string, number, etc) into an integer that points to an index in the hash table. In other words, if the hash-table has size S and if the first index is indexed as 0, then we can find the table index, y, such that y would be greater than or equal to 0 but but less than S. More details about Hash Table/Hash Functions is available in our data structures modules understanding Hash Tables.

For hash tables, an import consideration is that it should be able to accommodate collisions. A collision happens when we have two or more elements that are indexed to the same slot. Collisions can potentially increase the search complexity of hash tables and so a good hashing function must be designed to reduce these collisions. The average lookup complexity of O(1) is not possible when there are collisions in the Hashtable.

It is easy to see that if the size of the hash table (S) is smaller than the number of entries (N), then collisions are bound to happen. For an ideal hashing function, if S where to be same as N, then each index would have one entry. In reality, this may not be true. Even if N is smaller than S, collisions can still occur. The ratio of N and S does provide an insight into an average number of collisions. This ratio is often known as load factor of the hash table. Load factor is an important consideration for estimating the amount of collisions in a hash table. In an ideal world, the load factor should be always less than 1.0. In fact, its recommended upper limit is around 0.7. For Instance, if an applications intends to store 700 entries, it is recommended to use an Hash table of size/capacity equal to 1000. so that the load factor will be equal to 700/1000 = 0.70 or 70%.

Things To Remember
The initial default-size/capacity of an LinkedHashSet is 16 and the load factor is 0.75.

LinkedHashSet like HashSet does not allow duplicate entries. But, unlike HashSet, LinkedHashSet preserves the insertion order. So, if the requirement of an application is such that, it needs to perform a large number of lookup requests and those lookups has to be faster and also the insertion order needs to be maintained. then, LinkedHashSet is the way to go!

LinkedHashSet uses HashTable for the quick insertion/removal of elements from the Set. It uses LinkedList to preserve the order of insertions. So, how does it work internally? Well, whenever an entry is added to an Hashtable, the same entry is also added to the end of the LinkedList. Since the LinkedList used is a circular double linked List, the insertion/removal operation from a double linked list happens in constant time O(1). When it comes to performance, LinkedHashSet takes slightly more time (for insertions/removals) than a HashSet for the very obvious reason, it has the overhead of maintaining a linked list. But, when it comes to walking through the Set, LinkedHashSet performs better.

LinkedHashSet Class Constructors

LinkedHashList class provides four Constructors and, here is the table that lists them.


Table: Constructors provided by ArrayList Class
ConstructorDescription
LinkedHashSet()Creates an empty set with a default size of 16 and the load factor equal to 0.75.
LinkedHashSet(Collection c)Creates a new set with with the elements of Collection c. Even if the collection c has any number of duplicate elements, the newly created set will maintain the property of a Set "no Duplicate elements allowed".
LinkedHashSet(int initialSize)Creates an empty set with a size equal to "initialSize" and load factor equal to 0.75(default)
LinkedHashSet(int initialSize, float loadFactor)Creates an empty set with a size equal to "initialSize" load factor equal to loadFactor.

Next, let us see an example program, that demonstrates the usage of the above mentioned operations.

 import java.util.ArrayList;
 import java.util.LinkedHashSet;

 public class LinkedHashSetDemo {
 public static void main(String[] args) {
     LinkedHashSet lhsetObj1 = new LinkedHashSet();
         lhsetObj1.add(10);
         lhsetObj1.add("abc");
         lhsetObj1.add(null);
         lhsetObj1.add(null);
         System.out.println(" Linked Hash Set 1: " + lhsetObj1);

         ArrayList alistObj1 = new ArrayList();
         alistObj1.add("abc");
         alistObj1.add("abc");
         alistObj1.add(1);
         System.out.println(" Array List: " + alistObj1);
         LinkedHashSet lhsetObj2 = new LinkedHashSet(alistObj1);
         System.out.println(" Linked Hash Set 2: " + lhsetObj2);

         LinkedHashSet lhsetObj3 = new LinkedHashSet(30);
         LinkedHashSet lhsetObj4 = new LinkedHashSet(30, (float) 0.80);
     }
 }
Things To Remember
LinkedHashSet can store heterogeneous elements, including NULL.

The above example program has created four new Sets using different constructors. Notice the elements of the LinkedHashSet, represented by the object lhsetObj1. We will come to know about some of the properties of a Set. First, a Set can store heterogeneous elements, Second, NULL element is allowed, Third, by noticing the output (below) you would know that a LinkedHashSet is an ordered collection. The order of the elements in a Set is preserved and guaranteed. LinkedHashSet object, lhsetObj3, and lhsetObj4 are created with a capacity and load factor different from the default values.

Let us Compile/run the program, Here is the output:

  Linked Hash Set 1: [10, abc, null]
  Array List: [abc, abc, 1]
  Linked Hash Set 2: [abc, 1]

LinkedHashSet Class Methods

As we already know, HashSet is one of the Classes that implements the Set interface. So, HashSet class implements all the methods that are defined in the Set interface. LinkedHashSet extends the HashSet class and overrides the methods but does not define any specific methods. The method list is similar to the one defined in the table Set Interface Methods List.

Here is an example, that demonstrates the usage of the methods.

 import java.util.ArrayList;
 import java.util.LinkedHashSet;

 public class LinkedHashSetDemo {
     public static void main(String[] args) {
         LinkedHashSet lhsetObj1 = new LinkedHashSet();
         lhsetObj1.add(10);
         lhsetObj1.add("abc");
         lhsetObj1.add(null);
         lhsetObj1.add(null);
         System.out.println(" Linked Hash Set 1: " + lhsetObj1);

         ArrayList alistObj1 = new ArrayList();
         alistObj1.add("abc");
         alistObj1.add("abc");
         alistObj1.add(1);
         System.out.println(" Array List: " + alistObj1);
         LinkedHashSet lhsetObj2 = new LinkedHashSet(alistObj1);
         System.out.println(" Linked Hash Set 2: " + lhsetObj2);

         lhsetObj2.remove(1);
         System.out.println(" Linked Hash Set 2: " + lhsetObj2);
         lhsetObj2.clear();
         System.out.println(" After Clear, Linked Hash Set 2: " + lhsetObj2);
     }
 }

Compile/Run the program, Here is the output:

  Linked Hash Set 1: [10, abc, null]
  Array List: [abc, abc, 1]
  Linked Hash Set 2: [abc, 1]
  Linked Hash Set 2: [abc]
  After Clear, Linked Hash Set 2: []




comments powered by Disqus