List Cycle Detection
Linked list cycle detection problem is a very instructive and fun problem to reason about. This article will state the problem first and then explains how we can solve it efficiently while giving some insight on the underlying math.
A list can gets corrupted and some node can be linked by more than one node, as in the following figure. This could lead to never ending traversal of the list. So it make sense to solve the following problem:
List Cycle Detection - Problem Statement
- Given a linked list, detect if the list is circular i.e. contains a cycle
- Find the starting node of the cycle (the node with two inward arrows in the figure)
The problem is easily solvable in time and space considering that we can visit the list from the head and store in a separate list the visited nodes. As the visit continues we check if the node we are examining was previously visited (for each node we visit we asks the following question: is that node contained in the support list already?). If yes the list is circular and that node is the start point of the cycle. If we reach the tail of the list then the list is not circular.
We can lower the complexity of this approach down to time using a more efficient support (set) data structure (like a tree set). But we can do much better, and the rest of the article will show how to obtain time and space complexity.
This algorithm uses the fact that, like clock's hands, things iterating on a cycle at different speed will eventually meet at some point in the future. Consider two runner with velocities respectively, starting running from the same point in a circular stadium. They will meet again when the slower runner reach the starting point for the second time. Why? By the time the slower one has completed half circle the faster has completed a complete cycle and by the time the slower finishes his run, arriving at the starting point again, the faster has completed a second entire cycle.
Things are a bit more complicated in the list cycle detection problem because the iterators (the two runners) do not necessarily start they race from the circular part of the list
Consider two iterators with velocities respectively. Suppose the cycle has length and that it starts at node number . When the slower iterator reaches the faster is at location . How many iteration it will take before they meet? And at which node?
The situation is described by the following congruence:
which has solution . This means that they will meet after iteration of the slower iterator. This means that they will meet at nodes before the beginning of the cycle and we can use this fact to count nodes from the beginning of the list to deduce the starting point of the cycle. Once the iterators meet in the cycle, we can move the fast iterator back to the beginning of the list and iterate forward one node per step with both iterators until they match again. When we move the fast iterator back at the head of the list, both iterators are nodes away from the beginning of the cycle. Because of this when we move both of them by one, they will eventually meet exactly at that node .
Let's consider now the case when . This means that by the time the slower iterator reaches the beginning of the cycle the faster one has completed more that a cycle. What will be the starting point for the faster one? We argue that once reaches , is at node but since , this means that it will be at position . We can now use similar argument to the previous case and write:
which has solution . This means that the meeting point is nodes before the beginning of the cycle. If we do the same operations as the previous case, , we obtain the same result. Iterators will meet at the beginning of the cycle. Why? Well advancing makes cycles possibly several times ( remember that ) and it will clearly stops at .
In other words the slower pointer is at first at node number . We can write where . After advancing steps it will be at location Since the result follows.
As an example consider a list with a cycle of length starting at node number . The first part of the algorithm tells us that the nodes will meet at node . Moving the fast pointer back to the head of the list and iterating one node per time both iterators will lead the slower point to node:
- 12 again after advancing of 4 nodes
- 12 again after advancing of 4 nodes
- 10 advancing of the remaining 2 nodes.
The following illustration depict how the algorithm would work on a list of 8 nodes with a cycle of length 4 starting at node number 4. After 5 steps the slow (blue) and fast (red) iterator points to the same node i.e. node number 6.
After that the fast pointer is moved to the head of the list and both iterator are incremented by 1 until they meet again. When they do, they will meet at the beginning of the cycle.
Execution of the Floyd's algorithm on a list of 8 nodes with a cycle of length 4 starting at node 4.