
Problem statement
In this article we will be discussing one of the most common coding interview question asked in many google interviews, especially during one of the first stages. The problem statement goes as follows:
Given an array
N
of distinct integers and a integerK
, print on the standard output the number of triples(N[i], N[j], N[l])
whose sum is equal toK
. In other words how many triples(i,j,k), i < j < k
s.t.N[i] + N[j] + N[l] = K
are inN
?
Example
Input : arr[] = {0, -1, 2, -3, 1}, K=0 Output : 2. Two triples have sum 0: {(0 -1 1), (2 -3 1)} Input : arr[] = {1, -2, 1, 0, 5}, K=0 Output : 1. Only one triples has sum 0: {(1 -2 1)}
Brute Force Approach
As usual the we should be able to come up with a brute force approach quickly. For this problem this is especially easy. We simply need to search for those triples as in the problem description.
unsigned long triplets_sum_k(const vector<int>& N, const long K) { const size_t size = N.size(); unsigned long ans = 0; for(size_t i = 0 ; i < size ; i++) for(size_t j = i+1 ; j < size ; j++) for(size_t l = j+1 ; l < size ; l++) if(N[i] + N[j] + N[l] == K) ++ans; return ans; }
The complexity for the code above is:
- Time $\theta(n^3)$ i.e. cubic
- Memory $\theta(1)$ i.e. constant
HashMap approach
We can substantially improve the time complexity of the previous solution if we search for triples in the following way:
- $\forall \:(i < j )$ we can add one to the count of valid triples if
- $\exists \: k < i$ s.t. $N[l] = K-(N[i]+N[j])$
In other words when we consider a pair of numbers, the only way we can have a valid triplets is if we find a number among the numbers we have already processed with sum equal to $K-(N[i]+N[j])$. The key part here is that we can trade memory for time complexity by storing all the numbers we have already considered so far, and quickly perform the query in order to decide if such element with index l < i
s.t. $N[l] = K-(N[i]+N[j])$ exists.
We know that using a hash-map to store the elements processed so far can lead us to perform this task in average constant time.
unsigned long triples_sum_k_1(const vector<int>& N, const long K) { if(N.size() < 3) return 0; const size_t size = N.size(); // value -> #times appeared unordered_map<int,size_t> C; unsigned long ans = 0; C[N.front()] = 1; //we can start from 1 because we need to find //l < i and there is none if i=0 for(size_t i = 1 ; i < size ; i++){ for(size_t j = i+1 ; j < size ; j++) { const long target = K - (N[i] + N[j]); if(C.find(target) != C.end()) ans+=C[target]; } //we finished processing element i //add it to the map C[N[i]]++; } return ans; }
Sorting Approach
If we add an additional constraint to the problem i.e. that there is no duplicate element in the array, we can easily get rid of the linear space complexity paid in the hash-map solution by sorting the elements. Note that reordering the elements does not have any effect on the number of valid triples. This is because the sum is commutative and associative.
Consider the initial array: $N={0, -1, 2, -3, 1}$. We have two valid triples for K=0
:
- 2 triples, indexes
(0,1,4)
and(2,3,4)
.
if we order the array in increasing fashion we will end up with $N={-3,-1,0 ,1, 2}$ and the valid triples will be:
- 2 triples, indexes
(0,3,4)
and(1,2,3)
.
The idea is that for each element of index i
we can loop over the remaining elements of index [i+1, n-1]
and linearly count how many of them have sum equal to K-N[i]
.
We can do this by using a two pointer technique.
Let's dig into the code first and discuss it:
unsigned long triples_sum_k_2( vector<int>& N, const long K) { if(N.size() < 3) return 0; sort(begin(N), end(N)); const size_t size = N.size(); unsigned long ans = 0; for(int i = 0 ; i < size ; i++) { size_t l = i+1; //left size_t r = size-1; //right while(l < r) { const long sum = N[i] +N[l] + N[r]; if(sum == K){ ans+=1; l++; r--; } else if(sum < K) l++; else r--; } } return ans; }
Core idea
The core of the idea is in the while
loop. If we find that the sum
is greater than our target the only sensible thing we can do is to decrease the right pointer as by doing this we are sure that the sum will be smaller the next time around (elements are sorted, remember?).
The same idea applies wen sum < K
. We can increase the left pointer because we are sure that the sum will be higher the next time around.
If we are lucky and we find a triplet whose sum is exactly K
then we can move both pointer, as if we only move one we are sure that the sum is going to be either lower or higher the next time around since there is no duplicate in the array.
The complexity for the code above is:
- Time $O(n^2)$ i.e. quadratic on overage
- Memory $\theta(1)$ i.e. constant
We were able to substantially improve the time complexity at the cost of linear memory usage.
In this article we discussed a couple of approaches for solving the 3sum problem which is a common coding interview question asked at google and many other tech companies. This article is the first of a long serie. More articles to come soon.