challenge - two sum

to find two numbers in an array that sum up to a given target value.

Challenge

I had never used leetcode before. In the last few years I spend some time in other similar platforms such has hackerank and codewars, but never leetcode. It crossed my path the other day in some random post on reddit where i read something like “everything starts with two sum”. So I decided to take a look. At first glance the problem Two Sum seemed quite simple and reminded me of some other basic challenges I had encountered on other platforms. However, as I delved deeper, I realized there was more to this problem than met the eye

Given an array of integers nums and an integer target, return indices of the two 
numbers such that they add up to target.

You may assume that each input would have exactly one solution, and you may not 
use the same element twice.

You can return the answer in any order.

For visualization purpose let’s take one of examples, if the input is nums = \([ 15 , 11 , 2 , 7 ]\) and target = 9, the output should be \([ 2 , 3 ]\) , the numbers at indices 2 and 3 (values 2 and 7, respectively) add up to the target value of 9.

This problem is significant not only because it’s a common interview question but also because it serves as an accessible introduction to key concepts in computer science, including hashing, array manipulation, and the trade-offs between time and space complexity. Its simplicity allows for a focus on algorithmic thinking and optimization strategies without the overhead of complex data structures or algorithms. Moreover, the problem’s applicability to real-world scenarios, such as database queries or financial transactions, highlights the practical importance of efficient data processing and retrieval techniques, making it a relevant and engaging challenge for programmers at all levels.

Initial approach

In a problem like this the the straightforward approach, and I think the one that comes to most people minds is the use of brute force. That is, using a double loop to iterate through the array, comparing each pair of numbers to see if they added up to the target. This method it simple enough, but, it has a time complexity of $$ O ( n^2 ) $, it is far from optimal for larger arrays. Here is this approach in C++, Java and python. This was confirmed when I submitted the code and found that it was only faster than 30% of code submitted by other users.

C++

class Solution {
public:
    std::vector<int> twoSum(std::vector<int>& nums, int target) {
        int n = nums.size();
        
        for (int i = 0; i < n-1 ; i++) 
            for (int j = i + 1; j < n ; j++) 
                if (nums[i] + nums[j] == target) 
                    return {i, j};
                        
        return {};
    }
};

Java

class Solution {
    public int[] twoSum(int[] nums, int target) {
        int n = nums.length

        for(int i = 0 ; i < n-1 ; i ++)
            for(int j = i+1 ; j < n ; j++)
                if(nums[i]+nums[j]==target)
                    return new int[]{i,j};
      
        return null;
    }
}

python

class Solution:
    def twoSum(self, nums: List[int], target: int) -> List[int]:
        
        n = len(nums)
        for i in range(n-1):
            for j in range (i+1, n):
                if(nums[i]+nums[j]==target):
                    return [i, j]
        
        return []

Time efficient with Hash Map

A resounding difference between leetcode and the other platforms I used before is the plethora of discussions, solutions and even videos. I always like to see how a problem can be tackled in multiple ways, each with its own trade-offs in terms of time and space complexity. So i revisited the problem, but this time armed with new insights and a hash table. The idea was to store each element’s value and its index in the array. As I iterated through the array, I checked if the complement of the current element (target - current element) was already in the hash table. If it was, I had found the two numbers.

In a hash table the values are stored associating a specific key to corresponding values. To store and retrieve these values, an hashing function is used. This causes that the average complexity of search, insert and delete data in a hash table is of a \(O ( 1 )\) time complexity. This approach significantly reduced the time complexity to \(O ( n )\), a vast improvement over my initial brute-force method. But, unlike the brute force method that has a space complexity of \(O ( 1 )\), the space complexity is now \(O ( n )\), since in the worst case scenario we will need to store each element of the array in the hash map.

C++

class Solution {
public:
    vector<int> twoSum(vector<int>& nums, int target) {
        //initialize an unordered_map to store the array values and their indices
        unordered_map<int, int> numMap;

        //iterate through the array
        for(int i  = 0 ; i < nums.size() ; i++){
            //calculate the complement by subtracting the current value to the target
            int complement = target - nums[i];
            //check if the element exists in the map
            if(numMap.find(complement) != numMap.end())
                //if found, return the indices of the current value and its complement
                return{numMap[complement], i};
            //store the current value and it's index in the map
            numMap[nums[i]]=i;
        }   

        return {};
    }
};

Java

class Solution {
    public int[] twoSum(int[] nums, int target) {
        //create an HashMap to store the array values and their indices
        Map<Integer, Integer> map =  new HashMap<>();
        
        //iterate through the array
        for(int i = 0 ; i < nums.length ; i ++){
            //calculate the complement by subtracting the current value to the target
            int complement = target - nums[i];
            //check if the element exists in the map
            if(map.containsKey(complement))
                //if found, return the indices of the current value and its complement
                return new int[]{map.get(complement),i};
            //store the current value and it's index in the map
            map.put(nums[i], i);
        }
        return null;
    }
}

python

class Solution:
    def twoSum(self, nums: List[int], target: int) -> List[int]:
        # initialize the set to store the array values and their indices
        nums_set = set()
        
        # iterate through the array
        for num in nums:
            # calculate the complement by subtracting the current value to the target
            complement = target - num
            #check if the complement exists in the set
            if complement in nums_set:
                # if found, return the indices of the current value and its complement
                return [nums.index(complement), nums.index(num, i+1)]
            #store the current number in the set
            nums_set.add(num)
        
        return []

This approach parachuted me to be faster than 80% of other submissions in C++ and to the 98% faster in Java and python.

Alternative approach

While the hash table approach significantly improves the time efficiency, it does introduce a higher space complexity of \(O ( n )\). This got me thinking if whether there’s a middle ground that could offer a better balance between time and space efficiency.

A potential approach is to consider the two pointer technique, applicable if the array is sorted or if sorting doesn’t affect the solution’s correctness. This method involves placing one pointer at the beginning of the array and another at the end and then moving them towards each other until the target sum is found. The time complexity is then \(O ( n \log n)\) from the sorting and the space complexity is \(O ( n )\).

C++

class Solution {
public:
    vector<int> twoSum(vector<int>& nums, int target) {
        
        //create vector to store pairs of numbers and their original indices
        vector<pair<int, int>> vector;
        
        //populate the vector with the pairs
        for(int i = 0 ; i < nums.size() ; i++)
            vector.push_back(make_pair(nums[i],i));

        //sort the vector based on the numeric value
        sort(vector.begin(), vector.end(),
            [](const pair<int, int>& a, const pair<int,int>& b){
                return a.first < b.first;
            });
        
        //initialize the pointers
        int left = 0 , right = nums.size()-1;
        
        
        while(left < right){
            //calculate the sum of the two pointed values
            int sum = vector[left].first+vector[right].first;
            
            //if the sum matches the target, we return the original indices of the two
            if(sum == target)
                return { vector[left].second, vector[right].second};
            
            //if the sum is less than the target, move the left pointer to the right
            //that is, we increase the left value
            if(sum < target){
                left++;
                continue;
            }
            
            //if the sum is greater than the target, we move the right pointer to the left
            right--;
            
        }
 
        return {};
    }
};

Java

class Solution {
    public int[] twoSum(int[] nums, int target) {
        //create an array with the numbers and their indices
        int[][] numsWithIndices = IntStream.range(0, nums.length)
                                           .mapToObj(i -> new int[]{nums[i], i})
                                           .toArray(int[][]::new);
        
        //sort the array based on the numeric vlaues
        Arrays.sort(numsWithIndices, Comparator.comparingInt(a -> a[0]));

        //initialize the two pointers
        int left = 0, right = nums.length - 1;
        

        //iterate through the array using the two pointers
        while(left < right){
            //calculate the sum of the two pointed values
            int sum = numsWithIndices[left][0]+numsWithIndices[right][0];
            
            //if the sum matches the target, we return the original indices of the two
            if(sum==target)
                return new int[]{numsWithIndices[left][1], numsWithIndices[right][1]};
            
            //if the sum is less than the target, move the left pointer to the right
            //that is, we increase the left value
            if(sum < target){
                left++;
                continue;
            }
            
            //if the sum is greater than the target, we move the right pointer to the left
            right--;
        }
        
        return null;
    }
}

python

class Solution:
    def twoSum(self, nums: List[int], target: int) -> List[int]:
        # create an array of tuples of the numbers and the original indices, sorted by the number
        numsWithIndices = sorted((num, i) for i, num in enumerate(nums))
        
        # initialize the pointers
        left, right = 0, len(nums)-1
        
        # iterate through the array using the two pointers
        while left < right:
            # calculate the sum of the two pointed values
            sum = numsWithIndices[left][0] + numsWithIndices[right][0]

            # if the sum matches the target, we return the original indices of the two
            if sum == target:
                return [numsWithIndices[left][1], numsWithIndices[right][1]]
            
            #if the sum is less than the target, move the left pointer to the right
            if sum < target:
                left += 1
                continue
            
            #if the sum is greater than the target, we move the right pointer to the left
            right -= 1
        
        return []

Reflection

The journey through solving the “Two Sum” problem on LeetCode has been enlightening. It showcased the importance of understanding both time and space complexity and how different approaches can significantly impact performance. From a brute-force method to a time-efficient hash table solution, and exploring a space-efficient two-pointer technique, each method provided valuable insights.

The brute-force method, while straightforward, highlighted the importance of considering time complexity from the outset. The hash map solution, on the other hand, was a practical lesson in how space-time trade-offs work in algorithm design. Finally, the sorting and two-pointer technique provided a clear example of how sometimes a preliminary step (like sorting) can lead to more efficient solutions, even if it seems counterintuitive at first.

Comparison of Approaches

A comparison of these methods can offer valuable insights. Understanding the nuances of each approach allows developers to make more informed decisions when faced with similar problems.

  Time Complexity Space Complexity Readability
Brute-Force \(O ( n^2 )\) \(O ( 1 )\) High
Hash Map \(O ( n )\) \(O ( n )\) Moderate
Two-Pointer \(O ( n \log n )\) \(O ( n )\) Moderate

Choosing the right approach depends on several factors, including the size of the input data, the computational resources available, and the specific requirements of the application. By carefully considering these aspects, developers can select the most appropriate solution, optimizing for performance, readability, and resource usage as needed.

So, Brute-force also has its advantages, it is the more readable solution and simple implementation. If we have small datasets, if the simplicity and readability are more important than efficiency concerns, or if our system has limited memory this solution may be ideal. If we have large data-sets and are in a performance-sensitive application, the reduction of the execution time from the constant-time lookups make this solution ideal. On the other hand the Two-Pointer technique is ideal for medium to large datasets where a balance between time and space efficiency is desired. This method is particularly useful when the input array can be sorted without affecting the solution’s correctness, offering a good compromise between the brute-force and hash map approaches.

We can also analyze the performance in terms of execution time and memory usage in the leetcode platform for each programming language

Approach Language Best Time (ms) Best Memory (MB)
Brute-Force Java 44 44.6
  C++ 9 14.2
  Python3 1681 17.3
Hash Map Java 2 45.3
  C++ 8 14.1
  Python3 49 17.8
Two-Pointers Java 9 44.1
  C++ 3 13.5
  Python3 60 18

The Hash Map approach shows the best performance in Java, significantly reducing the execution time to 2 ms, which is a substantial improvement over the Brute-Force and Two-Pointers methods. This demonstrates the efficiency of hash maps in optimizing lookup operations, making it the preferred choice for large datasets where time complexity is a critical factor.

C++ implementations consistently show low execution times across all approaches, with the Hash Map and Two-Pointers methods being particularly efficient. This underscores C++’s performance advantages, especially in memory usage, where it outperforms the other languages.

Python, while not matching the execution speeds of Java and C++, benefits significantly from the Hash Map approach, reducing its execution time to 49 ms from the much slower brute-force implementation. This illustrates Python’s effective use of dynamic data structures like dictionaries for optimizing time complexity, albeit with a higher space complexity.

The performance results for each approach in different programming languages also highlights the impact of language-specific optimizations and data structure implementations on algorithm efficiency. For instance, Java’s significant time reduction using hash maps can be attributed to its highly optimized HashMap class, while C++’s lower memory usage reflects the language’s close-to-hardware design, allowing for more controlled memory management. Python’s slower execution times, compared to C++ and Java, underscore the trade-offs of using a dynamically typed, interpreted language, which prioritizes development speed and ease of use over raw performance. These variations underscore the importance of choosing the right tool for the job, considering both the problem at hand and the execution environment.

Conclusion

The “Two Sum” problem, in its simplicity and complexity, serves as a microcosm of the broader journey in computer science—a journey of continuous learning, problem-solving, and growth. The exploration of different solutions not only deepens our understanding of algorithmic efficiency but also equips us with a versatile toolkit for approaching future coding challenges. Recognizing when to use a brute-force method for its simplicity, a hash map for its time efficiency, or a two-pointer technique for a balanced approach can drastically improve our problem-solving strategies. These insights encourage a more nuanced consideration of trade-offs in algorithm design, preparing us to tackle a wide range of problems with informed decision-making about when and how to optimize for time, space, and code readability.

This exploration has not only enhanced my problem-solving toolkit but also sharpened my decision-making skills, teaching me to weigh the trade-offs between different approaches based on the problem’s requirements. As I move forward, these insights will be invaluable, guiding me through future challenges with a deeper understanding of when and how to apply each method effectively.