In order to win the prize for most cookies sold, my friend Alice and I are going to merge our Girl Scout Cookies orders and enter as one unit.
Each order is represented by an "order id" (an integer).
We have our lists of orders sorted numerically already, in vectors. Write a function to merge our vectors of orders into one sorted vector.
For example:
We can do this in time and space.
If you're running a built-in sorting function, your algorithm probably takes time for that sort.
Think about edge cases! What happens when we've merged in all of the elements from one of our vectors but we still have elements to merge in from our other vector?
We could simply concatenate (join together) the two vectors into one, then sort the result:
What would the time cost be?
, where n is the total length of our output vector (the sum of the lengths of our inputs).
We can do better. With this algorithm, we're not really taking advantage of the fact that the input vectors are themselves already sorted. How can we save time by using this fact?
A good general strategy for thinking about an algorithm is to try writing out a sample input and performing the operation by hand. If you're stuck, try that!
Since our vectors are sorted, we know they each have their smallest item in the 0th index. So the smallest item overall is in the 0th index of one of our input vectors!
Which 0th element is it? Whichever is smaller!
To start, let's just write a function that chooses the 0th element for our sorted vector.
Okay, good start! That works for finding the 0th element. Now how do we choose the next element?
Let's look at a sample input:
To start we took the 0th element from alicesVector and put it in the 0th slot in the output vector:
We need to make sure we don't try to put that 1 in mergedVector again. We should mark it as "already merged" somehow. For now, we can just cross it out:
Or we could even imagine it's removed from the vector:
Now to get our next element we can use the same approach we used to get the 0th element—it's the smallest of the earliest unmerged elements in either vector! In other words, it's the smaller of the leftmost elements in either vector, assuming we've removed the elements we've already merged in.
So in general we could say something like:
Can you implement this in code?
Okay, this algorithm makes sense. To wrap up, we should think about edge cases and check for bugs. What edge cases should we worry about?
Here are some edge cases:
Actually, (3) will always happen. In the process of merging our vectors, we'll certainly exhaust one before we exhaust the other.
Does our function handle these cases correctly?
If both vectors are empty, we're fine. But for all other edge cases, we'll get a segfault.
How can we fix this?
We can probably solve these cases at the same time. They're not so different—they just have to do with indexing past the end of vectors.
To start, we could treat each of our vectors being out of elements as a separate case to handle, in addition to the 2 cases we already have. So we have 4 cases total. Can you code that up?
Be sure you check the cases in the right order!
Cool. This'll work, but it's a bit repetitive. We have these two lines twice:
Same for these two lines:
That's not DRY. Maybe we can avoid repeating ourselves by bringing our code back down to just 2 cases.
See if you can do this in just one "if else" by combining the conditionals.
You might try to simply squish the middle cases together:
But what happens when myVector is exhausted?
We'll get a segfault when we try to access myVector[currentIndexMine]!
How can we fix this?
First, we allocate our answer vector, getting its size by adding the size of myVector and alicesVector.
We keep track of a current index in myVector, a current index in alicesVector, and a current index in mergedVector. So at each step, there's a "current item" in alicesVector and in myVector. The smaller of those is the next one we add to the mergedVector!
But careful: we also need to account for the case where we exhaust one of our vectors and there are still elements in the other. To handle this, we say that the current item in myVector is the next item to add to mergedVector only if myVector is not exhausted AND, either:
The if statement is carefully constructed to avoid undefined behavior from indexing past the end of the vector. We take advantage of C++ short circuit evaluation and check first if the vectors are exhausted.
time and additional space, where n is the number of items in the merged vector.
The added space comes from allocating the mergedVector. There's no way to do this "in place" because neither of our input vectors are necessarily big enough to hold the merged vector.
But if our inputs were linked lists, we could avoid allocating a new structure and do the merge by simply adjusting the next pointers in the list nodes!
In our implementation above, we could avoid tracking currentIndexMerged and just compute it on the fly by adding currentIndexMine and currentIndexAlices. This would only save us one integer of space though, which is hardly anything. It's probably not worth the added code complexity.
What if we wanted to merge several sorted vectors? Write a function that takes as an input a vector of sorted vectors and outputs a single sorted vector with all the items from each vector.
Do we absolutely have to allocate a new vector to use for the merged output? Where else could we store our merged vector? How would our function need to change?
We spent a lot of time figuring out how to cleanly handle edge cases.
Sometimes it's easy to lose steam at the end of a coding interview when you're debugging. But keep sprinting through to the finish! Think about edge cases. Look for off-by-one errors.
Wanna review this one again later? Or do you feel like you got it all?
Mark as done Pin for review laterReset editor
Powered by qualified.io