MathJax

Thursday, June 6, 2019

Time Complexity, Part 2: Notation Station

Continuation from Time Complexity, Part 1: Intro to Big-O

Last time we spoke about Time Complexity, we broached the topic of Big-O analysis.  We said that Big-O analysis helps us judge how our algorithms' speed is effected as data-sets become large. We also implied that Big-O analysis is done assuming the least optimal conditions possible for our algorithms.  In upcoming articles, we will conduct proper Big-O analysis, but first we need the 'vocabulary' necessary to talk about our code. 

In the same way that a Ferrari on a long stretch of open road will perform better than a beat up ice cream truck: some algorithms' speed will scale fabulously and others poorly as our data-sets get larger.  This article is about how we can express those differences.

'Constant Time' also known as $O(1)$.

 

Writing an algorithm that scales in "constant time" is like inventing a car that can travel around the earth in the same time as it takes to travel down the block.  It implies that your algorithm doesn't slow down at all, regardless of the data size.  In most programming languages, arithmetic operators, inequalities, and most hash-map operations (when keys are already known) occur in constant time.  If you have solved a technical interview question with an algorithm that scales in constant time, chances are that you're in great shape.  Consider that let x = 0; x += 5; executes at the same speed as let y = 0; y += 1000000; even though the integer 1000000 requires more storage space than the number 5. If we were to graph this behavior, it would look like this:


'Linear Time' also known as $O(n)$. 

 

Brief, but necessary, tangent.

The first time I saw the term $O(n)$, the knee-jerk question that came to mind was, "What does $n$ mean?"  The variable $n$ is a convention.  It assumes your algorithm is processing one data-set and refers to the size of that data-set.  The notation $O(\ldots )$ refers to how the runtime of your algorithm will scale.  In my car analogies, $n$ refers to the distance the car has to travel while $O(\ldots )$ describes the time required to reach the destination.

In the real world, it's a rookie mistake to only think in terms of $n$, since there may be multiple data-sets that your algorithm is processing.  To illustrate that point, here is an excerpt from Cracking the Coding Interview, 6th Edition:
Suppose we had an algorithm that took in an array of strings, sorted each string, and then sorted the full array. What would the runtime be?

Many candidates will reason the following: sorting each string is $O(n\cdot \log n)$ and we have to do this for each string, so that's $O(n\cdot n\cdot \log n)$. We also have to sort this array, so that's an additional $O(n\cdot \log n)$ work.  Therefore,the total runtime is $O(n^2\cdot \log n + n\cdot \log n)$, which is just $O(n^2\cdot \log n)$.

This is completely incorrect. Did you catch the error?

The problem is that we used $n$ in two different ways. In one case, it's the length of the string (which string?). And in another case, it's the length of the array.
Don't worry if you didn't understand all the notation that was included in the excerpt, just focus on the take home message.  In Big-O notation, different data-sets get their own variable.  You'll see examples of this in upcoming articles.

From this point onward, I'll be using a code examples to illustrate my point.  While not strictly required, it may be useful for you to type out the code for yourself and execute it.  The concepts that I'm discussing can be generalized to any programming language; I will be using Javascript with Node.  You can use this online Node IDE to follow along, if you'd like.

Back to 'Linear Time.'

Writing an algorithm that operates in linear time is like inventing a Toyota Carolla.  The more road between you and your destination, the longer your commute (in proportion to the distance).  Similarly, in a linear time algorithm, if the data-set gets bigger, the runtime is increased by a steady amount (in proportion to $n$). Your algorithms will scale in linear time if they contain a loop of $n$ cycles optionally with respect to an iterable data-type containing $n$ items.  This is because we have to "touch" each item in the data-set at least once. The last few sentences were a mouthful! To illustrate them, consider this:

const data = [1, 2, 3];
const size = data.length;

let count = 0;

for (let i = 0; i < size; i++) {
  count = count + 1;
}

console.log(size, count);

  • In the above code, size represents the size of the data-set, which is the same as $n$.
  • The count variable represents the number of operations that were were conducted. Since the operation we are performing occurs in constant time and we are performing the same operation each time, we can use the count variable to approximate the behavior of $O(n)$, which describes the runtime of the code.
  • In comparing the size variable to the count variable, we are actually comparing how our data size is effecting our runtime.  Let's execute our code to see what it prints:

$node main.js
3 3

It doesn't matter how many items are in the data variable, size will always equal count so long as the rest of the code remains unaltered.  In other words, our runtime is proportional to the size of our data.  If we were going to graph this behavior, it would look something like this:


Addition in Linear Time

 

If you've spent any time coding, you've probably had to loop through your data multiple times.  Let's consider the following code:

const data = [1, 2, 3];
const size = data.length;

let count = 0;

for (let i = 0; i < size; i++) {
  count = count + 1;
}

for (let i = 0; i < size; i++) {
  count = count + 1;
}

console.log(size, count);

How would we express the time complexity of that code in Big-O notation? Well, we looped through our entire data-set twice, so it would be $O(n) + O(n) = O(2n)$.  If we were to execute the above code, it prints:

$node main.js
3 6

Since we're looping through the data twice, we are having to do twice as many operations. That is why count is now double the value of size. In the real world, the ability to optimize an algorithm from $O(2\cdot n)$ to $O(n)$ is likely a valuable optimization. That said, in Big-O analysis, we typically don't care about the difference between $O(2n)$ and $O(n)$ because the type of scaling is still linear. If we were to graph the runtime of the above code as it scales with larger data-sets, it would look like this:



In the next article, I'll discuss why Big-O analysis emphasizes the type of scaling instead of the actual scaling.  For now, we need to talk about one more operation.

Multiplication in Linear Time.

 

We have seen addition, but what about multiplication? Ask yourself, "What is the relationship between addition and multiplication?" When we say $3\times 3$, what do we mean? Well, that's $3 + 3 + 3$. We are essentially looping over the "plus 3" operation 3 times. When we multiply $O(n)\times O(n)$, we're looping over your data to perform some operation "data" times. Consider this code:

const data = [1, 2, 3];
const size = data.length;

let count = 0;

for (let i = 0; i < size; i++) {
  for (let j = 0; j < size; j++) {
    count = count + 1;
  }
}

console.log(size, count);

It's very important to contrast the multiplication example with the addition example. In both instances, there are two loops.  However, the multiplication contains a loop within a loop; also called a "nested loop."  This is a very common point of confusion and its important to distinguish the two because multiplication results in another type of scaling.  When we execute the code above, this is printed:

$node main.js
3 9

This result may not look too bad, but that's because our example data-set is small.  Try adding a few more elements to data and see what happens.  You will see that the scaling looks like this:



'Quadratic Time' also known as $O(n^2)$

 

If you've written an algorithm in quadratic time, you've just invented a car that slows down as the road becomes longer. This means that as your data-set gets larger, your algorithm will have to perform exponentially more operations.  Quadratic time complexity results when your code contains a nested loop.  All of this should be ringing a bell.  We discussed this in the "Multiplication in Linear Time" section above.  After all, $O(n)\times O(n) = O(n^2)$.

Wrapping Up

 

Lets summarize what we've talked about:
  • $n$ means the size of the data that your processing (assuming only one data-set).
  • $O(\ldots )$ describes the worst-case runtime of your algorithm.
  • 'Constant time' or $O(1)$ means that your algorithm's speed is independent of your data-size.
  • 'Linear time' or $O(n)$ means that your algorithm's speed is proportional to your data-size.
    • Linear addition occurs when you loop through your data-set sequentially.
    • Linear multiplication occurs when you loop through your data within a loop through your data - or a nested loop. 
  • 'Quadratic time' or $O(n^2)$ means that your algorithm's speed slows down as your data-set gets larger.  It is the result of linear multiplication.
Now that we have some vocabulary under our belt, in the next article we're going discuss the "rules" of Big-O analysis and why they exist.  We'll also analyze our first bit of code using Big-O analysis.

P.S. - You may be wondering about other types of scalings, such $O(\log n)$, $O(n\log n)$, $O(2^n)$, $O(n!)$, or $O(\infty )$.  Those will all be covered in a future article.  Also, I'm well aware that you don't necessarily have to "loop" through your data to achieve an $O(n)$ complexity, but it's a good place to begin learning the concept.  In future articles, I'll be going through tons of examples which will hopefully add a more nuanced understanding to the concepts I've described here.