Binary Search at the Doctor's Office

Linear or binary search? Explore the concepts through an engaging analogy. I talk about the efficiency of each method with real-world and programming examples.

Imagine this—

You're at the doctor's office, and you ask the receptionist to sign you in.

"Sure thing," she says. "Name, please?"

"Julio Martinez," you respond.

"Alright, have a seat, and I'll call you up momentarily."

A big shelf of folders behind her lists all the last names of registered patients, A-Z. One by one, she goes through all the files inside folder A. Is a Martinez in here? No? Next file. After a while, it's folder B, then C, then D. It takes a while, but sooner or later the surname "Martinez" should pop up, right?

She's on folder G now. You see dusk settling in from the windows. You glance at your watch. Dinner was an hour ago.

Is this the best approach? Why doesn't she just go directly to the middle and see where the letter M is, you wonder. Your last name is Martinez, after all.

Linear Search

What we just witnessed might not have been the best approach for finding a patient's records.

But this method is popular in the algorithm world, looking through a dataset one by one. It's called linear search, also known as sequential search.

Linear search involves consecutively checking each element of a list for a target value. If we find it, we return the index. Otherwise, we keep searching until the end of the list.

Below is a small example of linear search using JavaScript code:

const lastNames = ['Austen', 'Plath', 'Morrison', 'Woolf', 'Shelley', 'Angelou', 'Brontë'];

const linearSearch = (arr, item) => {
  for (let i = 0; i < arr.length; i++) {
    if (arr[i] === item) {
      return i; // index where it was found
    }
  }
  return -1; // not found
}

linearSearch(lastNames, 'Morrison'); // 2

Here's a note though, just because it's slower, doesn't mean it's bad (at least in the computer world). It's simple and works. It's important to know the distinction that computers are not like human beings. A computer doing linear search is much MUCH faster than a human. What may take our receptionist hours to do, can take a computer only a few milliseconds!

We do linear search all the time when programming, whether it's using for loops to find something, or JavaScript methods like indexOf() and includes(). They all go through a dataset in a linear fashion. So don't throw it off your table completely.

Linear Search Time Complexity

In the world of Big O, the time complexity of linear search is O(n), with n being the number of items we're searching. So, you can imagine that if there was another for loop inside of the above code, searching for specific letters in last names, it would be O(n^2). In other words, if our receptionist at the doctor's office not only looked for last names but also had to scan the files to verify another piece of information, then the process would entail more time complexity.

Binary Search

Now let's take a look at another way we can search: binary search, a 'dividing and conquering' way to find something within a small number of steps.

But first, let's revisit our analogy of the receptionist and patient. What if our receptionist did in fact just start at the middle of the shelf of folders looking for 'Martinez'?

Let's say she did, and her hands touched the "H" folder.

The receptionist goes, "Nope. That's not M. Well hey, since H comes before M, I can discard all the last name folders before H, and just worry about this section after it."

Cool. Now she rummages through the middle of the stack and pulls out the "P" folder. Not there yet, but she does the same as before and discards whatever comes after P since M is before P.

Now she has a smaller area to work with, and she does this until... voilà! Folder M.

"Mr. Martinez! I found your file! Mr. Marti—? Hm... must've gone for a walk. Oh well."

So what happened? 🤔

With linear search, the receptionist was going through so many folders to find Martinez. But with binary search, she found him in just a few steps.

There was already a whole shelf of sorted last names. She knew that Martinez, 'M', was somewhere around the middle, so she started her search there.

If the folder she picks up from the middle is M, great, she found it! If the folder is a letter before M, she discards the folders on the left, and keeps searching right. If the folder is a letter after M, she discards everything on the right and searches left. In a nutshell, that's how binary search works behind the scenes.

Let's take a look at some binary search code:

const sortedLastNames = ["Angelou", "Austen", "Brontë", "Morrison", "Plath", "Shelley", "Woolf"];

const binarySearch = (sortedArr, item) => {
  // everything from start to finish in our data set
  let start = 0;
  let end = sortedArr.length - 1;

  while (start <= end) {
    let middle = Math.floor((start + end) / 2);

    if (sortedArr[middle] === item) {
      return middle; // found the item
    } else if (sortedArr[middle] < item) {
      start = middle + 1; // keep searching right
    } else { // keep searching left
      end = middle - 1;
    }
  }
  return -1; // not found
};

binarySearch(sortedLastNames, 'Brontë'); // 2

At first glance, it looks more complex and verbose than our linear search function. But fear not, it's not as bad as it seems.

Let's walk through it together.

Firstly, for binary search to work, you need to have a sorted array, unlike linear search. The reason for this is because we're comparing our middle item to what's on the left and right.

Our start and end variables indicate the size of the dataset we're dealing with, e.g., the whole shelf of patient records from A-Z, or an entire phonebook, a dictionary, etc.

Then, we divide the sum of the start and the end values by 2 (rounding down to the nearest whole number in case of an even-numbered list), so that we can find a middle point. We pull the middle out and check if it's the item we're looking for.

Each time we go through our loop, we check the middle element. If the middle value is too low, we update our start (left) by making our new start the middle value plus one. We loop again with a new middle value. If our middle is too high, we update the end (right) by setting it to the middle minus one. This is because we want to minimize our search area. We're essentially narrowing our dataset to a smaller segment each time so that we can pinpoint our target value without having to sift through irrelevant values. This process continues until our item is either found or not found.

If this concept sounds confusing, a visualization might help from mathwarehouse.com:

Just a note—it makes sense to use binary search if your dataset is organized and sorted, and you plan to search through it multiple times. For instance, a patient records database in a doctor's office. But sometimes, you might not want to use this method if you're going to search through something only once. It really depends on your use cases.

Binary Search Time Complexity

Binary search has a time complexity of O(log n), which is a lot better than linear search's O(n). For a visual reference, see below how both time complexities fare next to each other. This significant difference is why, for example, if you had 240,000 patient records, searching for a name using linear search would be inefficient. You might have to go through tens of thousands of records first to reach your target. This inefficiency is what makes linear search O(n). With binary search, you'd only have to go through approximately 18 steps to check if your target is there, instead of 240,000 steps. It's a monumental difference.

Last Few Thoughts...

If you're to dive a little deeper than my cliche analogy, here's a lower level explanation in a video by ComputerPhile:

I initially wanted to do a binary search analogy with a phonebook or a dictionary. But I went with the receptionist and patient analogy because of a recent negative and humorous experience I had at the doctor's office. 😁

At the time of writing this, I was a web development fellow at The Knowledge House where I learned how important data structures and algorithms are. They're not just essential for technical interviews; they help us solve problems and think outside the box.