3 Variations on sequential search

We are going to look at a simple problem. Things might look complicated at points along the way. Be patient. Some important lessons will emerge at the end. If I am successful, the algorithms will appear simple to you again at the end.

Suppose that you are looking at a calendar. The page for July is open. You or someone else has previously written the highest temperature of the day next to each of the 31 dates on the page. You know how hot it was in your city on July 1, on July 2, and so on up to July 31.

Now you want to know the highest temperature for the whole month.

Do you have a method? Do you need a method? Can you not just look at the page and spot the largest number?

You might well be able to pick out the largest value in a set of numbers in a single glance. However, you can only do this if the set is small. What if your instructor or employer asks you to find the highest temperature in the last century? Now you must examine thousands of temperatures. If you are working alone, you might still be able to accomplish the task without spelling out a procedure. You “just know” how to find the biggest number. You do not have to tell anyone else how you do it. You do not even have to list the steps in your own head.

However, if your instructor or employer assigns you the task of identifying the largest value in a set of one million values, then you will probably see automation as a necessity and not just an attractive option.

At the lowest level, a computer can respond to only a small variety of very elementary instructions. To make a computer do something, we must give the computer explicit instructions. We must provide a series of steps that leads to the solution to a problem.

Here is our challenge. We want to accomplish a task with the help of a computer. In cases where the size of our problem is small, we can easily accomplish the task without the use of a computer. When we are solving the problem manually, we can proceed without thinking very much about our methods. However, we need a computer to solve large instances of the problem. Then we do not to think about our methods. We must be able to describe our method in detail and with precision.

There are many problems like this one. A child can solve the problem, but we find it challenging to describe step by step the method that produces the solution. Sorting a list of numbers or names is another such problem.

Finding the largest number in a list of numbers

Here is an algorithm. Can you translate these words into executable code?

  • Begin with a list of numbers. Do not assume that the list is sorted.
  • Assume that the first number in the list is the largest. Store its value in a location that you label BestGuessSoFar.
  • Examine the next number in the list. Compare its value to the value of BestGuessSoFar. If necessary, update the value of BestGuessSoFar.
  • Examine the third number in the list. Again, compare and then update if necessary.
  • Continue for each successive number in the list: examine the value of that next number, compare its value to that of BestGuessSoFar, and update the value of BestGuessSoFar if the value examined is greater than the value of BestGuessSoFar.
  • At the end of this procedure, BestGuessSoFar will hold the value of the largest number in the list.

This search for the largest value looks for a “greater than” relationship at each step. Replacing “greater than” with “less than” will yield a program that finds the smallest number in the list.

This algorithm looks for the largest number in a list of numbers. What if the list contains something other than numbers? The list might, for example, contain words. Will the algorithm still be useful?

Faced with a new problem, look for similar problems that you have already solved. You might be able to solve the new problem by adapting the solution to a familiar problem.

Finding the index of the largest number in a list of numbers

Again, suppose that you have list of 31 temperatures. These are the highest temperatures measured on each of the 31 days in July in a particular year at a particular location.

This time you do not want to find the highest temperature in the list. Instead, you want to find the date on which the temperature was highest.

  • Begin with a list of numbers. Do not assume that the list is sorted. Each number has an index. The index is the position of the number within the list.
  • Assume that the first number in the list is the largest. Store the index of this value in a location that you label BestGuessSoFar. We count from zero in computer science, so the index of the first value a list is zero. However, calendars number the days of a month beginning with one. There is no July 0 on the calendar. There is a July 1. Beware! Here is a common source of errors. Computer scientists call these “Off By One Errors.” (Fans of Star Wars might call these “Obi-Wan” errors!)
  • Increment the index. The index is now one. (The index of the second element of a list is one.) Examine the number at that position in the list. Compare the value at that position to the value of the element of the list whose index is found in BestGuessSoFar. If necessary, update the value of BestGuessSoFar with the current index.
  • Increment the index. The index is now two. (This is the index of the third element of the list.) Again, compare and then update if necessary.
  • Continue for each successive index in the list: examine the value of the number at that next position in the list, compare its value to that of of number found at the index stored in BestGuessSoFar, and update the value of BestGuessSoFar if the value to which the current index points is greater than the value to which BestGuessSoFar points.
  • At the end of this procedure, BestGuessSoFar will hold the index (that is, the position) of the largest number in the list. The number of the day will be one more than this index. If the program finds the highest temperature in element 30 of the list, it should report that the highest temperature was on July 31.

Although this version of the problem might appear to be harder than the first version, small changes to the solution of the first problem are enough to produce a solution to this problem.

  • The initial value of BestGuessSoFar will be 0, not the value of the first number in the list.
  • The comparision looks like data[k] > BestGuessSoFar rather than data[k] > data[BestGuessSoFar].
  • Updates to the value of BestGuessSoFar look like BestGuessSoFar = k rather than BestGuessSoFar = data[k].

Finding largest number’s index in the last part of a list

Again, suppose that you have a list of the high temperature on each of the 31 days of July. Now the goal is to find the date during the last week of July on which the temperature was highest.

How to solve this problem? Easy! Take the solution to the previous problem and replaces the initial value of BestGuessSoFar with the index of the first element in that part of the list in which you wish to search for the largest value. To discover which day in the last week of July saw the highest temperature, begin the search on day 25.

Lessons to take from this exercise

  • Know what conditions hold before your code begins its work. In this case, we have a list of numbers. We have said nothing else about the list. Would it matter if the list were sorted? Would our algorithm work if the list contained some repeated values? Maybe we should put aside any consideration of those special cases for now.
  • If you cannot solve the whole problem, solve part of the problem. In this case, you might begin by writing a program that simply prints the values of the elements of a list. Then you might write a program that prints the index and value of each element.
  • If you cannot solve the assigned problem, look for a similar problem that you can solve. In this case, if you cannot find which day was hottest, look for which temperature was highest.
  • Label the pieces of your program in a way that helps readers understand your methods. A person with no knowledge of computer science or programming can look at variable named BestGuessSoFar and see what you are trying to do.
  • Even simple programs can contain subtle complexities. The element of the list whose index is 0 contains the high temperature for July 1, not the temperature for the non-existent July 0.
  • Know what conditions hold after our code completes its work. Our list remains unchanged. For one kind of problem, we will have found a highest temperature. That highest temperature could be a positive or negative number. (Maybe our thermometer is at the South Pole. July comes in the middle of winter at the South Pole!) For another kind of problem, we will have found the index of the element that holds the highest temperature. An index will never be negative and will never be greater than one less than the size of the list. The indices of a list that contains the highest temperatures on each of the 31 days in July will range from 0 to 30.