Problem Solving

This is the half of programming that makes the biggest difference between a mediocre programmer and a really good one. It is also the part that is rarely taught. Most programming courses deal exclusively with the coding half of programming because that is the half which has set rules that must be followed. While the exact rules and syntax differ between programming languages, everyone using a particular language needs to apply the same rules and syntax with their coding.

The other half of programming is completely different. There are no set rules to follow with respect to the problem solving half of programming. It is quite possible for each person to take an entirely different approach to solving a given problem and for each of those approaches to be valid. It is only once each has worked out how they propose solving the problem that their solutions then get converted to code.

Given all of the different approaches to problem solving, how can it be taught? That question is probably the reason why so many programming courses omit this part of how to program. While this is correct to some extent there are still some parts of problem solving that can be taught. While each person's approach to problem solving will be different there are a number of techniques that cam be used to simplify problems and so make them easier to solve.

The only book I have ever come across that deals specifically with this half of programming is the book ""Think Like A Programmer" by V.Anton Spraul. In the first chapter on problem solving he lists the following general problem solving techniques:

Now these are some of the approaches that you can use for problem solving. There are many more. Here are a few others that occurred to me while writing this article:

Perhaps the most important thing is to never lose sight of just exactly what it is you are trying to achieve. In each case the program you are writing needs to be able to take the inputs provided and produce a given result subject to a number of constraints. Now some of the constraints will be specified for you in the problem you are trying to solve while others will be implied. If you write code that produces a solution that ignores one or more of the specified or implied constraints then you haven't really solved the problem. Assuming constraints that don't really exist may make solving the problem impossible.

Another consideration regarding problem solving is that not all of your attempts to solve a given problem will be successful. When you reach the point where you discover that what you have done so far is not going to help you solve the problem then you have to be prepared to discard that particular approach and try something different. The time you spent on the unworkable approach isn't wasted as you have eliminated possibilities for solving the current problem and the work you have done may turn out to be something that will help you solve a completely different problem in the future.

Before we move on to look at specific problem solving techniques, let's look at a simple problem and how implied constraints may make the most obvious solutions unworkable.

For this example the problem we are looking to solve is to add up all of the consecutive integers between two specified integers. For example if the two specified integers are 2 and 7 then we need to calculate 2+3+4+5+6+7 and the result we need would return is 27. Now if those were always going to be the values at the end of the range then simply returning the answer 27 every time would resolve the problem. However the ends of the range are not constrained to being specifically 2 and 7, they can be any integers. Obviously we are going to need some code to actually add the integers in the range together, or do we? The most obvious generic solution is to use a loop to add up the numbers. In JavaScript we might write:

function sumRange(x, y) {
var result = 0;
for (var i = x; i <= y; i++) {result += i;}
return result;

While the exact syntax used for other languages would be slightly different, what the code does would be the same. Our code now works fine for producing the correct result when we test it with a few simple test cases.


Did you remember to consider that last test case? There is nothing in the constraints that say that the integers have to be positive. Come to that, there's nothing in the constraints limiting how big the integers can be (well there are limits to integer sizes within the computer but our test cases so far are a long way from those limits). Is sumRange(-1000000000,1000000000) a valid test case? Can you point to anything that says that this is excluded by the constraints? You can't because there isn't anything specifying a limit on the integers at either end of the range. The above solution will produce the correct answer for this test case but in order to get the answer it will need to perform over six trillion calculations. You might need to wait a while for the computer to add all these integers together before it can finally tell you that the answer is 0. I can't tell you how long it takes a computer to do that calculation because I have never run the above code. In this particular case I can tell that the answer is zero just by looking at the two numbers supplied as the ends of the range.

If I can tell the answer just by looking at the end numbers then why should a computer have to spend ages adding all of the numbers up in order to achieve the same end result. We could consider time to be an implied constraint of the question and if the computer is going to take longer to get the answer than we need to work it out for ourselves then there's something wrong with the computer solution.

While it is obvious that having the ends of the range an equal distance either side of zero will always add up to 0 (that is obvious to you isn't it?) it should also be obvious that some sort of pattern exists that would allow the answer to be determined directly from the integers at either end of the range without needing to add up all the intervening numbers. Just what that pattern is needs to be determined but if we can work out the formula for the pattern and substitute that into the function then we'll have a function that scales much better when the ends of the range are larger numbers.

Did it occur to you that there is a pattern to the result of adding a series of sequential integers together that is dependent only on the integers at either end of the range? If you didn't then you need to start looking for patterns or your code will always be as inefficient as the above loop performing trillions of operations to produce an obvious result.

There is no constraint in the question that forces us to actually add up all the integer. We need to determine what the sum of all the integers is and return that result but how we actually calculate the result is up to us. By recognising that a pattern exists between the end points of a range of integers and the result of adding all the integers in that range together we are half way toward obtaining a far superior solution to the problem. Now all we have to do is to work out what the pattern is.

How you go about this is up to you. Any of the techniques listed above and many others may work to help you work out this formula. The approach I chose was to experiment. This pattern seemed so simple to me that I thought that I should be able to easily work it out using a few simple test cases and a few minutes of trial and error. This actually worked for me because I have been solving patterns like this for many years and so had quite a good idea of what sorts of calculations might need to be done. This approach probably wouldn't work for most programmers because the particular calculations I chose to try first would be lost amongst many others.

Within a few minutes I came up with:

sumRange = function(x, y) {
return (y-x+1)/2*(x+y);

Now there's nothing in that calculation that I would consider to be obvious. It just happened to give the right answer in each of the test cases I was using for my trial and error approach and so was the first candidate for a solution. I then confirmed that the formula is the correct one by trying it out on a much larger selection of test cases including several that looked like they would represent boundary conditions for that particular formula. Each of those also produced the correct result.

Now this hasn't proved that the formula is correct for all possible ranges but given that I know that there is a pattern and that this particular formula returned the correct answer for every case I tested and that it is similar to the formula for other patterns that I have dealt with before, I am confident enough that this formula will always return the correct result that I would use this code in preference to the loop. That way no matter how big the range of integers to be added the computer only needs to perform five calculations to return the result.


This article written by Stephen Chapman, Felgall Pty Ltd.

go to top

FaceBook Follow
Twitter Follow