17 Exception Handling – 1
Bhushan Trivedi
Introduction
The programs can handle errors in a fashion specified by the programmer. C provides a few ways for programmers to report and respond to errors. However, the process is quite untidy and not standardized. Most other languages especially Java and Python provide very systematic exception handling mechanisms. Exceptions are special conditions including errors. When the program is provided an unacceptable input or a function is passed with an unacceptable parameter or the file specified is not found, or the disk is full, the programmer who is coding must provide some mechanism for the user to get this message and provide him the freedom to take some action corresponding to that situation. The designers provide help to programmers to handle exceptions by providing systematic reporting method known as exception handling. Programmers write the code to generate exceptions when something unusual and unexpected happens like above and the program cannot continue functioning in a normal fashion. The exception includes errors but can also be used when some permanent control transfer with the message is needed. During the course of this module, however, we will only consider errors as the cause for generating exceptions and nothing else.
In this module, we will look at shortcomings of traditional error handling process, see how a more systaltic exception handling process is a better option in some cases and various ways in which exceptions are handled.
Traditional ways to handle errors
There are three major ways to handle errors conventionally. We will be looking at each one in little more detail in this section.
Error number
The main function in C is designed to return a single int value normally. Conventionally, when it returns 0, it is considered no error. Whenever it returns a value other than zero, it is considered to be an error and the value indicates which error it is. Many other system functions are also designed on the same lines. They return the error number which indicates the type of error, or no error in case they return 0. The downside of this approach is that the numbering scheme is based on programmer’s choice. One may decide to keep 1 for file not found and 2 for disk full, another may choose 1 for disk full and 2 for the file is readable only and the program wants to write to that file. Unless there is a consensus, it is hard to work with such a scheme in a general manner. Point is, it works for a single programmer or a small collection of programmers closely linked with each other, so they can resolve what exactly is meaning of which error number. This mechanism cannot work in general as two unknown programmers cannot communicate using this method. The programmer 1 may decide to return 5 for a file not found operation in his function X. programmer 2 who is using this function X may understand it to be a disk full and eventually he will not be able to resolve the error. Some consensus or standard is necessary.
Global Flag
Sometimes a global flag value is set by the function to indicate the status of the system. When the error has global significance, for example when the file asked for is not available or the hardware expected is not present etc., this method is better. There is a variable known as errno which is set by functions upon exit indicating the type of error. The function perror of the standard library can be used to check this value and print the message accordingly in C. Unlike the user-specified error number, this system defined error numbers are more standardized and have common value across operating systems as well. Another advantage of this system is that there are standard error messages associated with each of these values and thus this approach is more standardized and user-friendly. This mechanism is only for system related messages and there is no way a user can specify errors related to his own function.
In short, the errno is a global variable in C which contains a global error value. This information is kept with the system. As and when an operation takes place by executing a typical program statement, the errno variable reflects the outcome of that execution process in terms of error. If the operation is successful, it contains zero and otherwise, it contains a value which indicates what exactly is the error associated with that operation.
There is a downside is that the global flag value must be checked immediately after the operation of which we want to study the status. For example, when we open the file and we would like to know the status, we must place perror immediately after the open statement (or otherwise deploy our own mechanism to react to the global error flag). If we fail to do it immediately, the perror will not show the right status as it will only show the status of the operation immediately previous to the perror. That means if we open the file and do some printf and then write perror, it will only let us know the status of printf statement (which is normally without any error). The global status of the file open operation is overwritten by the status of next operation.
Exit or Abort to terminate abnormally
A popular method in C program is to call exit() or abort() with or without a typical error value. Whenever exit is called the program terminates there and then without doing any cleanup or restoring the program to whatever extent it can. The exit() is still used in C++ whenever there is no alternative but quit the program immediately.
The problem with this approach is that we have no built-in mechanism to salvage whatever possible. For example, if we have opened a network connection and opened a few files for operation and connected to a database, and after that if we encounter an error which cannot let us proceed, if we call exit(), the open connections to network, open files and connections to the database are kept in limbo and introduces strange errors hard to handle. A programmer can always write the code to state what exactly he wants to do if there is an abnormal termination and the system should execute those instructions as and when there is a call to exit().Most programmers are not equipped with such skills and they may introduce other errors if they try attempting such operations, wherever they can. Ideally, the system should automatically find out such things and close them on its own.
Exception handling
If you have read these three approaches and their lacunas well, you know why most current languages provide exception handling mechanism and why C++ does so. You can clearly see that all these approaches are clearly inappropriate and we need a standardized and better solution for at least cases where the error is generated and handled at different places and programmers have little idea about who is using the code and how. The solution demands using exceptions in the system and a mechanism to systematically manage exceptions.
Exceptions are special conditions including errors. When the program is provided an unacceptable input or a function is passed with an unacceptable parameter or the file specified is not found, or the disk is full, the programmer who is coding must provide some mechanism for the user to get this message and provide him the freedom to take some action corresponding to that situation. These exceptional situations which demand further action are known as exceptions.
Exception Handling is a standardized way to report anomalies and errors across multiple functions to handle exceptions. Programmers write the code to generate exceptions when something unusual and unexpected happens like above and the program cannot continue functioning in a normal fashion. Programmers, on the other hand, also write the code to direct the system about what to do when such abnormal conditions occur. Another definition: the EH is a mechanism of communicating between a library designer and the library user about the anomalies or abnormal situations and provide the programmer a way to handle such situation.
Here are some important reasons we would like to stress on for a more systematic approach than what C offers.
Designer and user of library
The first reason for having such a mechanism is that we would like to have a clear demarcation between who will generate the error and who will handle the error. It is not normally the case in real programming that the person who writes the code which might generate an error is the same as the person who writes the code which takes actions when the error is handled. In real life situations, a group of core programmers normally design libraries. Other programmer put these libraries to use based on their need. For example, we all use printf, scanf, atoi, sqrt etc. functions without really knowing who has coded it and how. Same is applicable to user-defined libraries. Let us take an example of a graphics library. A graphics designer designs website graphics components which the website developers use to decorate their pages. The designer may know the component he is designing and provide operations to manipulate them. For example, the user can enlarge or shrink using functions enlarge () and shrink () the designer provided with his design. Now, he also knows that shrinking or enlargement beyond a point is not desirable. Whenever the arguments passed to any of these functions are inappropriate, the designer knows that the error has occurred. Now can the designer decide what to do with that error? Not really. The programmer who creates web pages and uses these graphics for its work knows exactly what to do. He can generate a message that the ‘logo’ cannot be reduced to this small size as it cannot be seen properly or if we enlarge this symbol further it will overlap the text area and so on. As the programmer is working with the exact situation, he knows what to do with the error.
One more way to look at this issue is to see the context under which the program is working. When the function is designed, the designer can make decisions based on his library context only. The designer has no idea about the local context under which this function is put to use. That contextual information is only available to the user and he is the best judge on deciding what to do with that error; which action to take, which message to display etc. Thus it is important to note that the context where the code for error generation is written is different than the context where the code for error handling is written. We need a solution which connects these two different realms. Not only that, we must also understand the limitations of working in one context and the impact of that work being in the other context.
In short, the library designer is a programmer who develops library functions while the library user is a programmer who develops code for the users using library functions used by the library designers.
Above mentioned situation is not one-off. It is a common situation normally observed in many real-time situations. The programmers who are using the library know how to use the library and have a clear idea of what exactly one should do when the error has occurred. On the contrary, libraries are designed for multiple situations and many programmers, in their unique situation, may put the library to use in a very different way than what others do. Library designer cannot predetermine the manner in which the library is going to be used. Another point is, the library designer decides to report errors when it is so and the library user has to act on those errors. This mechanism has to be standardized (unlike random number given to different errors), so any library user can use it without any ambiguity. Anybody who is designing a library can throw typical exceptions (unlike error numbers exceptions are objects which have well defined unique names so it is easier for humans to understand) which are standardized and have a clear and unambiguous meaning.
Another critical point is to understand that exit() or abort() like methods do not provide any control to programmers while aborting or closing the program. Ideally, the library designer should design the system in a way that allows the programmer to decide what to do when the error has occurred, close files, terminate network connections, inform users and server that the program is closing, empty buffers allocated for this job and return them back to OS and so on; at least they should allow the system to delete all objects defined and are still running. Calling destructors of all objects defined prior to the exception is thrown is a normal process that a user would like to have control over.
A graceful exit is about exiting the program in the controlled manner in which the user can execute clean up actions like closing open files before exiting. The point is, one needs a solution with graceful exit option rather than abrupt and uncontrolled termination.
In a way, we need a mechanism where we can separate error reporting and error handling process.
Error reporting is about the case where the library function, upon realizing that the data or process is going in the wrong direction and raises an error. This process is also called exception throwing.
Error handling happens when the program uses a library function which results in an error and thus returns with an error, the program must have a code to handle that error. This process, carried out by this code, is known as error handling
The library designer elaborates on error reporting process. It finds out all possible cases where the function he has designed runs into errors and provide a mechanism to report errors in all those cases. On the other hand, we have an error handling mechanism which is available with the library user (the programmer who is designing programs using those libraries), who writes the code for all possible errors reported by the designer. For example, if the designer has provided error related to “bigger than required dimension”, a library user who is designing aeroplane layout might display “the left wing is going out of screen”, while a library user who is designing a basketball game might display “the left side goal post is drawn out of screen” based on his own processing of errors.
The programmers take a central role in this process and decide the flow of processing when the error occurs, and might just not terminate but take an action to mitigate the situation. For example, in case of error like file not found, it might generate a new file at user preferred location and use that file instead of the earlier file or user is prompted to choose another file etc. This control is with the library user.
The library user faces one typical problem known as the object destroy problem, let us discuss that now. This problem reinforces the need for having a standardized exception handling system.
Object destroy problem
We have seen constructors with dynamic memory allocations. We have already seen that whenever a programmer allocates memory, it becomes his own responsibility to deallocate memory1. Normally, a programmer is careful enough to write a destructor so when the object goes out of scope it deallocates the memory. Unfortunately, when the program is terminated abruptly, all the objects which are defined during the execution, should be destructed gracefully by calling their destructors if need be. The object might have acquired resources like network connections, file handles, buffers etc. during its existence. The exit() or abort() do not automatically call the destructors of such objects and create an inconsistent state. Instead, if the object-destroy process is carried out, these resources are smoothly returned and connections are closed before the object is actually destroyed.
In other words, whenever an exception is thrown, the function has to leave the job taken in the middle and thus cannot complete that job. That is why all the objects constructed so far for completing that job are to be destroyed to place the program back in the previous stable state before calling the function in which the exception is thrown. This problem is called the object destroy problem
It is clear from the above discussion that one needs a systematic exception handling system. In the subsequent section, we will discuss the components of exception handling system provided by C++ and how those components together provide solutions to the problems we have discussed above.
1 Compilers compliant to ANSI C++ 2011 provide garbage collectors which can help programmers in this regard but there is a huge amount of legacy code still running across millions of machines which still need this.
Components of EH system
The EH system of C++ contains three different components. The actual error is reported using a throw statement. The throw is the first component of the EH system. the throw is used by the functions designed by the library designer as and when they find situation leading to errors.
The library user uses the second component to handle the incoming exceptions which are rightly called a catch. The library user decides the part of the code which might run into an error (as it is calling the library functions which throws exceptions in error like situations) and enclose that part into a try block. The beginning of the try block is identified by the first component, called try, which heralds the beginning of the try block which is enclosed in curly braces {}.
In short, there are three components of the EH system. The throw statements are part of library design and used to report an error. The error is handled at the library user level using the third component called a catch. When the library user uses library functions might run into errors, it encloses them into a try block. The catch section is immediately followed by the try block. Whenever a throw statement is executed, the control is permanently transferred to the catch block and statements after the throw statement is not executed.
In other words, a try block contains at least one throw statement or a function which in turn contain such statement. A throw statement actually throws the exception from that point and permanently transfer the control to the appropriate catch block The catch statement catches that exception and take appropriate action at the library user’s end.
Let us take an example to illustrate the point.
#include <iostream>
#include <string>
using namespace std;
class ExNew
{
public:
int ExNo;
string ExMsg;
};
int main()
{
try
{
cout << “Inside Try now\n”;
cout << “throwing exception as error assumed to occur\n”;
ExNew Error1;
Error1.ExNo= 20;
Error1.ExMsg=”Error at this point”;
throw Error1;
cout << “Control Won’t come here”;
}
catch (ExNew Except)
{
cout << “\n Inside Catch now \n”;
cout << “Exception Numebr is ” << Except.ExNo;
cout << “\n” << Except.ExMsg;
}
return 0;
}
Try, throw and catch
Look at the program 18.1 carefully. It contains all the three components we have discussed in the previous section.
The try block immediately follows the main() and contains a single throw statement. This throw statement throws an object of class NewEx. You can also find the catch block immediately following the try block. Look at the syntax of the try block first. It contains a single word try and then the code embedded in a pair of curly braces. It contains a single throw statement with Error1 as an argument. Error1 is an object of type NewEx which is being thrown. The catch block is a function call. The argument defines the type and the value it is catching. It uses those values in processing the exception. You can now see how these three parts are interrelated.
try
{
…
throw Error1;
}
catch (ExNew Except)
{
}
Interestingly, you cannot see the library designer and user problem from this simple, unrealistic example. A little better example which showcases the need for library user and designer will be introduced soon.
Let us try to elaborate the code. We have not done any serious processing, we just assumed an error and execute a throw by calling throw Error1;
In actual case, there is a situation which demands to throw exception for example when a file is needed for processing further but the file is not found at the location specified by the user, or the disk where the file content is to be written is full or dimensions of the object being drawn on the screen is going out of the screen. Whenever such conditions are checked and found to be such that the programmer must be informed about them, then the exception is thrown using this throw statement. The try block contains a few otherstatements but of little significance. In a real case, it will contain steps to open files, write to them and so on. Any of the statement might introduce an error. We have simplified the situation by just calling throw.
Once the throw statement is called, the control is permanently transferred to the immediate matching catch block. It will report an error by calling a function terminate() if the appropriate catch block is not found. By default, the terminate function will close the program abnormally by calling abort(). However, it is possible to modify that behavior. We can replace the built-in terminate by our own terminate() function which can do other things and call abort if need be or otherwise. This two-step process, calling terminate which in turn calling abort is a good idea as it gives the programmer a control for the case where an unexpected exception is thrown. He can write the code to close files and close connections if nothing else before calling abort().
It is also important to note that catch matching is more restrictive than a normal function.
For example, catch(unsigned int) won’t catch an integer value.
Anything written after the throw statement, for example, following statement, will not be executed.
cout << “Control Won’t come here”;
The throw statement throws an object of class NewEx which contains only two items, ExNo and ExMsg. The catch block receives the object if the argument specified inside the braces match. In our case, the catch is designed to receive arguments of the type NewEx as specified in the following statement.
catch (ExNew Except)
The variable Except is of type ExNew is passed to catch now. It can use the object and its attributes for whatever purpose it deems fit. In our case, we used two cout statements to display both the attributes.
The call to catch block, unlike function call, is not returning back. Once the control is transferred to a catch block, it won’t come back and execute remaining statements after the throw. That is why we have called this throw control transfer as permanent. Even when no matching catch is found, the control won’t come back.
In short, when the function encounters an exception, the control is transferred either to a corresponding catch or the terminate function and will not come back. This one-way transfer is thus named as permanent control transfer.
Challenges in exception handling
The EH approach is not without problems. There are quite a few additional challenges the C++ designers have to face while providing this solution. The readers who aspire to be good programmers must learn about them. The behavior of the program changes substantially under this model and if the programmer is unaware of it, he might not be able to deliver an efficient solution. One must understand all subsequent effects of throwing exceptions before doing so.
This approach demands to code with a proper design. One must realize the type of exceptions which could be thrown from the function before calling it so an appropriate catch block to be provided for all such cases. Everywhere a function which can throw an exception is called, the code must be enclosed in a try block. The user must also provide a proper catch for each possible exceptions which can be thrown. For example, an operator new can be used to allocate memory. The designer of the new has no idea in which situation you are calling new and does not provide any specific message regarding it. It only throws a bad_alloc exception if it runs out of memory. If a user is using new but does not enclose it in the try block and also does not provide an appropriate catch block, runs into the abnormal termination of the program. If we have used a C code and called malloc() it is doing the same thing anyway and thus we are not losing anything if we want to mimic that functionality. However, in this case, we can use catch (bad_alloc) in our calling function and take some action, for example, find a solution without consuming additional memory, or find an alternate source of memory, or at least close files and open connections before terminating.
Point is, whenever we are calling third party functions or even system functions, we must be aware of the type and nature of exceptions that they might throw and must write an appropriate handler for them. We can always write a code to improve the default behavior of abnormal termination.
Another point. The EH is a mechanism of communicating between a library designer and the library user about the anomalies or abnormal situations and provide the programmer a way to handle such situation. The programmer must be capable to utilize the power of control provided to him.
Modules 20.21 and 22 are dealing with inheritance or the ability of the class to inherit into another class. When one class inherits from another, the inherited class is called a derived class and the class which is inherited is called the base class. It is possible to have a base class pointer and make it point to any of the derived class objects and manipulate them. These objects of derived classes which can be pointed to by the base class pointer are known as polymorphic objects. Dealing with those objects in any way complicated, and finding a proper handler for them is equally complicated. In fact, it is possible to throw a derived class exception and let it be caught by a catch statement containing an object of the base class as an argument. When to do so and how to manage polymorphic objects will be discussed during our discussion about polymorphism.
As it is needed, whenever a throw statement is encountered, the control travels back to the beginning of try block statement by statement and destroy all objects defined one after another, in a way calling their destructors and deallocating memory if need be. This process is also known as stack unwinding process. The C++ EH system, whenever encounters a try block, start putting all objects defined after that on the stack, when the exception is encountered, the objects are picked up from the stack and destroyed. Thus the object defined last is destroyed first and so on till the object defined immediately after the try block.
This process is quite logical and important for the smooth termination of the program, however, it is not without a serious problem. If ever a destructor throws an exception, the program is in a confused state as one exception is still being handled and another exception is being thrown. As per the C++ design, it is not possible to have more than one exception to remain unhandled at any given point of time and such a situation leads to abnormal termination. So, as long as possible, one must try to avoid throwing exceptions in destructor2.
Propagation of control
It is interesting to learn that the propagation of the control after the throw is always upwards from deepest of function to the outermost function where it is handled. Consider a case where a Function A calls Function B which in turn calls Function C. What if the exception is thrown in function C? We may not have tried in it and so there is no catch for it. It will immediately leave the function call and return back to calling function, i.e. Function B. However, the Function B also does not have a try or catch so the exception is passed up to Function A, after leaving function B. Fortunately, it has a try block under which the call to Function B is made. Now the exception is to be handled here and thus the catch block immediately following the try in Function A is executed. In all three functions, there are some statements after the throw and the function call, none of them will be executed (unlike normal function calls where they are after the call returns back)
2 There is one solution through which we will learn about in the next module, anyway, it is always a good idea not to have code which can throw exceptions in the destructor.
A | B | C |
{ | { | { |
… | … | … |
try | C() | throw SomeException |
{ | Some other statements-B | |
B() | } | Some other statements-C |
some other statements-A | } | |
} | ||
catch() | ||
{ …. | ||
} |
There are two queries. First, what if there is no try catch even in function A? if function A is main and there is no other calling function for it, the exception handling mechanism calls terminate() and if the terminate() is not redefined by user, it calls abort() and thus the program is terminated in an abnormal fashion.
The second query is, what if the function B contains a try-catch block and calls the function C inside? After the throw of the SomeException from Function C, the control immediately transfers to the catch statement of the function B (it will not execute statements after the call to Function C is made). However, once that part is done, the exception is said to be handled properly and the call to Function B terminates in a normal fashion. Thus, remaining statements (Some other statements-A) of function A will be executed in a normal fashion exactly like a normal function return of Function B.
In other words, whenever the exception is thrown, it travels outside the function and up in the hierarchy until a try and an appropriate catch statement is found or it exits out of main. This is how the EH control progresses.
Another important point is to learn how the objects are destroyed in such a case. The EH system systematically destructs objects in reverse order, first in Function C, then Function B and finish off with function A. We will look at one program which implements this case in program 18.2 in the next section.
Throwing within a called function
Let us take one example to see what happens when a called function returns with an exception which is thrown from within.
Closely observe program 18.2. It has a similar structure to the what we have seen in the previous module. Main calls ExGen, ExGen calls ThrowEx. The ThrowEx throws the exception but does not handle the exception itself. It is handled in ExGen. You can see that this program and the previous program are not much different. Instead of ExGen throwing an exception, it calls ThrowEx which in turn throws an exception, that is the only difference. However, the exception is still handled in ExGen. We are simulating a case where ThrowEx is like a library function defined by a library designer, which throws an exception. The library user has written the ExGen which wraps the code (including a call to ThrowEx) which might throw an exception in the try block. It also provides a catch block to handle that exception immediately after that. This is how one call library functions which might throw an exception.
Program 18.2
#include <iostream>
#include <string>
using namespace std;
class NewEx
{
private:
int ExNo;
string ExMsg;
public:
NewEx(int ErrNum, string t_msg)
{
ExNo = ErrNum;
ExMsg = t_msg;
}
void DisplayEx()
{
cout << “Error Number is “<< ExNo << “\n”;
cout << ExMsg << “\n”;
}
};
void ThrowEx()
{
cout << “second level\n “;
cout << “Assuming error and throwing exception now\n”;
NewEx Error1(20,”Test Error”);
throw Error1;
cout << “The control will not come here”;
}
void ExGen()
{
try
{
ThrowEx();// calling this function generates exception
cout <<“This will not be printed”;
}
catch (NewEx Except)
{
Except.DisplayEx();
}
}
int main()
{
ExGen();
cout << “This will be printed\n”;
return 0;
}
The output is
the second level
Assuming error and throwing exception now
Error Number is 20
Test Error
This will be printed
You can see that the exception has climbed to the second level and then handled. Once the exception is handled, ExGen returns normally and that is why cout of main is executed properly.
Catching Exception
The exception is thrown from one function and handled in some other function calling that function. It is imperative that the developer who is writing the handler function know exactly what to catch. Let us try to gather some other information on how catch works.
A catch statement matches with following
1. an exact type of exception which is thrown. For example, when Error1 is thrown, our catch contains the type NewEx, exactly the same type of Error1
2. The object of any class which inherits the class NewEx. That means any base class of class NewEx if it is an inherited class. (IT IS NOT IN THIS CASE). You may refer to Reference-1 for example and detailed explanation.
3. If a pointer is thrown, it can always match a special pointer (void *). This is done to enable the handler to manage all types of pointers together. It is possible to have one try block inside another. In that case, if an exception is thrown from the inside try block, and not handled there, is managed by the outside try block.
Catch all
One prerequisite to manage library function exceptions is to make sure that all exceptions thrown by the library functions are caught. How does a caller know about all such exceptions? What if he does not know? There are multiple ways to handle this but one of the simplest solutions is to use a catch which can catch any exception. The syntax of catch-all is simple as shown below.
catch (int ) //catching an int exception
catch (char ) //catching a char exception
catch (NewEx ) //catching NewEx type exception
catch (…) //three dots indicate all, catches any other exception
{
//do whatever you want
}
Above mentioned catch statement is always written as the last catch statement in a single catch block, after providing all other catch statements. This statement prevents the program to crash in the case of some unknown exception is thrown. Catch all is quite similar to a default case in a switch statement. That means, whenever a special argument with three dots ( …) is provided in the argument to catch, it is known as a catch all. This catch block can catch any exception thrown from the preceding try block.
Summary
We have looked at the needs of a case where a library designer designs a function or a library of functions and another programmer who is using that library of his choice. EH is a mechanism to pass anomaly related information from the designer to the user in a standardized form, which is different and more powerful than the C like error handling approach. In this module, we have seen the EH components provided by the C++ language; the throw, catch and try keywords and their usage, different ways of throwing an exception and how to catch block is used to catch the exceptions which are thrown.
you can view video on Exception Handling -1 |
References
- Programming with ANSI C++, Bhushan Trivedi, Oxford University Press
- www.stroustrup.com, homepage of Bjarne Stroustrup, the creator of C++