14 Arrays and Strings

Bhushan Trivedi

epgp books

Introduction

 

The C++ language is an extension of C and thus carries the legacy of the C language. One of the legacies is the arrays. The array is a simple structure containing a collection of homogeneous items together as a single unit. A great feature of the arrays is the ability of the programmer to manipulate the complete collection using the index value and some looping structure. Another important legacy of C is the string. The string in C is an array of characters terminated by a typical sentinel value called null (‘\n’). The C string library contains many functions for dealing with C type strings, for example, copy one string into another, comparing two strings with each other, finding out if there is a typical substring which is part of a given string etc. The arrays are part of the C++ design and they are used like they are in C. However, the C type strings are not preferred in C++ as a better form of string is provided by the C++ designers. The C++ string is basically an object of the class (which is also known as a string class) available to C++ programmers. C++ contains a new library called standard template library or STL which contains many other classes apart from the string. The string is one of the most popular classes of the STL. The string class has many advantages as compared to the C type string. We will throw some light on how the string objects are better than C type strings in this module. We will also explore two important things in this module. We will see how arrays are extended in C++ to have objects as their elements. We will also learn about how the C++ string objects are used and how one can program using the C++ string objects and their member functions.

 

 

Before we embark on the discussion of string objects, let us be clear that the C language array structure and strings are assumed to be known. If you have no idea about C arrays and strings, it is strongly recommended that you study them before attempting to learn the content of this module.

 

Arrays of objects

 

The arrays in C++ can contain everything a C array could, additionally, it can also contain objects as its members. An array of objects is basically objects stored as elements of arrays. Let us look at an example to illustrate the point. Look at the program 15.1 which describes a class called players. There are 3 data members and two function members of this class. Every player has a typical jersey number, name and address associated with him. We have defined two function members, one which takes the details of the member (all three data member values) and the other displays the values of those data members. You can see that both functions are quite an integral part of most of the classes that we have defined. When we study operator overloading, we will see how can we simplify this process by overloading the operators << and >>. The definition of the class contains both private and public sections exactly like other class definitions that we have seen in previous modules. We also access members defined in public sections using the dot operator like the previous modules in this case as well. The additional parts are the array definition and using the members of an array of objects as any other object. Here is the program for your perusal. We can see how the objects which are part of the array are accessed and used in the program.

 

//Program 15.1

// ObjectArray

 

#include <iostream>

#include <string>

using namespace std;

 

class Player

{

private:

int JerseyNo;

string pNm;

string pAdd;

public :

void Get(int Jersey, string t_pNm, string t_pAdd)

{

JerseyNo = Jersey;

pNm = t_pNm;

pAdd = t_pAdd;

}

void pDisplay()

{

cout << “Jersey Number is ” << JerseyNo << “\n”;

 

cout << “Player name is ” << pNm << “\n”;

cout << “Player Address is ” << pAdd << “\n”;

}

};

int main() {

int i;

 

Player pArray[4];

pArray[0].Get(1,”Virat Kohli”,”India”);

Array[1].Get(2,”Chris Gayle”,”West Indies”);

pArray[2].Get(3,”A B Deviliers”,”South Africa”);

pArray[3].Get(4,”Steve Smith”,”Australia”);

for (i = 0; i < 4; i++)

{

pArray[i].pDisplay();

}

return 0;

}

 

Description

 

Let us try to see the program 15.1 from the point of view of our understanding of arrays. We have defined a class called a player with three data members and two member functions in the beginning. An important statement follows the main definition.

Player pArray[4];

This statement is quite similar to statements which define an array in C, except for the case that the elements of the array are of type Player here. You can see that the definition has nothing new. Now we have pArray as an array. Every element is an object of type Player now. That means pArray [0] is a Player, pArray [1] is a Player and so on. We need to treat those elements as objects of type Player now. We can do so by calling member functions in the following fashion.

 

pArray[0].Get(1,”Virat Kohli”,”India”);

 

pArray[i].pDisplay();

 

Kindly look at both the statements. In the first statement, an int constant is used as an index  and in the second case an int variable is used but in both cases, we are able to access anarray element with dot notation (like we did earlier) to call a member function using that

 

syntax.

 

Thus, we can use arrays like we did with C, apart from all valid C data types, objects can also be an element of the C++ array. Array elements can be used like normal objects and we can use dot notation to access public data and function members. Another point. You probably have noticed that we have used following statements to define two data members of the Player class.

 

string pNm;

 

string pAdd;

 

Which type of string are these? They are not arrays like in C, neither they have the null character to indicate the end of the string. They are objects of type string. The string is a class from STL (Standard Template Library) from C++. We will study STL in modules 16,17 and 18 at a later stage but let us try to see how can we define and use string objects in C++.

 

Need for the string objects

 

C Type strings are the character arrays with the null character as an indicator to the end of the string. This mechanism is used by C to have string variables and also have many functions to manipulate these types of strings. Let me repeat, the C-Type string objects are still available to C++, we have this additional construct at our disposal while dealing with strings. This option, the string object, is also called the C++ strings, they are the objects of STL1 the class also called string. C++ strings is a better option than the C type strings and that is the reason why it is chosen to be used by most developers instead of C type strings.

C++ strings, as the ensuing discussion will prove, is a better option than the C type strings and that is the reason why it is chosen to be used by most developers if not all. Let us see shortcomings of the string representation as character arrays. We need to have a string library and the prototypes to use the C++ strings. Here is what we have done at the beginning of our program

 

#include <string>

 

Above statement inserts string prototypes in the std namespace so we can use them. The discussion about what is the std namespace and how these insertion works are beyond the scope of this course. You may refer to the reference 1 for more information on namespaces2.

 

Limitations of string representation as character arrays

If we need to define a string in C, we can do it as follows.

 

char PNm[30];

 

char PAdd[60];

 

So we actually define an array of characters. We can use these names (PNm and PAdd) as strings and use them in normal operations. However, there are a few limitations of this approach. Let us try to see.

 

Non-availability of == operator

 

We cannot compare strings like normal variables. For example, following statements are not allowed.

 

if (PNm == PAdd) ….

 

We cannot compare strings using the == operator. The solution is to use a function called strcmp in a fashion which is not as straightforward as the == operator. One typical problem with the strcmp function is that when the comparison is right it returns a value zero and we need to negate the output to check if the strings are the matching. For example, following is the normal statement in C.

 

if !(strcmp(PNm,PAdd) ….

 

1   STL is an acronym of Standard Template Library, a unique type of library which allows templatized library functions possible to be used by different types of collections of C++ objects.

2  Namespace is an enclosure where functions and variables are defined in C++. The standard namespace or std is the most common namespace.

The ! operator is needed here. The problem with this construct is that it is not a logical way to state that the strings are matching. It is hard for somebody who is new to C to understand what is happening in that statement. A novice has more probability to mistake that the check is made for string being dissimilar (rather than same). A better form is really desired.

 

Let us repeat, there are many functions like strcmp and strcpy which manipulates the C type strings. They are known as C Type string functions. They are different than C++ string functions. They are the member functions of the string class which can be used by the C++ string objects. These C++ string functions allow manipulation of the string object in much more user-friendly and simpler fashion.

 

Non-availability of assignment

 

When we have two strings and we want to allocate the value of the first string to another, we cannot do it using normal assignment, unlike other variables. For example, look at the following (invalid) statement.

 

Player1.PNm = Player2.PNm

 

It is not a valid statement if both PNm is of C Type string. We need to use following construct for the same operation.

strcpy(Player1.PNm,Player2.PNm)

 

There is a possibility of incorrect assignment in the case of strcpy function. We may assign string X to string Y instead of assigning string Y to string X if we exchange the arguments by mistake. It is also not very readable.

 

Initialization is not straightforward

 

Another problem with C type strings is that it is not easy to initialize strings by other strings. For example, following statements are not acceptable in C type string case. Both are examples of initialization possible to be done for other types of variables. Initialization defines a new string with a value of an already defined string. Following statements are possible with string objects.

 

string Player1.PNm (Player2.PNm)

string Player1.PNm = Player2.PNm

One may incorrectly compare above-mentioned statements being equivalent to following using the C type string but it is not.

 

char *Player1.PNm = Player2.PNm

Above statement initializes Player1.PNm pointer to Player2.PNm but do not define a new string. In other words, we still have the same string, pointed to by two different pointers. What we really need is to have a new string with the value of an old string. That is not possible in C Type string in the fashion that described above (initialization). We need to define both strings and copy one string into another separately. One can understand the difference in one more way.

 

C strings are not possible to be initialized but it is not the case when objects are initialized with other objects. Though both objects contain the same value after initialization, changing one object does not change the other object.

 

String functions are not member functions

 

The strcpy and strcmp are not truly string functions. They are library functions which are used with arguments as strings. These functions are char array functions assuming a null character at the end. When that null character is not present in a char array, the function will not be able to work properly. For example, a function called strlen counts characters till the null character. It is possible that the array length is 100 but first 70 characters contain the string characters. The 71st character is the null character and thus the strlen correctly identify the length of the string as 70. The strcpy function is also written in the form that only characters till the null characters are copied in the resultant string. If strlen or strcpy functions are called with character arrays without having the null character, they cannot work as expected.

 

Readability is compromised

 

When the string is represented as character arrays, it works but not the same way as other built-in types. Using strings demands mastering different syntax and one must learn to deal with errors related to placement of null character and actual size of the string being one more than what is specified in the array definition (to accommodate the null character). If we have a solution which provides string operations like other data types, the users find it more readable and also user-friendly.

 

C++ string objects are designed to address all these limitations and thus provide better and more readable ways to working with strings. Subsequent sections of this module deal with how C++ string objects are possible to be defined and used. We will also see how C++ objects overcome the limitations of C type strings.

 

String Objects

 

The string class, as we have mentioned already, is part of STL and thus provides a class string. The objects of this class, obviously, is known as string objects and are the constructs used to represent strings instead of C type strings (character arrays).

 

The string class is defined in a manner that string objects work like natural strings. It contains many functions that a normal user expects from a string object. If you have looked at how we have defined and used the string objects in program 15.1, you can vouch for that yourself.

 

If you look back the code that we have defined and used in previous modules, you can see that string objects are indeed user-friendly. We have defined and used string objects without specifying anything about them. We have treated strings as if they are normal data types. One does not really need a special introduction to strings to use them in the C++ program. Those examples showcase how user-friendly and natural these string objects are.

 

Having mentioned that, it is important for us to understand that string class does not only provide user-friendly operations for creating and using strings like we have been doing so far but over and above, it also provide quite a few member functions to make it much more powerful than one can think of at the first glance. Let us delve deeper to see that.

 

Defining strings

 

One can define strings in a similar fashion as other types of user-defined and built-in variables in C++. There are three different ways possible in which one can define strings in C++, here are they.

 

Normal way

Examples that we have seen earlier are the normal way to define strings. Here is another example

 

string PNm;

string PAdd;

 

One just defines both variables as strings, that is all. We have seen quite a few examples of this type of definition so far.

 

Initialization

 

Initialization is another way to define a string. In this case, the strings are defined and given a value from some other C++ string object, a C type string object or a string constant. Here are examples.

 

string PNm (SomeOtherPNm); // using other string object string PNm = SomeOtherPNm; // same as above

char *Address = “Some Address”

string PAdd(Address); // using C type string

string PNm (“Virat Kohli”); // using string constant

string PNm = “Virat Kohli”; // same as above

Using constructor

 

All examples in above section invoke the constructor for string object. The designers have defined a few string constructors and they are called with appropriate arguments as and when the specific constructor is invoked. When a constructor is also defined for one argument, it is also possible to use as a conversion function.

 

Substring related member functions

 

Once we have seen how the strings are possible to be defined, there are some ways one can manipulate strings to get the substrings and use them in the program. There are two different ways one can manipulate STL objects including strings. The first method is using a member function, which is quite common across many other libraries. Second is to use generic algorithms. We will look at some of the member functions in this module. We will have a detailed look at the generic algorithms at a much later stage of this course when we study STL. However, we will give examples of using strings with member functions here to showcase how a programmer can put them to use. Before we discuss these functions, let us clearly understand one thing. Most of the functions that we show in the subsequent sections have other overloaded versions for different types and numbers of arguments. We are only showing the versions which are most commonly used. One should study the help file of the compiler that they use for learning about other versions.

 

Finding a character or a substring

 

The generic algorithm (a non-member function), find() is quite useful in many ways with many STL objects, so as strings. However, string also has a member function which is also called find(). The member function find() is used to find the location of a character or a substring in a given string. Find takes the string under consideration as the calling object and the substring or char as the first argument. It returns the position at which that character or substring resides in the given string. The first position of the string is numbered as zero like arrays. That means the humans who read the string and find the location of a typical character at position 10, the find function finds it at position 9.

 

When the substring or the character provided as the argument is really a part of the invoking string object, the position where the argument is present in the invoking string is returned as mentioned above. However, if the character or the substring is not part of the string, a strange response is provided. The output will be the largest value a string position can obtain is returned (normally a very big number indicating the maximum size of the string object). Let us check following statements.

 

string String1 = “Vishwanathan is a Chess Player”;

unsigned long position = String1.find(“Chess”);

cout << “Position is ” << position << “\n”;

unsigned long Otherposition = String1.find(‘P’);

cout << “Position is ” << Otherposition << “\n”;

 

output: –

Position is 18

 

We have a string “Vishwanathan is a Chess Player” at our disposal and we tried to find out where the word Chess and character P exists in the string. The find function returns the position of the substring in the first case and the character in the second case as expected.

 

When find () is supplied with a string or a character not present in the string

 

Look at the following code as well as the output. Both, the substring and the character are not really present in the string.

string String1 = “Chirag throw a ball”;

string String2(“Cat jumps over”);

cout << String1.find(“Nowhere”) << endl; // substring not present cout << String1.find(‘B’)<< endl;// char not present

 

output: –

 

18446744073709551615

18446744073709551615

 

The large value 18446744073709551615 is the maximum size of the string on the machine where this code snippet was running.

 

Let us take a few more examples to illustrate few other important characteristics of the string object and the functions that are used with it.

 

Using function at()

 

A function at() takes an unsigned long integer value indicating the position and returns the character at that position. This functionality is exactly opposite to find. The find takes char and returns the position and the function at returns the character at a given position. In following code snippet, the value i is used to indicating the position in the string. The for loop starts with i value as 0 and went on till the length of the string and display the characters at ith position. As we are looping through each position of the string, we get the complete string as an output of the loop.

 

for (unsigned int i=0; i<String1.length(); i++)

 

cout << String1.at(i);

cout << endl;

Output: –

 

Chirag throw a ball

 

Using function insert ()

A function insert() takes two arguments, a position, and a substring. It adds the substring at a given position. Here is an example.

String1.insert(14, ” red “);

cout << String1<< endl;

 

output: –

Chirag throw a red ball

 

Using function append ()

The function append () takes a substring as an argument and appends the substring to the string. Here is an example.

String2.append(” the table!”);

cout << String2 << endl;

output: –

 

Cat jumps over the table!

 

 

Using function replace ()

 

The replace () function takes three arguments, position from where to start, number of characters from that position and the substring which replaces the part which is being replaced. Following example illustrates how one can replace a table with a chair in the string2

 

String2.replace(15,10, “the chair”); // table with chair

cout << String2 << endl;

 

output: –

 

Cat jumps over the chair

 

Using function erase()

 

The erase() function works like the replace function but replacing the content with nothing. It has two arguments, position to start with and the number of characters from that position. The function removes those characters from the string object. Here is an example where 10 characters from positon 15 are removed.

String2.erase(15,10); // now removing chair

cout << String2 << endl;

 

output: –

 

Cat jumps over

Dealing with multiple strings

 

There are many cases where multiple strings are manipulated in the program. For example, when one string is compared with the other string or checking if one string is assigned to another string etc. The C++ string object contains many member functions which can be used for operations involving multiple strings. Let us try to see how those member functions can be used in a program.

 

Using operators

 

+ operator for concatenation

The first operator function provided for string object is operator +() which allows a programmer to use + to concatenate two strings. Here is an example.

string String1 = “Chirag throw a ball”;

string String2(“Cat jumps over”);

string String3 = String1 + String2;

cout << String3;

 

output: –

 

Chirag throw a ballCat jumps over

 

= operator for assignment

 

The = operator is possible to be used to assign one string object into another. Here is an example.

string String1 = “Chirag throw a ball”;

 

string String3;

String3 = String1;

cout << String3;

output: –

 

Chirag throw a ball

 

== operator for comparison

 

string String1 = “Chirag throw a ball”;

 

string String3;

 

String3 = String1;

 

if (string3 == string1) cout << “both are equal”);

 

output: –

 

both are equal

> operator for comparison

 

We can use > to see if the LHS of the > is a string which is lexicographically greater than the string which is at the RHS of the operator. Here is an example.

 

cout << “\n as per lexicographically order”;

if (String1 > String2)

cout << “string1 is greater than String2\n”;

else

cout << “\nString1 is lesser than String2\n”;

 

output: –

 

as per lexicographically order string1 is greater than String2

 

!= operator for checking inequality

Like == operator, the != operator is also possible to check if two items on either side of the operators are not equal. Here is an example.

if (String1 != String2)

cout << “Both strings are not equal\n”;

 

output: –

 

Both strings are not equal

 

[] operator for array-like behavior

 

for (unsigned int i=0; i< String1.length(); i++)

cout << String1[i];

cout << endl;

 

output: –

 

Chirag throw a ball

Using functions

 

Now we will see a few functions which can help us deal with multiple strings.

 

Function compare ()

 

The C++ does not have the strcmp function for the string objects but a similar compare() function which provides similar functionality. Exactly like strcmp, it checks for all three cases, both strings being equal, first string being lexicographically (in the alphabetic order like the library) larger or the second string being larger. Compare function is called with another string as an argument. When it returns zero, both strings are equal, when it returns lesser value than 0, the first string is lexicographically smaller and when it returns greater value than 0, the first string is lexicographically larger than the second. Following code describes these three cases.

 

int val = String1.compare(String2);

 

if (val == 0)

cout << “Both the strings are equal\n”;

else if (val > 0)

cout << “Second String is lexicographically greater \n”;

else if (val < 0)

cout << “Second String is lexicographically lesser \n”;

 

output: –

 

Second String is lexicographically greater

 

Function swap ()

 

The function swap is also a useful function. It takes one string argument and swaps the calling string object value with the string object value of the argument.

 

cout << “Strings before swapping \n”;

cout <<“First string is ” << String1 << endl;

cout <<“Second string is ” << String2 << endl;

String1.swap(String2); // String2.swap(String1) will have same effect

cout << “Strings after swapping \n”;

cout <<“First string is ” << String1 << endl;

cout <<“Second string is ” << String2 << endl;

 

output: –

 

Strings before swapping

 

First string is Vishwanathan is a Chess Player

Second string is England Won the worldcup

Strings after swapping

First string is England Won the worldcup

Second string is Vishwanathan is a Chess Player

Yielding Characteristics of the string

 

In this section, we will be dealing with a few functions which help us deal with the characteristics of the string object. We will look at four functions which are used the most.

Function empty ()

 

If we want to see if the string is empty, this function is handy. This function does not take any argument and returns a bool value which is true if the string is empty. Otherwise, it returns false.

 

Function size() and max_size ()

 

The function size returns the size of the string. Another function length() can also be used for the same. The function max_size() returns maximum size of the string for a given compiler under a given machine. Here is an example.

 

cout << “\nCurrently the size of the string is ” << tString.size(); cout << “\nCurrently the string is empty: “<< tString.empty(); cout << “\nThe maximum size of string is ” << tString.max_size();

 

output: –

 

Currently, the size of the string is 30

Currently, the string is empty: false

The maximum size of string is 18446744073709551599

 

Function resize ()

The function resize contains a single argument, it resizes the string with the supplied argument size.

void Resize(string tString, int NewSize)

 

{

 

cout << “The original string is :”<< tString << endl;

cout << “Original size of the string is : ” << tString.size()<< endl;

tString.resize(NewSize);

cout << “Now the string size is : ” << tString.size()<< endl; cout << “The resized string is :”<< tString << endl;

 

}

 

Resize (String1, 100);

Resize (String2,10);

The original string is: England Won the worldcup

Original size of the string is: 24

Now the string size is: 100

The resized string is: England Won the worldcup

The original string is: Vishwanathan is a Chess Player

Original size of the string is: 30

Now the string size is: 10

The resized string is: Vishwanath

 

Summary

 

We have seen how arrays can be defined and used with objects as members at the beginning of this module. We have learned to define and use string objects after that. We have seen through numerous examples how string objects work in accordance with the idea of providing user-defined types to work as close to built-in types as possible. We have seen operators as well as a few member functions of the string class.

 

you can view video on Arrays and Strings

References

  1. Programming with ANSI C++, Bhushan Trivedi, Oxford University Press
  2. www.stroustrup.com, homepage of Bjarne Stroustrup, the creator of C++