29 Binary Files in C++

Jyotika Doshi

Introduction

 

As explained in ‘File I/O’ module, files are used for persistent data storage. File I/O facilitates transfer of data between secondary memory (hard disk) and main memory.

 

In C++, there can be two types of files: Text files and Binary files. Text files are discussed at length including examples in ‘File I/O’ module. Our level of comfort lies only with proper ASCII or UNICODE characters, as they are human readable.

 

In this module, we will see working with binary files in detail.

 

Binary File

 

Binary file is a file with data stored in raw format, the way it is stored in memory.

 

For example, numbers are stored in binary in memory. They are not converted to text (ASCII characters) when writing to binary file.

 

The data in binary format is not human readable and cannot be read or modified using text editors.

 

Advantages of using binary file over text file

 

Text files perform formatted I/O; whereas binary files perform unformatted I/O with raw data. Thus conversion between binary format and ASCII text format do not take place in case of binary files.

 

See following example to understand this.

 

Consider integer value 65536. When read from keyboard, it is entered using keys ‘6’, ‘5’, ‘5’, ‘3’, ‘6’ as ASCII characters. But in memory, data is stored in binary format as 0x’0001FFFF’.

 

Integer 65536 stored as text (using 5 ASCII characters):

 

Character ‘6’ ‘5’ ‘5’ ‘3’ ‘6’
ASCII in Binary 00000110 00000101 00000101 00000011 00000110
ASCII in Hexa 06 05 05 03 06

 

 

Integer 65536 stored as binary in memory (assuming long int, using 4 bytes):

 

In Binary 00000000 00000001 11111111 11111111
In Hexa 00 01 FF FF

When writing the output on screen, it requires converting data stored in memory to text.

Thus, 0x’0001FFFF’ is converted to 5 characters ‘6’, ‘5’, ‘5’, ‘3’, ‘6’.

 

Thus, working with text files requires conversion from binary to text or vice versa.

 

When data is stored in binary file, it does not require such conversion. Integer 65536 is stored as 0x’0001FFFF’ using 4 bytes without any conversion.

 

  • Binary read/write is faster, as data interpretation or conversion is not required
  • Binary file uses fixed size of memory space for specific type of objects

In text file, number 65536 requires 5 ASCII characters (5 bytes), number 123456789 requires 9 bytes and number 2 requires 1 byte of storage space. In case of binary files, all these numbers requires fixed 4 bytes space, assuming long int data type.

 

For larger numbers, binary file uses less space as compared to text files.

  • Binary file helps in protecting data up to some extent

The data stored in binary format is not easy to read for humans. Someone may consider this as a drawback, but it helps in protecting data up to some extent. Text files can be easily edited using simple text editors. Binary files cannot be read using text editors.

 

If information such as marks of students or balance in bank account is stored in text files, it can be easily tampered by someone having access rights. If such information is stored in binary files, data cannot be edited using simple text editors.

 

As explained above, binary files are more efficient in terms of speed and memory usage.

Moreover they provide data protection up to certain level.

 

When to use binary files?

 

Even though most of the programs usually use ASCII text files, there are certain occasions where binary files will be very useful.

 

Binary files should be preferred in applications like

 

  • Number crunching scientific applications requiring large amount of data to be processed
  • Data sensitive applications involving financial data, personal data or scores in exam etc.

 

Working with binary files

 

Working with files in C++ requires the use of file-oriented streams based on classes:

ifstream, ofstream and fstream.

 

  • ifstream (input file stream) for reading only
  • ofstream (output file stream) for writing only
  • fstream (file stream) for reading as well as writing

All these classes are defined in fstream file. Thus, to perform file operations in C++, header files <iostream> and <fstream> must be included in a program.

 

Like text files, working with binary files also requires following operations:

  • open file
  • process: read/write using file, check errors
  • close file

 

Before any operation can take place on a file, it must be opened first. When working with file is finished, it should be closed to avoid loss of data.

 

Opening binary file requires open mode ios::binary to be specified. Binary input/output operations can be performed using read() and write() methods.

 

Performing error checks is similar to those in text files.

 

Opening Binary file

 

Opening binary file is similar to opening text files, but requires specifying ios::binary as additional open mode. A file stream object can be opened in one of two ways.

 

First, supply a file name along with an i/o mode parameter to the constructor when declaring an object:

 

Syntax: FileStream FileObject (const char *filename[, int mode][, int prot]);

 

Ex. ifstream myFile (“data.bin”, ios::in | ios::binary);

 

Second, call open method after declaring a file stream object.

 

Syntax of open(): void open(const char *filename[, int mode][, int prot]);

 

Ex. ofstream myFile; … myFile.open (“data2.bin”, ios::out | ios::binary);

 

Ex. fstream myFile (“data.bin”, ios::in | ios::out | ios::binary);

 

Either approach will work equally well with ifstream, ofstream or fstream object. In the syntax given here, [] indicates that the mode and prot arguments are optional. Default mode is ios::in for ifstream, ios::out for ofstream and ios::in | ios::out for fstream.

 

For binary I/O, opening with mode flag ios::binary indicate to suppress formatting and conversion.

 

Note: When working with text files, one may omit the second parameter (the i/o mode parameter) to use the default mode. However, in order to manipulate binary files, it is necessary to specify the i/o mode also along with ios::binary mode.

Manipulating binary file

 

As binary format does not require type conversion and formatting, conventional text-oriented << and >> operators are not used normally with binary files.

 

File streams include member functions read() and write() specifically designed to read and write binary data.

 

One may use methods get() and put() to read/write single character in binary. But, for binary i/o of numbers and other complex data types like struct or class, it requires using methods read() and write().

 

Function write() is a member function of ostream (inherited by ofstream); and read() is a member function of istream (inherited by ifstream). Objects of class fstream have both these member functions available as fstream is derived from ifstrem and ofstream.

 

Syntax:

 

istream& istream::read(char* buf, int size)

 

ostream& ostream::write(const char* buf, int size)

 

As seen in the syntax, both read() and write() have same parameters.

 

Parameter buf is of type char * (pointer to char) and it specifies the starting address of an array of bytes where the read data elements are to be stored or from where the data elements are to be written. Note that first parameter should be a C-type array of characters and not a string type of C++.

 

The size parameter specifies the number of characters to be read or written from/to file.

 

1.1.  Writing data to binary file

 

Method write() can be used to write data to binary file associated with ofstrem or fstrem object.

 

The write() method causes specified size of bytes to be written from the given memory location buf to binary file on given stream and moves the file pointer size bytes ahead.

 

Following example writes two numbers in file. First is of type int, and second is of type double. Note that it requires typecasting to (char *) as per syntax. To know the size of an item to be written, one may use sizeof() operator.

 

Example code segment:

 

int i = 1234;

 

double d = 12.34;

ofstream fout(“data.bin”, ios::out | ios::binary);

fout.write((char *)&i, sizeof(int));

fout.write((char *)&d, sizeof(double));

  • Writing in a file starts at the position of the “put” pointer.

 

If the put pointer is current at the end of the file, the file is extended. If the put pointer points into the middle of the file, characters in the file are overwritten with the new data.

 

The position of the “put” pointer can be known using tellp() method and can be changed using seekp() method.

 

  • The bytes that are written are not interpreted or formatted.
  • The write() method do not add carriage return after writing the data.
  • The write method does not assume a null terminator at the end of the bytes in buf being written.

It writes size bytes from starting address specified in buf irrespective of occurrence of null character. Null terminator may not be there or may be there before size number of bytes, it has no effect.

 

  • If an error occurs while writing (for example, out of disk space), the stream is placed in an error state. Such errors are not common and are often not checked, but it is programmer’s responsibility to check such errors. Once error state bit is set, it remains in effect in following file operations. So, programmer should remember to clear the error state of the file stream before next operation. Error state can be reset using clear() method.

 

Note: The data written to a binary file using write( ) can be read accurately using read( ) only.

 

1.2.  Reading data from binary file

 

Use read() method to read from a binary file associated with fstream or ifstream object.

 

Function read() extracts a given number of bytes from the specified stream and store it into the memory pointed to by the first parameter which is a C-type array of characters.

 

Following example reads two numbers (int and double) from binary file. Note typecasting (char *) used to convert address of int or double to pointer to char.

 

Example code segment:

 

int i;

 

double d;

ifstream fin(“data.bin”, ios::in | ios::binary);

 

fin.read((char *)&i, sizeof(int));

fin.read((char *)&d, sizeof(double));

When reading data from a file, there are a couple of things to watch out.

 

  • It is the responsibility of the programmer to ensure that the buffer is created and is large enough to hold the number of bytes of the data that is being read.

The following code segment may probably result in a crash. Reading 7 bytes at memory location of x may overwrite next locations after 2 or 4 bytes (depending on size of int).

 

int x;

 

ifstream infile;

 

infile.open(“data.bin”, ios::binary | ios::in);

 

infile.read((char *) &x, 7); // reads 7 bytes into a cell that is either 2 or 4 bytes long

 

  • The bytes that are read are not interpreted, simply stored in memory as read from file.
  • The method does not assume anything about line endings.

Method reads exactly size number of bytes. If end of line happens to occur before size bytes, it has no effect.

  • The read method does not place a null terminator at the end of the bytes that are read in.
  • If an error occurs while reading (for example, reading after the end of a file), the stream is placed in an error state. In such case, one can use the gcount() method to find out the number of characters actually read.

 

Once a stream goes into an error state, all future read operations will fail. Do not forget to use the clear() method to reset the stream to a usable state.

 

Checking error in file operation

 

In C++, no file operations cause the program to stop. If an error occurs and is not checked, then the program will be running unreliably. Programmer is responsible to check i/o errors and do the needful.

 

Testing for errors, either check error state bits or use methods like good(), fail() etc.

 

A special method is_open() can be used to check whether opening file is successful or not. Method is_open() returns true if open is successful

 

For all other i/o operations, error can be checked using one of the following:

  • Method good() returns true if operation is successful. None of the bits/flags is set in error state.
  • Method fail() returns true when operation is unsuccessful, i.e. if failbit or badbit is set.
  • Method bad() returns true when operation is unsuccessful, i.e. if badbit is set.
  • Method eof() returns true when end of file is reached, i.e. if eofbit is set.
  • Operator ! returns true when operation is unsuccessful, like fail()
  • Stream pointer is null when operation is unsuccessful Examples:
  • if (fobj.is_open()) {//file opened, process} else {//error}
  • if (fobj.good()) {//successful operation, process} else {//error}
  • If (fobj.fail()) {//error} else {// successful operation, process}
  • If (fobj.bad()) {//error} else {// successful operation, process}
  • If (fobj.eof()) {//error, end of file} else {// process}
  • If (!fobj) {//error} else {// successful operation, process}
  • if (fobj) {// successful operation, process} else {//error}

 

Remember that once error bit of error state is set, it remains in effect. When checking error in next operation with associated stream, it will show an error even if next operation is successful. So, do not forget to reset the error state flag associated with the stream after an error is encountered. One may use clear() method to reset the error state associated with stream.

 

Examples of working with binary files

 

1.3.  Program to copy contents of one file to another file using get() and put() methods

 

  • // copy one file to another using get(), put()
  • // character by character copy

#include <iostream>

#include<fstream>

using namespace std;

 

int main()

{

ifstream fin;

fin.open(“file1.pdf”, ios::in | ios::binary);

 

ofstream fout;

fout.open(“file2.pdf”, ios::out | ios::binary);

if (! (fin.is_open() && fout.is_open()))

{ cout << “Error opening file…”, exit(1); }

 

char ch;

while(!fin.eof())

{ ch = fin.get(); fout.put(ch); }

 

fin.close();    fout.close();

return 0;

}

Before executing this code, see that the location of input file is same as that of the executable program. Otherwise, use full path.

 

 

1.4. Program to copy entire content of one file at a time to another file using read() and write()

 

If a file is small enough to get accommodated in memory, binary copy can be performed by reading and writing entire file at a time using single read and write operation.

 

In program code 9.1, do modifications as shown below and then execute.

  • replace statement

fin.open(“file1.pdf”, ios::in | ios::binary); with

 

fin.open(“file1.pdf”, ios::in | ios::binary | ios::ate); // file pointer at end of file

  • replace code

char ch;

while(!fin.eof())

{         ch = fin.get(); fout.put(ch);      }

 

 

with following code

 

streampos size;

char * buf;

 

size=fin.tellg(); buf=new char [size]; // get size of file and allocate memory of size bytes

if (buf) // memory allocation successful

{

 

fin.seekg (0, ios::beg); // pointer at beginning of file fin.read (buf, size); fout.write (buf, size);

}

else {cout << “error allocating memory”;}

 

 

Here, input file is opened with ios::ate mode to have file pointer at end. It enables to know the size of file using tellg(); and then memory buffer of that size can be allocated.

 

Note that the program code size has become larger, but execution will be faster. After all, execution time matters a lot!

 

1.5. Program to copy one file to another file with one block at a time using read() and write()

If a file is very large, it may not possible to get enough memory to allocate. It is safer to copy block by block. Block size should be selected to be large enough to reduce number of read/write operations, but to get successful memory allocation.

 

Modify program code 9.1 as follows:

 

replace code

 

char ch;

while(!fin.eof())

{         ch = fin.get(); fout.put(ch);      }

 

 

with following code

 

//copy block by block

streampos size = 4096;

char buf [4096];

while (!fin.eof())

 

{ fin.read (buf, size); fout.write (buf, fin.gcount());

}

 

1.6.  Program to write and read objects in binary file using write() and read() methods

 

For examples of objects in binary file, consider class Student as shown below:

 

class Student

{char name[30];

int marks[7]; // marks in 7 subjects

public:

void GetStudentData()  // input student data

{

cout << “Enter student’s name: “;

cin.getline(name,30);

cout << “Enter marks in 7 subjects:”;

for (int i=0; i<7; i++) cin >> marks[i];

}

 

void ShowStudentData() // display student data

{

 

cout << “Name: ” << name <<endl;

cout << “Marks: “;

for (int i = 0; i<7; i++) cout << marks[i] << ”  “;

cout << endl;

}

 

 

char * getname() { return name; }

void getMarks ()

{

cout << “Enter marks in 7 subjects:”;

for (int i=0; i<7; i++) cin >> marks[i];

}

}; // end class Student

 

Refer following code to write objects in binary file, read all objects and display. While

writing into file, objects are inserted in file at end, so file is to be opened with ios::app

mode.

Student sobj;

 

void writeStudents()

{

ofstream outfile;

 

outfile.open(“student.dat”, ios::out | ios::binary | ios::app); if (!outfile.is_open())

{ cout << “Error opening input file…”, exit(1); }

 

int i,n;

 

cout << “How many students to add? “;

cin >> n;

for ( i=0; i<n; i++)

{

cin.get(); // extract enter key

 

  • // read student data GetStudentData();
  • // write student data in binary file write ( (char*)&sobj, sizeof(sobj));

}

outfile.close();

} // end addStudents

 

void readStudents()

 

{

 

ifstream infile(“student.dat”, ios::in | ios::binary); if (!infile )

{ cout << “Error opening input file…”, exit(2); }

  • // read from binary file

 

infile.read( (char*)&sobj, sizeof(sobj));

while (!infile.eof())

 

{

//if (infile) // successful read

sobj.ShowStudentData();

infile.read( (char*)&sobj, sizeof(sobj));

}

infile.close();

}

 

int main()

{

writeStudents();

 

readStudents();

cout <<“Press any key “;

cin.get(); cin.get();

return 0;

}

 

Sample output of program is shown below.

 

How many students to add? 2

Enter student’s name: Tina

Enter marks in 7 subjects:45 40 38 47 42 48 39

Enter student’s name: Ali

Enter marks in 7 subjects:48 43 40 37 36 47 45

Name: Tina

 

Marks: 45  40  38  47  42  48  39

Name: Ali

Marks: 48  43  40  37  36  47  45

 

 

1.6.1. Modify object stored in binary file

 

Add function to modify marks of a specified student and write updated record back in binary file as shown here.

 

void modifyMarks()

{

//file for read/write purpose, so use fstream class

fstream file(“student.dat”, ios::in | ios::out | ios::binary);

if (!file )

{ cout << “Error opening input file…”, exit(3); }

 

char name [40];

 

cout << “Enter name of student to modify marks: “; cin >> name;

 

file.read( (char*)&sobj, sizeof(sobj));

while (!file.eof())

{

if (strcmp(name, sobj.getname())==0) // object found

{

sobj.getMarks();

 

long pos = -1 * sizeof(sobj); //for backward move file.seekp (pos, ios::cur); //back from current file.write ((char*)&sobj, sizeof(sobj));

 

break;

}

file.read( (char*)&sobj, sizeof(sobj)); //read next

 

}

file.close();

}

 

 

Calling modifyMarks() and readstudents() in main function will result in the following output:

 

Enter name of student to modify marks: Tina

Enter marks in 7 subjects:45 45 46 46 47 47 40

 

after modification …

Name: Tina

Marks: 45  45  46  46  47  47  40

Name: Ali

Marks: 48  43  40  37  36  47  45

 

Here, observe following points in the code:

 

  • Use of fstream with input as well as output open mode

 

To modify marks of a specified student, it requires reading student, modifying marks if name is matching, and then writing student object back at the same position. Hence, one should use fstream class object for i/o strem.

 

  • Positioning “put” pointer at the beginning of record before writing

 

After reading an object, file position is moved sizeof(sobj) bytes ahead. Before writing the updated object back, the position of “put” pointer should be moved sizeof(obj) bytes in backward direction.

 

1.6.2.    Delete an object from binary file

 

Deletion of record from a file can be implemented using logical deletion or physical deletion.

 

For logical deletion of an object, add one data member, say delFlag, in a class to mark the object as deleted. Now deletion is as good as modifying object by setting delFlag to true. A care should be taken while processing, like showing only non-deleted objects.

With physical deletion, an object record is to be removed from file. Due to sequential nature of streams, remaining records are to be shifted. It requires setting “get” and “put” pointer for all remaining records. The simple solution is to write remaining records in one temporary file and copying temporary file back to original file. This can be performed using following steps:

 

(i) Create temporary file

(ii) Read an object from data file

(iii) If it matches with the specified object to be deleted, do not write in temporary file; otherwise write in temporary file

(iv) Repeat steps (ii) and (iii) till end of data file

(v) Close both files

(vi) Delete original data file

(vii) Rename temporary file as original data file

(viii) rename(“temp.dat”, “marks.dat”);

 

Refer following code using physical deletion of records from binary file.

 

void deleteStudent()

{

 

ifstream infile(“student.dat”, ios::in | ios::binary); if (!infile )

{ cout << “Error opening input file…”, exit(2); }

 

ofstream tmpfile;

 

tmpfile.open(“temp.dat”, ios::out | ios::binary); // copy objects in temp file if (!tmpfile.is_open())

 

{ cout << “Error opening input file…”, exit(4); }

 

char name [40];

 

cout << “Enter name of student to be deleted: “; cin >> name;

 

infile.read( (char*)&sobj, sizeof(sobj));

while (!infile.eof())

{

if (strcmp(name, sobj.getname())!=0) // object not to be deleted

{

tmpfile.write ((char*)&sobj, sizeof(sobj));

}

 

infile.read( (char*)&sobj, sizeof(sobj)); //read next

}

infile.close(); tmpfile.close();

remove(“student.dat”); // remove old student.date file

 

rename(“temp.dat”, “student.dat”); // rename temp.dat to student.dat

}

Write following lines in main() and execute program.

 

deleteStudent();

cout << “\nafter deletion … \n”;

 

readStudents();

 

 

Observe the ouput:

 

Enter name of student to be deleted: Tina

 

after deletion …

Name: Ali

Marks: 48  43  40  37  36  47  45

 

1.6.3. Random access: Reading nth object from binary file

 

To access nth object from file without reading previous records, simply position “get” file pointer at (n-1)*sizeof(obj) bytes from beginning of file and read the record as follows.

 

ifstream file;

file.open(“student.dat”, ios::in | ios::binary);

//input n

long pos = (n-1) * sizeof(sobj); //for moving pos bytes ahead from beginning

file.seekg (pos);  // from beginning

file.read ((char*)&sobj, sizeof(sobj));

sobj.ShowStudentData();

file.close();

 

 

Summary

  • Binary files are used for persistent storage of data in raw format o No type conversion or formatting takes place

o  Data stored in the format as it is stored in main memory

  • Advantages of binary files over text files
    • o Faster input/output operations
    • o Storage apace efficient
    • o Not editable using text editors
  • While opening, requires ios::binary open mode to be specified alongwith other modes
  • Input/output operations can be performed using
    • o Methods read() and write() for any type of data; may be numeric, character,

 

string, struct, object etc.

  • o Additionally, methods get() and put() can be used to read and write character data
  • Random access and file error handling is same as with text files

References

 

 

1) Stanley Lippmann, “C++ Primer”, Pearson Education.

2) Bjarne Stroustrup, “The C++ Programming Language”, Pearson Education.

3) Scott Mayer, “Effective C++”, Addison Wesley.

4) Bhushan Trivedi , Programming with ANSI C++, 2/e , Oxford University Press.

5) Yashavant P. Kanetkar “Let Us C++” , Bpb Publications.

6) Abhiram G. Ranade, “An Introduction to Programming through C++” , McGraw Hill

7) Ellis and B. Stroustrup “Annotated C++ Reference Manual”, http://www.stroustrup.com/arm.html

8) Herbert Schildt, “Complete Reference C++”, McGraw Hill Publications.

9) Ashok Kamthane, “Object Oriented Programming with ANSI and Turbo C++”, Pearson Education

10) E Balaguruswami, “Object Oriented Programming With C++”, Tata McGraw Hill

11) “C++ FAQs”, Pearson Education.