Learning C++ part 3

Table of Contents

Interface in c++

In C++, an interface is not a built-in language construct, but rather a design pattern implemented using a class that contains only pure virtual functions. A pure virtual function is declared using = 0, which makes the class non-instantiable and forces derived classes to provide implementations.

By convention, an interface class typically: avoids storing state (no data members), exposes only behavior, and declares a virtual destructor to ensure proper cleanup through base-class pointers. Although C++ does not strictly forbid constructors or implemented methods in an interface-style class, idiomatic usage keeps it minimal and purely behavioral.

1class IDisplayable { // 'I' prefix by convention
2public:
3    virtual void display() = 0; // Pure virtual function
4    virtual ~IDisplayable() = default; // Always make base destructors virtual
5};

The key idea is that an interface defines a contract. Any class inheriting from IDisplayable must implement display(), guaranteeing consistent behavior across unrelated types.

A more general abstract class, on the other hand, is simply any class that contains at least one pure virtual function. Unlike a strict interface-style class, an abstract class may: provide both pure virtual and fully implemented methods, contain data members (shared state), and define constructors to initialize that shared state.

Abstract classes are particularly useful when modeling a hierarchy of closely related types that share common behavior. For example, a base Animal class might implement a concrete eat() function while leaving makeSound() as a pure virtual function. This allows derived classes to reuse shared logic while still customizing specific behaviors.

virtual keyword

The virtual keyword enables dynamic dispatch, also known as runtime polymorphism. When a function is declared as virtual in a base class, the function that gets executed is determined at runtime based on the actual type of the object, not the type of the pointer or reference.

The most critical place where virtual matters is the destructor. If you delete a derived object through a base class pointer, the base class destructor must be virtual. Otherwise, only the base portion of the object will be destroyed, and the derived class's destructor will not run. This leads to resource leaks and undefined behavior.

1class Base {
2public:
3    virtual ~Base() = default; // Must be virtual for polymorphic deletion
4};
5
6class Derived : public Base {
7public:
8    ~Derived() {
9        // Cleanup specific to Derived
10    }
11};

Without a virtual destructor, deleting a Derived object through a Base* would only call ~Base(), skipping ~Derived().

As a best practice, if a class contains any virtual functions or is intended to be used polymorphically (i.e., accessed through a base pointer or reference), its destructor should always be declared virtual. This ensures correct object destruction and safe polymorphic behavior.

1class Base {
2public:
3    int x; // 4 bytes
4    virtual void func() {};
5};
6
7class Derived : public Base {
8public:
9    int y;
10    int z;
11};
12
13int main() {
14    std::cout << sizeof(Base) << std::endl;
15    // vptr (8 bytes) + x (4 bytes) + padding (4 bytes) = 16
16
17    std::cout << sizeof(Derived) << std::endl;
18    // Base part: vptr (8) + x (4)
19    // Derived members: y (4) + z (4)
20    // Padding added to maintain 8-byte alignment
21    // Total = 24 bytes
22
23    return 0;
24}

When a class contains a virtual function, the compiler adds a hidden pointer called the vptr. This pointer references a vtable, which stores the addresses of the virtual functions. On most 64-bit systems, the vptr occupies 8 bytes.

Because the largest member in the object is a pointer (8 bytes), the compiler aligns the object size to a multiple of 8 bytes. This results in padding being added where necessary.

Memory layout of Base:

1 Base ----------------------------------------------------- 
2   | vptr (8 bytes) | | x (4 bytes) | | padding (4 bytes) | 
3 ----------------------------------------------------------
4sTotal: 16 bytes

Memory layout of Derived:

The Derived class inherits the memory layout of Base. Many compilers reuse the padding at the end of the base class to store derived members when possible.

1Derived ----------------------------------------------------------------- 
2    | vptr (8 bytes) | x (4) | y (4) | | z (4 bytes) | | padding (4 bytes) |
3------------------------------------------------------------------------- 
4Total: 24 bytes

In summary, each object only stores a single vptr, not the entire vtable. The vtable itself is shared by all objects of the same type. Alignment and padding ensure that objects satisfy the platform's memory alignment requirements, which is why the final sizes become 16 bytes for Base and 24 bytes for Derived.

override keyword

In C++, a derived class can override a base class function as long as the base function is declared virtual and the function signatures match exactly. Theoverride keyword is not required for overriding to work — it is a compile-time safety feature.

Consider the following base class:

1class Shape {
2public:
3    virtual void printDetails() {
4        std::cout << "Shape\n";
5    }
6};

And a derived class that does not use override:

1class Rectangle : public Shape {
2public:
3    void printDetails() {
4        std::cout << "Rectangle\n";
5    }
6};

This still overrides correctly. Dynamic dispatch works as expected:

1Shape* s = new Rectangle();
2s->printDetails();   // prints "Rectangle"

The override happens because the base function is virtual and the function signatures match exactly.

So why does override exist? Without it, the compiler does not verify that you are actually overriding a base class function. If you accidentally change the function signature, the derived function becomes a completely new function instead of an override.

1class Rectangle : public Shape {
2public:
3    void printDetails() const {  // added const
4        std::cout << "Rectangle\n";
5    }
6};

Now the signatures differ:

1virtual void printDetails();      // Base
2void printDetails() const;      // Derived (different signature)
3
4Shape* s = new Rectangle();
5s->printDetails();              // calls Shape::printDetails

Because the signatures do not match exactly, the derived function does not override the base version. Instead, it hides it. This can lead to subtle bugs.

If we add override, the compiler will detect the mistake:

1void printDetails() const override;
2// error: marked 'override' but does not override

Using override is considered best practice. It provides compile-time guarantees that the function is truly overriding a base class virtual function, making polymorphic code safer and easier to maintain.

Object slicing

Object slicing occurs when a derived class object is copied into a base class object by value. During this copy, only the base portion of the derived object is preserved. Any data members or behavior specific to the derived class are discarded.

1class Base { 
2public:
3    int foo; 
4};
5
6class Derived : public Base { 
7public:
8    int bar; 
9};
10
11Derived d;
12Base b = d;   // Object slicing occurs here; 'bar' is lost.

After the assignment, bis a completely independent Base object. The derived-specific member bar no longer exists in the copied object.

Object slicing commonly happens in three situations. First, when passing a derived object to a function that takes a base class parameter by value. The function receives a sliced copy. Second, when returning a derived object by value from a function whose return type is the base class. Third, when storing derived objects inside containers such as std::vector<Base>, which store elements by value.

To avoid object slicing, pass and store polymorphic objects using references(e.g., const Base&) or pointers(e.g., Base*,std::unique_ptr<Base>). This preserves the full derived object and enables correct runtime behavior.

Stack vs Heap

In C++, memory is broadly divided into two primary regions: the stack and the heap.

The stack is automatically managed by the program and typically stores local variables inside functions. Objects placed on the stack have a well-defined lifetime — they are created when execution enters a scope and destroyed automatically when the scope exits.

The heap, on the other hand, is used for dynamic memory allocation. Memory on the heap must be explicitly managed by the programmer using new / delete, or indirectly through abstractions such as standard library containers and smart pointers.

Traditional C-style arrays and std::array are typically allocated on the stack when declared as local variables with a compile-time constant size. Their lifetime follows the scope in which they are defined, meaning they are automatically destroyed when the function returns.

std::vector behaves differently. Even if the std::vector object itself is created on the stack, the elements it stores are allocated on the heap. Internally, the vector manages a dynamically allocated contiguous buffer that can grow or shrink at runtime.

In general, the stack is best suited for small, short-lived, and predictable data, while the heap is used for data whose size is unknown at compile time or must outlive the current scope.

Stack allocation is usually faster than heap allocation due to the simplicity of its memory management and better CPU cache locality.

Fast allocation: The stack operates using a Last-In-First-Out (LIFO) model. Allocating memory usually only requires adjusting the stack pointer register by a fixed offset. Deallocation is equally fast and happens automatically when a function returns.

Heap management overhead: The heap must support dynamic allocation and deallocation in arbitrary order. Memory allocators need to track free blocks, maintain metadata, and sometimes request memory from the operating system. This bookkeeping introduces additional overhead, making heap allocation slower than stack allocation.

Cache locality: Stack memory tends to have excellent spatial and temporal locality because local variables are accessed frequently during function execution. This increases the likelihood that stack data remains in the CPU's L1 cache. Heap allocations, however, can become scattered due to fragmentation, which can lead to more cache misses and slower memory access.

Thread behavior: Each thread has its own independent stack, so accessing stack variables usually requires no synchronization. The heap is typically shared across all threads in a process, so memory allocators often need synchronization mechanisms to prevent data corruption. This additional coordination can introduce further performance overhead in multi-threaded programs.

static_cast vs dynamic_cast vs reinterpret_cast

C++ provides several explicit casting operators that allow programmers to convert values between types in a controlled and readable way. The most commonly used ones are static_cast, dynamic_cast, and reinterpret_cast. Each serves a different purpose and provides a different level of safety.

static_cast performs conversions that can be verified at compile time. It is commonly used for standard conversions such as converting numeric types (for example int to double), invoking explicit constructors, or performing casts within an inheritance hierarchy.

Within class hierarchies, static_cast is commonly used for upcasting, such as converting a derived class pointer to a base class pointer. It can also perform downcasting, but it does not perform runtime type checking. This means the programmer must guarantee the cast is correct. If the object is not actually of the expected derived type, the behavior is undefined behavior.

static_cast is restricted to conversions between related types or conversions to and from void*. It cannot be used to convert between unrelated pointer types. For example, you cannot convert an int* to a char* or long* using static_cast. In situations where such conversions are required, reinterpret_cast must be used instead, although it is significantly less safe.

Another common use of static_cast is converting pointers to and from void*. In such cases the underlying address remains unchanged.

1int* a = new int();
2void* b = static_cast<void*>(a);
3int* c = static_cast<int*>(b);

dynamic_cast is designed for safe casting within polymorphic class hierarchies. Unlike static_cast, it performs runtime type checking using runtime type information (RTTI).

A key requirement is that the base class must be polymorphic, meaning it must contain at least one virtual function. This allows the runtime system to determine the actual type of the object.

When casting pointers, a failed dynamic_cast returns nullptr. When casting references, it throws std::bad_cast. This makes dynamic_cast safer for downcasting when the exact runtime type is uncertain.

reinterpret_cast is a much lower-level operation. It instructs the compiler to treat the raw bit pattern of a value as a different, often unrelated, type. This form of cast performs no type checking and should generally be avoided unless absolutely necessary.

Typical uses of reinterpret_cast include converting between unrelated pointer types, converting pointers to integers (and vice versa), working with hardware interfaces, or examining raw memory layouts.

One important property is that if a pointer is converted to another type using reinterpret_cast and then converted back to its original type, the original pointer value is preserved. However, dereferencing the intermediate casted pointer may lead to undefined behavior if alignment requirements or strict aliasing rules are violated.

1// Treating a float's bits as an integer
2float f = 10.5f;
3uint32_t* p = reinterpret_cast<uint32_t*>(&f);
4// 'p' points to the same memory but interprets the bits as an integer

In general, static_cast should be used for well-defined conversions known to be safe at compile time. dynamic_cast should be used when performing downcasts in polymorphic class hierarchies where runtime verification is required. reinterpret_cast should only be used for low-level operations where raw memory reinterpretation is unavoidable.

c++ string

A std::string is a high-level container that manages a dynamically sized sequence of characters. The string object itself is typically a small, fixed-size object (often stored on the stack when declared locally). Internally, it maintains metadata such as a pointer to the character buffer, the current size, and the capacity.

For longer strings, the actual character buffer is allocated on the heap. When the string grows beyond its current capacity, it typically allocates a new, larger buffer (often using a growth strategy such as doubling the capacity), copies the existing characters, and then deallocates the old buffer. This allows std::string to resize dynamically.

1Stack:
2[ std::string object ]
3    - pointer
4    - size
5    - capacity
6
7Heap:
8[ 'h','e','l','l','o','' ]

Most modern implementations apply Small String Optimization (SSO). With SSO, short strings are stored directly inside the std::string object itself, avoiding heap allocation entirely. On many 64-bit systems, strings up to roughly 15 characters can fit within this internal buffer, though the exact size is implementation-dependent.

Conceptually, without SSO: the std::string object (pointer, size, capacity) resides on the stack, while the character data lives on the heap.

With SSO enabled: the characters themselves are stored directly inside the string object, eliminating heap allocation for small strings and improving performance.

Memory management is handled automatically throughRAII (Resource Acquisition Is Initialization). When a std::string object goes out of scope, its destructor releases any dynamically allocated memory, ensuring safe and automatic cleanup.

Operator new Function

In C++, the difference between the new operator and the operator new function is the distinction between a high-level language construct and a low-level memory allocation function.

The new operator is a C++ language keyword used to dynamically create an object (or an array of objects) on the heap. It performs two distinct steps. First, it calls the appropriate operator new function to allocate raw memory. Second, it invokes the object's constructor to initialize that memory.

1MyClass* p = new MyClass(arguments); // Uses the new operator

In contrast, operator new is a function whose sole responsibility is to allocate a specified number of bytes of raw, uninitialized memory. It does not construct objects. Conceptually, it is similar to malloc in C.

The function takes the size (in bytes) as an argument and returns a void* pointer to the allocated memory block.

1void* raw_memory = operator new(sizeof(MyClass)); // Calls operator new directly

Since this memory is uninitialized, the object's constructor has not been called yet. To construct an object inside this pre-allocated memory, you must use placement new.

1MyClass* p = new (raw_memory) MyClass(arguments); // Placement new constructs the object in raw_memory

The scope resolution operator :: can be used to explicitly call the global operator new, even if the class defines its own overloaded version:

1void* raw_memory = ::operator new(sizeof(MyClass));

Placement new Function

Placement new is a specialized form of new in C++ that allows you to construct an object at a pre-allocated memory address. Unlike the standard new operator, which both allocates memory and invokes the constructor, placement new skips the allocation step and only executes the constructor.

It is commonly used in performance-critical systems, custom memory managers, container implementations (e.g.,std::vector), and embedded environments where avoiding dynamic allocation improves performance and cache locality.

To use placement new, you must include the<new> header. The syntax is: new (pointer) Type(args), where pointer is a void* pointing to raw, pre-allocated memory.

1#include <iostream>
2#include <new>       // Required for placement new
3
4struct Person {
5    std::string name;
6    int age;
7
8    Person(const std::string& n, int a) : name(n), age(a) {
9        std::cout << "Person constructed: " << name << ", " << age << "\n";
10    }
11
12    ~Person() {
13        std::cout << "Person destroyed: " << name << "\n";
14    }
15};
16
17int main() {
18    // Allocate properly aligned raw storage on the stack
19    alignas(Person) char buffer[sizeof(Person)];
20
21    // Construct a Person object in the buffer
22    Person* p = new (buffer) Person("Alice", 30);
23
24    std::cout << p->name << " is " << p->age << " years old.\n";
25
26    // IMPORTANT: Manually call the destructor
27    p->~Person();
28
29    // Do NOT call delete p; (memory was not heap-allocated)
30}

Because placement new does not allocate memory, there is no corresponding delete call. You must manually invoke the destructor using ptr->~Type(). Failing to do so will leak any resources managed by the object.

Proper alignment is critical. The memory must be correctly aligned for the type being constructed. Using a plain char buffer without alignas(Type) can lead to undefined behavior on some architectures. Placement new separates memory allocation from object construction.

RAII (Resource Acquisition Is Initialization)

1class Person {
2private:
3    std::string name;  // value-type member
4    int age;           // primitive type
5};

In this example, both name and age are value-type member variables (not raw pointers). They are stored directly inside the Person object's memory layout.

The reason no manual delete is needed is because neither member directly owns raw heap memory. The int age is a primitive type. It lives entirely within the object and requires no cleanup when the object is destroyed.

The std::string name may internally allocate heap memory to store its character data. However,std::string itself follows RAII. Its destructor automatically releases any owned memory.

When a Person object is destroyed (for example, when it goes out of scope or is deleted), the following happens:

First, the Person destructor runs (even if it is compiler-generated). Then, the compiler automatically calls the destructors of all member variables in reverse order of declaration. This means std::string::~string() is called for name, which frees its internal memory. The int age is a trivial type and requires no destruction logic.

Manual cleanup would only be required if the class directly managed resources such as raw pointers allocated with new, file handles, sockets, or mutexes.

RAII (Resource Acquisition Is Initialization) is the principle that resource management is tied to object lifetime. Resources are acquired during construction and released automatically in the destructor. This guarantees deterministic cleanup and prevents leaks, even in the presence of exceptions.

From RAII comes the Rule of Zero: if all member variables are RAII types (such as std::string, std::vector, or smart pointers), you do not need to manually define a destructor, copy constructor, move constructor, or assignment operators. The compiler-generated defaults are correct and sufficient.

<cstring> Functions

std::memcpy

The std::memcpy function is a C++ standard library routine used to copy a specified number of bytes from a source memory location to a destination memory location, regardless of the underlying data types.

It is declared in the <cstring> header:

1void* memcpy(void* dest, const void* src, std::size_t count);

std::memcpy performs a raw, byte-by-byte copy as if the memory were an array of unsigned char. It does not call constructors, assignment operators, or perform type-aware copying. It should only be used with trivially copyable types.

It is typically the fastest standard routine for memory-to-memory copies, but assumes that the source and destination memory regions do not overlap. If overlapping memory must be handled,std::memmove should be used instead.

std::strlen

1std::size_t strlen(const char* str);

The std::strlen function returns the length of a null-terminated byte string. It counts the number of characters starting from str up to, but not including, the first null character ("\0").

The behavior is undefined if the character array pointed to by str does not contain a null terminator. Unlike std::string, C-style strings do not store their length explicitly.

std::memset

1void* std::memset(void* dest, int ch, std::size_t count);

The std::memset function fills a block of memory with a single byte value. It sets the first count bytes of dest to the value represented by ch.

Because std::memset operates at the byte level, it should be used carefully with non-trivial types. For example, setting an object containing pointers or virtual tables to zero may result in undefined behavior. It is safest to use with raw buffers or trivially constructible types.

Destructors

Automatic Storage (Stack Objects)

1void foo() {
2    Person p("Alice", 30);
3}

When foo() returns,p's destructor is called automatically. The memory is reclaimed automatically as the stack frame unwinds.

You must not manually call the destructor:

1p.~Person();  // ❌ Wrong

Doing so causes double destruction, which results in undefined behavior.

Rule: Objects with automatic storage duration have their destructors invoked automatically when they leave scope.

Heap Objects (Allocated with new)

1Person* p = new Person("Alice", 30);

Here, the constructor runs and memory is allocated on the heap. To properly destroy the object:

1delete p;

The delete keyword performs two operations:

1. Calls the destructor.
2. Frees the heap memory.

You must not manually call the destructor before delete:

1p->~Person();  // ❌ Only calls destructor
2delete p;      // ❌ Double destruction

That would destroy the object twice, leading to undefined behavior.

Placement New

1alignas(Person) char buffer[sizeof(Person)];
2Person* p = new (buffer) Person("Alice", 30);

In this case, memory was not allocated usingnew. Only the constructor was executed.

Therefore, you must manually call the destructor:

1p->~Person();  // ✅ Required

You must not call:

1delete p;  // ❌ Not heap allocated

Because the memory was never allocated on the heap. Explicit destructor calls are only required when you manually control object lifetime, such as: Placement new, custom containers,std::optional-like implementations, unions with non-trivial types, or memory pools.

Member Destruction Rules

1class Person {
2private:
3    std::string name;
4    int age;
5
6public:
7    ~Person() {
8        // empty
9    }
10};

When a Person object is destroyed, C++ automatically:

1. Executes the body of ~Person().
2. Destroys member variables in reverse declaration order.
3. Calls base class destructors.

In this example, destruction order is:

1. age (trivial type, nothing happens).
2. name (its destructor runs automatically).

You must never manually destroy value-type members:

1~Person() {
2    name.~basic_string();  // ❌ Wrong
3}

If you do this, the destructor for name runs once manually. After your destructor finishes, the compiler automatically destroys name again. This causes double destruction, leading to: Double free errors, heap corruption, crashes, and undefined behavior.