Learning C++

Allocating and deallocating memory in C++
Functors
Most vexing parse problem
Copy constructors
Copy Assignment Operators
Move constructors
Move assignment
Reference vs Pointers
Introduction to smart pointers
lvalue vs rvalue
std::move in c++
Capturing state in lambda functions

Allocating and deallocating memory in C++

The free store in C++ is a pool of memory that allows programmers to dynamically allocate and deallocate storage for objects during program execution. Dynamic memory allocation is handled using the new and delete operators, providing flexibility for scenarios where the size or lifetime of objects is not determined at compile time.

Unlike compiler-allocated memory, which is automatically managed for variables like int a or char str[10], dynamically allocated memory, such as int *p = new int[10], must be explicitly deallocated by the programmer using the deleteoperator. Failure to do so results in memory leaks, where allocated memory remains inaccessible until the program terminates.

One major use case of dynamic memory allocation is allocating variable-sized memory, which is not feasible with compiler-allocated memory (except for features like variable-length arrays in certain compilers). This capability enables the creation of complex data structures, such as linked lists and trees, where memory requirements can vary during runtime.

When the new operator is used, it requests memory allocation on the free store. If sufficient memory is available, it initializes the memory and returns the address of the allocated space to the pointer variable. The delete operator is then used to release the allocated memory back to the system when it is no longer needed, ensuring efficient memory management and preventing leaks.

1#include <iostream>
2
3int main() {
4    int *p = new int;
5    *p = 10;
6    std::cout << *p;
7    return 0;
8}

Functors

A functor is essentially a class that defines the operator(). This allows you to create objects that behave like functions.

1// this is a functor
2struct add_x {
3  add_x(int val) : x(val) {}  // Constructor
4  int operator()(int y) const { return x + y; }
5
6private:
7  int x;
8};
9
10add_x add42(42); // create an instance of the functor class
11int i = add42(8); // and "call" it
12assert(i == 50); 
13std::vector<int> in; // assume this contains a bunch of values
14std::vector<int> out(in.size());
15// Pass a functor to std::transform, which calls the functor on every element 
16// in the input sequence, and stores the result to the output sequence
17std::transform(in.begin(), in.end(), out.begin(), add_x(1)); 
18assert(out[i] == in[i] + 1); // for all i

Functors have several advantages. One significant benefit is that, unlike regular functions, they can maintain state. For example, the code above creates a functor that adds 42 to any given value. This value (42) is not hardcoded—it is specified as a constructor argument when creating the functor instance. This makes functors highly customizable. You could create another instance of the functor, say, one that adds 27, by passing a different value to the constructor.

As shown in the last lines, functors are often passed as arguments to other functions, such as std::transform or other standard library algorithms. While you can achieve similar functionality with regular function pointers, functors are more flexible because they can encapsulate state. For instance, instead of writing a function that adds a specific value (e.g., exactly 1), a functor allows you to create a general solution that works with any value you specify during initialization.

Functors are also potentially more efficient. In the example above, the compiler knows exactly which function std::transform should call—it invokes add_x::operator(), making the operation faster and more optimized.

Most vexing parse problem

The "most vexing parse" is a specific form of syntactic ambiguity in the C++ programming language. It states that anything that can be interpreted as a function declaration will be interpreted as a function declaration.

1A a( A() );

The above code could be disambiguated in two ways: as a variable definition of class A taking an anonymous instance of class A, or as a function declaration for a function that returns an object of type A and takes a single unnamed argument, which is a function returning type A (and taking no input). Most programmers expect the first interpretation, but the C++ standard requires it to be interpreted as the second.

1class background_task
2{
3    public:
4    void operator()() const
5    {
6        do_something();
7        do_something_else();
8    }
9};
10background_task f;
11std::thread my_thread(f);
12
13std::thread my_thread(background_task())
14std::thread my_thread{background_task()}

In this example, we create a thread by passing a functor instance to it. However, in line 13, due to a parsing ambiguity, my_thread is interpreted as a function declaration instead of creating a new thread. Specifically, the compiler treats my_thread as a function that takes a single parameter (a pointer to a function returning a background_task object) and returns a std::thread object. As a result, no thread is launched.

To avoid this issue, you should use braces{} as shown in line 14. This syntax ensures that my_thread is correctly interpreted as a thread instance, launching a new thread as intended.

Copy Constructors

A copy constructor is a special constructor in C++ that is called when a new object is created as a copy of an existing object. It initializes the new object with the same values as the original one.

By default, the C++ compiler provides a default copy constructor, which performs a shallow copy. This means that if an object contains dynamically allocated memory (such as pointers to heap-allocated resources), only the memory address is copied, not the actual resource itself. As a result, multiple objects may end up pointing to the same memory location, leading to issues like dangling pointers and double deletion.

To avoid these problems, we need to define a custom copy constructor that performs a deep copy. A deep copy ensures that each copied object gets its own separate copy of dynamically allocated resources, preventing unintended side effects.

Below is the syntax of a copy constructor in C++:

1type(const type& other) { ... }

Now, let's look at a practical example of a copy constructor in C++:

1#include <iostream>
2#include <cstring>
3using namespace std;
4
5class Person {
6    char* name;
7
8public:
9    Person(const char* nameInput) {
10        cout << "Constructor called" << endl;
11        name = new char[strlen(nameInput) + 1];
12        strcpy(name, nameInput); 
13    }
14
15    Person(const Person& other) { // copy constructor
16        cout << "Copy constructor called" << endl;
17        char* copiedName = other.name;
18        name = new char[strlen(copiedName) + 1];
19        strcpy(name, copiedName);
20    }
21
22    ~Person() { // define destructor
23        delete[] name; // free up memory
24    }
25
26    void printName() {
27        cout << this->name << endl;
28    }
29
30};
31
32int main() {
33    Person a("Kenneth");
34    Person b = a; // copy constructor called
35    a.printName();
36    b.printName();
37    return 0;
38}

1Constructor called
2Copy constructor called
3Kenneth
4Kenneth

Copy Assignment Operators

Unlike a copy constructor, which is called when a new object is created from an existing object, the copy assignment operator is called when an already initialized object is assigned a new value from another existing object. Below are some key differences between a copy constructor and an assignment operator:

Copy Constructor	Assignment Operator
It is called when a new object is created from an existing object, as a copy of the existing object.	This operator is called when an already initialized object is assigned a new value from another existing object.
It creates a separate memory block for the new object.	It does not automatically create a separate memory block or new memory space. However, if the class involves dynamic memory management, the assignment operator must first release the existing memory on the left-hand side and then allocate new memory as needed to copy the data from the right-hand side.

Below is the syntax of a copy assignment operator in C++:

1type& operator=(const type& other) { ... return *this; }

Now, let's look at an example demonstrating how we can overload the copy assignment operator method.

1#include <iostream>
2#include <cstring>
3using namespace std;
4
5class Person {
6    char* name;
7
8public:
9    Person(const char* nameInput) {
10        cout << "Constructor called" << endl;
11        name = new char[strlen(nameInput) + 1];
12        strcpy(name, nameInput); // assign nameInput to name
13    }
14
15    Person(const Person& other) { // copy constructor
16        cout << "Copy constructor called" << endl;
17        char* copiedName = other.name;
18        name = new char[strlen(copiedName) + 1];
19        strcpy(name, copiedName);
20    }
21
22    ~Person() { // define destructor
23        delete[] name; // free up memory
24    }
25
26    Person& operator = (const Person& other) {
27        cout << "copy assignment operator called" << endl;
28        if(this != &other) {
29            delete[] name;
30            char* personName = other.name;
31            name = new char[strlen(personName) + 1];
32            strcpy(name, personName);
33        }
34        return *this;
35    }
36
37    void printName() {
38        cout << this->name << endl;
39    }
40
41};
42
43int main() {
44    Person a("Kenneth");
45    Person b = a; // copy 
46    Person c("ShouldNotSeeName");
47    c = a;
48    a.printName();
49    b.printName();
50    c.printName();
51    return 0;
52}

1Constructor called
2Copy constructor called
3Constructor called
4copy assignment operator called
5Kenneth
6Kenneth
7Kenneth

Move Constructors

A move constructor is a special constructor in C++ that is invoked when a new object is created from an r-value, such as a temporary object or the result of a function returning by value. Instead of copying data, the move constructor transfers ownership of the internal resources from the source object to the newly created one.

The source object relinquishes ownership, typically by transferring its pointer values to the destination and then nullifying its own pointers. This avoids expensive deep copies and ensures that the source does not attempt to free the same resources later, which would lead to double-deallocation.

After a move operation, the source object remains in a valid but unspecified state. The destination object, however, now fully owns the transferred resources and operates as if they were allocated for it directly.

Move constructors are particularly important when working with objects that manage expensive resources such as heap memory, file handles, or sockets. They allow temporary or intermediate values to be passed around efficiently, enabling performance optimizations such as return value optimization (RVO) and efficient container operations like `push_back` in modern C++.

1class MyClass {
2public:
3    int* data;
4
5    // Move Constructor
6    MyClass(MyClass&& other) noexcept : data(other.data) {
7        other.data = nullptr; // Steal resources and nullify source
8    }
9
10    // ... other members
11};
12
13MyClass createObject() {
14    return MyClass(); // Returns a temporary, triggers move constructor
15}
16
17int main() {
18    MyClass obj1 = createObject(); // Move constructor called
19    // ...
20}

Move Assignment

The move assignment operator is used when an already existing object is assigned the value of an r-value. Just like a move constructor, it transfers ownership of resources from the source object to the target object. However, because the target object was previously constructed and may already hold resources, the move assignment operator must perform additional work to manage those existing resources correctly.

A proper move assignment operator typically needs to carry out several steps. First, it must safely release any resources currently owned by the target object to avoid memory leaks. Next, it takes ownership of the source object's resources, usually by copying pointer values. Then, it sets the source object's resource pointers to nullptr to prevent double-deallocation. Finally, it must guard against self-assignment to ensure that an object being assigned to itself does not accidentally delete or corrupt its own data.

1class MyClass {
2public:
3    int* data;
4
5    // Move Assignment Operator
6    MyClass& operator=(MyClass&& other) noexcept {
7        if (this != &other) { // Handle self-assignment
8            delete[] data;        // Release existing resources
9            data = other.data;    // Steal resources
10            other.data = nullptr; // Nullify source
11        }
12        return *this;
13    }
14    // ... other members
15};
16
17int main() {
18    MyClass obj1;
19    MyClass obj2;
20    obj1 = std::move(obj2); // Move assignment operator called
21    // ...
22}

The key distinction between move construction and move assignment is the state of the target object. A move constructor operates only when a new object is being created, while a move assignment operator applies to an object that already exists. This mirrors the difference between copy construction and copy assignment.

1a2 = std::move(a1);        // Move assignment: object already exists
2
3A a2 = std::move(a1);     // Move construction: new object created

Reference vs Pointers

A pointer is a variable that holds the memory address of another variable. To access the value at that memory address, a pointer must be dereferenced using the * operator. On the other hand, a reference is essentially an alias for an existing variable. It is another name for the variable, allowing you to use it as though it were the original. While both references and pointers store the address of an object, references are automatically dereferenced by the compiler. In this sense, a reference can be thought of as a constant pointer with automatic indirection.

1#include <iostream>
2using namespace std;
3
4void increaseByReference(int& ref) {
5    ref += 10;  
6}
7
8void increaseByPointer(int* ptr) {
9    if (ptr != nullptr) {
10        *ptr += 10;  
11    }
12}
13
14int main() {
15    int num1 = 5;
16    int num2 = 5;
17
18    cout << "Original num1: " << num1 << endl;
19    cout << "Original num2: " << num2 << endl;
20
21    increaseByReference(num1);
22    cout << "num1 after increaseByReference: " << num1 << endl;
23
24    increaseByPointer(&num2);
25    cout << "num2 after increaseByPointer: " << num2 << endl;
26
27    return 0;
28}

1Original num1: 5
2Original num2: 5
3num1 after increaseByReference: 15
4num2 after increaseByPointer: 15

Introduction to smart pointers

In C++, objects can be allocated either on the stack or the heap. When you write Person person("test");, the `person` object is created on the stack, and it will be automatically destroyed when it goes out of scope. This is managed by the C++ runtime, so you don't have to worry about manual memory management. However, when working with dynamic memory (heap allocation), when we do not want the lifetime of an object to be tied to the scope of the current function, raw pointers are commonly used, but they come with risks like memory leaks and dangling pointers. This is where smart pointers come in—providing automatic memory management and reducing common errors in dynamic memory handling.

A smart pointer in C++ automatically manages memory allocation on the heap, ensuring that resources are properly deallocated. One commonly used smart pointer is std::unique_ptr, which helps convert non-RAII types into RAII types. It ensures that allocated resources are automatically deleted when the smart pointer goes out of scope.

Memory management issues can arise when dynamically allocating resources. Some common problems include:

Double free: A resource is freed multiple times, causing undefined behavior.

Use after free: A thread attempts to access a resource after it has been freed.

Data race: No proper synchronization exists, leading to race conditions when accessing or freeing memory.

Below is an example demonstrating a memory leak, where dynamically allocated memory is not freed:

1 void trade_nvda()
2{
3    int* foo = new int { 0 };
4    // TODO: Insert delete at appropriate locations
5
6    // Predictor threads
7    auto predictor_fn = [](int* foo, int prediction) {
8        bar.arrive_and_wait();
9        *foo = prediction;
10    };
11    std::thread { predictor_fn, foo, 13300 }.detach();
12    std::thread { predictor_fn, foo, 13000 }.detach();
13
14    // Trader threads
15    auto trader_fn = [](int* foo) {
16        bar.arrive_and_wait();
17        while(*foo == 0)
18        {
19        }
20        std::cout << "B NVDA 2 " + std::to_string(*foo) + '
21' << std::flush;
22    };
23    std::thread { trader_fn, foo }.detach();
24    std::thread { trader_fn, foo }.detach();
25}
26
27int main()
28{
29    trade_nvda();
30    std::this_thread::sleep_for(std::chrono::milliseconds { 1 });
31    trade_nvda();
32    std::this_thread::sleep_for(std::chrono::milliseconds { 1 });
33    // ...
34}

To avoid these issues, we can use std::shared_ptr, a reference-counted smart pointer that allows multiple owners to share a resource safely.

How do shared pointers work?

Each instance of std::shared_ptr maintains a reference count to the managed object. When a shared pointer is copied, the reference count increases. When a shared pointer is destroyed, the reference count decreases. Once the count reaches zero, the managed object is automatically deleted. The use_count()method can be used to check the number of references to a resource.

1void trade_nvda()
2{
3    auto foo = std::make_shared<int>(0);
4
5    // Predictor threads
6    auto predictor_fn = [](std::shared_ptr<int> foo, int prediction) {
7        bar.arrive_and_wait();
8        *foo = prediction;
9    };
10    std::thread { predictor_fn, foo, 13300 }.detach();
11    std::thread { predictor_fn, foo, 13000 }.detach();
12
13    // Trader threads
14    auto trader_fn = [](std::shared_ptr<int> foo) {
15        bar.arrive_and_wait();
16        while(*foo == 0)
17        {
18        }
19        std::cout << "B NVDA 2 " + std::to_string(*foo) + '
20' << std::flush;
21    };
22    std::thread { trader_fn, foo }.detach();
23    std::thread { trader_fn, foo }.detach();
24}

By using std::shared_ptr, we eliminate the memory leaks from our previous example, as the resource is properly managed. However, this approach still has a problem: data races. The shared resource is not thread-safe, meaning multiple threads reading and modifying the same object can cause unexpected behavior.

Dereferencing a Shared Pointer

If you have a shared pointer a, using *a gives you a reference to the managed object (e.g., int&). This means you can read or modify the value just like with a regular pointer.

1#include <iostream>
2#include <memory>
3
4int main() {
5    std::shared_ptr<int> a = std::make_shared<int>(42);
6
7    int value = *a;  // Dereference shared_ptr to read value
8    std::cout << value << std::endl;  // Output: 42
9
10    *a = 100;  // Modify the value through dereferencing
11    std::cout << *a << std::endl;  // Output: 100
12
13    return 0;
14}

Accessing the Raw Pointer

Using shared_ptr.get() returns the underlying raw pointer (T*) without affecting the ownership. You can use it if a function requires a raw pointer, but make sure not to delete it manually.

1#include <iostream>
2#include <memory>
3
4struct MyStruct {
5    void sayHi() {
6        std::cout << "Hello from MyStruct!" << std::endl;
7    }
8};
9
10int main() {
11    std::shared_ptr<MyStruct> a = std::make_shared<MyStruct>();
12
13    (*a).sayHi();        // Call method using dereference
14    a->sayHi();          // Preferred and concise
15    a.get()->sayHi();    // Equivalent raw pointer access
16
17    MyStruct* raw = a.get(); // Obtain raw pointer
18    raw->sayHi();            // Use raw pointer
19
20    return 0;
21}

While .get() can be useful, always be cautious when mixing raw pointers with smart pointers to avoid double deletions or dangling pointers.

lvalue vs rvalue

In C++, an lvalue (locator value) refers to an object that has a persistent memory address. It can appear on the left-hand side of an assignment. An rvalue (read value), on the other hand, is a temporary object that doesn’t have a fixed memory location—it typically represents a value that’s about to expire.

You can think of lvalues as containers (variables with memory), while rvalues are the contents (like literals or temporary results). Once detached from a container, rvalues typically vanish quickly.

1int x = 666;   // ok

In this example, 666 is an rvalue—it's a literal constant without a permanent address, stored in some temporary register while the program is running. The variable x is an lvalue because it refers to a location in memory. Assigning an rvalue to an lvalue is legal and very common.

1int x = 10;   // x is an lvalue
2x = 20;       // ok: assigning to an lvalue
3
4int* p = &x;  // ok: can take address of lvalue

Here I'm grabbing the the memory address of x and putting it into y, through the address-of operator &. It takes an lvalue argument and produces an rvalue. This is another perfectly legal operation: on the left side of the assignment we have an lvalue (a variable), on the right side an rvalue produced by the address-of operator.

1int x = 5 + 10;  // 5 + 10 is an rvalue
2x = 15;          // 15 is an rvalue
3
4int* p = &(5 + 10); // ❌ error: cannot take address of an rvalue

Expressions like 5 + 10 yield rvalues, which are short-lived and can't be directly referenced. That's why you can't take the address of 5 + 10.

Common examples of rvalues include numeric literals like 42, results of expressions such as a + b, and results returned by functions returning by value:std::string("hi").

With C++11, rvalue references (using Type&&) and std::move allow efficient use of temporary objects.

1void takeByValue(std::string s);       // makes a copy
2void takeByLvalueRef(std::string& s);  // modifies the original
3void takeByRvalueRef(std::string&& s); // can move from it

Here's an example showing how std::move enables moving a resource rather than copying:

1#include <iostream>
2#include <string>
3
4void takeByRvalueRef(std::string&& s) {
5    std::cout << "Moved string: " << s << std::endl;
6}
7
8int main() {
9    std::string name = "Kenneth";
10    takeByRvalueRef(std::move(name)); 
11    std::cout << "Original string after move: " << name << std::endl;
12}

In this example, the std::move casts the variable name into an rvalue, allowing it to be passed into the rvalue reference function. This avoids a copy and enables move semantics. After the move, name is still valid but its content is unspecified.

std::move in C++

In C++, std::move is a utility function that enables move semantics — a feature introduced in C++11 to efficiently transfer ownership of resources (such as dynamically allocated memory, file handles, or other system resources) from one object to another without performing a deep copy. This mechanism allows for significant performance improvements in situations where copying large or complex objects would be expensive.

Instead of duplicating data, move semantics transfers the underlying resources of one object to another. After the move, the original object remains in a valid but unspecified state, meaning it can be safely destroyed or reassigned but should not be relied upon for meaningful data.

Why is std::move needed?

By default, C++ uses copy semantics when assigning one object to another. Even if an object could be safely moved, the compiler does not automatically treat it as a temporary (rvalue) unless you explicitly indicate that intention. This is where std::move comes in — it performs a cast to an rvalue reference, signaling that the object's resources can be transferred instead of copied.

How std::move works

It's important to understand that std::move does not actually move anything by itself. Instead, it simply casts its argument into an rvalue reference — a type that can bind to temporary objects. Once this cast is done, the compiler is allowed to invoke a move constructor or move assignment operator, if the object type defines them.

For example, consider std::vector. Without move semantics, assigning one vector to another would require allocating new memory and copying each element individually. With std::move, the target vector can simply take ownership of the source vector's internal memory buffer, avoiding unnecessary data duplication.

1#include <iostream>
2#include <vector>
3#include <utility> // For std::move
4
5int main() {
6    // Create a vector and add some elements
7    std::vector<int> original = {1, 2, 3, 4, 5};
8    
9    // Move the original vector to a new one
10    std::vector<int> moved = std::move(original);
11
12    // 'original' is now in a valid but unspecified state
13    std::cout << "Moved vector: ";
14    for (int val : moved) {
15        std::cout << val << " "; // Output: 1 2 3 4 5
16    }
17
18    // 'original' is now empty, as its contents were moved
19    std::cout << "\nOriginal vector after move: ";
20    for (int val : original) {
21        std::cout << val << " "; // Output: Nothing, original is empty
22    }
23
24    return 0;
25}

In the example above, the moved vector takes ownership of the memory previously held by original. The original vector remains valid and destructible, but it no longer owns any data. This transfer avoids the cost of copying elements, demonstrating how std::move helps improve performance in modern C++ applications.

Capturing state in lambda functions

In C++, lambda functions can access variables from the scope in which they are defined. This process is called capturing. You can capture variables either by value (making a copy) or by reference (linking directly to the original). The choice determines whether changes to the original variable are visible inside the lambda.

Capturing by value

Capture by value ([=] or [variable_name]) means the lambda receives its own copy of the variable. Any changes to the variable outside the lambda do not affect the captured copy. This is useful when you want the lambda to work with a snapshot of a variable's state at the moment the lambda was created.

1int x = 10;
2auto lambda_val = [x]() {
3    // x is a copy of the original
4    std::cout << x << std::endl; // Prints 10
5};
6x = 20; // Change original x
7lambda_val(); // Still prints 10

Capturing by reference

Capture by reference ([&] or [&variable_name]) means the lambda holds a reference to the original variable. Any changes made inside the lambda will modify the original variable, and vice versa. This is useful when you want the lambda to always reflect the current state of the variable.

1int y = 10;
2auto lambda_ref = [&y]() {
3    // y is a reference to the original
4    y++; 
5    std::cout << y << std::endl; // Prints 11
6};
7lambda_ref();
8std::cout << y << std::endl; // Also prints 11

In summary, capture by value gives you a safe, unchanging copy of a variable, while capture by reference allows the lambda to read and modify the original. The choice depends on whether you need isolation (value) or shared access (reference).