manus.im - game engine guide
March 28, 2025
Game Engine Guide
Research Phase
-
[x] Research game engine architecture
- [x] Explore procedural and data-oriented design principles
- [x] Compare with OOP approaches and identify performance benefits
- [x] Understand basic game engine components and their interactions
- [x] Research game loop implementations
-
[x] Research core components
- [x] 2D Rendering systems and techniques
- [x] Input handling mechanisms
- [x] Game loop implementations (fixed vs variable time steps)
- [x] Basic physics (AABB collision detection and responses)
- [x] Audio integration (sound effects and background music)
- [x] Resource management techniques
-
[x] Collect resources and references
- [x] Find articles and blog posts on non-OOP game engine design
- [x] Identify video tutorials for practical implementation
- [x] Locate open-source projects using non-OOP approaches
- [ ] Compile beginner-friendly project suggestions
-
[x] Research performance optimization techniques
- [x] Data-oriented design principles and examples
- [x] Efficient memory management strategies
- [x] Multithreading approaches for game engines
- [x] Profiling tools and techniques
- [x] Practical optimization examples
Content Creation Phase
-
[x] Create guide structure
- [x] Outline main sections and subsections
- [x] Plan progression of concepts (from basic to advanced)
- [x] Determine appropriate code examples
-
[ ] Write guide sections
- [ ] Introduction to game engine architecture
- [ ] Core components detailed explanations
- [ ] Resources and references compilation
- [ ] Project suggestions for hands-on learning
- [ ] Performance optimization techniques
-
[ ] Compile final document
- [ ] Integrate all sections into a cohesive guide
- [ ] Format document for readability
- [ ] Add diagrams or illustrations if needed
- [ ] Include code snippets and examples
Review and Delivery Phase
- [ ] Review the complete guide
- [ ] Check for technical accuracy
- [ ] Ensure beginner-friendliness
- [ ] Verify all resources are accessible
- [ ] Confirm all required topics are covered
A Comprehensive Beginner's Guide to Writing a Performant Game Engine Without OOP
Table of Contents
- Introduction to Game Engine Architecture
- Game Loop
- Rendering System
- Input Handling
- Physics System
- Audio System
- Resource Management
- Performance Optimization Techniques
- Recommended Resources
Introduction to Game Engine Architecture: A Non-OOP Approach
What is a Game Engine?
A game engine is a software framework designed to create and develop video games. It provides developers with a collection of tools, libraries, and systems that handle common game development tasks, allowing them to focus on the unique aspects of their games rather than reinventing fundamental technologies.
At its core, a game engine typically includes systems for:
- Rendering graphics
- Processing input
- Playing audio
- Simulating physics
- Managing game resources
- Controlling game logic and flow
While commercial engines like Unity, Unreal Engine, and Godot offer comprehensive solutions for game development, understanding how to build a game engine from scratch provides invaluable insights into game architecture, performance optimization, and low-level programming concepts.
Traditional OOP vs. Procedural/Data-Oriented Approaches
The Limitations of Object-Oriented Programming for Game Engines
Object-oriented programming (OOP) has been the dominant paradigm in software development for decades, including game development. OOP organizes code around "objects" that combine data and behavior, emphasizing concepts like encapsulation, inheritance, and polymorphism. While this approach offers benefits for code organization and reusability, it introduces several challenges for performance-critical applications like game engines:
-
Cache Inefficiency: OOP tends to scatter related data across memory due to its focus on encapsulating data within individual objects. This leads to poor cache utilization, as modern CPUs work most efficiently when processing contiguous blocks of memory.
-
Indirection Overhead: Virtual function calls, a cornerstone of polymorphism in OOP, require additional pointer dereferencing and can cause branch mispredictions in the CPU pipeline.
-
Inheritance Complexity: Deep inheritance hierarchies can lead to complex, difficult-to-understand code with non-obvious execution paths and performance characteristics.
-
Memory Fragmentation: Dynamic allocation of many small objects can lead to memory fragmentation, causing additional overhead for memory management.
-
Rigid Composition: Traditional OOP inheritance can make it difficult to compose behaviors flexibly, often leading to either too-specific base classes or the "diamond problem" in multiple inheritance scenarios.
Consider this typical OOP approach to game entities:
class GameObject {
public:
virtual void update(float deltaTime) = 0;
virtual void render() = 0;
};
class Player : public GameObject {
private:
Vector2 position;
Vector2 velocity;
Sprite sprite;
public:
void update(float deltaTime) override {
// Process input
// Update position based on velocity
position += velocity * deltaTime;
}
void render() override {
// Draw the player sprite
sprite.draw(position);
}
};
// Game loop
for (auto& gameObject : gameObjects) {
gameObject->update(deltaTime);
}
for (auto& gameObject : gameObjects) {
gameObject->render();
}
While this code is organized and intuitive, it can lead to performance issues in larger games due to cache misses, virtual function call overhead, and interleaved data access patterns.
Benefits of Data-Oriented Design
Data-Oriented Design (DOD) offers an alternative approach that prioritizes efficient data layout and processing over object hierarchies. The core principle is simple: organize your code around the data and how it flows through the system, rather than around objects and their behaviors.
Key benefits of DOD for game engines include:
-
Improved Cache Utilization: By storing similar data together in contiguous arrays, DOD maximizes cache hits and minimizes memory latency.
-
Predictable Performance: With fewer indirections and virtual calls, code behavior becomes more predictable and easier to optimize.
-
Better Parallelization: Arrays of data can be processed in parallel more easily, taking advantage of multi-core processors and SIMD instructions.
-
Reduced Memory Fragmentation: Bulk allocation of arrays reduces memory fragmentation compared to allocating many small objects.
-
Flexible Composition: Component-based approaches allow for more flexible composition of game entities without deep inheritance hierarchies.
Here's how the previous example might look using a data-oriented approach:
// Data structures
struct Position {
float x, y;
};
struct Velocity {
float x, y;
};
struct Sprite {
// Sprite data
TextureID textureId;
float width, height;
};
// Systems
class MovementSystem {
public:
void update(std::vector<Position>& positions,
const std::vector<Velocity>& velocities,
float deltaTime) {
for (size_t i = 0; i < positions.size(); ++i) {
positions[i].x += velocities[i].x * deltaTime;
positions[i].y += velocities[i].y * deltaTime;
}
}
};
class RenderSystem {
public:
void render(const std::vector<Position>& positions,
const std::vector<Sprite>& sprites) {
for (size_t i = 0; i < positions.size(); ++i) {
drawSprite(sprites[i].textureId, positions[i].x, positions[i].y,
sprites[i].width, sprites[i].height);
}
}
};
// Game loop
movementSystem.update(positions, velocities, deltaTime);
renderSystem.render(positions, sprites);
In this approach, data is organized by type rather than by entity, allowing for more efficient memory access patterns and better utilization of the CPU cache.
Key Components of a Game Engine
A game engine, whether using OOP or DOD principles, typically consists of several core components that work together to create a functioning game:
-
Game Loop: The central control mechanism that drives the game forward, handling timing, updates, and rendering.
-
Rendering System: Responsible for drawing graphics to the screen, managing visual assets, and handling transformations.
-
Input System: Processes user input from keyboards, mice, controllers, and other devices.
-
Physics System: Simulates physical interactions, including collision detection and response.
-
Audio System: Handles sound effects and music playback.
-
Resource Management: Loads, caches, and manages game assets like textures, sounds, and models.
-
Entity Management: Organizes game objects and their components (particularly important in data-oriented designs).
-
Memory Management: Controls allocation and deallocation of memory, often with custom allocators for performance.
How Components Interact in a Non-OOP Architecture
In a non-OOP architecture, components interact differently than in traditional object-oriented designs. Instead of objects calling methods on other objects, systems operate on data, with clear separation between data and behavior.
Entity Component System (ECS)
One popular non-OOP architecture for game engines is the Entity Component System (ECS), which consists of:
- Entities: Simple IDs that represent game objects
- Components: Pure data containers attached to entities
- Systems: Logic that processes components of specific types
In an ECS architecture, systems query for entities with specific component combinations and process them in bulk. This approach aligns well with data-oriented design principles and can lead to highly performant game engines.
Here's a simplified example of how components might interact in an ECS:
// Entity is just an ID
using Entity = uint32_t;
// Components are pure data
struct Position { float x, y; };
struct Velocity { float x, y; };
struct Renderable { TextureID texture; };
// Systems operate on components
class MovementSystem {
public:
void update(Registry& registry, float deltaTime) {
// Get all entities with both Position and Velocity components
auto view = registry.view<Position, Velocity>();
// Process them together
for (auto entity : view) {
auto& position = view.get<Position>(entity);
const auto& velocity = view.get<Velocity>(entity);
position.x += velocity.x * deltaTime;
position.y += velocity.y * deltaTime;
}
}
};
class RenderSystem {
public:
void render(Registry& registry) {
// Get all entities with both Position and Renderable components
auto view = registry.view<Position, Renderable>();
// Process them together
for (auto entity : view) {
const auto& position = view.get<Position>(entity);
const auto& renderable = view.get<Renderable>(entity);
drawTexture(renderable.texture, position.x, position.y);
}
}
};
// Game loop
movementSystem.update(registry, deltaTime);
renderSystem.render(registry);
Data Flow Architecture
Another approach is to organize the engine around data flow, where systems are arranged in a pipeline, each processing data and passing it to the next system. This can be particularly effective for rendering pipelines and other sequential processes.
// Data structures for each stage
struct InputData { /* ... */ };
struct PhysicsData { /* ... */ };
struct RenderData { /* ... */ };
// Pipeline stages
class InputSystem {
public:
PhysicsData process(const InputData& input, float deltaTime) {
PhysicsData output;
// Transform input data into physics data
return output;
}
};
class PhysicsSystem {
public:
RenderData process(const PhysicsData& physics, float deltaTime) {
RenderData output;
// Process physics and prepare render data
return output;
}
};
class RenderSystem {
public:
void process(const RenderData& renderData) {
// Render the scene
}
};
// Game loop
InputData inputData = gatherInput();
PhysicsData physicsData = inputSystem.process(inputData, deltaTime);
RenderData renderData = physicsSystem.process(physicsData, deltaTime);
renderSystem.process(renderData);
Conclusion
Understanding the architecture of a game engine is crucial for building performant games. While object-oriented programming has been the dominant paradigm, data-oriented design offers significant performance advantages by focusing on efficient data layout and processing.
In the following sections, we'll dive deeper into each core component of a game engine, exploring how to implement them using non-OOP approaches for maximum performance. We'll start with the game loop, the central mechanism that drives everything forward in a game engine.
Remember that the goal isn't to completely avoid OOP concepts—encapsulation and abstraction are still valuable—but rather to be mindful of data organization and processing efficiency, especially in performance-critical systems.
Game Loop
Understanding the Game Loop: The Heart of Your Engine
The game loop is the central control mechanism of any game engine. It's responsible for advancing the game state, processing input, updating game logic, and rendering the scene to the screen. A well-designed game loop ensures your game runs smoothly and consistently across different hardware.
At its most basic level, a game loop follows this pattern:
while (gameIsRunning) {
processInput();
updateGame();
renderScene();
}
This simple structure drives everything in your game, executing continuously until the game ends. However, implementing an efficient and robust game loop involves addressing several important considerations, particularly when aiming for a performant non-OOP design.
Fixed vs. Variable Time Steps
One of the most important decisions when implementing a game loop is how to handle time. There are two primary approaches:
Variable Time Step
With a variable time step, the game updates based on how much real time has passed since the last frame:
float lastFrameTime = getCurrentTime();
while (gameIsRunning) {
float currentTime = getCurrentTime();
float deltaTime = currentTime - lastFrameTime;
lastFrameTime = currentTime;
processInput();
updateGame(deltaTime);
renderScene();
}
Advantages:
- Adapts naturally to different hardware capabilities
- Fully utilizes available processing power
- Simple to implement
Disadvantages:
- Physics and gameplay can behave inconsistently at different frame rates
- Can lead to hard-to-reproduce bugs
- May cause temporal aliasing (jittering) in motion
Fixed Time Step
With a fixed time step, the game updates at consistent time intervals, regardless of how quickly frames are rendered:
float accumulator = 0.0f;
float fixedTimeStep = 1.0f / 60.0f; // 60 updates per second
float lastFrameTime = getCurrentTime();
while (gameIsRunning) {
float currentTime = getCurrentTime();
float frameTime = currentTime - lastFrameTime;
lastFrameTime = currentTime;
// Cap the frame time to avoid spiral of death
if (frameTime > 0.25f) {
frameTime = 0.25f;
}
accumulator += frameTime;
processInput();
// Update in fixed time steps
while (accumulator >= fixedTimeStep) {
updateGame(fixedTimeStep);
accumulator -= fixedTimeStep;
}
// Render with interpolation if needed
float alpha = accumulator / fixedTimeStep;
renderScene(alpha);
}
Advantages:
- Consistent and deterministic physics and gameplay
- Easier to debug and reproduce issues
- Can decouple update rate from render rate
Disadvantages:
- More complex to implement
- May require interpolation for smooth rendering
- Can waste CPU cycles on slower hardware
Semi-Fixed Time Step
A common compromise is the semi-fixed time step, which updates at a fixed rate but can skip updates if the system can't keep up:
float targetFrameTime = 1.0f / 60.0f; // Target 60 FPS
float lastFrameTime = getCurrentTime();
while (gameIsRunning) {
float currentTime = getCurrentTime();
float deltaTime = currentTime - lastFrameTime;
lastFrameTime = currentTime;
// Cap delta time to avoid spiral of death
if (deltaTime > 0.25f) {
deltaTime = 0.25f;
}
processInput();
updateGame(deltaTime);
renderScene();
// Sleep if we're running too fast
float frameEndTime = getCurrentTime();
float frameProcessTime = frameEndTime - currentTime;
if (frameProcessTime < targetFrameTime) {
sleep(targetFrameTime - frameProcessTime);
}
}
This approach balances consistency with adaptability, making it suitable for many games.
Implementation Approaches for Non-OOP Game Loops
When implementing a game loop in a non-OOP, data-oriented manner, we focus on efficient data flow and minimal overhead. Here are some approaches:
Function-Based Game Loop
A simple procedural approach uses plain functions to process game state:
struct GameState {
std::vector<Entity> entities;
InputState input;
float deltaTime;
// Other game state data
};
void processInput(GameState& state) {
// Process input and update state.input
}
void updateEntities(GameState& state) {
for (size_t i = 0; i < state.entities.size(); ++i) {
updateEntity(state.entities[i], state.input, state.deltaTime);
}
}
void renderEntities(const GameState& state) {
for (size_t i = 0; i < state.entities.size(); ++i) {
renderEntity(state.entities[i]);
}
}
int main() {
GameState state;
initializeGame(state);
float lastFrameTime = getCurrentTime();
while (!state.quit) {
float currentTime = getCurrentTime();
state.deltaTime = currentTime - lastFrameTime;
lastFrameTime = currentTime;
processInput(state);
updateEntities(state);
renderEntities(state);
}
cleanupGame(state);
return 0;
}
This approach is simple and avoids the overhead of virtual function calls, but it can become unwieldy as the game grows more complex.
System-Based Game Loop
A more structured approach organizes the game into systems that operate on data:
struct InputData { /* ... */ };
struct PhysicsData { /* ... */ };
struct RenderData { /* ... */ };
class InputSystem {
public:
void process(InputData& data) {
// Process input
}
};
class PhysicsSystem {
public:
void process(PhysicsData& data, float deltaTime) {
// Update physics
}
};
class RenderSystem {
public:
void process(const RenderData& data) {
// Render scene
}
};
int main() {
InputData inputData;
PhysicsData physicsData;
RenderData renderData;
InputSystem inputSystem;
PhysicsSystem physicsSystem;
RenderSystem renderSystem;
float lastFrameTime = getCurrentTime();
bool running = true;
while (running) {
float currentTime = getCurrentTime();
float deltaTime = currentTime - lastFrameTime;
lastFrameTime = currentTime;
inputSystem.process(inputData);
running = !inputData.quitRequested;
physicsSystem.process(physicsData, deltaTime);
// Copy necessary data from physics to render
updateRenderData(renderData, physicsData);
renderSystem.process(renderData);
}
return 0;
}
This approach provides better organization while maintaining data-oriented principles.
ECS-Based Game Loop
For a full Entity Component System (ECS), the game loop orchestrates system execution:
int main() {
Registry registry;
// Create systems
MovementSystem movementSystem;
CollisionSystem collisionSystem;
RenderSystem renderSystem;
// Initialize entities and components
initializeWorld(registry);
float lastFrameTime = getCurrentTime();
bool running = true;
while (running) {
float currentTime = getCurrentTime();
float deltaTime = currentTime - lastFrameTime;
lastFrameTime = currentTime;
// Process input
InputState input = processInput();
running = !input.quitRequested;
// Update systems
movementSystem.update(registry, input, deltaTime);
collisionSystem.update(registry, deltaTime);
// Render
renderSystem.render(registry);
}
return 0;
}
This approach scales well for complex games and maintains good data locality for performance.
Handling Frame Timing
Accurate timing is crucial for a smooth game experience. Here are some considerations:
High-Resolution Timers
Use the highest resolution timer available on your platform:
#include <chrono>
double getCurrentTime() {
static auto startTime = std::chrono::high_resolution_clock::now();
auto currentTime = std::chrono::high_resolution_clock::now();
return std::chrono::duration<double>(currentTime - startTime).count();
}
Frame Rate Limiting
To avoid wasting resources and battery life, implement frame rate limiting:
void limitFrameRate(float targetFPS) {
static double lastFrameTime = getCurrentTime();
double currentTime = getCurrentTime();
double frameTime = currentTime - lastFrameTime;
double targetFrameTime = 1.0 / targetFPS;
if (frameTime < targetFrameTime) {
double sleepTime = targetFrameTime - frameTime;
std::this_thread::sleep_for(std::chrono::duration<double>(sleepTime));
}
lastFrameTime = getCurrentTime();
}
Frame Time Smoothing
For more stable frame rates, consider smoothing frame times:
float getSmoothedDeltaTime() {
static const int FRAME_SAMPLES = 10;
static float frameTimes[FRAME_SAMPLES] = {0};
static int currentFrame = 0;
static float lastFrameTime = getCurrentTime();
float currentTime = getCurrentTime();
float deltaTime = currentTime - lastFrameTime;
lastFrameTime = currentTime;
// Store frame time
frameTimes[currentFrame] = deltaTime;
currentFrame = (currentFrame + 1) % FRAME_SAMPLES;
// Average frame times
float averageDeltaTime = 0;
for (int i = 0; i < FRAME_SAMPLES; ++i) {
averageDeltaTime += frameTimes[i];
}
averageDeltaTime /= FRAME_SAMPLES;
return averageDeltaTime;
}
Complete Game Loop Example
Here's a complete example of a fixed time step game loop with frame rate limiting, suitable for a data-oriented game engine:
#include <chrono>
#include <thread>
#include <vector>
#include <iostream>
// Time management
double getCurrentTime() {
static auto startTime = std::chrono::high_resolution_clock::now();
auto currentTime = std::chrono::high_resolution_clock::now();
return std::chrono::duration<double>(currentTime - startTime).count();
}
// Game state structures
struct Position { float x, y; };
struct Velocity { float x, y; };
struct Input { bool up, down, left, right, quit; };
// Game systems
void processInput(Input& input) {
// In a real game, you'd read from keyboard/gamepad here
// This is just a placeholder
input.quit = false;
}
void updatePositions(std::vector<Position>& positions,
const std::vector<Velocity>& velocities,
const Input& input,
float deltaTime) {
// Update all positions based on velocities
for (size_t i = 0; i < positions.size(); ++i) {
positions[i].x += velocities[i].x * deltaTime;
positions[i].y += velocities[i].y * deltaTime;
}
// Respond to input (in a real game, this would affect the player)
if (input.up) {
// Move player up
}
// Handle other input...
}
void render(const std::vector<Position>& positions) {
// In a real game, you'd render sprites at these positions
// This is just a placeholder
std::cout << "Rendering " << positions.size() << " entities\n";
}
int main() {
// Game state
std::vector<Position> positions = {{0, 0}, {10, 10}, {20, 20}};
std::vector<Velocity> velocities = {{1, 2}, {-1, 0}, {0, 1}};
Input input = {};
// Timing variables
const float fixedTimeStep = 1.0f / 60.0f; // 60 updates per second
const float maxFrameTime = 0.25f; // Maximum frame time to avoid spiral of death
float accumulator = 0.0f;
float lastFrameTime = getCurrentTime();
// Main game loop
bool running = true;
while (running) {
// Calculate frame time
float currentTime = getCurrentTime();
float frameTime = currentTime - lastFrameTime;
lastFrameTime = currentTime;
// Cap frame time
if (frameTime > maxFrameTime) {
frameTime = maxFrameTime;
}
// Add to accumulator
accumulator += frameTime;
// Process input
processInput(input);
running = !input.quit;
// Fixed update steps
while (accumulator >= fixedTimeStep) {
updatePositions(positions, velocities, input, fixedTimeStep);
accumulator -= fixedTimeStep;
}
// Render (could interpolate between states for smoother rendering)
render(positions);
// Limit frame rate if needed
float targetFrameTime = 1.0f / 60.0f;
float processTime = getCurrentTime() - currentTime;
if (processTime < targetFrameTime) {
std::this_thread::sleep_for(std::chrono::duration<double>(targetFrameTime - processTime));
}
}
return 0;
}
Performance Considerations
When implementing a game loop for a performant engine, keep these considerations in mind:
-
Minimize Allocations: Avoid allocating memory during the game loop. Pre-allocate all necessary memory during initialization.
-
Data Locality: Keep related data together to improve cache utilization. Use structures of arrays rather than arrays of structures when appropriate.
-
Batch Processing: Process similar operations together to maximize cache hits and enable SIMD optimizations.
-
Avoid Branching: Minimize conditional statements in hot loops to reduce branch mispredictions.
-
Profile Regularly: Measure the performance of your game loop to identify bottlenecks and optimize accordingly.
Conclusion
The game loop is the foundation of your game engine, driving everything from input processing to rendering. By implementing a well-designed loop with appropriate time step handling, you can ensure your game runs smoothly and consistently across different hardware.
In the next section, we'll explore the rendering system, which is responsible for drawing your game world to the screen.
Rendering System
Overview of 2D Rendering: Bringing Your Game to Life
The rendering system is responsible for displaying your game's visual elements on screen. For a 2D game engine, this involves drawing sprites, handling transformations, managing layers, and optimizing the drawing process for performance.
While 3D rendering involves complex pipelines with vertex and fragment shaders, 2D rendering is more straightforward but still requires careful design for performance. In this section, we'll focus on building a performant 2D rendering system using non-OOP approaches.
Graphics Pipeline Basics
Even in a 2D engine, understanding the basics of the graphics pipeline helps in designing an efficient renderer:
- Vertex Processing: Transforming vertex positions (corners of your sprites)
- Rasterization: Converting vector data to pixels
- Fragment Processing: Determining the color of each pixel
- Output Merging: Combining fragments with existing framebuffer data
Modern graphics APIs like OpenGL, DirectX, Vulkan, or Metal handle these steps, but you need to organize your rendering code to use them efficiently.
Sprite Rendering
Sprites are 2D images that represent game objects. A performant sprite rendering system needs to handle:
- Loading and storing textures
- Positioning sprites in the game world
- Applying transformations (translation, rotation, scaling)
- Batching similar sprites for efficient rendering
Texture Management
In a data-oriented approach, textures should be managed separately from the entities that use them:
struct Texture {
uint32_t id; // GPU texture ID
int width, height; // Dimensions
// Other texture properties
};
// Store textures in a central repository
std::vector<Texture> textures;
Sprite Data Structure
For a data-oriented design, separate the sprite data from rendering logic:
struct Sprite {
uint32_t textureIndex; // Index into the textures array
float u1, v1, u2, v2; // Texture coordinates (for sprite sheets)
};
struct Transform {
float x, y; // Position
float scaleX, scaleY; // Scale
float rotation; // Rotation in radians
float pivotX, pivotY; // Pivot point for rotation
};
// Store sprite and transform data in parallel arrays
std::vector<Sprite> sprites;
std::vector<Transform> transforms;
This organization keeps similar data together, improving cache coherency during rendering.
Batching Techniques
Batching is crucial for performance, as it reduces the number of draw calls sent to the GPU. Each draw call has overhead, so minimizing them is essential.
Sprite Batching
Group sprites that share the same texture and can be drawn with a single draw call:
void renderSprites(const std::vector<Sprite>& sprites,
const std::vector<Transform>& transforms,
const std::vector<Texture>& textures) {
// Sort sprites by texture to minimize texture switches
std::vector<size_t> drawOrder(sprites.size());
for (size_t i = 0; i < sprites.size(); ++i) {
drawOrder[i] = i;
}
std::sort(drawOrder.begin(), drawOrder.end(),
[&sprites](size_t a, size_t b) {
return sprites[a].textureIndex < sprites[b].textureIndex;
});
// Prepare batch data
std::vector<Vertex> vertices;
vertices.reserve(sprites.size() * 6); // 6 vertices per sprite (2 triangles)
uint32_t currentTexture = UINT32_MAX;
// Process sprites in texture-sorted order
for (size_t i = 0; i < drawOrder.size(); ++i) {
size_t spriteIndex = drawOrder[i];
const Sprite& sprite = sprites[spriteIndex];
const Transform& transform = transforms[spriteIndex];
// If texture changed, flush the batch
if (currentTexture != sprite.textureIndex && !vertices.empty()) {
flushBatch(vertices, textures[currentTexture]);
vertices.clear();
}
currentTexture = sprite.textureIndex;
// Add sprite vertices to batch
addSpriteToVertexBatch(vertices, sprite, transform);
}
// Flush any remaining vertices
if (!vertices.empty()) {
flushBatch(vertices, textures[currentTexture]);
}
}
Texture Atlases
Texture atlases combine multiple smaller textures into one large texture, further reducing texture switches:
struct AtlasRegion {
float u1, v1, u2, v2; // Normalized texture coordinates
int width, height; // Original dimensions
};
struct TextureAtlas {
uint32_t textureId;
std::vector<AtlasRegion> regions;
};
// Using an atlas
void renderWithAtlas(const TextureAtlas& atlas,
const std::vector<int>& regionIndices,
const std::vector<Transform>& transforms) {
std::vector<Vertex> vertices;
vertices.reserve(regionIndices.size() * 6);
// Bind atlas texture once
bindTexture(atlas.textureId);
// Add all sprites to a single batch
for (size_t i = 0; i < regionIndices.size(); ++i) {
const AtlasRegion& region = atlas.regions[regionIndices[i]];
const Transform& transform = transforms[i];
addRegionToVertexBatch(vertices, region, transform);
}
// Draw everything in one call
drawVertices(vertices);
}
Handling Transformations
Transformations position and orient sprites in the game world. In a 2D engine, we typically need to handle translation, rotation, and scaling.
Matrix Transformations
While matrices might seem object-oriented, they're actually just mathematical constructs that can be implemented in a data-oriented way:
struct Matrix3x3 {
float m[9]; // Row-major order
};
Matrix3x3 createTransformMatrix(const Transform& transform) {
// Create translation, rotation, and scale matrices
// Multiply them together to get the final transform
// Return the combined matrix
}
void applyTransform(std::vector<Vertex>& vertices,
const Matrix3x3& matrix,
size_t startIndex,
size_t count) {
for (size_t i = startIndex; i < startIndex + count; ++i) {
// Apply matrix to vertex position
float x = vertices[i].x;
float y = vertices[i].y;
vertices[i].x = matrix.m[0] * x + matrix.m[1] * y + matrix.m[2];
vertices[i].y = matrix.m[3] * x + matrix.m[4] * y + matrix.m[5];
}
}
SIMD Acceleration
For better performance, use SIMD (Single Instruction, Multiple Data) instructions to process multiple vertices at once:
#include <immintrin.h> // For SSE/AVX intrinsics
void applyTransformSIMD(std::vector<Vertex>& vertices,
const Matrix3x3& matrix,
size_t startIndex,
size_t count) {
// Load matrix values into SIMD registers
__m128 row0 = _mm_set_ps(0, matrix.m[2], matrix.m[1], matrix.m[0]);
__m128 row1 = _mm_set_ps(0, matrix.m[5], matrix.m[4], matrix.m[3]);
for (size_t i = startIndex; i < startIndex + count; i += 4) {
// Process 4 vertices at once
// Load x coordinates
__m128 x = _mm_set_ps(
vertices[i+3].x,
vertices[i+2].x,
vertices[i+1].x,
vertices[i].x
);
// Load y coordinates
__m128 y = _mm_set_ps(
vertices[i+3].y,
vertices[i+2].y,
vertices[i+1].y,
vertices[i].y
);
// Calculate new x coordinates
__m128 newX = _mm_add_ps(
_mm_add_ps(
_mm_mul_ps(_mm_shuffle_ps(row0, row0, _MM_SHUFFLE(0, 0, 0, 0)), x),
_mm_mul_ps(_mm_shuffle_ps(row0, row0, _MM_SHUFFLE(1, 1, 1, 1)), y)
),
_mm_shuffle_ps(row0, row0, _MM_SHUFFLE(2, 2, 2, 2))
);
// Calculate new y coordinates
__m128 newY = _mm_add_ps(
_mm_add_ps(
_mm_mul_ps(_mm_shuffle_ps(row1, row1, _MM_SHUFFLE(0, 0, 0, 0)), x),
_mm_mul_ps(_mm_shuffle_ps(row1, row1, _MM_SHUFFLE(1, 1, 1, 1)), y)
),
_mm_shuffle_ps(row1, row1, _MM_SHUFFLE(2, 2, 2, 2))
);
// Store results back
float xResults[4], yResults[4];
_mm_store_ps(xResults, newX);
_mm_store_ps(yResults, newY);
for (int j = 0; j < 4 && i+j < startIndex + count; ++j) {
vertices[i+j].x = xResults[j];
vertices[i+j].y = yResults[j];
}
}
}
Camera System
A camera defines what portion of the game world is visible on screen. In a 2D engine, this typically involves a view matrix that transforms world coordinates to screen coordinates.
Camera Data Structure
struct Camera {
float x, y; // Position
float width, height; // Viewport dimensions
float zoom; // Zoom level
float rotation; // Camera rotation
};
Matrix3x3 createViewMatrix(const Camera& camera) {
// Create a matrix that transforms from world to screen coordinates
// This is essentially the inverse of a transform matrix
Matrix3x3 result;
// Apply translation, rotation, and scale
// ...
return result;
}
Culling
To improve performance, only render sprites that are visible within the camera's view:
bool isVisible(const Transform& transform,
float spriteWidth,
float spriteHeight,
const Camera& camera) {
// Calculate sprite bounds in world space
float left = transform.x - transform.pivotX * transform.scaleX;
float right = left + spriteWidth * transform.scaleX;
float top = transform.y - transform.pivotY * transform.scaleY;
float bottom = top + spriteHeight * transform.scaleY;
// Calculate camera bounds in world space
float camLeft = camera.x - camera.width / 2.0f / camera.zoom;
float camRight = camera.x + camera.width / 2.0f / camera.zoom;
float camTop = camera.y - camera.height / 2.0f / camera.zoom;
float camBottom = camera.y + camera.height / 2.0f / camera.zoom;
// Check for overlap
return !(right < camLeft || left > camRight ||
bottom < camTop || top > camBottom);
}
Layering and Depth
In 2D games, layering determines which sprites appear in front of others. There are several approaches to handling layers:
Z-Order
Assign each sprite a z-order value and sort before rendering:
struct RenderableSprite {
size_t spriteIndex;
size_t transformIndex;
int zOrder;
};
void renderWithZOrder(const std::vector<Sprite>& sprites,
const std::vector<Transform>& transforms,
const std::vector<RenderableSprite>& renderables) {
// Sort renderables by z-order
std::vector<RenderableSprite> sortedRenderables = renderables;
std::sort(sortedRenderables.begin(), sortedRenderables.end(),
[](const RenderableSprite& a, const RenderableSprite& b) {
return a.zOrder < b.zOrder;
});
// Render in sorted order
for (const auto& renderable : sortedRenderables) {
renderSprite(sprites[renderable.spriteIndex],
transforms[renderable.transformIndex]);
}
}
Layer-Based Rendering
Group sprites by layer and render each layer separately:
struct Layer {
std::vector<size_t> spriteIndices;
int zOrder;
};
void renderLayers(const std::vector<Sprite>& sprites,
const std::vector<Transform>& transforms,
const std::vector<Layer>& layers) {
// Sort layers by z-order
std::vector<size_t> layerOrder(layers.size());
for (size_t i = 0; i < layers.size(); ++i) {
layerOrder[i] = i;
}
std::sort(layerOrder.begin(), layerOrder.end(),
[&layers](size_t a, size_t b) {
return layers[a].zOrder < layers[b].zOrder;
});
// Render each layer
for (size_t layerIndex : layerOrder) {
const Layer& layer = layers[layerIndex];
// Batch render all sprites in this layer
renderSpriteBatch(sprites, transforms, layer.spriteIndices);
}
}
Optimizing Rendering Performance
Spatial Partitioning
For large worlds with many sprites, use spatial partitioning to quickly determine which sprites are visible:
struct QuadTreeNode {
float x, y, width, height;
std::vector<size_t> spriteIndices;
std::array<std::unique_ptr<QuadTreeNode>, 4> children;
};
void insertIntoQuadTree(QuadTreeNode& node,
size_t spriteIndex,
const Transform& transform,
float spriteWidth,
float spriteHeight) {
// Insert sprite into appropriate quadtree node
// ...
}
void queryVisibleSprites(const QuadTreeNode& node,
const Camera& camera,
std::vector<size_t>& visibleSprites) {
// If node is not visible to camera, return
if (!isNodeVisible(node, camera)) {
return;
}
// Add sprites from this node
visibleSprites.insert(visibleSprites.end(),
node.spriteIndices.begin(),
node.spriteIndices.end());
// Recursively check children
for (const auto& child : node.children) {
if (child) {
queryVisibleSprites(*child, camera, visibleSprites);
}
}
}
Instanced Rendering
For sprites that share the same texture and properties, use instanced rendering:
void renderInstanced(const Sprite& sprite,
const std::vector<Transform>& transforms) {
// Set up sprite vertices once
// ...
// Create instance data (positions, scales, rotations)
std::vector<InstanceData> instances(transforms.size());
for (size_t i = 0; i < transforms.size(); ++i) {
instances[i].x = transforms[i].x;
instances[i].y = transforms[i].y;
instances[i].scaleX = transforms[i].scaleX;
instances[i].scaleY = transforms[i].scaleY;
instances[i].rotation = transforms[i].rotation;
}
// Upload instance data to GPU
// ...
// Draw all instances in a single call
drawInstanced(instances.size());
}
Render State Sorting
Minimize state changes by sorting render operations:
enum class RenderStateType {
Texture,
BlendMode,
Shader
// Other states
};
struct RenderState {
RenderStateType type;
uint32_t value;
};
struct RenderCommand {
std::vector<RenderState> states;
std::vector<size_t> spriteIndices;
};
void optimizeRenderCommands(std::vector<RenderCommand>& commands) {
// Sort commands to minimize state changes
std::sort(commands.begin(), commands.end(),
[](const RenderCommand& a, const RenderCommand& b) {
// Compare states in order of importance
// Texture changes are usually most expensive
for (size_t i = 0; i < std::min(a.states.size(), b.states.size()); ++i) {
if (a.states[i].type != b.states[i].type) {
return a.states[i].type < b.states[i].type;
}
if (a.states[i].value != b.states[i].value) {
return a.states[i].value < b.states[i].value;
}
}
return a.states.size() < b.states.size();
});
}
Complete 2D Renderer Example
Here's a simplified example of a data-oriented 2D renderer:
#include <vector>
#include <algorithm>
#include <cmath>
// Basic structures
struct Vertex {
float x, y; // Position
float u, v; // Texture coordinates
float r, g, b, a; // Color
};
struct Texture {
uint32_t id;
int width, height;
};
struct Sprite {
uint32_t textureIndex;
float u1, v1, u2, v2;
float width, height;
};
struct Transform {
float x, y;
float scaleX, scaleY;
float rotation;
float pivotX, pivotY;
};
struct Camera {
float x, y;
float width, height;
float zoom;
};
// Renderer class
class Renderer {
private:
std::vector<Texture> textures;
std::vector<Sprite> sprites;
std::vector<Transform> transforms;
std::vector<int> layers;
Camera camera;
// OpenGL/DirectX/etc. specific variables would go here
public:
// Initialize renderer
void initialize(int screenWidth, int screenHeight) {
// Set up graphics API
// ...
// Initialize camera
camera.x = 0.0f;
camera.y = 0.0f;
camera.width = static_cast<float>(screenWidth);
camera.height = static_cast<float>(screenHeight);
camera.zoom = 1.0f;
}
// Create a texture
uint32_t createTexture(const char* filename) {
Texture texture;
// Load texture from file
// ...
textures.push_back(texture);
return textures.size() - 1;
}
// Create a sprite
uint32_t createSprite(uint32_t textureIndex,
float u1 = 0.0f, float v1 = 0.0f,
float u2 = 1.0f, float v2 = 1.0f) {
Sprite sprite;
sprite.textureIndex = textureIndex;
sprite.u1 = u1;
sprite.v1 = v1;
sprite.u2 = u2;
sprite.v2 = v2;
sprite.width = (u2 - u1) * textures[textureIndex].width;
sprite.height = (v2 - v1) * textures[textureIndex].height;
sprites.push_back(sprite);
// Create default transform
Transform transform;
transform.x = 0.0f;
transform.y = 0.0f;
transform.scaleX = 1.0f;
transform.scaleY = 1.0f;
transform.rotation = 0.0f;
transform.pivotX = sprite.width / 2.0f;
transform.pivotY = sprite.height / 2.0f;
transforms.push_back(transform);
// Default layer
layers.push_back(0);
return sprites.size() - 1;
}
// Update sprite transform
void setTransform(uint32_t spriteIndex,
float x, float y,
float scaleX = 1.0f, float scaleY = 1.0f,
float rotation = 0.0f) {
if (spriteIndex >= transforms.size()) return;
transforms[spriteIndex].x = x;
transforms[spriteIndex].y = y;
transforms[spriteIndex].scaleX = scaleX;
transforms[spriteIndex].scaleY = scaleY;
transforms[spriteIndex].rotation = rotation;
}
// Set sprite layer
void setLayer(uint32_t spriteIndex, int layer) {
if (spriteIndex >= layers.size()) return;
layers[spriteIndex] = layer;
}
// Update camera
void setCamera(float x, float y, float zoom = 1.0f) {
camera.x = x;
camera.y = y;
camera.zoom = zoom;
}
// Check if sprite is visible
bool isVisible(uint32_t spriteIndex) {
if (spriteIndex >= sprites.size()) return false;
const Sprite& sprite = sprites[spriteIndex];
const Transform& transform = transforms[spriteIndex];
// Calculate sprite bounds
float left = transform.x - transform.pivotX * transform.scaleX;
float right = left + sprite.width * transform.scaleX;
float top = transform.y - transform.pivotY * transform.scaleY;
float bottom = top + sprite.height * transform.scaleY;
// Calculate camera bounds
float camLeft = camera.x - camera.width / 2.0f / camera.zoom;
float camRight = camera.x + camera.width / 2.0f / camera.zoom;
float camTop = camera.y - camera.height / 2.0f / camera.zoom;
float camBottom = camera.y + camera.height / 2.0f / camera.zoom;
// Check for overlap
return !(right < camLeft || left > camRight ||
bottom < camTop || top > camBottom);
}
// Render all visible sprites
void render() {
// Collect visible sprites
std::vector<uint32_t> visibleSprites;
for (uint32_t i = 0; i < sprites.size(); ++i) {
if (isVisible(i)) {
visibleSprites.push_back(i);
}
}
// Sort by layer and then by texture
std::sort(visibleSprites.begin(), visibleSprites.end(),
[this](uint32_t a, uint32_t b) {
if (layers[a] != layers[b]) {
return layers[a] < layers[b];
}
return sprites[a].textureIndex < sprites[b].textureIndex;
});
// Set up view matrix
// ...
// Batch rendering
uint32_t currentTexture = UINT32_MAX;
std::vector<Vertex> vertices;
for (uint32_t spriteIndex : visibleSprites) {
const Sprite& sprite = sprites[spriteIndex];
// If texture changed, flush batch
if (currentTexture != sprite.textureIndex && !vertices.empty()) {
// Bind texture and draw vertices
// ...
vertices.clear();
}
currentTexture = sprite.textureIndex;
// Add sprite vertices to batch
const Transform& transform = transforms[spriteIndex];
// Calculate vertex positions with transformation
// ...
// Add vertices to batch
// ...
}
// Flush any remaining vertices
if (!vertices.empty()) {
// Bind texture and draw vertices
// ...
}
}
};
Conclusion
A well-designed rendering system is crucial for a performant game engine. By organizing your renderer in a data-oriented way, you can achieve excellent performance while maintaining flexibility.
Key takeaways:
- Keep similar data together for better cache coherency
- Batch similar draw calls to reduce API overhead
- Use spatial partitioning to quickly identify visible sprites
- Sort render operations to minimize state changes
- Consider SIMD optimizations for transform calculations
In the next section, we'll explore input handling, which allows players to interact with your game.
Input Handling
Introduction to Input Systems: Connecting Players to Your Game
An input system is the bridge between the player and your game. It captures user actions from devices like keyboards, mice, and game controllers, then translates them into meaningful game events. A well-designed input system should be responsive, flexible, and efficient.
In a non-OOP approach, we'll focus on separating the input data from the processing logic, creating a system that's both performant and easy to extend.
Polling vs. Event-Based Input
There are two primary approaches to handling input in game engines:
Polling
Polling involves checking the state of input devices at regular intervals, typically once per frame. This approach is straightforward and works well for many games:
struct InputState {
// Keyboard state
bool keys[256] = {false};
// Mouse state
int mouseX = 0;
int mouseY = 0;
bool mouseButtons[3] = {false};
// Gamepad state
bool gamepadButtons[16] = {false};
float gamepadAxes[6] = {0.0f};
};
void updateInput(InputState& inputState) {
// Platform-specific code to update input state
// For example, using SDL:
SDL_Event event;
while (SDL_PollEvent(&event)) {
// Handle window events
if (event.type == SDL_QUIT) {
// Set quit flag
}
}
// Get keyboard state
const Uint8* keyboardState = SDL_GetKeyboardState(NULL);
for (int i = 0; i < 256; ++i) {
inputState.keys[i] = keyboardState[i] != 0;
}
// Get mouse state
int mouseButtons = SDL_GetMouseState(&inputState.mouseX, &inputState.mouseY);
inputState.mouseButtons[0] = (mouseButtons & SDL_BUTTON(1)) != 0;
inputState.mouseButtons[1] = (mouseButtons & SDL_BUTTON(2)) != 0;
inputState.mouseButtons[2] = (mouseButtons & SDL_BUTTON(3)) != 0;
// Get gamepad state
// ...
}
Advantages of Polling:
- Simple to implement and understand
- Consistent with the game loop's frame-based approach
- Easy to check for combinations of inputs
Disadvantages of Polling:
- May miss brief inputs that occur between polls
- Not ideal for text input or complex gestures
- Can be less efficient for inputs that rarely change
Event-Based
Event-based input responds to input events as they occur, using callbacks or an event queue:
struct InputEvent {
enum class Type {
KeyDown,
KeyUp,
MouseMove,
MouseButtonDown,
MouseButtonUp,
GamepadButtonDown,
GamepadButtonUp,
GamepadAxisMove
};
Type type;
union {
struct {
int keyCode;
} key;
struct {
int x, y;
int deltaX, deltaY;
} mouseMove;
struct {
int button;
int x, y;
} mouseButton;
struct {
int button;
} gamepadButton;
struct {
int axis;
float value;
} gamepadAxis;
};
};
class InputSystem {
private:
std::vector<InputEvent> eventQueue;
public:
void processEvents() {
// Platform-specific code to gather events
// For example, using SDL:
SDL_Event event;
while (SDL_PollEvent(&event)) {
InputEvent inputEvent;
switch (event.type) {
case SDL_KEYDOWN:
inputEvent.type = InputEvent::Type::KeyDown;
inputEvent.key.keyCode = event.key.keysym.scancode;
eventQueue.push_back(inputEvent);
break;
case SDL_KEYUP:
inputEvent.type = InputEvent::Type::KeyUp;
inputEvent.key.keyCode = event.key.keysym.scancode;
eventQueue.push_back(inputEvent);
break;
case SDL_MOUSEMOTION:
inputEvent.type = InputEvent::Type::MouseMove;
inputEvent.mouseMove.x = event.motion.x;
inputEvent.mouseMove.y = event.motion.y;
inputEvent.mouseMove.deltaX = event.motion.xrel;
inputEvent.mouseMove.deltaY = event.motion.yrel;
eventQueue.push_back(inputEvent);
break;
// Handle other event types...
}
}
}
bool pollEvent(InputEvent& event) {
if (eventQueue.empty()) {
return false;
}
event = eventQueue.front();
eventQueue.erase(eventQueue.begin());
return true;
}
};
Advantages of Event-Based:
- Captures all input events, even brief ones
- Better for text input and complex sequences
- Can be more efficient for rarely changing inputs
Disadvantages of Event-Based:
- More complex to implement
- Requires careful event queue management
- May need additional state tracking for continuous inputs
Hybrid Approach
Many game engines use a hybrid approach, combining polling for continuous inputs (like movement) with events for discrete actions (like button presses):
class InputSystem {
private:
InputState currentState;
InputState previousState;
std::vector<InputEvent> eventQueue;
public:
void update() {
// Save previous state
previousState = currentState;
// Clear event queue
eventQueue.clear();
// Process platform events
// ...
// Update current state
// ...
}
bool isKeyDown(int keyCode) const {
return currentState.keys[keyCode];
}
bool isKeyPressed(int keyCode) const {
return currentState.keys[keyCode] && !previousState.keys[keyCode];
}
bool isKeyReleased(int keyCode) const {
return !currentState.keys[keyCode] && previousState.keys[keyCode];
}
// Similar methods for mouse and gamepad
bool pollEvent(InputEvent& event) {
if (eventQueue.empty()) {
return false;
}
event = eventQueue.front();
eventQueue.erase(eventQueue.begin());
return true;
}
};
This approach gives you the best of both worlds: reliable state checking and detailed event information.
Mapping Inputs to Actions
In a well-designed input system, you want to abstract physical inputs (like key presses) into logical game actions (like "jump" or "shoot"). This makes your game more flexible and easier to maintain.
Action Mapping
Define a set of game actions and map input combinations to them:
enum class GameAction {
MoveLeft,
MoveRight,
MoveUp,
MoveDown,
Jump,
Shoot,
Pause,
// Add more actions as needed
};
struct InputBinding {
enum class Type {
Keyboard,
MouseButton,
MouseAxis,
GamepadButton,
GamepadAxis
};
Type type;
int code;
float scale = 1.0f;
// For axis inputs
bool isPositive = true;
float deadZone = 0.1f;
};
class InputMapper {
private:
std::unordered_map<GameAction, std::vector<InputBinding>> actionBindings;
public:
void bindAction(GameAction action, const InputBinding& binding) {
actionBindings[action].push_back(binding);
}
void clearBindings(GameAction action) {
actionBindings[action].clear();
}
bool isActionActive(GameAction action, const InputState& state) const {
const auto it = actionBindings.find(action);
if (it == actionBindings.end()) {
return false;
}
for (const auto& binding : it->second) {
switch (binding.type) {
case InputBinding::Type::Keyboard:
if (state.keys[binding.code]) {
return true;
}
break;
case InputBinding::Type::MouseButton:
if (state.mouseButtons[binding.code]) {
return true;
}
break;
case InputBinding::Type::GamepadButton:
if (state.gamepadButtons[binding.code]) {
return true;
}
break;
case InputBinding::Type::GamepadAxis:
{
float value = state.gamepadAxes[binding.code] * binding.scale;
if (binding.isPositive) {
if (value > binding.deadZone) {
return true;
}
} else {
if (value < -binding.deadZone) {
return true;
}
}
}
break;
// Handle other binding types...
}
}
return false;
}
float getActionValue(GameAction action, const InputState& state) const {
const auto it = actionBindings.find(action);
if (it == actionBindings.end()) {
return 0.0f;
}
float maxValue = 0.0f;
for (const auto& binding : it->second) {
float value = 0.0f;
switch (binding.type) {
case InputBinding::Type::Keyboard:
value = state.keys[binding.code] ? 1.0f : 0.0f;
break;
case InputBinding::Type::MouseButton:
value = state.mouseButtons[binding.code] ? 1.0f : 0.0f;
break;
case InputBinding::Type::GamepadButton:
value = state.gamepadButtons[binding.code] ? 1.0f : 0.0f;
break;
case InputBinding::Type::GamepadAxis:
{
value = state.gamepadAxes[binding.code] * binding.scale;
if (binding.isPositive) {
value = std::max(0.0f, value);
if (value < binding.deadZone) {
value = 0.0f;
}
} else {
value = std::min(0.0f, value);
if (value > -binding.deadZone) {
value = 0.0f;
} else {
value = -value; // Make positive
}
}
}
break;
// Handle other binding types...
}
maxValue = std::max(maxValue, value);
}
return maxValue;
}
};
Using the Input Mapper
With this system, you can set up bindings and check for actions in your game:
// Set up input bindings
InputMapper inputMapper;
// Bind keyboard controls
InputBinding leftBinding = {InputBinding::Type::Keyboard, SDL_SCANCODE_A};
InputBinding rightBinding = {InputBinding::Type::Keyboard, SDL_SCANCODE_D};
InputBinding jumpBinding = {InputBinding::Type::Keyboard, SDL_SCANCODE_SPACE};
inputMapper.bindAction(GameAction::MoveLeft, leftBinding);
inputMapper.bindAction(GameAction::MoveRight, rightBinding);
inputMapper.bindAction(GameAction::Jump, jumpBinding);
// Bind gamepad controls
InputBinding gamepadLeftBinding = {InputBinding::Type::GamepadAxis, 0, -1.0f, false};
InputBinding gamepadRightBinding = {InputBinding::Type::GamepadAxis, 0, 1.0f, true};
InputBinding gamepadJumpBinding = {InputBinding::Type::GamepadButton, 0};
inputMapper.bindAction(GameAction::MoveLeft, gamepadLeftBinding);
inputMapper.bindAction(GameAction::MoveRight, gamepadRightBinding);
inputMapper.bindAction(GameAction::Jump, gamepadJumpBinding);
// In the game loop
void update(float deltaTime) {
// Update input state
updateInput(inputState);
// Check for actions
float moveX = 0.0f;
if (inputMapper.isActionActive(GameAction::MoveLeft, inputState)) {
moveX -= inputMapper.getActionValue(GameAction::MoveLeft, inputState);
}
if (inputMapper.isActionActive(GameAction::MoveRight, inputState)) {
moveX += inputMapper.getActionValue(GameAction::MoveRight, inputState);
}
bool jump = inputMapper.isActionActive(GameAction::Jump, inputState);
// Use these values to update game state
player.velocity.x = moveX * player.speed;
if (jump && player.isOnGround) {
player.velocity.y = player.jumpForce;
}
}
This approach decouples the physical inputs from the game logic, making it easier to support different input devices and allow for control remapping.
Input State Management
Managing input state efficiently is crucial for a performant input system. Here are some techniques:
Double Buffering
Use double buffering to track both current and previous input states:
struct InputSystem {
InputState current;
InputState previous;
void update() {
// Swap states
previous = current;
// Update current state
// ...
}
bool isKeyPressed(int keyCode) const {
return current.keys[keyCode] && !previous.keys[keyCode];
}
bool isKeyReleased(int keyCode) const {
return !current.keys[keyCode] && previous.keys[keyCode];
}
};
Bit Sets for Key States
For keyboard state, use bit sets instead of boolean arrays for more efficient storage and operations:
#include <bitset>
struct InputState {
std::bitset<256> keys;
// Other input state...
};
// Check if a key is down
bool isKeyDown(const InputState& state, int keyCode) {
return state.keys[keyCode];
}
// Check if any key in a set is down
bool isAnyKeyDown(const InputState& state, const std::vector<int>& keyCodes) {
std::bitset<256> keyMask;
for (int keyCode : keyCodes) {
keyMask.set(keyCode);
}
return (state.keys & keyMask).any();
}
// Check if all keys in a set are down
bool areAllKeysDown(const InputState& state, const std::vector<int>& keyCodes) {
std::bitset<256> keyMask;
for (int keyCode : keyCodes) {
keyMask.set(keyCode);
}
return (state.keys & keyMask) == keyMask;
}
Input Contexts
For complex games, implement input contexts to handle different game states (e.g., gameplay, menu, inventory):
enum class InputContext {
Gameplay,
Menu,
Inventory,
Dialog,
// Add more contexts as needed
};
class InputSystem {
private:
std::unordered_map<InputContext, InputMapper> contextMappers;
InputContext activeContext = InputContext::Gameplay;
public:
void setActiveContext(InputContext context) {
activeContext = context;
}
InputContext getActiveContext() const {
return activeContext;
}
void bindAction(InputContext context, GameAction action, const InputBinding& binding) {
contextMappers[context].bindAction(action, binding);
}
bool isActionActive(GameAction action, const InputState& state) const {
const auto it = contextMappers.find(activeContext);
if (it == contextMappers.end()) {
return false;
}
return it->second.isActionActive(action, state);
}
float getActionValue(GameAction action, const InputState& state) const {
const auto it = contextMappers.find(activeContext);
if (it == contextMappers.end()) {
return 0.0f;
}
return it->second.getActionValue(action, state);
}
};
Handling Complex Input Scenarios
Text Input
For games that need text input (e.g., for character names or chat), use platform-specific text input methods:
struct TextInputState {
bool active = false;
std::string text;
int cursorPosition = 0;
};
void processTextInput(TextInputState& textInput, const InputEvent& event) {
if (!textInput.active) {
return;
}
if (event.type == InputEvent::Type::TextInput) {
// Insert text at cursor position
textInput.text.insert(textInput.cursorPosition, event.text.text);
textInput.cursorPosition += strlen(event.text.text);
} else if (event.type == InputEvent::Type::KeyDown) {
switch (event.key.keyCode) {
case SDL_SCANCODE_BACKSPACE:
if (textInput.cursorPosition > 0) {
textInput.text.erase(textInput.cursorPosition - 1, 1);
textInput.cursorPosition--;
}
break;
case SDL_SCANCODE_DELETE:
if (textInput.cursorPosition < textInput.text.length()) {
textInput.text.erase(textInput.cursorPosition, 1);
}
break;
case SDL_SCANCODE_LEFT:
if (textInput.cursorPosition > 0) {
textInput.cursorPosition--;
}
break;
case SDL_SCANCODE_RIGHT:
if (textInput.cursorPosition < textInput.text.length()) {
textInput.cursorPosition++;
}
break;
// Handle other special keys...
}
}
}
Gestures and Touch Input
For mobile games or touch interfaces, implement gesture recognition:
struct TouchPoint {
int id;
float x, y;
float startX, startY;
float deltaX, deltaY;
float time;
};
struct GestureState {
std::vector<TouchPoint> touches;
bool isPinching = false;
float pinchDistance = 0.0f;
float pinchScale = 1.0f;
bool isDragging = false;
float dragX = 0.0f;
float dragY = 0.0f;
bool isTapping = false;
int tapCount = 0;
};
void updateGestures(GestureState& gestures, const std::vector<InputEvent>& events) {
// Process touch events
for (const auto& event : events) {
if (event.type == InputEvent::Type::TouchDown) {
TouchPoint touch;
touch.id = event.touch.id;
touch.x = touch.startX = event.touch.x;
touch.y = touch.startY = event.touch.y;
touch.deltaX = touch.deltaY = 0.0f;
touch.time = getCurrentTime();
gestures.touches.push_back(touch);
} else if (event.type == InputEvent::Type::TouchMove) {
for (auto& touch : gestures.touches) {
if (touch.id == event.touch.id) {
touch.deltaX = event.touch.x - touch.x;
touch.deltaY = event.touch.y - touch.y;
touch.x = event.touch.x;
touch.y = event.touch.y;
break;
}
}
} else if (event.type == InputEvent::Type::TouchUp) {
for (auto it = gestures.touches.begin(); it != gestures.touches.end(); ++it) {
if (it->id == event.touch.id) {
// Check for tap
float touchTime = getCurrentTime() - it->time;
float distance = std::sqrt(std::pow(it->x - it->startX, 2) +
std::pow(it->y - it->startY, 2));
if (touchTime < 0.3f && distance < 10.0f) {
gestures.isTapping = true;
gestures.tapCount++;
}
gestures.touches.erase(it);
break;
}
}
}
}
// Update gesture states
if (gestures.touches.size() >= 2) {
// Calculate pinch
const auto& t1 = gestures.touches[0];
const auto& t2 = gestures.touches[1];
float currentDistance = std::sqrt(std::pow(t2.x - t1.x, 2) +
std::pow(t2.y - t1.y, 2));
if (!gestures.isPinching) {
gestures.isPinching = true;
gestures.pinchDistance = currentDistance;
gestures.pinchScale = 1.0f;
} else {
gestures.pinchScale = currentDistance / gestures.pinchDistance;
}
} else {
gestures.isPinching = false;
}
// Update drag state
if (gestures.touches.size() == 1) {
const auto& touch = gestures.touches[0];
if (!gestures.isDragging && (std::abs(touch.deltaX) > 5.0f ||
std::abs(touch.deltaY) > 5.0f)) {
gestures.isDragging = true;
}
if (gestures.isDragging) {
gestures.dragX = touch.deltaX;
gestures.dragY = touch.deltaY;
}
} else {
gestures.isDragging = false;
gestures.dragX = gestures.dragY = 0.0f;
}
// Reset tap state after one frame
if (gestures.isTapping) {
gestures.isTapping = false;
} else {
gestures.tapCount = 0;
}
}
Input Recording and Playback
For debugging or replay features, implement input recording and playback:
struct InputRecord {
float timestamp;
InputState state;
};
class InputRecorder {
private:
std::vector<InputRecord> records;
float startTime;
bool recording = false;
bool playing = false;
size_t playbackIndex = 0;
public:
void startRecording() {
records.clear();
startTime = getCurrentTime();
recording = true;
playing = false;
}
void stopRecording() {
recording = false;
}
void recordInput(const InputState& state) {
if (!recording) {
return;
}
InputRecord record;
record.timestamp = getCurrentTime() - startTime;
record.state = state;
records.push_back(record);
}
void startPlayback() {
if (records.empty()) {
return;
}
startTime = getCurrentTime();
playbackIndex = 0;
playing = true;
recording = false;
}
void stopPlayback() {
playing = false;
}
bool getPlaybackInput(InputState& state) {
if (!playing || records.empty()) {
return false;
}
float currentTime = getCurrentTime() - startTime;
// Find the appropriate record for the current time
while (playbackIndex < records.size() - 1 &&
records[playbackIndex + 1].timestamp <= currentTime) {
playbackIndex++;
}
if (playbackIndex >= records.size()) {
playing = false;
return false;
}
state = records[playbackIndex].state;
return true;
}
void saveToFile(const std::string& filename) {
// Save records to file
// ...
}
void loadFromFile(const std::string& filename) {
// Load records from file
// ...
}
};
Performance Optimization
Minimizing Input Overhead
To keep your input system performant:
-
Avoid Polling Too Frequently: For most games, updating input once per frame is sufficient.
-
Batch Input Processing: Process all input events at once rather than handling them individually.
-
Use Efficient Data Structures: Choose appropriate containers for your input state and event queue.
-
Minimize Allocations: Pre-allocate memory for event queues and avoid dynamic allocations during input processing.
-
Profile Input Processing: Measure the performance of your input system to identify bottlenecks.
SIMD for Input Processing
For games with many input sources, consider using SIMD instructions to process multiple inputs simultaneously:
#include <immintrin.h> // For SSE/AVX intrinsics
void updateGamepadAxes(float* axes, float* deadZones, float* outputs, int count) {
// Process 4 axes at once using SSE
for (int i = 0; i < count; i += 4) {
// Load 4 axes and deadzone values
__m128 axesVec = _mm_loadu_ps(&axes[i]);
__m128 deadZonesVec = _mm_loadu_ps(&deadZones[i]);
// Apply deadzone
__m128 absAxesVec = _mm_andnot_ps(_mm_set1_ps(-0.0f), axesVec); // Absolute value
__m128 maskVec = _mm_cmpgt_ps(absAxesVec, deadZonesVec);
__m128 resultVec = _mm_and_ps(axesVec, maskVec);
// Store results
_mm_storeu_ps(&outputs[i], resultVec);
}
}
Complete Input System Example
Here's a complete example of a data-oriented input system:
#include <vector>
#include <unordered_map>
#include <bitset>
#include <string>
#include <algorithm>
// Input state structures
struct InputState {
std::bitset<256> keys;
std::bitset<8> mouseButtons;
int mouseX = 0, mouseY = 0;
int mouseDeltaX = 0, mouseDeltaY = 0;
float gamepadAxes[6] = {0.0f};
std::bitset<16> gamepadButtons;
};
// Input event structure
struct InputEvent {
enum class Type {
KeyDown,
KeyUp,
MouseMove,
MouseButtonDown,
MouseButtonUp,
GamepadButtonDown,
GamepadButtonUp,
GamepadAxisMove,
TextInput
};
Type type;
float timestamp;
union {
struct {
int keyCode;
} key;
struct {
int x, y;
int deltaX, deltaY;
} mouseMove;
struct {
int button;
int x, y;
} mouseButton;
struct {
int button;
} gamepadButton;
struct {
int axis;
float value;
} gamepadAxis;
struct {
char text[32];
} text;
};
};
// Game action enum
enum class GameAction {
MoveLeft,
MoveRight,
MoveUp,
MoveDown,
Jump,
Shoot,
Interact,
Pause,
// Add more actions as needed
};
// Input binding structure
struct InputBinding {
enum class Type {
Keyboard,
MouseButton,
MouseAxis,
GamepadButton,
GamepadAxis
};
Type type;
int code;
float scale = 1.0f;
bool isPositive = true;
float deadZone = 0.1f;
};
// Input context enum
enum class InputContext {
Gameplay,
Menu,
Dialog,
// Add more contexts as needed
};
// Input system class
class InputSystem {
private:
// Input states
InputState currentState;
InputState previousState;
// Event queue
std::vector<InputEvent> eventQueue;
// Action mappings for different contexts
std::unordered_map<InputContext,
std::unordered_map<GameAction,
std::vector<InputBinding>>> contextBindings;
// Active context
InputContext activeContext = InputContext::Gameplay;
// Text input state
bool textInputActive = false;
std::string textInputBuffer;
// Recording/playback
std::vector<std::pair<float, InputState>> recordedStates;
bool recording = false;
bool playing = false;
float recordStartTime = 0.0f;
size_t playbackIndex = 0;
public:
// Initialize the input system
void initialize() {
// Platform-specific initialization
// ...
}
// Update input state
void update() {
// Save previous state
previousState = currentState;
// Clear event queue
eventQueue.clear();
if (playing) {
// Use recorded input if playing back
float currentTime = getCurrentTime() - recordStartTime;
while (playbackIndex < recordedStates.size() &&
recordedStates[playbackIndex].first <= currentTime) {
currentState = recordedStates[playbackIndex].second;
playbackIndex++;
}
if (playbackIndex >= recordedStates.size()) {
playing = false;
}
} else {
// Process platform events
processEvents();
// Record input if recording
if (recording) {
float currentTime = getCurrentTime() - recordStartTime;
recordedStates.push_back({currentTime, currentState});
}
}
}
// Process platform-specific events
void processEvents() {
// Platform-specific code to gather events
// For example, using SDL:
// ...
// Update current state based on events
// ...
}
// Set active input context
void setContext(InputContext context) {
activeContext = context;
}
// Bind an action to an input
void bindAction(InputContext context, GameAction action, const InputBinding& binding) {
contextBindings[context][action].push_back(binding);
}
// Clear all bindings for an action
void clearBindings(InputContext context, GameAction action) {
auto& contextMap = contextBindings[context];
auto it = contextMap.find(action);
if (it != contextMap.end()) {
it->second.clear();
}
}
// Check if an action is active
bool isActionActive(GameAction action) const {
auto contextIt = contextBindings.find(activeContext);
if (contextIt == contextBindings.end()) {
return false;
}
auto actionIt = contextIt->second.find(action);
if (actionIt == contextIt->second.end()) {
return false;
}
for (const auto& binding : actionIt->second) {
switch (binding.type) {
case InputBinding::Type::Keyboard:
if (currentState.keys[binding.code]) {
return true;
}
break;
case InputBinding::Type::MouseButton:
if (currentState.mouseButtons[binding.code]) {
return true;
}
break;
case InputBinding::Type::GamepadButton:
if (currentState.gamepadButtons[binding.code]) {
return true;
}
break;
case InputBinding::Type::GamepadAxis:
{
float value = currentState.gamepadAxes[binding.code] * binding.scale;
if (binding.isPositive) {
if (value > binding.deadZone) {
return true;
}
} else {
if (value < -binding.deadZone) {
return true;
}
}
}
break;
// Handle other binding types...
}
}
return false;
}
// Check if an action was just activated this frame
bool isActionJustActivated(GameAction action) const {
return isActionActive(action) && !wasActionActive(action);
}
// Check if an action was active in the previous frame
bool wasActionActive(GameAction action) const {
// Similar to isActionActive but using previousState
// ...
return false; // Placeholder
}
// Get the value of an action (for analog inputs)
float getActionValue(GameAction action) const {
auto contextIt = contextBindings.find(activeContext);
if (contextIt == contextBindings.end()) {
return 0.0f;
}
auto actionIt = contextIt->second.find(action);
if (actionIt == contextIt->second.end()) {
return 0.0f;
}
float maxValue = 0.0f;
for (const auto& binding : actionIt->second) {
float value = 0.0f;
switch (binding.type) {
case InputBinding::Type::Keyboard:
value = currentState.keys[binding.code] ? 1.0f : 0.0f;
break;
case InputBinding::Type::MouseButton:
value = currentState.mouseButtons[binding.code] ? 1.0f : 0.0f;
break;
case InputBinding::Type::GamepadButton:
value = currentState.gamepadButtons[binding.code] ? 1.0f : 0.0f;
break;
case InputBinding::Type::GamepadAxis:
{
value = currentState.gamepadAxes[binding.code] * binding.scale;
if (binding.isPositive) {
value = std::max(0.0f, value);
if (value < binding.deadZone) {
value = 0.0f;
} else {
// Normalize value after deadzone
value = (value - binding.deadZone) / (1.0f - binding.deadZone);
}
} else {
value = std::min(0.0f, value);
if (value > -binding.deadZone) {
value = 0.0f;
} else {
// Normalize value after deadzone and make positive
value = (-value - binding.deadZone) / (1.0f - binding.deadZone);
}
}
}
break;
// Handle other binding types...
}
maxValue = std::max(maxValue, value);
}
return maxValue;
}
// Get mouse position
void getMousePosition(int& x, int& y) const {
x = currentState.mouseX;
y = currentState.mouseY;
}
// Get mouse delta
void getMouseDelta(int& deltaX, int& deltaY) const {
deltaX = currentState.mouseDeltaX;
deltaY = currentState.mouseDeltaY;
}
// Start text input
void startTextInput() {
textInputActive = true;
textInputBuffer.clear();
// Platform-specific text input start
// ...
}
// Stop text input
void stopTextInput() {
textInputActive = false;
// Platform-specific text input stop
// ...
}
// Get text input buffer
const std::string& getTextInput() const {
return textInputBuffer;
}
// Start input recording
void startRecording() {
recordedStates.clear();
recordStartTime = getCurrentTime();
recording = true;
playing = false;
}
// Stop recording
void stopRecording() {
recording = false;
}
// Start playback
void startPlayback() {
if (recordedStates.empty()) {
return;
}
recordStartTime = getCurrentTime();
playbackIndex = 0;
playing = true;
recording = false;
}
// Stop playback
void stopPlayback() {
playing = false;
}
// Save recorded input to file
void saveRecording(const std::string& filename) {
// Save recordedStates to file
// ...
}
// Load recorded input from file
void loadRecording(const std::string& filename) {
// Load recordedStates from file
// ...
}
private:
// Helper function to get current time
float getCurrentTime() const {
// Platform-specific time function
// ...
return 0.0f; // Placeholder
}
};
Conclusion
A well-designed input system is essential for creating responsive and flexible games. By separating input data from processing logic and using a data-oriented approach, you can create an input system that's both performant and easy to extend.
Key takeaways:
- Choose between polling, event-based, or hybrid approaches based on your game's needs
- Abstract physical inputs into logical game actions for flexibility
- Use efficient data structures like bitsets for input state
- Implement input contexts for different game states
- Optimize for performance by minimizing allocations and batching operations
In the next section, we'll explore basic physics systems, focusing on collision detection and response in a non-OOP manner.
Basic Physics
Introduction to Game Physics: Collision Detection and Response
Physics systems are essential components of game engines, responsible for simulating physical interactions between game objects. For a beginner-friendly 2D game engine, we'll focus on the most fundamental aspects: collision detection and response.
Unlike 3D physics which can be complex and computationally expensive, 2D physics is more approachable while still providing engaging gameplay. In this section, we'll implement a performant, non-OOP physics system that handles basic collisions efficiently.
Axis-Aligned Bounding Box (AABB) Collision Detection
The simplest and most efficient collision detection method for 2D games is the Axis-Aligned Bounding Box (AABB). As the name suggests, these are rectangles whose edges are aligned with the coordinate axes.
AABB Data Structure
In a data-oriented approach, we'll store the AABB data separately from the entities they represent:
struct AABB {
float minX, minY;
float maxX, maxY;
};
// Store AABBs in a contiguous array
std::vector<AABB> colliders;
AABB Collision Detection
Detecting collisions between AABBs is straightforward:
bool checkAABBCollision(const AABB& a, const AABB& b) {
// Check for overlap on both axes
return (a.maxX >= b.minX && a.minX <= b.maxX &&
a.maxY >= b.minY && a.minY <= b.maxY);
}
This simple check is extremely efficient and can be further optimized with SIMD instructions for batch processing:
#include <immintrin.h> // For SSE/AVX intrinsics
bool checkAABBCollisionSIMD(const AABB& a, const AABB& b) {
// Load AABB values into SIMD registers
__m128 aBox = _mm_set_ps(a.maxY, a.minY, a.maxX, a.minX);
__m128 bBox = _mm_set_ps(b.maxY, b.minY, b.maxX, b.minX);
// Shuffle bBox to get (b.minX, b.maxX, b.minY, b.maxY)
__m128 bBoxShuffled = _mm_shuffle_ps(bBox, bBox, _MM_SHUFFLE(3, 2, 1, 0));
// Compare a.maxX >= b.minX, a.minX <= b.maxX, a.maxY >= b.minY, a.minY <= b.maxY
__m128 cmp1 = _mm_cmpge_ps(_mm_shuffle_ps(aBox, aBox, _MM_SHUFFLE(1, 1, 3, 3)),
_mm_shuffle_ps(bBoxShuffled, bBoxShuffled, _MM_SHUFFLE(0, 0, 0, 0)));
__m128 cmp2 = _mm_cmple_ps(_mm_shuffle_ps(aBox, aBox, _MM_SHUFFLE(0, 0, 0, 0)),
_mm_shuffle_ps(bBoxShuffled, bBoxShuffled, _MM_SHUFFLE(1, 1, 1, 1)));
__m128 cmp3 = _mm_cmpge_ps(_mm_shuffle_ps(aBox, aBox, _MM_SHUFFLE(3, 3, 3, 3)),
_mm_shuffle_ps(bBoxShuffled, bBoxShuffled, _MM_SHUFFLE(2, 2, 2, 2)));
__m128 cmp4 = _mm_cmple_ps(_mm_shuffle_ps(aBox, aBox, _MM_SHUFFLE(2, 2, 2, 2)),
_mm_shuffle_ps(bBoxShuffled, bBoxShuffled, _MM_SHUFFLE(3, 3, 3, 3)));
// Combine results with logical AND
__m128 result = _mm_and_ps(_mm_and_ps(cmp1, cmp2), _mm_and_ps(cmp3, cmp4));
// Extract result
return _mm_movemask_ps(result) & 0x1;
}
Updating AABBs
AABBs need to be updated when entities move:
void updateAABB(AABB& aabb, float x, float y, float width, float height) {
aabb.minX = x - width / 2.0f;
aabb.minY = y - height / 2.0f;
aabb.maxX = x + width / 2.0f;
aabb.maxY = y + height / 2.0f;
}
For batch updates:
void updateAABBs(std::vector<AABB>& aabbs,
const std::vector<float>& positionsX,
const std::vector<float>& positionsY,
const std::vector<float>& widths,
const std::vector<float>& heights) {
for (size_t i = 0; i < aabbs.size(); ++i) {
updateAABB(aabbs[i], positionsX[i], positionsY[i], widths[i], heights[i]);
}
}
Collision Response
Once a collision is detected, the next step is to respond appropriately. Collision responses vary depending on the game, but common approaches include:
Separation (Resolving Overlaps)
When two objects collide, we often want to move them apart to prevent overlapping:
void resolveAABBCollision(AABB& a, AABB& b,
float& aVelX, float& aVelY,
float& bVelX, float& bVelY,
float aInvMass, float bInvMass) {
// Calculate overlap on each axis
float overlapX = std::min(a.maxX - b.minX, b.maxX - a.minX);
float overlapY = std::min(a.maxY - b.minY, b.maxY - a.minY);
// Determine which axis has the smaller overlap
if (overlapX < overlapY) {
// Resolve on X axis
float totalMass = aInvMass + bInvMass;
if (totalMass > 0) {
float aRatio = aInvMass / totalMass;
float bRatio = bInvMass / totalMass;
// Determine direction
float direction = (a.minX + (a.maxX - a.minX) / 2.0f) <
(b.minX + (b.maxX - b.minX) / 2.0f) ? -1.0f : 1.0f;
// Move objects apart
float aMoveX = direction * overlapX * aRatio;
float bMoveX = -direction * overlapX * bRatio;
a.minX += aMoveX;
a.maxX += aMoveX;
b.minX += bMoveX;
b.maxX += bMoveX;
// Bounce velocities
float relativeVelX = aVelX - bVelX;
float impulse = relativeVelX * 1.5f; // 1.5 = elasticity + 1
aVelX -= impulse * bRatio;
bVelX += impulse * aRatio;
}
} else {
// Resolve on Y axis (similar to X axis)
float totalMass = aInvMass + bInvMass;
if (totalMass > 0) {
float aRatio = aInvMass / totalMass;
float bRatio = bInvMass / totalMass;
// Determine direction
float direction = (a.minY + (a.maxY - a.minY) / 2.0f) <
(b.minY + (b.maxY - b.minY) / 2.0f) ? -1.0f : 1.0f;
// Move objects apart
float aMoveY = direction * overlapY * aRatio;
float bMoveY = -direction * overlapY * bRatio;
a.minY += aMoveY;
a.maxY += aMoveY;
b.minY += bMoveY;
b.maxY += bMoveY;
// Bounce velocities
float relativeVelY = aVelY - bVelY;
float impulse = relativeVelY * 1.5f; // 1.5 = elasticity + 1
aVelY -= impulse * bRatio;
bVelY += impulse * aRatio;
}
}
}
Collision Events
For gameplay purposes, we often want to trigger events when collisions occur:
struct CollisionEvent {
size_t entityA;
size_t entityB;
float impulse;
};
std::vector<CollisionEvent> collisionEvents;
void processCollisionEvents(const std::vector<CollisionEvent>& events) {
for (const auto& event : events) {
// Handle each collision event based on entity types
// For example, damage calculation, sound effects, etc.
}
}
Spatial Partitioning
As the number of entities increases, checking every possible pair for collisions becomes inefficient (O(n²) complexity). Spatial partitioning techniques divide the game world into regions to reduce the number of checks.
Grid-Based Partitioning
A simple approach is to divide the world into a grid:
struct SpatialGrid {
float cellSize;
int gridWidth, gridHeight;
std::vector<std::vector<size_t>> cells; // Each cell contains entity indices
SpatialGrid(float worldWidth, float worldHeight, float cellSize)
: cellSize(cellSize) {
gridWidth = static_cast<int>(std::ceil(worldWidth / cellSize));
gridHeight = static_cast<int>(std::ceil(worldHeight / cellSize));
cells.resize(gridWidth * gridHeight);
}
void clear() {
for (auto& cell : cells) {
cell.clear();
}
}
int getCellIndex(float x, float y) const {
int cellX = static_cast<int>(x / cellSize);
int cellY = static_cast<int>(y / cellSize);
// Clamp to grid bounds
cellX = std::max(0, std::min(cellX, gridWidth - 1));
cellY = std::max(0, std::min(cellY, gridHeight - 1));
return cellY * gridWidth + cellX;
}
void insertEntity(size_t entityIndex, const AABB& aabb) {
// Get cell indices for the AABB corners
int minCellX = static_cast<int>(aabb.minX / cellSize);
int minCellY = static_cast<int>(aabb.minY / cellSize);
int maxCellX = static_cast<int>(aabb.maxX / cellSize);
int maxCellY = static_cast<int>(aabb.maxY / cellSize);
// Clamp to grid bounds
minCellX = std::max(0, std::min(minCellX, gridWidth - 1));
minCellY = std::max(0, std::min(minCellY, gridHeight - 1));
maxCellX = std::max(0, std::min(maxCellX, gridWidth - 1));
maxCellY = std::max(0, std::min(maxCellY, gridHeight - 1));
// Add entity to all cells it overlaps
for (int y = minCellY; y <= maxCellY; ++y) {
for (int x = minCellX; x <= maxCellX; ++x) {
int cellIndex = y * gridWidth + x;
cells[cellIndex].push_back(entityIndex);
}
}
}
void getPotentialCollisions(std::vector<std::pair<size_t, size_t>>& collisionPairs) {
// For each cell
for (const auto& cell : cells) {
// Check all entity pairs in this cell
for (size_t i = 0; i < cell.size(); ++i) {
for (size_t j = i + 1; j < cell.size(); ++j) {
// Add pair to collision check list
// (avoid duplicates if entities span multiple cells)
if (cell[i] < cell[j]) {
collisionPairs.emplace_back(cell[i], cell[j]);
} else {
collisionPairs.emplace_back(cell[j], cell[i]);
}
}
}
}
// Remove duplicates
std::sort(collisionPairs.begin(), collisionPairs.end());
collisionPairs.erase(std::unique(collisionPairs.begin(), collisionPairs.end()),
collisionPairs.end());
}
};
Quadtree Partitioning
For more dynamic scenes, a quadtree can be more efficient:
struct QuadTreeNode {
AABB bounds;
std::vector<size_t> entities;
std::array<std::unique_ptr<QuadTreeNode>, 4> children;
static const size_t MAX_ENTITIES = 8;
static const int MAX_DEPTH = 5;
QuadTreeNode(const AABB& bounds) : bounds(bounds) {}
bool isLeaf() const {
return !children[0];
}
void insert(size_t entityIndex, const AABB& entityBounds, int depth = 0) {
// If entity doesn't fit in this node, ignore it
if (!checkAABBCollision(bounds, entityBounds)) {
return;
}
// If this is a leaf node with space or max depth reached, add entity here
if (isLeaf() && (entities.size() < MAX_ENTITIES || depth >= MAX_DEPTH)) {
entities.push_back(entityIndex);
return;
}
// If this is a leaf node that's full, split it
if (isLeaf()) {
split();
// Redistribute existing entities
for (size_t oldEntity : entities) {
insertToChildren(oldEntity, depth + 1);
}
entities.clear();
}
// Insert to children
insertToChildren(entityIndex, depth + 1);
}
void split() {
float midX = (bounds.minX + bounds.maxX) / 2.0f;
float midY = (bounds.minY + bounds.maxY) / 2.0f;
// Create four child nodes
children[0] = std::make_unique<QuadTreeNode>(
AABB{bounds.minX, bounds.minY, midX, midY}); // Top-left
children[1] = std::make_unique<QuadTreeNode>(
AABB{midX, bounds.minY, bounds.maxX, midY}); // Top-right
children[2] = std::make_unique<QuadTreeNode>(
AABB{bounds.minX, midY, midX, bounds.maxY}); // Bottom-left
children[3] = std::make_unique<QuadTreeNode>(
AABB{midX, midY, bounds.maxX, bounds.maxY}); // Bottom-right
}
void insertToChildren(size_t entityIndex, int depth) {
for (auto& child : children) {
child->insert(entityIndex, entityBounds[entityIndex], depth);
}
}
void getPotentialCollisions(std::vector<std::pair<size_t, size_t>>& collisionPairs) {
// Check for collisions within this node
for (size_t i = 0; i < entities.size(); ++i) {
for (size_t j = i + 1; j < entities.size(); ++j) {
collisionPairs.emplace_back(entities[i], entities[j]);
}
}
// If not a leaf, check children
if (!isLeaf()) {
for (auto& child : children) {
child->getPotentialCollisions(collisionPairs);
}
}
}
};
Simple Physics Simulation
Now let's combine these components into a simple physics system:
class PhysicsSystem {
private:
std::vector<AABB> colliders;
std::vector<float> positionsX;
std::vector<float> positionsY;
std::vector<float> velocitiesX;
std::vector<float> velocitiesY;
std::vector<float> masses; // 0 = static object
std::vector<bool> isStatic;
SpatialGrid grid;
std::vector<CollisionEvent> collisionEvents;
public:
PhysicsSystem(float worldWidth, float worldHeight, float cellSize)
: grid(worldWidth, worldHeight, cellSize) {}
size_t createEntity(float x, float y, float width, float height,
float mass = 1.0f, float velX = 0.0f, float velY = 0.0f) {
AABB aabb;
updateAABB(aabb, x, y, width, height);
colliders.push_back(aabb);
positionsX.push_back(x);
positionsY.push_back(y);
velocitiesX.push_back(velX);
velocitiesY.push_back(velY);
if (mass <= 0.0f) {
masses.push_back(0.0f);
isStatic.push_back(true);
} else {
masses.push_back(mass);
isStatic.push_back(false);
}
return colliders.size() - 1;
}
void update(float deltaTime) {
// Apply gravity and update positions
for (size_t i = 0; i < colliders.size(); ++i) {
if (isStatic[i]) continue;
// Apply gravity
velocitiesY[i] += 9.8f * deltaTime;
// Update position
positionsX[i] += velocitiesX[i] * deltaTime;
positionsY[i] += velocitiesY[i] * deltaTime;
// Update AABB
float halfWidth = (colliders[i].maxX - colliders[i].minX) / 2.0f;
float halfHeight = (colliders[i].maxY - colliders[i].minY) / 2.0f;
updateAABB(colliders[i], positionsX[i], positionsY[i], halfWidth * 2.0f, halfHeight * 2.0f);
}
// Clear collision events
collisionEvents.clear();
// Update spatial grid
grid.clear();
for (size_t i = 0; i < colliders.size(); ++i) {
grid.insertEntity(i, colliders[i]);
}
// Get potential collisions
std::vector<std::pair<size_t, size_t>> potentialCollisions;
grid.getPotentialCollisions(potentialCollisions);
// Check and resolve collisions
for (const auto& pair : potentialCollisions) {
size_t a = pair.first;
size_t b = pair.second;
if (checkAABBCollision(colliders[a], colliders[b])) {
// Calculate inverse masses (0 for static objects)
float invMassA = isStatic[a] ? 0.0f : 1.0f / masses[a];
float invMassB = isStatic[b] ? 0.0f : 1.0f / masses[b];
// Skip if both objects are static
if (invMassA == 0.0f && invMassB == 0.0f) continue;
// Resolve collision
resolveAABBCollision(colliders[a], colliders[b],
velocitiesX[a], velocitiesY[a],
velocitiesX[b], velocitiesY[b],
invMassA, invMassB);
// Update positions based on new AABB positions
if (!isStatic[a]) {
positionsX[a] = (colliders[a].minX + colliders[a].maxX) / 2.0f;
positionsY[a] = (colliders[a].minY + colliders[a].maxY) / 2.0f;
}
if (!isStatic[b]) {
positionsX[b] = (colliders[b].minX + colliders[b].maxX) / 2.0f;
positionsY[b] = (colliders[b].minY + colliders[b].maxY) / 2.0f;
}
// Create collision event
float relVelX = velocitiesX[a] - velocitiesX[b];
float relVelY = velocitiesY[a] - velocitiesY[b];
float impulse = std::sqrt(relVelX * relVelX + relVelY * relVelY);
collisionEvents.push_back({a, b, impulse});
}
}
}
const std::vector<CollisionEvent>& getCollisionEvents() const {
return collisionEvents;
}
// Getters and setters for entity properties
void setPosition(size_t entity, float x, float y) {
if (entity >= positionsX.size()) return;
positionsX[entity] = x;
positionsY[entity] = y;
float halfWidth = (colliders[entity].maxX - colliders[entity].minX) / 2.0f;
float halfHeight = (colliders[entity].maxY - colliders[entity].minY) / 2.0f;
updateAABB(colliders[entity], x, y, halfWidth * 2.0f, halfHeight * 2.0f);
}
void setVelocity(size_t entity, float vx, float vy) {
if (entity >= velocitiesX.size() || isStatic[entity]) return;
velocitiesX[entity] = vx;
velocitiesY[entity] = vy;
}
void getPosition(size_t entity, float& x, float& y) const {
if (entity >= positionsX.size()) return;
x = positionsX[entity];
y = positionsY[entity];
}
void getVelocity(size_t entity, float& vx, float& vy) const {
if (entity >= velocitiesX.size()) return;
vx = velocitiesX[entity];
vy = velocitiesY[entity];
}
};
Ray Casting
Ray casting is useful for line-of-sight checks, bullet physics, and more:
struct Ray {
float originX, originY;
float directionX, directionY;
};
struct RaycastHit {
bool hit;
float distance;
size_t entityIndex;
float normalX, normalY;
};
RaycastHit raycast(const Ray& ray, const std::vector<AABB>& colliders, float maxDistance) {
RaycastHit result = {false, maxDistance, 0, 0.0f, 0.0f};
// Normalize direction
float dirLength = std::sqrt(ray.directionX * ray.directionX + ray.directionY * ray.directionY);
float invDirLength = 1.0f / dirLength;
float normalizedDirX = ray.directionX * invDirLength;
float normalizedDirY = ray.directionY * invDirLength;
// Check each collider
for (size_t i = 0; i < colliders.size(); ++i) {
const AABB& aabb = colliders[i];
// Calculate intersection with AABB
float tMinX = (aabb.minX - ray.originX) / normalizedDirX;
float tMaxX = (aabb.maxX - ray.originX) / normalizedDirX;
if (tMinX > tMaxX) std::swap(tMinX, tMaxX);
float tMinY = (aabb.minY - ray.originY) / normalizedDirY;
float tMaxY = (aabb.maxY - ray.originY) / normalizedDirY;
if (tMinY > tMaxY) std::swap(tMinY, tMaxY);
// Find intersection interval
float tMin = std::max(tMinX, tMinY);
float tMax = std::min(tMaxX, tMaxY);
// Check if there's a valid intersection
if (tMax >= 0 && tMin <= tMax && tMin < result.distance) {
result.hit = true;
result.distance = tMin;
result.entityIndex = i;
// Calculate normal
if (tMinX > tMinY) {
result.normalX = normalizedDirX < 0 ? 1.0f : -1.0f;
result.normalY = 0.0f;
} else {
result.normalX = 0.0f;
result.normalY = normalizedDirY < 0 ? 1.0f : -1.0f;
}
}
}
return result;
}
Performance Optimization
Broad and Narrow Phase Collision Detection
For better performance, split collision detection into two phases:
- Broad Phase: Use spatial partitioning to find potential collisions quickly
- Narrow Phase: Perform precise collision checks only on potential collisions
void updatePhysics(float deltaTime) {
// Update positions and AABBs
// ...
// Broad phase (using spatial grid)
std::vector<std::pair<size_t, size_t>> potentialCollisions;
grid.getPotentialCollisions(potentialCollisions);
// Narrow phase
for (const auto& pair : potentialCollisions) {
if (checkAABBCollision(colliders[pair.first], colliders[pair.second])) {
// Resolve collision
// ...
}
}
}
SIMD Batch Processing
Process multiple collisions at once using SIMD:
void checkCollisionsBatch(const std::vector<AABB>& aabbs,
const std::vector<std::pair<size_t, size_t>>& pairs,
std::vector<bool>& results) {
results.resize(pairs.size());
for (size_t i = 0; i < pairs.size(); i += 4) {
// Load 4 pairs of AABBs
__m128 minX1, minY1, maxX1, maxY1;
__m128 minX2, minY2, maxX2, maxY2;
// Fill SIMD registers with AABB data
for (size_t j = 0; j < 4 && i + j < pairs.size(); ++j) {
size_t idx1 = pairs[i + j].first;
size_t idx2 = pairs[i + j].second;
// Set values for this slot
reinterpret_cast<float*>(&minX1)[j] = aabbs[idx1].minX;
reinterpret_cast<float*>(&minY1)[j] = aabbs[idx1].minY;
reinterpret_cast<float*>(&maxX1)[j] = aabbs[idx1].maxX;
reinterpret_cast<float*>(&maxY1)[j] = aabbs[idx1].maxY;
reinterpret_cast<float*>(&minX2)[j] = aabbs[idx2].minX;
reinterpret_cast<float*>(&minY2)[j] = aabbs[idx2].minY;
reinterpret_cast<float*>(&maxX2)[j] = aabbs[idx2].maxX;
reinterpret_cast<float*>(&maxY2)[j] = aabbs[idx2].maxY;
}
// Check collisions for 4 pairs at once
__m128 overlapX = _mm_and_ps(
_mm_cmpge_ps(maxX1, minX2),
_mm_cmpge_ps(maxX2, minX1)
);
__m128 overlapY = _mm_and_ps(
_mm_cmpge_ps(maxY1, minY2),
_mm_cmpge_ps(maxY2, minY1)
);
__m128 collision = _mm_and_ps(overlapX, overlapY);
// Store results
int mask = _mm_movemask_ps(collision);
for (size_t j = 0; j < 4 && i + j < pairs.size(); ++j) {
results[i + j] = (mask & (1 << j)) != 0;
}
}
}
Sleep States
To save processing time, put inactive objects to sleep:
struct PhysicsBody {
// Other properties...
bool awake = true;
float sleepTimer = 0.0f;
static const float SLEEP_THRESHOLD = 0.1f; // Velocity threshold
static const float SLEEP_TIME = 0.5f; // Time below threshold before sleeping
};
void updateSleepStates(std::vector<PhysicsBody>& bodies, float deltaTime) {
for (auto& body : bodies) {
if (!body.awake) continue;
float velocitySquared = body.velocityX * body.velocityX +
body.velocityY * body.velocityY;
if (velocitySquared < SLEEP_THRESHOLD * SLEEP_THRESHOLD) {
body.sleepTimer += deltaTime;
if (body.sleepTimer > SLEEP_TIME) {
body.awake = false;
body.velocityX = 0.0f;
body.velocityY = 0.0f;
}
} else {
body.sleepTimer = 0.0f;
}
}
}
void wakeBody(PhysicsBody& body) {
body.awake = true;
body.sleepTimer = 0.0f;
}
void wakeBodyAndNeighbors(size_t bodyIndex,
std::vector<PhysicsBody>& bodies,
const std::vector<CollisionEvent>& collisions) {
// Wake the body
wakeBody(bodies[bodyIndex]);
// Wake any bodies it's colliding with
for (const auto& collision : collisions) {
if (collision.entityA == bodyIndex) {
wakeBody(bodies[collision.entityB]);
} else if (collision.entityB == bodyIndex) {
wakeBody(bodies[collision.entityA]);
}
}
}
Complete Physics System Example
Here's a complete example of a data-oriented 2D physics system:
#include <vector>
#include <array>
#include <algorithm>
#include <cmath>
#include <memory>
// Basic structures
struct AABB {
float minX, minY;
float maxX, maxY;
};
struct PhysicsBody {
float posX, posY;
float velX, velY;
float width, height;
float mass;
bool isStatic;
bool awake;
float sleepTimer;
};
struct CollisionEvent {
size_t entityA;
size_t entityB;
float impulse;
float normalX, normalY;
};
// Spatial partitioning grid
class SpatialGrid {
private:
float cellSize;
int gridWidth, gridHeight;
std::vector<std::vector<size_t>> cells;
public:
SpatialGrid(float worldWidth, float worldHeight, float cellSize)
: cellSize(cellSize) {
gridWidth = static_cast<int>(std::ceil(worldWidth / cellSize));
gridHeight = static_cast<int>(std::ceil(worldHeight / cellSize));
cells.resize(gridWidth * gridHeight);
}
void clear() {
for (auto& cell : cells) {
cell.clear();
}
}
void insertEntity(size_t entityIndex, const AABB& aabb) {
// Get cell indices for the AABB corners
int minCellX = static_cast<int>(aabb.minX / cellSize);
int minCellY = static_cast<int>(aabb.minY / cellSize);
int maxCellX = static_cast<int>(aabb.maxX / cellSize);
int maxCellY = static_cast<int>(aabb.maxY / cellSize);
// Clamp to grid bounds
minCellX = std::max(0, std::min(minCellX, gridWidth - 1));
minCellY = std::max(0, std::min(minCellY, gridHeight - 1));
maxCellX = std::max(0, std::min(maxCellX, gridWidth - 1));
maxCellY = std::max(0, std::min(maxCellY, gridHeight - 1));
// Add entity to all cells it overlaps
for (int y = minCellY; y <= maxCellY; ++y) {
for (int x = minCellX; x <= maxCellX; ++x) {
int cellIndex = y * gridWidth + x;
cells[cellIndex].push_back(entityIndex);
}
}
}
void getPotentialCollisions(std::vector<std::pair<size_t, size_t>>& collisionPairs) {
// For each cell
for (const auto& cell : cells) {
// Check all entity pairs in this cell
for (size_t i = 0; i < cell.size(); ++i) {
for (size_t j = i + 1; j < cell.size(); ++j) {
// Add pair to collision check list
if (cell[i] < cell[j]) {
collisionPairs.emplace_back(cell[i], cell[j]);
} else {
collisionPairs.emplace_back(cell[j], cell[i]);
}
}
}
}
// Remove duplicates
std::sort(collisionPairs.begin(), collisionPairs.end());
collisionPairs.erase(std::unique(collisionPairs.begin(), collisionPairs.end()),
collisionPairs.end());
}
};
// Physics system
class PhysicsSystem {
private:
// Entity data
std::vector<PhysicsBody> bodies;
std::vector<AABB> colliders;
// Spatial partitioning
SpatialGrid grid;
// Collision events
std::vector<CollisionEvent> collisionEvents;
// Constants
static constexpr float GRAVITY = 9.8f;
static constexpr float SLEEP_THRESHOLD = 0.1f;
static constexpr float SLEEP_TIME = 0.5f;
public:
PhysicsSystem(float worldWidth, float worldHeight, float cellSize)
: grid(worldWidth, worldHeight, cellSize) {}
size_t createBody(float x, float y, float width, float height,
float mass = 1.0f, bool isStatic = false) {
PhysicsBody body;
body.posX = x;
body.posY = y;
body.velX = 0.0f;
body.velY = 0.0f;
body.width = width;
body.height = height;
body.mass = mass;
body.isStatic = isStatic;
body.awake = true;
body.sleepTimer = 0.0f;
AABB aabb;
aabb.minX = x - width / 2.0f;
aabb.minY = y - height / 2.0f;
aabb.maxX = x + width / 2.0f;
aabb.maxY = y + height / 2.0f;
bodies.push_back(body);
colliders.push_back(aabb);
return bodies.size() - 1;
}
void update(float deltaTime) {
// Clear collision events
collisionEvents.clear();
// Apply gravity and update positions
for (size_t i = 0; i < bodies.size(); ++i) {
auto& body = bodies[i];
if (body.isStatic || !body.awake) continue;
// Apply gravity
body.velY += GRAVITY * deltaTime;
// Update position
body.posX += body.velX * deltaTime;
body.posY += body.velY * deltaTime;
// Update AABB
updateAABB(i);
// Update sleep state
float velocitySquared = body.velX * body.velX + body.velY * body.velY;
if (velocitySquared < SLEEP_THRESHOLD * SLEEP_THRESHOLD) {
body.sleepTimer += deltaTime;
if (body.sleepTimer > SLEEP_TIME) {
body.awake = false;
body.velX = 0.0f;
body.velY = 0.0f;
}
} else {
body.sleepTimer = 0.0f;
}
}
// Update spatial grid
grid.clear();
for (size_t i = 0; i < bodies.size(); ++i) {
grid.insertEntity(i, colliders[i]);
}
// Get potential collisions (broad phase)
std::vector<std::pair<size_t, size_t>> potentialCollisions;
grid.getPotentialCollisions(potentialCollisions);
// Check and resolve collisions (narrow phase)
for (const auto& pair : potentialCollisions) {
size_t a = pair.first;
size_t b = pair.second;
// Skip if both bodies are static or asleep
if ((bodies[a].isStatic && bodies[b].isStatic) ||
(!bodies[a].awake && !bodies[b].awake)) {
continue;
}
if (checkCollision(a, b)) {
// Wake up sleeping bodies
if (!bodies[a].awake) {
bodies[a].awake = true;
bodies[a].sleepTimer = 0.0f;
}
if (!bodies[b].awake) {
bodies[b].awake = true;
bodies[b].sleepTimer = 0.0f;
}
// Resolve collision
resolveCollision(a, b);
}
}
}
bool checkCollision(size_t a, size_t b) {
const AABB& aabbA = colliders[a];
const AABB& aabbB = colliders[b];
return (aabbA.maxX >= aabbB.minX && aabbA.minX <= aabbB.maxX &&
aabbA.maxY >= aabbB.minY && aabbA.minY <= aabbB.maxY);
}
void resolveCollision(size_t a, size_t b) {
AABB& aabbA = colliders[a];
AABB& aabbB = colliders[b];
PhysicsBody& bodyA = bodies[a];
PhysicsBody& bodyB = bodies[b];
// Calculate overlap on each axis
float overlapX = std::min(aabbA.maxX - aabbB.minX, aabbB.maxX - aabbA.minX);
float overlapY = std::min(aabbA.maxY - aabbB.minY, aabbB.maxY - aabbA.minY);
// Calculate collision normal
float normalX = 0.0f;
float normalY = 0.0f;
if (overlapX < overlapY) {
// Resolve on X axis
normalX = (aabbA.minX + aabbA.maxX) / 2.0f < (aabbB.minX + aabbB.maxX) / 2.0f ? -1.0f : 1.0f;
normalY = 0.0f;
} else {
// Resolve on Y axis
normalX = 0.0f;
normalY = (aabbA.minY + aabbA.maxY) / 2.0f < (aabbB.minY + aabbB.maxY) / 2.0f ? -1.0f : 1.0f;
}
// Calculate inverse masses (0 for static objects)
float invMassA = bodyA.isStatic ? 0.0f : 1.0f / bodyA.mass;
float invMassB = bodyB.isStatic ? 0.0f : 1.0f / bodyB.mass;
float invMassSum = invMassA + invMassB;
if (invMassSum == 0.0f) return;
// Calculate separation amount
float separation = (overlapX < overlapY ? overlapX : overlapY) * 1.01f; // Add a small buffer
float separationA = separation * invMassA / invMassSum;
float separationB = separation * invMassB / invMassSum;
// Separate the objects
if (!bodyA.isStatic) {
bodyA.posX -= normalX * separationA;
bodyA.posY -= normalY * separationA;
updateAABB(a);
}
if (!bodyB.isStatic) {
bodyB.posX += normalX * separationB;
bodyB.posY += normalY * separationB;
updateAABB(b);
}
// Calculate relative velocity
float relVelX = bodyB.velX - bodyA.velX;
float relVelY = bodyB.velY - bodyA.velY;
// Calculate relative velocity along the normal
float relVelDotNormal = relVelX * normalX + relVelY * normalY;
// If objects are moving apart, don't apply impulse
if (relVelDotNormal > 0) return;
// Calculate restitution (bounciness)
float restitution = 0.2f; // A value between 0 and 1
// Calculate impulse scalar
float impulseScalar = -(1.0f + restitution) * relVelDotNormal / invMassSum;
// Apply impulse
float impulseX = normalX * impulseScalar;
float impulseY = normalY * impulseScalar;
if (!bodyA.isStatic) {
bodyA.velX -= impulseX * invMassA;
bodyA.velY -= impulseY * invMassA;
}
if (!bodyB.isStatic) {
bodyB.velX += impulseX * invMassB;
bodyB.velY += impulseY * invMassB;
}
// Create collision event
float impulse = std::abs(impulseScalar);
collisionEvents.push_back({a, b, impulse, normalX, normalY});
}
void updateAABB(size_t index) {
const auto& body = bodies[index];
auto& aabb = colliders[index];
aabb.minX = body.posX - body.width / 2.0f;
aabb.minY = body.posY - body.height / 2.0f;
aabb.maxX = body.posX + body.width / 2.0f;
aabb.maxY = body.posY + body.height / 2.0f;
}
// Getters and setters
void setPosition(size_t index, float x, float y) {
if (index >= bodies.size()) return;
bodies[index].posX = x;
bodies[index].posY = y;
updateAABB(index);
// Wake up the body
bodies[index].awake = true;
bodies[index].sleepTimer = 0.0f;
}
void setVelocity(size_t index, float vx, float vy) {
if (index >= bodies.size() || bodies[index].isStatic) return;
bodies[index].velX = vx;
bodies[index].velY = vy;
// Wake up the body
bodies[index].awake = true;
bodies[index].sleepTimer = 0.0f;
}
void applyForce(size_t index, float forceX, float forceY, float deltaTime) {
if (index >= bodies.size() || bodies[index].isStatic) return;
auto& body = bodies[index];
// F = ma, so a = F/m
float accelerationX = forceX / body.mass;
float accelerationY = forceY / body.mass;
// v = v0 + at
body.velX += accelerationX * deltaTime;
body.velY += accelerationY * deltaTime;
// Wake up the body
body.awake = true;
body.sleepTimer = 0.0f;
}
void getPosition(size_t index, float& x, float& y) const {
if (index >= bodies.size()) return;
x = bodies[index].posX;
y = bodies[index].posY;
}
void getVelocity(size_t index, float& vx, float& vy) const {
if (index >= bodies.size()) return;
vx = bodies[index].velX;
vy = bodies[index].velY;
}
const std::vector<CollisionEvent>& getCollisionEvents() const {
return collisionEvents;
}
};
Conclusion
A basic physics system with AABB collision detection and response provides a solid foundation for 2D games. By using a data-oriented approach with contiguous arrays and spatial partitioning, you can achieve excellent performance even with hundreds of entities.
Key takeaways:
- Use AABB collision detection for simple and efficient 2D physics
- Implement spatial partitioning to reduce collision checks
- Separate collision detection into broad and narrow phases
- Use sleep states to avoid processing inactive objects
- Consider SIMD optimizations for batch processing
In the next section, we'll explore audio systems, which bring your game to life with sound effects and music.
Audio System
Introduction to Game Audio: Adding Sound to Your Game
Audio is a crucial component of game development that significantly enhances player experience. A well-designed audio system allows your game to play sound effects and music at appropriate times, manage audio resources efficiently, and provide spatial audio capabilities when needed.
In this section, we'll build a data-oriented audio system that's both performant and flexible, focusing on non-OOP approaches that align with our engine's design philosophy.
Audio System Architecture
A game audio system typically handles several key responsibilities:
- Loading and managing audio resources
- Playing and stopping sounds
- Controlling volume, pitch, and other properties
- Mixing multiple audio sources
- Spatial audio for positional sounds
- Streaming for longer audio like music
For our non-OOP approach, we'll separate the data from the behavior and focus on efficient memory layouts and processing.
Audio Libraries
Rather than implementing low-level audio playback from scratch, we'll use an existing audio library. Some popular options include:
- OpenAL: Open Audio Library, cross-platform 3D audio API
- FMOD: Commercial audio engine with extensive features
- SDL_mixer: Simple audio mixing library, part of SDL
- SoLoud: Free, portable audio engine
- miniaudio: Single-file audio playback and capture library
For our examples, we'll use a simplified interface that could be implemented with any of these libraries.
Basic Audio Data Structures
Let's define the core data structures for our audio system:
// Audio sample data
struct AudioData {
uint32_t id;
void* buffer;
uint32_t size;
uint32_t channels;
uint32_t sampleRate;
bool streaming;
};
// Audio source (instance of a playing sound)
struct AudioSource {
uint32_t id;
uint32_t dataId;
float volume;
float pitch;
float pan;
bool looping;
bool playing;
uint32_t position; // Current playback position in samples
};
// Spatial audio properties
struct AudioSpatialData {
float positionX, positionY;
float velocityX, velocityY;
float minDistance;
float maxDistance;
float rolloffFactor;
};
Loading and Managing Audio Resources
In a data-oriented approach, we'll manage audio resources in contiguous arrays:
class AudioSystem {
private:
std::vector<AudioData> audioData;
std::unordered_map<std::string, uint32_t> audioNameMap;
public:
uint32_t loadSound(const std::string& filename, bool streaming = false) {
// Check if already loaded
auto it = audioNameMap.find(filename);
if (it != audioNameMap.end()) {
return it->second;
}
AudioData data;
data.id = audioData.size();
data.streaming = streaming;
// Load audio file
if (streaming) {
// Set up streaming (implementation depends on audio library)
// ...
} else {
// Load entire file into memory
// ...
}
// Store in our collections
audioNameMap[filename] = data.id;
audioData.push_back(data);
return data.id;
}
void unloadSound(uint32_t soundId) {
if (soundId >= audioData.size()) return;
// Free resources
if (audioData[soundId].buffer) {
free(audioData[soundId].buffer);
audioData[soundId].buffer = nullptr;
}
// Remove from name map
for (auto it = audioNameMap.begin(); it != audioNameMap.end(); ++it) {
if (it->second == soundId) {
audioNameMap.erase(it);
break;
}
}
// Note: We don't remove from the vector to avoid invalidating IDs
// In a real implementation, you might want to recycle IDs
}
// Other resource management methods...
};
Playing and Controlling Sounds
Now let's implement sound playback functionality:
class AudioSystem {
private:
std::vector<AudioData> audioData;
std::unordered_map<std::string, uint32_t> audioNameMap;
std::vector<AudioSource> sources;
std::vector<AudioSpatialData> spatialData;
uint32_t nextSourceId = 1;
public:
// Resource management methods...
uint32_t playSound(uint32_t soundId, float volume = 1.0f, float pitch = 1.0f, bool looping = false) {
if (soundId >= audioData.size()) return 0;
// Create a new audio source
AudioSource source;
source.id = nextSourceId++;
source.dataId = soundId;
source.volume = volume;
source.pitch = pitch;
source.pan = 0.0f;
source.looping = looping;
source.playing = true;
source.position = 0;
// Start playback (implementation depends on audio library)
// ...
// Store the source
sources.push_back(source);
return source.id;
}
uint32_t playSoundSpatial(uint32_t soundId, float x, float y,
float minDist = 1.0f, float maxDist = 100.0f,
float volume = 1.0f, float pitch = 1.0f, bool looping = false) {
uint32_t sourceId = playSound(soundId, volume, pitch, looping);
if (sourceId == 0) return 0;
// Add spatial data
AudioSpatialData spatial;
spatial.positionX = x;
spatial.positionY = y;
spatial.velocityX = 0.0f;
spatial.velocityY = 0.0f;
spatial.minDistance = minDist;
spatial.maxDistance = maxDist;
spatial.rolloffFactor = 1.0f;
// Store spatial data
spatialData.push_back(spatial);
return sourceId;
}
void stopSound(uint32_t sourceId) {
for (size_t i = 0; i < sources.size(); ++i) {
if (sources[i].id == sourceId) {
sources[i].playing = false;
// Stop playback (implementation depends on audio library)
// ...
// Remove the source
sources.erase(sources.begin() + i);
// Remove spatial data if it exists
for (size_t j = 0; j < spatialData.size(); ++j) {
if (spatialData[j].id == sourceId) {
spatialData.erase(spatialData.begin() + j);
break;
}
}
break;
}
}
}
void pauseSound(uint32_t sourceId) {
for (auto& source : sources) {
if (source.id == sourceId) {
source.playing = false;
// Pause playback (implementation depends on audio library)
// ...
break;
}
}
}
void resumeSound(uint32_t sourceId) {
for (auto& source : sources) {
if (source.id == sourceId) {
source.playing = true;
// Resume playback (implementation depends on audio library)
// ...
break;
}
}
}
void setVolume(uint32_t sourceId, float volume) {
for (auto& source : sources) {
if (source.id == sourceId) {
source.volume = volume;
// Update volume (implementation depends on audio library)
// ...
break;
}
}
}
void setPitch(uint32_t sourceId, float pitch) {
for (auto& source : sources) {
if (source.id == sourceId) {
source.pitch = pitch;
// Update pitch (implementation depends on audio library)
// ...
break;
}
}
}
void setPan(uint32_t sourceId, float pan) {
for (auto& source : sources) {
if (source.id == sourceId) {
source.pan = pan;
// Update pan (implementation depends on audio library)
// ...
break;
}
}
}
// Other control methods...
};
Updating Spatial Audio
For games with positional audio, we need to update the audio sources based on listener and source positions:
class AudioSystem {
private:
// Previous members...
float listenerX = 0.0f;
float listenerY = 0.0f;
float listenerDirectionX = 0.0f;
float listenerDirectionY = 1.0f;
public:
// Previous methods...
void setListenerPosition(float x, float y) {
listenerX = x;
listenerY = y;
// Update listener position (implementation depends on audio library)
// ...
}
void setListenerDirection(float dirX, float dirY) {
// Normalize direction
float length = std::sqrt(dirX * dirX + dirY * dirY);
if (length > 0.0f) {
listenerDirectionX = dirX / length;
listenerDirectionY = dirY / length;
// Update listener direction (implementation depends on audio library)
// ...
}
}
void setSoundPosition(uint32_t sourceId, float x, float y) {
for (size_t i = 0; i < sources.size(); ++i) {
if (sources[i].id == sourceId) {
// Find corresponding spatial data
for (auto& spatial : spatialData) {
if (spatial.id == sourceId) {
spatial.positionX = x;
spatial.positionY = y;
// Update position (implementation depends on audio library)
// ...
break;
}
}
break;
}
}
}
void updateSpatialAudio() {
// For each spatial sound source
for (size_t i = 0; i < sources.size(); ++i) {
uint32_t sourceId = sources[i].id;
// Find corresponding spatial data
for (size_t j = 0; j < spatialData.size(); ++j) {
if (spatialData[j].id == sourceId) {
// Calculate distance from listener
float dx = spatialData[j].positionX - listenerX;
float dy = spatialData[j].positionY - listenerY;
float distance = std::sqrt(dx * dx + dy * dy);
// Calculate attenuation based on distance
float attenuation = 1.0f;
if (distance > spatialData[j].minDistance) {
if (distance > spatialData[j].maxDistance) {
attenuation = 0.0f;
} else {
float range = spatialData[j].maxDistance - spatialData[j].minDistance;
attenuation = 1.0f - (distance - spatialData[j].minDistance) / range;
attenuation = std::pow(attenuation, spatialData[j].rolloffFactor);
}
}
// Calculate pan based on position relative to listener direction
float dotProduct = dx * listenerDirectionX + dy * listenerDirectionY;
float crossProduct = dx * listenerDirectionY - dy * listenerDirectionX;
float angle = std::atan2(crossProduct, dotProduct);
float pan = std::sin(angle);
// Apply attenuation and pan
float baseVolume = sources[i].volume;
float effectiveVolume = baseVolume * attenuation;
// Update volume and pan (implementation depends on audio library)
// ...
break;
}
}
}
}
};
Audio Categories and Mixing
For better control over your game's audio, implement a category system:
enum class AudioCategory {
Master,
Music,
SoundEffects,
Voices,
Ambient,
UI
// Add more categories as needed
};
class AudioSystem {
private:
// Previous members...
std::unordered_map<AudioCategory, float> categoryVolumes;
std::vector<AudioCategory> sourceCategories;
public:
AudioSystem() {
// Initialize category volumes
categoryVolumes[AudioCategory::Master] = 1.0f;
categoryVolumes[AudioCategory::Music] = 1.0f;
categoryVolumes[AudioCategory::SoundEffects] = 1.0f;
categoryVolumes[AudioCategory::Voices] = 1.0f;
categoryVolumes[AudioCategory::Ambient] = 1.0f;
categoryVolumes[AudioCategory::UI] = 1.0f;
}
// Previous methods...
uint32_t playSound(uint32_t soundId, AudioCategory category,
float volume = 1.0f, float pitch = 1.0f, bool looping = false) {
uint32_t sourceId = playSound(soundId, volume, pitch, looping);
if (sourceId == 0) return 0;
// Store category
while (sourceCategories.size() <= sourceId) {
sourceCategories.push_back(AudioCategory::SoundEffects); // Default
}
sourceCategories[sourceId] = category;
return sourceId;
}
void setCategoryVolume(AudioCategory category, float volume) {
categoryVolumes[category] = volume;
// Update all sources in this category
for (size_t i = 0; i < sources.size(); ++i) {
uint32_t sourceId = sources[i].id;
if (sourceId < sourceCategories.size() && sourceCategories[sourceId] == category) {
// Calculate effective volume
float effectiveVolume = sources[i].volume *
categoryVolumes[category] *
categoryVolumes[AudioCategory::Master];
// Update volume (implementation depends on audio library)
// ...
}
}
}
float getCategoryVolume(AudioCategory category) const {
auto it = categoryVolumes.find(category);
if (it != categoryVolumes.end()) {
return it->second;
}
return 1.0f;
}
float getEffectiveVolume(uint32_t sourceId) const {
for (const auto& source : sources) {
if (source.id == sourceId) {
AudioCategory category = AudioCategory::SoundEffects; // Default
if (sourceId < sourceCategories.size()) {
category = sourceCategories[sourceId];
}
return source.volume *
categoryVolumes.at(category) *
categoryVolumes.at(AudioCategory::Master);
}
}
return 0.0f;
}
};
Streaming Audio for Music
For longer audio files like music, streaming is more memory-efficient than loading the entire file:
class AudioSystem {
private:
// Previous members...
static const size_t STREAMING_BUFFER_SIZE = 32768; // 32 KB
static const size_t NUM_STREAMING_BUFFERS = 3;
struct StreamingState {
uint32_t sourceId;
FILE* file;
uint32_t totalSize;
uint32_t position;
uint32_t buffers[NUM_STREAMING_BUFFERS];
bool bufferFilled[NUM_STREAMING_BUFFERS];
uint8_t currentBuffer;
};
std::vector<StreamingState> streamingStates;
public:
// Previous methods...
uint32_t streamMusic(const std::string& filename, float volume = 1.0f, bool looping = true) {
// Load audio file info but not the entire data
uint32_t dataId = loadSound(filename, true);
if (dataId == 0) return 0;
// Create source
uint32_t sourceId = playSound(dataId, AudioCategory::Music, volume, 1.0f, looping);
if (sourceId == 0) return 0;
// Set up streaming state
StreamingState state;
state.sourceId = sourceId;
state.file = fopen(filename.c_str(), "rb");
state.totalSize = audioData[dataId].size;
state.position = 0;
state.currentBuffer = 0;
// Create buffers
for (size_t i = 0; i < NUM_STREAMING_BUFFERS; ++i) {
// Create buffer (implementation depends on audio library)
// ...
state.bufferFilled[i] = false;
}
// Fill initial buffers
for (size_t i = 0; i < NUM_STREAMING_BUFFERS; ++i) {
fillStreamingBuffer(state, i);
}
// Queue initial buffers
for (size_t i = 0; i < NUM_STREAMING_BUFFERS; ++i) {
if (state.bufferFilled[i]) {
// Queue buffer (implementation depends on audio library)
// ...
}
}
// Store streaming state
streamingStates.push_back(state);
return sourceId;
}
void updateStreaming() {
for (auto& state : streamingStates) {
// Check if any buffers have been processed
for (size_t i = 0; i < NUM_STREAMING_BUFFERS; ++i) {
bool processed = false;
// Check if buffer is processed (implementation depends on audio library)
// ...
if (processed) {
// Refill and requeue the buffer
fillStreamingBuffer(state, i);
if (state.bufferFilled[i]) {
// Queue buffer (implementation depends on audio library)
// ...
}
}
}
}
}
private:
void fillStreamingBuffer(StreamingState& state, size_t bufferIndex) {
if (!state.file) {
state.bufferFilled[bufferIndex] = false;
return;
}
// Seek to current position
fseek(state.file, state.position, SEEK_SET);
// Read data
uint8_t data[STREAMING_BUFFER_SIZE];
size_t bytesRead = fread(data, 1, STREAMING_BUFFER_SIZE, state.file);
if (bytesRead > 0) {
// Fill buffer (implementation depends on audio library)
// ...
state.position += bytesRead;
state.bufferFilled[bufferIndex] = true;
// Handle looping
if (bytesRead < STREAMING_BUFFER_SIZE) {
for (const auto& source : sources) {
if (source.id == state.sourceId && source.looping) {
// Reset position for looping
state.position = 0;
// Read remaining data
fseek(state.file, state.position, SEEK_SET);
size_t remainingBytes = STREAMING_BUFFER_SIZE - bytesRead;
size_t additionalBytesRead = fread(data + bytesRead, 1, remainingBytes, state.file);
if (additionalBytesRead > 0) {
// Update buffer with additional data
// ...
state.position += additionalBytesRead;
}
break;
}
}
}
} else {
state.bufferFilled[bufferIndex] = false;
// Check if we should loop
for (const auto& source : sources) {
if (source.id == state.sourceId && source.looping) {
// Reset position for looping
state.position = 0;
fillStreamingBuffer(state, bufferIndex);
break;
}
}
}
}
};
Sound Effects and Variations
To add variety to your game's audio, implement sound variations:
class AudioSystem {
private:
// Previous members...
struct SoundGroup {
std::string name;
std::vector<uint32_t> soundIds;
float minPitch;
float maxPitch;
float minVolume;
float maxVolume;
};
std::vector<SoundGroup> soundGroups;
std::unordered_map<std::string, size_t> groupNameMap;
public:
// Previous methods...
size_t createSoundGroup(const std::string& name) {
// Check if already exists
auto it = groupNameMap.find(name);
if (it != groupNameMap.end()) {
return it->second;
}
SoundGroup group;
group.name = name;
group.minPitch = 1.0f;
group.maxPitch = 1.0f;
group.minVolume = 1.0f;
group.maxVolume = 1.0f;
size_t groupId = soundGroups.size();
soundGroups.push_back(group);
groupNameMap[name] = groupId;
return groupId;
}
void addSoundToGroup(size_t groupId, uint32_t soundId) {
if (groupId >= soundGroups.size() || soundId >= audioData.size()) return;
soundGroups[groupId].soundIds.push_back(soundId);
}
void setSoundGroupPitchRange(size_t groupId, float minPitch, float maxPitch) {
if (groupId >= soundGroups.size()) return;
soundGroups[groupId].minPitch = minPitch;
soundGroups[groupId].maxPitch = maxPitch;
}
void setSoundGroupVolumeRange(size_t groupId, float minVolume, float maxVolume) {
if (groupId >= soundGroups.size()) return;
soundGroups[groupId].minVolume = minVolume;
soundGroups[groupId].maxVolume = maxVolume;
}
uint32_t playRandomSound(size_t groupId, AudioCategory category = AudioCategory::SoundEffects, bool looping = false) {
if (groupId >= soundGroups.size() || soundGroups[groupId].soundIds.empty()) return 0;
// Select random sound
size_t index = rand() % soundGroups[groupId].soundIds.size();
uint32_t soundId = soundGroups[groupId].soundIds[index];
// Select random pitch and volume
float pitch = soundGroups[groupId].minPitch;
if (soundGroups[groupId].maxPitch > soundGroups[groupId].minPitch) {
float t = static_cast<float>(rand()) / RAND_MAX;
pitch = soundGroups[groupId].minPitch + t * (soundGroups[groupId].maxPitch - soundGroups[groupId].minPitch);
}
float volume = soundGroups[groupId].minVolume;
if (soundGroups[groupId].maxVolume > soundGroups[groupId].minVolume) {
float t = static_cast<float>(rand()) / RAND_MAX;
volume = soundGroups[groupId].minVolume + t * (soundGroups[groupId].maxVolume - soundGroups[groupId].minVolume);
}
// Play the sound
return playSound(soundId, category, volume, pitch, looping);
}
};
Audio System Update Loop
The audio system needs to be updated regularly to handle streaming, spatial audio, and other time-dependent operations:
class AudioSystem {
private:
// Previous members...
public:
// Previous methods...
void update(float deltaTime) {
// Update streaming audio
updateStreaming();
// Update spatial audio
updateSpatialAudio();
// Clean up finished sounds
for (size_t i = 0; i < sources.size(); ) {
bool isPlaying = true;
// Check if source is still playing (implementation depends on audio library)
// ...
if (!isPlaying) {
uint32_t sourceId = sources[i].id;
// Remove source
sources.erase(sources.begin() + i);
// Remove spatial data if it exists
for (size_t j = 0; j < spatialData.size(); ++j) {
if (spatialData[j].id == sourceId) {
spatialData.erase(spatialData.begin() + j);
break;
}
}
// Remove streaming state if it exists
for (size_t j = 0; j < streamingStates.size(); ++j) {
if (streamingStates[j].sourceId == sourceId) {
// Clean up streaming resources
if (streamingStates[j].file) {
fclose(streamingStates[j].file);
}
streamingStates.erase(streamingStates.begin() + j);
break;
}
}
} else {
++i;
}
}
}
};
Complete Audio System Example
Here's a complete example of a data-oriented audio system using OpenAL:
#include <vector>
#include <unordered_map>
#include <string>
#include <cmath>
#include <cstdlib>
#include <cstdio>
#include <memory>
#include <algorithm>
// Include OpenAL headers
#include <AL/al.h>
#include <AL/alc.h>
// Audio categories
enum class AudioCategory {
Master,
Music,
SoundEffects,
Voices,
Ambient,
UI
};
// Audio system class
class AudioSystem {
private:
// OpenAL device and context
ALCdevice* device;
ALCcontext* context;
// Audio data
struct AudioData {
uint32_t id;
ALuint buffer;
uint32_t size;
uint32_t channels;
uint32_t sampleRate;
bool streaming;
};
// Audio source
struct AudioSource {
uint32_t id;
uint32_t dataId;
ALuint source;
float volume;
float pitch;
float pan;
bool looping;
bool playing;
};
// Spatial audio data
struct AudioSpatialData {
uint32_t id;
float positionX, positionY;
float velocityX, velocityY;
float minDistance;
float maxDistance;
float rolloffFactor;
};
// Streaming state
struct StreamingState {
uint32_t sourceId;
FILE* file;
uint32_t totalSize;
uint32_t position;
ALuint buffers[3];
bool bufferFilled[3];
uint8_t currentBuffer;
};
// Sound group
struct SoundGroup {
std::string name;
std::vector<uint32_t> soundIds;
float minPitch;
float maxPitch;
float minVolume;
float maxVolume;
};
// Collections
std::vector<AudioData> audioData;
std::unordered_map<std::string, uint32_t> audioNameMap;
std::vector<AudioSource> sources;
std::vector<AudioSpatialData> spatialData;
std::vector<StreamingState> streamingStates;
std::vector<SoundGroup> soundGroups;
std::unordered_map<std::string, size_t> groupNameMap;
std::unordered_map<AudioCategory, float> categoryVolumes;
std::vector<AudioCategory> sourceCategories;
// Listener data
float listenerX = 0.0f;
float listenerY = 0.0f;
float listenerDirectionX = 0.0f;
float listenerDirectionY = 1.0f;
// ID counters
uint32_t nextSourceId = 1;
// Constants
static const size_t STREAMING_BUFFER_SIZE = 32768; // 32 KB
public:
AudioSystem() {
// Initialize OpenAL
device = alcOpenDevice(nullptr);
if (!device) {
// Handle error
return;
}
context = alcCreateContext(device, nullptr);
if (!context) {
alcCloseDevice(device);
// Handle error
return;
}
alcMakeContextCurrent(context);
// Initialize category volumes
categoryVolumes[AudioCategory::Master] = 1.0f;
categoryVolumes[AudioCategory::Music] = 1.0f;
categoryVolumes[AudioCategory::SoundEffects] = 1.0f;
categoryVolumes[AudioCategory::Voices] = 1.0f;
categoryVolumes[AudioCategory::Ambient] = 1.0f;
categoryVolumes[AudioCategory::UI] = 1.0f;
}
~AudioSystem() {
// Clean up resources
for (auto& data : audioData) {
if (data.buffer) {
alDeleteBuffers(1, &data.buffer);
}
}
for (auto& source : sources) {
alDeleteSources(1, &source.source);
}
for (auto& state : streamingStates) {
if (state.file) {
fclose(state.file);
}
alDeleteBuffers(3, state.buffers);
}
alcMakeContextCurrent(nullptr);
alcDestroyContext(context);
alcCloseDevice(device);
}
// Resource management
uint32_t loadSound(const std::string& filename, bool streaming = false) {
// Check if already loaded
auto it = audioNameMap.find(filename);
if (it != audioNameMap.end()) {
return it->second;
}
AudioData data;
data.id = audioData.size();
data.streaming = streaming;
if (streaming) {
// For streaming, we just store the file info
data.buffer = 0;
data.size = 0; // Will be determined when streaming starts
data.channels = 0;
data.sampleRate = 0;
} else {
// Load WAV file (simplified, real implementation would handle different formats)
FILE* file = fopen(filename.c_str(), "rb");
if (!file) {
return 0;
}
// Parse WAV header (simplified)
// ...
// Read audio data
// ...
// Create OpenAL buffer
alGenBuffers(1, &data.buffer);
alBufferData(data.buffer, data.channels == 1 ? AL_FORMAT_MONO16 : AL_FORMAT_STEREO16,
/* audio data */, data.size, data.sampleRate);
fclose(file);
}
// Store in our collections
audioNameMap[filename] = data.id;
audioData.push_back(data);
return data.id;
}
void unloadSound(uint32_t soundId) {
if (soundId >= audioData.size()) return;
// Stop any sources using this sound
for (size_t i = 0; i < sources.size(); ) {
if (sources[i].dataId == soundId) {
stopSound(sources[i].id);
} else {
++i;
}
}
// Free resources
if (audioData[soundId].buffer) {
alDeleteBuffers(1, &audioData[soundId].buffer);
audioData[soundId].buffer = 0;
}
// Remove from name map
for (auto it = audioNameMap.begin(); it != audioNameMap.end(); ++it) {
if (it->second == soundId) {
audioNameMap.erase(it);
break;
}
}
}
// Sound playback
uint32_t playSound(uint32_t soundId, AudioCategory category = AudioCategory::SoundEffects,
float volume = 1.0f, float pitch = 1.0f, bool looping = false) {
if (soundId >= audioData.size() || audioData[soundId].streaming) return 0;
// Create OpenAL source
ALuint alSource;
alGenSources(1, &alSource);
// Set source properties
alSourcei(alSource, AL_BUFFER, audioData[soundId].buffer);
alSourcef(alSource, AL_PITCH, pitch);
// Calculate effective volume based on category
float effectiveVolume = volume *
categoryVolumes[category] *
categoryVolumes[AudioCategory::Master];
alSourcef(alSource, AL_GAIN, effectiveVolume);
alSourcei(alSource, AL_LOOPING, looping ? AL_TRUE : AL_FALSE);
// Create our source object
AudioSource source;
source.id = nextSourceId++;
source.dataId = soundId;
source.source = alSource;
source.volume = volume;
source.pitch = pitch;
source.pan = 0.0f;
source.looping = looping;
source.playing = true;
// Start playback
alSourcePlay(alSource);
// Store the source
sources.push_back(source);
// Store category
while (sourceCategories.size() <= source.id) {
sourceCategories.push_back(AudioCategory::SoundEffects);
}
sourceCategories[source.id] = category;
return source.id;
}
uint32_t playSoundSpatial(uint32_t soundId, float x, float y,
float minDist = 1.0f, float maxDist = 100.0f,
AudioCategory category = AudioCategory::SoundEffects,
float volume = 1.0f, float pitch = 1.0f, bool looping = false) {
uint32_t sourceId = playSound(soundId, category, volume, pitch, looping);
if (sourceId == 0) return 0;
// Find the source
for (auto& source : sources) {
if (source.id == sourceId) {
// Set up spatial properties
alSource3f(source.source, AL_POSITION, x, y, 0.0f);
alSource3f(source.source, AL_VELOCITY, 0.0f, 0.0f, 0.0f);
alSourcef(source.source, AL_REFERENCE_DISTANCE, minDist);
alSourcef(source.source, AL_MAX_DISTANCE, maxDist);
alSourcef(source.source, AL_ROLLOFF_FACTOR, 1.0f);
// Add spatial data
AudioSpatialData spatial;
spatial.id = sourceId;
spatial.positionX = x;
spatial.positionY = y;
spatial.velocityX = 0.0f;
spatial.velocityY = 0.0f;
spatial.minDistance = minDist;
spatial.maxDistance = maxDist;
spatial.rolloffFactor = 1.0f;
spatialData.push_back(spatial);
break;
}
}
return sourceId;
}
void stopSound(uint32_t sourceId) {
for (size_t i = 0; i < sources.size(); ++i) {
if (sources[i].id == sourceId) {
// Stop playback
alSourceStop(sources[i].source);
alDeleteSources(1, &sources[i].source);
// Remove the source
sources.erase(sources.begin() + i);
// Remove spatial data if it exists
for (size_t j = 0; j < spatialData.size(); ++j) {
if (spatialData[j].id == sourceId) {
spatialData.erase(spatialData.begin() + j);
break;
}
}
// Remove streaming state if it exists
for (size_t j = 0; j < streamingStates.size(); ++j) {
if (streamingStates[j].sourceId == sourceId) {
// Clean up streaming resources
if (streamingStates[j].file) {
fclose(streamingStates[j].file);
}
alDeleteBuffers(3, streamingStates[j].buffers);
streamingStates.erase(streamingStates.begin() + j);
break;
}
}
break;
}
}
}
void pauseSound(uint32_t sourceId) {
for (auto& source : sources) {
if (source.id == sourceId) {
source.playing = false;
alSourcePause(source.source);
break;
}
}
}
void resumeSound(uint32_t sourceId) {
for (auto& source : sources) {
if (source.id == sourceId) {
source.playing = true;
alSourcePlay(source.source);
break;
}
}
}
void setVolume(uint32_t sourceId, float volume) {
for (auto& source : sources) {
if (source.id == sourceId) {
source.volume = volume;
// Calculate effective volume
AudioCategory category = AudioCategory::SoundEffects;
if (sourceId < sourceCategories.size()) {
category = sourceCategories[sourceId];
}
float effectiveVolume = volume *
categoryVolumes[category] *
categoryVolumes[AudioCategory::Master];
alSourcef(source.source, AL_GAIN, effectiveVolume);
break;
}
}
}
void setPitch(uint32_t sourceId, float pitch) {
for (auto& source : sources) {
if (source.id == sourceId) {
source.pitch = pitch;
alSourcef(source.source, AL_PITCH, pitch);
break;
}
}
}
void setPan(uint32_t sourceId, float pan) {
for (auto& source : sources) {
if (source.id == sourceId) {
source.pan = pan;
// OpenAL doesn't have a direct pan control, so we use position
alSource3f(source.source, AL_POSITION, pan, 0.0f, 0.0f);
break;
}
}
}
// Listener control
void setListenerPosition(float x, float y) {
listenerX = x;
listenerY = y;
alListener3f(AL_POSITION, x, y, 0.0f);
}
void setListenerDirection(float dirX, float dirY) {
// Normalize direction
float length = std::sqrt(dirX * dirX + dirY * dirY);
if (length > 0.0f) {
listenerDirectionX = dirX / length;
listenerDirectionY = dirY / length;
// OpenAL uses a "look at" vector and an "up" vector
float orientation[6] = {
listenerDirectionX, listenerDirectionY, 0.0f, // Look at
0.0f, 0.0f, 1.0f // Up
};
alListenerfv(AL_ORIENTATION, orientation);
}
}
// Sound position control
void setSoundPosition(uint32_t sourceId, float x, float y) {
for (auto& source : sources) {
if (source.id == sourceId) {
// Update position in OpenAL
alSource3f(source.source, AL_POSITION, x, y, 0.0f);
// Update our spatial data
for (auto& spatial : spatialData) {
if (spatial.id == sourceId) {
spatial.positionX = x;
spatial.positionY = y;
break;
}
}
break;
}
}
}
void setSoundVelocity(uint32_t sourceId, float vx, float vy) {
for (auto& source : sources) {
if (source.id == sourceId) {
// Update velocity in OpenAL
alSource3f(source.source, AL_VELOCITY, vx, vy, 0.0f);
// Update our spatial data
for (auto& spatial : spatialData) {
if (spatial.id == sourceId) {
spatial.velocityX = vx;
spatial.velocityY = vy;
break;
}
}
break;
}
}
}
// Category volume control
void setCategoryVolume(AudioCategory category, float volume) {
categoryVolumes[category] = volume;
// Update all sources in this category
for (auto& source : sources) {
uint32_t sourceId = source.id;
if (sourceId < sourceCategories.size() &&
(sourceCategories[sourceId] == category || category == AudioCategory::Master)) {
// Calculate effective volume
AudioCategory sourceCategory = sourceCategories[sourceId];
float effectiveVolume = source.volume *
categoryVolumes[sourceCategory] *
categoryVolumes[AudioCategory::Master];
alSourcef(source.source, AL_GAIN, effectiveVolume);
}
}
}
float getCategoryVolume(AudioCategory category) const {
auto it = categoryVolumes.find(category);
if (it != categoryVolumes.end()) {
return it->second;
}
return 1.0f;
}
// Sound groups
size_t createSoundGroup(const std::string& name) {
// Check if already exists
auto it = groupNameMap.find(name);
if (it != groupNameMap.end()) {
return it->second;
}
SoundGroup group;
group.name = name;
group.minPitch = 1.0f;
group.maxPitch = 1.0f;
group.minVolume = 1.0f;
group.maxVolume = 1.0f;
size_t groupId = soundGroups.size();
soundGroups.push_back(group);
groupNameMap[name] = groupId;
return groupId;
}
void addSoundToGroup(size_t groupId, uint32_t soundId) {
if (groupId >= soundGroups.size() || soundId >= audioData.size()) return;
soundGroups[groupId].soundIds.push_back(soundId);
}
void setSoundGroupPitchRange(size_t groupId, float minPitch, float maxPitch) {
if (groupId >= soundGroups.size()) return;
soundGroups[groupId].minPitch = minPitch;
soundGroups[groupId].maxPitch = maxPitch;
}
void setSoundGroupVolumeRange(size_t groupId, float minVolume, float maxVolume) {
if (groupId >= soundGroups.size()) return;
soundGroups[groupId].minVolume = minVolume;
soundGroups[groupId].maxVolume = maxVolume;
}
uint32_t playRandomSound(size_t groupId, AudioCategory category = AudioCategory::SoundEffects, bool looping = false) {
if (groupId >= soundGroups.size() || soundGroups[groupId].soundIds.empty()) return 0;
// Select random sound
size_t index = rand() % soundGroups[groupId].soundIds.size();
uint32_t soundId = soundGroups[groupId].soundIds[index];
// Select random pitch and volume
float pitch = soundGroups[groupId].minPitch;
if (soundGroups[groupId].maxPitch > soundGroups[groupId].minPitch) {
float t = static_cast<float>(rand()) / RAND_MAX;
pitch = soundGroups[groupId].minPitch + t * (soundGroups[groupId].maxPitch - soundGroups[groupId].minPitch);
}
float volume = soundGroups[groupId].minVolume;
if (soundGroups[groupId].maxVolume > soundGroups[groupId].minVolume) {
float t = static_cast<float>(rand()) / RAND_MAX;
volume = soundGroups[groupId].minVolume + t * (soundGroups[groupId].maxVolume - soundGroups[groupId].minVolume);
}
// Play the sound
return playSound(soundId, category, volume, pitch, looping);
}
// Music streaming
uint32_t streamMusic(const std::string& filename, float volume = 1.0f, bool looping = true) {
// Load audio file info
uint32_t dataId = loadSound(filename, true);
if (dataId == 0) return 0;
// Create OpenAL source
ALuint alSource;
alGenSources(1, &alSource);
// Set source properties
alSourcef(alSource, AL_PITCH, 1.0f);
// Calculate effective volume based on category
float effectiveVolume = volume *
categoryVolumes[AudioCategory::Music] *
categoryVolumes[AudioCategory::Master];
alSourcef(alSource, AL_GAIN, effectiveVolume);
alSourcei(alSource, AL_LOOPING, AL_FALSE); // We handle looping manually for streaming
// Create our source object
AudioSource source;
source.id = nextSourceId++;
source.dataId = dataId;
source.source = alSource;
source.volume = volume;
source.pitch = 1.0f;
source.pan = 0.0f;
source.looping = looping;
source.playing = true;
// Store the source
sources.push_back(source);
// Store category
while (sourceCategories.size() <= source.id) {
sourceCategories.push_back(AudioCategory::SoundEffects);
}
sourceCategories[source.id] = AudioCategory::Music;
// Set up streaming state
StreamingState state;
state.sourceId = source.id;
state.file = fopen(filename.c_str(), "rb");
// Skip WAV header (simplified)
// ...
state.totalSize = 0; // Determined from file
state.position = 0;
state.currentBuffer = 0;
// Create buffers
alGenBuffers(3, state.buffers);
for (int i = 0; i < 3; ++i) {
state.bufferFilled[i] = false;
}
// Fill initial buffers
for (int i = 0; i < 3; ++i) {
fillStreamingBuffer(state, i);
}
// Queue initial buffers
for (int i = 0; i < 3; ++i) {
if (state.bufferFilled[i]) {
alSourceQueueBuffers(source.source, 1, &state.buffers[i]);
}
}
// Start playback
alSourcePlay(alSource);
// Store streaming state
streamingStates.push_back(state);
return source.id;
}
// Update function
void update(float deltaTime) {
// Update streaming audio
for (auto& state : streamingStates) {
// Find the source
ALuint alSource = 0;
for (const auto& source : sources) {
if (source.id == state.sourceId) {
alSource = source.source;
break;
}
}
if (alSource == 0) continue;
// Check if any buffers have been processed
ALint processed = 0;
alGetSourcei(alSource, AL_BUFFERS_PROCESSED, &processed);
while (processed > 0) {
// Unqueue the processed buffer
ALuint buffer;
alSourceUnqueueBuffers(alSource, 1, &buffer);
// Find which buffer was processed
int bufferIndex = -1;
for (int i = 0; i < 3; ++i) {
if (state.buffers[i] == buffer) {
bufferIndex = i;
break;
}
}
if (bufferIndex != -1) {
// Refill the buffer
fillStreamingBuffer(state, bufferIndex);
// Requeue the buffer if it was filled
if (state.bufferFilled[bufferIndex]) {
alSourceQueueBuffers(alSource, 1, &state.buffers[bufferIndex]);
}
}
processed--;
}
// Check if playback has stopped due to buffer underrun
ALint state;
alGetSourcei(alSource, AL_SOURCE_STATE, &state);
if (state != AL_PLAYING) {
// Restart playback if there are queued buffers
ALint queued;
alGetSourcei(alSource, AL_BUFFERS_QUEUED, &queued);
if (queued > 0) {
alSourcePlay(alSource);
}
}
}
// Clean up finished sounds
for (size_t i = 0; i < sources.size(); ) {
ALint state;
alGetSourcei(sources[i].source, AL_SOURCE_STATE, &state);
if (state != AL_PLAYING && state != AL_PAUSED) {
uint32_t sourceId = sources[i].id;
// Clean up OpenAL source
alDeleteSources(1, &sources[i].source);
// Remove source
sources.erase(sources.begin() + i);
// Remove spatial data if it exists
for (size_t j = 0; j < spatialData.size(); ++j) {
if (spatialData[j].id == sourceId) {
spatialData.erase(spatialData.begin() + j);
break;
}
}
// Remove streaming state if it exists
for (size_t j = 0; j < streamingStates.size(); ++j) {
if (streamingStates[j].sourceId == sourceId) {
// Clean up streaming resources
if (streamingStates[j].file) {
fclose(streamingStates[j].file);
}
alDeleteBuffers(3, streamingStates[j].buffers);
streamingStates.erase(streamingStates.begin() + j);
break;
}
}
} else {
++i;
}
}
}
private:
void fillStreamingBuffer(StreamingState& state, int bufferIndex) {
if (!state.file) {
state.bufferFilled[bufferIndex] = false;
return;
}
// Seek to current position
fseek(state.file, state.position, SEEK_SET);
// Read data
uint8_t data[STREAMING_BUFFER_SIZE];
size_t bytesRead = fread(data, 1, STREAMING_BUFFER_SIZE, state.file);
if (bytesRead > 0) {
// Fill OpenAL buffer (assuming 16-bit stereo data)
alBufferData(state.buffers[bufferIndex], AL_FORMAT_STEREO16,
data, bytesRead, 44100); // Sample rate from file
state.position += bytesRead;
state.bufferFilled[bufferIndex] = true;
// Handle looping
if (bytesRead < STREAMING_BUFFER_SIZE) {
// Check if we should loop
bool shouldLoop = false;
for (const auto& source : sources) {
if (source.id == state.sourceId && source.looping) {
shouldLoop = true;
break;
}
}
if (shouldLoop) {
// Reset position for looping
state.position = 0; // Skip header
// Read remaining data
fseek(state.file, state.position, SEEK_SET);
size_t remainingBytes = STREAMING_BUFFER_SIZE - bytesRead;
size_t additionalBytesRead = fread(data + bytesRead, 1, remainingBytes, state.file);
if (additionalBytesRead > 0) {
// Update buffer with additional data
alBufferData(state.buffers[bufferIndex], AL_FORMAT_STEREO16,
data, bytesRead + additionalBytesRead, 44100);
state.position += additionalBytesRead;
}
}
}
} else {
state.bufferFilled[bufferIndex] = false;
// Check if we should loop
bool shouldLoop = false;
for (const auto& source : sources) {
if (source.id == state.sourceId && source.looping) {
shouldLoop = true;
break;
}
}
if (shouldLoop) {
// Reset position for looping
state.position = 0; // Skip header
fillStreamingBuffer(state, bufferIndex);
}
}
}
};
Using the Audio System
Here's how you might use this audio system in your game:
int main() {
// Initialize audio system
AudioSystem audio;
// Load sounds
uint32_t jumpSound = audio.loadSound("jump.wav");
uint32_t footstepSound = audio.loadSound("footstep.wav");
uint32_t explosionSound = audio.loadSound("explosion.wav");
// Create sound groups
size_t footstepGroup = audio.createSoundGroup("Footsteps");
audio.addSoundToGroup(footstepGroup, footstepSound);
audio.setSoundGroupPitchRange(footstepGroup, 0.9f, 1.1f);
audio.setSoundGroupVolumeRange(footstepGroup, 0.8f, 1.0f);
// Start background music
uint32_t musicId = audio.streamMusic("music.wav", 0.7f, true);
// Game loop
bool running = true;
float lastTime = getCurrentTime();
while (running) {
float currentTime = getCurrentTime();
float deltaTime = currentTime - lastTime;
lastTime = currentTime;
// Update audio system
audio.update(deltaTime);
// Update listener position based on player position
audio.setListenerPosition(playerX, playerY);
audio.setListenerDirection(playerDirX, playerDirY);
// Play sounds based on game events
if (playerJumped) {
audio.playSound(jumpSound, AudioCategory::SoundEffects, 1.0f);
}
if (playerWalking && footstepTimer <= 0) {
audio.playRandomSound(footstepGroup);
footstepTimer = 0.3f; // Play footstep every 0.3 seconds
} else {
footstepTimer -= deltaTime;
}
if (explosion) {
// Play positional explosion sound
audio.playSoundSpatial(explosionSound, explosionX, explosionY, 10.0f, 100.0f,
AudioCategory::SoundEffects, 1.0f);
}
// Handle volume controls
if (volumeChanged) {
audio.setCategoryVolume(AudioCategory::Master, masterVolume);
audio.setCategoryVolume(AudioCategory::Music, musicVolume);
audio.setCategoryVolume(AudioCategory::SoundEffects, sfxVolume);
}
// Other game logic...
}
return 0;
}
Conclusion
A well-designed audio system is essential for creating an immersive game experience. By using a data-oriented approach, we've created a flexible and performant system that can handle sound effects, music streaming, spatial audio, and more.
Key takeaways:
- Separate audio data from playback behavior for better organization
- Use streaming for longer audio files like music
- Implement sound categories for flexible volume control
- Add variety with randomized sound groups
- Use spatial audio to create immersive environments
In the next section, we'll explore resource management, which is crucial for efficiently loading and handling game assets like textures, sounds, and other data.
Resource Management
Introduction to Resource Management: Efficiently Handling Game Assets
Resource management is a critical component of any game engine, responsible for loading, storing, and providing access to various game assets such as textures, models, sounds, and other data. An efficient resource management system ensures that:
- Assets are loaded when needed and unloaded when no longer required
- Memory usage is optimized
- Loading times are minimized
- Asset dependencies are properly handled
- Resources can be hot-reloaded during development
In this section, we'll build a data-oriented resource management system that aligns with our non-OOP approach, focusing on performance and flexibility.
Resource Types
A game engine typically needs to handle various types of resources:
- Textures: Images used for rendering
- Audio: Sound effects and music
- Meshes: 3D model data
- Shaders: Programs that run on the GPU
- Fonts: For text rendering
- Data files: JSON, XML, or custom formats for game data
Let's define a common interface for these different resource types:
enum class ResourceType {
Texture,
Audio,
Mesh,
Shader,
Font,
Data,
// Add more as needed
};
struct ResourceHandle {
uint32_t id;
ResourceType type;
};
struct ResourceData {
uint32_t id;
ResourceType type;
std::string path;
void* data;
size_t size;
bool loaded;
uint32_t referenceCount;
uint64_t lastModifiedTime;
};
Basic Resource Manager
Let's start with a simple resource manager that can load and unload resources:
class ResourceManager {
private:
std::vector<ResourceData> resources;
std::unordered_map<std::string, uint32_t> pathToIdMap;
uint32_t nextResourceId = 1;
public:
ResourceHandle loadResource(const std::string& path, ResourceType type) {
// Check if already loaded
auto it = pathToIdMap.find(path);
if (it != pathToIdMap.end()) {
uint32_t id = it->second;
resources[id].referenceCount++;
return {id, type};
}
// Create new resource
ResourceData resource;
resource.id = nextResourceId++;
resource.type = type;
resource.path = path;
resource.data = nullptr;
resource.size = 0;
resource.loaded = false;
resource.referenceCount = 1;
resource.lastModifiedTime = getFileModificationTime(path);
// Load the resource based on type
bool loadSuccess = false;
switch (type) {
case ResourceType::Texture:
loadSuccess = loadTexture(resource);
break;
case ResourceType::Audio:
loadSuccess = loadAudio(resource);
break;
// Handle other resource types...
default:
break;
}
if (!loadSuccess) {
// Handle loading failure
return {0, type};
}
// Store the resource
uint32_t index = resources.size();
resources.push_back(resource);
pathToIdMap[path] = index;
return {index, type};
}
void releaseResource(ResourceHandle handle) {
if (handle.id >= resources.size()) return;
ResourceData& resource = resources[handle.id];
resource.referenceCount--;
if (resource.referenceCount == 0) {
// Unload the resource
unloadResource(resource);
}
}
void* getResourceData(ResourceHandle handle) {
if (handle.id >= resources.size()) return nullptr;
return resources[handle.id].data;
}
private:
bool loadTexture(ResourceData& resource) {
// Load texture from file
// ...
return true;
}
bool loadAudio(ResourceData& resource) {
// Load audio from file
// ...
return true;
}
void unloadResource(ResourceData& resource) {
if (!resource.loaded || !resource.data) return;
switch (resource.type) {
case ResourceType::Texture:
// Free texture data
// ...
break;
case ResourceType::Audio:
// Free audio data
// ...
break;
// Handle other resource types...
default:
break;
}
resource.data = nullptr;
resource.loaded = false;
}
uint64_t getFileModificationTime(const std::string& path) {
// Get file modification time using platform-specific code
// ...
return 0;
}
};
Resource Caching Strategies
To optimize memory usage and loading times, we can implement various caching strategies:
Reference Counting
The basic approach we've already implemented, where resources are kept in memory as long as they're being used:
ResourceHandle getResource(const std::string& path, ResourceType type) {
// Check if already loaded
auto it = pathToIdMap.find(path);
if (it != pathToIdMap.end()) {
uint32_t id = it->second;
resources[id].referenceCount++;
return {id, type};
}
// Load new resource...
}
void releaseResource(ResourceHandle handle) {
if (handle.id >= resources.size()) return;
ResourceData& resource = resources[handle.id];
resource.referenceCount--;
if (resource.referenceCount == 0) {
// Unload the resource
unloadResource(resource);
}
}
LRU (Least Recently Used) Cache
Keep a fixed number of resources in memory, discarding the least recently used ones when the cache is full:
class ResourceManager {
private:
// Previous members...
std::list<uint32_t> lruList; // Most recently used at front
std::unordered_map<uint32_t, std::list<uint32_t>::iterator> lruMap;
size_t maxCacheSize = 100;
public:
void* accessResource(ResourceHandle handle) {
if (handle.id >= resources.size() || !resources[handle.id].loaded) return nullptr;
// Update LRU list
updateLRU(handle.id);
return resources[handle.id].data;
}
private:
void updateLRU(uint32_t resourceId) {
// If already in LRU list, remove it
auto it = lruMap.find(resourceId);
if (it != lruMap.end()) {
lruList.erase(it->second);
}
// Add to front of LRU list
lruList.push_front(resourceId);
lruMap[resourceId] = lruList.begin();
// If cache is full, remove least recently used
if (lruList.size() > maxCacheSize) {
uint32_t lruResourceId = lruList.back();
// Only unload if reference count is 0
if (resources[lruResourceId].referenceCount == 0) {
unloadResource(resources[lruResourceId]);
lruMap.erase(lruResourceId);
lruList.pop_back();
}
}
}
};
Resource Groups
Group related resources together for batch loading and unloading:
class ResourceManager {
private:
// Previous members...
std::unordered_map<std::string, std::vector<uint32_t>> resourceGroups;
public:
void createResourceGroup(const std::string& groupName) {
resourceGroups[groupName] = std::vector<uint32_t>();
}
void addToResourceGroup(const std::string& groupName, ResourceHandle handle) {
if (handle.id >= resources.size()) return;
auto it = resourceGroups.find(groupName);
if (it != resourceGroups.end()) {
it->second.push_back(handle.id);
}
}
void loadResourceGroup(const std::string& groupName) {
auto it = resourceGroups.find(groupName);
if (it == resourceGroups.end()) return;
for (uint32_t resourceId : it->second) {
if (resourceId < resources.size() && !resources[resourceId].loaded) {
// Load the resource
switch (resources[resourceId].type) {
case ResourceType::Texture:
loadTexture(resources[resourceId]);
break;
case ResourceType::Audio:
loadAudio(resources[resourceId]);
break;
// Handle other resource types...
default:
break;
}
}
}
}
void unloadResourceGroup(const std::string& groupName) {
auto it = resourceGroups.find(groupName);
if (it == resourceGroups.end()) return;
for (uint32_t resourceId : it->second) {
if (resourceId < resources.size() && resources[resourceId].loaded) {
// Only unload if reference count is 0
if (resources[resourceId].referenceCount == 0) {
unloadResource(resources[resourceId]);
}
}
}
}
};
Asynchronous Resource Loading
To prevent loading times from blocking the game, we can implement asynchronous resource loading:
class ResourceManager {
private:
// Previous members...
struct LoadRequest {
uint32_t resourceId;
std::future<bool> future;
};
std::vector<LoadRequest> loadRequests;
std::mutex resourceMutex;
public:
ResourceHandle loadResourceAsync(const std::string& path, ResourceType type) {
std::lock_guard<std::mutex> lock(resourceMutex);
// Check if already loaded or loading
auto it = pathToIdMap.find(path);
if (it != pathToIdMap.end()) {
uint32_t id = it->second;
resources[id].referenceCount++;
return {id, type};
}
// Create new resource
ResourceData resource;
resource.id = nextResourceId++;
resource.type = type;
resource.path = path;
resource.data = nullptr;
resource.size = 0;
resource.loaded = false;
resource.referenceCount = 1;
resource.lastModifiedTime = getFileModificationTime(path);
// Store the resource
uint32_t index = resources.size();
resources.push_back(resource);
pathToIdMap[path] = index;
// Start asynchronous loading
std::future<bool> future = std::async(std::launch::async, [this, index, type]() {
// Load the resource based on type
switch (type) {
case ResourceType::Texture:
return loadTexture(resources[index]);
case ResourceType::Audio:
return loadAudio(resources[index]);
// Handle other resource types...
default:
return false;
}
});
loadRequests.push_back({index, std::move(future)});
return {index, type};
}
void update() {
std::lock_guard<std::mutex> lock(resourceMutex);
// Check for completed load requests
for (auto it = loadRequests.begin(); it != loadRequests.end(); ) {
if (it->future.wait_for(std::chrono::seconds(0)) == std::future_status::ready) {
bool success = it->future.get();
if (!success) {
// Handle loading failure
// ...
}
it = loadRequests.erase(it);
} else {
++it;
}
}
}
bool isResourceLoaded(ResourceHandle handle) {
std::lock_guard<std::mutex> lock(resourceMutex);
if (handle.id >= resources.size()) return false;
return resources[handle.id].loaded;
}
float getLoadingProgress() {
std::lock_guard<std::mutex> lock(resourceMutex);
if (loadRequests.empty()) return 1.0f;
// Count completed requests
int completed = 0;
for (const auto& request : loadRequests) {
if (request.future.wait_for(std::chrono::seconds(0)) == std::future_status::ready) {
completed++;
}
}
return static_cast<float>(completed) / loadRequests.size();
}
};
Resource Hot-Reloading
During development, it's useful to be able to update resources without restarting the game:
class ResourceManager {
private:
// Previous members...
bool hotReloadEnabled = false;
std::chrono::steady_clock::time_point lastHotReloadCheck;
public:
void enableHotReload(bool enable) {
hotReloadEnabled = enable;
lastHotReloadCheck = std::chrono::steady_clock::now();
}
void checkForHotReload() {
if (!hotReloadEnabled) return;
auto now = std::chrono::steady_clock::now();
auto elapsed = std::chrono::duration_cast<std::chrono::seconds>(now - lastHotReloadCheck).count();
// Only check every few seconds to avoid excessive file system access
if (elapsed < 2) return;
lastHotReloadCheck = now;
for (auto& resource : resources) {
if (!resource.loaded || resource.path.empty()) continue;
uint64_t currentModTime = getFileModificationTime(resource.path);
if (currentModTime > resource.lastModifiedTime) {
// File has been modified, reload it
unloadResource(resource);
switch (resource.type) {
case ResourceType::Texture:
loadTexture(resource);
break;
case ResourceType::Audio:
loadAudio(resource);
break;
// Handle other resource types...
default:
break;
}
resource.lastModifiedTime = currentModTime;
// Notify systems that resource has been reloaded
// ...
}
}
}
void update() {
// Update async loading
// ...
// Check for hot reload
checkForHotReload();
}
};
Resource Packing and Streaming
For larger games, we can implement resource packing and streaming:
Resource Packing
Combine multiple small resources into a single file to reduce file system overhead:
struct PackedResourceInfo {
std::string name;
ResourceType type;
size_t offset;
size_t size;
};
class ResourcePack {
private:
std::string packFilePath;
std::vector<PackedResourceInfo> resourceInfos;
std::unordered_map<std::string, size_t> nameToIndexMap;
FILE* packFile = nullptr;
public:
ResourcePack(const std::string& path) : packFilePath(path) {
// Open pack file
packFile = fopen(path.c_str(), "rb");
if (!packFile) return;
// Read header
uint32_t numResources;
fread(&numResources, sizeof(uint32_t), 1, packFile);
// Read resource infos
for (uint32_t i = 0; i < numResources; ++i) {
PackedResourceInfo info;
// Read name length
uint32_t nameLength;
fread(&nameLength, sizeof(uint32_t), 1, packFile);
// Read name
char nameBuffer[256];
fread(nameBuffer, 1, nameLength, packFile);
nameBuffer[nameLength] = '\0';
info.name = nameBuffer;
// Read type
int typeValue;
fread(&typeValue, sizeof(int), 1, packFile);
info.type = static_cast<ResourceType>(typeValue);
// Read offset and size
fread(&info.offset, sizeof(size_t), 1, packFile);
fread(&info.size, sizeof(size_t), 1, packFile);
// Store info
nameToIndexMap[info.name] = resourceInfos.size();
resourceInfos.push_back(info);
}
}
~ResourcePack() {
if (packFile) {
fclose(packFile);
}
}
bool hasResource(const std::string& name) const {
return nameToIndexMap.find(name) != nameToIndexMap.end();
}
bool loadResource(const std::string& name, void* buffer, size_t bufferSize) {
auto it = nameToIndexMap.find(name);
if (it == nameToIndexMap.end() || !packFile) return false;
const PackedResourceInfo& info = resourceInfos[it->second];
if (bufferSize < info.size) return false;
// Seek to resource data
fseek(packFile, info.offset, SEEK_SET);
// Read resource data
size_t bytesRead = fread(buffer, 1, info.size, packFile);
return bytesRead == info.size;
}
const PackedResourceInfo* getResourceInfo(const std::string& name) const {
auto it = nameToIndexMap.find(name);
if (it == nameToIndexMap.end()) return nullptr;
return &resourceInfos[it->second];
}
};
Resource Streaming
Load large resources in chunks as needed:
class StreamingManager {
private:
struct StreamingRequest {
std::string path;
size_t offset;
size_t size;
void* buffer;
std::function<void(bool)> callback;
std::future<bool> future;
};
std::vector<StreamingRequest> activeRequests;
std::mutex requestMutex;
public:
void streamData(const std::string& path, size_t offset, size_t size, void* buffer,
std::function<void(bool)> callback) {
std::lock_guard<std::mutex> lock(requestMutex);
// Create streaming request
StreamingRequest request;
request.path = path;
request.offset = offset;
request.size = size;
request.buffer = buffer;
request.callback = callback;
// Start asynchronous loading
request.future = std::async(std::launch::async, [request]() {
FILE* file = fopen(request.path.c_str(), "rb");
if (!file) return false;
// Seek to offset
fseek(file, request.offset, SEEK_SET);
// Read data
size_t bytesRead = fread(request.buffer, 1, request.size, file);
fclose(file);
return bytesRead == request.size;
});
activeRequests.push_back(std::move(request));
}
void update() {
std::lock_guard<std::mutex> lock(requestMutex);
// Check for completed requests
for (auto it = activeRequests.begin(); it != activeRequests.end(); ) {
if (it->future.wait_for(std::chrono::seconds(0)) == std::future_status::ready) {
bool success = it->future.get();
// Call callback
if (it->callback) {
it->callback(success);
}
it = activeRequests.erase(it);
} else {
++it;
}
}
}
};
Memory Management for Resources
Efficient memory management is crucial for resource handling:
Custom Allocators
Use custom allocators for different types of resources:
class MemoryAllocator {
public:
virtual void* allocate(size_t size) = 0;
virtual void deallocate(void* ptr) = 0;
virtual ~MemoryAllocator() {}
};
class LinearAllocator : public MemoryAllocator {
private:
void* buffer;
size_t bufferSize;
size_t offset;
public:
LinearAllocator(size_t size) : bufferSize(size), offset(0) {
buffer = malloc(size);
}
~LinearAllocator() {
free(buffer);
}
void* allocate(size_t size) override {
if (offset + size > bufferSize) return nullptr;
void* ptr = static_cast<char*>(buffer) + offset;
offset += size;
return ptr;
}
void deallocate(void* ptr) override {
// Linear allocator doesn't support individual deallocation
}
void reset() {
offset = 0;
}
};
class PoolAllocator : public MemoryAllocator {
private:
void* buffer;
size_t bufferSize;
size_t blockSize;
std::vector<bool> blockUsed;
public:
PoolAllocator(size_t totalSize, size_t blockSize)
: bufferSize(totalSize), blockSize(blockSize) {
buffer = malloc(totalSize);
blockUsed.resize(totalSize / blockSize, false);
}
~PoolAllocator() {
free(buffer);
}
void* allocate(size_t size) override {
if (size > blockSize) return nullptr;
// Find free block
for (size_t i = 0; i < blockUsed.size(); ++i) {
if (!blockUsed[i]) {
blockUsed[i] = true;
return static_cast<char*>(buffer) + i * blockSize;
}
}
return nullptr;
}
void deallocate(void* ptr) override {
if (!ptr) return;
// Calculate block index
size_t offset = static_cast<char*>(ptr) - static_cast<char*>(buffer);
size_t blockIndex = offset / blockSize;
if (blockIndex < blockUsed.size()) {
blockUsed[blockIndex] = false;
}
}
};
class ResourceManager {
private:
// Previous members...
std::unordered_map<ResourceType, std::unique_ptr<MemoryAllocator>> allocators;
public:
ResourceManager() {
// Create allocators for different resource types
allocators[ResourceType::Texture] = std::make_unique<PoolAllocator>(100 * 1024 * 1024, 1024 * 1024);
allocators[ResourceType::Audio] = std::make_unique<PoolAllocator>(50 * 1024 * 1024, 512 * 1024);
allocators[ResourceType::Mesh] = std::make_unique<PoolAllocator>(200 * 1024 * 1024, 2 * 1024 * 1024);
// Add more allocators as needed...
}
private:
void* allocateResourceMemory(ResourceType type, size_t size) {
auto it = allocators.find(type);
if (it != allocators.end()) {
return it->second->allocate(size);
}
// Fallback to standard allocation
return malloc(size);
}
void deallocateResourceMemory(ResourceType type, void* ptr) {
auto it = allocators.find(type);
if (it != allocators.end()) {
it->second->deallocate(ptr);
} else {
// Fallback to standard deallocation
free(ptr);
}
}
};
Memory Budgets
Implement memory budgets for different resource types:
class ResourceManager {
private:
// Previous members...
struct MemoryBudget {
size_t current;
size_t maximum;
};
std::unordered_map<ResourceType, MemoryBudget> memoryBudgets;
public:
ResourceManager() {
// Set memory budgets
memoryBudgets[ResourceType::Texture] = {0, 100 * 1024 * 1024}; // 100 MB
memoryBudgets[ResourceType::Audio] = {0, 50 * 1024 * 1024}; // 50 MB
memoryBudgets[ResourceType::Mesh] = {0, 200 * 1024 * 1024}; // 200 MB
// Add more budgets as needed...
}
bool canAllocateMemory(ResourceType type, size_t size) {
auto it = memoryBudgets.find(type);
if (it == memoryBudgets.end()) return true;
return it->second.current + size <= it->second.maximum;
}
void trackMemoryAllocation(ResourceType type, size_t size) {
auto it = memoryBudgets.find(type);
if (it != memoryBudgets.end()) {
it->second.current += size;
}
}
void trackMemoryDeallocation(ResourceType type, size_t size) {
auto it = memoryBudgets.find(type);
if (it != memoryBudgets.end()) {
it->second.current -= size;
}
}
float getMemoryUsagePercentage(ResourceType type) {
auto it = memoryBudgets.find(type);
if (it == memoryBudgets.end() || it->second.maximum == 0) return 0.0f;
return static_cast<float>(it->second.current) / it->second.maximum;
}
};
Resource Dependencies
Handle dependencies between resources:
class ResourceManager {
private:
// Previous members...
struct ResourceDependency {
uint32_t resourceId;
std::vector<uint32_t> dependencies;
};
std::vector<ResourceDependency> dependencies;
public:
void addDependency(ResourceHandle resource, ResourceHandle dependency) {
if (resource.id >= resources.size() || dependency.id >= resources.size()) return;
// Find or create dependency entry
for (auto& dep : dependencies) {
if (dep.resourceId == resource.id) {
// Add dependency if not already present
if (std::find(dep.dependencies.begin(), dep.dependencies.end(), dependency.id) == dep.dependencies.end()) {
dep.dependencies.push_back(dependency.id);
// Increment reference count of dependency
resources[dependency.id].referenceCount++;
}
return;
}
}
// Create new dependency entry
ResourceDependency newDep;
newDep.resourceId = resource.id;
newDep.dependencies.push_back(dependency.id);
dependencies.push_back(newDep);
// Increment reference count of dependency
resources[dependency.id].referenceCount++;
}
void releaseResource(ResourceHandle handle) {
if (handle.id >= resources.size()) return;
ResourceData& resource = resources[handle.id];
resource.referenceCount--;
if (resource.referenceCount == 0) {
// Release dependencies first
for (auto& dep : dependencies) {
if (dep.resourceId == handle.id) {
for (uint32_t depId : dep.dependencies) {
if (depId < resources.size()) {
// Decrement reference count of dependency
resources[depId].referenceCount--;
if (resources[depId].referenceCount == 0) {
// Unload dependency
unloadResource(resources[depId]);
}
}
}
break;
}
}
// Unload the resource
unloadResource(resource);
}
}
};
Complete Resource Management System
Here's a complete example of a data-oriented resource management system:
#include <vector>
#include <unordered_map>
#include <string>
#include <memory>
#include <mutex>
#include <future>
#include <chrono>
#include <functional>
#include <algorithm>
#include <cstdio>
#include <cstdlib>
#include <cstring>
// Resource types
enum class ResourceType {
Texture,
Audio,
Mesh,
Shader,
Font,
Data
};
// Resource handle
struct ResourceHandle {
uint32_t id;
ResourceType type;
bool isValid() const { return id != 0; }
};
// Memory allocator interface
class MemoryAllocator {
public:
virtual void* allocate(size_t size) = 0;
virtual void deallocate(void* ptr) = 0;
virtual ~MemoryAllocator() {}
};
// Pool allocator implementation
class PoolAllocator : public MemoryAllocator {
private:
void* buffer;
size_t bufferSize;
size_t blockSize;
std::vector<bool> blockUsed;
public:
PoolAllocator(size_t totalSize, size_t blockSize)
: bufferSize(totalSize), blockSize(blockSize) {
buffer = malloc(totalSize);
blockUsed.resize(totalSize / blockSize, false);
}
~PoolAllocator() {
free(buffer);
}
void* allocate(size_t size) override {
if (size > blockSize) return nullptr;
// Find free block
for (size_t i = 0; i < blockUsed.size(); ++i) {
if (!blockUsed[i]) {
blockUsed[i] = true;
return static_cast<char*>(buffer) + i * blockSize;
}
}
return nullptr;
}
void deallocate(void* ptr) override {
if (!ptr) return;
// Calculate block index
size_t offset = static_cast<char*>(ptr) - static_cast<char*>(buffer);
size_t blockIndex = offset / blockSize;
if (blockIndex < blockUsed.size()) {
blockUsed[blockIndex] = false;
}
}
};
// Resource manager class
class ResourceManager {
private:
// Resource data
struct ResourceData {
uint32_t id;
ResourceType type;
std::string path;
void* data;
size_t size;
bool loaded;
uint32_t referenceCount;
uint64_t lastModifiedTime;
};
// Resource dependency
struct ResourceDependency {
uint32_t resourceId;
std::vector<uint32_t> dependencies;
};
// Async load request
struct LoadRequest {
uint32_t resourceId;
std::future<bool> future;
};
// Memory budget
struct MemoryBudget {
size_t current;
size_t maximum;
};
// Collections
std::vector<ResourceData> resources;
std::unordered_map<std::string, uint32_t> pathToIdMap;
std::vector<ResourceDependency> dependencies;
std::vector<LoadRequest> loadRequests;
std::unordered_map<std::string, std::vector<uint32_t>> resourceGroups;
std::unordered_map<ResourceType, std::unique_ptr<MemoryAllocator>> allocators;
std::unordered_map<ResourceType, MemoryBudget> memoryBudgets;
// LRU cache
std::list<uint32_t> lruList; // Most recently used at front
std::unordered_map<uint32_t, std::list<uint32_t>::iterator> lruMap;
size_t maxCacheSize = 100;
// Hot reload
bool hotReloadEnabled = false;
std::chrono::steady_clock::time_point lastHotReloadCheck;
// Threading
std::mutex resourceMutex;
// ID counter
uint32_t nextResourceId = 1;
public:
ResourceManager() {
// Create allocators for different resource types
allocators[ResourceType::Texture] = std::make_unique<PoolAllocator>(100 * 1024 * 1024, 1024 * 1024);
allocators[ResourceType::Audio] = std::make_unique<PoolAllocator>(50 * 1024 * 1024, 512 * 1024);
allocators[ResourceType::Mesh] = std::make_unique<PoolAllocator>(200 * 1024 * 1024, 2 * 1024 * 1024);
// Set memory budgets
memoryBudgets[ResourceType::Texture] = {0, 100 * 1024 * 1024}; // 100 MB
memoryBudgets[ResourceType::Audio] = {0, 50 * 1024 * 1024}; // 50 MB
memoryBudgets[ResourceType::Mesh] = {0, 200 * 1024 * 1024}; // 200 MB
// Initialize hot reload
lastHotReloadCheck = std::chrono::steady_clock::now();
}
~ResourceManager() {
// Unload all resources
for (auto& resource : resources) {
if (resource.loaded && resource.data) {
unloadResource(resource);
}
}
}
// Synchronous resource loading
ResourceHandle loadResource(const std::string& path, ResourceType type) {
std::lock_guard<std::mutex> lock(resourceMutex);
// Check if already loaded
auto it = pathToIdMap.find(path);
if (it != pathToIdMap.end()) {
uint32_t id = it->second;
resources[id].referenceCount++;
// Update LRU cache
updateLRU(id);
return {id, type};
}
// Create new resource
ResourceData resource;
resource.id = nextResourceId++;
resource.type = type;
resource.path = path;
resource.data = nullptr;
resource.size = 0;
resource.loaded = false;
resource.referenceCount = 1;
resource.lastModifiedTime = getFileModificationTime(path);
// Load the resource based on type
bool loadSuccess = false;
switch (type) {
case ResourceType::Texture:
loadSuccess = loadTexture(resource);
break;
case ResourceType::Audio:
loadSuccess = loadAudio(resource);
break;
case ResourceType::Mesh:
loadSuccess = loadMesh(resource);
break;
case ResourceType::Shader:
loadSuccess = loadShader(resource);
break;
case ResourceType::Font:
loadSuccess = loadFont(resource);
break;
case ResourceType::Data:
loadSuccess = loadData(resource);
break;
default:
break;
}
if (!loadSuccess) {
// Handle loading failure
return {0, type};
}
// Store the resource
uint32_t index = resources.size();
resources.push_back(resource);
pathToIdMap[path] = index;
// Update LRU cache
updateLRU(index);
return {index, type};
}
// Asynchronous resource loading
ResourceHandle loadResourceAsync(const std::string& path, ResourceType type) {
std::lock_guard<std::mutex> lock(resourceMutex);
// Check if already loaded or loading
auto it = pathToIdMap.find(path);
if (it != pathToIdMap.end()) {
uint32_t id = it->second;
resources[id].referenceCount++;
return {id, type};
}
// Create new resource
ResourceData resource;
resource.id = nextResourceId++;
resource.type = type;
resource.path = path;
resource.data = nullptr;
resource.size = 0;
resource.loaded = false;
resource.referenceCount = 1;
resource.lastModifiedTime = getFileModificationTime(path);
// Store the resource
uint32_t index = resources.size();
resources.push_back(resource);
pathToIdMap[path] = index;
// Start asynchronous loading
std::future<bool> future = std::async(std::launch::async, [this, index, type]() {
// Load the resource based on type
switch (type) {
case ResourceType::Texture:
return loadTexture(resources[index]);
case ResourceType::Audio:
return loadAudio(resources[index]);
case ResourceType::Mesh:
return loadMesh(resources[index]);
case ResourceType::Shader:
return loadShader(resources[index]);
case ResourceType::Font:
return loadFont(resources[index]);
case ResourceType::Data:
return loadData(resources[index]);
default:
return false;
}
});
loadRequests.push_back({index, std::move(future)});
return {index, type};
}
// Release a resource
void releaseResource(ResourceHandle handle) {
std::lock_guard<std::mutex> lock(resourceMutex);
if (handle.id >= resources.size()) return;
ResourceData& resource = resources[handle.id];
resource.referenceCount--;
if (resource.referenceCount == 0) {
// Release dependencies first
for (auto& dep : dependencies) {
if (dep.resourceId == handle.id) {
for (uint32_t depId : dep.dependencies) {
if (depId < resources.size()) {
// Decrement reference count of dependency
resources[depId].referenceCount--;
if (resources[depId].referenceCount == 0) {
// Unload dependency
unloadResource(resources[depId]);
}
}
}
break;
}
}
// Don't unload immediately, let LRU cache handle it
// updateLRU will unload if cache is full
}
}
// Get resource data
void* getResourceData(ResourceHandle handle) {
std::lock_guard<std::mutex> lock(resourceMutex);
if (handle.id >= resources.size() || !resources[handle.id].loaded) return nullptr;
// Update LRU cache
updateLRU(handle.id);
return resources[handle.id].data;
}
// Check if resource is loaded
bool isResourceLoaded(ResourceHandle handle) {
std::lock_guard<std::mutex> lock(resourceMutex);
if (handle.id >= resources.size()) return false;
return resources[handle.id].loaded;
}
// Add dependency between resources
void addDependency(ResourceHandle resource, ResourceHandle dependency) {
std::lock_guard<std::mutex> lock(resourceMutex);
if (resource.id >= resources.size() || dependency.id >= resources.size()) return;
// Find or create dependency entry
for (auto& dep : dependencies) {
if (dep.resourceId == resource.id) {
// Add dependency if not already present
if (std::find(dep.dependencies.begin(), dep.dependencies.end(), dependency.id) == dep.dependencies.end()) {
dep.dependencies.push_back(dependency.id);
// Increment reference count of dependency
resources[dependency.id].referenceCount++;
}
return;
}
}
// Create new dependency entry
ResourceDependency newDep;
newDep.resourceId = resource.id;
newDep.dependencies.push_back(dependency.id);
dependencies.push_back(newDep);
// Increment reference count of dependency
resources[dependency.id].referenceCount++;
}
// Resource group management
void createResourceGroup(const std::string& groupName) {
std::lock_guard<std::mutex> lock(resourceMutex);
resourceGroups[groupName] = std::vector<uint32_t>();
}
void addToResourceGroup(const std::string& groupName, ResourceHandle handle) {
std::lock_guard<std::mutex> lock(resourceMutex);
if (handle.id >= resources.size()) return;
auto it = resourceGroups.find(groupName);
if (it != resourceGroups.end()) {
it->second.push_back(handle.id);
}
}
void loadResourceGroup(const std::string& groupName) {
std::lock_guard<std::mutex> lock(resourceMutex);
auto it = resourceGroups.find(groupName);
if (it == resourceGroups.end()) return;
for (uint32_t resourceId : it->second) {
if (resourceId < resources.size() && !resources[resourceId].loaded) {
// Load the resource
switch (resources[resourceId].type) {
case ResourceType::Texture:
loadTexture(resources[resourceId]);
break;
case ResourceType::Audio:
loadAudio(resources[resourceId]);
break;
case ResourceType::Mesh:
loadMesh(resources[resourceId]);
break;
case ResourceType::Shader:
loadShader(resources[resourceId]);
break;
case ResourceType::Font:
loadFont(resources[resourceId]);
break;
case ResourceType::Data:
loadData(resources[resourceId]);
break;
default:
break;
}
// Update LRU cache
updateLRU(resourceId);
}
}
}
void unloadResourceGroup(const std::string& groupName) {
std::lock_guard<std::mutex> lock(resourceMutex);
auto it = resourceGroups.find(groupName);
if (it == resourceGroups.end()) return;
for (uint32_t resourceId : it->second) {
if (resourceId < resources.size() && resources[resourceId].loaded) {
// Only unload if reference count is 0
if (resources[resourceId].referenceCount == 0) {
unloadResource(resources[resourceId]);
}
}
}
}
// Hot reload control
void enableHotReload(bool enable) {
std::lock_guard<std::mutex> lock(resourceMutex);
hotReloadEnabled = enable;
lastHotReloadCheck = std::chrono::steady_clock::now();
}
// Memory usage statistics
float getMemoryUsagePercentage(ResourceType type) {
std::lock_guard<std::mutex> lock(resourceMutex);
auto it = memoryBudgets.find(type);
if (it == memoryBudgets.end() || it->second.maximum == 0) return 0.0f;
return static_cast<float>(it->second.current) / it->second.maximum;
}
size_t getTotalMemoryUsage() {
std::lock_guard<std::mutex> lock(resourceMutex);
size_t total = 0;
for (const auto& budget : memoryBudgets) {
total += budget.second.current;
}
return total;
}
// Loading progress
float getLoadingProgress() {
std::lock_guard<std::mutex> lock(resourceMutex);
if (loadRequests.empty()) return 1.0f;
// Count completed requests
int completed = 0;
for (const auto& request : loadRequests) {
if (request.future.wait_for(std::chrono::seconds(0)) == std::future_status::ready) {
completed++;
}
}
return static_cast<float>(completed) / loadRequests.size();
}
// Update function
void update() {
std::lock_guard<std::mutex> lock(resourceMutex);
// Check for completed load requests
for (auto it = loadRequests.begin(); it != loadRequests.end(); ) {
if (it->future.wait_for(std::chrono::seconds(0)) == std::future_status::ready) {
bool success = it->future.get();
if (!success) {
// Handle loading failure
// ...
} else {
// Update LRU cache
updateLRU(it->resourceId);
}
it = loadRequests.erase(it);
} else {
++it;
}
}
// Check for hot reload
if (hotReloadEnabled) {
auto now = std::chrono::steady_clock::now();
auto elapsed = std::chrono::duration_cast<std::chrono::seconds>(now - lastHotReloadCheck).count();
// Only check every few seconds to avoid excessive file system access
if (elapsed >= 2) {
lastHotReloadCheck = now;
for (auto& resource : resources) {
if (!resource.loaded || resource.path.empty()) continue;
uint64_t currentModTime = getFileModificationTime(resource.path);
if (currentModTime > resource.lastModifiedTime) {
// File has been modified, reload it
unloadResource(resource);
switch (resource.type) {
case ResourceType::Texture:
loadTexture(resource);
break;
case ResourceType::Audio:
loadAudio(resource);
break;
case ResourceType::Mesh:
loadMesh(resource);
break;
case ResourceType::Shader:
loadShader(resource);
break;
case ResourceType::Font:
loadFont(resource);
break;
case ResourceType::Data:
loadData(resource);
break;
default:
break;
}
resource.lastModifiedTime = currentModTime;
// Notify systems that resource has been reloaded
// ...
}
}
}
}
}
private:
// Resource loading implementations
bool loadTexture(ResourceData& resource) {
// Check memory budget
if (!canAllocateMemory(ResourceType::Texture, 1024 * 1024)) { // Estimate size
return false;
}
// Load texture from file (simplified)
FILE* file = fopen(resource.path.c_str(), "rb");
if (!file) return false;
// Get file size
fseek(file, 0, SEEK_END);
size_t fileSize = ftell(file);
fseek(file, 0, SEEK_SET);
// Allocate memory
resource.data = allocateResourceMemory(ResourceType::Texture, fileSize);
if (!resource.data) {
fclose(file);
return false;
}
// Read file data
size_t bytesRead = fread(resource.data, 1, fileSize, file);
fclose(file);
if (bytesRead != fileSize) {
deallocateResourceMemory(ResourceType::Texture, resource.data);
resource.data = nullptr;
return false;
}
// Track memory usage
resource.size = fileSize;
trackMemoryAllocation(ResourceType::Texture, fileSize);
// Set as loaded
resource.loaded = true;
return true;
}
bool loadAudio(ResourceData& resource) {
// Similar to loadTexture but for audio files
// ...
return true;
}
bool loadMesh(ResourceData& resource) {
// Similar to loadTexture but for mesh files
// ...
return true;
}
bool loadShader(ResourceData& resource) {
// Similar to loadTexture but for shader files
// ...
return true;
}
bool loadFont(ResourceData& resource) {
// Similar to loadTexture but for font files
// ...
return true;
}
bool loadData(ResourceData& resource) {
// Similar to loadTexture but for data files
// ...
return true;
}
void unloadResource(ResourceData& resource) {
if (!resource.loaded || !resource.data) return;
// Track memory deallocation
trackMemoryDeallocation(resource.type, resource.size);
// Free memory
deallocateResourceMemory(resource.type, resource.data);
resource.data = nullptr;
resource.loaded = false;
// Remove from LRU cache
auto it = lruMap.find(resource.id);
if (it != lruMap.end()) {
lruList.erase(it->second);
lruMap.erase(it);
}
}
// Memory management
void* allocateResourceMemory(ResourceType type, size_t size) {
auto it = allocators.find(type);
if (it != allocators.end()) {
return it->second->allocate(size);
}
// Fallback to standard allocation
return malloc(size);
}
void deallocateResourceMemory(ResourceType type, void* ptr) {
auto it = allocators.find(type);
if (it != allocators.end()) {
it->second->deallocate(ptr);
} else {
// Fallback to standard deallocation
free(ptr);
}
}
bool canAllocateMemory(ResourceType type, size_t size) {
auto it = memoryBudgets.find(type);
if (it == memoryBudgets.end()) return true;
return it->second.current + size <= it->second.maximum;
}
void trackMemoryAllocation(ResourceType type, size_t size) {
auto it = memoryBudgets.find(type);
if (it != memoryBudgets.end()) {
it->second.current += size;
}
}
void trackMemoryDeallocation(ResourceType type, size_t size) {
auto it = memoryBudgets.find(type);
if (it != memoryBudgets.end()) {
it->second.current -= size;
}
}
// LRU cache management
void updateLRU(uint32_t resourceId) {
// If already in LRU list, remove it
auto it = lruMap.find(resourceId);
if (it != lruMap.end()) {
lruList.erase(it->second);
}
// Add to front of LRU list
lruList.push_front(resourceId);
lruMap[resourceId] = lruList.begin();
// If cache is full, remove least recently used
if (lruList.size() > maxCacheSize) {
uint32_t lruResourceId = lruList.back();
// Only unload if reference count is 0
if (resources[lruResourceId].referenceCount == 0) {
unloadResource(resources[lruResourceId]);
lruMap.erase(lruResourceId);
lruList.pop_back();
}
}
}
// Utility functions
uint64_t getFileModificationTime(const std::string& path) {
// Platform-specific implementation
#ifdef _WIN32
struct _stat64 fileStat;
if (_stat64(path.c_str(), &fileStat) == 0) {
return fileStat.st_mtime;
}
#else
struct stat fileStat;
if (stat(path.c_str(), &fileStat) == 0) {
return fileStat.st_mtime;
}
#endif
return 0;
}
};
Using the Resource Management System
Here's how you might use this resource management system in your game:
int main() {
// Create resource manager
ResourceManager resourceManager;
// Enable hot reload during development
#ifdef _DEBUG
resourceManager.enableHotReload(true);
#endif
// Create resource groups
resourceManager.createResourceGroup("Level1");
resourceManager.createResourceGroup("Level2");
resourceManager.createResourceGroup("UI");
// Load some resources
ResourceHandle playerTexture = resourceManager.loadResource("textures/player.png", ResourceType::Texture);
ResourceHandle backgroundTexture = resourceManager.loadResource("textures/background.png", ResourceType::Texture);
ResourceHandle jumpSound = resourceManager.loadResource("sounds/jump.wav", ResourceType::Audio);
// Add resources to groups
resourceManager.addToResourceGroup("Level1", playerTexture);
resourceManager.addToResourceGroup("Level1", backgroundTexture);
resourceManager.addToResourceGroup("Level1", jumpSound);
// Load UI resources asynchronously
ResourceHandle buttonTexture = resourceManager.loadResourceAsync("textures/button.png", ResourceType::Texture);
ResourceHandle fontResource = resourceManager.loadResourceAsync("fonts/main.ttf", ResourceType::Font);
resourceManager.addToResourceGroup("UI", buttonTexture);
resourceManager.addToResourceGroup("UI", fontResource);
// Game loop
bool running = true;
while (running) {
// Update resource manager
resourceManager.update();
// Check loading progress
float progress = resourceManager.getLoadingProgress();
if (progress < 1.0f) {
// Show loading screen
// ...
continue;
}
// Access resources
void* playerTextureData = resourceManager.getResourceData(playerTexture);
void* backgroundTextureData = resourceManager.getResourceData(backgroundTexture);
void* jumpSoundData = resourceManager.getResourceData(jumpSound);
// Use resources
// ...
// When changing levels
if (changingToLevel2) {
// Unload Level1 resources
resourceManager.unloadResourceGroup("Level1");
// Load Level2 resources
resourceManager.loadResourceGroup("Level2");
}
// Check memory usage
float textureMemoryUsage = resourceManager.getMemoryUsagePercentage(ResourceType::Texture);
if (textureMemoryUsage > 0.9f) {
// Memory usage is high, consider unloading some resources
// ...
}
}
// Resources will be automatically unloaded when ResourceManager is destroyed
return 0;
}
Conclusion
A well-designed resource management system is crucial for efficiently handling game assets. By using a data-oriented approach with contiguous arrays, custom memory allocators, and efficient caching strategies, you can create a system that minimizes loading times and memory usage.
Key takeaways:
- Use reference counting to track resource usage
- Implement caching strategies like LRU to optimize memory usage
- Load resources asynchronously to avoid blocking the game
- Group related resources for batch loading and unloading
- Use custom memory allocators for different resource types
- Implement hot-reloading for faster development iteration
In the next section, we'll explore performance optimization techniques that can be applied across all components of your game engine.
Performance Optimization Techniques
Introduction to Performance Optimization
Performance optimization is a critical aspect of game engine development. A well-optimized engine ensures smooth gameplay, consistent frame rates, and efficient resource utilization. This is especially important for games that need to run on a variety of hardware configurations.
In this section, we'll explore various performance optimization techniques specifically tailored for non-OOP game engines. We'll focus on data-oriented design principles, memory management strategies, multithreading approaches, and profiling methodologies that can help you identify and resolve performance bottlenecks.
Data-Oriented Design for Performance
Data-oriented design (DOD) is a programming paradigm that focuses on organizing data for optimal processing rather than organizing code around objects. This approach is particularly well-suited for performance-critical applications like game engines.
Cache Efficiency
One of the primary benefits of DOD is improved cache efficiency. Modern CPUs have multiple levels of cache that store frequently accessed data. When the CPU needs data that isn't in the cache (a cache miss), it must fetch it from main memory, which is significantly slower.
// Object-oriented approach (poor cache locality)
class Entity {
public:
Vector3 position;
Vector3 velocity;
float rotation;
float health;
// Many other properties...
void update(float deltaTime) {
position += velocity * deltaTime;
// Other logic...
}
};
std::vector<Entity> entities;
// Update all entities
for (auto& entity : entities) {
entity.update(deltaTime);
}
In the OOP approach above, each entity contains various properties. When updating positions, the CPU needs to load entire entity objects into cache, even though it only needs the position and velocity data.
// Data-oriented approach (good cache locality)
struct EntityData {
std::vector<Vector3> positions;
std::vector<Vector3> velocities;
std::vector<float> rotations;
std::vector<float> healths;
// Other arrays...
};
EntityData entities;
// Update all positions
for (size_t i = 0; i < entities.positions.size(); ++i) {
entities.positions[i] += entities.velocities[i] * deltaTime;
}
In the DOD approach, similar data is stored together in contiguous arrays. When updating positions, the CPU can efficiently load position and velocity data into cache without wasting cache space on unneeded properties.
Structure of Arrays vs. Array of Structures
The example above demonstrates the "Structure of Arrays" (SoA) pattern, as opposed to the "Array of Structures" (AoS) pattern used in OOP. SoA is generally more cache-friendly for operations that process one property across many entities.
// Array of Structures (AoS)
struct Particle {
Vector3 position;
Vector3 velocity;
float lifetime;
float size;
};
std::vector<Particle> particles;
// Structure of Arrays (SoA)
struct ParticleSystem {
std::vector<Vector3> positions;
std::vector<Vector3> velocities;
std::vector<float> lifetimes;
std::vector<float> sizes;
};
ParticleSystem particles;
SoA is particularly beneficial when:
- You're processing one property across many entities
- You have a large number of entities
- Each entity has many properties, but operations typically use only a few
However, AoS can be more efficient when:
- You typically access all properties of an entity together
- You have a small number of entities
- Entities are frequently created and destroyed individually
Data Alignment
Proper data alignment can significantly improve memory access performance:
// Poorly aligned structure
struct Vertex {
float x, y, z; // 12 bytes
uint8_t r, g, b, a; // 4 bytes
float u, v; // 8 bytes
}; // Total: 24 bytes
// Well-aligned structure
struct Vertex {
float x, y, z; // 12 bytes
float u, v; // 8 bytes
uint8_t r, g, b, a; // 4 bytes
// 4 bytes padding added by compiler
}; // Total: 28 bytes (but better performance)
// Even better for SIMD
struct alignas(16) Vertex {
float x, y, z, w; // 16 bytes (aligned for SIMD)
float u, v, 0, 0; // 16 bytes (aligned for SIMD)
uint8_t r, g, b, a; // 4 bytes
uint8_t padding[12];// 12 bytes padding
}; // Total: 48 bytes (but optimal for SIMD)
While the well-aligned structures may use more memory, they can be accessed more efficiently by the CPU, especially when using SIMD instructions.
SIMD (Single Instruction, Multiple Data)
SIMD instructions allow you to perform the same operation on multiple data elements simultaneously:
// Scalar implementation
void addVectors(const float* a, const float* b, float* result, size_t count) {
for (size_t i = 0; i < count; ++i) {
result[i] = a[i] + b[i];
}
}
// SIMD implementation using SSE
void addVectorsSIMD(const float* a, const float* b, float* result, size_t count) {
size_t i = 0;
// Process 4 floats at a time
for (; i + 3 < count; i += 4) {
__m128 va = _mm_load_ps(a + i);
__m128 vb = _mm_load_ps(b + i);
__m128 vr = _mm_add_ps(va, vb);
_mm_store_ps(result + i, vr);
}
// Handle remaining elements
for (; i < count; ++i) {
result[i] = a[i] + b[i];
}
}
SIMD is particularly useful for operations on vectors, matrices, and other mathematical structures common in game engines.
Efficient Memory Management
Memory management has a significant impact on performance. Here are some techniques to optimize memory usage:
Custom Allocators
Standard allocators like malloc
and new
are general-purpose and may not be optimal for game-specific patterns. Custom allocators can be tailored to your specific needs:
// Pool allocator for fixed-size objects
class PoolAllocator {
private:
struct Block {
Block* next;
};
void* memory;
size_t blockSize;
Block* freeList;
public:
PoolAllocator(size_t totalSize, size_t blockSize)
: blockSize(std::max(blockSize, sizeof(Block))) {
// Allocate memory
memory = malloc(totalSize);
// Initialize free list
freeList = nullptr;
char* start = static_cast<char*>(memory);
size_t numBlocks = totalSize / this->blockSize;
for (size_t i = 0; i < numBlocks; ++i) {
Block* block = reinterpret_cast<Block*>(start + i * this->blockSize);
block->next = freeList;
freeList = block;
}
}
~PoolAllocator() {
free(memory);
}
void* allocate() {
if (!freeList) return nullptr;
Block* block = freeList;
freeList = block->next;
return block;
}
void deallocate(void* ptr) {
if (!ptr) return;
Block* block = static_cast<Block*>(ptr);
block->next = freeList;
freeList = block;
}
};
Linear Allocator
A linear allocator is extremely fast for allocations but doesn't support individual deallocations:
class LinearAllocator {
private:
void* memory;
size_t size;
size_t offset;
public:
LinearAllocator(size_t size) : size(size), offset(0) {
memory = malloc(size);
}
~LinearAllocator() {
free(memory);
}
void* allocate(size_t bytes, size_t alignment = 8) {
// Align offset
size_t mask = alignment - 1;
size_t alignedOffset = (offset + mask) & ~mask;
// Check if we have enough space
if (alignedOffset + bytes > size) {
return nullptr;
}
// Allocate
void* ptr = static_cast<char*>(memory) + alignedOffset;
offset = alignedOffset + bytes;
return ptr;
}
void reset() {
offset = 0;
}
};
Linear allocators are perfect for per-frame allocations that can all be freed at once.
Memory Arenas
Memory arenas combine multiple allocators for different purposes:
class MemoryArena {
private:
LinearAllocator frameAllocator;
PoolAllocator entityAllocator;
PoolAllocator particleAllocator;
public:
MemoryArena()
: frameAllocator(10 * 1024 * 1024), // 10 MB for per-frame data
entityAllocator(5 * 1024 * 1024, 256), // 5 MB for entities (256 bytes each)
particleAllocator(2 * 1024 * 1024, 64) // 2 MB for particles (64 bytes each)
{}
void* allocateFrameMemory(size_t bytes) {
return frameAllocator.allocate(bytes);
}
void resetFrameMemory() {
frameAllocator.reset();
}
void* allocateEntity() {
return entityAllocator.allocate();
}
void deallocateEntity(void* ptr) {
entityAllocator.deallocate(ptr);
}
void* allocateParticle() {
return particleAllocator.allocate();
}
void deallocateParticle(void* ptr) {
particleAllocator.deallocate(ptr);
}
};
Memory Prefetching
Prefetching can improve performance by loading data into cache before it's needed:
void processEntities(EntityData& entities) {
for (size_t i = 0; i < entities.positions.size(); ++i) {
// Prefetch data for the next entity
if (i + 1 < entities.positions.size()) {
__builtin_prefetch(&entities.positions[i + 1], 0, 3);
__builtin_prefetch(&entities.velocities[i + 1], 0, 3);
}
// Process current entity
entities.positions[i] += entities.velocities[i] * deltaTime;
}
}
Multithreading Strategies
Modern CPUs have multiple cores, and utilizing them effectively can significantly improve performance.
Task-Based Parallelism
Task-based parallelism involves breaking work into independent tasks that can be executed in parallel:
class TaskSystem {
private:
struct Task {
std::function<void()> function;
};
std::vector<std::thread> threads;
std::queue<Task> tasks;
std::mutex taskMutex;
std::condition_variable condition;
bool shuttingDown = false;
public:
TaskSystem(size_t numThreads = std::thread::hardware_concurrency()) {
for (size_t i = 0; i < numThreads; ++i) {
threads.emplace_back([this] {
while (true) {
Task task;
{
std::unique_lock<std::mutex> lock(taskMutex);
condition.wait(lock, [this] {
return !tasks.empty() || shuttingDown;
});
if (shuttingDown && tasks.empty()) {
return;
}
task = std::move(tasks.front());
tasks.pop();
}
task.function();
}
});
}
}
~TaskSystem() {
{
std::unique_lock<std::mutex> lock(taskMutex);
shuttingDown = true;
}
condition.notify_all();
for (auto& thread : threads) {
thread.join();
}
}
void addTask(std::function<void()> function) {
{
std::unique_lock<std::mutex> lock(taskMutex);
tasks.push({std::move(function)});
}
condition.notify_one();
}
template<typename Iterator, typename Function>
void parallelFor(Iterator begin, Iterator end, Function function) {
size_t numItems = std::distance(begin, end);
size_t numThreads = threads.size();
size_t batchSize = std::max(numItems / numThreads, size_t(1));
std::atomic<size_t> completedTasks(0);
size_t totalTasks = (numItems + batchSize - 1) / batchSize;
for (size_t i = 0; i < numItems; i += batchSize) {
size_t start = i;
size_t end = std::min(i + batchSize, numItems);
addTask([start, end, &begin, &function, &completedTasks] {
for (size_t j = start; j < end; ++j) {
function(*(begin + j));
}
completedTasks++;
});
}
// Wait for all tasks to complete
while (completedTasks < totalTasks) {
std::this_thread::yield();
}
}
};
Data Parallelism
Data parallelism involves applying the same operation to different parts of a data set:
void updateParticles(ParticleSystem& particles, float deltaTime, TaskSystem& taskSystem) {
size_t numParticles = particles.positions.size();
taskSystem.parallelFor(0, numParticles, [&](size_t i) {
particles.positions[i] += particles.velocities[i] * deltaTime;
particles.lifetimes[i] -= deltaTime;
// Other particle updates...
});
}
Job System
A job system is a more sophisticated task system that handles dependencies between tasks:
class JobSystem {
private:
struct Job {
std::function<void()> function;
std::atomic<int> dependencyCount;
std::vector<Job*> dependents;
};
TaskSystem taskSystem;
std::mutex jobMutex;
std::vector<std::unique_ptr<Job>> jobs;
public:
JobSystem(size_t numThreads = std::thread::hardware_concurrency())
: taskSystem(numThreads) {}
Job* createJob(std::function<void()> function) {
std::lock_guard<std::mutex> lock(jobMutex);
jobs.push_back(std::make_unique<Job>());
Job* job = jobs.back().get();
job->function = std::move(function);
job->dependencyCount = 0;
return job;
}
void addDependency(Job* job, Job* dependency) {
dependency->dependents.push_back(job);
job->dependencyCount++;
}
void scheduleJob(Job* job) {
if (job->dependencyCount == 0) {
taskSystem.addTask([this, job] {
job->function();
// Schedule dependent jobs
for (Job* dependent : job->dependents) {
if (--dependent->dependencyCount == 0) {
scheduleJob(dependent);
}
}
});
}
}
};
Thread-Safe Data Structures
When sharing data between threads, thread-safe data structures are essential:
template<typename T>
class ThreadSafeQueue {
private:
std::queue<T> queue;
std::mutex mutex;
std::condition_variable condition;
public:
void push(T value) {
std::lock_guard<std::mutex> lock(mutex);
queue.push(std::move(value));
condition.notify_one();
}
bool tryPop(T& value) {
std::lock_guard<std::mutex> lock(mutex);
if (queue.empty()) {
return false;
}
value = std::move(queue.front());
queue.pop();
return true;
}
void waitAndPop(T& value) {
std::unique_lock<std::mutex> lock(mutex);
condition.wait(lock, [this] { return !queue.empty(); });
value = std::move(queue.front());
queue.pop();
}
bool empty() const {
std::lock_guard<std::mutex> lock(mutex);
return queue.empty();
}
};
Profiling and Optimization Techniques
Profiling is essential for identifying performance bottlenecks. Here are some techniques and tools to help with profiling:
Simple Profiling Timer
A basic profiling timer can help measure the execution time of different parts of your code:
class ScopedTimer {
private:
const char* name;
std::chrono::high_resolution_clock::time_point startTime;
public:
ScopedTimer(const char* name) : name(name) {
startTime = std::chrono::high_resolution_clock::now();
}
~ScopedTimer() {
auto endTime = std::chrono::high_resolution_clock::now();
auto duration = std::chrono::duration_cast<std::chrono::microseconds>(endTime - startTime).count();
std::cout << name << ": " << duration << " us" << std::endl;
}
};
// Usage
void someFunction() {
ScopedTimer timer("someFunction");
// Code to profile...
}
Hierarchical Profiling
For more complex applications, a hierarchical profiler can provide more detailed information:
class Profiler {
private:
struct ProfileNode {
std::string name;
int64_t startTime;
int64_t totalTime;
int callCount;
std::vector<ProfileNode> children;
};
ProfileNode root;
std::vector<ProfileNode*> stack;
public:
Profiler() : root({"Root", 0, 0, 0}) {
stack.push_back(&root);
}
void beginSample(const std::string& name) {
ProfileNode* parent = stack.back();
// Find or create child
ProfileNode* child = nullptr;
for (auto& node : parent->children) {
if (node.name == name) {
child = &node;
break;
}
}
if (!child) {
parent->children.push_back({name, 0, 0, 0});
child = &parent->children.back();
}
child->startTime = getCurrentTime();
child->callCount++;
stack.push_back(child);
}
void endSample() {
ProfileNode* node = stack.back();
int64_t endTime = getCurrentTime();
node->totalTime += endTime - node->startTime;
stack.pop_back();
}
void printResults() {
printNode(root, 0);
}
private:
int64_t getCurrentTime() {
return std::chrono::duration_cast<std::chrono::microseconds>(
std::chrono::high_resolution_clock::now().time_since_epoch()
).count();
}
void printNode(const ProfileNode& node, int depth) {
if (depth > 0) {
for (int i = 0; i < depth - 1; ++i) {
std::cout << " ";
}
std::cout << "|- ";
std::cout << node.name << ": " << node.totalTime << " us";
std::cout << " (" << node.callCount << " calls)" << std::endl;
}
for (const auto& child : node.children) {
printNode(child, depth + 1);
}
}
};
// Global profiler instance
Profiler gProfiler;
// Scoped profiler helper
class ScopedProfiler {
private:
const std::string name;
public:
ScopedProfiler(const std::string& name) : name(name) {
gProfiler.beginSample(name);
}
~ScopedProfiler() {
gProfiler.endSample();
}
};
// Usage
void gameLoop() {
ScopedProfiler profiler("GameLoop");
{
ScopedProfiler profiler("Update");
updateGame();
}
{
ScopedProfiler profiler("Render");
renderGame();
}
// Print results every 100 frames
static int frameCount = 0;
if (++frameCount % 100 == 0) {
gProfiler.printResults();
}
}
External Profiling Tools
Many external tools can provide more detailed profiling information:
-
Platform-specific profilers:
- Visual Studio Profiler
- Xcode Instruments
- Linux perf
-
Third-party profilers:
- Valgrind
- Intel VTune
- AMD CodeXL
- Tracy
- Remotery
Common Optimization Techniques
Once you've identified bottlenecks, here are some common optimization techniques:
1. Algorithmic Optimizations
Improving the algorithm can often yield the most significant performance gains:
// O(n^2) algorithm
bool hasDuplicate(const std::vector<int>& values) {
for (size_t i = 0; i < values.size(); ++i) {
for (size_t j = i + 1; j < values.size(); ++j) {
if (values[i] == values[j]) {
return true;
}
}
}
return false;
}
// O(n log n) algorithm
bool hasDuplicateOptimized(std::vector<int> values) {
std::sort(values.begin(), values.end());
for (size_t i = 1; i < values.size(); ++i) {
if (values[i] == values[i - 1]) {
return true;
}
}
return false;
}
// O(n) algorithm
bool hasDuplicateOptimized2(const std::vector<int>& values) {
std::unordered_set<int> seen;
for (int value : values) {
if (seen.count(value) > 0) {
return true;
}
seen.insert(value);
}
return false;
}
2. Batch Processing
Processing data in batches can improve cache utilization and reduce function call overhead:
// Individual processing
void updateEntities(EntityData& entities, float deltaTime) {
for (size_t i = 0; i < entities.positions.size(); ++i) {
updateEntity(entities, i, deltaTime);
}
}
void updateEntity(EntityData& entities, size_t index, float deltaTime) {
entities.positions[index] += entities.velocities[index] * deltaTime;
// Other entity updates...
}
// Batch processing
void updateEntitiesBatched(EntityData& entities, float deltaTime) {
// Update positions
for (size_t i = 0; i < entities.positions.size(); ++i) {
entities.positions[i] += entities.velocities[i] * deltaTime;
}
// Update rotations
for (size_t i = 0; i < entities.rotations.size(); ++i) {
entities.rotations[i] += entities.angularVelocities[i] * deltaTime;
}
// Other batch updates...
}
3. Avoiding Branches
Branch mispredictions can significantly impact performance. Techniques like branch-free code can help:
// Branching version
int abs(int x) {
if (x < 0) {
return -x;
} else {
return x;
}
}
// Branch-free version
int absBranchless(int x) {
int mask = x >> 31; // All 1s if negative, all 0s if positive
return (x ^ mask) - mask;
}
// Another example: min/max
int min(int a, int b) {
return a < b ? a : b;
}
int minBranchless(int a, int b) {
return b + ((a - b) & ((a - b) >> 31));
}
4. Loop Unrolling
Unrolling loops can reduce loop overhead and improve instruction-level parallelism:
// Regular loop
void vectorAdd(const float* a, const float* b, float* result, size_t count) {
for (size_t i = 0; i < count; ++i) {
result[i] = a[i] + b[i];
}
}
// Unrolled loop
void vectorAddUnrolled(const float* a, const float* b, float* result, size_t count) {
size_t i = 0;
// Process 4 elements at a time
for (; i + 3 < count; i += 4) {
result[i] = a[i] + b[i];
result[i + 1] = a[i + 1] + b[i + 1];
result[i + 2] = a[i + 2] + b[i + 2];
result[i + 3] = a[i + 3] + b[i + 3];
}
// Handle remaining elements
for (; i < count; ++i) {
result[i] = a[i] + b[i];
}
}
5. Compiler Optimizations
Modern compilers have powerful optimization capabilities. Use appropriate compiler flags and hints:
// Inform the compiler about pointer aliasing
void addVectors(const float* __restrict a, const float* __restrict b, float* __restrict result, size_t count) {
for (size_t i = 0; i < count; ++i) {
result[i] = a[i] + b[i];
}
}
// Hint for branch prediction
if (__builtin_expect(condition, 1)) { // Expect condition to be true
// Likely path
} else {
// Unlikely path
}
// Force inline
__forceinline void criticalFunction() {
// Function body
}
Case Study: Optimizing a Particle System
Let's apply these optimization techniques to a particle system:
Initial Implementation
struct Particle {
Vector3 position;
Vector3 velocity;
float lifetime;
float size;
Color color;
};
class ParticleSystem {
private:
std::vector<Particle> particles;
public:
void update(float deltaTime) {
for (auto it = particles.begin(); it != particles.end(); ) {
it->lifetime -= deltaTime;
if (it->lifetime <= 0.0f) {
it = particles.erase(it);
} else {
it->position += it->velocity * deltaTime;
it->size *= 0.99f;
it->color.a = it->lifetime / 2.0f; // Fade out
++it;
}
}
}
void render() {
for (const auto& particle : particles) {
drawSprite(particle.position, particle.size, particle.color);
}
}
void emit(const Vector3& position, int count) {
for (int i = 0; i < count; ++i) {
Particle particle;
particle.position = position;
particle.velocity = randomDirection() * randomFloat(1.0f, 2.0f);
particle.lifetime = randomFloat(1.0f, 2.0f);
particle.size = randomFloat(0.1f, 0.3f);
particle.color = Color(1.0f, 0.5f, 0.2f, 1.0f);
particles.push_back(particle);
}
}
};
Optimized Implementation
struct ParticleData {
std::vector<Vector3> positions;
std::vector<Vector3> velocities;
std::vector<float> lifetimes;
std::vector<float> sizes;
std::vector<Color> colors;
size_t count = 0;
size_t capacity = 0;
};
class ParticleSystem {
private:
ParticleData particles;
LinearAllocator frameAllocator;
TaskSystem taskSystem;
public:
ParticleSystem(size_t maxParticles = 10000)
: frameAllocator(maxParticles * sizeof(Vector3) * 2) // For temporary data
{
// Pre-allocate memory
particles.positions.resize(maxParticles);
particles.velocities.resize(maxParticles);
particles.lifetimes.resize(maxParticles);
particles.sizes.resize(maxParticles);
particles.colors.resize(maxParticles);
particles.capacity = maxParticles;
}
void update(float deltaTime) {
ScopedProfiler profiler("ParticleSystem.update");
// Allocate temporary arrays for parallel processing
Vector3* newPositions = static_cast<Vector3*>(
frameAllocator.allocate(particles.count * sizeof(Vector3)));
float* newLifetimes = static_cast<float*>(
frameAllocator.allocate(particles.count * sizeof(float)));
// Update particles in parallel
taskSystem.parallelFor(0, particles.count, [&](size_t i) {
newLifetimes[i] = particles.lifetimes[i] - deltaTime;
newPositions[i] = particles.positions[i] + particles.velocities[i] * deltaTime;
particles.sizes[i] *= 0.99f;
particles.colors[i].a = newLifetimes[i] / 2.0f; // Fade out
});
// Remove dead particles (compact arrays)
size_t newCount = 0;
for (size_t i = 0; i < particles.count; ++i) {
if (newLifetimes[i] > 0.0f) {
if (newCount != i) {
particles.positions[newCount] = newPositions[i];
particles.velocities[newCount] = particles.velocities[i];
particles.lifetimes[newCount] = newLifetimes[i];
particles.sizes[newCount] = particles.sizes[i];
particles.colors[newCount] = particles.colors[i];
} else {
particles.positions[newCount] = newPositions[i];
particles.lifetimes[newCount] = newLifetimes[i];
}
newCount++;
}
}
particles.count = newCount;
// Reset frame allocator
frameAllocator.reset();
}
void render() {
ScopedProfiler profiler("ParticleSystem.render");
// Batch rendering
beginBatchDraw();
for (size_t i = 0; i < particles.count; ++i) {
addSpriteToBatch(particles.positions[i], particles.sizes[i], particles.colors[i]);
}
endBatchDraw();
}
void emit(const Vector3& position, int count) {
ScopedProfiler profiler("ParticleSystem.emit");
size_t newCount = std::min(particles.count + count, particles.capacity);
size_t emitCount = newCount - particles.count;
for (size_t i = 0; i < emitCount; ++i) {
size_t index = particles.count + i;
particles.positions[index] = position;
particles.velocities[index] = randomDirection() * randomFloat(1.0f, 2.0f);
particles.lifetimes[index] = randomFloat(1.0f, 2.0f);
particles.sizes[index] = randomFloat(0.1f, 0.3f);
particles.colors[index] = Color(1.0f, 0.5f, 0.2f, 1.0f);
}
particles.count = newCount;
}
};
The optimized implementation uses:
- Structure of Arrays (SoA) for better cache locality
- Pre-allocated memory to avoid dynamic allocations
- Parallel processing for updates
- Batch rendering to reduce draw calls
- Profiling to identify bottlenecks
Conclusion
Performance optimization is an ongoing process that requires careful measurement, analysis, and improvement. By applying data-oriented design principles, efficient memory management, multithreading, and proper profiling, you can create a high-performance game engine that runs smoothly on a variety of hardware.
Key takeaways:
- Focus on data organization and memory access patterns
- Use custom allocators tailored to your specific needs
- Leverage multiple CPU cores through effective multithreading
- Profile your code to identify and address bottlenecks
- Apply optimization techniques like SIMD, loop unrolling, and batch processing
Remember that premature optimization can lead to more complex, harder-to-maintain code. Always measure first, then optimize the parts that actually need it. With the techniques covered in this section, you'll be well-equipped to create a performant game engine that delivers a smooth and responsive gaming experience.
Recommended Resources
Articles and Blog Posts
1. The Faster You Unlearn OOP, The Better For You And Your Software
- Author: Dawid Ciężarkiewicz
- Key Points:
- Criticizes OOP for encouraging complexity and poor data architecture
- Emphasizes "data is more important than code" principle
- Recommends focusing on data organization first, then code
- Discusses how OOP leads to performance issues through excessive indirection
- Suggests alternatives like data-oriented design
2. Data-Oriented Design
- Author: Noel Llopis
- Key Points:
- Explains the fundamental principles of data-oriented design
- Contrasts DOD with OOP approaches
- Provides practical examples of transforming OOP code to DOD
- Discusses cache coherency and performance benefits
3. How was game development done in C with no OOP?
- Source: Stack Overflow
- Key Points:
- Historical perspective on pre-OOP game development
- Explains procedural approaches to game architecture
- Discusses how C games managed entity systems without OOP
- Shows that OOP is just an abstraction, not a requirement
4. Entity Component System vs. OO Game Engine
- Author: Paul Gestwicki
- Key Points:
- Compares Entity Component System (ECS) with traditional OOP
- Explains how ECS avoids inheritance hierarchies
- Discusses composition over inheritance
Video Tutorials
1. Intro to Data Oriented Design for Games
- Creator: Nic Barker
- Duration: ~30 minutes
- Key Points:
- Introduction to data-oriented thinking for game developers
- Explains how CPU caches work and why they matter
- Demonstrates practical examples of data-oriented design
- Shows performance improvements over OOP approaches
2. CppCon 2014: Mike Acton "Data-Oriented Design and C++"
- Speaker: Mike Acton (Former Engine Director at Insomniac Games)
- Duration: 1 hour
- Key Points:
- Foundational talk on data-oriented design
- Explains the "three big lies" of OOP
- Demonstrates practical examples of transforming OOP code
- Focuses on real-world performance considerations
3. CppCon 2018: Stoyan Nikolov "OOP Is Dead, Long Live Data-Oriented Design"
- Speaker: Stoyan Nikolov
- Duration: ~1 hour
- Key Points:
- Explains why OOP often leads to performance problems
- Shows how data-oriented design improves cache utilization
- Provides practical examples of refactoring from OOP to DOD
Open-Source Projects and Examples
1. Amethyst
- Language: Rust
- Description: Data-oriented and data-driven game engine
- Key Features:
- Entity Component System (ECS) architecture
- Parallel processing by design
- Clean separation of data and behavior
- Note: Development has halted, but codebase remains valuable for study
2. Bevy
- Language: Rust
- Description: A refreshingly simple data-driven game engine
- Key Features:
- Modern ECS architecture
- Cache-friendly data layout
- Excellent documentation for learning
3. EnTT
- Language: C++
- Description: Gaming meets modern C++ - a fast and reliable entity component system
- Key Features:
- Header-only library
- Compile-time dispatch system
- Used in Minecraft Bedrock Edition
4. DOOM (1993) Source Code
- Language: C
- Description: Original source code for the classic DOOM game
- Key Features:
- Historical example of non-OOP game architecture
- Procedural design with clear data structures
- Efficient memory management techniques
Books and In-depth Resources
1. Game Engine Architecture
- Author: Jason Gregory
- Relevance: While not exclusively about non-OOP approaches, contains valuable sections on data-oriented design and performance considerations
2. Game Programming Patterns
- Author: Robert Nystrom
- Relevance: Covers various design patterns including data locality and component systems that work well in non-OOP contexts
- Note: Available free online
3. Data-Oriented Design Book
- Author: Richard Fabian
- Relevance: Comprehensive guide to data-oriented design principles
- Note: Available free online
Community Resources
1. Data-Oriented Design Community
- Reddit community focused on DOD discussions and resources
2. Handmade Network
- Community focused on software craftsmanship with emphasis on performance and understanding how things work
- Many members practice data-oriented and procedural programming approaches
3. Game Programming Discord Servers
- Various Discord communities with channels dedicated to engine development and DOD
Tools and Frameworks
1. SDL (Simple DirectMedia Layer)
- Low-level C library that provides direct access to graphics, audio, and input hardware
- Excellent foundation for building non-OOP game engines
2. SFML
- C++ library with a simple API for graphics, audio, networking, etc.
- Can be used in a procedural style despite being written in C++
3. Raylib
- Simple and easy-to-use library for videogame programming
- C-based with procedural design philosophy
- Excellent for learning non-OOP game development