Phase 2 Months 7 – 12

Systems &
Performance

This is where you develop a genuine edge. Advanced C++, profiling tools, OS internals, networking, and your first real domain project. By the end of this phase you should be competitive for entry/junior quant dev roles and able to pass a technical phone screen at mid-level.

01Advanced C++

Templates, memory model & advanced features 5 topics ▲

Templates and metaprogramming basicsFunction templates, class templates, template specialisation. Understand how the compiler instantiates templates. SFINAE basics. C++20 concepts as a cleaner alternative.
Memory model and atomicsstd::atomic, std::memory_order. Sequential consistency vs relaxed ordering. acquire/release semantics. This underpins everything lock-free in Phase 3.
Cache-friendly data structuresStruct of arrays vs array of structs. False sharing. Cache line size (64 bytes). Why linked lists are slower than vectors in practice despite O(1) insert.
Undefined behaviour deep divestrict aliasing, integer overflow, unsequenced modifications, lifetime violations. Use -fsanitize=address,undefined to catch them.
Read: Effective Modern C++ (Meyers)Work through all 42 items. Write the code for each one. This book bridges "knows C++" and "thinks in C++".

HFT roles often demand cache-aware data structures and lock-free programming. You cannot design a cache-friendly structure without understanding the memory model. You cannot write lock-free code without understanding atomics.

Resources

Effective Modern C++ — Meyers C++ High Performance — Andrist & Sehr CppCon: "The C++ Memory Model" — Herb Sutter

02Profiling & Optimisation

Tools and techniques 9 tasks ▲

Learn perfLinux perf stat, perf record, perf report. Understand IPC, cache-misses, branch-misses as reported metrics. Profile a real program you wrote.
Learn valgrind + cachegrindMemory error detection. Cache simulation. Run your data structures implementations through it.
Learn gprof or VTuneFunction-level profiling. Hotspot identification. Intel VTune is the industry standard for HFT environments.
Create microbenchmarks: measure cache miss impactTraverse an array sequentially vs randomly. Measure the difference. Understand why spatial locality matters.
Create microbenchmarks: branch prediction impactSorted vs unsorted array in a branch-heavy loop. Measure. Know what a branch predictor does and when it fails.
Optimise using inline functionsWhen does inlining help? When does it hurt? __attribute__((always_inline)) vs the compiler's decision.
Optimise using data alignmentalignas(). Padding in structs. Why does alignment affect performance?
Optimise using loop unrolling#pragma GCC unroll. When the compiler does it automatically. Trade-offs with instruction cache.
Read: What Every Programmer Should Know About MemoryUlrich Drepper's free paper. Long but foundational. Read sections 1–4 as a minimum.

Being able to profile your own code and explain performance characteristics is a senior-level signal. At quant firms it is table stakes. This skill is also what makes your order book project go from interesting to impressive.

Resources

What Every Programmer Should Know About Memory — Drepper (free) Google Benchmark library — C++ Brendan Gregg — perf examples

03OS Internals

Deep Linux knowledge 5 topics ▲

Context switching — mechanics and costWhat does the kernel save and restore? How long does a context switch take (~1–10μs)? Why does this matter for latency?
CPU schedulingCFS (Completely Fair Scheduler). Real-time scheduling policies (SCHED_FIFO, SCHED_RR). CPU pinning (taskset, sched_setaffinity).
Interrupts and interrupt handlingHardware interrupts. Interrupt coalescing. Why network interrupts can kill latency and how to mitigate (IRQ affinity, NAPI).
Experiment with eBPF tracingWrite a simple eBPF program to trace system calls or measure latency distributions. Tools: bcc, bpftrace.
Experiment with kernel parameter tuning/proc/sys/net, /sys/kernel. Disable transparent huge pages. CPU frequency scaling (performance governor). NUMA balancing.

HFT systems frequently bypass or tune the kernel to shave microseconds. Understanding how the scheduler and interrupt system work is what separates a systems engineer from a software engineer in this domain.

Resources

Linux Insides — free online book Systems Performance — Brendan Gregg eBPF.io — getting started

04Networking Fundamentals

Sockets, TCP/UDP, and latency measurement 4 tasks ▲

Implement TCP server/client in C++Raw POSIX sockets. connect(), bind(), listen(), accept(). Handle partial reads. Non-blocking mode with select() or epoll().
Learn TCP vs UDP deeplyBeyond "TCP is reliable, UDP is fast." Nagle algorithm (disable it for low-latency). TCP_NODELAY. TCP handshake overhead. Why HFT uses UDP multicast for market data.
Implement UDP multicast listenerJoin a multicast group. Receive simulated price ticks. Parse a simple binary format. Measure inter-packet latency.
Measure latency between two machinesSet up two processes, send a message, measure round-trip time. Use CLOCK_MONOTONIC. Understand the difference between latency and throughput.

Market data in HFT travels via UDP multicast. Order execution goes via TCP (often with Nagle disabled). Writing socket code yourself makes network latency tangible — you stop thinking in milliseconds and start thinking in microseconds.

Resources

Beej's Guide to Network Programming (free) TCP/IP Illustrated Vol.1 — Stevens

05Trading Domain — Introduction

Market structure & HFT architecture 4 topics ▲

Understand order booksBid/ask spread. Limit orders vs market orders. Price-time priority. What happens when a market order arrives. Level 1 vs Level 2 data.
Understand exchanges and market microstructureHow do exchanges match orders? What is co-location? What is direct market access? Why does latency to the exchange matter?
Learn market data feedsBinary protocols. FAST compression. SBE (Simple Binary Encoding). UDP multicast distribution. Sequence numbers and gap detection.
Read architecture examples of HFT systemsA complete HFT engine has: network stack → protocol parsing → order book management → strategy → execution → risk. Draw this diagram and explain each block.

An order book project in C++ signals to Optiver/IMC that you understand their domain. Most applicants have no idea what a matching engine is. You will — and you will have built one.

Resources

Trading and Exchanges — Larry Harris Architecture of HFT Systems — Medium

06Portfolio — Phase 1: Limit Order Book

Build a basic matching engine in C++ 5 deliverables ▲

Support add orderLimit orders with price, quantity, side. Efficient bid/ask book representation (sorted maps or custom structures).
Support cancel orderO(1) cancel by order ID. How do you index into the book efficiently?
Support order matchingPrice-time priority. Match a market order against the book. Partial fills. Generate trade records.
Write unit testsTest edge cases: crossed book, zero quantity, duplicate IDs. Show correctness before performance.
Publish on GitHub with a clear READMEExplain your data structure choices and their complexity. This README is read in interviews. Write it like you're explaining to a senior HFT engineer.

This is the most impactful project for quant applications. It demonstrates domain knowledge, C++ ability, and performance thinking in one place. Start simple — correctness first, then optimise.

References

cedwies/low-latency-trading — open source reference How to Build a Fast Limit Order Book — classic post

07DSA — Continued

150 → 300 problems, timed solves Focus: trees, graphs, DP ▲

Trees and binary treesInorder/preorder/postorder traversal. BST validation. Lowest common ancestor. Serialisation.
GraphsAdjacency list, DFS, BFS, topological sort, union-find. Dijkstra's algorithm. Detect cycles.
Heap patternsK-th largest/smallest. Merge K sorted lists. Top-K problems.
Start DP: 1D patternsClimbing stairs, house robber, coin change, longest increasing subsequence. Understand the recurrence before the code.
Timed solves from week 1 of phase 225-minute hard limit per problem. If stuck, read approach only. This is the transition from practice to interview-readiness.

08Math & Probability — Continued

Deeper probability + first brainteasers Ongoing — 3 per week ▲

Law of Large Numbers and Central Limit TheoremWhy the average of many i.i.d. random variables converges. What "in distribution" means. Why normal distributions appear everywhere.
Markov chainsTransition matrices. Stationary distributions. Absorbing states. "How many steps to reach state X?"
Brainteasers at interview paceWork through 3 per week. Write your reasoning step-by-step. Common types: expected value problems, geometric probability, combinatorial counting, game theory basics.
Finish Blitzstein & HwangComplete the book by end of this phase. Work through the exercises, not just the reading.

Milestone — End of Month 12

You should be able to: Pass a technical phone screen at a quant firm. Explain cache misses, the C++ memory model, and context switching in an interview. Have a working limit order book on GitHub with a quality README. Solve 300 LeetCode problems, with medium difficulty feeling comfortable. Finish Blitzstein. Solve probability brainteasers without hints. At this point you resemble a junior-to-mid quant dev and are genuinely competitive for interviews.

Systems &Performance

Systems &
Performance