System Design Fight Club – Over 50 System Design Interview Question Solutions

STOCK EXCHANGE

RESOURCES

Alex Xu Volume II
A video by jane street
- “How to Build an Exchange” – https://www.youtube.com/watch?v=b1e4t2k2KJY

REQUIREMENTS

Order placement, matching, and execution
the price feed
broker stuff is out of scope

VARIATIONS:

sometimes the focus is on the price feed
sometimes the focus is on the broker portion

Approaches:

sharding by ticker, sidecar deployed redis for the wallet and gossiping w/ CRDT
one very, very beefy mainframe

CHECKLIST:

LIVE DISCUSSION QUESTIONS:

i think Alex Xu’s book actually has a very interesting take on this problem. i thought it was legit YES.

So are we going to deal with many concurrent buying and selling transactions taking place at scale!! Correct :)

yeah. robinhood isn’t a market maker. they actually just pipe the order to a real MM that perform the trades

Little surprised that it is not using websocket for live feed -– we can’t build on top of TCP because we care about speed, and handshakes for TCP add a lot of latency

machine sizing might be interesting to discuss here since we are sticking everything into a single node. also LLD might be interesting to discuss disk IO in regards to latency

I read there’s Active-Shadow replicator. Both of them do same computation in matching engine. If active goes down, shadow takes over. Sharding by ticker is what SEBI follows in India

problem with sharding is that we’ll introduce a network hop. that adds millisecond latency which might not be acceptable

stack overflow has machines with TB ram. I wonder if that’ll be able to host everything in a single machine. I didn’t do the math so not sure

agree with fact, network hop is going to add 2 digit ms latency. Need to check what kind of beafy machine would able to host everything on 1 host

my earlier comment around sharding is that the client needs to first connect to a router that knows for a stock, which machine the request needs to go to? -– the client could cache which ticker is on which IP address

we could hard-code the stock to machine map into the client. but that makes client updates tricky. an older client would not be able to trade any newly added stocks -– you could just load it up at the start of day and cache it on the client

clients like robinhood/binance share socket connection between them and end user (app/webpage). Any idea why stock exchange does UDP multicast? -– “retail investors” (college kids on robinhood) vs “institutional investors” (citadel / two sigma / goldman sachs)

college kids on robinhood (don’t necessarily require low latency)
HFTs (require low latency)
market makers (require low latency)
investment banks / hedge funds (don’t necessarily require low latency)

we could do forced updates for the client. but that makes the UX pretty bad. and also what happens if the forced update fails? -– the cache has a TTL of one day

don’t quote me on this. from first principle perspective, the order placement needs to happen over TCP since we need to ensure the request goes through. pricing updates can go through

(continued) can go through UDP since TCP requires ping/pong [round trip] that adds a lot of latency. and there’s no point in trying to deliver an old price value when a new one has become available. so UDP wins

Got it. That helps! Don’t have much background on UDP multicast. Will definitely check. Any resource you have on this? Thanks

you might not like this answer. wikipedia

Why is there an arrow between the redis exchange gateways for different machines? -– gossiping to make the leaderless replicas of sidecar deployed redis consistent with each other

How is the sequencer designed? -– Database Internals - Alex Petrov

Percolator Transactions -> leaderless databases use a “sequencer” component to outsource the challenge of total order broadcast when they want to implement serializable transactions

what kind of machines serve 100k TPS here? -– “bare metal” in EC2 instances 4XL instance only (the AMD ones that have ~100 cores on them)

5-6 “xlarge” – we were only getting “slices” of a machine