The Coupon Collector's Problem (with Geoff Marshall)
Check out Geoff's channel. Here's a video I'm in about Platforms Zero: https://www.youtube.com/watch?v=TTHOyTypNs8
Find your nearest Park Run: https://www.parkrun.com/
Thanks to all of Geoff's running buddies for being involved. This is Matt's Runderground channel: https://www.youtube.com/c/runderground
Cheers to my Patreon supporters who keep this whole channel running. But not literally. You can also help support and shape the videos I make: https://www.patreon.com/standupmaths
CORRECTIONS
- 10:49 Yes, I said "converges" by accident when filming and I dropped in a "diverges" in the edit. I don't think anyone will notice.
- I think my big divergent observation may not hold! Clarence Lam was the first to spot that the lead n out the front of the series can explain the increasing times without the series itself needing to diverge. I suspect there is still an argument to be made around the rate at which times go up outpacing n, but I’m not sure it’ll be super intuitive.
-Let me know if you spot any other mistakes!
Early morning filming and editing by Alex Genn-Bash
Props by Matt Parker
Music by Howard Carter
Design by Simon Wright and Adam Robinson
English subtitles by Max, Rob Macdonald, Eric RodrĂguez and Matt Parker
MATT PARKER: Stand-up Mathematician
Website: http://standupmaths.com/
US book: https://www.penguinrandomhouse.com/books/610964/humble-pi-by-matt-parker/
UK book: https://mathsgear.co.uk/collections/books/products/humble-pi-signed-paperback
Check out Geoff’s channel. Here’s a video I’m in about Platforms Zero: https://www.youtube.com/watch?v=TTHOyTypNs8
Find your nearest Park Run: https://www.parkrun.com/
Thanks to all of Geoff’s running buddies for being involved. This is Matt’s Runderground channel: https://www.youtube.com/c/runderground
Cheers to my Patreon supporters who keep this whole channel running. But not literally. You can also help support and shape the videos I make: https://www.patreon.com/standupmaths
CORRECTIONS
– 10:49 Yes, I said “converges” by accident when filming and I dropped in a “diverges” in the edit. I don’t think anyone will notice.
– I think my big divergent observation may not hold! Clarence Lam was the first to spot that the lead n out the front of the series can explain the increasing times without the series itself needing to diverge. I suspect there is still an argument to be made around the rate at which times go up outpacing n, but I’m not sure it’ll be super intuitive.
-Let me know if you spot any other mistakes!
Early morning filming and editing by Alex Genn-Bash
Props by Matt Parker
Music by Howard Carter
Design by Simon Wright and Adam Robinson
English subtitles by Max, Rob Macdonald, Eric RodrĂguez and Matt Parker
MATT PARKER: Stand-up Mathematician
Website: http://standupmaths.com/
US book: https://www.penguinrandomhouse.com/books/610964/humble-pi-by-matt-parker/
UK book: https://mathsgear.co.uk/collections/books/products/humble-pi-signed-paperback
Ok, many are suggestion I should have stood up to reveal an even bigger table next to me. Great concept, but ideas like that require some serious resources. cough http://patreon.com/standupmaths
I think random time is a bad assumption. A good runner is likely to have a time of 16:45 for a 5k. A time of 16:01 will not happen by chance. I think you need to do a normal distribution around an expected time.
Also for sandbaggers you should do a normal distribution around the center of a clump of numbers. So say a runner has 3 numbers left 45 46 and 47. And they have a probability given by a normal distribution centered on their target with a standard deviation of 5 seconds or such. Finnally you can use this model to figure out how accurate and percise a sandbagger they are. Is there an offset between the stopwatch time and the official time. This could be fun. Think of the Montecarlo simulations with python!
I’m way ahead… I’m on 75 parkruns and have 42 “coupons” and I’m not sandbagging it at all!
This was fantastic!
I'm trying to work out the average number of runs needed until your first duplicate. I found a PDF by Philippe Duchon and Cyril Nicaud
which suggested sqrt((pi * n)/2). So for 60 items, it's ~9.708 (so 9th or 10th run). Am I applying the correct formula?
15:03 id imagine the other peak would be well above 281 because people are naturally going to be able to consistently do a park run in roughly the same amount of time, theyre not going to have one thats super fast followed by an extremely slow one so they might get more repeat times than expected if theyre not trying to get the bingo
I will be very disappointed if this video doesn't have a "Parker-run" joke in it
I'm starting to question whether coupon is even a real word at about 8 minutes. Semantic satiation, isn't it?
So this is essentially the magic the gathering collector problem.
It is spelled "parkrun" all lower case. 🙂
Also, some lucky parkrun enthusiasts can run 55 parkruns a year (52 weekends + Christmas double special + x2 New years special)
Who else was waiting for that tiny desk?
Where's the link for this video?
13:20 I was expecting Matt telling why the graf looks like plot of a log function but it never came.
Following a mathematical rabbit hole, I accidentally proved that 1 = 1 for all values n > 0.
I hate it when I do that.
Possible correction:
Wouldn't the last number take 41.23 tries not 60.
Chance of wrong number is 59/60=.9833
.9833^41.241286(tries) = 50% chance of happening.
My intuition is that the median (and other quantiles) would be more informative than the mean in this case. Average amount of runs to get bingo doesn’t matter if most people only go for it once; what you want is how many runs it should be until you have a 50% chance of having done it by then.
6:15 its not 1/52 times its more like 1/31, because you can get the desired card at any of those times by the 31st try you have a 50% of having gotten the card already.
Let's extend this to something that costs money. Panini stickers. Apparently there are 670 stickers in the 2022 World Cup pack. 670 * (1 + 1/2 + … + 1/670) = 4747. There are five stickers to a pack, so you're looking at just over 949 packs. At 70p per pack, that works out at ÂŁ665 to complete your collection.
Short version – save yourself money and buy them a football instead. Or even a PS5 with FIFA22 AND a football. It's still cheaper.
The fact that the seconds in a person's park run time is more or less random reminds me of one way to get truly random numbers in a computer. You simply time the intervals between successive user key strokes in microseconds, then throw away everything except the right-most digit. You can do that as often as necessary to obtain a sequence of truly random numbers.
How did I miss this Video… Amazing Job.. I guess we will never see the parker sandbag metric
Maybe I shouldn't have looked it up … I have 59 out of 60 in parkrun bingo. Lacking only '22'. I've done 165 runs but don't know when I got my 59th 'coupon' or what it was.
I have had quite a lot of these over the years and am very aware of the average times. However, at one point a related question came up (assuming no sandbagging of course): What is the expected value of the highest stack once you finish. I. e., once you got your 60th second value, you will most likely have some times you have run twice, thrice, n times. One of those values you will have encountered the most times (or a couple at the same amount), but what is to be expected this highest number?
4:13
Wouldn't trials be a better term then times in this case?
I’d like to see a proof that expected time is 1/p – it’s intuitive, but that’s not enough in math.
At around 13:30 Matt says that on average it should take 60 runs to get the last needed number which is obviously incorrect. It would take 30 runs on average to get any particular number.