EnchantedOatmeal
4 years agoSeasoned Rookie
Next pack drop date based on the maths
TLDR: I am a nerdy engineer who played with some statistical methods to predict within +/-32 days the next pack drop dates.
I am excited for whatever the next pack is so I did some math. I'm an engineer and I get to play in Excel a lot at work. One of the easiest things I do is create interpolations and extrapolations based upon collected data to predict future trends that help operators run equipment more efficiently or determine maintenance dates, etc, etc. So this only took me about an hour to pull everything into a spreadsheet and work some magic. It was also really fun! (Can you tell I love my job?)
So if you know anything about Excel, you already know how extremely difficult it is to get it to recognize date formats and do math with them so I converted all the release dates into days based upon January 1, 2015 (The year the first pack dropped) and made an assumption that each month has 30 days. I put days on the y-axis and pack number on the x-axis. For pack number, I counted Outdoor Retreat as Pack #1 and Paranormal as Pack #37 and the first three kits I combined into Pack #38 because they did not release on unique dates and thus would be a statistical irregularity that would throw off my calculations. Once I completed my date conversions, I graphed it and produced a linear fit trendline which I then used to go back and interpolate each individual pack release date to get a standard deviation so I know how accurate my fit line is. It was NOT accurate. I had an STD of 66 days. Here is the first graph.
https://i.imgur.com/F67qaqY.png
So I analyzed the data (I have 6-sigma training for those of yall who know what that is) and I decided to remove all irregularities in the data because I realllllly wanted a linear fit. To remove the irregularities, I decided to remove all data before 2018 because from 2018-2020 only five packs were released each year. There were 8 in 2015, 7 in 2016, and 6 in 2017. I believe the closeness AND the irregular release times were throwing my interpolation off. Here is my second graph.
https://i.imgur.com/Tl4DQzj.png
Needless to say, my intercept was....{insert strong language here}... so I decided to get a quadratic fit trendline. That gave me worse results and it was even worse with a polynomial fit to the power of 5. So I decided to go back and reanalyze my data with the linear fit in Graph 2 because something obviously went wrong. So I copied the sheet, deleted everything before 2018 and set my days to be based on 0 days=January 1, 2018. That didn't fix nearly as much as setting Laundry Day to Pack #1 instead of #22. Suddenly all my numbers started making sense again! (I cannot emphasize enough how weird statistics is) Now I have a standard deviation of 32 days with the linear fit trendline in Graph 3.
https://i.imgur.com/HiKRIi7.png
At this point, I realized that sim pack release dates cannot be accurately predicted due to irregular time based upon holidays, setbacks, PR, etc etc. (So glad I don't deal with consistently irregular data at work). So I went ahead and extrapolated some data to try and determine within +/-32 days future pack drop dates.
7/13/21 (or rather, anywhere between 6/11/21 and 8/15/21)
9/29/21
12/15/21
2/25/22
5/11/22
Thus (QED) I have predicted the two month range in which the next five packs will drop but disclaimer, the further away my extrapolated data is, the more the data may deviate from the fit.
I may, in the future, take another shot at making this more accurate (and possibly precise) but for now, I just want to play the Sims! So Happy Simming everyone, from your neighborhood nerd. Sul Sul!
I am excited for whatever the next pack is so I did some math. I'm an engineer and I get to play in Excel a lot at work. One of the easiest things I do is create interpolations and extrapolations based upon collected data to predict future trends that help operators run equipment more efficiently or determine maintenance dates, etc, etc. So this only took me about an hour to pull everything into a spreadsheet and work some magic. It was also really fun! (Can you tell I love my job?)
So if you know anything about Excel, you already know how extremely difficult it is to get it to recognize date formats and do math with them so I converted all the release dates into days based upon January 1, 2015 (The year the first pack dropped) and made an assumption that each month has 30 days. I put days on the y-axis and pack number on the x-axis. For pack number, I counted Outdoor Retreat as Pack #1 and Paranormal as Pack #37 and the first three kits I combined into Pack #38 because they did not release on unique dates and thus would be a statistical irregularity that would throw off my calculations. Once I completed my date conversions, I graphed it and produced a linear fit trendline which I then used to go back and interpolate each individual pack release date to get a standard deviation so I know how accurate my fit line is. It was NOT accurate. I had an STD of 66 days. Here is the first graph.
https://i.imgur.com/F67qaqY.png
So I analyzed the data (I have 6-sigma training for those of yall who know what that is) and I decided to remove all irregularities in the data because I realllllly wanted a linear fit. To remove the irregularities, I decided to remove all data before 2018 because from 2018-2020 only five packs were released each year. There were 8 in 2015, 7 in 2016, and 6 in 2017. I believe the closeness AND the irregular release times were throwing my interpolation off. Here is my second graph.
https://i.imgur.com/Tl4DQzj.png
Needless to say, my intercept was....{insert strong language here}... so I decided to get a quadratic fit trendline. That gave me worse results and it was even worse with a polynomial fit to the power of 5. So I decided to go back and reanalyze my data with the linear fit in Graph 2 because something obviously went wrong. So I copied the sheet, deleted everything before 2018 and set my days to be based on 0 days=January 1, 2018. That didn't fix nearly as much as setting Laundry Day to Pack #1 instead of #22. Suddenly all my numbers started making sense again! (I cannot emphasize enough how weird statistics is) Now I have a standard deviation of 32 days with the linear fit trendline in Graph 3.
https://i.imgur.com/HiKRIi7.png
At this point, I realized that sim pack release dates cannot be accurately predicted due to irregular time based upon holidays, setbacks, PR, etc etc. (So glad I don't deal with consistently irregular data at work). So I went ahead and extrapolated some data to try and determine within +/-32 days future pack drop dates.
7/13/21 (or rather, anywhere between 6/11/21 and 8/15/21)
9/29/21
12/15/21
2/25/22
5/11/22
Thus (QED) I have predicted the two month range in which the next five packs will drop but disclaimer, the further away my extrapolated data is, the more the data may deviate from the fit.
I may, in the future, take another shot at making this more accurate (and possibly precise) but for now, I just want to play the Sims! So Happy Simming everyone, from your neighborhood nerd. Sul Sul!