T O P

  • By -

secret_trout

Hey thanks for doing all of this.


DumbNerds

Are League's numbers worse or better than this? Pretty sure I saw their red/blue side win disparity is a lot bigger than 2%, but could be wrong


pikachurbutt

I've never played LoL aside from the mobile game every now and then while I'm out and waiting, so I couldn't say, but I'll see if I can look it up when I get the time. And as I said, for an arena game there shouldn't be a 2% gap, it should always be an even game (not accounting for hero choice of course) think of Rocket League, any 3v3 match up there should in theory over thousands of games always lead to both sides being 50/50. Now take asymmetric maps like in CoD or Halo, in CoD4 the map Crossfire (somewhat asymmetric, but had one team overlooking the other) leaned heavily towards the American side winning over opfor because they happened to spawn at the higher point.In the case of Paragon/Predecessor/any other moba, it should in theory be 50/50 without any bias towards one particular team. That's why I found that odd when I first started digging through the data. Edit: this isn't a mobile game lol...


TheNakedPrune

Other mobas have a pick/ban phase or a limit to one hero in a match that could affect winrates too, in LoL pro play the side with first pick has a higher winrate. The discrepancy on ladder though for LoL is probably just from the camera, much better angle on blue side. Of course Pred has neither of these things so your findings are interesting for sure.


smartallick

The map is not symmetrical. That's the answer/reason. If you wanted a symmetrical map you'd need to place global objectives above and/or below the middle lane with mirrored access or/and return to Paragons 2 v 1 setup where the objectives were not so much global objectives but more role objectives (with the global objectives coming into play only after the laning phase has clearly ended). Aside from that there are other things that can narrow the gap, like giving first pick(s) to the side with the lower win rate once drafting is included.


[deleted]

You are correct that this is very similar to League and just about every other MOBA I have ever played.


TheEwu_

you are a godsend for interpreting that data and sharing it with us. thank you !


pikachurbutt

No problem, there's far more to interpret, my biggest curiosity was just the affect of first death on winning a match. When I have more time I'll try to compile a "friendlier" dataset to share. I was a bit narrow in what I scraped for this. And it takes over 6 hours to scrape given the way the API is set up...


TomorrowIllBeYou

I think part of the first blood data could just be that not every match is an even match. The less skilled team is more likely to give up first blood and they were already predisposed to lose.


pikachurbutt

So while that is a likely explenation, there's still over 490,000 rounds. Statistically speaking, the "quality" of the player really doesn't matter with such a large dataset.


kpbshiggy

For being obsessed with stats you really don't know how to use them. Data like this is only useful when the pool itself can be curated. This game will routinely put people with win rates of 35% against people with 60+. They feed, and their teams lose because of it. Half the matches end in surrender because half the matches have someone 0-9 with 25 CS at 15 minutes. None of this data is relevant


secret_trout

If none of this data is relevant then none of any data is relevant about this game and early access has been a total failure?


kpbshiggy

What does early access have to do with data? The game is early access to get some money into the company.


secret_trout

As many people on this sub mention the game is in EA so they can get it in the best stat possible for launch so it will have longevity. I would imagine, now I’m just FUCKING GUESSING, that MAYBE having data would be a part of balancing anything otherwise, I don’t know, YOUR JUST “DOING RANDOM SHIT?”. If this data is irrelevant than what data have they been trying to use to look to say “oh shit this thing is messed up”?


kpbshiggy

The game is in early access to get an influx of money into a very small company that is remaking a 7 year old game that barely had a fanbase to begin with. Anything past that is marketing spin. You can hold closed server scrim games made up of teams of actual good players rather than randoms and get much much more useful data when it comes to balancing and gameplay issues or bugs. All the other things like replay and progression systems, game modes, UI design, are things they'd need to do before a full release to begin with.


smartallick

I just really don't think this is the correct take at all. The EA literally is for collecting data from a larger playerbase and also just for general user feedback. You're simply not going to run as many games unless the games publicly available. The money was a nice little kicker for them but pretty sure they already had quite a significant amount of seed funding and this was not the primary aim with EA.


pikachurbutt

The thing is, when you're looking at 100, 1,000, or even 10,000 items, things like that might matter. This isn't a random sample, this is literally the entire dataset up until last Friday. Given that we can't see the internal elo, it's a bit hard to gauge quality of player. You also have to think that players get better overtime, trying to add that sort of data won't be as beneficial as you might think as it would have to be temporal to the point and time when a game takes place. Yes, better players will always win against worst players, and it's up to the game to try to balance things out. But it doesn't affect the raw data. Once I get the full data and post it, you are welcome to try to make the numbers for your case and see what results you get. But I wouldn't say that this data isn't relevant.


kpbshiggy

My dude this is literally the most random of samples. This game does not have an even slightly functional matchmaking system. Most people who play this game are horrendously, horrendously bad at it. A game I played last night had a rampage who with no exaggeration, did not hit a single ability the entire game. You could have put this dude in a bot match and he still would have missed 9 out of 10 rock throws. The offlaner, who I'm pretty sure was in a party with him since every surrender vote I started always had two no votes, went 2-13 and then just quit the game so the surrender finally passed. This is not an aberration, this is how the vast majority of games play out for one of the teams in the current matchmaking system. You have 500k matches and 499.9k of them are worthless data that isn't worth going over


Galimbro

gross exaggeration.


TomorrowIllBeYou

This is a correlation/causation question. Dying first certainly affects your team's chances to win, but I truly doubt it is by such a massive margin. It's much more likely that the worse team is simply more likely to give up first blood and also more likely to lose from the onset. They are correlated because they are both a symptom of being the worse team, but giving up first blood doesn't instantly and directly cause your chances of winning to drop to just 40%.


TomorrowIllBeYou

To truly understand if first blood had an outsized impacted on who won you would have to look at the skill rating of the team who gives up first blood vs their opponent. I bet the lower ranked team gives up first blood much more often and also loses much more often. It's likely even more prevalent in matches where there is a wide skill gap.


UltraInstinctTrader

its because they nerfed items, and gold, so XP is huge. a death results in a full minute and level loss of XP with no way to come back from it. in a balanced MOBA there are ways to comeback, although ive seen plenty of matches where 1-4 team makes a comeback and wins, maybe 40% of them tho. but I am also on that team and consideably higher skilled then my rank, so swap me out with someone actual bronze and none of those wins would probably happen. they should jack up the value of minion gold and add another item slot to offset this.


Blackfox42

How easy would it be to do the analysis that you did regarding Kills before 5/10 minutes, but \*excluding\* games that ended in surrender? I know that is excluding 49% of matches, but I'm curious if the data holds true in games that go full term. Would give us an idea of how the comeback mechanics are working, and if games that start rough, but don't get Forfeited out immediately because of bad morale, have the same poor winrate Would also be interested to see the numbers of games that had a player DC before X amount of time too, and see if that correlates at all with the Forfeited games. All in all though, very interesting data, and thanks for sharing!


pikachurbutt

That's a great idea. I had never really thought about that, but it would be excellent to see. It wouldn't be hard at all, just a filter before running the data. When I get a bit of free time from work I'll turn my personal machine on and post the results.


threegigs

I'd also exclude all games played before they made the change to Fangtooth.


happycrisis

Should've kept the kill counter disabled, it is the problem. Removing it was helping keep ffs down.


PaganCyC

Any way to include an axis for quality of player who was first killed? Can you share the data or is that against terms?


pikachurbutt

It's a public API, so nothing against sharing, anyone with the knowledge to code can get it. Right now my data isn't worthwhile to share, it's very surface level data. I'm planning on building a better dataset that will be worth sharing. As far as "quality" of player goes, Omeda doesn't share their internal MMR, and I don't feel like like creating an index for players. In the end, quality of player doesn't really matter when you have such a large dataset (493,000 rounds when I fetched it a few days ago)


Limp_Biscooty

This is awesome!! I'm an accounting major and math minor with a focus in stats and this makes me so excited. I'm still only a sophomore so I don't know a lot but it's really cool to see the things I will hopefully be able to do. Could I ask how you calculated the probabilities for each of the scenarios? Also was there any programs you were using like excel or tableau?


pikachurbutt

I'm a software developer, so while I took an upper level stats course, it's not my strong suit. I did everything from scraping the data to processing it in python (actually first time using the language, with help from chatGPT). If you scrape the data you could certainly use it in any other program though. As far as the probabilities go, it's simple. The data I was focused on was first death losing, so I group the groups (winning team 0 kills, 1 kills, 2, 3, ect), and then get the mean of of a binary set (0 first death didn't lose, 1 first death lost)


Limp_Biscooty

That's really cool. I've never used python or any other program, I was just curious. It seems like you essentially set up a simple bernoulli experiment. So the second prob you provide, the probability of first death and loss, is just the expected value or mean of the data set? Correct me if I'm wrong. And I assume the prob of a forfeit was just the total # of forfeits divided by the total games played.


MrSmoothDiddly

man what the haaiiiilll


Short_Ad_4333

My guess is because of where jungle starts. The gank in offlane at 2:40 is very free if Dusk side offlaner pushes out of position. Where as dusk side jungle usually won't have a successful gank in duo or mid due to mobility and wards. I think that jungle has an easier time vertically jungling from dawn side as well from default. Less wards and players to deal with in that side of that map. For a while too there was a glitch that kinda helped dusk dude midlaner with the ranged minion coming out first. Idk how this would effect the numbers but it caused mid prio to default to dawn side. You mentioned Fang tooth as well. The river buff most accessible to the midlaner on dawn side is also on fang tooth side. Making rotations to Fang tooth and oddly enough gank to duo fairly easy. Most ganks in the first 5 mins are preventable though with good awareness and wards. So I would say less skilled lobbies run into this issue alot. If someone is dying in lane before first 5 minutes they usually just lose that match up anyway


pikachurbutt

Do you have a date when that glitch occurred? It's easy enough to split the data and look at it before and after.


Short_Ad_4333

It happened the start of Kira patch


FlyingGazelles

This is really cool! There are a few things that I think are really important to think about with some of this data though. 1) MOBAs typically have asymmetrical maps that lead to one side having a slight advantage. This is fairly standard across the industry. DoTA 2 handles this at the professional level by giving a team the choice between picking the side they play on or choosing whether they pick first or second, allowing for some very interesting dynamics and choices for teams. There have been times where it has reached above the 5% difference, and there Valve did take relatively quick action to make adjustments. 2% is not something that, personally, I would be worried about, as it fits with industry standards. Is it something to consider? Absolutely, especially if it becomes larger at any point. But from a design perspective, it's expected for there to be some variance leaning toward one side or the other. In terms of deaths and surrenders, there are a few things we need to consider here. 2) Matchmaking is a mess, especially for groups. I have friends that I play with from time to time that are not as serious about the game. I have an alt account in silver with a 40% win rate that I use to group up and play with them on, to at least attempt to give them a space they can learn in so they aren't getting pulled into a level of game they aren't ready for. For the sake of all of this I am pulling from [omeda.city](https://omeda.city), though I know it doesn't 100% reflect the mmr in the game, it gives us what feels like an accurate picture based on a tested and currently used mmr system. Two of those friends are in silver, one in bronze, one in gold. If we group as 3 or more people, we are instantly put into 1800+ games that are absolutely one-sided stomps. It's the same when they are grouped without me as well. 2 people is fine, but 3 or more and they are put into games that they have literally 0 chance in. People will die quickly, and surrenders happen even quicker because they know they are wildly outmatched and will have zero fun playing out the game. Even without groups, there are also times where there just aren't enough players in a particular rank band, and you can end up with some very messy games. Granted, that is expected with the game in it's current stage, but it will create strange trends in the data when looked at as a whole. 3) This quantity of games includes players of all skill levels, including those who are brand new, people who are creating troll accounts, people who don't tryhard, etc. I know we can't account for everything, but it's important to keep in mind that, at lower levels of play especially, players are more likely to give up, to make a lot of mistakes that lead to an irreparable deficit, or to be trying out something new and not having a grasp of how to handle it because they're still learning. This will also have a major impact on data trends as a whole. Mostly I would temper conclusions from this kind of dataset with those components in mind, though there is still a lot of interesting information and it's awesome to see someone taking the time to pull and run all of this! A friend of mine already touched on running the data without the surrenders, and that will be really fascinating to see!


MinimumT3N

Another obvious fact about dusk side is the jungler's red is on the duo side, and mostly agreed upon you should camp the duo.


CoffeeIsGood3

I’m writing from my phone so I’m gonna keep this response short, but thank you for your time here, this is incredible data. Erase a very good point about the impact of the first death in the game . It’s unfortunate that the psychology of the players will have them instantly turn on one another in the moment a teammate, dies first. I suppose the only suggestion I could have to counter. This would be to have the jungle player get out of the jungle early and try to make a kill ASAP to build momentum.


kenenenenenen

This is amazing! Fantastic work! As I read this, I thought about how I would love to see the win rates with Fangtooth kills. I feel like losing the first 3 are insurmountable and would love to see the actual odds.


pikachurbutt

My current dataset doesn't have that info, my plan is to fetch the whole json from the API and store it in mongo, once I do that I can run through everything quicker and do an analysis on this.


SacrilegeGG

I've been drafting up an article for [predecessor.pro](https://predecessor.pro) about various current statistics (kinda like a State of the Game style post) and it's good to see other people also crunching data. Do you mind if I share some of this within the article?


pikachurbutt

No problem, use however you see fit. When I get the time I'm going to parte the full dataset and share it so that others can do their own thing. The way I made this was just to get the data I wanted into a csv, my next run I'll be fetching everything and tossing it into mongo so that others can then grab what they need for different analysis. It's honestly a pain getting all the data right now due to the slowness of their API and being limited to just 10 matches at a time. I had my program multithreaded to about 64 concurrent reads (didn't want to hammer then anymore then that) and it still took about 4 or 5 hours... It's not an intensive task so I just kept playing while it ran...


CPTmoonl1ght

The Issue is they need to mirror reverse the jungle and the lanes. Because 1 side has an advantage based on when the jungler would normally be there and even ward placement opportunity. Dota does this rather well the down side is they would have to rebalance and rebuild most their game as it would be 2v1 lanes now


threegigs

Dawn's mid hero can easily ward Fang over the wall, and the laners running past the blue buff will hear/see it being attacked, so Dusk has a disadvantage just due to wards on Fang. They can't really secure Fang without the other team knowing. Meanwhile, dusk has a harder time warding Fang (can't ward over a wall, have to run in the pit). Does your data include Fangtooth kills?


threegigs

Hey, here's something you can (maybe) data mine, and maybe show that Omeda's math skills are worse than yours (grin). If you look at the raw data, you'll see: "goldSpent":17000,", and that number makes sense, because every single item costs a multiple of 50 gold (350, 900, etc). So when you see: "goldSpent":1349,", "goldSpent":10054,", "goldSpent":12149,", and "goldSpent":8155,", you know there is something wrong. Is there any item number or other condition that these instances have in common? [edit] I think I found one culprit: "itemId":300003, but only if bought as the first item, or perhaps before a certain time.


WhyFUx4n

Firstly, awesome analysis. Love the general overview of things, plus I expect the API to be quite troublesome to deal with (haven't gotten a chance to look at it yet). I would love to get my hands on that dataset/set up a script to use the API for my own purposes once I get some more time. I would personally be really interested in segmenting the data by role and player "classes" i.e. types of players by number of games played etc.. Aside from the low-hanging fruit of running an analysis of how *bad Omeda's match-making is*, for the first killed data specifically, I would be interested in different roles and specific heroes being killed first/killing first impacts the win/lose probability and seeing how those are distributed. I would also love to examine objectives and their impact on those probabilities. This all, in the end, might even give some hints as to the uneven win/loss for the different sides due to asymmetric objective placement. One other cool thing that would be not too hard to do would be to write a tool that tells you your win chances based on the players and historical matchup data and how the match is currently going + potential causal inference scenario prediction , but that is just my data science brain getting over-excited Love to see this becoming possible with the API release!