Can Diverse Gaming Labels Lead To Better Algorithms?

Category Computer Science

tldr #

A new research from Cornell, Xbox and Microsoft Research has revealed the importance of diversifying the data used to label gaming titles. By surveying 5,174 gamers from around the world, the team found that the model fed on a global dataset outperformed a model fed on a single-country dataset. This research highlights the need to account for cultural diversity when labeling datasets in order to create more accurate AI models.


content #

Is The Witcher immersive? Is The Sims a role-playing game? Gamers from around the world may have differing opinions, but this diversity of thought makes for better algorithms that help audiences everywhere pick the right games, according to new research from Cornell, Xbox and Microsoft Research. With the help of more than 5,000 gamers, researchers show that predictive models, fed on massive datasets labeled by gamers from different countries, offer better personalized gaming recommendations than those labeled by gamers from a single country .

The team surveyed 5,174 Xbox gamers from around the world.

The team's findings and corresponding guidelines have broad application beyond gaming for researchers and practitioners who seek more globally applicable data labeling and, in turn, more accurate predictive artificial intelligence (AI) models. "We show that, in fact, you can do just as well, if not better, by diversifying the underlying data that goes into predictive models," said Allison Koenecke, assistant professor of information science in the Cornell Ann S .

The models could predict how gamers from each country would label a certain game.

Bowers College of Computing and Information Science. Koenecke is the senior author of "Auditing Cross-Cultural Consistency of Human-Annotated Labels for Recommendation Systems," which was presented at the Association for Computing Machinery Fairness, Accountability, and Transparency (ACM FAccT) conference, in June. Massive datasets inform the predictive models behind recommendation systems. The model's accuracy depends on its underlying data, especially the proper labeling of each individual piece within that massive trove .

The research was presented at the Association for Computing Machinery Fairness, Accountability, and Transparency (ACM FAccT) conference, in June.

Researchers and practitioners are increasingly turning to crowdsourced workers to do this labeling for them, but crowdsourced workforces tend to be homogenous. During this data-labeling phase, cultural bias can creep in and, ultimately, skew a predictive model intended to serve global audiences, Koenecke said. "For the datasets used in algorithmic processes, someone still has to come up with either some rules or just some general idea of what it means for a data point to be labeled in some way," Koenecke said .

Data labeling is a method of assigning tags to datasets in order to make it easier for an AI model to make predictions.

"That's where this human aspect comes in, because humans do have to be the decision makers at some point in this process." The team surveyed 5,174 Xbox gamers from around the world to help label gaming titles. They were asked to apply labels like "cozy," "fantasy," or "pacifist" to games they had played, and to consider different factors, such as whether a title is low or high complexity, or the difficulty of the game controls .

Gaming titles are often labelled with terms such as 'cozy', 'fantasy', or 'pacifist'.

Some game labels—like "zen," which is used to describe peaceful, calming games—were applied consistently across countries; others, like whether a game is "replayable," were applied inconsistently. To explain these inconsistencies, the team used computational methods to find that both cultural differences among gamers and translational and linguistic quirks of certain labels contributed to labeling differences across countries .

Cultural bias can creep in when data is labelled by homogenous groups, and can lead to inaccurate predictions in AI models.

The researchers then built two models that could predict how gamers from each country would label a certain game—one was fed survey data from globally representative gamers, and the second used survey data from only U.S. gamers. They found that the model trained on the global dataset outperformed the model trained on the single-country dataset. This suggests that Japanese gamers, for example, may identify a game as peaceful, while a U .

S. gamer may not--both assessments could be valid. The team's research and data can be used to build better labels for games by accounting for cultural diversity.


hashtags #
worddensity #

Share