Twitch In the Ever-Evolving Landscape of Digital Communication: An Innovative Method to Boost Moderation Performance
Category Computer Science Wednesday - January 3 2024, 05:34 UTC - 10 months ago USC Viterbi's Information Sciences Institute (ISI) researchers Dong-Ho Lee and Jay Pujara have developed an innovative method that boosts the performance of moderating content in live streams by 35%. They used a dataset of 4,583 norm-violating comments on Twitch moderated by human channel moderators, and their model successfully facilitated the curation of conversations into communities that share similar interests and values. It takes into account factors such as how words are used and where they are positioned in the context of a conversation, and achieved an accuracy rate of 90%.
Twitch. Some see it as a fun online community of gamers and good-natured e-sports fandom. For others, it's a perilous stream of potentially toxic content and hate speech.In the ever-evolving landscape of digital communication, the real-time nature of messages on live-stream platforms like Twitch and YouTube Live brings with it unique challenges for content moderation. At present, effective tools for moderating content in live streams are lacking because existing models have been trained on non-real-time social media platforms like Facebook or Twitter .
Research Assistant Dong-Ho Lee and Principal Scientist Jay Pujara, both from USC Viterbi's Information Sciences Institute (ISI), set out to change that. They have developed an innovative method that boosts the performance of moderation models on live platforms by 35%.Getting in syncPujara said, "If I post something on Twitter or Reddit, someone might respond hours or days later. But if we're looking at Twitch, it's a very different environment .
People are sending messages every second."It all comes down to timing. Twitter, Facebook, and Reddit are asynchronous—where users post their thoughts, but the responses are not immediate. On the other hand, Twitch, YouTube Live, and other live-streaming platforms are synchronous—which is the equivalent of being in a live conversation.In conversations on asynchronous platforms, thoughts are typically grouped into a structure of threads that allow for conversational context .
And users have no time constraints, so they can comment with better thought-out responses. Whereas on synchronous platforms, thoughts are presented in real time, consecutively, with no structure to indicate context. The fast-paced nature encourages quick responses and multiple short comments. A first-of-its-kind approachSeeing this gap in the research, Lee and Pujara conducted the first NLP study of detecting norm violations in live-stream chat .
"Norm violations" refer to instances where users on online platforms breach the established rules or guidelines for acceptable behavior. Pujara explained, "Typically there will be a set of rules that are published when you join [a live stream], and there are moderators who are trying to figure out if people are breaking these rules. Are you harassing someone? Are you trying to change the topic? Are you sending spam messages?"The team of authors, including ISI Ph .
D. students Justin Cho and Woojeong Jin, and Jonathan May, a research associate professor at the USC Viterbi Thomas Lord Department of Computer Science, used a dataset of 4,583 norm-violating comments on Twitch that were moderated by human channel moderators."They gathered chat rules of each Twitch streamer, held iterative meetings to categorize types of norm violations, and managed annotators in labeling various live streaming sessions to analyze norm violations in Twitch," said Lee, who continued, "This involved a significant joint effort between various industry partners and academic institutions for the first study of norm violations in live-stream chat .
"Bring in the humans… and the detailsPujara said, "An interesting thing about twitch is that users' aggression is very nuanced—scenes are very closely connected, and there can be jokes, shades of sarcasm, and puns. The challenge has been how to handle these situations in real-time using a moderation model to detect and determine the context of the conversation.”Lee and Pujara's model was designed to help automate the process of detecting norm violations in real time .
This model proves to be effective in live-stream chat conversations particularly because it leverages natural language processing and machine learning to identify subtle patterns in conversations.It takes into account factors such as how words are used and where they are positioned in the context of a conversation. This capability allows the model to better determine the intent and severity of the conversation and inform the moderator’s decision .
The team found that “in comparison to other existing models, which require thousands of messages from users to learn patterns, this model is much more efficient in its ability to identify violations, even with a training dataset of only 600 messages.”The model was put to the test on a dataset of over 1 million norm-violating messages, and the detection rate of norm violations achieved an accuracy of 90% .
Additionally, the team was able to reduce false positives by 10% and deployed a “confidence threshold” that filtered out messages it was uncertain of.Concluding thoughtsLee and Pujara’s model serves as a benchmark for more efficient and reliable content moderation in live-streaming platforms. It successfully facilitates the curation of conversations into communities that share similar interests and values and protects them from harassment, spam, and other unwanted content .
Going forward, the team hopes to develop additional projects that build on their findings, such as a model for identifying user friendliness, trustworthiness, and influence in conversations.
Share