Thursday, January 26, 2023

Sentiment Analysis On 2021 Shotgun Scenario Reviews

The 2021 Shotgun Scenario Contest was the first Delta Green contest I entered. Unfortunately, due to voter fraud no winner was declared that year. This sucks for everyone who submitted a scenario, but it also makes it harder for everyone else to find the best 2021 scenarios without having to read them all.

However, some nice people on the Night at the Opera Discord server have read a lot of the scenarios and shared their feedback in review documents. In total there are 461 reviews, which means that on average there are 8.87 reviews for each of the 52 submitted scenarios. If we can find a way to turn a review in a score we can compute the average score for each scenario and rank them accordingly. Doing this by hand for all 461 reviews would be quite cumbersome, so let's automate this.

Sentiment Analysis on Scenario Reviews

Sentiment analysis is a well-studied sub-field of Natural Language Processing and a lot of open source models are available, e.g. on the HuggingFace model hub. The goal of sentiment analysis is to assign a sentiment label (e.g. positive, negative, neutral) to a given piece of text (like a product review). I tried some of the models on HuggingFace on the scenario reviews, but the results were not as good as I had hoped. Many of the models where trained on domains that are very different than DG scenario reviews, e.g. on Twitter data. Others were not fine-grained enough and only assigned "positive" and "negative" labels, without a "neutral" or "mixed" class.

My next approach led to better results. I used the OpenAI API and asked their largest GPT-3 model to label each review with one of five classes. I used the following prompt and had GPT-3 complete it:

Decide whether the sentiment of a Delta Green scenario review is positive, mostly positive, mixed, mostly negative, or negative.

Review: <review text goes here>
Sentiment:

Here is a toy example with GPT-3's output in bold:

Decide whether the sentiment of a Delta Green scenario review is positive, mostly positive, mixed, mostly negative, or negative.

Review: This scenario is awesome and I want to run it!

Sentiment: Positive

I tested with some actual reviews and got reasonable results, so I wrote a script to do this for all available reviews. A spreadsheet with the results can be found here.

To rank the scenarios I assigned a score to each of the five classes (positive = 5, mixed = 3, negative = 1) and computed the average score for each review.

Of course this whole approach to determine good scenarios has some limitations:

  • GPT-3 might get it wrong. It's a powerful language model, but it can still make mistakes.
  • The predictions are still not very fine-grained. The most common class that GPT-3 predicted was "mixed", because even for good scenarios a lot of reviewers had some minor complaints or suggestions for improvements. However, I don't trust GPT-3 to accurately and consistently grade reviews on a 1-10 scale. 
  • Not all scenarios have been reviewed by all reviewers. This could introduce some bias. E.g., if one reviewer always reviews very positively but leaves out one scenario, then this scenario has a disadvantage.
  • The selection of reviewers might also be biased. All reviewers are active on the N@tO Discord server. It is possible that the tastes of N@tO members and the wider Delta Green community don't perfectly align. I don't know if this is actually the case, but it's something to keep in mind.

There are probably more caveats, so it should be clear that my way of determining the best scenarios is not a replacement for an actual vote. This means that you should take the following results with a grain of salt.

Results

Here are the top 5 scenarios according to procedure described above:

  1. Hemimetabolism
  2. Werewolf Gimmick
  3. Pattern Recognition
  4. When The Boat Comes In
  5. TOGA Party

 The next three positions were all shared by two scenarios with the same score:

My method to rank these scenarios was not totally off, because I think that this is quite a cool collection. Go and check them out!
 

No comments:

Post a Comment