Bots, that are going to play Stall Catchers along humans

Just as the GAIA catcher bot participated in the 2021 International Catchathon, for the next dataset, GAIA will be accompanied by two new catcher bots to help do the analysis, so we can get a better idea of how well bots can work with humans and other bots to analyze Stall Catchers data. Today, researchers Laura Onac (L.) and Pietro Michelucci (M.) will explain more about machine learning, bots, and how they will engage in Stall Catchers.

Machine learning might look like something very difficult, that only data scientists work with. If I am not using any fancy machines, how does machine learning affect my everyday life?

L.: Machine learning algorithms are present in our everyday lives through the apps we use on our phones and desktops. Among other things, they ensure we get music playlists tailored to our preferences and see content that’s relevant to us in different media streams.

P.: A common application of machine learning is in recommendation algorithms, which try to figure out what your shopping preferences are and show you advertisements that are most likely to be interesting to you or result in a purchase. These same kinds of algorithms might recommend new movies to watch on Netflix based on movies you have already chosen to see.

Most of us had already encountered with machine learning. Photo by Brian J. Matis - 2010

What does the machine that you are training look like?

P.: Most machine learning models used today involve some version of an artificial neural network, which is inspired by how the human brain works. They are much simpler than the brain, but they have “nodes” that behave like brain cells because they can be activated by other nodes and can themselves send a signal to other nodes, just like neurons.

Human brains can be trained in many ways. Can machines be also trained in various ways?

P.: A common way to differentiate among different types of machine learning is based on how much supervision they get. Are they just expected to look at a bunch of data and figure things out on their own or does someone tell them when they are correct or incorrect?

L.: The main types of machine learning are:

Supervised learning, where we often use manually labeled data to train models that perform classifications or predict outcomes
Unsupervised learning, where the models analyze unlabeled data to discover patterns
Reinforcement learning, where intelligent agents interact with their environment and learn to maximize a certain goal by trial and error.

How does machine learning work in the Stall Catchers game?

P.: Machine learning has played a role in Stall Catchers since the beginning of the project. We have always used machine learning to find the vessel segments that are embedded in large images and draw outlines around them so we can create the vessel movies that are shown. We have also used machine learning to try to weed out bad vessel movies. And more recently, we are trying something completely new in the field of machine learning, we are turning machine learning algorithms into bots that can participate in crowdsourced data analysis among humans who are doing the same thing. In other words, we are turning the machine learning into Crowd Bots, or in this case, “Crowd Bots”, or in this case, “Catcher bots”!

L.: Our machine learning models were developed by participants in a competition on DrivenData. We used Convolutional Neural Networks (CNNs) to find patterns in the videos and to formalize the concept of a “stall”.

How do bots and human participants play Stall Catchers alongside each other?

L.: CNNs are black box systems, so it’s difficult to interpret what they’re seeing in the data. We integrated the models into bots that look at new videos alongside the other players on the platform and classify the blood vessels as either flowing or stalled.

“Most machine learning models used today involve some version of an artificial neural network, which is inspired by how the human brain works” - Pietro Michelucci.

What are the differences between how humans and bots perform in Stall Catchers?

P.: This is a very interesting question that we hope to answer in our bot research. One main difference is that a bot will almost always give the same answer for the same input (vessel movie), whereas a human that is shown the same movie at different times might give different answers.

L.: Stall Catchers is a crowdsourcing platform, so the performance of the crowd is often better than the performance of a single person. While a single bot might be better at annotating vessels than an average person is, it is not better than the collective answer of a crowd.

Laura - does your bot have a name?

L.: GAIA is the name of our first intelligent bot that uses deep learning to classify blood vessels as either flowing or stalled. Since it was the first one, we gave it the name of a Greek primordial deity - the great mother of all creation.

What makes GAIA learn?

L.: GAIA has learned from a small subset of 7,931 videos from the data annotated by all the Stall Catchers players

P.: In her current form (as it is her prerogative as creator, Laura has specified previously that GAIA’s pronouns are she/her), GAIA does not continue to learn. However, another research area we would like to explore is how bots can continue to learn and improve while they are playing Stall Catchers, by showing them the final crowd answers for vessel movies shown in new datasets.

Why do we need GAIA and other bots in Citizen Science?

Will there be more bots playing Stall Catchers in the future?

L.: Yes, we plan on including more bots in the platform in the near future.

P.: We are very excited to be pioneering new machine learning methods by incorporating these algorithms into Stall Catchers as Catcher-bots. In this new study we will use three bots, and in the future, perhaps more. The most important aspect of this approach is that it is different from traditional uses of machine learning because it can be successful even when the bots are not perfect. Once machine learning researchers get past the mental barrier of having to create a near-perfect bot, suddenly many new opportunities open up for making an impact with machine learning models today that would otherwise be discarded or set aside in hopes of more training data to help improve performance.

Are all the bots the same and can participate in Stall Catchers?

P.: We want all the bots to be different from each other. The way our wisdom of crowd methods work in Stall Catchers is based on the idea that everyone thinks differently. If there are 5 other people analyzing the same vessel I’m analyzing, then when I get it wrong, hopefully there are enough other people who get it right that the combined answer is correct. And when someone else gets it wrong, maybe I’ll get it right. If all the bots always gave the exact same answer for the same vessel movies in Stall Catchers, then there wouldn’t be any value in having more than one bot playing. But what we discovered is that the 50 bots created in the ClogLoss machine learning challenge are all fundamentally different in their design, how they are taught, and how they decide on their answers.

L.: Every bot is different and uses a different machine learning algorithm, each with their own strengths and biases. All the models integrated in the bots have been developed by participants in the DrivenData competition and only verified bots can participate in Stall Catchers.

P.: It’s exactly this diversity that gives us hope that multiple bots could be very useful to Stall Catchers, and we are soon going to try again with three bots playing instead of one, as we did with our GAIA pilot.

“Since it was the first one, we gave it the name of a Greek primordial deity - the great mother of all creation“ - Laura Onac.

L.: And even then no single bot is accurate enough so that we don’t need the help of humans anymore. We are currently studying the performance of hybrid crowds, with both humans and bots.

P.: Yes, no single bot is able to achieve close to 99% accuracy in Stall Catchers which is our goal. But we are hoping that by allowing bots to play Stall Catchers alongside human players, we will be able to speed up the analysis without any loss of data quality. After all, bots do not need to sleep :)

A bot that never sleeps sounds like a perfect assistant! Is there any scientific achievement done by machine learning that wouldn't be achieved without it?

P.: Machine learning is good at detecting complex patterns that humans might not notice. But machine learning is not very good at knowing when patterns are meaningful or important - because that requires the kind of working knowledge that typically only humans possess. So it’s really a partnership between machine learning and humans that has enabled new discoveries, including answering research questions with Stall Catchers.

Thank you, Laura and Pietro, for the discussion! Anyone who would like to learn more about Stall Catchers and play alongside with GAIA and the two NEW bots please join us in analyzing the next Stall Catchers dataset!