The newsworthy.ml team, generously supported by Concertation Montreal as winners of the AI4Good Entrepreneurial Award, has developed an ML-powered platform to help you spot bias in your own news-reading habits. Click through their dynamic map to read local news stories that often get buried under aggrandized headlines, follow their development journey, and find out where they're headed next!
Team members (top row left to right): Janna Agustin, Harpriya Bagri, Jodi Boone
Team members (bottom row left to right): Miya Keilin, Raveen Sidhu
Keywords: Information literacy, disinformation, misinformation, media, news, sensationalism, news bias, polarization
How can AI be leveraged as a driver for social good? We revisited this question many times as we, the newsworthy.ml team, advanced through AI4Good’s intensive curriculum of machine learning and Artificial Intelligence this summer.
The virtual setting of the Lab provided a daily reminder of the adjustments we’ve all faced in the wake of the pandemic, so the application of AI as a mitigator of COVID-19 was frequently discussed during the lectures and seminars. In fact, the Lab’s hackathon was COVID-19 themed. All our current team-members chose to embark on the news project during the hackathon, and returned to further develop the prototype for Demo Day.
THE PROBLEMS WITH OUR INFORMATION CONSUMPTION AND LITERACY
The digital age has entirely changed media and the ways in which we consume it. With this in mind, we identified two problems that we wanted our project to address:
Throughout the COVID-19 crisis, misinformation has seemed to spread at a rate greater than ever before. Not only that, but every single day it has felt like there’s been another overwhelming update about the pandemic in the news. News about other topics has felt sparse and fleeting. Moreover, global events have dominated the news on social media and television, making it difficult to hear what is happening in our local communities. Community news is essential to navigating local hotspots, testing-sites, and restrictions, yet this crucial information has felt harder to access than ever before.
Echo-chambers of information
The very algorithms designed to provide personalized content that keeps us scrolling, that keeps us clicking, are the very algorithms that entomb us in echo chambers, perpetuating our own biases. Social media platforms like Facebook and Twitter disseminate stories in a manner that complicates our access and understanding of important news stories. Although comfortable and convenient for users, the algorithms that dictate what people see and what they do not, allow opinions to be left unchallenged. We believe divisive opinions are amplified by echo-chambers enabled through catered media creates.
A way out of the chamber?
As a team, we wanted to address these gaps in news consumption. We believed that there had to be a better way to stay in touch with accurate news about local and international events without contributing to echo chambers. Our goal was to create something that people could use to gain a less-biased and more-holistic understanding of the media they consume.
SURELY THERE’S AN APP FOR THAT?
We aligned ourselves in this manner because we saw other news dissemination apps were personalized and perpetuated bias. Other projects that addressed media bias and coverage imbalance were either discontinued or not actionable; they would bring awareness through a static data set, but would offer no tips on how readers could overcome this.
Figure 1: Other apps, chrome extensions, and projects like Unfiltered.news, Flipboard, Google Chrome extensions, News tab, Panda 5, Breaking News Tab varied between urging exploration, perpetuating bias, and being actionable.
A NEWSWORTHY SOLUTION?
Our solution was newsworthy.ml; a site that displays news articles on a map with trending scores to keep you alert on events in your community, as well as all around the world. The trending score lets users see what events are of global significance, enabling you to learn about big news stories without being overwhelmed, while keeping informed about smaller local stories.
Our vision for newsworthy.ml also included a dashboard to make readers aware of their news reading habits and how they could improve their information and news literacy skills. We also envisioned a chrome-extension could provide tips and analytics on articles that users would read on their own
HOW DOES IT WORK?
Demo Day Version (v0.1)
The design presented on Demo Day was the first iteration of newsworthy.ml and also featured filters so that users could navigate between topics.
Figure 2: newsworthy.ml’s first prototype that was presented on Demo Day in July 2020.
For the first prototype that we presented at AI4Good’s Demo Day, we focused on building functionalities for the map feature. We scraped news articles from 21 news outlets in Canada – mostly those based in the Metro-Vancouver area. We then used topic modelling to visualize these articles by topic, each of which are represented as bubbles on the map. We used scoring by frequency to represent the amount of coverage each topic gets, and this was demonstrated by the size of each topic bubble. Finally, the colour of each bubble represented the news category an article belonged to.
We scraped 4206 news articles from 21 news outlets in Canada. These articles we used were also sampled from a range of news categories.
Figure 3: Breakdown of articles we pulled from each respective outlet.
Figure 4: Breakdown of news articles by category.
V0.1 Non-Negative Matrix Factorization
To generate topics from the articles we scraped, we used an unsupervised technique called Non- Negative Matrix Factorization (NMF). NMF involves the factorization of the data matrix A – which is made up of each articles’ text – into two more matrices, W and H, where none of these three matrices have any negative entries. W and H represent the topics found and the weights of those topics, respectively. The initial values of W and H are modified such that their product approximates A.
We used two different libraries to implement NMF in v0.1. To determine the optimal number of topics, we used the gensim model because it was compatible with the gensim Coherence Model. The coherence of the models was the metric we used to choose the optimal number of topics. We then used sklearn to extract the topics from our dataset and sort the articles into their respective topics.
Figure 5: Our workflow and ML pipeline for v0.1.
Figures 6 and 7: Two examples of generated topics from the classifier after training it with ~1700 articles.
V0.1 Support Vector Machine
To classify the topics into one of ten preset news categories, we used a supervised model called Support Vector Machine (SVM). Our categories were as follows:
V0.2 Technical Details
Since Demo Day, many of our pipelines have improved. For example, we no longer use a static dataset. Live web scraping has been automated for each Sunday; this means we collect and present news out of the Metro-Vancouver area, hot off the press!
Figure 8: v0.2 of newsworthy.ml features UI changes.
Aside from automated web scraping, here are some other things that have been improved in newsworthy.ml’s v0.2:
News classifier model improved to 95% accuracy
NLP model has trained for geoparsing
Improved topic modelling algorithm by 50%
Secured domain and integrated with Vercel
Restructured backend for scaling
Topic model and classifier model automated on AWS
Mobile-optimized our site
As mentioned earlier, an essential component to newsworthy.ml is a dashboard that would provide analytics to users based on users’ news-reading habits, and improve their news intake with suggestions on information and news literacy. A Chrome extension that presents analytics is also in development.
Figure 9: A mock-up of the dashboard that is currently in development for future launches.
Figure 10: A mock-up of the Chrome extension feature that is currently in development for future launches.
WHAT WE LEARNED
To make newsworthy.ml a fully-functioning web app, we quickly realized that we had to be adjustable and open to learning as a team. In our group of five, each and every single one of us had to learn something new for this project, whether that be navigating MapBox, landing page development, social media marketing, or topic-modelling. It is likely that as we continue to develop newsworthy.ml, we will also be expanding our skill sets.
Winning the prize helped us realize there was potential in our idea and that others wanted to see our work come to life. Had it not been for AI4Good’s support, it is unlikely we would have seen the success that we did. We have been able to expand our demo day prototype into a more developed product. We have a long way to go to achieve our goal with newsworthy.ml (with a fully functioning dashboard, chrome extension, and an expanded set of global articles), but we are willing to put the work in to accomplish this. AI4Good saw our potential and made us realize we had it too.
July 2020: AI4Good Entrepreneurial Award
After winning ‘Most Likely to Become an Entrepreneurial Business’ from AI4Good, we continued to develop and compete.
September 2020: Youth Impact Challenge Top 7
After winning the AI4Good Lab’s top prize, we submitted newsworthy.ml to the Youth Impact Challenge (YIC). After being selected in the Top 30 Teams out of 180+ applicants, YIC awarded our team $1k; after competing amongst the Top 30 teams, we were then selected as one of the Top 7 teams in the challenge and received additional prize money for doing so.
October 2020: pioneer.app Global Top 50
Most recently, as of October 2020, newsworthy.ml placed within the Top 50 Projects on the Global Leaderboard on pioneer.app, a remote accelerator. From the America (Non-US) Leaderboard, we ranked Top 3.
WE NEED YOUR HELP!
We will continue to improve newsworthy.ml. Currently, our goals are to remain in the Top 50 Projects on pioneer.app.
We are moving in the direction of a closed beta with a revamped landing page for v0.3. If you are interested in giving feedback or becoming a beta user, feel free to message newsworthy.ml at firstname.lastname@example.org.
Visit newsworthy.ml to try out the app!
newsworthy.ml would not exist today if it were not for the kind and patient individuals who gave us their time, patience, unwavering support, and insightful feedback. During the intense 3-week project-development phase, our team’s TA, Aayushi Kulshrestha, met with our team nearly every day to keep us on track for our Demo Day deadline. Aayushi actively provided us with information and tips on how we could improve our site.
Our AI4Good mentor, Francis Duplessis, went above and beyond to connect newsworthy.ml with resources and encouraged us when it was much needed.
Lastly, Maya Marcus-Sells, Christina Isaicu, and Yosra Kazemi, managed to convert this program to an online format and created a positive and encouraging environment that enabled us to create newsworthy.ml.
Thank you to all these individuals and the rest of the AI4Good Team and 2020 cohort for your continued support. Thank you for believing in us.