Alex Whidden, Jocelyn Wong, Kate Kim, Marie-Elise Latorre, Lily Andruschak, Kiara Wimbush
Track and predict your hormone levels without having to wait for your next blood test.
Thyroid, ARIMA, Gaussian Process Regression, hormone levels
Inspiration for the project
Our teammate Lily was diagnosed with Graves disease in early 2019. With the thyroid being a hormone-producing gland that affects every cell, tissue, and organ in the body, this took an extreme toll on them. The unpredictable changes in energy levels and regulatory functions like the heart rate, hindered Lily's everyday life. Although they have since received treatment in the form of surgeries and medication, Lily often struggles with the medical system and its lack of consistency when monitoring their symptoms and hormone levels. Lily believed that the gap they saw in their own life when navigating this disease could be tackled through AI and thereby make a positive impact on over 200 million people affected across the globe.
Indeed, the necessity for such an application was demonstrated when we stumbled upon a Reddit community that had already been sharing their experiences in hopes of reaching a solution. We received many positive comments when sharing our project with them, expressing their gratitude to us for addressing this taxing matter that rarely receives recognition.
In addition to leveraging Lily's knowledge of the disease, our team first conducted background research in order to better understand the roles and associated mechanisms of the thyroid. Understanding baseline hormone levels, what aspects might influence where this baseline lies, how different hormones interact and influence one another were among the many topics we delved into within the early phases of our project. An appropriate machine learning model must be an accurate reflection of the problem at hand, which one can only grasp with a solid foundation of the scope. While what we aimed to accomplish was different from Hu et al.'s work, studying their findings provided guidance as to what features may or may not be important, possible models to look into, as well as identifying plausible limitations.
List of technologies used
- Google Colab
- Gaussian Process Regression
Detailed description of the project
People that suffer from thyroid diseases typically only take blood tests every 6 weeks to check if their hormone levels are in the healthy range. However, their thyroid medication needs to be taken daily. Tyro accompanies you in this process by keeping track of your hormone levels between doctor's visits. By using an easily accessible record of your bloodwork and symptoms, we generate intuitive visualizations of previous and forecasted hormone levels, allowing you to make better informed decisions and take preemptive measures. These functions are based on Lily's personal experience when navigating their thyroid health.
The most common type of related surgery to treat hyperthyroidism (i.e. an overproduction of thyroid hormones) is a total thyroidectomy, which is performed on 150 thousand individuals in the U.S. each year. Such was Lily's case, and after getting their thyroid removed, they were bound to take daily medication to paliating the lack of thyroid hormone production. Our app is built for individuals like Lily, whose associated hormone levels can be solely attributed to medication.
Since we wanted to model and predict the user's thyroid hormone levels through time, we looked for time series dataset with measurements of the hormone levels of the same person but at different moments. However we couldn't find such datasets and decided to create our own with the existing data from Lily. We also compiled a list of 20+ potential candidates who are willing to contribute to the creation of a more complete dataset. Collecting those data is very sensitive and requires following ethical guidelines and respecting privacy regulation, we thus left this for future work, and focused only on Lily's data to develop our models.
Tyro builds a personal time series model for each user, the model is trained on the blood levels they previously filed in. The trained model then generates predictions of the user's hormone levels and highlights symptoms they had recorded when their hormones were at similar levels. After exploring different models, we chose a combination of ARIMA and Gaussian Process Regression, 2 explainable models that work well with small data sets. Both are used for complementary purposes, ARIMA is used to make long term predictions about the hormone levels of the user in the forthcoming months, while Gaussian Process is used to make short term predictions about the daily level of hormones.
Here, the graph describes 6 data points, which correspond to Lily's TSH levels over the course of 5 blood tests and the ARIMA prediction for the following blood test. ARIMA stands for AutoRegressive Integrated Moving Average. It is a statistical model specifically designed for the analysis of non-seasonal time series that show patterns. We implemented it using the time series analysis library in Python's statsmodels library. An ARIMA model is characterized by 3 parameters (p,d,q), respectively: the order of the autoregressive term, the order of the moving average term (i.e. number of lagged forecast errors that should go in the model) and the number of differencing required to make the series stationary1. By playing with the different parameters, we managed to find a model that fits the data well and provides realistic predictions. Tyro uses ARIMA to predict the user's blood levels in the forthcoming months, at the given dates of the blood tests.
This graph shows the same data that was used on the previous graph, but this model gives a slightly different prediction.
Gaussian Process Regression is a supervised learning method that uses a Bayesian approach to solve regression problems. First, it assumes a Gaussian Process Prior that is specified using a mean and a covariance function2. The mean is usually 0, or the mean of the training data, which is what was used in our model. The covariance is specified using a kernel function which can be selected from a variety of options such as Linear, Matern, and RBF kernels. We tried different kernels, and chose the RBF kernel because it provided the best predictions for our data. To implement the model, we used the Gaussian Process Regression implementation provided by scikit learn3. Once the prior is specified, the model can calculate a posterior distribution using the prior, test data, and a test observation. Then, the posterior can be used to find a mean predicted value, and a standard deviation for the prediction.
Tyro uses Gaussian Process Regression to predict the user's daily hormone level between blood tests. This is because the model works well for interpolating values between data points. The model also provides a confidence interval for its predictions which gives users a method to judge the reliability of the values. This model is less effective for producing long term predictions because over time the prediction defaults to the mean of the data, which is why it is complemented with ARIMA to complete the machine learning component of our application.
Impact & Innovation
While our competitors also track hormone levels, our innovation lies in our predictive component, forecasting when levels may go outside of the normal range along with a confidence interval. In doing so, we constantly improve each model by comparing our predictions to the actual results. The other options available on the market only serve as a hub for blood test results. As a result, health complications still had to be dealt with as they arose since these platforms had no way to foresee them.
We acknowledge that we are not professionals in the medical field and took the necessary precautions prior to launching our project. Our medical disclaimer emphasizes that Tyro is not a substitute for professional medical advice. When formulating our machine learning model, we prioritized transparency to reduce unforeseen biases in blackbox operations. Explainability makes result tracing and model improvement easier if issues were to arise. Tyro's dedication to data security also guarantees that all personal information is confidential and secure. These measures ensure that Tyro receives informed consent from all participants.
Challenges we ran into & how we overcame them
The first and most obvious challenge we ran into, as mentioned previously, was our lack of a complete dataset. Previous research has revealed that other aspects of one's health and background have effects and implications on one's TSH levels, so by limiting our data to only people who have had a total thyroidectomy, we could fully attribute hormone levels to their medication. Many datasets expose patients' TSH and T4 hormone levels, but we weren't able to find one that focuses on patients whose thyroid was removed. Therefore, we decided to base our model solely on Lily's past blood tests, and planned to leverage platforms like reddit to build our own dataset.
Due to the short timeframe of the bootcamp, although we found people who were enthusiastic to contribute, many of them were not able to retrieve their results in time to help us with our project. This was also an issue when we tried to gain access to related datasets used in other studies. As a result, we adapted our models so that they would build on each user's dataset separately.
Another reason we decided not to use existing datasets was the privacy concerns that could arise. Despite there being datasets issued from hospitals, it was generally pretty hard to access, especially for a group of 6 students. This was only the beginning of our ethical concerns, as they continued to unveil while we developed the project: questions arose concerning the data we were collecting from both the volunteers found on reddit and the app's future users. These issues have not all been dealt with, as the app was not launched yet and no medical information was retrieved apart from Lily's. We wrote a tentative consent form to be signed by anyone willing to contribute to our dataset but, if the project were to continue, we would first have to consult medical ethics professionals before collecting more data.
What we learned & accomplishments we're proud of
The AI4Good Lab taught us many useful skills, thus empowering us in our pursuit of an AI-related career. During the lectures, conferences and workshops, we were equipped with many tools that will with undoubtedly be useful in our careers in machine learning: introductions to collaborative interfaces for coding and designing, an overview of the different types of models which allowed us to identify the possible types that would work with our data, as well as tips to convincingly pitch and market our ideas.
As for the project-building phase, we were able to apply all of these skills, and learned many more as time went on. We all learned more about thyroid health in addition to connecting with the community that would benefit from our project. On the computational side, some of us explored and learned about a wide variety of models beyond the scope of what was taught in the sessions, while others familiarized themselves with various UI-UX tools. Thus, as we come from different disciplines, we were able to accommodate everyone's strengths and weaknesses and despite being online and working virtually across different time zones, we were able to stay motivated and on track, producing something we were all proud of in the end.
What's next for us & the project
While our team still believes in Tyro and can see many exciting ways we could continue to improve, we are all pursuing different paths and have no plans to actively work on our project. If anyone is interested in taking up Tyro, feel free to contact us at email@example.com and we'd be happy to answer any questions.
However, if we were to further our project, we would first have consulted with medical professionals, data scientists, and ethics advisors to ensure that everything is as accurate and ethical as possible. Only after that would we work on finishing our back-end implementation and formally launch the beta version of our webapp for user feedback.
We would have tried to collect data that represents a wider variety of cases, such as women and adults over 85 years old, since both groups are at a higher risk. We would also have included other relevant parameters such as age, gender, BMI, and additional medications to really make our predictions better.
The first feature we'd like to include is natural language processing on users' symptoms inputs, which would allow us to discover trends and thus further analyze their physical state.
Finally, our ultimate goal is to expand our user base to include other thyroid disease patients, like those who still have their thyroid intact, as well as to include a community aspect that would serve as a platform for sharing experiences and questions.
Though we are saddened to say goodbye to Tyro, we believe that alternative platforms on the market like ThyForLife can benefit those affected by thyroid diseases.
Kiara, Marie-Elise, and Jocelyn are all continuing into their third year of the Cognitive Science Undergraduate program at McGill University, with an enhanced interest in AI classes.
Alex has graduated from their Electrical Engineering program at Dalhousie University, and has an increased interest in pursuing a Master's degree involving machine learning.
Lily is entering their final year of the Bachelor of Computer Science program at Dalhousie University. Upon graduation, Lily would like to pursue employment in the area of AI and Machine Learning, having been inspired by the Tyro team's success at AI4Good.
Kate has graduated with a degree in Neuroscience from McGill University and is planning on pursuing further education on the use of machine learning to answer questions in the field of genomics and medicine.
References & Acknowledgements
We would like to thank our TA Hugo, as well as our mentors Mojgan and Arjun, for their unending assistance. Hugo provided resources and encouraged us to make mistakes, as well as keeping us on track and providing unwavering encouragement. We were also lucky in that both of our mentors are from radically different fields, allowing us to receive a variety of perspectives when tackling problems. Mojgan, being from a medical background, was able to better inform us of the necessary steps to take when taking on a project in this subject matter. Arjun, being a data scientist, gave us an objective and tangible way of thinking when working on the model itself. We are immensely grateful for the opportunity to participate in the AI4Good program, which wouldn't have been possible without the administrative team, as well as all the lecturers and guest speakers that shaped the curriculum. We would also like to thank the 2022 cohort, for believing in our project and voting for us for the Montreal Accelerator Award.
1 "Introduction To ARIMA Models". People.Duke.Edu, 2022, https://people.duke.edu/~rnau/411arim.htm.
2 Hilarie Sit, "Quick Start to Gaussian Process Regression," Towards Data Science, June 19, 2019, https://towardsdatascience.com/quick-start-to-gaussian-process-regression-36d838810319
3 "1.7. Gaussian Processes," Scikit Learn, 2022, https://scikit-learn.org/stable/modules/gaussian_process.html