Modifying My Expectations¶
It is perhaps one of my personality traits to not be able to follow any timeline that I have set up for myself, and this time is no different. I could easily become a raging cynic here but that would be too painful and boring to read, so I will refrain from that since I do not want to scare you away!
My original timeline for the Outreachy internship with PyMC3 looked like this:
May 17 - May 24: Introduction & community bonding period
May 24 - June 10: Getting a better understanding of the existing PyMC3 use cases. Go through all of the notebooks in ‘examples’ to understand the current use cases and applications of PyMC3, bring them to best practices state using numpy.Generator, ArviZ, xarray
June 11 - June 17: Update overview wiki pages as and where required to sync with the updated notebooks. Understand the changes and upgrades in PyMC3 v4, learn more about Aesara and its usages
June 18 - July 10: Update the existing notebooks to work with PyMC3 4.0 and Aesara
July 11 - July 25: Work on generating example notebooks to highlight key differences between usages of PyMC3 v3 and v4, write wiki pages to give an overview of the usage differences
July 26 - Aug 4: Reviews, modifications in notebooks implemented so far (buffer period to work on rectifying any implementation mistakes/pending notebooks/ updating wiki)
Aug 5 - Aug 15: Write/augment the existing documentation for the example notebooks
Aug 16 - Aug 24: Publish a blog post about the project, write additional usage guides/documentation as required
Pretty ambitious, right? So where did it all go wrong?
I started a week late from my fellow Outreachy May’21 interns, due to the engagements I had with my previous workplace, and a few weeks after beginning I had to take extensions - come to a part-time arrangement and also take an entire week off due to covid-19 descending upon my family, so now I am running a whole month behind others.
To say it has been frustrating to not have made the progress I was hoping to have made till now is an understatement. Of the ambitious things I had hope to achieve by now, one crucial thing that I breezily overlooked was how much I would struggle with understanding the concepts used in the tutorial notebooks which I am meant to work on. A pretty big detail for sure! There are about 80 notebooks in the repository and I am in my 8th week of internship now, but nowhere close to being 50% done. Understanding theoretical concepts has easily been the trickiest part of my internship so far (apart from working with new python libraries and getting to know their APIs). That has always been my experience with machine learning and statistics - which is that I am endlessly intrigued by it and curious to know more, but the subject is so broad and deep that I always find myself spiralling into confusion.
While I historically do not have a great track record of being realistic, my mentor thankfully does. He reassured me that this is how he had pictured the internship to pan out in the first-half, with the intern taking time to get familiar with pymc3, the environment, incremental pushing of commits, working on notebooks and learning the theory in each. So while I have gained enough familiarity to not fumble anymore with my doubts and know what to look up when I am stuck, I am only about 25-30% done with my progress in working on notebooks.
The brighter part of my experience so far has definitely been my growing confidence in myself and the gradually increasing pace and decreasing hesitation in my work. I am also stuck less often, I push commits more often, ask for help more readily and have improved on my communication ever so slightly. I realize that timelines are meant to serve as a reference point; not to dictate but to steer our actions to a fruitful direction. I can only hope for greater patience with myself and enjoy learning more about a subject (machine learning and statistics) I frustratingly can’t get enough of!