It’s hard to believe it’s already been over a month since my last update.
I’m still plugging away at the Coursera Data Science Specialization. I’m just finishing up the third course, Getting and Cleaning Data.
The first course, The Data Scientist’s Toolbox, was essentially a brief overview of the data science field along with some of the tools involved. It didn’t have much depth (by design), but it was interesting, and a good course for getting one’s feet wet.
The second course was titled R Programming. R Programming was a pretty painful course. The lectures didn’t really prepare me for the difficulty of the assignments, and the bizarre syntax of the R programming language didn’t help things either. But don’t let that discourage you; the forums were very helpful and filled in the gaps, and I certainly learned a good amount about R for such a short Course. The Course also makes use of Swirl, which is an R package that includes a series of interactive R tutorials. Swirl, which was an optional part of the course, was a nice touch and you can get extra credit for completing the prescribed modules. That syntax though; ugh. I think I might end up being a Python man, but who knows.
The third course is somewhat neat; I’ve never taken a course like this so it’s certainly been a unique experience. The vast number of R packages dedicated to making reading and cleansing data is also impressive, and I can see why so many prefer R for data science.
I’m hoping that the fourth course, Exploratory Data Analysis, will be more interesting than the second and third courses.
On the school front, I’ve managed to get accepted into UTSA’s M.S. in Statistics and Data Science program (formally M.S. in Applied Statistics). I’m hoping to learn a lot, and of course, a master’s degree will certainly not hurt my employment prospects (I’ve noticed that many data science job postings require an advanced degree, and that the majority of them prefer candidates with one). The focus of the degree is certainly statistics; and the program appears to be very comprehensive. I think at this point I would prefer it that way, as I’ve heard employers prefer those educated in STEM rather than data science itself. Universities teaching data science is also very new, and the value of data science courses aren’t yet proven (but I’m not saying they’re without value). I also think I’ll be able to pick up most of the data science pieces on my own, through online courses and work experience; I’ll use my traditional coursework to build theoretical foundations, as I’ve already been doing up to this point.
Unfortunately, the program doesn’t start until next fall. But, at least there are a vast number of online courses available to keep me busy in the meantime.