The daunting task of getting your mac ready for data science

Today I did something that I have been postponing for some months now: creating an environment on my iMac to be able to do my own data analysis projects.

I learned to use Python and data science packages such as pandas and matplotlib, but all in the safe environment of datacamp. Now that I’m contemplating running my own projects, I had to install Python3 and the packages on my own computer. I started searching online last week, got a bit overwhelmed with all the variations on things other people listed they installed, felt too confused to continue and closed my browser without installing anything.

This afternoon I was ready to try again. I was mentally prepared this time, so I took my time to compare the variations on installing Python3. I quickly discovered that I should refine my search on using Python for data science, as otherwise I would be installing tools geared towards developers.

I first manually installed Python3 and then read at several data science websites about Anaconda, ‘your data science toolkit’ and ‘developed for solo practitioners’. That sounds like me. I installed it, created a new environment using the latest Python (3.10.0) version and ran straight into trouble when installing some packages. Of course the error messages were very human readable (not), but it mentioned lots of version numbers and greater than, equal to, or smaller than signs. I clearly chose the wrong version of Python to work with. I trashed the environment I created, made a new one using the auto-suggested Python version and automagically everything I needed was in there.

As a final step I installed Jupyter Notebook, by clicking on the install button within Anaconda’s GUI, tested whether it worked with a bit of example code one of the helpful instruction sites had and it worked! I now have a fully functioning data science environment waiting for me to do some awesome projects.