Quantitative Guide
Quantitative analysis software is broken into two buckets.
There is a real trade-off between these commercial and open source software. Commercial software is built on relatively easy to use user interfaces that reduce or eliminate your need to write code. Open source software is a collection of tools built by different developers to do different kinds of functions, all with transparent code. There are more bugs and quirks, but these also tend to be more flexible and where newer kinds of analysis show up first; commercial software often lags behind and limits changes to parameters.
There is a real trade-off between these commercial and open source software.
Commercial software is built on relatively easy to use user interfaces that reduce or eliminate your need to write code. Open source software is a collection of tools built by different developers to do different kinds of functions, all with transparent code. There are more bugs and quirks, but these also tend to be more flexible and where newer kinds of analysis show up first; commercial software often lags behind and limits changes to parameters.
Commercial providers with user interfaces and generally a lot of click features
- The most popular are Stata (popular in econ and government), SAS (popular in business), SPSS (popular as user-friendly, and primarily used in social science) and MatLab (popular in engineering/physical sciences).
- Currently MatLab is available free at Mac. Stata, SAS and SPSS must be purchased.
- They largely vary by the disciplines that use them. Generally people choose one and learn one, but may learn basic features of others or work with colleagues that use others in order to access features that work better in the other software.
- Each have kinds of analysis they work especially well for (for predictive modelling, SAS is great; for visualization of multi-variate or geographic data, nothing beats MatLab)
Free, open source software
- The two we are aware of are R/RStudio and Python
- R and RStudio are for data analysis. If that’s what you want to focus on, and you want to go open source, that’s where to start.
- Increasingly, some people are choosing to learn Python first or only. The advantage there is that it’s not specific to data analysis and so can be used for many other functions including website development and scraping data off the internet.
- The languages for each are similar but different enough that you want to be fully comfortable with one before learning the other or you can spend a lot of time hitting your head against the wall when your code doesn’t work.
- Others don’t have to buy the software you use in order to access your codes/scripts.
- There are active user communities and a lot of peer to peer support for open source software.
- You can spend as much or more time on getting your code to work than on running and interpreting analysis BUT
- You have much more control and flexibility in how you code your analyses. This is great if you know what you’re doing math-wise. It can get you into some trouble that commercial software will keep you out of with their more rigid structures.
Your field and the kind of work you want to do generally dictates what quantitative software you first learn.
Those of us that have been doing data analysis for many years often come to learn a couple different programs/languages or at least pieces of several. Spark can support researchers working in Stata, SPSS, R or Python. McMaster Libraries has support for users of SAS.