Here you can find simple guides to different techniques for statistical analysis with the software Stata. The focus is on running and interpreting the analyses, not the theory and assumptions behind that underpin the analyses. In the guides the code and output from the statistics software is shown, together with explanations in text. All code is supposed to be reproducible, so if you want to you can download the data that is linked in the guides, and follow along in the instructions. The site is run by Anders Sundell. Swedish version.
Different operations required to prepare the data for analysis.Getting started with Stata
The different parts of the program, setting a project folder, loading data, do-files, etc.Create datasets and import data
Import data or create a dataset from scratch.Recode variables
Change or remove certain values from variables to prepare them for analysis, using the commands "recode", "generate" and "replace".If qualifiers and conditions
Use conditions to run analyses and other commands on selected groups of observations.Combining datasets
Add data from other sources with the command "merge".Logarithms
Use the logarithmic transformation on variables to account for skewness, for instance arising from exponential growth.Aggregate datasets
Use the command "collapse" to aggregate datasets to show statistics such as means and standard deviations for groups in the data.
Work with time in Stata, either for one unit (time series) or many (panel data).Setting up data for time series
Set time variable, lags, leads, delta variable, plot data over time.
Descriptive statistics and simpler analyses
Get an overview of the data before proceeding to more advanced analysis.Simple descriptive statistics
Use the commands codebook, summarize and tab to quickly find out the mean, median, min and max values (among other things) for a variable.Mean values (averages) in different groups
Compare groups in a straightforward way by comparing mean values in different groups, using the commands sum and table.t-test
Test differences between groups for statistical significance.Correlation
Simple and very common measure to show the strength and direction of association between two variables.Crosstabs
Relationships between two categorical variables shown with percentages.