GGS-II data is available for download in the GGP User Space upon registration and completing the application procedure. Access to one of the GGS-II datasets provides immediate access to other GGS-II countries after submitting a data access request.
The data is available in different formats: .dta, .sav, and .xlxs.
The dataset of each country is processed in such a way that it is harmonized with the other GGS-II datasets to reduce the need for users to post-harmonize the data. As such, it is possible to append all country data for a cross-country comparison.
The datasets are prepared based on the baseline questionnaire 3.1. That means that in countries that fielded an earlier version of the questionnaire (e.g., Norway) where additional variables are still included, the variables are coded as country-specific. This ensures harmonization between countries.
The variables contain the labels and response options as in the baseline questionnaire. The full question is also stored within the variable.
The GGS-II documentation is available online on the GGP Colectica Portal https://ggp.colectica.org/
It contains:
Any country-specific deviations are systematically coded using four digits-long country codes. The variable “country” provides an overview of the country codes.
Country-specific values are added when the question follows the baseline questionnaire, but the answers are not at all or partly compatible. They consist of the country code plus a number, e.g., 2901.
A country-specific variable is introduced when the question differs from the baseline questionnaire or has been added to it. This kind of variable is identified with the suffix consisting of the country code plus a number, e.g., dem01_2401
Download the list of country specific question here
Download the list of country specific response option here.
Missing values in the dataset are indicated by system-generated codes. When a value is missing due to specific reasons, it is marked as follows:
.a Don’t know
.b Refusal
.c Not applicable
.h Incomplete survey
. Filter
Certain variables feature unique response categories. The coding of these special response categories varies based on whether the variables are continuous or categorical. In the case of continuous variables, the special answer categories are coded using system missings too in order to maintain the continuity of the variable.
.d Never
.e Mainly work from home
.f Not at all
.g Not working or homemaker
Post-stratification weights and design weights are included in the datasets. The post-stratification weights are produced using Iterative Proportional Fitting based on the most recent and reliable information on population figures provided by the country teams on five items: age, gender, region, level of education, and marital status. This accounts for selectivity in response, making within-country and cross-country-comparative research more reliable. In some countries, more detailed information is available so country teams chose to produce additional weights themselves. This weight variable is called cnt_weight and can only be used for within-country analyses.
A break-off refers to when a respondent quits the survey before reaching the final question. Those cases are marked with the missing value “.h incomplete survey”. Respondents who quit the survey in the first two sections, DEM or LHI, are removed from the dataset.
Some variables are filtered, denoted through a point-missing (in Stata). This means that the respondent has not received the respective question due to answers given previously. To figure out how a variable is filtered, you can check the documentation of the data in the GGP Colectica Portal. Please note that the filtering may vary across countries due to updates to the survey instrument.
Download an an overview of filter differences across countries here
The GGS-II Baseline questionnaire builds on previous versions and has involved various groups, developments, and testing phases. The development is described here.
Download GGS-II Baseline Questionnaire here
Some questions were optional: