Dr. Insa Bechert, Dr. Rabia Karatoprak Ersen, and Nidanur Baştürk at the GESIS - Leibniz Institute for the Social Sciences are using Harmony for the initial phase of identifying similar questions administered across international survey programs. These items constitute a Variable Database which is an entry point to harmonized cross-national survey data on the five NextGenEU youth policy topics: Make it Green; Make it Digital; Make it Healthy; Make it Strong; and Make it Equal as part of the EU-funded project Infra4NextGen.
Nidanur Baştürk is a research associate and PhD candidate and needed to identify similar questions across surveys. Together with Ms. Baştürk, we have been exploring ways to improve the Harmony R library. We identified some limitations of the current version of the tool around large numbers of question items which we are currently addressing to enable better handling of large datasets. For example, it became apparent that, although Harmony could process smaller questionnaires well, the large dataset used at GESIS was causing the tool to crash, so Harmony is currently being modified to batch questionnaire items in order to handle larger datasets.
Ms. Baştürk also prepared an R notebook as a walkthrough of how the GESIS team would find similar questions across surveys, and she wrote code to generate a crosswalk table from Harmony’s similarity matrix, as well as a collection of string cleaning rules to remove question introductions. All these contributions are making the tool more robust.
Ms. Baştürk wrote to us;
I’ve been testing the Harmony website tool, and the results are really promising—thank you so much for this incredible initiative!