Research data is often siloed and difficult to find. Once found, it is even harder to compare. Harmony provides two powerful tools to bridge this gap:
Harmony Meta - Our new discovery engine that helps you search through vast catalogues of longitudinal studies to find the variables you need.
Harmony - Our flagship tool that uses Large Language Models to automatically match items from different questionnaires, even across different languages.
Harmony



pip install harmonydata
import harmony
harmony.download_models()
instruments = harmony.example_instruments["CES_D English"],
harmony.example_instruments["GAD-7 Portuguese"]
questions, similarity, query_similarity, _ = harmony.match_instruments
(instruments)
# How to load a PDF, Excel or Word into an instrument
harmony.load_instruments_from_local_file("gad-7.pdf")
# install.packages("devtools") # If you don't have devtools installed already or CRAN is down.
# library(devtools)
# devtools::install_github("harmonydata/harmony_r")
install.packages("harmonydata")
library(harmonydata)
instrument = load_instruments_from_file(path = "examples/GAD-7.pdf")
instrument_2 = load_instruments_from_file("https://medfam.umontreal.ca/wp-content/uploads/sites/16/GAD-7-fran%C3%A7ais.pdf")
instruments = append(instrument, instrument_2)
match = match_instruments(instruments)
names(match)
Really useful! Would have been a great tool and saved me a lot of time when I was trying to externally validate my risk prediction model in two cohorts.
Our tool, Harmony, allows researchers to upload a set of mental health questionnaires in PDF or Excel format, such as the GAD-7 anxiety questionnaire. It identifies which questions among questionnaires are identical, similar in meaning, or antonyms of each other, and generates a network graph. This allows researchers to harmonise datasets.
Uniquely, Harmony relies on Transformer neural network architectures and is not dependent on a dictionary approach or word list. This allows for multilingual data harmonisation (English and Portuguese are our languages of focus), and Harmony is able to correctly map the GAD-7 used in the UK to the GAD-7 used in Brazil, despite the Brazilian questionnaire being in Brazilian Portuguese.
Using Harmony, our team was able to harmonise multilingual datasets and conduct groundbreaking research into social isolation and anxiety with NLP supplying a quantitative measure of the equivalence of the different mental health datasets.

HARMONY