psychology

Harmony’s research has been published in BMC Psychiatry!

Harmony’s research has been published in BMC Psychiatry!

BMC Psychiatry has published our paper validating Harmony on real-world data

We are pleased to announce the publication of a paper validating Harmony on real-life data: Using natural language processing to facilitate the harmonisation of mental health questionnaires: a validation study using real-world data, authored by Eoin McElroy, Thomas Wood, Raymond Bond, Maurice Mulvenna, Mark Shevlin, George B. Ploubidis, Mauricio Scopel Hoffmann and Bettina Moltrecht, and published in BMC Psychiatry.

Summary of the Harmony real-world validation study

Our study aimed to evaluate the effectiveness of Natural Language Processing (NLP) in harmonising mental health questionnaires for cross-study research in areas such as mental health.

By comparing the semantic similarity of questionnaire items using NLP (the Sentence-BERT transformer model) with their actual correlation in a sample population, we found a moderate relationship (r = .48, p < .001) between the two measures. This suggests that NLP can accurately identify similar questions across different questionnaires.

While the NLP model showed promise in uncovering underlying patterns in the data, it required manual intervention to determine which relationships were truly relevant.

Our study showed that NLP can be a useful tool to match similar questions from different questionnaires, but it’s not perfect and should be used with caution.

Citing the BMC validation paper

A BibTeX entry for LaTeX users is

@article{mcelroy2024using,
  title={Using natural language processing to facilitate the harmonisation of mental health questionnaires: a validation study using real-world data},
  author={McElroy, Eoin and Wood, Thomas and Bond, Raymond and Mulvenna, Maurice and Shevlin, Mark and Ploubidis, George B and Hoffmann, Mauricio Scopel and Moltrecht, Bettina},
  journal={BMC psychiatry},
  volume={24},
  number={1},
  pages={530},
  year={2024},
  publisher={Springer}
}

Related Posts

Improving Harmony's PDF extraction with user testing

Improving Harmony's PDF extraction with user testing

Since we built Harmony, a common complaint has been that it frequently identifies the wrong questions in PDFs. The original algorithm for finding questions in PDFs was a mixture of rule based heuristics and some hand coded logic to look for e.g. lines in the document which begin with numbers. This was very fragile and worked fine on short questionnaires such as the GAD-7, but failed on larger documents. We decided to run a competition with our partner DOXA AI where members of the public could train their own model to extract questions from PDFs.
Harmony at MQ and DataMind Data Science Workshop

Harmony at MQ and DataMind Data Science Workshop

Harmony at MQ and Datamind Data Science workshop On 2 May 2025, Dr Eoin McElroy demonstrated Harmony at the MQ and Datamind Data Science workshop in Deutsche Bank. Eoin’s presentation focused on “Maximising the use of existing survey data: facilitating cross-study research using retrospective harmonization.” The workshop brought together researchers interested in applying novel harmonisation techniques to existing datasets. Eoin explained traditional harmonisation processes and presented a user-friendly guide to the Harmony tool, demonstrating how natural language processing can streamline the harmonisation process.

Signup to our newsletter

The latest news on data harmonisation project.

Please select all the ways you would like to hear from Harmony project:

You can unsubscribe at any time by clicking the link in the footer of our emails. For information about our privacy practices, please visit our website. We use Mailchimp as our marketing platform. By clicking below to subscribe, you acknowledge that your information will be transferred to Mailchimp for processing. Learn more about Mailchimp's privacy practices.