Easy issues to get started on in Harmony: Python and R

We have a few more issues that have been added to the issue trackers.

If you are new and would like to make a pull request in either the Python or R libraries feel free to pick these up - they should be quite small.

Easy issues in Python library

We would like to expose the “between instrument matches” and the “negation” switches in the Python library and then from the API side. Ultimately this will allow the R library to expose this functionality.

https://github.com/harmonydata/harmony/issues/60: Allow user to turn on/off the “within instrument matches” behaviour
https://github.com/harmonydata/harmony/issues/59: Allow user to turn on/off the negation behaviour

Moderate issue in Python

https://github.com/harmonydata/harmony/issues/56 - users are having real problems processing large lists of items. So this would have to be batched to send to the LLM. The tricky part is, you will need to make the batch size configurable and choose a sensible default, and then expose it through the API, but also not allow a user to overload the API.

Easy issues for R library

https://github.com/harmonydata/harmony_r/issues/4 Can we have a built in function to turn the matching scores into a crosswalk table (e.g. a table like the website produces) and output as a dataframe?
https://github.com/harmonydata/harmony_r/issues/5 Expose an easy and idiomatic R-like way to create instruments in R

How else can I contribute?

First of all have a look at our Large Language Model training and fine tuning challenge! This is an online competition to train a Large Language Model for mental health data and improve Harmony. You don’t need any experience training a Large Language Model before. We provide data. First prize for most accurate LLM is £500 in vouchers!

Secondly, keep an eye out for our next hackathon. We have already run one in 2024 but we are planning more for the future. Find out how to find AI hackathons here.

Please also take a look at the issue trackers on our repositories. There are issues tagged as good first issue which you can pick up

Python - the main core library and the Python package which is on Pypi
R - the R port is on CRAN and it is slightly less mature than Python so we really appreciate if you can give the R package some TLC.
API - the Python API runs with Pydantic and Fast API and is running on an on-prem server enabling the web app to work
Web front end - we welcome feedback and contributions on front end and UX issues
If you’re doing research and found Harmony useful, please cite us!
If you’re a researcher trying to use the tool, and you encounter a problem, a bug, or a feature which you would like us to implement, please raise an issue on Github or message us on Discord.

Harmony at MQ and DataMind Data Science Workshop

Harmony at MQ and Datamind Data Science workshop On 2 May 2025, Dr Eoin McElroy demonstrated Harmony at the MQ and Datamind Data Science workshop in Deutsche Bank. Eoin’s presentation focused on “Maximising the use of existing survey data: facilitating cross-study research using retrospective harmonization.” The workshop brought together researchers interested in applying novel harmonisation techniques to existing datasets. Eoin explained traditional harmonisation processes and presented a user-friendly guide to the Harmony tool, demonstrating how natural language processing can streamline the harmonisation process.

'Send to Harmony' Chrome plugin

[Beta mode: we are currently testing this extension] We have developed a browser extension for Harmony called “Send to Harmony” which lets you send selected text to Harmony with a right-click. For PDFs, use the popup to paste your selected text. Send to Harmony enables users to send selected text to the Harmony Data Harmonization (https://harmonydata.ac.uk/) platform for analysis. This plugin provides a right-click or context menu item which allows users to easily bring text from into their harmonisations, making it easier to compare and analyze different measurement scales across research studies.