We have a number of exciting updates to Harmony including:
Over a period of a few months ending in January 2025 we made a dataset public and available for people all over the world to download and fine tune their own sentence transformer model. We ran this as a challenge on DOXA AI.
The motivation for this challenge was that users were reporting that Harmony often lacked psychology specific domain knowledge and tends to group items together if they appear superficially similar, e.g. items to do with the topic of “sleep”, and a mental health specific model with needed which would better differentiate topics within the mental health domain. We found that Harmony and other open weights and proprietary LLMs had varying performances on different datasets and we thought it would be nice to try making our own LLM - or rather, asking our community to do it!
We received a fantastic number of submissions to the matching challenge, including a flurry of changes to our model leaderboard in the last few hours before the competition closed the winner was announced as José Inés Martínez Berard and the second place was awarded to Rafi Ahmed Riyaz Ahmed Patel. Both received prizes. We also appreciate the many runners up who submitted models to the challenge.
You can now try José’s winning model on Harmony on the web. Just select harmonydata/mental_health_harmonisation_1
from the model dropdown.
If you are coding in Python, you can run this command to set an environment variable before you import Harmony
export HARMONY_SENTENCE_TRANSFORMER_PATH=harmonydata/mental_health_harmonisation_1
then you can use Harmony as usual in the Python terminal
import harmony
instruments = [harmony.example_instruments["CES_D English"],
harmony.example_instruments["GAD-7 Portuguese"]]
match_response = harmony.match_instruments(instruments)
similarity = match_response.similarity_with_polarity
for cluster in match_response.clusters:
print (f"Cluster #{cluster.cluster_id}: {cluster.text_description}")
for question in cluster.items:
print ("\t", question.question_text)
The model is also available on HuggingFace Hub under the model ID harmonydata/mental_health_harmonisation_1. The model converts English texts to 768 dimensions and has been trained by José to better differentiate mental health specific items.
Here are the updates to the R library from Omar Hassoun. The R library is on CRAN at: https://cran.r-project.org/web/packages/harmonydata/index.html
Check out the example R notebook in Google Colab: https://colab.research.google.com/github/harmonydata/experiments/blob/main/Harmony_R_example.ipynb
The new library:
We would be grateful if you could give it a try and let us know if anything’s unclear
There are a few other issues that we are working on, such as allowing users to harmonise response options and choose their own topics. We also welcome your contributions in these areas - check out the repo and issue board for the Harmony R library!
If you missed the DOXA competition, don’t worry… there’s another one! We’re running a competition with £1000 in vouchers as first prize, where the challenge is to improve Harmony’s questionnaire parsing. Find out more here.
If you’d like to meet us in person, Harmony is an official sponsor of AI UK 2025, run by the Alan Turing Institute, and we will have a stand at the event from 17 to 18 March at the QEII Conference Centre in London.