We are proud to have launched our first competition on Kaggle!
The primary challenge of this competition is to develop an AI tool or method that can accurately extract questionnaire questions from documents, primarily PDFs.
This competition offers a unique opportunity for participants to contribute to the field of natural language processing and document analysis as well as open source for social science while developing solutions that have real-world applications. We encourage participants to think creatively, leverage available resources, and push the boundaries of current technologies.
Requirements: Python 3.10 or greater
Create an account on Kaggle.
Install Kaggle on your computer:
pip install kaggle
On the Kaggle website, download your kaggle.json
file and put it in your home folder under .kaggle/kaggle.json
.
Download and unzip the competition data:
kaggle competitions download -c harmony-pdf-and-word-questionnaires-extract
unzip harmony-pdf-and-word-questionnaires-extract.zip
To generate predictions for the training data and write to train_predictions.csv:
python create_sample_submission.py train
To evaluate the train predictions:
python evaluate_train_results.py
To modify the prediction logic or inject your own model, you can edit the function dummy_extract_questions
.
To generate predictions for the test data and write to submission.csv:
python create_sample_submission.py test
kaggle competitions submit -c harmony-pdf-and-word-questionnaires-extract -f submission.csv -m "Message"
Harmony is an Official Partner of AI UK 2025 We are delighted to announce that Harmony is an Official Partner of AI UK 2025, the UK’s premier showcase for data science and artificial intelligence, hosted by The Alan Turing Institute. 📅 Date: 17-18 March 2025 📍 Location: Queen Elizabeth II Centre, Westminster, London At Harmony, we are pioneering AI-driven data harmonisation, enabling researchers to compare and integrate questionnaire data across diverse studies.
Help us design the next phase of Harmony and win up to £300 in vouchers! Search and Results UX/UI Challenge Harmony is a platform for researchers to help them discover and compare complex meta-data across different academic studies. The project is a collaboration between University College London (UCL), The University of Ulster, and Fast Data Science and has been funded by the Economic and Social Research Council (ESRC) and by Wellcome as part of the Wellcome Data Prize in Mental Health.