We are proud to have launched our first competition on Kaggle!
The primary challenge of this competition is to develop a tool or method that can accurately extract questionnaire questions from documents, primarily PDFs.
This competition offers a unique opportunity for participants to contribute to the field of natural language processing and document analysis while developing solutions that have real-world applications. We encourage participants to think creatively, leverage available resources, and push the boundaries of current technologies.
Requirements: Python 3.10 or greater
Create an account on Kaggle.
Install Kaggle on your computer:
pip install kaggle
On the Kaggle website, download your kaggle.json
file and put it in your home folder under .kaggle/kaggle.json
.
Download and unzip the competition data:
kaggle competitions download -c harmony-pdf-and-word-questionnaires-extract
unzip harmony-pdf-and-word-questionnaires-extract.zip
To generate predictions for the training data and write to train_predictions.csv:
python create_sample_submission.py train
To evaluate the train predictions:
python evaluate_train_results.py
To modify the prediction logic or inject your own model, you can edit the function dummy_extract_questions
.
To generate predictions for the test data and write to submission.csv:
python create_sample_submission.py test
kaggle competitions submit -c harmony-pdf-and-word-questionnaires-extract -f submission.csv -m "Message"
Harmony at PyData London - 86th Meetup Update: you can download the slides from the presentation here Topic: NLP and generative models for psychology research Thomas Wood will present our work on Harmony, harmonydata.ac.uk, which is a free online tool that uses generative AI and LLMs to help psychologists analyse datasets. It uses Python, Pandas and HuggingFace Sentence Transformers to find similarities between questionnaires. Psychologists and social scientists often have to match items in different questionnaires, such as “I often feel anxious” and “Feeling nervous, anxious or afraid”.
Data Harmonisation in Education: Overview The term ‘harmonisation’ has often been used in different contexts – for example, to describe similar phenomena, such as collaboration, coherence, alignment, integration, partnership, etc. However, we might argue that these concepts might do nothing more than indicate the extent and scale of integration among different entities when it comes to regional cooperation. Now, the underlying degree of interaction between all the players involved can run a lot deeper and tighter when we transition from collaboration, partnership, and cooperation to integration, community, harmonisation, and interdependence.