We are proud to have launched our first competition on Kaggle!
The primary challenge of this competition is to develop an AI tool or method that can accurately extract questionnaire questions from documents, primarily PDFs.
This competition offers a unique opportunity for participants to contribute to the field of natural language processing and document analysis as well as open source for social science while developing solutions that have real-world applications. We encourage participants to think creatively, leverage available resources, and push the boundaries of current technologies.
Requirements: Python 3.10 or greater
Create an account on Kaggle.
Install Kaggle on your computer:
pip install kaggle
On the Kaggle website, download your kaggle.json
file and put it in your home folder under .kaggle/kaggle.json
.
Download and unzip the competition data:
kaggle competitions download -c harmony-pdf-and-word-questionnaires-extract
unzip harmony-pdf-and-word-questionnaires-extract.zip
To generate predictions for the training data and write to train_predictions.csv:
python create_sample_submission.py train
To evaluate the train predictions:
python evaluate_train_results.py
To modify the prediction logic or inject your own model, you can edit the function dummy_extract_questions
.
To generate predictions for the test data and write to submission.csv:
python create_sample_submission.py test
kaggle competitions submit -c harmony-pdf-and-word-questionnaires-extract -f submission.csv -m "Message"
Train your own Large Language Model to parse PDFs and win up to £1000 in vouchers! Join a competition to train a machine learning model to improve Harmony’s PDF parsing. You don’t need to have trained a machine learning model before. Register on DOXA AI Enter the competition on DOXA AI by fine tuning your own model and improve Harmony! Join our Discord Join the Harmony Discord server. Check out the 🏅「matching-challenge」 channel!
Harmony at GenAI and LLMs night at Google London on 10 December 2024 Above: video of the AICamp meetup in London on 10 December 2024. Harmony starts at 40:00 - the first talk is by Connor Leahy of Conjecture We have presented the AI tool Harmony at the GenAI and LLMs night at Google London on 10th December organised by AI Camp at Google Cloud Startup Hub. AI Camp and Google hosted two deep dive tech talks on AI, GenAI, LLMs and machine learning, with food/drink, networking with speakers and fellow developers.