Harmony tends to perform better if you upload a file with item numbers.
If there are are no question numbers in the instrument, it’s very hard for Harmony to distinguish question text from other content such as the copyright information. Click here to see an example PDF with question numbers included.
Also, if your PDF is a scanned document, please see if you can find a fully digitised (OCR’ed) version of the document.
We suggest either finding a file with question numbers or better quality content. Or try a different file format such as Word, Excel or CSV. We have guidance on formatting your files for Harmony.
Harmony supports:
Finally, feel free to raise an issue to let us know that your PDF isn’t being parsed. Please also share the PDF in question. Harmony is an open source tool for social sciences research.
Train your own Large Language Model to parse PDFs and win up to £1000 in vouchers! Join a competition to train a machine learning model to improve Harmony’s PDF parsing. You don’t need to have trained a machine learning model before. Register on DOXA AI Enter the competition on DOXA AI by fine tuning your own model and improve Harmony! Join our Discord Join the Harmony Discord server. Check out the 🏅「matching-challenge」 channel!
Harmony at GenAI and LLMs night at Google London on 10 December 2024 Above: video of the AICamp meetup in London on 10 December 2024. Harmony starts at 40:00 - the first talk is by Connor Leahy of Conjecture We have presented the AI tool Harmony at the GenAI and LLMs night at Google London on 10th December organised by AI Camp at Google Cloud Startup Hub. AI Camp and Google hosted two deep dive tech talks on AI, GenAI, LLMs and machine learning, with food/drink, networking with speakers and fellow developers.