The Australian Data Archive (ADA) is a national service for the collection and preservation of digital research data, similar to the UK Data Archive (UKDA).
The ADA provides data access through the ADA Dataverse. The collection includes polls on housing conditions in Australian states, political views over time across the country, questions about employment or health, and other datasets that the ADA has collected over the years (such as the Australian election study).
In 2023, the ADA embarked on a project to harmonise a vast collection of survey questions, seeking a solution that could effectively identify and group similar items across different studies. Researchers at the ADA found Harmony, a data harmonisation tool powered by natural language processing (NLP), and the ADA recognised its potential to streamline this process.
The ADA faces several challenges in managing its extensive questionnaire data:
The ADA may integrate Harmony into its processes, using its powerful NLP capabilities to address the challenges and expedite questionnaire harmonisation:
Harmony has the potential to bring significant benefits to the ADA’s data harmonisation processes:
The Australian Data Archive aims to enhance the efficiency of its data management tasks. Harmony’s capacity to automatically compare, group, and categorise questionnaire items can be instrumental in streamlining the ADA’s harmonisation processes. This approach can not only reduce operational effort but also elevate data quality and foster greater data interoperability. As the ADA expands its repository of social science research data in the future, Harmony has the potential to play a crucial role in preserving the integrity and accessibility of this valuable resource.