DDS-AI: Dental Data Sharing for Artificial Intelligence
DDS-AI: Dental Data Sharing for Artificial Intelligence
Sharing Dental Datasets Globally
Dental data sharing is often hindered by concerns over data security, vendor restrictions, and high infrastructure costs. To address this challenge, we facilitate the sharing of de-identified dental datasets among the global community through a streamlined process.
Promoting Diversity in Dental Datasets
Existing dental datasets often lack diversity, with limited representation from different demographics, clinical practices, and imaging equipment. To address this issue, we create diverse dental datasets that capture a range of demographics, clinical practices, and imaging equipment used worldwide.
Enabling Clinically Relevant AI in Dentistry
Public dental datasets often provide simplified annotations that do not reflect the complexity of clinical decisions. To address this challenge, we work with multidisciplinary teams to curate dental datasets with relevant clinical information and detailed, high-quality annotations. Our multi-institutional datasets enable the development of AI models that can be validated across diverse clinical settings, making them more clinically useful.
The topic group Dentistry (TG-Dentistry) focuses on enabling, facilitating and implementing artificial intelligence (AI) in oral health domain. Its purpose is to foster AI tools and applications of the highest quality in dentistry, supporting patient care, research, and education.
The mandate for this group is to establish comprehensive benchmarking criteria and datasets, to develop and implement rigorous evaluation strategies, and promote multi-center research in AI for dentistry. Benchmarking these AI systems is anticipated to produce more robust and reliable models and algorithms, ultimately aligning the interests of developers, clinicians, and patients with the WHO's overarching goal to improve the population's health and increase dental coverage. Compared to current in-sample validation strategies, this might initially result in lower accuracy, but it allows for transparent comparisons across various models and algorithms.
Benchmarking Dataset Creation: Construct and supply a benchmarking dataset specifically for dental tasks, ensuring it is created and annotated consistently and reproducibly. This dataset will serve as a foundational tool for effective benchmarking in dental applications.
Promotion of Multi-Center Research: Actively encourage and facilitate research initiatives that span multiple centers. This approach is crucial for enhancing the generalizability and robustness of AI solutions in dentistry, ensuring they are tested and validated across diverse clinical environments.
Consent form for data donation
Metadatastandards for datasets for AI tasks for the dental domain
Data management plan
Benchmarking test dataset for one or more community AI tasks
TG-Dentistry is organized into three key groups to efficiently address the benchmarking problem in AI in dentistry:
The Data (Collection) Group is tasked with formulating comprehensive guidelines for dental data collection and storage, specifying data types, formats, and protocols as well as sourcing and curating diverse and high-quality international datasets relevant to dentistry. Simultaneously, they will develop metrics to evaluate dental datasets' quality, FAIRness (Findability, Accessibility, Interoperability, and Reusability), following the FAIR criteria and AI relevance. The Data (Collection) Group works towards ensuring that the collected data aligns with the needs of AI implementations in dental diagnostics and digital dentistry. Members collaborate with global partners to secure datasets that are representative and legally usable for assessing and training AI models.
The Data Annotation Group focuses on developing a pipeline for consensus reviewing of annotation guidelines. This subgroup ensures the creation of well-curated data by establishing high-quality, reproducible annotations in order to enhance the reliability and consistency of AI algorithms trained on dental data.
The Benchmarking Group is dedicated to providing evaluation standards. A continuous growing benchmarking dataset is one of its core tasks.
Organization of Data Donation:
a. Assess global interest and willingness to donate different types of data
b. Expand network to underrepresented regions
c. Define data standards
d. Data Donation Infrastructure Setup
i. Legal and ethical clearance
ii. Data hosting
iii. data administration
iv. set up Anonymization pipeline
Data Collection and Processing
a. Definition of community AI task set (continuous work)
b. Quality control and supervision (data dashboard)
c. Descriptive documents: Structure test datasets along compartments of data representative for different characteristics, e.g. population age, gender, ethnicity/source, data generation mode
Establishment of data donation network: The establishment of a global data donation network is a crucial first step for the benchmarking data set. This initiative will explore the willingness of various entities to contribute data, focusing on onboarding new collaborators and addressing legal and ethical considerations. The project's core objectives include assessing global data donation readiness, fostering partnerships across diverse sectors, and establishing a legal framework to ensure data privacy and compliance.
Data management guidelines and development of a benchmarking dataset: This task includes the development of data management guidelines for AI in dentistry and the creation of a benchmarking dataset. Key areas of focus include establishing data collection, storage, and security standards that meet the needs of AI in dental diagnostics and digital dentistry. The team will also create a diverse, high-quality benchmarking dataset that is representative of different dental conditions and meets legal and ethical standards. Work will include standardizing data formats, ensuring the FAIRness of the dataset, and collaborating internationally to collect a broad range of dental data.
Sergio Uribe (PI), Riga Stradins University, Riga, Latvia & Universidad de Valparaíso, Valparaíso, Chile
Julien Issa, Poznań University of Medical Sciences, Poland
Anahita Haiat, University of Western Australia Australia
Nightingale Open Science is a platform that connects researchers with world-class medical data from health systems around the world. Datasets of electrocardiogram waveforms, x-rays and CT scans, tissue biopsy images, and more are linked to ground-truth labels. Access requires registration.
OASIS provide open access to multimodal neuroimaging datasets for normal aging and Alzheimer's Disease, facilitating future discoveries in basic and clinical neuroscience through freely available data for hypothesis-driven analyses, atlas development, and algorithm development.
More datasets: Centaurus list