skip to Main Content

Making the most of healthcare data – towards savings through preventive care

Most Finnish patient data is stored in the Patient Data Repository, which is part of the Kanta Services provided by the Social Insurance Institution of Finland. This patient data is primarily collected and used for healthcare needs but has the potential to be utilized for predicting the risks of various diseases and identifying individuals at risk. Risk calculators based on population studies are already used in healthcare settings, and automating these tools could improve current practices.

In Finland, most healthcare providers, both public and private, store patient data in the national Patient Data Repository. Data is recorded in two forms: as free text and in a structured format, which includes recordings like diagnosis codes, laboratory test results, and risk information.

Early disease prevention and proactive care play a key role in cutting healthcare costs. When diseases are prevented before they start, considerable financial savings can be achieved.

Data volume on the rise

The volume of data in the Patient Data Repository has been increasing, leading to more structured data available for use. Structured data is particularly interesting from the perspective of automated risk calculation, because input values for risk calculators could possibly be directly extracted from the data fields. Currently, healthcare professionals manually input the necessary data into risk calculators, which can be time-consuming.

The potential of automated risk calculation

Automated risk calculation could enable the prevention of various diseases, if people with an increased risk of illness could be identified at an early stage. Calculators for predicting the risk of cardiovascular diseases and adult-onset diabetes are examples of tools that could be adapted for automated use. With healthcare resources stretched thin, prioritizing care based on risk levels could optimize resource allocation and reduce treatment costs.

Challenges for implementing automated risk calculation include insufficient structured input values, the short length of patient histories, and inconsistent data recording practices across healthcare providers. Extracting input values from free text is one possible solution, but this requires advanced methods for processing Finnish medical text.

The increase in the amount of patient data recorded in a structured format gives reason to believe that risk predictions for various diseases could be more effectively developed for patients in the future. More advanced recording practices and medical text analysis methods may soon enable greater use to be made of the data in the Patient Data Repository.


Henna Kujanen
Software Developer

I began my role as a software developer at Atostek in early 2023. I graduated from Tampere University with a Master’s degree in biotechnology in spring 2022 and with a Master’s degree in information management in spring 2024. At Atostek, my focus has been on data analysis for health data research projects, as well as software development and testing. My Master’s thesis explored the use of the Finnish Patient Data Repository for predicting cardiovascular disease risks.