Quality assurance processes
Quality assurance is integrated into our processes and computer systems and applied throughout the data-matching cycle.
These assurance processes include:
- registering the intention to undertake a data-matching program on an internal register
- risk assessment and approval from the data steward and relevant senior executive service (SES) officers prior to any data-matching program being undertaken
- conducting program pilots or obtaining sample data to ensure the data-matching program will achieve its objectives prior to full datasets being obtained
- notifying the OAIC of our intention to undertake the data-matching program and seek permission to vary from the data-matching guidelines (where applicable)
- restricting access to the data to approved users and access management logs record details of who has accessed the data
- quality assurance processes embedded into compliance activities, including
- review of risk assessments, taxpayer profiles and case plans by senior officers prior to client contact
- ongoing reviews of cases by subject matter technical experts at key points during the life cycle of a case
- regular independent panel reviews of samples of case work to ensure our case work is accurate and consistent.
These processes ensure data is collected and used in accordance with our data-management policies and principles and complies with the OAIC’s Guidelines on data matching in Australian Government administrationExternal Link.
How we ensure data quality
Data quality is a measure to determine how fit-for-purpose data is for its intended use. It is valuable because it helps us to understand the data asset and what it can be used for.
Data quality management allows us to use data with greater confidence and assists in meeting data governance requirements and ensures a greater understanding of the data we hold.
The ATO Enterprise Data Quality (DQ) framework provides clarity and structure to our management of data quality and may be applied in determining how business areas can make effective and sound use of data.
This framework outlines 6 core DQ dimensions:
- Accuracy – the degree to which the data correctly represents the actual value.
- Completeness – if all expected data in a data set is present.
- Consistency – whether data values in a data set are consistent with values elsewhere within the data set or in another data set.
- Validity – data values are presented in the correct format and fall within a predefined set of values.
- Uniqueness – if duplicated files or records are in the data set.
- Timeliness – how quickly the data is available for use from the time of collection.
To assure specific data is fit for consumption and the intended use throughout our data-matching programs, the following data quality elements may also be applied.
- Currency – how recent the time period is that the data set covers.
- Precision – the level of detail of a data element.
- Privacy – access control and usage monitoring.
- Reasonableness – reasonable data is within the bounds of common sense or specific operational context.
- Referential integrity – when all intended references within a data set, or with other data sets, are valid.
Data is sourced from providers' systems and may not be available in a format that can be readily processed by our own systems. We apply additional levels of scrutiny and analytics to verify the quality of these datasets.
This includes but is not limited to:
- meeting with data providers to understand their data holdings, including their data use, data currency, formats, compatibility and natural systems
- sampling data to ensure it is fit for purpose before fully engaging providers on task
- verification practices at receipt of data to check against confirming documentation; we then use algorithms and other analytical methods to refine the data
- transforming data into a standardised format and validating to ensure that it contains the required data elements prior to loading to our computer systems; our data quality practices may also be applied during this transformation process
- undertaking program evaluations to measure effectiveness before determining whether to continue to collect future years of the data or to discontinue the program.