📊Business Intelligence

Data Quality Dimensions

Study smarter with Fiveable

Get study guides, practice questions, and cheatsheets for all your subjects. Join 500,000+ students with a 96% pass rate.

Get Started

Why This Matters

In Business Intelligence, your dashboards, reports, and predictive models are only as good as the data feeding them. Data quality dimensions give you a framework for evaluating whether your data is actually fit for purpose—and when exam questions ask you to diagnose why a BI initiative failed or recommend improvements, these dimensions are your diagnostic toolkit. You're being tested on your ability to identify which dimension is compromised in a given scenario and explain the downstream business impact.

Think of data quality dimensions as falling into three categories: intrinsic quality (is the data itself correct?), contextual quality (is it useful for this specific purpose?), and accessibility quality (can stakeholders actually use it?). Don't just memorize definitions—know how each dimension connects to data governance, ETL processes, and decision-making reliability. When you see a case study about conflicting reports or missed business opportunities, your first instinct should be to ask: which quality dimension broke down?

Intrinsic Quality: Is the Data Itself Correct?

These dimensions evaluate whether data accurately represents reality, independent of how it's being used. Intrinsic quality failures corrupt everything downstream—no amount of sophisticated analytics can fix fundamentally flawed inputs.

Accuracy

Measures how closely data values match real-world truth—the foundation of all trustworthy analytics
Verification methods include cross-referencing against authoritative sources, validation rules, and audit sampling
Business impact is severe: inaccurate customer data leads to failed deliveries, wrong pricing, and compliance violations

Validity

Confirms data conforms to defined formats, ranges, and business rules—distinct from accuracy because data can be valid but still wrong
Examples include date formats (MM/DD/YYYY vs. DD/MM/YYYY), acceptable value ranges, and required field constraints
Caught during ETL through validation checks; invalid records typically get rejected or flagged for review

Precision

Indicates the level of granularity and exactness in data values—think decimal places, measurement units, or category specificity
Context-dependent: financial transactions need penny-level precision; customer satisfaction scores may not
Over-precision can actually harm analysis by introducing noise or false confidence in measurements

Compare: Accuracy vs. Validity—both assess "correctness," but accuracy asks "is this the right value?" while validity asks "is this value properly formatted?" A birthdate of 02/30/1990 is invalid (impossible date), while 02/28/1991 for someone born 02/28/1990 is inaccurate (wrong year). FRQs love this distinction.

Structural Quality: Is the Data Internally Sound?

These dimensions ensure data maintains its quality across systems, time, and transformations. Structural failures often emerge during data integration when multiple sources collide.

Consistency

Ensures uniform representation across datasets and systems—same customer shouldn't appear as "IBM" in one system and "International Business Machines" in another
Master data management (MDM) is the primary solution for maintaining consistency across enterprise systems
Reconciliation reports help identify inconsistencies before they contaminate downstream analytics

Integrity

Refers to data remaining accurate and unaltered throughout its lifecycle—from creation through storage, transfer, and archival
Referential integrity ensures relationships between tables remain valid (foreign keys point to existing records)
Compromised by system failures, unauthorized modifications, or broken ETL pipelines; detected through checksums and audit trails

Reliability

Measures the dependability of data sources and consistency of values over time—can you trust this source repeatedly?
Assessed through source reputation, historical accuracy rates, and documentation of collection methods
Unreliable sources should be flagged in data catalogs so analysts can weight their conclusions appropriately

Compare: Consistency vs. Integrity—consistency is about uniformity across systems (horizontal alignment), while integrity is about preservation over time (vertical alignment). A database could have perfect integrity but still be inconsistent with other systems using different naming conventions.

Contextual Quality: Is the Data Useful for This Purpose?

These dimensions evaluate whether data serves the specific business need at hand. Data that's perfect for one use case may be worthless for another.

Completeness

Measures whether all required data elements are present—null values, missing records, and partial entries all reduce completeness
Acceptable thresholds vary by use case: 95% completeness might suffice for trend analysis but fail for regulatory reporting
Root causes include optional fields, system integration gaps, and data entry abandonment

Timeliness

Assesses whether data is current enough for the decision at hand—real-time trading needs second-level freshness; annual planning can use month-old data
Data latency describes the gap between when an event occurs and when it appears in your BI system
Batch vs. streaming architectures represent fundamental design choices driven by timeliness requirements

Relevance

Evaluates whether data actually applies to the business question being asked—collecting everything isn't a strategy
Irrelevant data increases storage costs, slows queries, and distracts analysts from meaningful patterns
Data minimization principles (especially under GDPR) make relevance a compliance concern, not just an efficiency one

Compare: Completeness vs. Relevance—these create tension in data strategy. Completeness pushes toward "capture everything," while relevance argues "capture only what matters." The best BI architectures balance both by defining clear data requirements before collection begins.

Accessibility Quality: Can Stakeholders Use the Data?

This dimension determines whether quality data actually reaches the people who need it. The best data in the world is worthless if decision-makers can't access it.

Accessibility

Measures how easily authorized users can obtain and utilize data—includes technical access, format usability, and discoverability
Barriers include siloed systems, inadequate permissions, poor documentation, and lack of self-service tools
Data democratization initiatives aim to improve accessibility while maintaining appropriate governance controls

Compare: Accessibility vs. Timeliness—both affect whether data reaches users when needed, but timeliness is about data freshness while accessibility is about delivery mechanisms. Real-time data that's locked in a system only IT can query fails on accessibility, not timeliness.

Quick Reference Table

Concept	Best Examples
Intrinsic Quality	Accuracy, Validity, Precision
Structural Quality	Consistency, Integrity, Reliability
Contextual Quality	Completeness, Timeliness, Relevance
Accessibility Quality	Accessibility
ETL Validation Focus	Validity, Consistency, Completeness
Governance Priority	Integrity, Reliability, Accuracy
User-Facing Concerns	Timeliness, Accessibility, Relevance
Compliance-Related	Accuracy, Integrity, Relevance

Self-Check Questions

A company's CRM shows a customer's address as "123 Main St" while the billing system shows "123 Main Street, Suite 100." Which two dimensions are potentially compromised, and how would you distinguish between them?
Your sales dashboard shows last month's figures, but executives need to respond to a competitor's price change announced yesterday. Which dimension has failed, and what architectural change would address it?
Compare and contrast accuracy and precision using an example of customer age data. How might data be precise but inaccurate, or accurate but imprecise?
An FRQ describes a merger where two companies' product databases use different category taxonomies. Which dimensions are at risk, and what data management strategy would you recommend?
A data analyst complains that the information they need exists but is trapped in a legacy system requiring IT tickets to extract. Which dimension is failing, and how does this differ from a timeliness problem?

📊Business Intelligence

Data Quality Dimensions

Why This Matters

Intrinsic Quality: Is the Data Itself Correct?

Accuracy

Validity

Precision

Structural Quality: Is the Data Internally Sound?

Consistency

Integrity

Reliability

Contextual Quality: Is the Data Useful for This Purpose?

Completeness

Timeliness

Relevance

Accessibility Quality: Can Stakeholders Use the Data?

Accessibility

Quick Reference Table

Self-Check Questions

history

social science

english & capstone

arts

science

math & computer science

world languages

high school exams

honors classes

college classes

hs classes