upgrade
upgrade

⛱️Cognitive Computing in Business

Data Mining Techniques

Study smarter with Fiveable

Get study guides, practice questions, and cheatsheets for all your subjects. Join 500,000+ students with a 96% pass rate.

Get Started

Why This Matters

Data mining sits at the heart of cognitive computing in business—it's how organizations transform raw data into actionable intelligence. When you're tested on these techniques, you're really being evaluated on your understanding of when to apply which method, what business problems each solves, and how algorithms learn patterns from data. The exam will expect you to distinguish between supervised and unsupervised approaches, recognize appropriate use cases, and understand the tradeoffs between interpretability and predictive power.

Don't fall into the trap of memorizing algorithm names without context. Instead, focus on the underlying logic: classification assigns labels, clustering finds natural groupings, regression predicts continuous values, and association rules uncover hidden relationships. Each technique answers a different business question, and knowing which question each answers is what separates strong exam performance from mediocre recall.


Supervised Learning: When You Know What You're Looking For

These techniques require labeled training data—you're teaching the algorithm what "correct" looks like so it can make predictions on new data. The model learns a mapping function from inputs to known outputs.

Classification

  • Assigns data to predefined categories based on learned patterns—think of it as teaching a system to sort items into labeled bins
  • Business applications include spam detection, customer churn prediction, and credit risk scoring—anywhere you need a yes/no or category decision
  • Evaluation metrics like accuracy, precision, and recall measure how well the model categorizes unseen data

Decision Trees

  • Flowchart-like structure that splits data based on feature values—each branch represents a decision rule leading to an outcome
  • High interpretability makes these ideal for business contexts where stakeholders need to understand why a decision was made
  • Handles both classification and regression tasks, though prone to overfitting without pruning or ensemble methods

Support Vector Machines

  • Finds the optimal hyperplane that maximizes the margin between classes in high-dimensional feature space
  • Excels with complex boundaries and works well when classes are clearly separable—common in text classification and image recognition
  • Kernel trick allows SVMs to handle non-linear relationships by transforming data into higher dimensions

Compare: Decision Trees vs. Support Vector Machines—both handle classification, but Decision Trees prioritize interpretability while SVMs prioritize accuracy in complex, high-dimensional spaces. If an FRQ asks about explaining a model to executives, go with Decision Trees; for maximum predictive power with messy data, SVMs are your answer.

Naive Bayes

  • Probabilistic classifier using Bayes' theorem: P(AB)=P(BA)P(A)P(B)P(A|B) = \frac{P(B|A) \cdot P(A)}{P(B)}—calculates the probability of each class given the input features
  • Assumes feature independence, which rarely holds in reality but still performs surprisingly well on text classification tasks
  • Fast and scalable for large datasets—ideal for real-time spam filtering and sentiment analysis where speed matters

K-Nearest Neighbors (KNN)

  • Instance-based learning that classifies new points by majority vote of their kk nearest neighbors—no explicit training phase required
  • Distance metric selection (Euclidean, Manhattan, etc.) significantly impacts performance—choose based on your data's characteristics
  • Computationally expensive at prediction time since it must calculate distances to all training points for each new observation

Compare: Naive Bayes vs. KNN—both are simple to implement, but Naive Bayes builds a probabilistic model while KNN stores all training data. Naive Bayes handles high-dimensional text data efficiently; KNN struggles with the "curse of dimensionality" but captures local patterns better.


Unsupervised Learning: Discovering Hidden Structure

These techniques work with unlabeled data—the algorithm identifies patterns without being told what to look for. The goal is to uncover natural groupings or relationships that humans might miss.

Clustering

  • Groups similar data points without predefined labels—the algorithm decides what "similar" means based on distance or density measures
  • Key algorithms include K-Means (partition-based), Hierarchical Clustering (builds dendrograms), and DBSCAN (density-based, handles irregular shapes)
  • Business applications span market segmentation, customer profiling, and anomaly detection—anywhere you need to discover natural categories

Association Rule Mining

  • Discovers co-occurrence relationships in transactional data—the classic "customers who bought X also bought Y" insight
  • Evaluated using three metrics: support (how often items appear together), confidence (probability of Y given X), and lift (strength above random chance)
  • Market basket analysis remains the signature application, but also used in recommendation systems and cross-selling strategies

Compare: Clustering vs. Association Rule Mining—both are unsupervised, but clustering groups entities (customers, products) while association rules find relationships between items (purchase patterns). Clustering answers "who are my customer segments?" while association rules answer "what do they buy together?"


Predictive Modeling: Forecasting Continuous Outcomes

When your target variable is a number rather than a category, you need regression techniques. These models estimate the mathematical relationship between predictors and outcomes.

Regression Analysis

  • Models relationships between dependent and independent variables—linear regression assumes y=β0+β1x+ϵy = \beta_0 + \beta_1x + \epsilon, a straight-line relationship
  • Types serve different purposes: linear regression for continuous outcomes, logistic regression for binary classification (despite the name), polynomial regression for curved relationships
  • Essential for forecasting applications like sales predictions, demand planning, and risk assessment where you need numerical estimates

Neural Networks

  • Interconnected layers of nodes loosely inspired by biological neurons—each connection has a weight that's adjusted during training
  • Excel at complex, non-linear patterns in unstructured data like images, audio, and natural language—the foundation of deep learning
  • Require substantial resources: large training datasets, significant computational power, and careful hyperparameter tuning to avoid overfitting

Compare: Regression Analysis vs. Neural Networks—both predict continuous values, but regression offers interpretable coefficients (you can explain why the prediction changed) while neural networks are "black boxes" that often achieve higher accuracy on complex tasks. Choose regression when explainability matters; neural networks when prediction accuracy is paramount.


Outlier and Risk Detection: Finding What Doesn't Belong

Some business problems require identifying exceptions rather than rules. These techniques flag observations that deviate significantly from expected patterns.

Anomaly Detection

  • Identifies rare observations that differ significantly from the majority—statistical outliers that may indicate problems or opportunities
  • Multiple approaches exist: statistical tests (z-scores, IQR), clustering-based methods (points far from any cluster), and supervised models trained on known anomalies
  • Critical business applications include fraud detection, network intrusion identification, and equipment failure prediction—high-stakes scenarios where missing an anomaly is costly

Compare: Anomaly Detection vs. Classification—both can identify fraud, but classification requires labeled examples of fraud to train on, while anomaly detection can flag unusual patterns without prior fraud examples. Use classification when you have good historical data; anomaly detection when fraud patterns constantly evolve.


Quick Reference Table

ConceptBest Examples
Supervised ClassificationDecision Trees, Support Vector Machines, Naive Bayes, KNN
Unsupervised GroupingClustering (K-Means, DBSCAN), Association Rule Mining
Continuous PredictionRegression Analysis, Neural Networks
High InterpretabilityDecision Trees, Linear Regression, Naive Bayes
Complex Pattern RecognitionNeural Networks, Support Vector Machines
Text/Document AnalysisNaive Bayes, Support Vector Machines
Fraud/Outlier DetectionAnomaly Detection, Clustering-based methods
Real-time/Fast PredictionNaive Bayes, Decision Trees

Self-Check Questions

  1. Which two techniques both handle classification but differ most in their interpretability—and when would you choose each in a business context?

  2. A retailer wants to understand which products are frequently purchased together. Which technique should they use, and what three metrics would they use to evaluate the discovered patterns?

  3. Compare and contrast clustering and classification: What fundamental difference in the training data determines which approach is appropriate?

  4. Your company has transaction data but very few confirmed fraud cases to learn from. Would you recommend a classification approach or anomaly detection—and why?

  5. An FRQ asks you to recommend a data mining approach for predicting next quarter's sales revenue. Which technique category applies, and what's one key assumption of the simplest model in that category?