Fiveable

🔒AP Cybersecurity Unit 5 Review

QR code for AP Cybersecurity practice questions

5.6 Detecting Attacks on Data and Applications

5.6 Detecting Attacks on Data and Applications

Written by the Fiveable Content Team • Last updated June 2026
Verified for the 2027 exam
Verified for the 2027 examWritten by the Fiveable Content Team • Last updated June 2026

Catching an attacker in the act (or at least catching them before they do too much damage) is one of the most important jobs in cybersecurity. Even with strong preventive controls, smart adversaries find ways in. That's where detective controls come in: tools and techniques that spot suspicious activity on your data and applications. This topic covers how logs, honeypots, hashes, and DLP services work together to flag attacks, plus how to read log files for signs of common application attacks like SQL injection and XSS.

Detecting Attacks on Data

Most attacks on data leave footprints. The trick is knowing where to look and what counts as suspicious.

Pep mascot
more resources to help you study

Accounting and Log Analysis

Whenever a user opens, copies, moves, or deletes a file, the system can record that activity in a log. This process of recording and monitoring user activities is called accounting. Logs typically capture who did what, when they did it, where they did it from, and which device they used.

Reviewing these logs can reveal an adversary at work. Things that should make you suspicious:

  • Unusual files being accessed. If an intern in marketing suddenly opens the company's payroll database, that's a red flag.
  • Activity outside normal patterns. A user who always logs in from Chicago at 9 a.m. on a work laptop suddenly logs in from another country at 3 a.m. on an unknown device. That mismatch in time, location, and device type screams compromised account.
  • Attempts to delete or copy sensitive files. Someone trying to mass-download HR records or wipe financial documents is rarely doing it for a good reason.

Honeypots

A honeypot is a file (or sometimes a whole system) that's designed to look valuable but is actually fake. Imagine a file named customer_credit_cards_2024.xlsx sitting on a server. To an attacker poking around, it looks like gold. In reality, the data inside is bogus, and the system is set up to alert defenders the moment anyone touches it.

The logic is simple: there's no legitimate business reason for anyone to open that file. So if it gets accessed, you almost certainly have an intruder. Honeypots work for things like fake credit card data, fake PII, or fake password files.

Cryptographic Hashes for Integrity

A cryptographic hash function takes a file as input and produces a fixed-length string called a digest. The key property: the same input always produces the same output. Change even one character in the file, and the hash changes completely.

This is great for detecting tampering. If you record a file's hash today and check it next week, a different hash means someone (or something) altered the file. Unexpected changes to system files or configuration files often point to malicious activity, like malware modifying a binary.

Choosing the Right Detective Controls

Not every organization needs the same tools. Picking detective controls comes down to a few key criteria.

Cost

Some detection methods are basically free. Setting up a honeypot file or running hash checks costs almost nothing beyond a little admin time. On the other end, data loss prevention (DLP) services are third-party tools that continuously monitor how data is accessed, used, and transmitted across an entire organization. DLP catches things like an employee emailing a spreadsheet of customer SSNs to a personal Gmail account. DLP is powerful but expensive, so it's usually a fit for bigger organizations with serious data risk.

Sensitivity and Criticality

The more sensitive or critical the data, the more closely you should watch it. A public marketing PDF doesn't need much monitoring. A database of patient medical records or the source code for your company's main product absolutely does. Attackers target high-value assets, so your detection effort should follow the value.

Data Classification

Certain categories of data come with legal or regulatory monitoring requirements. Examples:

  • Private data (general PII)
  • Educational records (covered by laws like FERPA in the U.S.)
  • Healthcare data (HIPAA)
  • Financial data (PCI DSS for payment cards, SOX for financial reporting)

If your data falls into one of these classes, you often have no choice. The law requires you to log and monitor access.

Evaluating Detection Methods

Detection methods aren't all equal. They differ in speed, in whether they catch attacks live or after the fact, and in what they can miss.

Speed and Automation

Manually scrolling through logs doesn't scale. Modern systems generate millions of log entries a day, so log analysis has to be automated to be useful. Automated tools can flag anomalies in near real time.

Honeypots are even faster. Because there's no legitimate reason to touch a honeypot, a single access attempt triggers an instant alert. That's about as close to instantaneous detection as you can get.

Real-Time vs. Retrospective Detection

Some detective controls catch attacks while they're happening:

  • DLP tools (many of them)
  • Honeypots
  • Real-time automated log analysis

These let defenders respond quickly and potentially stop the attack before it spreads.

Other controls only catch attacks after the fact:

  • Retrospective (after-the-fact) log analysis
  • Hash-based integrity checks

These still matter for forensics, understanding what happened, and patching the hole, but the damage is usually already done by the time you spot it.

False Negatives

A false negative is when an attack happens but your detection method misses it. Every detection tool has blind spots:

  • Cryptographic hashes only detect changes. If an attacker just views or copies a sensitive file without modifying it, the hash stays the same. The theft goes undetected.
  • Honeypots only detect attackers who interact with them. A careful adversary who avoids the bait file gets in and out without tripping the alarm.

The lesson: stack multiple detective controls so the weaknesses of one are covered by the strengths of another.

Verifying a File with Hashes

Hashing a file is one of the most practical skills in this topic. Since the same file always produces the same hash, you can verify whether a file has been altered by hashing it twice and comparing the results.

Generating Hashes on Different Systems

You can compute hashes using built-in command line tools. Here's how to generate a SHA256 hash of a file called testfile on each major OS:

Windows PowerShell:

</>Code
Get-FileHash testfile -Algorithm SHA256

Linux (BASH):

</>Code
sha256sum testfile

macOS (zsh):

</>Code
shasum -a 256 testfile

You can also use websites that compute hashes or specialized software, but the command line is fast and free.

The Verification Process

The workflow is straightforward:

  1. Hash the file when you know it's in a trusted state. Record the digest somewhere safe.
  2. Later, hash the file again.
  3. Compare the two digests.

If they match, the file hasn't changed. If they differ, something modified the file between the two hash checks. That could be a legitimate update, or it could be malware tampering.

This is also how you verify downloads. Software publishers often post the official SHA256 hash of an installer. You hash the file you downloaded and compare. If the digests match, you got the real file. If they don't, the download was corrupted or tampered with.

Spotting Application Attacks in Log Files

Web application logs record every request users send. Attackers have to send requests too, and their requests often contain telltale patterns. Knowing what to look for turns log files into an attack detector.

SQL Injection Indicators

SQL injection happens when an attacker sneaks SQL code into a user input field, hoping the database will execute it. Logs of user input often reveal the attempt. Watch for:

  • Quote characters: single quotes ('), double quotes ("), or backticks (`). Attackers use these to break out of input strings.
  • Boolean conditions like OR 1=1, which always evaluates to true and can bypass login checks.
  • The double dash (--), which starts a comment in SQL. Attackers use it to ignore the rest of a query.
  • SQL control words in capital letters, like WHERE, IN, FROM, SELECT, UNION. Normal users rarely type these into a login form.

A login field that receives input like admin' OR 1=1 -- is almost certainly an attack attempt.

Cross-Site Scripting (XSS) Indicators

XSS attacks inject malicious scripts into web pages that other users will view. The dead giveaway in logs is the <script>...</script> tag in user input. If someone submits a comment that contains <script>alert('xss')</script>, that's not a normal comment. It's an XSS probe. Any HTML tags in user input where they shouldn't be (like in a username field) deserve a closer look.

Buffer Overflow Indicators

A buffer overflow tries to crash or hijack an application by sending more data than it expects. For web apps, you can detect attempts by checking the length of incoming data. Common fields to monitor:

  • URL length
  • Cookie length
  • Query string length
  • Total request length

A normal URL might be 50 to 200 characters. A URL that's 5,000 characters long is not a person browsing the site. It's likely a buffer overflow attempt.

Directory Traversal Indicators

Directory traversal attacks try to escape a web app's intended folder and access files elsewhere on the server, like /etc/passwd. The classic indicator is sequences of ../ in the URL path of an HTTP GET request. Each ../ means "go up one directory."

A log entry like:

</>Code
GET /images/../../../../etc/passwd HTTP/1.1

is a clear directory traversal attempt. Any request with multiple ../ sequences should trigger an alert.

Putting It Together

When analyzing logs for application attacks, you're scanning for these specific patterns across user input fields and request data. Most real attacks aren't subtle: the indicators above show up directly in the raw logs. The challenge is volume, which is why automated log analysis (SIEMs, intrusion detection systems) does the first pass and flags entries for a human to review.

Vocabulary

The following words are mentioned explicitly in the College Board Course and Exam Description for this topic.

Term

Definition

accounting

The process of recording and monitoring user activities and data access to track when and by whom data are accessed.

alerts

Notifications generated by security tools when suspicious activity or attacks are detected.

application logs

Records of events and activities generated by an application that can be analyzed to detect suspicious behavior and security incidents.

buffer overflows

An attack where excessive data is sent to an application to overflow its memory buffer, potentially allowing an attacker to execute arbitrary code or crash the application.

critical data

Information essential to organizational operations that, if compromised or lost, would significantly impact the organization.

cross-site scripting attacks

Cross-site scripting attacks where malicious scripts are injected through user input to compromise web applications or steal user data.

cryptographic hash functions

Mathematical functions that generate a unique digest for data to detect whether data have been altered or modified.

data classification

The process of categorizing data based on sensitivity levels such as private, educational, healthcare, or financial to determine appropriate security controls.

data digest

A unique output generated by a cryptographic hash function that can reveal if data have been changed.

data integrity

The assurance that data has not been altered or corrupted and remains accurate and complete.

data loss prevention (DLP)

Security services that monitor and control data access, usage, and transmission to detect and prevent unauthorized data movement or theft.

detective controls

Security controls that help identify attacks when they occur, such as intrusion detection systems and security incident and event management systems.

directory traversal attacks

An attack where an adversary uses path sequences like '../' in HTTP requests to navigate outside the intended directory and access unauthorized files on a server.

DLP tools

Data Loss Prevention tools that monitor and control the movement of sensitive data to prevent unauthorized access or exfiltration.

false negatives

Instances where a detection system fails to identify an actual attack or security threat that has occurred.

hash

A fixed-length binary string output produced by a cryptographic hash function from an input of arbitrary length.

hash values

Cryptographic outputs used to verify data integrity by detecting any unauthorized changes to data.

honeypots

Decoy systems or resources designed to attract and detect attackers by monitoring unauthorized access attempts.

indicators of application attacks

Specific patterns, signatures, or anomalies in logs and user input that suggest an attempted or ongoing attack on an application.

log analysis

The examination and interpretation of system logs to identify and investigate security events and potential attacks.

malicious activity

Harmful actions or behaviors conducted by adversaries on networks, such as unauthorized access, data theft, or system compromise.

real-time automated log analysis

Automated systems that continuously monitor and analyze logs as events occur to provide immediate detection of attacks.

retrospective log analysis

The examination of logs after an attack has occurred to identify what happened and how the system was compromised.

sensitive data

Information that requires protection from unauthorized access, such as personal credentials, financial information, or private communications.

server logs

Records of server activities and requests that can be reviewed to identify indicators of attacks and unauthorized access attempts.

SHA256

A specific cryptographic hash function that produces a 256-bit hash output, commonly used to verify file integrity.

SQL control words

Reserved keywords in SQL language such as WHERE, IN, and FROM that are used to construct database queries and can indicate SQL injection attempts when found in unexpected user input.

SQL injection attacks

A type of application attack where malicious SQL code is inserted into user input fields to manipulate database queries and gain unauthorized access to data.

Pep mascot
Upgrade your Fiveable account to print any study guide

Download study guides as beautiful PDFs See example

Print or share PDFs with your students

Always prints our latest, updated content

Mark up and annotate as you study

Click below to go to billing portal → update your plan → choose Yearly→ and select "Fiveable Share Plan". Only pay the difference

Plan is open to all students, teachers, parents, etc
Pep mascot
Upgrade your Fiveable account to export vocabulary

Download study guides as beautiful PDFs See example

Print or share PDFs with your students

Always prints our latest, updated content

Mark up and annotate as you study

Plan is open to all students, teachers, parents, etc
report an error
description

screenshots help us find and fix the issue faster (optional)

add screenshot