Detecting attacks on data and applications means using detective controls to spot suspicious activity that preventive controls miss. The main tools are log analysis (accounting), honeypots, cryptographic hashes for integrity, and data loss prevention (DLP) services. For applications, you read log files for attack signatures like SQL keywords, <script> tags, oversized requests, and ../ sequences.

more resources to help you study

practice questions

Why This Matters for the AP Cybersecurity Exam

This topic sits in Unit 5, where the focus shifts to protecting the data and applications that adversaries usually want most. Detection is the layer that catches what prevention lets slip through, so expect questions that ask you to pick the right detective control for a situation, judge how fast or reliable a method is, or read a log entry and name the attack.

You should be ready to do four things: explain how to detect attacks on data, choose detective controls based on cost, sensitivity, and data classification, evaluate a detection method's speed and blind spots, and analyze log files for indicators of application attacks. The hashing skill is hands-on, so know the actual commands and the verification workflow, not just the concept.

Key Takeaways

Accounting is the recording and monitoring of user activity. Log analysis flags suspicious behavior like odd file access, activity outside a user's normal time, location, or device, and attempts to copy or delete sensitive files.
A honeypot is a fake file that looks valuable. Any access to it is an indicator of malicious activity because there is no legitimate reason to open it.
Cryptographic hashes detect whether a file was altered, but they cannot tell if data was only viewed or copied. The same input always produces the same digest.
Choose detective controls based on cost, the sensitivity or criticality of the data, and its classification (private, educational, healthcare, financial), since some classifications carry legal monitoring requirements.
Honeypots, real-time automated log analysis, and some DLP tools alert during an attack. Retrospective log analysis and hash checks identify attacks after they happen.
Application attacks show up in logs as recognizable patterns: SQL control words and symbols, <script> tags, oversized request fields, and ../ directory traversal sequences.

Detecting Attacks on Data

Most attacks on data leave footprints. The trick is knowing where to look and what counts as suspicious.

Accounting and Log Analysis

Whenever a user opens, copies, moves, or deletes a file, the system can record that activity in a log. This process of recording and monitoring user activities is called accounting. Logs typically capture who did what, when they did it, where they did it from, and which device they used.

Reviewing these logs can reveal an adversary at work. Things that should make you suspicious:

Unusual files being accessed. If an intern in marketing suddenly opens the company's payroll database, that's a red flag.
Activity outside normal patterns. A user who always logs in from Chicago at 9 a.m. on a work laptop suddenly logs in from another country at 3 a.m. on an unknown device. That mismatch in time, location, and device type points to a compromised account.
Attempts to delete or copy sensitive files. Someone trying to mass-download HR records or wipe financial documents is rarely doing it for a good reason.

Honeypots

A honeypot is a file that's designed to look valuable but is actually fake. Imagine a file named customer_credit_cards_2024.xlsx sitting on a server. To an attacker poking around, it looks like gold. In reality, the data inside is bogus, and the system is set up to alert defenders the moment anyone touches it.

The logic is simple: there's no legitimate business reason for anyone to open that file. So if it gets accessed, you almost certainly have an intruder. Honeypots work for things like fake credit card data, fake PII, or fake password files.

Cryptographic Hashes for Integrity

A cryptographic hash function takes a file as input and produces a fixed-length string called a digest. The key property: the same input always produces the same output. Change even one character in the file, and the hash changes completely.

This is useful for detecting tampering. If you record a file's hash today and check it next week, a different hash means someone (or something) altered the file. Unexpected changes to system files or configuration files often point to malicious activity, like malware modifying a binary.

Choosing the Right Detective Controls

Not every organization needs the same tools. Picking detective controls comes down to a few key criteria.

Cost

Some detection methods are basically free. Setting up a honeypot file or running hash checks costs almost nothing beyond a little admin time. On the other end, data loss prevention (DLP) services are third-party tools that continuously monitor how data is accessed, used, and transmitted across an entire organization. DLP catches things like an employee emailing a spreadsheet of customer SSNs to a personal Gmail account. DLP provides strong detection capabilities but costs more, so it usually fits bigger organizations with serious data risk.

Sensitivity and Criticality

The more sensitive or critical the data, the more closely you should watch it. A public marketing PDF doesn't need much monitoring. A database of patient medical records or the source code for your company's main product absolutely does. Attackers target high-value assets, so your detection effort should follow the value.

Data Classification

Certain categories of data come with legal or regulatory monitoring requirements. These classifications include:

Private data (general PII)
Educational records
Healthcare data
Financial data

If your data falls into one of these classes, you often have no choice. The law or regulation requires you to log and monitor access. (Laws like FERPA, HIPAA, and PCI DSS are real-world examples of these requirements, not required AP content.)

Evaluating Detection Methods

Detection methods aren't all equal. They differ in speed, in whether they catch attacks live or after the fact, and in what they can miss.

Speed and Automation

Manually scrolling through logs doesn't scale. Modern systems generate huge numbers of log entries a day, so log analysis has to be augmented with automation to operate at an effective speed. Automated tools can flag anomalies in near real time.

Honeypots are even faster. Because there's no legitimate reason to touch a honeypot, a single access attempt triggers an instant alert. That's about as close to instantaneous detection as you can get.

Real-Time vs. Retrospective Detection

Some detective controls catch attacks while they're happening:

Some DLP tools
Honeypots
Real-time automated log analysis

These let defenders respond quickly and potentially stop the attack before it does more harm.

Other controls only catch attacks after they have occurred:

Retrospective (after-the-fact) log analysis
Hash-based integrity checks

These still matter for forensics, understanding what happened, and patching the hole, but the damage is usually already done by the time you spot it.

False Negatives

A false negative is when an attack happens but your detection method misses it. Every detection tool has blind spots:

Cryptographic hashes only detect changes. If an attacker just views or copies a sensitive file without modifying it, the hash stays the same. The theft goes undetected.
Honeypots only detect attackers who interact with them. A careful adversary who avoids the bait file gets in and out without tripping the alarm.

The lesson: stack multiple detective controls so the weaknesses of one are covered by the strengths of another.

Verifying a File with Hashes

Hashing a file is one of the most practical skills in this topic. Since the same file always produces the same hash, you can verify whether a file has been altered by hashing it twice and comparing the results.

Generating Hashes on Different Systems

You can compute hashes using built-in command line tools. Here's how to generate a SHA256 hash of a file called testfile on each major OS:

Windows PowerShell:

</>Code
Get-FileHash testfile -Algorithm SHA256

Linux (BASH):

</>Code
sha256sum testfile

macOS (zsh):

</>Code
shasum -a 256 testfile

You can also use websites that compute hashes or specialized software, but the command line is fast and free.

The Verification Process

The workflow is straightforward:

Hash the file when you know it's in a trusted state. Record the digest somewhere safe.
Later, hash the file again.
Compare the two digests.

If they match, the file hasn't changed. If they differ, something modified the file between the two hash checks. That could be a legitimate update, or it could be malware tampering.

This is also how you verify downloads. Software publishers often post the official SHA256 hash of an installer. You hash the file you downloaded and compare. If the digests match, you got the real file. If they don't, the download was corrupted or tampered with.

Spotting Application Attacks in Log Files

Web application logs record every request users send. Attackers have to send requests too, and their requests often contain telltale patterns. Knowing what to look for turns log files into an attack detector.

SQL Injection Indicators

SQL injection happens when an attacker sneaks SQL code into a user input field, hoping the database will execute it. Logs of user input often reveal the attempt. Watch for:

Quote characters: single quotes ('), double quotes ("), or backticks (`). Attackers use these to break out of input strings.
Boolean conditions like OR 1=1, which always evaluates to true and can bypass login checks.
The double dash (--), which starts a comment in SQL. Attackers use it to ignore the rest of a query.
SQL control words in capital letters, like WHERE, IN, and FROM. Normal users rarely type these into a login form.

A login field that receives input like admin' OR 1=1 -- is almost certainly an attack attempt.

Cross-Site Scripting (XSS) Indicators

XSS attacks inject malicious scripts into web pages that other users will view. The dead giveaway in logs is the <script>...</script> tag in user input. If someone submits a comment that contains <script>alert('xss')</script>, that's not a normal comment. It's an XSS probe. Tags in user input where they shouldn't be (like in a username field) deserve a closer look.

Buffer Overflow Indicators

A buffer overflow tries to crash or hijack an application by sending more data than it expects. For web apps, you can detect attempts by checking the length of incoming data. Common fields to monitor:

URL length
Cookie length
Query string length
Total request length

A normal URL might be 50 to 200 characters. A URL that's 5,000 characters long is not a person browsing the site. It's likely a buffer overflow attempt.

Directory Traversal Indicators

Directory traversal attacks try to escape a web app's intended folder and access files elsewhere on the server, like /etc/passwd. The classic indicator is sequences of ../ in the URL path of an HTTP GET request. Each ../ means "go up one directory."

A log entry like:

</>Code
GET /images/../../../../etc/passwd HTTP/1.1

is a clear directory traversal attempt. Any request with multiple ../ sequences should trigger an alert.

Putting It Together

When analyzing logs for application attacks, you're scanning for these specific patterns across user input fields and request data. Most real attacks aren't subtle: the indicators above show up directly in the raw logs. The challenge is volume, which is why automated log analysis does the first pass and flags entries for a human to review.

How to Use This on the AP Cybersecurity Exam

Choosing a Detective Control

When a prompt describes a scenario and asks for the best detective control, match the control to the goal. Want instant detection with almost no cost? A honeypot. Need to verify a file wasn't altered? A cryptographic hash. Want organization-wide monitoring of data movement and have the budget? DLP. Always tie your choice back to cost, the sensitivity of the data, and its classification.

Evaluating a Method

If asked to evaluate or judge a detection method, hit three angles: speed (real-time vs. after the fact), what it catches, and what it misses. Saying a hash check is reliable but only flags an attack after the file is already changed shows you understand the tradeoff. Naming a specific false negative, like a honeypot missing an attacker who never touches it, earns the point.

Reading Log Files

For log analysis questions, learn to scan for the signatures, not memorize whole logs. SQL control words and symbols (', --, OR 1=1, WHERE, FROM) point to SQL injection. A <script> tag points to XSS. An unusually long URL, cookie, or query string points to a buffer overflow. A path with ../ sequences points to directory traversal. Then state both the indicator and the attack name.

Code and Command Recall

You may need to recognize or produce the hashing commands. Connect each to its system: Get-FileHash testfile -Algorithm SHA256 for Windows PowerShell, sha256sum testfile for BASH, and shasum -a 256 testfile for macOS zsh. Remember the verification idea too: same file gives the same digest, so a changed digest means the file was altered.

Common Misconceptions

A hash check proves nobody stole the data. It only proves whether the file changed. An attacker can read and copy a file without altering it, and the hash stays identical. That is a classic false negative.
A honeypot catches every intruder. It only catches attackers who actually interact with the bait. A careful adversary who avoids it leaves no honeypot alert behind.
Detection and prevention are the same thing. Detective controls spot attacks; they do not block them on their own. They work alongside preventive controls, not instead of them.
All detection happens in real time. Honeypots, real-time automated log analysis, and some DLP tools alert during an attack, but retrospective log analysis and hash checks only reveal attacks after they have already happened.
Logs review themselves. Useful log analysis needs automation to keep up with the volume of entries. Without it, attacks hide in the noise.
Any quote or keyword in a log means an attack. These are indicators, not proof. Legitimate input can sometimes contain similar characters, so indicators flag entries for review rather than confirm an attack by themselves.

Vocabulary

The following words are mentioned explicitly in the AP® course framework for this topic.

Term	Definition
accounting	The process of recording and monitoring user activities and data access to track when and by whom data are accessed.
alerts	Notifications generated by security tools when suspicious activity or attacks are detected.
application logs	Records of events and activities generated by an application that can be analyzed to detect suspicious behavior and security incidents.
buffer overflows	An attack where excessive data is sent to an application to overflow its memory buffer, potentially allowing an attacker to execute arbitrary code or crash the application.
critical data	Information essential to organizational operations that, if compromised or lost, would significantly impact the organization.
cross-site scripting attacks	Cross-site scripting attacks where malicious scripts are injected through user input to compromise web applications or steal user data.
cryptographic hash functions	Mathematical functions that generate a unique digest for data to detect whether data have been altered or modified.
data classification	The process of categorizing data based on sensitivity levels such as private, educational, healthcare, or financial to determine appropriate security controls.
data digest	A unique output generated by a cryptographic hash function that can reveal if data have been changed.
data integrity	The assurance that data has not been altered or corrupted and remains accurate and complete.
data loss prevention (DLP)	Security services that monitor and control data access, usage, and transmission to detect and prevent unauthorized data movement or theft.
detective controls	Security controls that help identify attacks when they occur, such as intrusion detection systems and security incident and event management systems.
directory traversal attacks	An attack where an adversary uses path sequences like '../' in HTTP requests to navigate outside the intended directory and access unauthorized files on a server.
DLP tools	Data Loss Prevention tools that monitor and control the movement of sensitive data to prevent unauthorized access or exfiltration.
false negatives	Instances where a detection system fails to identify an actual attack or security threat that has occurred.
hash	A fixed-length binary string output produced by a cryptographic hash function from an input of arbitrary length.
hash values	Cryptographic outputs used to verify data integrity by detecting any unauthorized changes to data.
honeypots	Decoy systems or resources designed to attract and detect attackers by monitoring unauthorized access attempts.
indicators of application attacks	Specific patterns, signatures, or anomalies in logs and user input that suggest an attempted or ongoing attack on an application.
log analysis	The examination and interpretation of system logs to identify and investigate security events and potential attacks.
malicious activity	Harmful actions or behaviors conducted by adversaries on networks, such as unauthorized access, data theft, or system compromise.
real-time automated log analysis	Automated systems that continuously monitor and analyze logs as events occur to provide immediate detection of attacks.
retrospective log analysis	The examination of logs after an attack has occurred to identify what happened and how the system was compromised.
sensitive data	Information that requires protection from unauthorized access, such as personal credentials, financial information, or private communications.
server logs	Records of server activities and requests that can be reviewed to identify indicators of attacks and unauthorized access attempts.
SHA256	A specific cryptographic hash function that produces a 256-bit hash output, commonly used to verify file integrity.
SQL control words	Reserved keywords in SQL language such as WHERE, IN, and FROM that are used to construct database queries and can indicate SQL injection attempts when found in unexpected user input.
SQL injection attacks	A type of application attack where malicious SQL code is inserted into user input fields to manipulate database queries and gain unauthorized access to data.

Frequently Asked Questions

What is a honeypot in AP Cybersecurity and how does it detect attacks?

A honeypot is a fake file designed to look like it contains valuable data, such as credit card numbers or passwords, but the data inside is not real. Because there is no legitimate reason for anyone to access it, any attempted access is treated as an indicator of malicious activity and triggers an alert for defenders.

How do cryptographic hashes detect if a file has been tampered with?

A cryptographic hash function always produces the same output for the same input, so you can hash a file when it is in a trusted state and record the digest. If you hash the file again later and the digest has changed, the file was altered between the two checks.

What are the indicators of SQL injection in a log file for AP Cybersecurity?

In application and server logs, SQL injection attempts often appear as user input containing single or double quote characters, boolean conditions like OR 1=1, a double dash used to start a SQL comment, or SQL control words in capital letters such as WHERE, IN, or FROM. Seeing these patterns in a login or input field is a strong sign of an attack attempt.

What is the difference between real-time and retrospective detection in AP Cybersecurity 5.6?

Real-time detection tools like honeypots, some DLP services, and automated log analysis alert defenders while an attack is in progress, allowing a faster response. Retrospective log analysis and cryptographic hash checks only identify that an attack occurred after the fact, which is still useful for forensics but cannot stop damage already done.

How do you choose the right detective control for data in AP Cybersecurity?

The three main criteria are cost, the sensitivity or criticality of the data, and the data's classification. Honeypots and hash checks are inexpensive options, while DLP services cost more but offer stronger coverage; highly sensitive or legally classified data such as healthcare or financial records may also carry regulatory requirements that dictate how closely it must be monitored.