Skip to main content
C
CodeUtil

String Escaping Guide: JSON, HTML, URL, and More

After debugging my third XSS vulnerability caused by improper escaping, I finally sat down and learned the rules properly. Here is my complete guide to escaping strings for JSON, HTML, URLs, and more - with the gotchas that bit me.

2026-02-269 min
Related toolString Escape/Unescape

Use the tool alongside this guide for hands-on practice.

The escaping bug that taught me a lesson

I thought I understood escaping until a security audit at Šikulovi s.r.o. found an XSS vulnerability in a form I had built. User input with angle brackets was rendering as HTML. Classic mistake, but I made it because I did not fully grasp which context needed which escaping.

String escaping converts special characters into safe representations that do not break the containing format. A newline becomes \\n in JSON, < becomes &lt; in HTML. Without proper escaping, special characters get interpreted as syntax - causing errors at best, security holes at worst.

The tricky part is that every format has different rules. What works for JSON breaks in HTML. URL encoding differs from both. I finally sat down and learned them all properly after that security audit, and that is what this guide covers.

JSON escaping: the most common case

JSON escaping is probably what you need most often. When embedding text in JSON, you must escape double quotes, backslashes, and control characters like newlines.

The escape sequences are: \" for double quotes, \\ for backslash, \n for newline, \r for carriage return, \t for tab, \f for form feed, \b for backspace. Any other special characters can be Unicode-escaped as \uXXXX.

  • Always escape: \ (backslash), " (double quote)
  • Control characters: \n (newline), \r (carriage return), \t (tab)
  • Forward slash can be escaped as \/ but usually isn't required
  • Unicode: \u0000 through \uFFFF for any character
  • Single quotes don't need escaping in JSON (JSON uses double quotes only)

JavaScript escaping: single quotes matter

JavaScript escaping is similar to JSON but also handles single quotes since JS strings can use either delimiter. If your string is in single quotes, you need to escape single quotes. In double quotes, escape double quotes.

  • In 'single quoted' strings: escape \' (single quote) and \\ (backslash)
  • In "double quoted" strings: escape \" (double quote) and \\ (backslash)
  • Template literals (`backticks`): escape \` (backtick) and \${ (to avoid interpolation)
  • Control characters work the same as JSON: \n, \r, \t
  • JavaScript also has \0 for null character

HTML escaping: preventing XSS

This is where my security audit failure happened. HTML escaping is not just about syntax - it is about security. Display user input without escaping and they can inject arbitrary HTML and JavaScript. This is XSS, and it is still one of the most common web vulnerabilities I see in audits.

HTML uses entity references: &lt; for <, &gt; for >, &amp; for &, &quot; for double quotes. Once I memorized these four, XSS stopped being a mystery. The browser sees text, not markup.

  • < becomes &lt; - prevents opening tags
  • > becomes &gt; - prevents closing tags
  • & becomes &amp; - must be escaped first to avoid double-encoding
  • " becomes &quot; - important in attribute values
  • ' becomes &#39; or &apos; - important in attribute values
  • Modern frameworks (React, Vue, Angular) auto-escape by default
  • Use innerHTML or dangerouslySetInnerHTML only with sanitized content

URL encoding: percent-encoding

URL encoding (percent-encoding) converts characters that aren't allowed in URLs into %XX format where XX is the hex value of the byte. Space becomes %20, & becomes %26, = becomes %3D.

This is especially important for query parameters. If your search query is 'cats & dogs', without encoding the & would be interpreted as a parameter separator.

  • Space: %20 (or + in query strings)
  • Reserved characters: ! # $ & ' ( ) * + , / : ; = ? @ [ ]
  • Use encodeURIComponent() for values, encodeURI() for full URLs
  • Don't double-encode - %20 should not become %2520
  • UTF-8 characters become multiple %XX sequences

Regex escaping: treating metacharacters literally

Regex has many metacharacters with special meanings: . matches any character, * means repeat, + means one or more. When you want to match these literally, they need escaping.

I've been burned by this more times than I can count. Want to search for 'file.txt'? Without escaping the dot, you'll match 'filetxt', 'filextxt', anything with one character between 'file' and 'txt'.

  • Metacharacters to escape: . * + ? ^ $ { } [ ] \ | ( )
  • Escape with backslash: \. matches a literal dot
  • Most languages have a regex escape function - use it
  • JavaScript: str.replace(/[.*+?^${}()|[\]\\]/g, '\\$&')
  • When building regex from user input, always escape first

SQL escaping: prefer parameterized queries

I have strong opinions here. SQL escaping traditionally means doubling single quotes: O'Brien becomes O''Brien. But do not do this manually. I refuse to even teach the technique properly because I do not want anyone using it in production.

Use parameterized queries. Always. They separate the SQL structure from the data, making injection impossible. Every single SQL injection I have seen in code reviews at Šikulovi s.r.o. came from manual string building. Every single one.

  • Basic SQL escaping: ' becomes '' (two single quotes)
  • But seriously: use parameterized queries / prepared statements
  • In Node.js: connection.query('SELECT * FROM users WHERE id = ?', [userId])
  • In Python: cursor.execute('SELECT * FROM users WHERE id = %s', (user_id,))
  • String escaping tools are for debugging and learning, not production SQL

CSV escaping: fields with special characters

CSV files have simple escaping rules: if a field contains commas, newlines, or double quotes, wrap it in double quotes and double any quotes inside.

This catches people off guard. A simple field like 'Smith, John' breaks the CSV structure unless properly escaped to '"Smith, John"'.

  • Fields with commas must be quoted: "John, Smith"
  • Fields with newlines must be quoted
  • Fields with quotes must be quoted with doubled quotes: "He said ""Hello"""
  • Fields without special characters need no escaping
  • Use a proper CSV library - the edge cases are tricky

FAQ

What is the difference between escaping and encoding?

The terms are often used interchangeably. Strictly speaking, escaping adds special characters (like backslashes) to represent problematic characters, while encoding converts to a different representation entirely (like percent-encoding). In practice, both make strings safe for their intended context.

Why do different formats have different escaping rules?

Each format has different reserved characters and different ways of representing strings. JSON uses double quotes and backslash escapes. HTML uses angle brackets and entity references. URLs use reserved characters for structure. Each needed its own escaping scheme.

Is HTML escaping enough to prevent XSS?

For text content, yes - escaping < > & " ' prevents tag injection. But in certain contexts like inside script tags or event handlers, more complex sanitization is needed. Use a proper sanitization library like DOMPurify for untrusted HTML.

When should I manually escape strings vs using library functions?

Always prefer library functions and framework features. Manual escaping is error-prone and easy to forget. React's JSX auto-escapes, SQL prepared statements handle escaping, URL APIs handle encoding. Manual escaping is mainly useful for debugging and understanding.

What is double escaping and how do I avoid it?

Double escaping happens when you escape an already-escaped string. & becomes &amp; then becomes &amp;amp;. Avoid it by escaping only once, at the last moment before output. If you're seeing &amp;amp; in your output, something is escaping twice.

How do I escape user input for SQL queries?

Don't escape it - use parameterized queries instead. Every database driver supports prepared statements where you pass values separately from the query. This makes SQL injection impossible and handles all the escaping automatically.

Martin Šikula

Founder of CodeUtil. Web developer building tools I actually use. When I'm not coding, I experiment with productivity techniques (with mixed success).

Related articles