How to Generate and Validate UUIDs in Any Programming Language

Learn what UUIDs are, understand the differences between UUID versions (v1-v5), when to use each version, and how to generate UUIDs in JavaScript, Python, Go, and Java.

2024-11-0812 min

Related toolUUID Generator

Use the tool alongside this guide for hands-on practice.

The problem UUIDs solve (and why I use them everywhere)

Before I understood UUIDs, I used auto-incrementing IDs for everything. Then I built my first system that needed to sync data between multiple databases at Šikulovi s.r.o.. Suddenly, both databases wanted to create record ID 1,247. That's when I really got why UUIDs exist - they're guaranteed unique without any coordination.

A UUID is 128 bits of data, usually displayed as 32 hex characters with hyphens: 550e8400-e29b-41d4-a716-446655440000. Any device, anywhere, can generate one and be confident it won't collide with any other UUID ever created. The math works out. Trust me on this one.

UUID vs GUID (spoiler: same thing)

I wasted time early in my career thinking these were different. They're not. Microsoft calls them GUIDs, everyone else calls them UUIDs. Same format, same algorithms, same size. If someone asks about the difference in an interview, they're testing if you know this trivia.

UUID: The RFC 4122 standard term
GUID: Microsoft terminology (same exact thing)
Both: 128 bits, 8-4-4-4-12 format
Case varies by library, but it doesn't matter functionally

Decoding the format

The format xxxxxxxx-xxxx-Mxxx-Nxxx-xxxxxxxxxxxx tells you more than you might think. M is the version (1-5), and N is the variant (should be 8, 9, a, or b for RFC 4122 UUIDs). Once you know this, you can glance at a UUID and know how it was generated.

Position 15 (after second hyphen): Version digit (1, 2, 3, 4, or 5)
Position 20 (after third hyphen): Variant digit (8, 9, a, or b)
Example: 550e8400-e29b-41d4-a716-... is v4 (random)
36 characters with hyphens, 32 without

v1: Time + MAC address (I rarely use this)

v1 embeds a timestamp and your machine's MAC address. It's sortable by creation time, which is nice. But it also reveals when and where each UUID was generated, which can be a privacy problem. I only use v1 when I explicitly need time-ordering and privacy isn't a concern.

Timestamp with 100-nanosecond precision
MAC address baked in (privacy concern)
Sortable chronologically
Use case: Event logs, distributed tracing

v2: Skip this one

v2 exists for DCE security stuff from the 90s. I've never used it, never seen it used, and most libraries don't even implement it. Pretend it doesn't exist.

DCE security - largely obsolete
Most libraries don't support it
Not recommended for new applications

v3 and v5: Deterministic from name

These are interesting - you give them a namespace and a name, and they always produce the same UUID. v3 uses MD5, v5 uses SHA-1. I use v5 when I need to generate consistent IDs from known inputs, like creating a user's namespace ID from their email.

Same input always gives same UUID
v3 uses MD5, v5 uses SHA-1 (prefer v5)
Requires namespace UUID + name string
Use case: Consistent IDs from emails, URLs, etc.

v4: Random (my default)

v4 is what I use 95% of the time. 122 random bits. No coordination needed. No information leaked. crypto.randomUUID() in modern JS, uuid.uuid4() in Python. Simple, secure, done.

122 random bits (6 for version/variant)
No timestamp, no MAC, nothing predictable
Collision probability is effectively zero
My default for database IDs, session IDs, API keys

Which version do I actually use?

Choosing the right UUID version depends on your requirements. Most applications should use v4. Use v1 when ordering matters, and v5 when you need deterministic IDs.

v4 (Random): Default choice for most applications
v4 (random): 95% of my use cases
v1 (time-based): Event logs, distributed tracing, when I need sorting
v5 (deterministic): Consistent IDs from known inputs
v2, v3: Basically never

JavaScript: crypto.randomUUID() is all you need

If you're on a modern browser or Node.js 19+, just use crypto.randomUUID(). Done. For v1, v3, or v5, install the uuid package from npm.

Native: crypto.randomUUID() - v4, built-in
npm: uuid package for other versions
import { v4 as uuidv4 } from "uuid"
v5: uuidv5('name', uuidv5.URL)

Python: built into the standard library

Python's uuid module has everything. No pip install needed. I use this constantly for quick scripts and data processing.

import uuid
uuid.uuid4() - random
uuid.uuid1() - time-based
uuid.uuid5(uuid.NAMESPACE_URL, "example.com")
str(uuid.uuid4()) for string output

Go: use google/uuid

Go doesn't have UUIDs in the standard library, which is annoying. The google/uuid package is the de facto standard. go get github.com/google/uuid and you're good.

go get github.com/google/uuid
uuid.New() - returns v4
uuid.NewUUID() - returns v1
uuid.Parse() for validation

Java: UUID.randomUUID()

Java's built-in java.util.UUID handles v3 and v4. For v1 or v5, you'll need java-uuid-generator or similar.

UUID.randomUUID() - v4
UUID.nameUUIDFromBytes() - v3
UUID.fromString() to parse
For v1/v5: java-uuid-generator library

Validation (quick tip)

Most libraries have a parse function that throws on invalid input. That's your validator. If you need regex, here it is:

Regex: ^[0-9a-f]{8}-[0-9a-f]{4}-[1-5][0-9a-f]{3}-[89ab][0-9a-f]{3}-[0-9a-f]{12}$
Case-insensitive: Add i flag or use [0-9a-fA-F]
JavaScript: Use regex or uuid.validate() from uuid package
Python: Try uuid.UUID(string) and catch ValueError
Go: _, err := uuid.Parse(string); err != nil means invalid
Java: Try UUID.fromString() and catch IllegalArgumentException
Version check: Extract character at position 14 (version digit)
Variant check: Extract character at position 19 (should be 8, 9, a, or b)

"But what about collisions?"

I get asked this a lot. Short answer: don't worry about it. The math is overwhelming. You'd need to generate a billion UUIDs per second for 85 years to have a 50% chance of one collision. Your server will die of old age first. Just generate and use them without checking for duplicates.

2^122 possible values (5.3 x 10^36)
Collision probability is effectively zero
You're more likely to get hit by a meteor
No need to check database for uniqueness

UUIDs vs auto-increment: the tradeoff

Auto-increment IDs are smaller (4-8 bytes vs 16) and index better. But they leak information (user 1,247 means you have at least 1,247 users), require database coordination in distributed systems, and can't be generated client-side. I use UUIDs for public-facing IDs and sometimes auto-increment internally.

Auto-increment: Compact, fast, but leaky and needs coordination
UUID: Larger, random, but safe and coordination-free
My approach: UUID for external IDs, sometimes both internally

Performance: yes, there are tradeoffs

Random UUIDs are bad for B-tree indexes because they cause page splits. If you're doing millions of inserts, this matters. Solutions: use v1 (time-ordered), ULID, or store as binary instead of varchar. PostgreSQL's native UUID type is already optimized.

varchar(36) wastes space - use binary(16) or native UUID type
Random inserts fragment indexes - consider v1 or ULID for high volume
MySQL: UUID_TO_BIN() and BIN_TO_UUID()
PostgreSQL: Native UUID type is already efficient

Alternatives worth knowing

UUIDs aren't the only option. ULID gives you sorting. Snowflake gives you 64-bit IDs with timestamps. NanoID gives you URL-safe short IDs. I still use UUIDs most of the time, but these are worth considering for specific use cases.

ULID: Sortable, same size as UUID, lexicographically ordered
Snowflake: 64-bit, timestamp + machine + sequence (Twitter, Discord)
NanoID: URL-safe, customizable length
CUID2: Designed for horizontal scaling

Mistakes I see constantly

After reviewing a lot of code at Šikulovi s.r.o., these are the UUID antipatterns I see most often:

Math.random() for UUIDs - use crypto.randomUUID()
Assuming UUIDs are sequential - only v1 is ordered
varchar(36) in the database - use native UUID type
v1 in privacy-sensitive contexts - it leaks your MAC address
Case-sensitive comparison - UUIDs should be case-insensitive

My UUID decision tree

After 10+ years of building systems at Šikulovi s.r.o., here's my mental model: v4 for 90% of cases. It's random, safe, simple. v1 if you need time-ordering (logs, events). v5 if you need reproducibility (URL-to-ID mapping). That's literally it.

The best part? Every language has native support now. No libraries needed. crypto.randomUUID() in JavaScript, uuid.uuid4() in Python, uuid.New() in Go. Generating unique identifiers is a solved problem. That's why I included a UUID generator in CodeUtil - for those quick moments when you just need a valid UUID without writing code. Use the tools, don't overthink it, and move on to the interesting parts of your application.

FAQ

What UUID version should I use?

Use UUID v4 (random) for most applications. It requires no coordination, reveals no information, and has extremely low collision probability. Use v1 if you need time-ordering, and v5 if you need deterministic IDs from consistent inputs.

Can two UUIDs ever be the same?

Theoretically yes, but practically no. UUID v4 has 2^122 possible values. The probability of generating duplicate UUIDs is so low that you would need to generate billions per second for decades to have a meaningful chance of collision. For all practical purposes, UUIDs are unique.

Is UUID v4 cryptographically secure?

UUID v4 uses cryptographically secure random number generators, making the output unpredictable. However, UUIDs are identifiers, not secrets. Do not use UUIDs as passwords, tokens, or encryption keys. Use dedicated secrets management for security-sensitive values.

Why does UUID v1 reveal privacy information?

UUID v1 includes the MAC address of the generating machine and a timestamp. This reveals which device created the UUID and when. In privacy-sensitive applications, use UUID v4 instead, which contains only random data.

How do I generate the same UUID from the same input?

Use UUID v5 (or v3) with a namespace and name. Given the same namespace UUID and name string, v5 always generates the same UUID. This is useful for creating consistent identifiers for known resources like URLs or email addresses.

Should I use UUID or auto-increment for primary keys?

Use auto-increment for internal-only IDs where performance is critical. Use UUIDs for IDs exposed in APIs or URLs, distributed systems, or when you need to generate IDs without database access. Many applications use both: auto-increment internally and UUIDs externally.

What is the nil UUID?

The nil (or null) UUID is 00000000-0000-0000-0000-000000000000 (all zeros). It is a valid UUID that represents the absence of a value, similar to null. Some systems use it as a placeholder or default value. Be aware of this when validating UUIDs.

Are UUIDs case-sensitive?

UUIDs are case-insensitive by specification. "550E8400-E29B-41D4-A716-446655440000" and "550e8400-e29b-41d4-a716-446655440000" represent the same UUID. However, lowercase is conventional. When comparing UUIDs, normalize the case first to avoid mismatches.

MŠ

Martin Šikula

Founder of CodeUtil. Web developer building tools I actually use. When I'm not coding, I experiment with productivity techniques (with mixed success).

More about me →LinkedIn

January 19, 20268 min

Number Base Conversion: Binary and Hex for Everyday Development

I avoided binary and hex for years - until a production server went down because I set wrong permissions. That night I finally learned what chmod 755 actually means. Now I convert between bases constantly at Šikulovi s.r.o..

Number Base Converterbinaryhexadecimalprogramming

February 14, 20269 min

Naming Conventions: The Complete Guide to Case Styles in Programming

camelCase, snake_case, kebab-case - why do we have so many? Here's what I've learned about naming conventions, when to use each style, and why consistency matters more than the 'right' choice.

Case Converterprogrammingnaming conventionsbest practicestools