REST APIs: use UUID to identify a Resource

REST API

The fundamental concept in any RESTful API is the resource. But one of the main issue is to uniquely identify it. So let’s talk about the most commonly used techniques and patterns, finally I'll give my favorite choice.

TLDR

  • REST API may expose resources by using UUID identifier (to avoid enumeration).
  • If the resources have well defined logical keys, it is better to use the UUID-V3 generated starting from the logical key.
  • If a resource doesn't have a logical key, you can ue a random UUID. Check the UUID-V4
  • Backend should use and store its immutable primary keys for its logics (maybe numerical auto-generated IDs)

1. Database Auto Generated Primary Keys

This is one of the most used type of identifier. Especially when you need to store the resource inside a common relational database. It is the simplest way to identify the resource since DB engines guarantee uniqueness. You may find this kind of resources in legacy (monolithic) systems that have been exposed to the other services by using modern REST philosophies. For example an old app that need a fresh UI or some integrations with 3-rd parties may expose directly its internal ids. So usually the resource looks like the following one:

https://music.app/song/1

Usually resources (songs) are one to one with stored table rows, and in the previous example we are identifying the song (the row) with id 1.

Advantages

  • Simple, you can autogenerate it using your DB
  • ID is mandatory and not editable. It doesn’t change, and really, it is a big advantage.

Drawbacks

  • Not suitable for distributed system
  • Very hard to synchronize environments, or external service. You always have a production or test environment that are not synchronized and overlaps may be a very big problem (especially for your customers)
  • Security should always be taken into account. Enumeration of these kind of resources is easy, you only need to increment the ID counter to find new records (resources).

2. Logical Key(s)

Another most commonly used way is identifying the resource with a logical key. Usually it is used when resources can be treated as a domain. For example the day of a week, the country name or closed domains. So good resources can be:

https://my.app/weekday/monday
https://my.app/country/italy

Advantages

  • Easily interpretable for users and for API developers. It is easy to understand the resource content by its identifier.
  • Easy to synchronize data across environments… No overlaps, customers are so happy of that…
  • If you use the logical key as a route (webapp for example) it is a lot expressive

Drawbacks

  • Need to guarantee uniqueness. Backend should be more sophisticated.
  • Not suitable for relational data and if you want to use it as a PK (and FK) you are mad.
  • Usually the logical key can be modified. Imagine if your resource is a name of company. Company name changes… It is not so so common but it could be possibile.. And when you change the logical key… you need to change all its references (FK)… Blood…
  • When the resource can be identified only with a set of field… you need to craft the logical key (concatenation? another pattern?…)

So in the real life usually the logical key is exposed to the world but the app inside use a numeric primary key.

3. The right way: expose UUID-V3 — store PK

This is the best choice I found to develop my APIs.

  1. Basically I use autogenerated serial primary keys (from DB) inside the backend. So references (FK) always use a numeric PK.
  2. Then I decorate resources with a logical key. For example the code or the name. For domains the logical key is simple. For complex resources they may be a concatenation of fields.
  3. All the exposed resources have a UUID-V3 starting from the logical key.

UUID-V3 are Universally unique identifier generated from an MD5 hash of a namespace and name. Usually the namespace is the my appname (or endpoint) and the name is the entity + the logical key.

For example imagine that your app identifies the user by its username. You can expose your users both by logical key (the username) and the UUID-V3. For example:

https://my.app/users/andrea
https://my.app/users/92068a6a-822a-3865-9970-bd4bc3e96f2c
// NB. the UUID is always the same for the name andrea. I used the
// string user_andrea as logical key
// https://uuidonline.com/?version=3&namespace=user_andrea

Advantages

  • Backend use PK serial numeric identifiers. I use the power of relational databases and foreign keys
  • UUID is the best choice for synchronization, especially when you have resources with composite logical keys and you application should handle multiple fields logical keys properly and it is not so easy.
  • You can expose both UUID for complex resources and logical key for simple resources like a domain. You can switch to UUID/logical key accordingly to your needs. For example a GUI route can use the logical key

Drawbacks

  • Backend should be implemented well to understand and expose both logical key and UUID. Usually I expose also the logical key if the it is composed of a single field and I always expose UUID-V3
  • When you update the logical key (if permitted) you need also to update the UUID. But since the primary key of the app is always the same (auotogenerated by DB) this operation is not so expensive

4. Random UUID Random: when you don't have the logical key use the UUID-V4

It may happen that your resources don't have a well defined logical key. Imagine for example the identifier for a transaction, or of web user session. They need a unique identifier but there is not a well defined (closed domain) logical key to associate to them. So usually the easiest way is to use a numerical autoincremented primary key identifier.

But if we want to expose the resource without binding its external identifier to its internal primary key we could add also the UUID-V4. For example:

https://my.app/orders/92068a6a-822a-1865-9870-ad4bc3e96f2a

Advantages

  • Easy to develop. The UUID-V4 is a random identifier of a single resource.
  • Your API is protected from enumeration.

Drawbacks

  • Synchronization between environments may be difficult if you don't have a logical key.
  • If the UUID is the unique key and you don't have any other internal numeric primary key you need to be careful with indexes, joins, queries and so on. So performance could be a problem.

Conclusion

UUID are a very good resource identifier for REST APIs, especially if you don't have storage strict requirements, and you use them only as an identifier to the external world. You can choose from different version of UUID, pay attention to select the right one according to your needs.