Skip to content

Subscriptions overhaul#171

Open
Sporiff wants to merge 7 commits intomainfrom
subscriptions-redux
Open

Subscriptions overhaul#171
Sporiff wants to merge 7 commits intomainfrom
subscriptions-redux

Conversation

@Sporiff
Copy link
Member

@Sporiff Sporiff commented Mar 15, 2026

This PR is an overhaul of the subscriptions endpoint to address the following:

  1. Offline-first approach
  2. Bulk submission
  3. Multi-status responses
  4. Global log style of data storage

The goal with this approach is to facilitate a per-entity bulk update and sync, as well as a future firehose-like approach to handling updates coming from clients using an enum-based action switch.

This PR overhauls the structure of the existing Subscriptions endpoint documentation to align it more closely with discussed mechanisms and to present it as a more RFC-like specification rather than a series of endpoint descriptions.

@Sporiff Sporiff self-assigned this Mar 15, 2026
@netlify
Copy link

netlify bot commented Mar 15, 2026

Deploy Preview for openpodcastapi ready!

Name Link
🔨 Latest commit 69a44a4
🔍 Latest deploy log https://app.netlify.com/projects/openpodcastapi/deploys/69b72cb976b54f00084480f4
😎 Deploy Preview https://deploy-preview-171--openpodcastapi.netlify.app
📱 Preview on mobile
Toggle QR Code...

QR Code

Use your smartphone camera to open QR code link.

To edit notification comments on pull requests, go to your Netlify project configuration.


### 3.1 Normative language

The key words "MUST", "MUST NOT", "SHOULD", "SHOULD NOT", and "MAY" in this document are to be interpreted as described in RFC 2119[^2].

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmm, I feel like writing more in the style of an API documentation rather than an RFC makes it easier for people to adapt. Look at the podcast namespace docs for example. Those are also not written as an RFC.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@ByteHamster I see your point. I chose this structure because the aim was to make this something of a standard, which means defining behaviors clearly to avoid ambiguity. However, this structure is a bit cumbersome, so once we're happy with the content itself I think we can probably edit it down into something more readable.


To calculate the UUID value, the client MUST do the following:

1. Normalize the `feed_url` by removing the scheme (for example: `https://`) and all trailing slashes (`/`).

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What about stuff like ?utm_source=xy?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@ByteHamster Good point. I don't think Podcast index references this, but they probably implicitly mean that all query params are stripped out, too. We should be explicit there.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Removing all might cause problems again. The semi-official German public broadcasting website relies on query parameters: https://mediathekviewweb.de/feed?query=abc. I don't have a solution, just trying to break things ;)

| Field | Type | Required | Description |
| ------------ | ---------------- | -------- | ------------------------------------------------------- |
| `uuid` | UUID | Yes | Deterministic identifier for the feed |
| `feed_url` | string | Yes | The RSS feed's canonical URL used to calculate the UUID |

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What if the URL changes (redirect, etc), and the old one is turned off a month later? This sounds like it would store the non-working URL here?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@ByteHamster Hm, yes. This is the same issue we discussed with episodes, right? Where we essentially need to store and resolve "alias" feed URLs in case things change.

In this model, the uuid is the identifier, so that would remain static no matter what. That can be calculated only once per the podcast index's logic, then the feed_url can change after the fact and the uuid will remain the same. We definitely need a way to handle this.

Comment on lines +193 to +194
| `created_at` | string (RFC3339) | Yes | Server-authoritative creation timestamp |
| `updated_at` | string (RFC3339) | Yes | Server-authoritative update timestamp |

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Specify the time zone, we had that problem with one of the gpodder re-implementations

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@ByteHamster I did write in the conventions section that all time stamps must be UTC so that there's no ambiguity between the server and the client, and the client can then present the timestamps as a local timezone. But I agree it bears repeating (people might skip past that section or be linked directly). Never hurts to make it clear.

| `created_at` | string (RFC3339) | Yes | Server-authoritative creation timestamp |
| `updated_at` | string (RFC3339) | Yes | Server-authoritative update timestamp |

Normative rule: `created_at` and `updated_at` are managed by the server. Clients MAY supply `subscribed_at` and `unsubscribed_at` in requests but it doesn't override the server’s canonical timestamps.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So the server is required to ignore them? Why should they be allowed then?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@ByteHamster My language was not clear enough here. But to explain: there is a difference between when a record was created_at and when it was subscribed_at. If the client records that a user subscribed to a feed at 2026-03-17T07:48:00.000Z, but then doesn't come online for a day or so, the client should still record that the user subscribed at 2026-03-17T07:48:00.000Z. The record, however, would be created a day later at 2026-03-18T07:48:00.000Z.

The client cannot tell the server when the subscription was recorded by the server, only when it was recorded by the client. To be honest, this note is probably more confusing than helpful as most people would be able to work that out implcitly.

| ---------------- | ------ | ----- | -------- | ------------------------------------------------------------------------------------ |
| `cursor` | string | Query | No | The Base64-encoded cursor to query from |
| `page_size` | number | Query | No | The number of results to return per-page |
| `direction` | string | Query | No | The direction in which to search for results. `ascending` (default) or `descending`. |

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So it is implicitly sorted by time? Then why use a cursor instead of just a time?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@ByteHamster There's no reason it couldn't be a time stamp. Encoded cursors that make use of a combination of data such as id AND created_at tend to be more efficient than just a timestamp, but I see the point that it's probably not enough of a performance improvement to justify the opacity it introduces.

I'll update this pagination section to remove the base64 and make use of the created_at timestamp for simplicity.

| Field | Type | Required | Description |
| --------------------------------------- | ---------------- | -------- | ----------------------------------------------------------------------- |
| `next_cursor` | string | No | The Base64-encoded cursor for the next page of results |
| `prev_cursor` | string | Yes | The Base64-encoded cursor for the current page of results |

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Prev or current?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@ByteHamster This is a semantic thing used in a lot of APIs that kind of confused me, too. It's technically the previous cursor because of how you use it. You likely would not need to cursor to a specific position once you've consumed the information at that cursor, but you would need to cursor backwards from that position to read new updates. I agree that it's a confusing nomenclature, but it is a widely used one.

As mentioned above, I think that the better approach here would be to use the timestamp as a cursor and to specify the following:

  1. If the client doesn't pass a since parameter with a valid timestamp, the server must respond with the most recent set of results from the endpoint.
  2. If the client provides a valid timestamp in the since parameter, the server must respond only with records that were created after that timestamp, providing a link to the next page of results each time for the client to follow until the list is exhausted.


### 9.5 Client behavior

1. The Client MAY provide any combination of supported query parameters, or none.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

None? Subscribe without a feed url or id?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@ByteHamster This is for the GET endpoint, not the POST endpoint. The client may pass parameters to limit the number/page of results they receive, but they do not need to. The server should fall back to sensible defaults.


### 9.6 Server behavior

1. The Server MUST discard invalid query parameters and use default parameters.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Silently? That sounds dangerous. It should reject invalid ones

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@ByteHamster For a GET endpoint I'm not sure about that. We could enforce a return of 400 if the client passes in an invalid param, which maybe is better practice. But I don't think it's dangerous either way.

Comment on lines +584 to +595
"feed": {
"uuid": "fc4ed290-4621-54fe-b5b4-a001343aeed7",
"feed_url": "https://example.com/feed3.rss/",
"created_at": "2026-03-15T03:05:01:000Z",
"updated_at": "2026-03-15T03:05:01:000Z"
},
"subscription": {
"subscribed_at": "2026-03-15T03:05:01:000Z",
"unsubscribed_at": "2026-03-16T05:21:48.000Z",
"created_at": "2026-03-15T03:05:01:000Z",
"updated_at": "2026-03-16T06:05:02:000Z"
}

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why is there a difference between feed.created_at and subscription.created_at and subscription.subscribed_at? This will very likely lead to different clients using different fields and users then wondering why it doesn't work. If there is only one timestamp, client developers cannot mess it up

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@ByteHamster The Feed may have been created previously. In this model, the Feed exists separately from the Subscription. If User A subscribes to Feed A, and Feed A does not exist, the server implicitly creates a record for Feed A. If User B then subscribes to Feed A, it doesn't need to recreate the Feed record. It needs only to create a new Subscription for User B.

As for the subscription.created_at and subscription.subscribed_at, I think I've mentioned previously that the server-authoritative created_at timestamp is only a record of when the entity was created in the database, whereas the subscribed_at timestamp is a client-supplied record of when the user actually subscribed. It's for the case where devices record subscription actions offline and then come online later. The server still needs to keep a record of when the entity was created, but the clients need to know when the actual subscription happened.

As you mentioned above, if the user subscribes, then unsubscribes, then resubscribes, the subscribed_at timestamp will change, but the created_at timestamp will not.

This will very likely lead to different clients using different fields and users then wondering why it doesn't work. If there is only one timestamp, client developers cannot mess it up

We can be very clear in the language used that the client developers must only ever use the subscribed_at timestamp for their internal logic, and maybe there's an argument to be made that we don't need to return any other timestamps in the GET request. The timestamps are mostly for server actions such as compression and cursoring, so maybe the client doesn't need to care about them, I'm not sure.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants