Conversation
✅ Deploy Preview for openpodcastapi ready!
To edit notification comments on pull requests, go to your Netlify project configuration. |
|
|
||
| ### 3.1 Normative language | ||
|
|
||
| The key words "MUST", "MUST NOT", "SHOULD", "SHOULD NOT", and "MAY" in this document are to be interpreted as described in RFC 2119[^2]. |
There was a problem hiding this comment.
Hmm, I feel like writing more in the style of an API documentation rather than an RFC makes it easier for people to adapt. Look at the podcast namespace docs for example. Those are also not written as an RFC.
There was a problem hiding this comment.
@ByteHamster I see your point. I chose this structure because the aim was to make this something of a standard, which means defining behaviors clearly to avoid ambiguity. However, this structure is a bit cumbersome, so once we're happy with the content itself I think we can probably edit it down into something more readable.
|
|
||
| To calculate the UUID value, the client MUST do the following: | ||
|
|
||
| 1. Normalize the `feed_url` by removing the scheme (for example: `https://`) and all trailing slashes (`/`). |
There was a problem hiding this comment.
What about stuff like ?utm_source=xy?
There was a problem hiding this comment.
@ByteHamster Good point. I don't think Podcast index references this, but they probably implicitly mean that all query params are stripped out, too. We should be explicit there.
There was a problem hiding this comment.
Removing all might cause problems again. The semi-official German public broadcasting website relies on query parameters: https://mediathekviewweb.de/feed?query=abc. I don't have a solution, just trying to break things ;)
| | Field | Type | Required | Description | | ||
| | ------------ | ---------------- | -------- | ------------------------------------------------------- | | ||
| | `uuid` | UUID | Yes | Deterministic identifier for the feed | | ||
| | `feed_url` | string | Yes | The RSS feed's canonical URL used to calculate the UUID | |
There was a problem hiding this comment.
What if the URL changes (redirect, etc), and the old one is turned off a month later? This sounds like it would store the non-working URL here?
There was a problem hiding this comment.
@ByteHamster Hm, yes. This is the same issue we discussed with episodes, right? Where we essentially need to store and resolve "alias" feed URLs in case things change.
In this model, the uuid is the identifier, so that would remain static no matter what. That can be calculated only once per the podcast index's logic, then the feed_url can change after the fact and the uuid will remain the same. We definitely need a way to handle this.
| | `created_at` | string (RFC3339) | Yes | Server-authoritative creation timestamp | | ||
| | `updated_at` | string (RFC3339) | Yes | Server-authoritative update timestamp | |
There was a problem hiding this comment.
Specify the time zone, we had that problem with one of the gpodder re-implementations
There was a problem hiding this comment.
@ByteHamster I did write in the conventions section that all time stamps must be UTC so that there's no ambiguity between the server and the client, and the client can then present the timestamps as a local timezone. But I agree it bears repeating (people might skip past that section or be linked directly). Never hurts to make it clear.
| | `created_at` | string (RFC3339) | Yes | Server-authoritative creation timestamp | | ||
| | `updated_at` | string (RFC3339) | Yes | Server-authoritative update timestamp | | ||
|
|
||
| Normative rule: `created_at` and `updated_at` are managed by the server. Clients MAY supply `subscribed_at` and `unsubscribed_at` in requests but it doesn't override the server’s canonical timestamps. |
There was a problem hiding this comment.
So the server is required to ignore them? Why should they be allowed then?
There was a problem hiding this comment.
@ByteHamster My language was not clear enough here. But to explain: there is a difference between when a record was created_at and when it was subscribed_at. If the client records that a user subscribed to a feed at 2026-03-17T07:48:00.000Z, but then doesn't come online for a day or so, the client should still record that the user subscribed at 2026-03-17T07:48:00.000Z. The record, however, would be created a day later at 2026-03-18T07:48:00.000Z.
The client cannot tell the server when the subscription was recorded by the server, only when it was recorded by the client. To be honest, this note is probably more confusing than helpful as most people would be able to work that out implcitly.
| | ---------------- | ------ | ----- | -------- | ------------------------------------------------------------------------------------ | | ||
| | `cursor` | string | Query | No | The Base64-encoded cursor to query from | | ||
| | `page_size` | number | Query | No | The number of results to return per-page | | ||
| | `direction` | string | Query | No | The direction in which to search for results. `ascending` (default) or `descending`. | |
There was a problem hiding this comment.
So it is implicitly sorted by time? Then why use a cursor instead of just a time?
There was a problem hiding this comment.
@ByteHamster There's no reason it couldn't be a time stamp. Encoded cursors that make use of a combination of data such as id AND created_at tend to be more efficient than just a timestamp, but I see the point that it's probably not enough of a performance improvement to justify the opacity it introduces.
I'll update this pagination section to remove the base64 and make use of the created_at timestamp for simplicity.
| | Field | Type | Required | Description | | ||
| | --------------------------------------- | ---------------- | -------- | ----------------------------------------------------------------------- | | ||
| | `next_cursor` | string | No | The Base64-encoded cursor for the next page of results | | ||
| | `prev_cursor` | string | Yes | The Base64-encoded cursor for the current page of results | |
There was a problem hiding this comment.
@ByteHamster This is a semantic thing used in a lot of APIs that kind of confused me, too. It's technically the previous cursor because of how you use it. You likely would not need to cursor to a specific position once you've consumed the information at that cursor, but you would need to cursor backwards from that position to read new updates. I agree that it's a confusing nomenclature, but it is a widely used one.
As mentioned above, I think that the better approach here would be to use the timestamp as a cursor and to specify the following:
- If the client doesn't pass a
sinceparameter with a valid timestamp, the server must respond with the most recent set of results from the endpoint. - If the client provides a valid timestamp in the
sinceparameter, the server must respond only with records that were created after that timestamp, providing a link to the next page of results each time for the client to follow until the list is exhausted.
|
|
||
| ### 9.5 Client behavior | ||
|
|
||
| 1. The Client MAY provide any combination of supported query parameters, or none. |
There was a problem hiding this comment.
None? Subscribe without a feed url or id?
There was a problem hiding this comment.
@ByteHamster This is for the GET endpoint, not the POST endpoint. The client may pass parameters to limit the number/page of results they receive, but they do not need to. The server should fall back to sensible defaults.
|
|
||
| ### 9.6 Server behavior | ||
|
|
||
| 1. The Server MUST discard invalid query parameters and use default parameters. |
There was a problem hiding this comment.
Silently? That sounds dangerous. It should reject invalid ones
There was a problem hiding this comment.
@ByteHamster For a GET endpoint I'm not sure about that. We could enforce a return of 400 if the client passes in an invalid param, which maybe is better practice. But I don't think it's dangerous either way.
| "feed": { | ||
| "uuid": "fc4ed290-4621-54fe-b5b4-a001343aeed7", | ||
| "feed_url": "https://example.com/feed3.rss/", | ||
| "created_at": "2026-03-15T03:05:01:000Z", | ||
| "updated_at": "2026-03-15T03:05:01:000Z" | ||
| }, | ||
| "subscription": { | ||
| "subscribed_at": "2026-03-15T03:05:01:000Z", | ||
| "unsubscribed_at": "2026-03-16T05:21:48.000Z", | ||
| "created_at": "2026-03-15T03:05:01:000Z", | ||
| "updated_at": "2026-03-16T06:05:02:000Z" | ||
| } |
There was a problem hiding this comment.
Why is there a difference between feed.created_at and subscription.created_at and subscription.subscribed_at? This will very likely lead to different clients using different fields and users then wondering why it doesn't work. If there is only one timestamp, client developers cannot mess it up
There was a problem hiding this comment.
@ByteHamster The Feed may have been created previously. In this model, the Feed exists separately from the Subscription. If User A subscribes to Feed A, and Feed A does not exist, the server implicitly creates a record for Feed A. If User B then subscribes to Feed A, it doesn't need to recreate the Feed record. It needs only to create a new Subscription for User B.
As for the subscription.created_at and subscription.subscribed_at, I think I've mentioned previously that the server-authoritative created_at timestamp is only a record of when the entity was created in the database, whereas the subscribed_at timestamp is a client-supplied record of when the user actually subscribed. It's for the case where devices record subscription actions offline and then come online later. The server still needs to keep a record of when the entity was created, but the clients need to know when the actual subscription happened.
As you mentioned above, if the user subscribes, then unsubscribes, then resubscribes, the subscribed_at timestamp will change, but the created_at timestamp will not.
This will very likely lead to different clients using different fields and users then wondering why it doesn't work. If there is only one timestamp, client developers cannot mess it up
We can be very clear in the language used that the client developers must only ever use the subscribed_at timestamp for their internal logic, and maybe there's an argument to be made that we don't need to return any other timestamps in the GET request. The timestamps are mostly for server actions such as compression and cursoring, so maybe the client doesn't need to care about them, I'm not sure.
This PR is an overhaul of the subscriptions endpoint to address the following:
The goal with this approach is to facilitate a per-entity bulk update and sync, as well as a future firehose-like approach to handling updates coming from clients using an enum-based action switch.
This PR overhauls the structure of the existing Subscriptions endpoint documentation to align it more closely with discussed mechanisms and to present it as a more RFC-like specification rather than a series of endpoint descriptions.