Skip to content

THREESCALE-14261: Fix object stale check#4270

Open
mayorova wants to merge 5 commits intomasterfrom
fix-api-object-caching
Open

THREESCALE-14261: Fix object stale check#4270
mayorova wants to merge 5 commits intomasterfrom
fix-api-object-caching

Conversation

@mayorova
Copy link
Copy Markdown
Contributor

What this PR does / why we need it:

Not sure if this feature has always been broken (without nobody noticing), or it got broken at some point.
Apparently, .dup was added explicitly in https://github.com/3scale/porta/pull/1643/changes

But currently, when testing it, I realized that .dup creates a new instance of the object, with a nil ID, and also updated_at (among other fields) are empty, which actually makes this field unusable for figuring out whether the cached object should be returned or not.

Removing it makes the cache work correctly (I think).

Which issue(s) this PR fixes

https://redhat.atlassian.net/browse/THREESCALE-14261

Verification steps

  1. Find an active account in the admin portal
  2. Call Account Find endpoint from 3scale API Docs to find this account
  3. Check that in the response it's "approved"
  4. Suspend the account from the admin portal
  5. Call the same Account Find endpoint with the same parameters
  6. Verify that the response is 200 OK and the state of the account is "suspended" - before the fix the response was 304 Not Modified and the state was "approved".

Note that the bug only manifests in the browser (i.e. via 3scale API Docs) and not simple curl/Postman calls, because it seems that unlike curl, the browser sets the If-None-Match header. Without the cache-related headers, the updated object is always returned.

Special notes for your reviewer:

@mayorova mayorova force-pushed the fix-api-object-caching branch from 414d4f6 to 8b3af65 Compare April 14, 2026 11:58
Copy link
Copy Markdown
Contributor

@jlledom jlledom left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Tests are failing, so this must be breaking something.

Comment thread config/environments/test.rb Outdated
Comment thread test/integration/admin/api/accounts_controller_test.rb Outdated
Comment thread test/integration/admin/api/accounts_controller_test.rb Outdated
@mayorova
Copy link
Copy Markdown
Contributor Author

Tests are failing, so this must be breaking something.

Yeah, I made a stupid mistake 😅 Fixed it. I'll address your other comments.

@mayorova mayorova force-pushed the fix-api-object-caching branch from 8b3af65 to f7d84cf Compare April 15, 2026 11:37
@mayorova mayorova force-pushed the fix-api-object-caching branch from f7d84cf to d4a3ba1 Compare April 30, 2026 13:46
@qltysh
Copy link
Copy Markdown

qltysh Bot commented Apr 30, 2026

❌ 4 blocking issues (4 total)

Tool Category Rule Count
rubocop Style Wrap hash in \{ and \}. 1
rubocop Lint Cyclomatic complexity for api\_behavior is too high. [9/7] 1
rubocop Lint Perceived complexity for api\_behavior is too high. [10/8] 1
reek Lint Admin::Api::AccountPlansControllerTest#test_account_plan_representer has approx 11 statements 1

@mayorova mayorova force-pushed the fix-api-object-caching branch from d4a3ba1 to 067a609 Compare April 30, 2026 14:04
Comment thread config/environments/test.rb Outdated
akostadinov
akostadinov previously approved these changes Apr 30, 2026
Comment thread app/lib/three_scale/api/responder.rb Outdated
case
when get?
display(resource, status: :ok) if controller.stale?(serializable.dup)
display(resource, status: :ok) if controller.stale?(serializable)
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

looks good, did you go back to see why dup was used in the first place?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I didn't quiet get it 🥲 379b3dc

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm confused, API docs say that we should pass options like etag: object, not the object directly 🤔

https://apidock.com/rails/ActionController/Base/stale%3F

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That comment suggests stale called to_hash but then why should we care?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@akostadinov This seems to be an old doc:

This method is deprecated or moved on the latest stable version. The last existing version (v2.3.8) is shown here.

https://api.rubyonrails.org/v7.1/classes/ActionController/ConditionalGet.html#method-i-stale-3F

You can also pass an object that responds to maximum, such as a collection of records:

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

how does it break?

Like this:

Minitest::Assertion: --- expected
+++ actual
@@ -1 +1 @@
-"<?xml version=\"1.0\" encoding=\"UTF-8\"?><plans><plan default=\"true\"><id>195</id><name>Plan 1777320341</name><type>account_plan</type><state>published</state><approval_required>false</approval_required><setup_fee>0.0</setup_fee><cost_per_month>0.0</cost_per_month><trial_period_days>0</trial_period_days><cancellation_period>0</cancellation_period></plan></plans>"
+"<?xml version=\"1.0\" encoding=\"UTF-8\"?><plans><plan default=\"true\"><id>195</id><name>Plan 1777320341</name><type>account_plan</type><state>published</state><approval_required>false</approval_required><setup_fee>0.0</setup_fee><cost_per_month>0.0</cost_per_month><trial_period_days/><cancellation_period>0</cancellation_period></plan></plans>"

test/integration/admin/api/account_plans_controller_test.rb:33:in `test_get'

The difference is: <trial_period_days>0</trial_period_days> vs <trial_period_days/>

I wonder, what if we use #clone instead of #dup. That should retain modification times and such.

But why do you think it's worth it? If just removing #dup doesn't break anything?

Copy link
Copy Markdown
Contributor

@akostadinov akostadinov May 4, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok, I've got it.

It seems like in Rails 5, #stale? was calling to_hash which caused the represented model to be instrumented with the wrong representer methods.

Also it seems like in Rails 7.1 to_hash is not called anymore so the represented object is not broken by it anymore.

So we should be fine with your original version.

Would be nice though to have the original comment updated with that information and have a detailed commit message explaining this so in case Rails starts calling to_hash again, future people will have more clue what the problem is.

Might be a little more solid if we call #stale? before we instrument the object with the representer on line 7 but to me it's fine as it is also.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ehm... I'm not sure to_hash is not called now 🤔

I am not exactly sure how to change that comment 😅

Copy link
Copy Markdown
Contributor

@akostadinov akostadinov May 4, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If it was called, then the test would have failed, no? Because when you uncomment the to_hash line it fails you said.

Maybe something like this:

  # In Rails 5.0 `controller#stale?(resource)` called `#to_hash` on the object.
  # And `#stale?` is called from our responder:
  # https://github.com/3scale/porta/blob/bdc91b894eac16fbd81afd4f05198eb5cb8beee9/app/lib/three_scale/api/responder.rb#L11
  # But the +representable+ gem is extending each items with the representer inside the `#to_hash`
  # https://github.com/trailblazer/representable/blob/v2.3.0/lib/representable/hash.rb#L32
  #
  # We implicitly extend those items with the JSON representer as #to_json uses #to_hash
  # So it may break xml representation when `#to_hash` is called by `#stale?`.
  # In Rails 7.1 though, `#stale?` doesn't seem to call `#to_hash` anymore so we are good.
  # Hopefully this test will catch it if that changes in the future.

idk, it's a little weird

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't know @akostadinov I have a strong feeling that there was some confusion, and the test is not actually related to the "fix" provided in the same commit (adding .dup to the object in .stale? call).

This is what I think about what that test actually verifies - 02bbb40

Also, I checked that if the account plan in the test actually has trial_period_days set to a valid number, uncommenting that line doesn't break the test (i.e. the representation is the same with and without calling .to_hash).

Again, I think it's just too painful to try to set up the environment to be able to check out/run this old code and verify if this theory (about .dup not being needed in the first place) is true or not.

But if you want, please suggest what else I should do 😬

Comment thread app/lib/three_scale/api/responder.rb Outdated
case
when get?
display(resource, status: :ok) if controller.stale?(serializable.dup)
display(resource, status: :ok) if controller.stale?(serializable)
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Or maybe even cleaner - on line 7 we instrument the object with the represented. Lets check stale? before that and only if not stale, go ahead, instrument the AR object and continue the logic. There is no need to instrument with the representer before checking stale. And we save ourselves a lot of crap as well a few CPU cycles.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How about this? 1f4ce00

buyer_user = @buyer.users.last!

# First request - get the account and capture headers
get find_admin_api_accounts_path(format: :json), params: params.merge({ user_id: buyer_user.id })
Copy link
Copy Markdown
Contributor

@akostadinov akostadinov May 4, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe it is worth using the call used in 379b3dc , i.e. admin_api_account_plans_path so that we know we are testing what previously was known to break because of the #to_hash.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Well, I used this call here in the test, because I was fixing https://redhat.atlassian.net/browse/THREESCALE-14261, and this is the call where the issue was reproduced. So, I think it makes sense to use it.

If you're still worried about that other test, I can investigate further.

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actually the other test is already testing that other call so perhaps we are fine. Makes sense what you did.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

well, according to Bob, stale? never called to_hash explicitly 😬

Conclusion

  • Rails stale?() has NEVER called to_hash - it uses to_a, cache_key, or to_param
  • The .dup was added based on a misunderstanding of how Rails processes cache keys
  • The .dup removal is safe because stale?() doesn't trigger the mutation
  • The test remains valid as a regression guard
  • The "uncommenting breaks test" comment is accurate - it demonstrates representable's mutation behavior when to_hash is explicitly called

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

All connects except it doesn't explain why it failed on Rails 5.0 if it didn't call #to_hash. I somehow doubt our ancestors made an assessment so wrong.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't believe they made-up an issue and this description, and implemented a test for no reason.

Well, I am not saying they made it up, but they could have been slightly confused by something else 🤷 I am certainly very confused by many things in this project 🫠 It also seems like the PR consists of cherry-picking the commit from somewhere else, I can easily see how this could have been some confusion...

And by the way, if they were trying to fix something happening in stale? method, I would expect some tests covering the conditional GET Functionality (that uses this stale? method).

Anyway, I don't want to dig any deeper into this. Feel free to update the test/comment on it to your liking in a separate commit and push to this branch.

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

When Hery has done something, I would quadruple check before beginning to consider the remote possibility that he got confused and wrote slop.

Try the attached ugly AI written test that Rails 5.0 indeed calls #to_hash on the object stale_check_rails5_test.rb.gz

podman run --rm -it   -v "$PWD/stale_check_rails5_test.rb:/app/stale_check_rails5_test.rb:Z" -w /app docker.io/library/ruby:2.4 bash -c "ruby stale_check_rails5_test.rb"

I think your comment is probably alright. Just because we inevitably rely on trust when reviewing stuff, I thought it made sense to extend our circle of trust a little.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Great stuff @akostadinov thank you for building that test.
Do you want me to update the comment again including this info?

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If you feel the comment can be improved, I'm fine either way.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

akostadinov
akostadinov previously approved these changes May 5, 2026
Copy link
Copy Markdown
Contributor

@akostadinov akostadinov left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you! A very nasty issue!

jlledom
jlledom previously approved these changes May 5, 2026
@mayorova mayorova dismissed stale reviews from jlledom and akostadinov via ee80057 May 5, 2026 15:48
Copy link
Copy Markdown
Contributor

@akostadinov akostadinov left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks awesome!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants