Skip to content

Conversation

@freemandealer
Copy link
Contributor

reset_range could dereference a null cell when a block was evicted/removed while a downloader thread was finalizing a partial block. FileBlock stayed alive via refcount, but its FileBlockCell had already been erased from _files (when doing clear_cache operation), leading to SIGSEGV.

What problem does this PR solve?

Issue Number: close #xxx

Related PR: #xxx

Problem Summary:

Release note

None

Check List (For Author)

  • Test

    • Regression test
    • Unit Test
    • Manual test (add detailed scripts or steps below)
    • No need to test or manual test. Explain why:
      • This is a refactor/code format and no logic has been changed.
      • Previous test can cover this change.
      • No code files have been changed.
      • Other reason
  • Behavior changed:

    • No.
    • Yes.
  • Does this need documentation?

    • No.
    • Yes.

Check List (For Reviewer who merge this PR)

  • Confirm the release note
  • Confirm test cases
  • Confirm document
  • Add branch pick label

…ently

reset_range could dereference a null cell when a block was
evicted/removed while a downloader thread was finalizing a partial
block. FileBlock stayed alive via refcount, but its FileBlockCell had
already been erased from _files (when doing clear_cache operation),
leading to SIGSEGV.

Signed-off-by: zhengyu <zhangzhengyu@selectdb.com>
@Thearas
Copy link
Contributor

Thearas commented Jan 27, 2026

Thank you for your contribution to Apache Doris.
Don't know what should be done next? See How to process your PR.

Please clearly describe your PR:

  1. What problem was fixed (it's best to include specific error reporting information). How it was fixed.
  2. Which behaviors were modified. What was the previous behavior, what is it now, why was it modified, and what possible impacts might there be.
  3. What features were added. Why was this function added?
  4. Which code was refactored and why was this part of the code refactored?
  5. Which functions were optimized and what is the difference before and after the optimization?

@freemandealer
Copy link
Contributor Author

run buildall

@doris-robot
Copy link

TPC-H: Total hot run time: 32679 ms
machine: 'aliyun_ecs.c7a.8xlarge_32C64G'
scripts: https://github.com/apache/doris/tree/master/tools/tpch-tools
Tpch sf100 test result on commit 69caa8715edd2a4133f52767c8b0307799c05e47, data reload: false

------ Round 1 ----------------------------------
q1	17678	5277	5054	5054
q2	2039	306	194	194
q3	10209	1294	755	755
q4	10221	921	319	319
q5	7555	2160	1896	1896
q6	195	182	147	147
q7	880	724	615	615
q8	9263	1364	1077	1077
q9	5125	4851	4796	4796
q10	6765	1961	1561	1561
q11	511	295	278	278
q12	335	391	224	224
q13	17779	4045	3245	3245
q14	236	238	217	217
q15	894	827	811	811
q16	665	682	629	629
q17	632	765	503	503
q18	6680	6479	7358	6479
q19	2092	1060	638	638
q20	440	383	244	244
q21	2907	2303	1954	1954
q22	1126	1077	1043	1043
Total cold run time: 104227 ms
Total hot run time: 32679 ms

----- Round 2, with runtime_filter_mode=off -----
q1	5639	5539	5440	5440
q2	267	342	246	246
q3	2387	3043	2539	2539
q4	1461	2031	1462	1462
q5	4650	4600	4727	4600
q6	241	185	139	139
q7	2057	1898	1726	1726
q8	2559	2469	2420	2420
q9	7705	7542	7413	7413
q10	2873	3040	2497	2497
q11	541	465	442	442
q12	624	722	565	565
q13	3573	4070	3241	3241
q14	268	288	262	262
q15	850	804	808	804
q16	631	692	635	635
q17	1073	1321	1336	1321
q18	7645	7366	7325	7325
q19	793	773	796	773
q20	1948	2060	1911	1911
q21	4543	4149	4085	4085
q22	1047	1050	988	988
Total cold run time: 53375 ms
Total hot run time: 50834 ms

@doris-robot
Copy link

ClickBench: Total hot run time: 28.37 s
machine: 'aliyun_ecs.c7a.8xlarge_32C64G'
scripts: https://github.com/apache/doris/tree/master/tools/clickbench-tools
ClickBench test result on commit 69caa8715edd2a4133f52767c8b0307799c05e47, data reload: false

query1	0.06	0.05	0.05
query2	0.10	0.05	0.04
query3	0.25	0.08	0.09
query4	1.60	0.11	0.11
query5	0.27	0.26	0.24
query6	1.16	0.68	0.69
query7	0.04	0.03	0.03
query8	0.06	0.04	0.04
query9	0.56	0.50	0.50
query10	0.56	0.56	0.56
query11	0.14	0.09	0.09
query12	0.15	0.10	0.10
query13	0.63	0.62	0.63
query14	1.06	1.07	1.05
query15	0.87	0.86	0.87
query16	0.41	0.40	0.39
query17	1.17	1.12	1.14
query18	0.22	0.22	0.21
query19	2.10	2.03	2.05
query20	0.02	0.01	0.02
query21	15.40	0.27	0.14
query22	5.32	0.06	0.05
query23	16.00	0.28	0.11
query24	0.98	0.24	0.58
query25	0.07	0.09	0.07
query26	0.14	0.14	0.13
query27	0.07	0.06	0.06
query28	3.96	1.16	0.96
query29	12.56	3.89	3.14
query30	0.27	0.13	0.12
query31	2.83	0.66	0.40
query32	3.24	0.59	0.50
query33	3.29	3.22	3.27
query34	16.17	5.39	4.72
query35	4.84	4.82	4.85
query36	0.66	0.51	0.49
query37	0.12	0.07	0.07
query38	0.08	0.04	0.04
query39	0.04	0.03	0.03
query40	0.19	0.17	0.15
query41	0.09	0.04	0.03
query42	0.05	0.03	0.03
query43	0.06	0.04	0.04
Total cold run time: 97.86 s
Total hot run time: 28.37 s

@hello-stephen
Copy link
Contributor

BE UT Coverage Report

Increment line coverage 16.67% (1/6) 🎉

Increment coverage report
Complete coverage report

Category Coverage
Function Coverage 52.49% (19160/36501)
Line Coverage 35.89% (178227/496540)
Region Coverage 32.35% (137818/425976)
Branch Coverage 33.29% (59659/179197)

@hello-stephen
Copy link
Contributor

BE Regression && UT Coverage Report

Increment line coverage 16.67% (1/6) 🎉

Increment coverage report
Complete coverage report

Category Coverage
Function Coverage 73.13% (26166/35778)
Line Coverage 56.19% (278699/495996)
Region Coverage 53.94% (232190/430481)
Branch Coverage 55.60% (100054/179965)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants