[NAE-2408] Implement field sanitization modes for text data fields by renczesstefan · Pull Request #432 · netgrif/application-engine

renczesstefan · 2026-04-15T11:49:04Z

Description

Implement field sanitization modes for text data fields

Dependencies

Third party dependencies

<dependency>
    <groupId>com.googlecode.owasp-java-html-sanitizer</groupId>
    <artifactId>owasp-java-html-sanitizer</artifactId>
    <version>20260313.1</version>
</dependency>

Blocking Pull requests

There are no dependencies on other PR.

How Has Been This Tested?

This was tested manually and with unit tests.

Test Configuration

Name	Tested on
OS	macOS Tahoe 26.3
Runtime	Java 21
Dependency Manager	Maven 3.9.9n
Framework version	Spring Boot 3.4.4
Run parameters
Other configuration

Checklist:

Summary by CodeRabbit

New Features
- Added configurable text field sanitization with multiple modes including plain text, various safe HTML presets, and JavaScript disabling.
- Introduced sanitization actions to either sanitize unsafe content or reject it entirely.
- Sanitization settings can be configured on a per-field basis.

- Introduces `IFieldSanitizationService` and its implementation. - Adds configuration for HTML sanitization modes and actions. - Updates `DataService` to sanitize text fields during data modification. - Includes unit and integration tests for sanitization behavior. - Bumps project version to 6.4.3-SNAPSHOT.

Introduced a sanitization mechanism for text fields using IFieldSanitizationService, ensuring that input values are sanitized before being processed. This includes methods to resolve the appropriate field and component for sanitization. Updated the data parsing logic to integrate the sanitization step.

Revised test assertions to match the updated error message format. This ensures consistency with the new message structure indicating the configured action for unsafe content.

Changed the default fallback sanitization mode from PLAIN_TEXT to DISABLE_JAVASCRIPT in SanitizationMode and FieldSanitizationService. This ensures better protection against potential JavaScript-related vulnerabilities.

coderabbitai · 2026-04-15T13:39:10Z

Warning

Rate limit exceeded

@renczesstefan has exceeded the limit for the number of commits that can be reviewed per hour. Please wait 22 minutes and 57 seconds before requesting another review.

Your organization is not enrolled in usage-based pricing. Contact your admin to enable usage-based pricing to continue reviews beyond the rate limit, or try again in 22 minutes and 57 seconds.

⌛ How to resolve this issue?

After the wait time has elapsed, a review can be triggered using the @coderabbitai review command as a PR comment. Alternatively, push new commits to this PR.

We recommend that you space out your commits to avoid hitting the rate limit.

🚦 How do rate limits work?

CodeRabbit enforces hourly rate limits for each developer per organization.

Our paid plans have higher rate limits than the trial, open-source and free plans. In all cases, we re-allow further reviews after a brief timeout.

Please see our FAQ for further information.

ℹ️ Review info

⚙️ Run configuration

Configuration used: Organization UI

Review profile: ASSERTIVE

Plan: Pro

Run ID: 33be2947-b0f0-41b2-835b-7dc1922f2184

📥 Commits

Reviewing files that changed from the base of the PR and between ff46792 and 569634b.

📒 Files selected for processing (3)

src/main/java/com/netgrif/application/engine/workflow/service/sanitization/SanitizationMode.java
src/test/java/com/netgrif/application/engine/workflow/service/FieldSanitizationIntegrationTest.java
src/test/java/com/netgrif/application/engine/workflow/service/FieldSanitizationServiceTest.java

Walkthrough

Version bump from 6.4.2 to 6.4.3-SNAPSHOT. Added OWASP HTML sanitizer dependency. Introduced field sanitization feature with FieldSanitizationService supporting multiple sanitization modes and rejection actions. Integrated sanitization into DataService.setData(). Comprehensive unit and integration tests added.

Changes

Cohort / File(s)	Summary
Maven Configuration `pom.xml`	Updated project version to `6.4.3-SNAPSHOT` and added compile dependency on OWASP HTML sanitizer library (version `20260313.1`).
Service Interface `src/main/java/com/netgrif/application/engine/workflow/service/interfaces/IFieldSanitizationService.java`	New interface defining single `sanitize(String value, Field<?> field)` method for field value sanitization contract.
DataService Integration `src/main/java/com/netgrif/application/engine/workflow/service/DataService.java`	Added `IFieldSanitizationService` dependency injection and integrated sanitization into `setData()` workflow. New protected methods: `sanitizeValueIfNeeded()`, `resolveFieldForSanitization()`, `resolveSanitizationComponent()` to handle field-level sanitization configuration and component resolution.
Sanitization Service Implementation `src/main/java/com/netgrif/application/engine/workflow/service/sanitization/FieldSanitizationService.java`, `src/main/java/com/netgrif/application/engine/workflow/service/sanitization/SanitizationMode.java`, `src/main/java/com/netgrif/application/engine/workflow/service/sanitization/SanitizationAction.java`	Implemented `FieldSanitizationService` with multiple OWASP `PolicyFactory` instances for different sanitization modes. Added `SanitizationMode` enum (OFF, PLAIN_TEXT, SAFE_HTML variants, DISABLE_JAVASCRIPT) and `SanitizationAction` enum (SANITIZE, REJECT) with factory methods for string-to-enum conversion.
Unit Tests `src/test/java/com/netgrif/application/engine/workflow/service/FieldSanitizationServiceTest.java`	Comprehensive unit tests for `FieldSanitizationService` covering null handling, mode/action resolution, safe HTML preservation, property extraction, and rejection behavior across sanitization configurations.
Integration Tests `src/test/java/com/netgrif/application/engine/workflow/service/FieldSanitizationIntegrationTest.java`	End-to-end integration tests importing Petri net, creating cases, and verifying sanitization behavior across all modes (OFF, PLAIN_TEXT, SAFE_HTML variants, DISABLE_JAVASCRIPT) with SANITIZE and REJECT actions.
Test Resources `src/test/resources/sanitization_modes.xml`, `src/test/resources/data_text_validation.xml`	Added `sanitization_modes.xml` defining Petriflow test document with data fields and transitions covering all sanitization modes and actions. Minor formatting update to `data_text_validation.xml`.

Sequence Diagram(s)

sequenceDiagram
    participant Client
    participant DataService
    participant FieldSanitizationService
    participant Field
    participant Case

    Client->>DataService: setData(case, task, data)
    DataService->>Field: extract newValue for each field
    
    loop For each field
        DataService->>DataService: sanitizeValueIfNeeded(case, task, field, newValue)
        alt Is TextField and value is String?
            DataService->>DataService: resolveFieldForSanitization(case, task, field)
            DataService->>DataService: resolveSanitizationComponent(case, task, field)
            DataService->>FieldSanitizationService: sanitize(newValue, field)
            FieldSanitizationService->>Field: getComponent().getProperty(SANITIZATION_MODE)
            FieldSanitizationService->>Field: getComponent().getProperty(SANITIZATION_ACTION)
            alt Mode is OFF?
                FieldSanitizationService-->>DataService: return original value
            else Mode is SANITIZE or SAFE_HTML?
                FieldSanitizationService->>FieldSanitizationService: resolvePolicy(mode)
                FieldSanitizationService->>FieldSanitizationService: apply OWASP PolicyFactory.sanitize()
                alt Action is REJECT and content changed?
                    FieldSanitizationService-->>DataService: throw IllegalArgumentException
                else
                    FieldSanitizationService-->>DataService: return sanitized value
                end
            end
        else
            DataService-->>DataService: return value unchanged
        end
        
        DataService->>Case: apply sanitized value to field
    end

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~25 minutes

Possibly related PRs

[NAE-2303] TaskRef Security Improvements #400: Modifies DataService.setData() with additional validation logic (runStrict, forbidden-field checks) that intersects with the sanitization integration in the main PR.
[NAE-2303] TaskRef Security Improvements #399: Modifies DataService.setData() processing pipeline (runStrict validation and field-type constraints) alongside the sanitization feature introduced in the main PR.

Suggested labels

new feature, improvement, Large

🚥 Pre-merge checks | ✅ 2 | ❌ 1

❌ Failed checks (1 warning)

Check name	Status	Explanation	Resolution
Docstring Coverage	⚠️ Warning	Docstring coverage is 2.04% which is insufficient. The required threshold is 80.00%.	Write docstrings for the functions missing them to satisfy the coverage threshold.

✅ Passed checks (2 passed)

Check name	Status	Explanation
Description Check	✅ Passed	Check skipped - CodeRabbit’s high-level summary is enabled.
Title check	✅ Passed	The title directly and accurately describes the main change: implementing field sanitization modes for text data fields, which is the core feature across all modified files.

_{✏️ Tip: You can configure your own custom pre-merge checks in the settings.}

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

_{Comment @coderabbitai help to get the list of available commands and usage tips.}

coderabbitai

Actionable comments posted: 9

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (1)

src/main/java/com/netgrif/application/engine/workflow/service/DataService.java (1)
282-317: ⚠️ Potential issue | 🟠 Major

Apply incoming component-property changes before sanitizing the submitted value.

sanitizeValueIfNeeded() reads the current component from the case/task, but the new properties from the same payload are only applied later. A single setData() call that changes sanitizationMode/sanitizationAction and value together will therefore sanitize with stale rules. That can let unsafe HTML through when tightening a field in the same request, or strip content unexpectedly in the opposite direction.
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In
`@src/main/java/com/netgrif/application/engine/workflow/service/DataService.java`
around lines 282 - 317, The sanitization is run before applying incoming
component property changes, so sanitizeValueIfNeeded() sees stale component
rules; parse the properties via parseProperties(entry.getValue()) and apply them
(call changeComponentProperties(useCase, task, field.getStringId(), properties)
or otherwise update the in-memory component/state) before calling
sanitizeValueIfNeeded(useCase, task, field, newValue); then continue with
setting value, allowedNets, filterMetadata, options, choices and adding the
outcome — ensure modified flags and outcome.addOutcome ordering reflect the
early application of properties so sanitization uses the updated
sanitizationMode/sanitizationAction.

🤖 Prompt for all review comments with AI agents

Verify each finding against the current code and only fix it if needed.

Inline comments:
In
`@src/main/java/com/netgrif/application/engine/workflow/service/sanitization/SanitizationAction.java`:
- Around line 7-18: The from method in SanitizationAction should trim and
validate the input instead of silently returning SANITIZE for malformed or
whitespace-padded values; update SanitizationAction.from to first handle null
(return SANITIZE), then trim the input, return SANITIZE only if the trimmed
string is blank, otherwise compare trimmed value case-insensitively against enum
names, and if no match is found throw an IllegalArgumentException (or at least
log an error) naming the invalid value so mis-typed values like " REJECT " fail
fast and are visible.

In
`@src/main/java/com/netgrif/application/engine/workflow/service/sanitization/SanitizationMode.java`:
- Around line 14-25: Change SanitizationMode.from so it trims the input, treats
null/blank as "unspecified" (return null or the previous-compatibility sentinel
instead of defaulting to DISABLE_JAVASCRIPT), only accept DISABLE_JAVASCRIPT
when the trimmed value explicitly matches that name, and reject other invalid
values by throwing an IllegalArgumentException; i.e., in from(String) call value
= value == null ? null : value.trim(), if value==null or value.isEmpty() return
null (compat behavior), if value.equalsIgnoreCase("DISABLE_JAVASCRIPT") return
SanitizationMode.DISABLE_JAVASCRIPT, otherwise loop values() to match names
case-insensitively and return match or throw IllegalArgumentException when no
match is found.

In
`@src/test/java/com/netgrif/application/engine/workflow/service/FieldSanitizationIntegrationTest.java`:
- Around line 424-432: The two tests dataLevelSafeHtmlShouldApplyConfiguration
and dataLevelSafeHtmlShouldKeepSafeHtmlUntouched are duplicates of earlier cases
using the same field id text_safe_html and identical inputs/expectations; update
them to target a distinct data-level configured field (e.g., create/use a new
field id like text_safe_html_data_level) or explicitly document in the test
method Javadoc/inline comment that these exercise the data-level configuration
path (as opposed to transition/task-level) so the intent is clear; ensure any
new field id is wired to the data-level sanitization configuration and adjust
assertStoredEquals calls to reference that id.
- Around line 419-422: The test
disableJavascriptRejectShouldRejectNormalizedSafeMarkup currently asserts
rejection for input "<p><img src=\"https://example.com/image.png\"></p>" which
appears safe; update the test to clarify why rejection is expected: either
rename the test to something like
disableJavascriptRejectShouldRejectWhenHtmlIsNormalized (or similar) and/or add
an inline comment inside that test referencing the normalization behavior that
triggers rejection (e.g., self-closing tag or attribute normalization) and that
this is why assertRejected("text_disable_javascript_reject", ...) is used;
ensure the comment mentions the relevant behavior being tested so future readers
understand the reason for rejecting otherwise-safe markup.
- Around line 548-549: The assertion is doing an extra DB lookup by calling
getCase(stored.getStringId()) again; replace that redundant call and directly
use the already fetched Case instance stored to check the field value (i.e. use
stored.getDataField("text_plain_sanitize").getValue() in the assertNull),
keeping the same assertion intent but avoiding the unnecessary getCase
invocation.
- Line 70: Replace the manual instantiation of ObjectMapper in
FieldSanitizationIntegrationTest with the Spring-managed bean: remove "new
ObjectMapper()" and inject the application ObjectMapper (e.g., via an
`@Autowired/private` final field or constructor injection) so the test uses the
same JSON configuration and registered modules as the rest of the app.
- Around line 575-580: The createCase() method opens a FileInputStream with new
FileInputStream(XML_PATH) and passes it to petriNetService.importPetriNet
without closing it; wrap the FileInputStream creation in a try-with-resources in
createCase() so the stream is automatically closed even if importPetriNet throws
(i.e., use try (InputStream is = new FileInputStream(XML_PATH)) {
petriNetService.importPetriNet(is, VersionType.MAJOR,
superCreator.getLoggedSuper()); } and proceed to return the created Case) to
eliminate the resource leak.

In
`@src/test/java/com/netgrif/application/engine/workflow/service/FieldSanitizationServiceTest.java`:
- Around line 48-63: Tests assume plain-text stripping but
SanitizationMode.from() now defaults to DISABLE_JAVASCRIPT, so update the tests
to either explicitly set the field to plain-text mode or change
expectations/names to the new behavior: in FieldSanitizationServiceTest update
the two methods (shouldStripHtmlTagsWhenDefaultModeIsPlainText and
shouldStripDangerousAttributesWhenDefaultModeIsPlainText) to either call
field.setSanitizationMode(SanitizationMode.PLAIN_TEXT) before invoking
service.sanitize(...) (keeping existing expected strings) or rename the tests
and change expected results to reflect DISABLE_JAVASCRIPT semantics (e.g.,
expect "<b>Hello</b>  world" for the first and "test" vs "<img src=\"x\">test"
as appropriate) and reference SanitizationMode.from(), TextField, and
service.sanitize to locate the code to change.

In `@src/test/resources/sanitization_modes.xml`:
- Around line 9-15: The fixture with id "text_plain_default" lacks an explicit
sanitizationMode and thus resolves to DISABLE_JAVASCRIPT in production; update
the <data type="text"> element for id "text_plain_default" to include an
explicit <sanitizationMode>PLAIN</sanitizationMode> (or alternatively rename
id/title/placeholder to reflect DISABLE_JAVASCRIPT if you intend to adopt the
new default) so tests assert the intended plain-text stripping behavior.

---

Outside diff comments:
In
`@src/main/java/com/netgrif/application/engine/workflow/service/DataService.java`:
- Around line 282-317: The sanitization is run before applying incoming
component property changes, so sanitizeValueIfNeeded() sees stale component
rules; parse the properties via parseProperties(entry.getValue()) and apply them
(call changeComponentProperties(useCase, task, field.getStringId(), properties)
or otherwise update the in-memory component/state) before calling
sanitizeValueIfNeeded(useCase, task, field, newValue); then continue with
setting value, allowedNets, filterMetadata, options, choices and adding the
outcome — ensure modified flags and outcome.addOutcome ordering reflect the
early application of properties so sanitization uses the updated
sanitizationMode/sanitizationAction.

🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

Push a commit to this branch (recommended)
Create a new PR with the fixes

ℹ️ Review info

⚙️ Run configuration

Configuration used: Organization UI

Review profile: ASSERTIVE

Plan: Pro

Run ID: 52340b05-073e-4155-b1fd-75f9492704d7

📥 Commits

Reviewing files that changed from the base of the PR and between c4f79bd and ff46792.

📒 Files selected for processing (10)

pom.xml
src/main/java/com/netgrif/application/engine/workflow/service/DataService.java
src/main/java/com/netgrif/application/engine/workflow/service/interfaces/IFieldSanitizationService.java
src/main/java/com/netgrif/application/engine/workflow/service/sanitization/FieldSanitizationService.java
src/main/java/com/netgrif/application/engine/workflow/service/sanitization/SanitizationAction.java
src/main/java/com/netgrif/application/engine/workflow/service/sanitization/SanitizationMode.java
src/test/java/com/netgrif/application/engine/workflow/service/FieldSanitizationIntegrationTest.java
src/test/java/com/netgrif/application/engine/workflow/service/FieldSanitizationServiceTest.java
src/test/resources/data_text_validation.xml
src/test/resources/sanitization_modes.xml

Updated the default sanitization mode from 'DISABLE_JAVASCRIPT' to 'OFF' to align with behavior changes. Removed redundant test cases that validated HTML stripping in 'PLAIN_TEXT' mode, as it is no longer relevant. Adjusted existing tests to match the new default mode behavior.

machacjozef and others added 3 commits April 15, 2026 10:02

Update error message assertions in FieldSanitizationServiceTest

26e17b4

Revised test assertions to match the updated error message format. This ensures consistency with the new message structure indicating the configured action for unsafe content.

renczesstefan requested review from Kovy95, Retoocs, machacjozef and tuplle April 15, 2026 11:49

Retoocs reviewed Apr 15, 2026

View reviewed changes

Comment thread ...a/com/netgrif/application/engine/workflow/service/sanitization/FieldSanitizationService.java

Comment thread src/main/java/com/netgrif/application/engine/workflow/service/DataService.java

Retoocs requested changes Apr 15, 2026

View reviewed changes

Update default sanitization mode to DISABLE_JAVASCRIPT

ff46792

Changed the default fallback sanitization mode from PLAIN_TEXT to DISABLE_JAVASCRIPT in SanitizationMode and FieldSanitizationService. This ensures better protection against potential JavaScript-related vulnerabilities.

coderabbitai Bot added improvement A change that improves on an existing feature new feature A change that introduces new functionality Large labels Apr 15, 2026

coderabbitai Bot requested changes Apr 15, 2026

View reviewed changes

machacjozef changed the title ~~Nae 2408~~ [NAE-2408] Implement field sanitization modes for text data fields Apr 15, 2026

netgrif deleted a comment from Retoocs Apr 15, 2026

machacjozef requested a review from Retoocs April 15, 2026 15:25

Retoocs assigned machacjozef Apr 16, 2026

Retoocs approved these changes Apr 16, 2026

View reviewed changes

coderabbitai Bot approved these changes Apr 16, 2026

View reviewed changes

machacjozef approved these changes Apr 16, 2026

View reviewed changes

Kovy95 approved these changes Apr 16, 2026

View reviewed changes

machacjozef merged commit 4ee8eab into release/6.4.3 Apr 16, 2026
6 of 7 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[NAE-2408] Implement field sanitization modes for text data fields#432

[NAE-2408] Implement field sanitization modes for text data fields#432
machacjozef merged 5 commits intorelease/6.4.3from
NAE-2408

renczesstefan commented Apr 15, 2026 •

edited by coderabbitai Bot

Loading

Uh oh!

Uh oh!

Uh oh!

coderabbitai Bot commented Apr 15, 2026 •

edited

Loading

Rate limit exceeded

❌ Failed checks (1 warning)

Uh oh!

coderabbitai Bot left a comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

Conversation

renczesstefan commented Apr 15, 2026 • edited by coderabbitai Bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Description

Dependencies

Third party dependencies

Blocking Pull requests

How Has Been This Tested?

Test Configuration

Checklist:

Summary by CodeRabbit

Uh oh!

Uh oh!

Uh oh!

coderabbitai Bot commented Apr 15, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Rate limit exceeded

Walkthrough

Changes

Sequence Diagram(s)

Estimated code review effort

Possibly related PRs

Suggested labels

❌ Failed checks (1 warning)

Uh oh!

coderabbitai Bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

renczesstefan commented Apr 15, 2026 •

edited by coderabbitai Bot

Loading

coderabbitai Bot commented Apr 15, 2026 •

edited

Loading