-
Notifications
You must be signed in to change notification settings - Fork 153
Description
Describe the bug
When using a DirectoryClient created by a FileSystemClient, the directory path gets double url-encoded when calling ListPaths.
For example, if the directory path dataset/timestamp=2026-01-25T0000+1300/v3 is passed to GetDirectoryClient:
- It will be encoded to
dataset/timestamp=2026-01-25T0000%2B1300/v3(via UrlEncodePath) before being stored inm_pathUrlof the DirectoryClient. - In ListPaths, the directory path is extracted from
m_pathUrl(without decoding) and assigned to the Path variable of the options (ListFileSystemPathsOptions) passed to the REST client. - The REST client (re-)encodes the path as the directory query parameter to
dataset/timestamp=2026-01-25T0000%252B1300/v3, this time with UrlEncodeQueryParameter.
To Reproduce
Call GetDirectoryClient with a directory path that has characters that would be encoded by UrlEncodePath (e.g. +), and then call ListPaths.
Expected behavior
The directory path should be decoded before re-encoding as a query parameter.
Setup (please complete the following information):
- Version of the Library used: using via pyarrow and duckdb-azure, though the code from the latest (azure-storage-files-datalake_12.14.0) still has the issue.
Additional context
I'm hitting this bug via both Apache Arrow (AzureFileSystem) and duckdb-azure which both use the azure-sdk-for-cpp library.
Information Checklist
Kindly make sure that you have added all the following information above and checkoff the required fields otherwise we will treat the issuer as an incomplete report
- Bug Description Added
- Repro Steps Added
- Setup information Added