diff --git a/README.md b/README.md index 70f2f158e2f4..0735dae57cdc 100644 --- a/README.md +++ b/README.md @@ -42,7 +42,7 @@ This is a great place to meet other contributors and get guidance on where to co However, all technical designs should also be recorded and formalized in GitHub issues, so that they are accessible to everyone. In Slack, find us in the `#arrow-rust` channel and feel free to ask for an invite via Discord, GitHub issues, or other means. -There is more information in the [contributing] guide. +There is more information in the [contributing] guide and the [security] policy. ## Repository Structure @@ -186,3 +186,4 @@ You can find more details about each crate in their respective READMEs. [issues]: https://github.com/apache/arrow-rs/issues [pull requests]: https://github.com/apache/arrow-rs/pulls [discussions]: https://github.com/apache/arrow-rs/discussions +[security]: SECURITY.md diff --git a/SECURITY.md b/SECURITY.md new file mode 100644 index 000000000000..da071d6ed6ba --- /dev/null +++ b/SECURITY.md @@ -0,0 +1,49 @@ + + +# Security Policy + +This document outlines the security model for the Rust implementation of Apache Arrow (`arrow-rs`) and how to report vulnerabilities. + +## Security Model + +The `arrow-rs` project follows the [Apache Arrow Security Model]. Key aspects include: +- Reading data from untrusted sources (e.g., over a network or from a file) requires explicit validation. +- Failure to validate untrusted data before use may lead to security issues. This implementation provides APIs to validate Arrow data. For example, [`ArrayData::validate_full`] can be used to ensure that data conforms to the Arrow specification. + +## Rust Safety and Undefined Behavior + +We strive to uphold the [Rust Soundness Pledge]. + +- **Undefined Behavior (UB) is a bug:** Any instance of UB is a bug we are committed to fixing. +- **UB as a Security Issue:** Any **exploitable** UB triggered via safe APIs is a security issue. Other UB instances are bugs, and we welcome help fixing them. + +## Reporting a Vulnerability + +**Do not file a public issue.** Follow the [ASF security reporting process] by emailing [security@apache.org](mailto:security@apache.org). + +Include in your report: +- A clear description and minimal reproducer. +- Affected crates and versions. +- Potential impact. + +[Apache Arrow Security Model]: https://arrow.apache.org/docs/dev/format/Security.html +[`ArrayData::validate_full`]: https://docs.rs/arrow/latest/arrow/array/struct.ArrayData.html#method.validate_full +[Rust Soundness Pledge]: https://raphlinus.github.io/rust/2020/01/18/soundness-pledge.html +[ASF security reporting process]: https://www.apache.org/security/#reporting-a-vulnerability diff --git a/arrow-avro/README.md b/arrow-avro/README.md index c5776c125b0a..dbc1e1760ea3 100644 --- a/arrow-avro/README.md +++ b/arrow-avro/README.md @@ -212,9 +212,11 @@ async fn main() -> anyhow::Result<()> { * **Confluent Schema Registry wire format**: 1‑byte magic `0x00` + 4‑byte BE schema ID + Avro body; supports decode + encode helpers. * **Avro Single‑Object Encoding (SOE)**: 2‑byte magic `0xC3 0x01` + 8‑byte LE CRC‑64‑AVRO fingerprint + Avro body; supports decode + encode helpers. ---- +## Security + +See the [Security Policy] for information on the security model and how to report vulnerabilities. -## Examples +[Security Policy]: https://github.com/apache/arrow-rs/blob/main/SECURITY.md * Read/write OCF in memory and from files (see crate docs “OCF round‑trip”). * Confluent wire‑format and SOE quickstarts are provided as runnable snippets in docs. diff --git a/arrow-csv/README.md b/arrow-csv/README.md new file mode 100644 index 000000000000..e4d3dd87c227 --- /dev/null +++ b/arrow-csv/README.md @@ -0,0 +1,33 @@ + + +# `arrow-csv` + +Support for reading/writing CSV files to/from [Apache Arrow]. + +See the [main repository README] and the [API documentation] for more details. + +## Security + +See the [Security Policy] for information on the security model and how to report vulnerabilities. + +[Apache Arrow]: https://arrow.apache.org/ +[main repository README]: https://github.com/apache/arrow-rs +[API documentation]: https://docs.rs/arrow-csv/latest +[Security Policy]: https://github.com/apache/arrow-rs/blob/main/SECURITY.md diff --git a/arrow-flight/README.md b/arrow-flight/README.md index 1cd8f5cfe21b..a7d5e49cc261 100644 --- a/arrow-flight/README.md +++ b/arrow-flight/README.md @@ -81,4 +81,8 @@ $ flight_sql_client --host example.com statement-query "SELECT 1;" +----------+ ``` -[apache arrow flightsql]: https://arrow.apache.org/docs/format/FlightSql.html +## Security + +See the [Security Policy] for information on the security model and how to report vulnerabilities. + +[security policy]: https://github.com/apache/arrow-rs/blob/main/SECURITY.md diff --git a/arrow-ipc/README.md b/arrow-ipc/README.md new file mode 100644 index 000000000000..bc8c563d9cec --- /dev/null +++ b/arrow-ipc/README.md @@ -0,0 +1,34 @@ + + +# `arrow-ipc` + +Support for reading/writing files and streams of the [Arrow IPC Format] to/from [Apache Arrow]. + +See the [main repository README] and the [API documentation] for more details. + +## Security + +See the [Security Policy] for information on the security model and how to report vulnerabilities. + +[Apache Arrow]: https://arrow.apache.org/ +[Arrow IPC Format]: https://arrow.apache.org/docs/format/Columnar.html#format-ipc +[main repository README]: https://github.com/apache/arrow-rs +[API documentation]: https://docs.rs/arrow-ipc/latest +[Security Policy]: https://github.com/apache/arrow-rs/blob/main/SECURITY.md diff --git a/arrow-json/README.md b/arrow-json/README.md new file mode 100644 index 000000000000..ea790d6b9f73 --- /dev/null +++ b/arrow-json/README.md @@ -0,0 +1,33 @@ + + +# `arrow-json` + +Support for reading and writing JSON to/from [Apache Arrow]. + +See the [main repository README] and the [API documentation] for more details. + +## Security + +See the [Security Policy] for information on the security model and how to report vulnerabilities. + +[Apache Arrow]: https://arrow.apache.org/ +[main repository README]: https://github.com/apache/arrow-rs +[API documentation]: https://docs.rs/arrow-json/latest +[Security Policy]: https://github.com/apache/arrow-rs/blob/main/SECURITY.md diff --git a/arrow/README.md b/arrow/README.md index 7c55932d2f3e..fd47f0d8ab00 100644 --- a/arrow/README.md +++ b/arrow/README.md @@ -76,32 +76,21 @@ The `arrow` crate provides the following features which may be enabled in your ` The [Apache Arrow Status](https://arrow.apache.org/docs/status.html) page lists which features of Arrow this crate supports. -## Safety -Arrow seeks to uphold the Rust Soundness Pledge as articulated eloquently [here](https://raphlinus.github.io/rust/2020/01/18/soundness-pledge.html). Specifically: +## Safety and Security -> The intent of this crate is to be free of soundness bugs. The developers will do their best to avoid them, and welcome help in analyzing and fixing them +`arrow-rs` follows the [Apache Arrow Security Model]. Any **exploitable** instance of undefined behavior (UB) triggered via safe APIs is a security issue. See our [Security Policy] for reporting. -Where soundness in turn is defined as: +We uphold the [Rust Soundness Pledge], aiming to be free of UB from safe APIs. While `unsafe` is used for performance or FFI, we mitigate risk through: +- Strongly-typed `Array` and `ArrayBuilder` APIs. +- Extensive `ArrayData` validation for untrusted sources. +- [MIRI] verification in CI. +- A `force_validate` feature for extra checks. -> Code is unable to trigger undefined behavior using safe APIs - -One way to ensure this would be to not use `unsafe`, however, as described in the opening chapter of the [Rustonomicon](https://doc.rust-lang.org/nomicon/meet-safe-and-unsafe.html) this is not a requirement, and flexibility in this regard is one of Rust's great strengths. - -In particular there are a number of scenarios where `unsafe` is largely unavoidable: - -- Invariants that cannot be statically verified by the compiler and unlock non-trivial performance wins, e.g. values in a StringArray are UTF-8, [TrustedLen](https://doc.rust-lang.org/std/iter/trait.TrustedLen.html) iterators, etc... -- FFI - -Additionally, this crate exposes a number of `unsafe` APIs, allowing downstream crates to explicitly opt-out of potentially expensive invariant checking where appropriate. - -We have a number of strategies to help reduce this risk: - -- Provide strongly-typed `Array` and `ArrayBuilder` APIs to safely and efficiently interact with arrays -- Extensive validation logic to safely construct `ArrayData` from untrusted sources -- All commits are verified using [MIRI](https://github.com/rust-lang/miri) to detect undefined behaviour -- Use a `force_validate` feature that enables additional validation checks for use in test/debug builds -- There is ongoing work to reduce and better document the use of unsafe, and we welcome contributions in this space +[Rust Soundness Pledge]: https://raphlinus.github.io/rust/2020/01/18/soundness-pledge.html +[MIRI]: https://github.com/rust-lang/miri +[Apache Arrow Security Model]: https://arrow.apache.org/docs/dev/format/Security.html +[Security Policy]: https://github.com/apache/arrow-rs/blob/main/SECURITY.md ## Building for WASM diff --git a/arrow/src/lib.rs b/arrow/src/lib.rs index f9b0c717f0b3..c38ab9713ed2 100644 --- a/arrow/src/lib.rs +++ b/arrow/src/lib.rs @@ -335,14 +335,24 @@ //! * [`parquet`](https://docs.rs/parquet) - support for [Apache Parquet] //! * [`arrow-avro`](https://docs.rs/arrow-avro) - support for [Apache Avro] //! -//! # Safety and Security +//! # Security //! -//! Like many crates, this crate makes use of unsafe where prudent. However, it endeavours to be -//! sound. Specifically, **it should not be possible to trigger undefined behaviour using safe APIs.** +//! This project follows the [Apache Arrow Security Model]. Any exploitable +//! instance of undefined behavior using `safe` APIs and having a clear explanation +//! or reproducer is considered a security issue. //! -//! If you think you have found an instance where this is possible, please file -//! a ticket in our [issue tracker] and it will be triaged and fixed. For more information on -//! arrow's use of unsafe, see [here](https://github.com/apache/arrow-rs/tree/main/arrow#safety). +//! If you think you have found a security vulnerability or a soundness bug, +//! please follow the instructions in our [security policy] for reporting. +//! +//! # Safety +//! +//! Like many crates, this crate makes use of `unsafe` where prudent. However, it endeavors to be +//! sound. Specifically, **it should not be possible to trigger undefined behavior using safe APIs.** +//! +//! For more information on the use of unsafe, see [here](https://github.com/apache/arrow-rs/tree/main/arrow#safety-and-security). +//! +//! [Apache Arrow Security Model]: https://arrow.apache.org/docs/dev/format/Security.html +//! [security policy]: https://github.com/apache/arrow-rs/blob/main/SECURITY.md //! //! # Higher-level Processing //! diff --git a/parquet/README.md b/parquet/README.md index 9e4e91d85d73..8fb48856fe79 100644 --- a/parquet/README.md +++ b/parquet/README.md @@ -79,6 +79,12 @@ information on the status of this implementation. [implementation status page]: https://parquet.apache.org/docs/file-format/implementationstatus/ [apache parquet]: https://parquet.apache.org/ +## Security + +See the [Security Policy] for information on the security model and how to report vulnerabilities. + +[security policy]: https://github.com/apache/arrow-rs/blob/main/SECURITY.md + ## License Licensed under the Apache License, Version 2.0: . diff --git a/parquet_derive/README.md b/parquet_derive/README.md index 783c71abd599..b7a23f5a7245 100644 --- a/parquet_derive/README.md +++ b/parquet_derive/README.md @@ -144,6 +144,12 @@ To compile and test doctests, run `cargo test --doc -- --show-output` To build documentation, run `cargo doc --no-deps`. To compile and view in the browser, run `cargo doc --no-deps --open`. +## Security + +See the [Security Policy] for information on the security model and how to report vulnerabilities. + +[Security Policy]: https://github.com/apache/arrow-rs/blob/main/SECURITY.md + ## License Licensed under the Apache License, Version 2.0: http://www.apache.org/licenses/LICENSE-2.0. \ No newline at end of file