Skip to content

add spark runner documentation#367

Open
leroyjb wants to merge 20 commits intogoogle:mainfrom
leroyjb:main
Open

add spark runner documentation#367
leroyjb wants to merge 20 commits intogoogle:mainfrom
leroyjb:main

Conversation

@leroyjb
Copy link
Copy Markdown

@leroyjb leroyjb commented Nov 19, 2025

No description provided.

@leroyjb leroyjb marked this pull request as ready for review November 20, 2025 13:18
Copy link
Copy Markdown
Collaborator

@RamSaw RamSaw left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good, thank you! Let's fix small nits, I will take another look and submit likely.

Comment thread .github/workflows/maven.yml
Comment thread .github/workflows/maven.yml Outdated
Comment thread .github/workflows/maven.yml Outdated
Comment thread examples/pipelinedp4j/README.md Outdated
Comment thread examples/pipelinedp4j/README.md Outdated
Comment thread examples/pipelinedp4j/WORKSPACE.bazel Outdated
],

)
maven_install(
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why should it be a separate maven install?

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

discussed offline: it is because we need different scala version here (2.12) and above we use (2.13);

let's add this comment in code

"org.scala-lang:scala-library:%s" % SCALA_LIBRARY_TAG,
"info.picocli:picocli:4.7.6",
# For Apache Spark Runner testing locally
"org.apache.spark:spark-streaming_%s:%s" % (2.12, SPARK_TAG),
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why it should be here and the other deps in separate maven install?

Comment thread examples/pipelinedp4j/beam/pom.xml Outdated
Comment thread examples/pipelinedp4j/beam/pom.xml Outdated
@@ -0,0 +1,36 @@
# Copyright 2024 Google LLC
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

there is already examples/pipelinedp4j/beam/src/main/java/com/google/privacy/differentialprivacy/pipelinedp4j/examples/BUILD.bazel

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

has to be reverted

<?xml version="1.0" encoding="UTF-8"?>

<!--
Copyright 2024 Google LLC
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

could you revert these formatting changes for better diff?

Comment thread examples/pipelinedp4j/README.md
@@ -0,0 +1,36 @@
# Copyright 2024 Google LLC
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

has to be reverted


View the results with `cat output-spark-runner.txt`.

# Build and execute locally
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

header has to be smaller?

@@ -159,6 +159,53 @@ For Spark the output is written to a folder and the
result is stored in a file whose name starts with `part-00000`: `cat
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

add exaplanations how to run with bazel

@@ -159,6 +159,53 @@ For Spark the output is written to a folder and the
result is stored in a file whose name starts with `part-00000`: `cat
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: could you add a note somewhere in the beginning of bazel and maven instructions to reference yml files saying that these yml files for CI essential show how we build these examples test.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants