Skip to content

fix: Expose cached AI pdf data url to webview#954

Open
kaaviya98 wants to merge 5 commits intomasterfrom
cache-ai-pdf
Open

fix: Expose cached AI pdf data url to webview#954
kaaviya98 wants to merge 5 commits intomasterfrom
cache-ai-pdf

Conversation

@kaaviya98
Copy link
Contributor

Changes done

Reason for the changes

Fixes # .

@gemini-code-assist
Copy link

Summary of Changes

Hello @kaaviya98, I'm Gemini Code Assist1! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed!

This pull request enhances the application's ability to integrate local PDF content with web-based views, particularly for AI chat functionalities. It establishes a secure and efficient way for webviews to access cached PDF files by exposing their base64 encoded data URLs through a dedicated JavaScript interface. This change improves the user experience by allowing seamless display of offline PDF content within web components.

Highlights

  • PDF Data Exposure to Webview: Introduced a mechanism to expose locally cached PDF data URLs to webviews, enabling web-based components to access and display these files.
  • New JavaScript Interface: A CachedPdfPathProvider class was added to serve as a JavaScript interface, allowing webviews to query for cached PDF status and retrieve base64 encoded PDF data URLs.
  • Fragment Integration: The AIChatPdfFragment now initializes this JavaScript interface upon webview load, and the DocumentViewerFragment passes the cached PDF path to AIChatPdfFragment.
  • PDF Path Retrieval: The PDFDownloadManager was extended with a new method to easily retrieve the absolute path of a cached PDF.
  • Sample Configuration Update: Updated the sample application's InstituteSettings to point to "staging.testpress.in" instead of "sandbox.testpress.in".
Using Gemini Code Assist

The full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips.

Invoking Gemini

You can request assistance from Gemini at any point by creating a comment using either /gemini <command> or @gemini-code-assist <command>. Below is a summary of the supported commands on the current page.

Feature Command Description
Code Review /gemini review Performs a code review for the current pull request in its current state.
Pull Request Summary /gemini summary Provides a summary of the current pull request in its current state.
Comment @gemini-code-assist Responds in comments when explicitly tagged, both in pull request comments and review comments.
Help /gemini help Displays a list of available commands.

Customization

To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a .gemini/ folder in the base of the repository. Detailed instructions can be found here.

Limitations & Feedback

Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments. If you're interested in giving your feedback about your experience with Gemini Code Assist for Github and other Google products, sign up here.

You can also get AI-powered code generation, chat, as well as code reviews directly in the IDE at no cost with the Gemini Code Assist IDE Extension.

Footnotes

  1. Review the Privacy Notices, Generative AI Prohibited Use Policy, Terms of Service, and learn how to configure Gemini Code Assist in GitHub here. Gemini can make mistakes, so double check it and use code with caution.

Copy link

@gemini-code-assist gemini-code-assist bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request aims to expose a cached PDF's data URL to a WebView for the AI chat feature. The implementation correctly passes the PDF path and uses a JavaScript interface to expose the data. However, there are a few significant issues. A critical security vulnerability is introduced by enabling broad file access for the WebView, which is unnecessary for the current implementation. Additionally, the new CachedPdfPathProvider has resource management and potential performance problems, such as not closing file streams correctly and risking OutOfMemoryError with large files. My review includes suggestions to fix these issues.

Comment on lines +73 to +76
webViewFragment.webView.settings.apply {
allowFileAccessFromFileURLs = true
allowUniversalAccessFromFileURLs = true
}

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

critical

Enabling allowFileAccessFromFileURLs and allowUniversalAccessFromFileURLs poses a significant security risk, especially when loading content from a remote URL. These settings could allow malicious JavaScript on the web page to access local files on the user's device.

Since the PDF data is being provided to the WebView via a JavaScript interface as a Base64-encoded data URL, direct file system access from the WebView is not necessary. Please remove these settings to enhance security.

Comment on lines +21 to +35
fun getBase64PdfData(): String {
return if (isPDFCached()) {
try {
val file = File(pdfPath)
val inputStream = FileInputStream(file)
val bytes = inputStream.readBytes()
inputStream.close()
Base64.encodeToString(bytes, Base64.DEFAULT)
} catch (e: Exception) {
""
}
} else {
""
}
}

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

high

This method has a few areas for improvement:

  1. Resource Handling: The FileInputStream is not guaranteed to be closed if an exception occurs during readBytes() or encodeToString(). It's safer to use the use extension function, which automatically closes the stream.
  2. Exception Handling: Catching a generic Exception is too broad. It's better to catch more specific exceptions, like IOException, to handle file-related errors more precisely.
  3. Memory Usage: The readBytes() function reads the entire file into a byte array in memory. This can lead to an OutOfMemoryError if the PDF file is large. While addressing this might require a larger design change, we should at least ensure resources are handled correctly.

Here's a suggested refactoring to address the resource and exception handling:

Suggested change
fun getBase64PdfData(): String {
return if (isPDFCached()) {
try {
val file = File(pdfPath)
val inputStream = FileInputStream(file)
val bytes = inputStream.readBytes()
inputStream.close()
Base64.encodeToString(bytes, Base64.DEFAULT)
} catch (e: Exception) {
""
}
} else {
""
}
}
fun getBase64PdfData(): String {
return if (isPDFCached()) {
try {
File(pdfPath).inputStream().use {
val bytes = it.readBytes()
Base64.encodeToString(bytes, Base64.DEFAULT)
}
} catch (e: java.io.IOException) {
// It's good practice to log the exception for debugging purposes.
""
}
} else {
""
}
}

Comment on lines +16 to +18
fun isPDFCached(): Boolean {
return !pdfPath.isEmpty() && File(pdfPath).exists()
}

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

For better code clarity and to follow idiomatic Kotlin conventions, it's recommended to use isNotEmpty() instead of !pdfPath.isEmpty().

Suggested change
fun isPDFCached(): Boolean {
return !pdfPath.isEmpty() && File(pdfPath).exists()
}
fun isPDFCached(): Boolean {
return pdfPath.isNotEmpty() && File(pdfPath).exists()
}

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant