[New Feature]: Add BrowserOS MCP support for better agent-controlled browsing

### Target Component

External Integrations (LLM/Search APIs)

### Enhancement Description

PentAGI currently has a browser tool that uses the external scraper service to fetch:

* Markdown content
* HTML content
* Links
* Screenshots

This is useful for basic extraction, but it is still mostly a request/response scraper flow. For modern web apps, the agent may need a more interactive browser: click buttons, fill forms, handle **login pages**, wait for JavaScript, **inspect dynamic UI**, and extract data after navigation.

Can we add optional **BrowserOS MCP integration** so PentAGI agents can control a real browser through MCP?

### Current behavior

From reviewing the source code/doc, it looks like the current browser tool is implemented in:

* lives in `backend/pkg/tools/browser.go`
* uses configured scraper URLs
* calls scraper endpoints like `/markdown`, `/html`, `/links`, and `/screenshot`
* captures screenshots in parallel with content extraction
* supports public/private scraper routing
* returns browser errors as text instead of hard-failing the agent chain

This works for simple pages, but it is limited when the task needs real browser interaction. 🥹

### Proposed feature

Add BrowserOS as an optional MCP browser backend. 💯 

When enabled, PentAGI agents should be able to use BrowserOS MCP tools for actions like:

* open a URL
* click elements
* type into inputs
* submit forms
* wait for page changes
* extract selected page content
* capture screenshots
* inspect visible UI state
* navigate multi-step websites

### Why BrowserOS?

BrowserOS is an [open-source](https://github.com/browseros-ai/BrowserOS) AI browser with a built-in MCP server. It allows external AI clients to control the browser using MCP.

This could make PentAGI browsing easier and more reliable than only using the old scraper-style method.

### Suggested config

```env
BROWSER_BACKEND=scraper
# options: scraper, browseros_mcp, hybrid

BROWSEROS_MCP_ENABLED=true
BROWSEROS_MCP_TRANSPORT=http
BROWSEROS_MCP_URL=http://host.docker.internal:PORT
BROWSEROS_MCP_REQUIRE_APPROVAL=false
```

### Suggested modes

#### 1. Scraper mode

Current behavior.

Use existing scraper endpoints:

* `/markdown`
* `/html`
* `/links`
* `/screenshot`

#### 2. BrowserOS MCP mode 🥇 

Use [BrowserOS MCP](https://docs.browseros.com/features/use-with-claude-code) as the main browser tool.

The agent can perform interactive browsing actions instead of only extracting static content.

#### 3. Hybrid mode

Use the current scraper first for simple extraction.

If the scraper fails or the task needs interaction, fallback to BrowserOS MCP.

Example (Just my idea):

```text
browser markdown extraction failed
→ try BrowserOS MCP
→ open URL
→ wait for page load
→ extract visible text
→ capture screenshot
→ return result to agent
```

### Example agent flow

Current flow:

```text
Agent asks browser tool for URL markdown
PentAGI calls scraper /markdown
PentAGI returns extracted text
```

New BrowserOS MCP flow:

```text
Agent asks browser tool to inspect target
PentAGI sends MCP call to BrowserOS
BrowserOS opens the page
Agent clicks / types / navigates if needed
BrowserOS returns page text, DOM info, or screenshot
PentAGI stores the result in the flow
```

### Possible implementation idea

Add a browser backend interface:

```go
type BrowserBackend interface {
    Markdown(ctx context.Context, url string) (string, string, error)
    HTML(ctx context.Context, url string) (string, string, error)
    Links(ctx context.Context, url string) (string, string, error)

    Navigate(ctx context.Context, url string) error
    Click(ctx context.Context, selector string) error
    Type(ctx context.Context, selector string, text string) error
    Screenshot(ctx context.Context) (string, error)
    Extract(ctx context.Context) (string, error)
}
```

Then keep the existing scraper implementation as one backend and add BrowserOS MCP as another backend.

### Benefits 👍🏻 

* Better support for JavaScript-heavy websites
* Better handling of login pages and forms
* More natural browser automation for agents
* Less need for brittle HTML scraping
* Easier screenshots and visual inspection
* Reuses the growing MCP ecosystem
* Fits well with the existing MCP client proposal

### Safety considerations

Because PentAGI is a pentesting tool, BrowserOS MCP should be controlled safely:

* disabled by default
* configurable per deployment
* log every browser action
* optionally require approval for form submission or credential use
* keep browser sessions isolated per flow
* avoid leaking cookies/session data between flows
* allow admins to restrict BrowserOS to approved targets only (must)

### Acceptance criteria

* PentAGI can connect to BrowserOS MCP as an optional browser backend.
* Agents can use BrowserOS for interactive browser actions.
* Existing scraper behavior still works.
* Hybrid fallback from scraper to BrowserOS is supported.
* Screenshots and extracted content are stored in the normal PentAGI flow logs.
* Browser actions are visible in logs/observability.


### Technical Details

_No response_

### Designs and Mockups

_No response_

### Alternative Solutions

_No response_

### Verification

- [x] I have checked that this enhancement hasn't been already proposed
- [x] This enhancement aligns with PentAGI's goal of autonomous penetration testing
- [x] I have considered the security implications of this enhancement
- [x] I have provided clear use cases and benefits

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[New Feature]: Add BrowserOS MCP support for better agent-controlled browsing #342

Target Component

Enhancement Description

Current behavior

Proposed feature

Why BrowserOS?

Suggested config

Suggested modes

1. Scraper mode

2. BrowserOS MCP mode 🥇

3. Hybrid mode

Example agent flow

Possible implementation idea

Benefits 👍🏻

Safety considerations

Acceptance criteria

Technical Details

Designs and Mockups

Alternative Solutions

Verification

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

[New Feature]: Add BrowserOS MCP support for better agent-controlled browsing #342

Description

Target Component

Enhancement Description

Current behavior

Proposed feature

Why BrowserOS?

Suggested config

Suggested modes

1. Scraper mode

2. BrowserOS MCP mode 🥇

3. Hybrid mode

Example agent flow

Possible implementation idea

Benefits 👍🏻

Safety considerations

Acceptance criteria

Technical Details

Designs and Mockups

Alternative Solutions

Verification

Metadata

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

Issue actions