VulnAI is an AI-powered Static Application Security Testing (SAST) engine that uses machine learning to detect vulnerabilities in source code. Built on transformer models (CodeBERT), it provides accurate vulnerability classification with explainable results.
- ML-Based Detection: Transformer-based vulnerability classification
- Multi-Language Support: Python, Java, JavaScript, TypeScript, C/C++
- 10+ Vulnerability Categories: SQL Injection, XSS, Code Injection, and more
- REST API: FastAPI-based detection service
- CLI Tool: Easy command-line scanning
- Vector Database: Similarity search for vulnerability intelligence
- False Positive Reduction: Rule-based filtering + taint analysis
- Explainable AI: Attention visualization for vulnerability highlighting
βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
β AI-Powered SAST Engine Architecture β
βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
β USER INTERFACE LAYER β
β βββββββββββββββββββ βββββββββββββββββββββββββββββββββββ
β β REST API β β CLI Detection Tool ββ
β β (FastAPI) β β (Python-based Scanner) ββ
β ββββββββββ¬βββββββββ βββββββββββββββββ¬ββββββββββββββββββ
βββββββββββββΌββββββββββββββββββββββββββββββββββββββββββββββΌβββββββββββββββββββ
β β
βΌ βΌ
βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
β DETECTION ENGINE LAYER β
β βββββββββββββββββββ βββββββββββββββββββ βββββββββββββββββββββββββββββββββββ
β β Code Parser β β AST Generator β β Rule-Based Filter ββ
β β (Multi-lang) β β (Tree-sitter) β β (False Positive Reduction) ββ
β ββββββββββ¬βββββββββ ββββββββββ¬βββββββββ βββββββββββββββββ¬ββββββββββββββββββ
β β β β β
β ββββββββββββ¬ββββββββββ β β
β βΌ β β
β ββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββΌβββββββββββββββ
β β MODEL INFERENCE ENGINE ββ
β β βββββββββββββββββββ βββββββββββββββββββ βββββββββββββββββββββββββ
β β β CodeBERT β β Similarity β β Output Formatter βββ
β β β Embedding β β Search β β (JSON Results) βββ
β β ββββββββββ¬βββββββββ ββββββββββ¬βββββββββ ββββββββββββββββ¬βββββββββββ
β βββββββββββββΌβββββββββββββββββββββΌββββββββββββββββββββββββββΌββββββββββββ
ββββββββββββββββΌβββββββββββββββββββββΌββββββββββββββββββββββββββΌββββββββββββββ
β β β
βΌ βΌ βΌ
βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
β VULNERABILITY INTELLIGENCE LAYER β
β ββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
β β PostgreSQL + pgvector Database ββ
β β ββββββββββββββββββββββββ βββββββββββββββββββββββββββββββββββββββββββββ
β β β vulnerabilities β β detected_issues βββ
β β β - id β β - id βββ
β β β - cwe_id β β - file_name βββ
β β β - name β β - line_number βββ
β β β - description β β - detected_cwe βββ
β β β - severity β β - confidence βββ
β β β - remediation β β - timestamp βββ
β β β - embedding_vector β β βββ
β β ββββββββββββββββββββββββ βββββββββββββββββββββββββββββββββββββββββββββ
β ββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
ββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
- Python 3.10+
- PostgreSQL 15+ (optional, for database)
- CUDA-capable GPU (recommended for training)
# Clone the repository
git clone https://github.com/vulnai/sast-engine.git
cd sast-engine
# Create virtual environment
python -m venv venv
source venv/bin/activate # On Windows: venv\Scripts\activate
# Install dependencies
pip install -r requirements.txt
# Install package
pip install -e .# Scan a single file
vulnai scan -f path/to/code.py
# Scan a directory
vulnai scan -d ./src
# Output JSON
vulnai scan -f app.py -o json
# Specify language
vulnai scan -f main.js -l javascript
# Scan with verbose output
vulnai scan -f app.py -v# Start the API server
uvicorn vulnai.api.main:app --host 0.0.0.0 --port 8000
# API Documentation
# Open http://localhost:8000/docs in your browserfrom vulnai.detection.engine import DetectionEngine
# Initialize engine
engine = DetectionEngine(
model_path="models/trained/vulnai_classifier.pt",
confidence_threshold=0.5
)
# Detect vulnerabilities
result = engine.detect(code="your code here")
print(f"Is vulnerable: {result.is_vulnerable}")
for vuln in result.vulnerabilities:
print(f" {vuln.cwe_id} at line {vuln.line_number}")from vulnai.models.trainer import ModelTrainer, TrainingConfig
from vulnai.data.loader import load_training_data
# Load data
train, val, test = load_training_data()
# Configure training
config = TrainingConfig(
model_name="microsoft/codebert-base",
num_epochs=10,
batch_size=16,
learning_rate=2e-5
)
# Train model
trainer = ModelTrainer(config)
history = trainer.train(train_loader, val_loader)| CWE ID | Vulnerability Type | Severity |
|---|---|---|
| CWE-89 | SQL Injection | HIGH |
| CWE-79 | Cross-Site Scripting (XSS) | MEDIUM |
| CWE-94 | Code Injection | HIGH |
| CWE-78 | OS Command Injection | HIGH |
| CWE-287 | Insecure Authentication | HIGH |
| CWE-862 | Insecure Authorization | MEDIUM |
| CWE-434 | Unrestricted File Upload | HIGH |
| CWE-502 | Insecure Deserialization | HIGH |
| CWE-119 | Buffer Overflow | HIGH |
| CWE-200 | Information Exposure | LOW |
Environment variables can be set in .env file:
# Database
DATABASE_URL=postgresql://user:pass@localhost:5432/vulnai_db
# Model
MODEL_NAME=microsoft/codebert-base
MAX_SEQ_LENGTH=512
# API
API_HOST=0.0.0.0
API_PORT=8000
# Detection
CONFIDENCE_THRESHOLD=0.5vulnai/
βββ api/ # FastAPI REST API
β βββ main.py
β βββ models/
β βββ routes/
βββ cli/ # CLI tool
βββ core/ # Configuration & logging
βββ data/ # Data collection & loading
βββ detection/ # Detection engine
βββ models/ # ML models & training
βββ preprocessing/ # Code preprocessing
βββ storage/ # Database & vector store
Run evaluation on test data:
from vulnai.models.evaluator import evaluate_model
results = evaluate_model(
model_path="models/trained/vulnai_classifier.pt",
dataloader=test_loader,
output_dir="evaluation"
)
print(results.accuracy)
print(results.f1_score)
print(results.false_positive_rate)| Method | Endpoint | Description |
|---|---|---|
| POST | /api/v1/detect | Detect vulnerabilities in code |
| GET | /api/v1/vulnerabilities | List stored vulnerabilities |
| GET | /api/v1/vulnerabilities/{cwe_id} | Get specific vulnerability |
| POST | /api/v1/feedback | Submit feedback for learning |
| GET | /api/v1/stats | Get detection statistics |
| GET | /api/v1/health | Health check |
# Build image
docker build -t vulnai/sast-engine .
# Run container
docker run -p 8000:8000 vulnai/sast-engineContributions are welcome! Please read our contributing guidelines first.
This project is licensed under the MIT License - see the LICENSE file for details.
