When working with server-side rendering, developers often face the challenge of retrieving fully rendered HTML from modern websites. Many websites rely on JavaScript to dynamically load and update content, which makes traditional HTTP requests insufficient. In this tutorial, we’ll create an endpoint deployed to Google Cloud Run that leverages Playwright to render HTML, including all dynamically injected DOM nodes.
Overview of the Solution
We’ll build a Flask application that:
- Exposes an endpoint to accept a URL.
- Uses Playwright to fetch and render the URL.
- Waits for all network activity to settle before extracting the fully rendered HTML.
- Deploys the API on Google Cloud Run for scalable, serverless hosting.
Prerequisites
- Python 3.13 or later installed locally.
- Docker installed to containerize the application.
- A Google Cloud account with gcloud CLI configured.
Step 1: Setup the skeleton
Create a new directory for your project and touch
the following files:
mkdir playwright-flask-cloud-run
cd playwright-flask-cloud-run
touch Dockerfile
touch requirements.txt
touch app.py
touch Makefile
Step 2: Writing the Dockerfile
To deploy on Google Cloud Run, we need to containerize the application. The Dockerfile installs all necessary dependencies, including Playwright.
Dockerfile
:
# Use an official Python image
FROM python:3.13-slim
# Set the working directory
WORKDIR /app
# Set environment variables
ENV PYTHONDONTWRITEBYTECODE=1
ENV PYTHONBUFFERED=1
ENV PLAYWRIGHT_BROWSERS_PATH=/app/ms-playwright
# Install system dependencies
RUN apt-get clean && apt-get update
COPY . .
# Install Playwright and Flask etc. dependencies defined in requirements.txt
RUN pip install --upgrade pip
RUN pip install -r requirements.txt
# Install Playwright browsers and dependencies
RUN apt-get install -y gconf-service libasound2 libatk1.0-0 libcairo2 libcups2 libfontconfig1 libgdk-pixbuf2.0-0 libgtk-3-0 libnspr4 libpango-1.0-0 libxss1 fonts-liberation libappindicator1 libnss3 lsb-release xdg-utils
RUN python -m playwright install --with-deps chromium
# Copy the application files
COPY . .
# Expose port 8080 for Google Cloud Run
EXPOSE 8080
# Run the application
CMD ["gunicorn", "--bind", ":$PORT", "--workers", "1", "--threads", "8", "--timeout", "300", "main:app"]
Step 3: Writing the requirements.txt
We need to specify the dependencies for the Flask application and Playwright in
requirements.txt
:
playwright==1.49.0
flask==3.1.0
gunicorn==23.0.0
Step 4: Setting Up the Flask Application
Let’s start with the Flask API. The application will accept a URL in a POST request, use Playwright to render the URL, and return the fully rendered HTML.
app.py
:
from flask import Flask, request, jsonify
from playwright.sync_api import sync_playwright
import os
app = Flask(__name__)
@app.route('/render', methods=['POST'])
def render_url():
data = request.get_json()
url = data.get("url")
if not url:
return jsonify({"error": "URL is required"}), 400
try:
with sync_playwright() as p:
browser = p.chromium.launch(headless=True)
context = browser.new_context()
page = context.new_page()
# Navigate to the URL and wait for network activity to stop
page.goto(url, wait_until="load")
page.wait_for_load_state("networkidle")
# Get the rendered HTML
html = page.content()
browser.close()
return jsonify({"html": html}), 200
except Exception as e:
return jsonify({"error": str(e)}), 500
if __name__ == "__main__":
app.run(debug=True, host="0.0.0.0", port=int(os.environ.get("PORT", 8080)))
Step 5: Building and Deploying the Application
Now that we have the Dockerfile and Flask application ready, we can build the Docker image and deploy it to Google Cloud Run.
build:
docker build -t gcr.io/YOUR_PROJECT_ID/playwright-flask .
push:
docker push gcr.io/YOUR_PROJECT_ID/playwright-flask
deploy:
gcloud run deploy playwright-flask \
--image gcr.io/YOUR_PROJECT_ID/playwright-flask \
--platform managed \
--region us-central1 \
--allow-unauthenticated \
--memory 2Gi \
--timeout 300s
test:
curl -X POST \
-H "Content-Type: application/json" \
-d '{"url": "https://example.com"}' \
https://playwright-flask-XXXXXXXXXX-uc.a.run.app/render
Key Notes:
- Network Idle: The page.wait_for_load_state(“networkidle”) ensures that the page waits for all network activity to cease before retrieving the content.
- JSON Input/Output: The server expects a POST request with JSON input ({“url”: ”…”}) and returns the rendered HTML as JSON.
- Resource Limits: Rendering pages with Playwright is resource-intensive. Choose appropriate resource limits when configuring your Cloud Run service.
- Timeouts: Ensure your Cloud Run timeout settings accommodate the time Playwright needs to render large or complex pages.
- Error Handling: Add robust error handling to manage issues like invalid URLs, timeouts, or JavaScript errors.
Conclusion
And with that, you have successfully set up a Playwright server with Flask on Google Cloud Run. This solution provides a scalable and serverless way to render web pages with Playwright, making it ideal for server-side rendering, and other automation tasks.