Introduction
Want to use AI at work without risking data leaks or paying API fees? This guide shows you how to build your own ChatGPT Operator that runs entirely on your local machine—no internet, no cloud, no cost.
For professionals and companies handling sensitive data, relying on cloud-based AI services—even those compliant with GDPR—can still pose security risks. But by running AI models like LLaMA, DeepSeek R1, and Qwen locally on your machine’s GPU or CPU, you ensure that no data ever leaves your device. This eliminates privacy concerns and reduces AI usage costs to zero.
This step-by-step guide will help you set up a self-hosted AI assistant to boost productivity while keeping your data secure and your operational costs minimal. Whether you’re a developer, enterprise user, or researcher, this is the future of AI—powerful, private, and cost-free. 🚀
Sample Scenario
For some time now, I’ve been developing a web-based Iron Ore Procurement application as a side project (built on React and PostgreSQL) to simplify and digitize the traditionally cumbersome processes of asset pricing, stock tracking, invoice calculation etc. I wanted to take it one step further—by integrating artificial intelligence. I needed a fully local AI system that could seamlessly interact with it.
That’s where open source AI model (DeepSeek-R1, Llama) comes in. Now, instead of manually navigating through dashboards, clicking buttons, and extracting data, I can delegate these tasks to AI . After I write a few sentences as simple prompt and rest my eyes, the AI is hard at work.
or it may be:
"Get the whole production data of six facilities from 2022, make a comment on performance month by month." It will retrieve whole data. May apply time series to analyze relationship between same month in different years and provide you result.💀 That’s all it takes. The AI logs in, clicks through the interface, retrieves the required data, and presents the result—all autonomously, without cloud dependencies or security risks.
Requirements
- PC (Highly recommended but not necessary: NVIDA basis GPU or Apple Silicon based, If your computer capability is not available to test local LLM and would like to see web-ui interaction I recommend API Gemini 2.0 Flash)
- Visual Studio Code
- Python
- Ollama (simplify the process of running open-source large language models (LLMs) directly on your computer, generating localhost to interact your LLM)
- LLM (Deepseek-R1, Meta's Llama, or any open source LLM)
- Browser Use WebUI and uv (Uv will find or install Python if needed as part of syncing or running code in your environment)
- Optional: Web Based Software
Install
To check python installed succesfully
pip3 -v
python3 -v
macOS:
curl -LsSf https://astral.sh/uv/install.sh | sh
Windows:
powershell -ExecutionPolicy ByPass -c "irm https://astral.sh/uv/install.ps1 | iex
- Ollama
- Open Source Language Models (LLM)
Deepseek R1:7B (4.7GB) \
ollama run deepseek-r1:7b
Deepseek R1:7B (9.0GB) \
ollama run deepseek-r1:14b
Meta Llama 2 (3.8GB) \
ollama run llama2:7b
Vist Ollama website to explore more models. Ollama Explore Models
The "b" in model names stands for billion, indicating the number of parameters in the model. For example, a 7B model has 7 billion parameters, which determines its size. Larger models generally perform better but require significantly more computing power and memory, making them difficult to run on standard home hardware. If a model is too large, your system may run out of both RAM and VRAM.
Ollama, built on top of the Llama.Cpp project, supports CPU offloading. This means it first loads model parameters into VRAM, and once that is full, the remaining data is offloaded to CPU and system RAM. However, since CPUs are much slower than GPUs, performance may suffer if a model exceeds available VRAM.
Note 1: Browser-use web-ui is not fully compatible with whole models or models' in terms of validation of unexpected response. At this stage, I recommend you to use Deepseek R1:7B. Also, in order to check system functionallity, you may get an Gemini API from Google AI Studio and run the agent with default prompt (Go google.com and search OpenAI, return first url etc.)
Google AI StudioNote 2: If your model does not response any prompt (common problem in web-ui)deepseek-r1:7b. Open the terminal, check the name of model that you download. Even your model name ends with 7b, it listed as :latest. So, you need to write deepseek-r1:latest to web-ui's interface instead of deepseek-r1:7b \
- Browser Use WebUI
Create an empty folder in your Desktop. Open terminal and go to folder by cd command.
cd ~/Desktop/yourFolder
Clone repository. Access web-ui folder.
git clone https://github.com/browser-use/web-ui.git
cd web-ui
Set up and use virtual environments
uv venv --python 3.11
Activate
source .venv/bin/activate
Install the dependencies:
uv pip install -r requirements.txt
Install Playwright(end-to end test automation)
playwright install
Build Browser Use WebUI on 127.0.0.0.1 port.
python webui.py --ip 127.0.0.1 --port 7788
💣 Here it is
Agent Type: "org" is standardized, while "custom" appears to be more flexible like we did as choosen open source model. Max Run Steps: Maximum number of steps the agent will take. Max Actions Step: Maximum number of actions the agent will take per step. LLM Provider: AI Service Provider that runs your moldel, in our case we would like to run locally, Ollama. LLM Model: Injection of model what you want API Key: If you would like to get response from paid/free AI Services you need to write your api key to access. Browser Settings: Web-UI has an option to run your test on Chromium, Firefox etc. And also
-
Choose agent type as custom.
-
As LLM Provider, choose Ollama
-
Model Name: deepsek-r1:latest
- If you would like to run web test within 127.0.0.1, not opening Chromium, go to browser setting turn on 'Headless Mode'.
-
Optional: Run your web application.
- Go to run agent tab and describe your task. Click the 'Run Agent Button'
You may follow the operating task during the terminal window on Visual Studio Code or terminal. Generally, if their model response validation is acceptiable to process from WebUI, Step 1 is going to take arround 40-80 seconds, depends on your computer spec.
- You have AI Power on Chrome!