Expose a Local LLM

Running a large language model locally with Ollama or LM Studio? Localport lets you share that model's API with teammates, connect it to external apps, or access it from other devices, all through a public URL.

Why tunnel a local LLM?#

Share with your team. One person runs the model, everyone uses it via the API.
Connect external apps. Tools like Continue, Cursor, or custom scripts can call your model from anywhere.
Access from other devices. Use the model from your phone, tablet, or another machine on a different network.
No cloud GPU costs. Run models on your own hardware, share the inference endpoint.

Ollama#

Ollama serves a REST API on port 11434 by default. Tunnel it like any other HTTP service:

localport http 11434 --token YOUR_TOKEN

Now anyone with the URL can call the Ollama API:

curl https://abc123.tunnel.localport.dev/api/generate -d '{
  "model": "llama3.2",
  "prompt": "Explain TCP tunneling in one paragraph",
  "stream": false
}'

Ollama needs to allow external connections

By default, Ollama only listens on localhost. To allow connections from Localport, set the environment variable OLLAMA_HOST=0.0.0.0 before starting Ollama, or add it to your systemd service file.

LM Studio#

LM Studio's local server runs on port 1234 by default:

localport http 1234 --token YOUR_TOKEN

LM Studio exposes an OpenAI-compatible API, so any tool that works with the OpenAI API can point to your tunnel URL instead:

curl https://abc123.tunnel.localport.dev/v1/chat/completions -d '{
  "model": "local-model",
  "messages": [{"role": "user", "content": "Hello"}]
}'

Connecting AI coding tools#

Many AI-powered development tools let you configure a custom API endpoint. Point them to your tunnel URL:

Tool	Setting
Continue (VS Code)	Set `apiBase` in config to your tunnel URL
Open WebUI	Set the Ollama URL to your tunnel URL
Custom scripts	Replace `http://localhost:11434` with your tunnel URL

Tips#

Keep the model loaded. Ollama unloads models after inactivity. Use OLLAMA_KEEP_ALIVE=-1 to prevent this if you're sharing the endpoint.
Monitor usage. The Localport dashboard shows connection counts and bandwidth, so you can see how much your model is being used.
Reserve a subdomain (any plan). Keep a stable URL for team configurations that shouldn't change.

Public by default

Localport tunnels are public by default, so anyone with the URL can call your LLM API. Lock it down with the built-in access controls (IP allow lists and password protection, available on all plans) or add authentication to your API layer.

Next steps#

HTTP Tunnels. More about HTTP tunneling.
Shared Tunnels. Let multiple people connect to the same endpoint.