Skip to content

Expose a Local LLM

Share your local Ollama or LM Studio AI model with teammates, apps, or other devices. Tunnel a local LLM API to a public URL.

Running a large language model locally with Ollama or LM Studio? Localport lets you share that model's API with teammates, connect it to external apps, or access it from other devices, all through a public URL.

Why tunnel a local LLM?

  • Share with your team. One person runs the model, everyone uses it via the API.
  • Connect external apps. Tools like Continue, Cursor, or custom scripts can call your model from anywhere.
  • Access from other devices. Use the model from your phone, tablet, or another machine on a different network.
  • No cloud GPU costs. Run models on your own hardware, share the inference endpoint.

Ollama

Ollama serves a REST API on port 11434 by default. Tunnel it like any other HTTP service:

localport http 11434 --token YOUR_TOKEN

Now anyone with the URL can call the Ollama API:

curl https://abc123.tunnel.localport.dev/api/generate -d '{
  "model": "llama3.2",
  "prompt": "Explain TCP tunneling in one paragraph",
  "stream": false
}'

Ollama needs to allow external connections

By default, Ollama only listens on localhost. To allow connections from Localport, set the environment variable OLLAMA_HOST=0.0.0.0 before starting Ollama, or add it to your systemd service file.

LM Studio

LM Studio's local server runs on port 1234 by default:

localport http 1234 --token YOUR_TOKEN

LM Studio exposes an OpenAI-compatible API, so any tool that works with the OpenAI API can point to your tunnel URL instead:

curl https://abc123.tunnel.localport.dev/v1/chat/completions -d '{
  "model": "local-model",
  "messages": [{"role": "user", "content": "Hello"}]
}'

Connecting AI coding tools

Many AI-powered development tools let you configure a custom API endpoint. Point them to your tunnel URL:

ToolSetting
Continue (VS Code)Set apiBase in config to your tunnel URL
Open WebUISet the Ollama URL to your tunnel URL
Custom scriptsReplace http://localhost:11434 with your tunnel URL

Tips

  • Keep the model loaded. Ollama unloads models after inactivity. Use OLLAMA_KEEP_ALIVE=-1 to prevent this if you're sharing the endpoint.
  • Monitor usage. The Localport dashboard shows connection counts and bandwidth, so you can see how much your model is being used.
  • Reserve a subdomain (any plan). Keep a stable URL for team configurations that shouldn't change.

Public by default

Localport tunnels are public by default, so anyone with the URL can call your LLM API. Lock it down with the built-in access controls (IP allow lists and password protection, available on all plans) or add authentication to your API layer.

Next steps