Struggling to run large language models locally due to limited GPU resources? Discover how to effortlessly execute Ollama models on Google Colab's powerful cloud infrastructure. In this tutorial, I'll guide you through the entire process, from setting up your Colab environment to running your first Ollama model. No need for expensive hardware – let Colab do the heavy lifting!
If you prefer learning through a visual approach or want to gain additional insight into this topic, be sure to check out my YouTube video on this subject!
Quick Links
- Jupyter Notebooks
- Run Ollama on Colab from TechXplainator.
- Run Ollama on Colab with models stored on Google Drive from TechXplainator
- Original notebook from Ollama
- Ngrok
- Ollama
- Google Colab
Setup
Here's what I'll be using:
- Ollama: framework for running large language models locally
- Open-source and easy to set up
- Link for installation process
- Google Colab: cloud-based platform for Python code and Jupyter notebooks
- Free account required, assumes you already have one
- Consider upgrading to Colab Pro for faster LLMs
- Ngrok: gives local web applications a public URL
- Sign up for free account in description
Downsides of running Ollama on Google Colab
- Loss of privacy. interactions with LLMs are no longer private as Google Colab and Ngrok can see every HTTP request
- Cost associated with setup
- Requires Colab Pro subscription for powerful GPU runtimes
- Ongoing monthly expense to factor into budget
Setup Steps
- Log in to Ngrok and obtain authentication token and stable domain.
- Open a pre-prepared Jupyter Notebook on Colab, logged in with Google account.
- Copy and paste Ngrok authentication code into Colab notebook.
- Execute Jupyter Notebook, setting up Ollama on Colab and linking it to public stable Ngrok URL.
- Connect local Ollama to public Ngrok URL.
- Run Ollama.
Step 1: Get Authentication Token and Stable Domain from Ngrok
- Access your Ngrok account
- Get your unique authentication token by clicking on "Your Auth-Token"
- Copy the token at the top of the page
Optional:
- Navigate to "Domains" section
- Create a static domain if you don't already have one
Step 2: Open Prepared Jupyter Notebook
- Open the Jupyter notebook for running Ollama in Google Colab
- Ensure logged in with Google account first
Step 3: Maintain Auth Token in Colab
- Navigate to "Secrets" section within Colab
- Add new secret and name it 'NGROK_AUTH_TOKEN'
- Paste authentication code from Ngrok into value field
- Enable notebook access
- Replace
<insert-your-statik-ngrok-domain-here>
in line 10 of last cell with the static domain you copied from Ngrok
If you're NOT using a static Ngrok URL, uncomment line 9 and comment line 10 in the final cell of the notebook!
Step 4: Execute Jupyter Notebook
- Run all cells in the notebook
- Output should be the following:
"started tunnel" obj=tunnels name=command_line addr=http://localhost:11434 url=https://example.ngrok-free.app
If you are using a static Ngrok URL, the example URL should correspond to the static Ngrok domain. If not, you will get a randomly generated URL from Ngrok.
Step 5: Connect Local Ollama to Public Ngrok URL
- Copy generated URL from notebook's output
- Open command-line tool like Terminal app on Mac
- Configure local Ollama installation to communicate with instance hosted on Colab using the following command:
export OLLAMA_HOST=<paste_url_here>
- Store this export-command for future reference if using a static URL
Step 6: Run Ollama
- Start by running
ollama List
and notice that model list is empty - Download a model, such as Llama3, using the command
ollama run llama3
- Interact with Ollama locally and notice fast interaction due to leveraging Google's servers through Colab environment
Ending and Restarting Ollama on Colab
- End Ollama by disconnecting and deleting runtime in Google Colab
- To restart Ollama:
- run the Jupyter notebook (step 4)
- export Ollama host variable (step 5)
Alternative Method: Installing Ollama Models on Google Drive
- Install models directly on Google Drive using this alternative Jupyter Notebook
Combining with Ollama WebUI
If using Ollama alongside with Ollama WebUI, you'll need to do one additional step:
Install Open Web UI into a Docker container using the URL provided by Ngrok (stable URL required)
docker run -d -p 4000:8080 -e OLLAMA_BASE_URL=<paste_ngrok_url_here> -v open-webui:/app/backend/data --name ngrok-open-webui --restart always ghcr.io/open-webui/open-webui:main
This will create a docker container called ngrok-open-webui
which lets Open WebUI run on http://localhost:4000/