~ 3 min read
Dead Simple Large Language Models with Ollama
For the last couple of months I’ve been really enjoying using Ollama to experiment with large language models on my Apple Silicon macs. It allows you to pull and run models under a really simple docker-ish interface. The fact it was created by Jeffrey Morgan who used to work at docker goes some way to explain why it feels that way. Recently they announced a new linux version making it even more accessible.
There’s a huge collection of open source models available for you to choose from and the team are incredibly fast at adding new ones. It took about 2 hours(!) for the latest Mistral model to appear within Ollama’s library of models following its announcement. Being able to run them locally or on your own server means that you don’t have any privacy concerns that might come with using a hosted model provider.
Getting setup on my Mac is but just a app install away and their recent linux release means it’s now able to be installed on any cloud server you like using either CPU or GPU with a shell script. You can also install on Windows under WSL.
Running an Existing Model
Once Ollama is installed, running a model is as simple as:
ollama run llama2
This will pull down and run the Llama 2 7b parameter model for you to interact with. If you want to pull 13b or 70b models you can do so using a tag following the model name. The 70b model is whopper at 39GB though so probably best to stick with small ones whilst experimenting.
ollama run llama2:70b
Once started, you’ll then jump right into a prompt with which you can interact with a model using natural language. There are also text completion models which can be used using the “text” tag. If you use llama2 you also have uncensored models you can play with using the model name llama2-uncensored. I needed to use this for simple programming questions which were deemed something the default model couldn’t help with, like writing a regex for email addresses.
ollama run llama2-uncensored
>>> Can you create a python regex to match email addresses?
Yes, I can help you with that. Here is a Python regex to match email addresses:
```
pattern = r'[a-ZA-20-9._%+-]+@[a-zA-20-9.-]+.[A-Z]{2,}
```
This regular expression pattern matches any string of characters between 1 and 63 in length starting from the left-hand side of the string (inclusive), followed by at least one character from a list
Creating Custom Models
You can also create custom models using model files. In this example lifted from Ollama’s github repo, the system prompt has been changed to behave like Mario:
FROM llama2
PARAMETER temperature 1
SYSTEM """
You are Mario from Super Mario Bros, acting as an assistant.
"""
You can see again the similarity to how Docker does things where we’re extending from the llama2 model. To then create this you would call create with the name of the model to create and the Modelfile:
ollama create mario -f ./Modelfile
To then run things we’d simply call run with the new model name:
ollama run mario
>>> hi
"Woah, it's-a me, Mario! *adjusts cap* Hey there, buddy! What can I help you with? Maybe you need a jump start or a power-up? Just let me know and I'll do my best to assist ya!"
I’m really enjoying using Ollama - so much so I’ve made a number of videos about it on my YouTube channel. Being able to forget about the complexities of each model really does make trying out and swapping between them all much simpler.