The Ollama R library provides the easiest way to integrate R with Ollama, which lets you run language models locally on your own machine. Main site: https://hauselin.github.io/ollama-r/
Note: You should have at least 8 GB of RAM available to run the 7B models, 16 GB to run the 13B models, and 32 GB to run the 33B models.
See Ollama’s Github page for more information. See also the Ollama API documentation and endpoints. For Ollama Python, see ollama-python. You’ll need to have the Ollama app installed on your computer to use this library.
You should have the Ollama app installed on your computer. Download it from Ollama.
Open/launch the Ollama app to start the local server. You can then run your language models locally, on your own machine/computer.
Install the development version of ollamar
R library like so:
If it doesn’t work or you don’t have devtools
installed, please run install.packages("devtools")
in R or RStudio first.
library(ollamar)
test_connection() # test connection to Ollama server; returns a httr2 response object
# Ollama local server running
# <httr2_response>
list_models() # list available models (models you've pulled/downloaded)
# A tibble: 16 × 4
name model parameter_size quantization_level
<chr> <chr> <chr> <chr>
1 mixtral:latest mixtral:latest 47B Q4_0
2 llama3:latest llama3:latest 8B Q4_0
Optional/advanced parameters (see API docs) such as temperature
are not yet implemented as of now but will be implemented in future versions.
If you don’t have the Ollama app running, you’ll get an error. Make sure to open the Ollama app before using this library.
test_connection()
# Ollama local server not running or wrong server.
# Error in `httr2::req_perform()` at ollamar/R/test_connection.R:18:9:
If a function in the library returns an httr2_response
object, you can parse the output with resp_process()
. ollamar
uses the httr2
library to make HTTP requests to the Ollama server.
resp <- list_models(output = "resp") # returns a httr2 response object
# process the httr2 response object with the resp_process() function
resp_process(resp, "df")
resp_process(resp, "jsonlist") # list
resp_process(resp, "raw") # raw string
resp_process(resp, "resp") # returns the input httr2 response object
Download a model from the ollama library (see API doc). For the list of models you can pull/download, see Ollama library.
pull("llama3") # returns a httr2 response object
pull("mistral-openorca")
list_models() # verify you've pulled/downloaded the model
Delete a model and its data (see API doc). You can see what models you’ve downloaded with list_models()
. To download a model, specify the name of the model.
list_models() # see the models you've pulled/downloaded
delete("all-minilm:latest") # returns a httr2 response object
Generate the next message in a chat (see API doc).
messages <- list(
list(role = "user", content = "Who is the prime minister of the uk?")
)
chat("llama3", messages) # returns httr2 response object
chat("llama3", messages, output = "df") # data frame/tibble
chat("llama3", messages, output = "raw") # raw string
chat("llama3", messages, output = "jsonlist") # list
messages <- list(
list(role = "user", content = "Hello!"),
list(role = "assistant", content = "Hi! How are you?"),
list(role = "user", content = "Who is the prime minister of the uk?"),
list(role = "assistant", content = "Rishi Sunak"),
list(role = "user", content = "List all the previous messages.")
)
chat("llama3", messages)
messages <- list(
list(role = "user", content = "Hello!"),
list(role = "assistant", content = "Hi! How are you?"),
list(role = "user", content = "Who is the prime minister of the uk?"),
list(role = "assistant", content = "Rishi Sunak"),
list(role = "user", content = "List all the previous messages.")
)
chat("llama3", messages, stream = TRUE)
Get the vector embedding of some prompt/text (see API doc). By default, the embeddings are normalized to length 1, which means the following:
embeddings("llama3", "Hello, how are you?")
# don't normalize embeddings
embeddings("llama3", "Hello, how are you?", normalize = FALSE)
# get embeddings for similar prompts
e1 <- embeddings("llama3", "Hello, how are you?")
e2 <- embeddings("llama3", "Hi, how are you?")
# compute cosine similarity
sum(e1 * e2) # 0.9859769
sum(e1 * e1) # 1 (identical vectors/embeddings)
# non-normalized embeddings
e3 <- embeddings("llama3", "Hello, how are you?", normalize = FALSE)
e4 <- embeddings("llama3", "Hi, how are you?", normalize = FALSE)
sum(e3 * e4) # 23695.96
sum(e3 * e3) # 24067.32
Generate a response for a given prompt (see API doc).