Grok-1
The last open-source model we’ll discuss in this section is Grok-1, which was released by Xai in early 202421. Like Mixtral, it uses a mixture of expert architecture and is not purpose-built for a particular product domain. It was inspired by the science fiction classic “The Hitchhiker’s Guide to the Galaxy,” and is intended to have a humorous personality relative to other models22.
Unlike the other models in this chapter, we cannot directly load Grok in the pipelines modules. Instead, we can use the following code to load the weights and execute the model23:
import torch
from transformers import AutoModelForCausalLM, AutoTokenizer
torch.set_default_dtype(torch.bfloat16)
tokenizer = AutoTokenizer.from_pretrained("hpcai-tech/grok-1",
trust_remote_code=True)
model = AutoModelForCausalLM.from_pretrained(
"hpcai-tech/grok-1",
trust_remote_code=True,
device_map="auto",
torch_dtype=torch.bfloat16...