Picture by Writer
Hugging Face has developed a brand new serialization format known as Safetensors, geared toward simplifying and streamlining the storage and loading of huge and complicated tensors. Tensors are the first knowledge construction utilized in deep studying, and their dimension can pose challenges with regards to effectivity.
Safetensors use a mixture of environment friendly serialization and compression algorithms to scale back the dimensions of huge tensors, making it quicker and extra environment friendly than different serialization codecs like pickle. Which means that Safetensors is 76.6X quicker on CPU and 2X quicker on GPU in comparison with the standard PyTorch serialization format, <code>pytorch_model.bin</code> with <code>mannequin.safetensors</code>. Try Pace Comparability.
Simple of use
Safetensors have a easy and intuitive API to serialize and deserialize tensors in Python. Which means that builders can concentrate on constructing their deep studying fashions as an alternative of spending time on serialization and deserialization.
Cross-platform compatibility
You may serialize in Python and conveniently load the ensuing information in varied programming languages and platforms, corresponding to C++, Java, and JavaScript. This enables for seamless sharing of fashions throughout completely different programming environments.
Pace
Safetensors is optimized for velocity and might effectively deal with the serialization and deserialization of huge tensors. Consequently, it is a wonderful alternative for functions that use massive language fashions.
Dimension Optimization
It makes use of a mix of efficient serialization and compression algorithms to lower the dimensions of huge tensors, leading to quicker and extra environment friendly efficiency in comparison with different serialization codecs corresponding to pickle.
Safe
To stop any corruption throughout storage or switch of serialized tensors, Safetensors makes use of a checksum mechanism. This ensures an added layer of safety, making certain that every one knowledge saved in Safetensors is correct and reliable. Moreverover, it prevents DOS assaults.
Lazy loading
When working in distributed settings with a number of nodes or GPUs, it’s useful to load solely a portion of the tensors on every mannequin. BLOOM makes use of this format to load the mannequin on 8 GPUs in simply 45 seconds, in comparison with the common PyTorch weights which took 10 minutes.
On this part, we’ll take a look at <code>safetensors</code> API and how one can save and cargo file tensor information.
We are able to merely Set up safetensors utilizing pip supervisor:
We’ll use the instance from Torch shared tensors to construct a easy neural community and save the mannequin utilizing <code>safetensors.torch</code> API for PyTorch.
from torch import nn
class Mannequin(nn.Module):
def __init__(self):
tremendous().__init__()
self.a = nn.Linear(100, 100)
self.b = self.a
def ahead(self, x):
return self.b(self.a(x))
mannequin = Mannequin()
print(mannequin.state_dict())
As we are able to see, we have now efficiently created the mannequin.
OrderedDict([('a.weight', tensor([[-0.0913, 0.0470, -0.0209, ..., -0.0540, -0.0575, -0.0679], [ 0.0268, 0.0765, 0.0952, ..., -0.0616, 0.0146, -0.0343], [ 0.0216, 0.0444, -0.0347, ..., -0.0546, 0.0036, -0.0454], ...,
Now, we’ll save the mannequin by offering the <code>mannequin</code> object and the file title. After that, we’ll load the save file into the <code>mannequin</code> object created utilizing <code>nn.Module</code>.
from safetensors.torch import load_model, save_model
save_model(mannequin, "mannequin.safetensors")
load_model(mannequin, "mannequin.safetensors")
print(mannequin.state_dict())
OrderedDict([('a.weight', tensor([[-0.0913, 0.0470, -0.0209, ..., -0.0540, -0.0575, -0.0679], [ 0.0268, 0.0765, 0.0952, ..., -0.0616, 0.0146, -0.0343], [ 0.0216, 0.0444, -0.0347, ..., -0.0546, 0.0036, -0.0454], ...,
Within the second instance, we’ll attempt to save the tensor created utilizing <code>torch.zeros</code>. For that we’ll use the <code>save_file</code> operate.
import torch
from safetensors.torch import save_file, load_file
tensors = {
"weight1": torch.zeros((1024, 1024)),
"weight2": torch.zeros((1024, 1024))
}
save_file(tensors, "new_model.safetensors")
And to load the tensors, we’ll use the <code>load_file</code> operate.
load_file("new_model.safetensors")
{'weight1': tensor([[0., 0., 0., ..., 0., 0., 0.],
[0., 0., 0., ..., 0., 0., 0.],
[0., 0., 0., ..., 0., 0., 0.],
...,
[0., 0., 0., ..., 0., 0., 0.],
[0., 0., 0., ..., 0., 0., 0.],
[0., 0., 0., ..., 0., 0., 0.]]),
'weight2': tensor([[0., 0., 0., ..., 0., 0., 0.],
[0., 0., 0., ..., 0., 0., 0.],
[0., 0., 0., ..., 0., 0., 0.],
...,
[0., 0., 0., ..., 0., 0., 0.],
[0., 0., 0., ..., 0., 0., 0.],
[0., 0., 0., ..., 0., 0., 0.]])}
The safetensors API is accessible for Pytorch, Tensorflow, PaddlePaddle, Flax, and Numpy. You may perceive it by studying the Safetensors documentation.
Picture from Torch API
Briefly, safetensors is a brand new method to retailer massive tensors utilized in deep studying functions. In comparison with different strategies, it gives quicker, extra environment friendly, and user-friendly options. Moreover, it ensures the confidentiality and security of information whereas supporting varied programming languages and platforms. By using Safetensors, machine studying engineers can optimize their time and focus on creating superior fashions.
I extremely suggest utilizing Safetensors to your tasks. Many high AI corporations, corresponding to Hugging Face, EleutherAI, and StabilityAI, make the most of Safetensors for his or her tasks.
Reference
Abid Ali Awan (@1abidaliawan) is a licensed knowledge scientist skilled who loves constructing machine studying fashions. Presently, he’s specializing in content material creation and writing technical blogs on machine studying and knowledge science applied sciences. Abid holds a Grasp’s diploma in Expertise Administration and a bachelor’s diploma in Telecommunication Engineering. His imaginative and prescient is to construct an AI product utilizing a graph neural community for college students scuffling with psychological sickness.