- Ollama is used for inferencing. It gives quantized open source model for creating chat applications.
- Unsloth is used for inferencing and training both.. Unsloth gives quantized model.
- HF gives access to open source models wthout any quantization. Useful for training and inferencing. It also provides the flexibility to quantize the models.