Keras, a popular machine learning tool known for its high-level abstractions and user-friendliness, faces challenges surrounding the cost of training large models, the complexity of preprocessing and metrics, and improving training performance. In response to these challenges, researchers from the Keras Team at Google have introduced KerasCV and KerasNLP. These extensions of the Keras API target computer vision (CV) and natural language processing (NLP) and emphasize ease of use and performance. They are also designed to support JAX, TensorFlow, and PyTorch.
KerasCV and KerasNLP are notable for their modular design, offering constituent parts for both models and data preprocessing at a low level. Simultaneously, they provide pretrained task models for widespread architectures like Stable Diffusion and GPT-2 at a higher level. These high-level models comprise built-in preprocessing, pretrained weights, and fine-tuning capabilities. They additionally support XLA compilation and use TensorFlow’s tf. Data API for efficient preprocessing.
Like HuggingFace’s Transformers library, KerasCV and KerasNLP offer pretrained model checkpoints for manifold transformer architectures. However, while HuggingFace employs a “repeat yourself” method, KerasNLP uses a layered strategy to reimplement large language models with minimized code. The Keras libraries make all the pretrained models available on Kaggle Models, making them accessible even in an offline state.
The Keras Domain Packages API employs a layered approach that consists of three main abstraction levels. These are Foundational Components (independent composable modules), Pretrained Backbones (fine-tuning-ready models with matched tokenizers for NLP), and Task Models (specialized models with lower-level modules for unified training and inference interface).
Keras 3 allows users to select the fastest backend for their tasks, consistently outperforming its predecessor, Keras 2. In the future, the Keras Team plans to expand the project’s capabilities, especially by widening the range of multimodal models available and refining integration with backend-specific large model serving solutions. KerasCV and KerasNLP provide configurable components for rapid model prototyping, as well as a variety of pretrained models for CV and NLP tasks. These resources are designed for users of JAX, TensorFlow, or PyTorch and deliver state-of-the-art training and inference performance. Detailed user guides for both libraries are available on Keras.io.