We introduce Maya, an open-source Multilingual Multimodal model.
- A multilingual image-text pretraining dataset in eight languages, based on the LLaVA pretraining dataset;
- A novel toxicity-free version across eight languages; and
- A multilingual image-text 8B model supporting these languages, enhancing culture and linguistics.