InternVL

Posts

  • 2025/04/11

    InternVL3: Advancing Open-Source Multimodal Models with Native Multimodal Pretraining

    2025/03/13

    VisualPRM: An Effective Process Reward Model for Multimodal Reasoning

    2024/12/20

    InternVL2.5-MPO: Enhancing the Reasoning Ability of Multimodal Large Language Models via Mixed Preference Optimization

    2024/12/05

    InternVL2.5: Expanding Performance Boundaries of Open-Source Multimodal Models with Model, Data, and Test-Time Scaling

    2024/10/25

    Mini-InternVL 2.0: A Flexible-Transfer Pocket Multimodal Model with 5% Parameters and 90% Performance

    2024/10/10

    Mono-InternVL: Pushing the Boundaries of Monolithic Multimodal Large Language Models with Endogenous Visual Pre-training

    2024/07/31

    InternOmni: Extending InternVL with Audio Modality

    2024/07/04

    InternVL2: Better than the Best—Expanding Performance Boundaries of Open-Source Multimodal Models with the Progressive Scaling Strategy

    2024/05/31

    ShareGPT-4o: Comprehensive Multimodal Annotations With GPT-4o

    2024/05/25

    Mini-InternVL 1.5: A Powerful Pocket Multimodal Model with 8% Parameters for 80% Performance

    2024/04/30

    InternVL 1.5: How Far Are We to GPT-4V? Closing the Gap to Commercial Multimodal Models with Open-Source Suites

    2024/02/21

    InternVL 1.2: Scaling up LLM to 34B

    2024/01/24

    InternVL 1.1: Enhance Chinese and OCR Capabilities

    2023/12/12

    InternVL 1.0: Scaling up Vision Foundation Models and Aligning for Generic Visual-Linguistic Tasks