dataset
RSVLM_SFT
RL-MIND
or hover any field below to flag it
Overview
Name
RSVLM_SFT
Source
RL-MIND
Episodes
0
Robot count
0
Format
other
Description
MF-RSVLM
FUSE-RSVLM: Feature Fusion Vision-Language Model for Remote Sensing
Project Page |
Paper |
Model |
Dataset
If this project helps you, please give us a star on GitHub.
Overview
MF-RSVLM is a remote sensing vision-language model (VLM). It combines a CLIP vision encoder, a two-layer MLP projector, and a Vicuna-7B LLM, and is trained in two stages for modality alignment and instruction following.
Visual Encoder: CLIP… See the full description on the dataset page: https://huggingface.co/datasets/RL-MIND/RSVLM_SFT.
Robots used
null
Links
HuggingFace dataset