CogVLM¶
Class: CogVLMBlockV1
Source: inference.core.workflows.core_steps.models.foundation.cog_vlm.v1.CogVLMBlockV1
CogVLM reached End Of Life
Due to dependencies conflicts with newer models and security vulnerabilities discovered in transformers
library patched in the versions of library incompatible with the model we announced End Of Life for CogVLM
support in inference
, effective since release 0.38.0
.
We are leaving this block in ecosystem until release 0.42.0
for clients to get informed about change that
was introduced.
Starting as of now, all Workflows using the block stop being functional (runtime error will be raised),
after inference release 0.42.0
- this block will be removed and Execution Engine will raise compilation
error seeing the block in Workflow definition.
Ask a question to CogVLM, an open source vision-language model.
This model requires a GPU and can only be run on self-hosted devices, and is not available on the Roboflow Hosted API.
This model was previously part of the LMM block.
Type identifier¶
Use the following identifier in step "type"
field: roboflow_core/cog_vlm@v1
to add the block as
as step in your workflow.
Properties¶
Name | Type | Description | Refs |
---|---|---|---|
name |
str |
Enter a unique identifier for this step.. | ❌ |
prompt |
str |
Text prompt to the CogVLM model. | ✅ |
json_output_format |
Dict[str, str] |
Holds dictionary that maps name of requested output field into its description. | ❌ |
The Refs column marks possibility to parametrise the property with dynamic values available
in workflow
runtime. See Bindings for more info.
Available Connections¶
Compatible Blocks
Check what blocks you can connect to CogVLM
in version v1
.
- inputs:
Keypoint Visualization
,Background Color Visualization
,Stitch OCR Detections
,VLM as Classifier
,Polygon Zone Visualization
,OCR Model
,Stitch Images
,Florence-2 Model
,VLM as Detector
,Florence-2 Model
,Polygon Visualization
,Crop Visualization
,Image Convert Grayscale
,Grid Visualization
,Twilio SMS Notification
,Slack Notification
,Image Blur
,Mask Visualization
,Relative Static Crop
,Anthropic Claude
,OpenAI
,SIFT
,Camera Calibration
,Circle Visualization
,Image Slicer
,Color Visualization
,Image Slicer
,OpenAI
,CogVLM
,Label Visualization
,Dot Visualization
,Absolute Static Crop
,Blur Visualization
,Pixelate Visualization
,Dynamic Crop
,LMM
,Keypoint Detection Model
,Roboflow Dataset Upload
,Corner Visualization
,Google Vision OCR
,Image Preprocessing
,SIFT Comparison
,Image Threshold
,Single-Label Classification Model
,Stability AI Image Generation
,Trace Visualization
,Google Gemini
,Bounding Box Visualization
,Local File Sink
,Camera Focus
,Webhook Sink
,Llama 3.2 Vision
,Email Notification
,Ellipse Visualization
,Stability AI Inpainting
,Instance Segmentation Model
,Roboflow Custom Metadata
,Roboflow Dataset Upload
,Model Monitoring Inference Aggregator
,Perspective Correction
,Model Comparison Visualization
,Object Detection Model
,Depth Estimation
,Reference Path Visualization
,LMM For Classification
,CSV Formatter
,Multi-Label Classification Model
,Line Counter Visualization
,Triangle Visualization
,Clip Comparison
,Halo Visualization
,Image Contours
,Classification Label Visualization
- outputs:
Keypoint Visualization
,Multi-Label Classification Model
,Stitch OCR Detections
,Instance Segmentation Model
,First Non Empty Or Default
,Detections Stitch
,Relative Static Crop
,Detections Merge
,OpenAI
,LMM
,Dot Visualization
,Roboflow Dataset Upload
,Rate Limiter
,Byte Tracker
,Velocity
,SIFT Comparison
,Continue If
,QR Code Detection
,Byte Tracker
,Clip Comparison
,Pixel Color Count
,Model Comparison Visualization
,Overlap Filter
,Triangle Visualization
,Cosine Similarity
,Stability AI Image Generation
,Image Contours
,Keypoint Detection Model
,Detections Stabilizer
,Size Measurement
,Expression
,Florence-2 Model
,Grid Visualization
,Byte Tracker
,SIFT
,Image Slicer
,Dominant Color
,Color Visualization
,Detection Offset
,CogVLM
,Label Visualization
,JSON Parser
,Detections Consensus
,Google Vision OCR
,Buffer
,Time in Zone
,SmolVLM2
,Webhook Sink
,Email Notification
,Model Monitoring Inference Aggregator
,Line Counter
,Perspective Correction
,Object Detection Model
,Reference Path Visualization
,CSV Formatter
,Multi-Label Classification Model
,Clip Comparison
,Segment Anything 2 Model
,OpenAI
,Qwen2.5-VL
,Classification Label Visualization
,Polygon Zone Visualization
,Stitch Images
,VLM as Detector
,Florence-2 Model
,Polygon Visualization
,Barcode Detection
,Crop Visualization
,YOLO-World Model
,Slack Notification
,Image Blur
,Mask Visualization
,Distance Measurement
,Line Counter
,Time in Zone
,Anthropic Claude
,VLM as Classifier
,Identify Changes
,Circle Visualization
,Cache Get
,Detections Classes Replacement
,Dynamic Crop
,Identify Outliers
,Single-Label Classification Model
,Image Preprocessing
,Image Threshold
,Single-Label Classification Model
,Local File Sink
,Moondream2
,Camera Focus
,Instance Segmentation Model
,Stability AI Inpainting
,Ellipse Visualization
,Roboflow Custom Metadata
,Line Counter Visualization
,Detections Filter
,Property Definition
,Dimension Collapse
,Background Color Visualization
,VLM as Classifier
,OCR Model
,Template Matching
,Dynamic Zone
,Image Convert Grayscale
,Path Deviation
,Path Deviation
,Camera Calibration
,Google Gemini
,Image Slicer
,VLM as Detector
,Keypoint Detection Model
,Absolute Static Crop
,Blur Visualization
,Pixelate Visualization
,Corner Visualization
,Bounding Rectangle
,SIFT Comparison
,Delta Filter
,Trace Visualization
,Bounding Box Visualization
,Llama 3.2 Vision
,Roboflow Dataset Upload
,Data Aggregator
,LMM For Classification
,Depth Estimation
,CLIP Embedding Model
,Detections Transformation
,Object Detection Model
,Halo Visualization
,Cache Set
,Twilio SMS Notification
,Gaze Detection
Input and Output Bindings¶
The available connections depend on its binding kinds. Check what binding kinds
CogVLM
in version v1
has.
Bindings
-
input
-
output
parent_id
(parent_id
): Identifier of parent for step output.root_parent_id
(parent_id
): Identifier of parent for step output.image
(image_metadata
): Dictionary with image metadata required by supervision.structured_output
(dictionary
): Dictionary.raw_output
(string
): String value.*
(*
): Equivalent of any element.
Example JSON definition of step CogVLM
in version v1
{
"name": "<your_step_name_here>",
"type": "roboflow_core/cog_vlm@v1",
"images": "$inputs.image",
"prompt": "my prompt",
"json_output_format": {
"count": "number of cats in the picture"
}
}