A ComfyUI custom node collection that estimates the Field of View (FOV) and tilt angle (horizon angle) of images using computer vision techniques.
- Two Analysis Methods:
- RGB-based: Analyzes images directly using edge detection and line detection
- Depth-based: Uses depth maps for more robust geometric analysis (recommended)
- FOV Estimation: Estimates the camera's field of view in degrees by detecting vanishing points
- Tilt Detection: Detects the horizon angle and tilt of the camera in degrees
- Visual Debugging: Optionally overlays detected lines, vanishing points, and measurements on the output image
- Configurable Parameters: Adjust thresholds for different image types and depth estimation models
-
Navigate to your ComfyUI custom nodes directory:
cd ComfyUI/custom_nodes/ -
Clone this repository:
git clone https://github.com/gitcapoom/comfyui_fovestimator.git
-
Install dependencies:
cd comfyui_fovestimator pip install -r requirements.txt -
Restart ComfyUI
Two nodes are available under the image/analysis category:
Uses depth maps for more accurate and robust estimation.
Workflow:
Load Image → Depth Estimation (MiDaS/Depth Anything) → FOV & Tilt Estimator (Depth)
↓
Original Image (for visualization)
Inputs:
- depth_map (required): Depth map from a depth estimation node (MiDaS, Depth Anything, etc.)
- image (required): Original RGB image for visualization overlay
- depth_edge_threshold (optional, default: 0.1): Threshold for depth discontinuities (0.01-1.0)
- line_threshold (optional, default: 50): Threshold for Hough line detection (10-300)
- visualize (optional, default: True): Whether to draw detected features
Outputs:
- annotated_image: RGB image with visualization overlays
- fov_degrees: Estimated horizontal field of view (float)
- tilt_degrees: Estimated tilt angle (float)
- info: Formatted text with results
Example Workflow:
- Load your image
- Pass it through a depth estimation node (MiDaS, Depth Anything, etc.)
- Connect both the depth map and original image to the Depth FOV Estimator
- The node will analyze depth discontinuities (3D edges) for more accurate results
Analyzes RGB images directly without requiring depth estimation.
Inputs:
- image (required): The input image to analyze
- edge_threshold_low (optional, default: 50): Lower threshold for Canny edge detection (0-255)
- edge_threshold_high (optional, default: 150): Upper threshold for Canny edge detection (0-255)
- line_threshold (optional, default: 100): Threshold for Hough line detection (10-500)
- visualize (optional, default: True): Whether to draw detected features
Outputs:
- annotated_image: The input image with visualization overlays
- fov_degrees: Estimated field of view in degrees (float)
- tilt_degrees: Estimated tilt/horizon angle in degrees (float)
- info: Text string with formatted results
Example Workflow:
- Load an image
- Connect it directly to the FOV & Tilt Estimator (RGB) node
- View the results with detected features
FOV Estimation from Depth:
- Detects depth discontinuities (3D edges) using Sobel gradients on the depth map
- Applies bilateral filtering to preserve edges while smoothing
- Detects lines from depth discontinuities using Hough transform
- Filters for non-horizontal lines (converging building edges, roads, etc.)
- Finds dominant vanishing point using RANSAC-like clustering
- Calculates focal length from vanishing point position
- Converts to horizontal FOV:
hfov = 2 × arctan(width / (2 × focal_length))
Tilt Estimation from Depth:
- Detects depth discontinuities (horizon appears as depth transition)
- Finds horizontal lines in depth edges
- Calculates horizon position relative to frame center
- Converts to tilt angle using vertical FOV
- Falls back to depth gradient analysis if no clear horizon found
Advantages:
- Depth discontinuities represent real 3D geometry, independent of texture/lighting
- More robust to shadows, reflections, and complex textures
- Cleaner edge detection from structural boundaries
- Can analyze depth gradients when lines aren't clear
FOV Estimation: Uses the pinhole camera model and vanishing point analysis:
- Applies Canny edge detection to find edges in RGB image
- Uses Hough transform to detect lines
- Filters for non-horizontal converging lines
- Finds dominant vanishing point through intersection clustering
- Calculates focal length from vanishing point distance to image center
- Converts to FOV:
hfov = 2 × arctan(width / (2 × focal_length))
Tilt Estimation:
- Detects edges using Canny
- Finds horizontal lines (potential horizons)
- Calculates horizon position relative to frame center
- Converts to tilt angle using vertical FOV
- Tilt = 0° when horizon is centered
Key Principle: The distance from the image center to a vanishing point approximates the camera's focal length in pixels, enabling FOV calculation from pure geometry.
- Use high-quality depth maps: MiDaS v3.1 or Depth Anything v2 work best
- Adjust depth_edge_threshold:
- Lower (0.05-0.08) for subtle depth changes
- Higher (0.15-0.25) for noisy depth maps
- For outdoor scenes: Works well even without strong architectural features
- For indoor scenes: Excellent with depth - detects walls, furniture edges clearly
- For architecture/buildings: Default settings work well
- For outdoor landscapes: Reduce line_threshold to 50-70
- For noisy images: Increase edge_threshold_low to 70-100
- For subtle edges: Decrease edge_threshold_high to 100-120
- Use depth-based method whenever possible - it's significantly more robust
- Images with clear geometric features (buildings, roads) work best for both methods
- For organic scenes (forests, clouds), depth-based method has better fallback behavior
- Requires depth estimation preprocessing (adds computational cost)
- Accuracy depends on depth map quality
- Very noisy depth maps (from poor lighting) may produce unreliable results
- Still assumes pinhole camera model (no fisheye correction)
- FOV estimation requires clear converging lines (buildings, roads, railroad tracks)
- Tilt estimation requires visible horizons
- Sensitive to texture, lighting, and shadows
- May struggle with:
- Very cluttered scenes with few clear lines
- Organic/natural scenes without geometric features
- Low contrast images
- Assume linear perspective (pinhole camera model)
- Not designed for extreme wide-angle or fisheye distortion
- Accuracy degrades with severe lens distortion
- Computer Vision: Uses OpenCV for all image processing operations
- Depth Processing (Depth-based node):
- Sobel gradient computation for depth discontinuities
- Bilateral filtering for edge-preserving smoothing
- Depth gradient analysis for fallback tilt estimation
- Handles various depth map formats (1-channel, 3-channel, normalized/unnormalized)
- Edge Detection:
- RGB: Canny edge detection with configurable thresholds
- Depth: Gradient-based depth discontinuity detection
- Line Detection: Hough line transform (both standard and probabilistic)
- FOV Estimation:
- RANSAC-like clustering for robust vanishing point detection
- Pinhole camera model:
focal_length ≈ distance(principal_point, vanishing_point) - FOV calculation:
hfov = 2 × arctan(width / (2 × f)) - Intelligent fallbacks based on aspect ratio
- Tilt Estimation:
- Horizon line detection from horizontal edges
- Position analysis relative to frame center
- Vertical FOV conversion:
vfov = 2 × arctan(tan(hfov/2) / aspect_ratio) - Median-based robust estimation resistant to outliers
- Depth gradient analysis for scenes without clear horizons
For best results with the depth-based estimator, use these ComfyUI depth estimation nodes:
- MiDaS (v3.0, v3.1) - Good general-purpose depth estimation
- Depth Anything (v1, v2) - State-of-the-art depth estimation, recommended
- ZoeDepth - Metric depth estimation
- LeReS - High-quality depth for indoor scenes
Install these through ComfyUI Manager or manually from their respective repositories.
MIT License
Contributions are welcome! Please feel free to submit issues or pull requests.