In this project, we explore the robustness and reliability of camera calibration by performing uncertainty analysis on real-world data. Camera calibration is the process of determining a camera's intrinsic parameters (like focal length) and extrinsic parameters (position and orientation in 3D space) using known reference points. But how reliable are these estimates? That's what we set out to discover.
The Original Calibration Code
The initial code (provided by our professor) performs a straightforward camera calibration using 6 known points with their GPS coordinates (latitude, longitude, altitude) and their corresponding pixel locations (x, y) in an image. The goal is to estimate the camera's intrinsic matrix K (containing focal length and principal point) and the camera's 3D location in GPS coordinates.
The calibration process uses OpenCV's calibrateCamera function, which minimizes the reprojection error between the observed pixel locations and where the 3D points would project given the estimated camera parameters. The original code successfully calibrated the camera with a reprojection error of approximately 39 pixels.
Key outputs from the original calibration:
- Focal length: ~1,328 pixels
- Camera location: approximately 43.07°N, -89.41°W, at ~323m altitude
- Horizontal FOV: 88.62°, Vertical FOV: 72.42°
Part (a): Leave-One-Out Analysis
To test the robustness of our calibration to individual points, we implemented leave-one-out cross-validation. This technique systematically removes one point at a time and recalibrates the camera using the remaining 5 points, generating 6 different camera location estimates.
Results
The leave-one-out analysis revealed good stability in the calibration:
- Camera Location Variability:
- Latitude std: 0.0001833° (~20 meters)
- Longitude std: 0.0003581° (~27 meters)
- Altitude std: 7.41 meters
- Focal Length Variability:
- Mean: 1,425.63 pixels
- Std: 250.89 pixels
All 6 estimated camera locations form a tight cluster in 3D space. The altitude distribution shows 6 distinct values ranging from ~312m to ~334m, while the focal length histogram reveals a bimodal distribution with most estimates clustering around 1,250 and 1,450 pixels, with one outlier at ~1,934 pixels.
The relatively small spread suggests the calibration is not overly dependent on any single point — no single point is an obvious outlier causing instability.
Part (b): Noise Addition Analysis
To simulate real-world measurement uncertainty, we added Gaussian noise to our data and ran 100 calibration trials:
- 3D coordinate noise: Gaussian with 0 mean, 1-meter std (simulating GPS/altitude error)
- 2D pixel coordinate noise: Gaussian with 0 mean, 1-pixel std (simulating image measurement error)
Results
The noise addition analysis revealed extreme sensitivity to measurement errors:
- Camera Location Variability:
- Latitude std: 0.0123° (~1,370 meters) — 67× larger than leave-one-out
- Longitude std: 0.5564° (~42,500 meters) — 1,555× larger
- Altitude std: 13,512 meters — 1,824× larger
- Focal Length Variability:
- Mean: 71,117 pixels, Median: 1,342 pixels
- Std: 304,675 pixels — 1,214× larger than leave-one-out
While most camera locations still cluster around the original estimate (~43.07°N, -89.41°W), several outliers show dramatically different estimates — some with altitudes of -60,000m or +60,000m. The focal length histogram is even more revealing: ~95 of 100 trials produced reasonable focal lengths, but a handful produced absurdly large values (up to 1.5 million pixels), completely skewing the mean.
This reveals a critical insight: with only 6 points, small measurement errors can occasionally lead to catastrophically wrong solutions. The optimization can converge to local minima that produce geometrically valid but physically unrealistic results.
Looking at the combined comparison:
- Leave-one-out (blue) shows a tight, well-behaved cluster — calibration is structurally sound when data quality is good.
- Noise addition (red) shows both the tight cluster and extreme outliers.
Part 2: Taking from the Real World
Part 2 is all about the real world. We needed to understand how to map a 3D GPS coordinate to a 2D pixel in a static webcam feed. It sounds like sci-fi, but it's a classic computer vision problem. We wanted to experiment with taking that static image and building a model that could translate any 3D GPS coordinate into a 2D pixel.
This was a journey, and we learned a ton — including the major mistake that almost derailed the whole thing.
Image One: Northeastern University Webcam
For our first image, we selected the Northeastern University webcam in Boston.
Initially it was very hard to figure out exactly where the image was from. After digging into it and decoding some URLs, we found the source. This camera is part of a weather camera system. We were able to see the exact data it records:
It also had a direct link to the camera's coordinates, elevation, and heading:
Camera details we uncovered:
- Camera Coordinates: 42°20'09.2"N 71°05'17.8"W
- Decimal: 42.33587843449229, -71.08826335575469
The next step was identifying 5–10 immovable landmarks to use as calibration points. Using a small script, we loaded the image, tagged 10 points, and extracted their 2D pixel coordinates.
We then found the real-world latitude, longitude, and altitude of each point using Google Earth, Google Maps, and Apple Maps. Working together, we found the most accurate representations possible.
| img_x | img_y | label | map_lat | map_lng | map_alt |
|---|---|---|---|---|---|
| 306 | 855 | p1 | 42.347251 | -71.088745 | 98 |
| 725 | 1045 | p2 | 42.336671 | -71.087771 | 43 |
| 678 | 779 | p3 | 42.345504 | -71.084017 | 229 |
| 735 | 808 | p4 | 42.346976 | -71.082614 | 226 |
| 827 | 867 | p5 | 42.346431 | -71.081467 | 160 |
| 1039 | 859 | p6 | 42.3492 | -71.075302 | 235 |
| 1378 | 847 | p7 | 42.337178 | -71.085517 | 75 |
| 1435 | 937 | p8 | 42.336955 | -71.085676 | 50 |
| 1769 | 1022 | p9 | 42.336214 | -71.085598 | 23 |
| 1387 | 1065 | p10 | 42.336888 | -71.085971 | 23 |
Our first attempt at calibration was a total disaster. We fed 10 points in and the results came back completely off — all test points were projecting onto wrong pixels. After investigating, we found the problem: the calibration function expected a 1920×1080 snapshot, but we had used a 960×540 image to get the 2D points. The resolution mismatch threw off the entire calibration.
We fixed the issue by retagging the image at full resolution, and the points lined up perfectly:

Reference Origin
To establish a local East-North-Up (ENU) coordinate system, we set the origin near the camera:
- Latitude: 42.3359°N
- Longitude: 71.0883°W
- Altitude: 44 meters
Initial Camera Calibration Results
Using OpenCV's calibrateCamera with all 10 calibration points:
Intrinsic Matrix (K):
[[1075.35 0.00 960.00]
[ 0.00 1075.35 540.00]
[ 0.00 0.00 1.00]]
Key parameters:
- Focal length: 1,075.35 pixels
- Horizontal FOV: 83.51°, Vertical FOV: 53.33°
- Reprojection error: 48.35 pixels

The reprojection error of 48.35 pixels is reasonable given the challenges of precisely identifying ground truth locations in an urban environment with varying building heights and perspectives.
Part 2a: Leave-One-Out Uncertainty Analysis
We performed leave-one-out cross-validation: remove one point, recalibrate with the remaining 9, record camera location and focal length. Repeat for all 10 points.
Success rate: All 10 calibrations succeeded.

Camera Location Statistics (GPS):
- Mean position: Lat = 42.33525°N, Lon = 71.08857°W, Alt = 42.36m
- Std: Lat = 0.000197° (~22m), Lon = 0.000280° (~21m), Alt = 22.21m
Focal Length Statistics:
- Mean: 1,082.30 pixels
- Std: 102.07 pixels (9.4% of mean)

Individual LOO Results
| Point Removed | Camera Altitude | Focal Length | Reprojection Error |
|---|---|---|---|
| p1 | 47.75 m | 1363.69 px | 30.00 px |
| p2 | 27.92 m | 1062.24 px | 27.72 px |
| p3 | 52.14 m | 1071.43 px | 49.34 px |
| p4 | 52.40 m | 1079.17 px | 49.97 px |
| p5 | -20.48 m | 1065.29 px | 116.50 px |
| p6 | 55.13 m | 1099.74 px | 48.61 px |
| p7 | 52.88 m | 1023.94 px | 46.84 px |
| p8 | 52.46 m | 1041.25 px | 47.18 px |
| p9 | 51.58 m | 945.72 px | 30.52 px |
| p10 | 51.87 m | 1070.49 px | 49.91 px |
Key observation: Removing point 5 produces an outlier result — negative altitude (-20.48m, physically impossible for a rooftop camera) and the highest reprojection error (116.50 pixels). This flags p5 as either critical or problematic.
The LOO analysis shows good overall stability with one exception. Nine out of ten results have altitudes between 28–56 meters (realistic for an urban rooftop). The 22-meter lat/lon std indicates acceptable position stability.
Part 2b: Monte Carlo Noise Sensitivity Analysis
We ran 100 calibration trials with Gaussian noise added to all points:
- 3D coordinate noise: σ = 1 meter (simulating GPS/altitude uncertainty)
- 2D pixel noise: σ = 1 pixel (simulating click/identification uncertainty)
Success rate: 100/100 trials succeeded with physically reasonable solutions.
Camera Location Statistics:
- Mean position: Lat = 42.33525°N, Lon = 71.08857°W, Alt = 42.36m
- Std: Lat = 0.000024° (~2.7m), Lon = 0.000023° (~1.7m), Alt = 1.12m

Focal Length: Mean = 1,075.35 px, Std = 8.60 px (0.8%)

The NEU webcam excels in Monte Carlo analysis because 10 well-distributed urban points provide strong geometric constraints. Buildings and rooftops offer precise, stable landmarks. The good altitude variation (23m to 235m) constrains camera height well. Even with noise added to all points, the overdetermined system finds the correct solution reliably.
Comparison: Part 2a vs. Part 2b
| Metric | Leave-One-Out | Monte Carlo | Ratio |
|---|---|---|---|
| Std Latitude | 0.000197° (~22m) | 0.000024° (~2.7m) | 8.2× |
| Std Longitude | 0.000280° (~21m) | 0.000023° (~1.7m) | 12.3× |
| Std Altitude | 22.21 m | 1.12 m | 19.8× |
| Focal Length std | 102.07 px | 8.60 px | 11.9× |
| Mean Altitude | 42.36 m | 42.36 m | Match |
| Outliers | 1 (p5: -20.48m) | 0 | — |

Leave-one-out shows 8–20× more variation than Monte Carlo. This makes sense: removing a point changes the geometric configuration entirely, while adding noise to all points still preserves the overall structure. Both analyses agree on the camera location (~42.36m altitude), giving us confidence in the result.
Conclusions for Image One
We successfully calibrated the NEU webcam with excellent real-world robustness:
- Focal length: 1,075.35 pixels, σ = 8.6 px (only 0.8%)
- Camera location: 42.33525°N, 71.08857°W, 42.36m altitude
- Position uncertainty: ~2 meters horizontal, ~1 meter vertical
- Success rate: 100/100 under 1m/1px noise
Why does removing point 5 cause failure? When p5 is removed, the camera gets placed at -20.48m with 116.50 px reprojection error. Point 5 sits at 160m altitude — right in the middle of our range (23m to 235m). It likely provides critical altitude triangulation. Without it, the optimizer falls into a wrong but locally optimal solution. The fact that Monte Carlo succeeds 100/100 even with noise added to p5 proves the point isn't fundamentally wrong — it's a critical but marginal anchor.
Image Two: Boston Harbor Islands Webcam
For the second image, we chose something completely different. We moved from an urban environment to a remote island on the outskirts of Boston, using the National Park Service's webcam overlooking the harbor.
We located the public record on the NPS website and found a Google Street View of the island itself. Since the camera is part of a lighthouse, we figured out the lighthouse height and made adjustments based on the camera's relative position.

One complication: Google Earth's 3D satellite imagery for the area hasn't been updated since 2022. We could see two distinct islands from the camera's view, but the satellite shows them connected. This meant guessing some points in an already structurally difficult environment.
Point Selection in a Marine Environment
Calibrating a camera overlooking open water presents unique challenges compared to urban environments. We carefully selected 9 distinct landmarks — coastal features, islands, and man-made structures.
Unlike urban environments with well-defined building corners, identifying precise points on natural coastal features and distant islands proved difficult. Tidal variations, wave action, and atmospheric haze all contribute to measurement uncertainty.
| img_x | img_y | label | map_lat | map_lng | map_alt |
|---|---|---|---|---|---|
| 891 | 1980 | p1 | 42.327871 | -70.890574 | 5 |
| 344 | 1356 | p2 | 42.32737 | -70.891641 | 2 |
| 1724 | 1263 | p3 | 42.328117 | -70.891248 | 4 |
| 1315 | 1239 | p4 | 42.327967 | -70.892254 | 0.21 |
| 2744 | 998 | p5 | 42.332095 | -70.895926 | 3 |
| 2086 | 912 | p6 | 42.346274 | -70.954216 | 44 |
| 724 | 941 | p7 | 42.327193 | -70.924872 | 3 |
| 2717 | 1925 | p8 | 42.32821 | -70.890555 | 6 |
| 3646 | 971 | p9 | 42.334612 | -70.894444 | 22 |
Reference origin (ENU): 42.3279°N, 70.8901°W, 23m altitude
Initial Calibration Results
Using all 9 calibration points:
- Focal length: 1,824.61 pixels
- Horizontal FOV: 92.92°, Vertical FOV: 61.24°
- Reprojection error: 204.01 pixels
The 204-pixel reprojection error is significantly higher than typical calibrations — a direct consequence of the difficulty in precisely identifying natural coastal landmarks.

Part 2a: Leave-One-Out Analysis
Success rate: All 9 LOO calibrations succeeded.

Camera Location Statistics:
- Mean position: Lat = 42.32790°N, Lon = 70.89011°W, Alt = 14.04m
- Std: Lat = 0.000650° (~72m), Lon = 0.001802° (~137m), Alt = 46.32m
Focal Length: Mean = 1,607.48 px, Std = 332.31 px
Point 2 stands out as a critical outlier — removing it causes the altitude estimate to go to -116.54m (well below sea level). This is consistent with the challenges we noted in identifying precise coastal landmarks.
Part 2b: Monte Carlo Analysis
Despite the high reprojection error, Monte Carlo performs well:
Success rate: 100/100 trials with physically reasonable solutions
Camera Location Statistics:
- Mean position: Lat = 42.32790°N, Lon = 70.89011°W, Alt = 28.25m
- Std: Lat = 0.000011° (~1.2m), Lon = 0.000024° (~1.7m), Alt = 1.23m

Focal Length: Mean = 1,826.07 px, Std = 27.72 px
When all 9 points are used together, even with noise, the consensus overwhelms individual measurement errors. Small uniform noise doesn't drastically change the geometric configuration.

Comparison: Part 2a vs. Part 2b
| Metric | Leave-One-Out | Monte Carlo | Ratio |
|---|---|---|---|
| Std Latitude | 0.000650° (~72m) | 0.000011° (~1.2m) | 59.1× |
| Std Longitude | 0.001802° (~137m) | 0.000024° (~1.7m) | 75.1× |
| Std Altitude | 46.32 m | 1.23 m | 37.7× |
| Focal Length std | 332.31 px | 27.72 px | 12.0× |
| Mean Altitude | 14.04 m (outlier affected) | 28.35 m (realistic) | — |


The extreme LOO/Monte Carlo disparity (58–115× more variation) reveals high sensitivity to geometric configuration changes. The system is fragile to geometric changes (removing points changes the configuration drastically) but robust to measurement noise (small uniform perturbations don't break the consensus).
Final Comparison: Urban vs. Marine
| Metric | NU Webcam (Urban) | Harbor Webcam (Marine) | Winner |
|---|---|---|---|
| LOO Lat Std | 0.000197° | 0.000650° | NU (3.3× better) |
| LOO Alt Std | 22.21 m | 46.32 m | NU (2.1× better) |
| MC Lat Std | 0.000020° | 0.000011° | Harbor (1.8× better) |
| MC Alt Std | 1.12 m | 1.23 m | NU (slightly better) |
| MC Focal Std | 6.19 px | 27.72 px | NU (4.5× better) |
| LOO/MC Ratio | 10–20× | 37–75× | NU (much better) |
Urban environments win for geometric stability. Well-defined building corners provide more reliable landmark identification than natural coastal features. The NU webcam shows better overall stability across 4 of 6 metrics.
Both achieve ~1 meter altitude precision under realistic noise. Despite the NU webcam's advantages in LOO stability, both calibrations are robust to measurement uncertainty when all points are present — Monte Carlo alt std is 1.12m (NU) vs. 1.23m (Harbor).
The LOO/MC ratio tells the real story. NU's 10–20× ratio vs. Harbor's 37–75× indicates that marine calibrations are much more fragile to changes in geometric configuration. Coastal features are harder to pin down, and the calibration is correspondingly more dependent on any single point being present and accurate.
The key lesson: Camera calibration from webcam feeds is surprisingly achievable with enough carefully chosen landmarks, even without direct access to the camera's manufacturer specs. But the choice of environment — and the quality of point identification — dramatically affects reliability. When you can't control your landmarks, Monte Carlo analysis tells you how much to trust your estimates.
