Performance Guide
This guide helps you choose the right algorithms and approaches for your use case.
Algorithm Characteristics
Distance Calculations
Euclidean Distance - Treats coordinates as flat X,Y points - Use for small geographic areas or projected coordinates - Doesn’t account for Earth’s curvature
Haversine Distance - Assumes Earth is a sphere - Good for most geographic distance calculations - Less accurate at very long distances due to spherical approximation
Vincenty Distance - Uses ellipsoidal Earth model (WGS84) - More accurate for geographic distances - More computational work than Haversine
Optimization Strategies
Algorithm Selection
Choose algorithms based on your accuracy requirements:
Use Euclidean for projected coordinates or small areas
Use Haversine for general geographic distance calculations
Use Vincenty when millimeter accuracy is required
Batch Processing
Process multiple items together when possible:
# More efficient
distances = batch.pairwise_haversine(points)
# Less efficient
distances = [haversine(p1, p2) for p1, p2 in zip(points[:-1], points[1:])]
Memory Management
For large datasets:
Process data in chunks rather than loading everything into memory
Use streaming operations when available
Consider using NumPy arrays for better memory layout
Best Practices
Choose the right algorithm for your accuracy needs
Use batch operations when processing multiple items
Process large datasets in chunks to manage memory
Test performance with your actual data rather than making assumptions
For polylines, use precision 5 unless you need sub-meter accuracy