Performance Guide

This guide helps you choose the right algorithms and approaches for your use case.

Algorithm Characteristics

Distance Calculations

Euclidean Distance - Treats coordinates as flat X,Y points - Use for small geographic areas or projected coordinates - Doesn’t account for Earth’s curvature

Haversine Distance - Assumes Earth is a sphere - Good for most geographic distance calculations - Less accurate at very long distances due to spherical approximation

Vincenty Distance - Uses ellipsoidal Earth model (WGS84) - More accurate for geographic distances - More computational work than Haversine

Optimization Strategies

Algorithm Selection

Choose algorithms based on your accuracy requirements:

  • Use Euclidean for projected coordinates or small areas

  • Use Haversine for general geographic distance calculations

  • Use Vincenty when millimeter accuracy is required

Batch Processing

Process multiple items together when possible:

# More efficient
distances = batch.pairwise_haversine(points)

# Less efficient
distances = [haversine(p1, p2) for p1, p2 in zip(points[:-1], points[1:])]

Memory Management

For large datasets:

  • Process data in chunks rather than loading everything into memory

  • Use streaming operations when available

  • Consider using NumPy arrays for better memory layout

Best Practices

  • Choose the right algorithm for your accuracy needs

  • Use batch operations when processing multiple items

  • Process large datasets in chunks to manage memory

  • Test performance with your actual data rather than making assumptions

  • For polylines, use precision 5 unless you need sub-meter accuracy