Karmona Pragmatic Blog

Don't get overconfident… Tiny minds also think alike

Karmona Pragmatic Blog

Karmona Labs on Geo Distance

October 9th, 2010 by Moti Karmona | מוטי קרמונה · 6 Comments

Well.. Everyone are talking about Location-Location-Location so this weekend was all about Geographical distance

I will quickly preview few basics (geographic coordinate system, earth radius), introduce and compare four distance calculation models (Pythagorean, Law of Cosines, Haversine, Vincenty), finalize with a pragmatic recommendation (use Law of Cosines! :) and random quote for desert.

Start with the basics…

A Geographic Coordinate System Indicate location using lines of longitude and latitude

Latitude is the angle between the equatorial plane and a line that is normal to the reference ellipsoid, which approximates the shape of Earth to account for flattening of the poles and bulging of the equator.

Longitude is the angle east or west of a reference meridian between the two geographical poles to another meridian that passes through an arbitrary point. All meridians are halves of great circles, and are not parallel. They converge at the north and south poles.

Exempli Gratia
The Status of Liberty is located on 40° 41′ 21″ N , 74° 2′ 40″ W with the following Decimal representative: 40.689167, -74.044444

Since…
40.689167 = Degrees + Minutes/60 + Seconds/3600 = 40 + 41/60 + 21/3600 = 40.689167
-74.044444 = -1*(74 + 2/60 + 40/3600) = -74.044444 // the minus is used to represent South & West

What is Earth Radius? (hint: we will need later for the calculation)

To cut a *very* long story short, Google, IUGG and Karmona labs thinks it is 6378.137 (3963.19 miles)

Because the Earth is not perfectly spherical, no single value serves as its natural radius.

Distances from points on the surface to the center range and regardless of calculation model, the radius falls between 6,357 km and 6,378 km

Earth Radius (km) based on different models

EquatorialPolarVolumetricAuthalicMeridional
6,378.16,356.86,371.06,371.06,367.4


Geographic Distance Calculation Models

There are quite few geo distance calculation models but I will focus on the four I found most relevant:
* Pythagorean
* Law of Cosines
* Haversine
* Vincenty

I have done a little excel experiment (downloadable here):

  • I  have compared the distance between “The Empire State Building” and 15 other locations
  • I have used three geo distance calculation models (beside Vincenty)
  • Modeling this into Excel – I had two locations A (latA, longA) and B (latB, LongB) with R as Earth Radius (coordinates are used in their decimal representative)
    • Pythagorean =SQRT((111.2*(latA-latB))^2+(85.2*(LongA-LongB))^2)
    • Law of Cosines =ACOS(SIN(RADIANS(latA))*SIN(RADIANS(latB)) + COS(RADIANS(latA))*COS(RADIANS(latB))*COS(RADIANS(longA-longB)))*R
    • Haversine =2*ASIN(MIN(1, SQRT(SIN(RADIANS(latA-latB)/2)^2 + COS(RADIANS(latA))*COS(RADIANS(latB))*SIN(RADIANS(longA-longB)/2)^2)))*R
  • Additional Notes:
    • The reason I have used 111.2 and 85.2 in the Pythagorean equation is the fact that 1° latitude ≈ 111 km and 1° longitude can vary but the average is ≈ 82.2 km (the right thing to do actually is to choose the exact longitude/km conversion base on the degree)
    • The conversion from the original Geo Location representative to a decimal one was using this excel formula =IF(Degree<0,Degree-Minutes/60-Seconds/3600,Degree+Minutes/60+Seconds/3600)
  • The Results
    • Pythagorean is easy to compute but not that accurate
    • Law of Cosines and Haversine are almost the same
    • See comparison table below…
ModelPythagoreanLaw of CosinesHaversineVincenty
Formulad = sqrt((X2 – X1)^2 + (Y2 – Y1)^2)a = sin(lat1) * sin(lat2)
b = cos(lat1) * cos(lat2) * cos(lon2 – lon1)
c = arccos(a + b)
d = R * c
dlon = lon2 – lon1
dlat = lat2 – lat1
a = sin^2(dlat/2) + cos(lat1) * cos(lat2) * sin^2(dlon/2)
c = 2 * arcsin(min(1,sqrt(a)))
d = R * c
Too long… ;)
AssumptionsFlat Earth… :)Spherical EarthSpherical EarthEllipsoidal Earth
Accuracy (Worst | 1-5 | Best)(1) Estimated distance (good enough for less than 20km)(4) Good!(4) Good! + The Haversine formula is more robust to floating point errors(5) Great! The most accurate…
Computability (Slowest | 1-5 | Fastest)(5) Fastest!(4) 5-6 trig. calls(3) 5 trig. calls + SQRT (+ Floating Point)(1) Requires iteration (+ “The rest”)

Final note:

  • Simple pragmatic recommendation – Use Law of Cosines to calculate geographic distance – It will be suffice in 90% of your usages !
  • Complex pragmatic recommendation – It really depends – Call me…


*** *** *** *** *** *** *** *** *** *** *** *** *** ***

Random Quote“There’s no sense in being precise when you don’t even know what you’re talking about” | John von Neumann

Tags: Geo

6 responses so far ↓

  • 1  Barak // Oct 9, 2010 at 6:15 pm

    The quote seems fitting and suspiciously not random ;)

  • 2  Moti Karmona | מוטי קרמונה // Oct 9, 2010 at 6:58 pm

    There can always be a random fit… ;)

  • 3  Ken // Oct 9, 2010 at 7:32 pm

    I’d expand on your final note – if you are talking about short distances and large dataset, such as with pin-pointing geo-data to an address or a business, assuming you are always 100 meters from the nearest building, the Pythagorean way could prove useful. it is way easier to index on it, and various DB systems has builtin support for that OOTB.
    I did once implemented the Haversine method using User-Defined-Function within SQL Server. The dataset was towns and cities in the UK. There are 1000~ so the search was very quick and not too CPU intensive.
    For the actual location-services that pop up these days all the times, I’d go with Pythagorean. If it works for Foursquare …

  • 4  Moti Karmona | מוטי קרמונה // Oct 9, 2010 at 7:36 pm

    @Ken, where did you read that Foursquare is using Pythagorean?

  • 5  Ken // Oct 9, 2010 at 8:26 pm

    They told me that when I was interviewing there.

    nah, just kidding.

    https://docs.google.com/present/view?id=dhkkqm6q_13gm6jq5fv

    this was almost half a year ago. By now they have it all on MongoDB, and they have even got to the point that they had to outgrow the initial two machines setup and had a few problems with that (mainly due to having 1:1 relation of shards-to-machines )

  • 6  Moti Karmona | מוטי קרמונה // Oct 9, 2010 at 8:48 pm

    Very interesting presentation!

Leave a Comment

Allowed tags <a href="" title=""> <abbr title=""> <acronym title=""> <b> <blockquote cite=""> <cite> <code> <del datetime=""> <em> <i> <q cite=""> <s> <strike> <strong>

coemption-noncreativity