Karmona's Pragmatic Blog

Don't get overconfident… Tiny minds also think alike

Karmona's Pragmatic Blog

The GeoSpatial Cloud

November 1st, 2010 by Moti Karmona | מוטי קרמונה · 3 Comments

In continue to my geo distance post, I have decided to post something on the simplest cloud architecture (RDS) for location based services (LBS)

IMHO, (to cut a long story short :) Amazon RDS is more than enough for most geo location applications.

Amazon Relational Database Service (Amazon RDS) is a web service that makes it easy to set up, operate, and scale a relational database in the cloud.

Amazon RDS gives you access to the full capabilities of a familiar MySQL database. This means the code, applications, and tools you already use today with your existing MySQL databases work seamlessly with Amazon RDS.

Two preliminary steps:
* Start a small DB instance with the kind help of AWS management console (image above)
* Use the RDS instance as if it is your “disruptive”  MySQL instance e.g. manage it using MySQL Workbench 5.2.29

“Flirting” with MySQL spatial capabilities (which seems to be “fully” supported by AWS RDS)

* Create the (MyISAM) Table with spatial index

CREATE TABLE `locations` (
`id` int(11) NOT NULL AUTO_INCREMENT,
`lat` decimal(10,6) DEFAULT NULL,
`long` decimal(10,6) DEFAULT NULL,
`loc` point NOT NULL,
`name` varchar(45) DEFAULT NULL,
PRIMARY KEY (`id`),
SPATIAL KEY `loc` (`loc`)
) ENGINE=MyISAM AUTO_INCREMENT=6 DEFAULT CHARSET=utf8

* Insert few values to populate your table

INSERT INTO locations VALUES(1,40.748433, -73.985655, GeomFromText(‘POINT(40.748433 -73.985655)’), ‘The Empire State Building‘);
INSERT INTO locations VALUES(2, 40.689166, -74.044444, GeomFromText(‘POINT(40.689166 -74.044444)’), ‘The Statue of Liberty’);
INSERT INTO locations VALUES(3, 40.758611, -73.979166, GeomFromText(‘POINT(40.758611 -73.979166)’), ‘Rockefeller Center’);
INSERT INTO locations VALUES(4, 40.757266, -73.985838, GeomFromText(‘POINT(40.757266 -73.985838)’), ‘Times Square’);
INSERT INTO locations VALUES(5, 40.7527, -73.9818, GeomFromText(‘POINT(40.7527 -73.9818)’), ‘New York Public Library’);

* Execute a simple test drive query which returns all the locations and their distance from ‘The Empire State Building’ + Comparing two distance calculation methods (1) MySQL euclidean calculation (2) Haversine calculation (results below)

SELECT locations.name,
astext(locations.loc),
GLength(LineStringFromWKB(LineString(locations.loc, GeomFromText(‘POINT(40.748433 -73.985655)’))))*100
AS euclidean,
6378.1 * 2 * ASIN(SQRT(
POWER(SIN(RADIANS(40.748433 – locations.lat) / 2),2) +
COS(RADIANS(40.748433)) * COS(RADIANS(locations.lat) )
* POWER(SIN((RADIANS(-73.985655 – locations.long)) /2), 2) )) AS haversine
FROM locations
— HAVING euclidean < 1
— ORDER BY euclidean ASC LIMIT 10;


Two Surprises/Issues

(1) SRID (Spatial Reference Identifier) support in MySQL is a disgrace – In MySQL, the SRID value is just an integer associated with the geometry value. All calculations are done assuming Euclidean (planar) geometry.

Possible Workarounds: (1) Euclidean calculation can be enough (2) Use Haversine function if you need the accuracy

(2) MySQL ‘Where’ clause can’t use column aliases for filtering

Possible Workarounds: (1) Use ‘Having’ clause (see above) (2) Use the explicit function or field and not the alias

*** *** *** *** *** *** *** *** *** *** *** ***

Related/Interesting reference – Geo Distance Search with MySQL (Presentation | 2008)

Important note: To help new AWS customers get started in the cloud, AWS is introducing a new free usage tier. Beginning November 1, new AWScustomers will be able to run a free Amazon EC2 Micro Instance for a year, while also leveraging a new free usage tier for Amazon S3, Amazon Elastic Block Store, Amazon Elastic Load Balancing, and AWSdata transfer – Very Exciting Times!!!

Disturbing unrelated fact: Starting in 1931, every graduate of the Japanese Naval Academy was asked: “How would you carry out a surprise attack on Pearl Harbor?”

→ 3 CommentsTags: Cloud · Development · Disruptive Technology · Geo

Karmona Labs on Geo Distance

October 9th, 2010 by Moti Karmona | מוטי קרמונה · 6 Comments

Well.. Everyone are talking about Location-Location-Location so this weekend was all about Geographical distance

I will quickly preview few basics (geographic coordinate system, earth radius), introduce and compare four distance calculation models (Pythagorean, Law of Cosines, Haversine, Vincenty), finalize with a pragmatic recommendation (use Law of Cosines! :) and random quote for desert.

Start with the basics…

A Geographic Coordinate System Indicate location using lines of longitude and latitude

Latitude is the angle between the equatorial plane and a line that is normal to the reference ellipsoid, which approximates the shape of Earth to account for flattening of the poles and bulging of the equator.

Longitude is the angle east or west of a reference meridian between the two geographical poles to another meridian that passes through an arbitrary point. All meridians are halves of great circles, and are not parallel. They converge at the north and south poles.

Exempli Gratia
The Status of Liberty is located on 40° 41′ 21″ N , 74° 2′ 40″ W with the following Decimal representative: 40.689167, -74.044444

Since…
40.689167 = Degrees + Minutes/60 + Seconds/3600 = 40 + 41/60 + 21/3600 = 40.689167
-74.044444 = -1*(74 + 2/60 + 40/3600) = -74.044444 // the minus is used to represent South & West

What is Earth Radius? (hint: we will need later for the calculation)

To cut a *very* long story short, Google, IUGG and Karmona labs thinks it is 6378.137 (3963.19 miles)

Because the Earth is not perfectly spherical, no single value serves as its natural radius.

Distances from points on the surface to the center range and regardless of calculation model, the radius falls between 6,357 km and 6,378 km

Earth Radius (km) based on different models

EquatorialPolarVolumetricAuthalicMeridional
6,378.16,356.86,371.06,371.06,367.4


Geographic Distance Calculation Models

There are quite few geo distance calculation models but I will focus on the four I found most relevant:
* Pythagorean
* Law of Cosines
* Haversine
* Vincenty

I have done a little excel experiment (downloadable here):

  • I  have compared the distance between “The Empire State Building” and 15 other locations
  • I have used three geo distance calculation models (beside Vincenty)
  • Modeling this into Excel – I had two locations A (latA, longA) and B (latB, LongB) with R as Earth Radius (coordinates are used in their decimal representative)
    • Pythagorean =SQRT((111.2*(latA-latB))^2+(85.2*(LongA-LongB))^2)
    • Law of Cosines =ACOS(SIN(RADIANS(latA))*SIN(RADIANS(latB)) + COS(RADIANS(latA))*COS(RADIANS(latB))*COS(RADIANS(longA-longB)))*R
    • Haversine =2*ASIN(MIN(1, SQRT(SIN(RADIANS(latA-latB)/2)^2 + COS(RADIANS(latA))*COS(RADIANS(latB))*SIN(RADIANS(longA-longB)/2)^2)))*R
  • Additional Notes:
    • The reason I have used 111.2 and 85.2 in the Pythagorean equation is the fact that 1° latitude ≈ 111 km and 1° longitude can vary but the average is ≈ 82.2 km (the right thing to do actually is to choose the exact longitude/km conversion base on the degree)
    • The conversion from the original Geo Location representative to a decimal one was using this excel formula =IF(Degree<0,Degree-Minutes/60-Seconds/3600,Degree+Minutes/60+Seconds/3600)
  • The Results
    • Pythagorean is easy to compute but not that accurate
    • Law of Cosines and Haversine are almost the same
    • See comparison table below…
ModelPythagoreanLaw of CosinesHaversineVincenty
Formulad = sqrt((X2 – X1)^2 + (Y2 – Y1)^2)a = sin(lat1) * sin(lat2)
b = cos(lat1) * cos(lat2) * cos(lon2 – lon1)
c = arccos(a + b)
d = R * c
dlon = lon2 – lon1
dlat = lat2 – lat1
a = sin^2(dlat/2) + cos(lat1) * cos(lat2) * sin^2(dlon/2)
c = 2 * arcsin(min(1,sqrt(a)))
d = R * c
Too long… ;)
AssumptionsFlat Earth… :)Spherical EarthSpherical EarthEllipsoidal Earth
Accuracy (Worst | 1-5 | Best)(1) Estimated distance (good enough for less than 20km)(4) Good!(4) Good! + The Haversine formula is more robust to floating point errors(5) Great! The most accurate…
Computability (Slowest | 1-5 | Fastest)(5) Fastest!(4) 5-6 trig. calls(3) 5 trig. calls + SQRT (+ Floating Point)(1) Requires iteration (+ “The rest”)

Final note:

  • Simple pragmatic recommendation – Use Law of Cosines to calculate geographic distance – It will be suffice in 90% of your usages !
  • Complex pragmatic recommendation – It really depends – Call me…


*** *** *** *** *** *** *** *** *** *** *** *** *** ***

Random Quote“There’s no sense in being precise when you don’t even know what you’re talking about” | John von Neumann

→ 6 CommentsTags: Geo