Karmona's Pragmatic Blog

Don't get overconfident… Tiny minds also think alike

Karmona's Pragmatic Blog

The Size of the Internet

September 26th, 2007 by Moti Karmona | מוטי קרמונה · 3 Comments

No Internet Today“The size of the Web is 800 million pages, and the biggest search engine only covers about 16% of it.”
(Lawrence and Giles, 8 July 1999)

For various reasons (e.g. script/dynamic/unlinked/limited-access/non-html content) the indexed web is only a portion of the world wide web and later studies want further claiming that the deep (un-indexed/unknown) web is ~500 times larger than the indexed web!

So… What is the size of the internet?

Well, how many pages are there?

According to boutell.com guesstimate we have ~ 29.7 billion indexed pages on the World Wide Web (updated to Feb. 2007)
http://worldwidewebsize.com suggest a ~22.34 billion indexed pages (Sep. 2007)
and I used a simple Google search for “or | -or +*” and got about 17,340,000,000 documents estimation from Google…

=> We are left out with nice round guesstimate average of 20 billion indexed pages.

Now, multiplying it with an average page size of 70K (based on my experience, utexas.edu and optimizationweek.com)

= ~1300* Tera = ~1.3 Peta of known/indexed web which might hide a ~600 Peta of deeper web

→ 3 CommentsTags: Internet

The Blind Ostrich

September 21st, 2007 by Moti Karmona | מוטי קרמונה · 6 Comments

The Ostrich - IP Discovery Tool

Update (6 Jan. 2008): This post is about anonymous surfing and not about ostriches but due to the emerging Google search traffic looking for ostrich images , I will end this post with few interesting ostrich facts and images… ;)


How to use the kind help of the Google Translate services for Anonymous Surfing?

The Ostrich – IP Discovery Tool @ http://eview.co.il // Will reveal your “true” IP


Google translate service @ http://www.google.com/language_tools


The Blind Ostrich @ http://translate.google.com/translate?u=http%3A%2F%2Feview.co.il&sl=iw&tl=en&hl=en&ie=UTF-8

1+1=3 (!!!) // Notice different IP ;)


Ten most interesting facts about ostriches:

  1. Ostriches are a true dinosaur – Ostriches skeletons and fossils have been found which date back over 120 million years;
  2. The ostrich has the largest eye of any land animal. Its eye measures almost five centimeters across.
  3. The flightless ostrich is the world’s largest bird.
  4. Though they cannot fly, ostriches are fleet, strong runners. They can sprint up to 70 kilometers an hour and run over distance at 50 kilometers an hour.
  5. Ostrich kicks can kill a human
  6. Contrary to popular belief, ostriches do not bury their heads in the sand. The old saw probably originates with one of the bird’s defensive behaviors. At the approach of trouble, ostriches will lie low and press their long necks to the ground in an attempt to become less visible. Their plumage blends well with sandy soil and, from a distance, gives the appearance that they have buried their heads in the sand.
  7. An Ostrich will live to be 50 – 75 years old.
  8. The Ostrich egg will weigh 1.6 kg and is equivalent to 2 dozen chicken eggs.
  9. An Ostrich Hen can lay 40 -100 eggs per year, averaging about 60 eggs per year.
  10. An Ostrich chick grows one foot taller each month until it is 7-8 months old.

Ostrich EscapeOstrich Head in Sand SignOstrich Meat CutsOstrich Head in Sand

→ 6 CommentsTags: Internet

Chronicle of a Death Foretold

September 20th, 2007 by Moti Karmona | מוטי קרמונה · No Comments

Chronicle of a Death Foretold = Waterfall Shmoterfall = Checkmate in 10 movesChronicle of a Death Foretold = Waterfall Shmoterfall = Checkmate in 10 moves
* Note: I did see, participate and lead some successful waterfall projects (mainly due to some adoption of agile methodologies ;-) and this is my view of the projects which failed…

  1. Release scoping start with marketing high-level-copy-paste-from-last-year-marketing-presentations MRD in ~1 month delay
  2. 1 month of quick lets-write-all-the-features-we-can PRD – This is also the last time you hear from the product manager until the next milestone-demo-crisis.
  3. High level design for a couple of weeks which sum-up to a Very Rough Time Guesstimate a.k.a. VeRTiGo
  4. Release time-frame is set ~1 year ahead with the needed VeRTiGo “squeezing” and high level time-frame is determined:
    • 2 months of the waste above and last release leftover
    • 1-2 months of Planning (functional and technical design)
    • 4-5 months of Development – with ~3 Major Milestones
    • 3 months of QA & stabilization
    • 1 month of Project Buffer
  5. Very soon the development teams are scattered like lonely wolfs – everyone for himself until the next integration or major milestones months away.
  6. First milestone is ending with:
    • 20% of the content is really Done a.k.a. “Even a Blind Chicken Finds a Kernel of Corn Now and Then”
    • 50% is “done” with dirty bugy code, low quality, performance issues with missing or wrong functionality
    • 30% is just not ready
  7. Developers and low level management remind themselves yet again to put more buffers…
  8. The PMO suggest (in relax and trusting tone) to postpone the milestone or remove content.
  9. Management doesn’t get in panic (they have seen it before ;-) and decide not to decide: “Let’s see if we can cut the drawback in the next Milestone” a.k.a. The classic do {} while(timeRemaining > Last Milestone)
  10. Next milestone has much more content and the pressure builds up… until the last milestone blaming game which usually ends up with ~2 month delay and half of the planned content.


→ No CommentsTags: Agile · Development · Management · Planning · Project Management · Scrum · Software · Software Management

Privacy Time Bomb @ Google

September 13th, 2007 by Moti Karmona | מוטי קרמונה · No Comments

Google is Evil“We are moving to a Google that knows more about you.”
Google CEO Eric Schmidt, February 9, 2005 (2.5 years ago…)

Did you know that Gmail have “No Deletion Policy” for your emails?

If not, you should read the GMail “privacy” notice
“… Similar to other web services, Google records information such as account activity (including storage usage, number of log-ins), data displayed or clicked on (including UI elements, ads, links); and other log information (including browser type, IP-address, date and time of access, cookie ID, and referrer URL)… Google may send you information related to your Gmail account or other Google services… Residual copies of deleted messages and accounts may take up to 60 days to be deleted from our active servers and may remain in our offline backup systems.”

–> Your private emails will be retained safely on Google’s backup system even after you close your account…

Google do records everything they can (and I mean everything) and retains all data indefinitely (and I mean indefinitely)

Google insists that it uses individual data and collects information about users’ activity to improve the quality of search results and not to create a profile for each user. But history shows that information seldom remains limited to the purpose for which it was collected.

Your Googeled data (private emails & documents, browsing preferences, search history etc.) has become the most wanted honey-pot in the internet– It attracts hackers, crackers, identity thieves, and perhaps most worrisome of all, a governmental big brother intents…

Personally, I like Google and I will keep using GMail since it is the best online email service available today and I don’t trust the other web-mails ;-)


Update (15 December | 2008): Google has stepped off (ranked No. 10 last year) the top 20 list of the most trusted U.S. companies for privacy, according to a report in the San Francisco Chronicle on Monday.

→ No CommentsTags: Google · Internet

Internet Service Outage-Lie-of-the-Day

September 10th, 2007 by Moti Karmona | מוטי קרמונה · No Comments

Dilbert on Network Outage

Almost a month ago, I have moved to my own hosted blog due to google blogger outage.

Last week (September 3rd ~00:00), I was visiting BlogCatalog and I got the Internet Service Outage-Lie-of-the-Day: “Database Error – We’re sorry, but our database servers are currently overloaded. Please enjoy a cup of coffee and then try refreshing this page – Blog Catalog” [BlogCatalog Outage September 3rd – 2007] … so I followed the site recommendation and I have enjoyed a great cup of Italian coffee but I was very sorry to see that it didn’t helped the BlogCatalog database to recover…

These days, we are working intensively on the last sprint goals; One of the goals was to finish the architectural design of our unique internet service.

So… Yesterday we where guesstimating (until the very late hours of the night) on the scale needed for this service and started mapping the IT topology, Database implementation and software back-end options that are viable to support it.

Dilbert plan to Network Outage

I have a lot to say on the process, technology challenges and options for my future posts and I can sum it up now saying: We all hope that by taking the right decisions when reaching a critical-architectural-junction these days, we will be able to enjoy a good cup of Italian coffee while our databases are down for maintenance… but we also know that simple risk management on our guesstimates is also to be prepared for rainy days with a good enough Internet Service Outage-Lie-of-the-Day ;-)

→ No CommentsTags: Blogging · Delver · Internet · Planning · Scrum