Karmona Pragmatic Blog

Pragmatic Software Management, Internet Trends, Life and more…

Karmona Pragmatic Blog

Random Rumbling on Technology Triggers

August 10th, 2009 by Moti Karmona | מוטי קרמונה · No Comments

Monopoly Go to Jail“In the future everything will be augmented reality!”

I might be getting a little too old, visionless, pragmatic or pessimistic for this but I find it very hard to travel to the promised lala land, Gartner’s calls “peak of inflated expectations”.

When I encounter a new “Technology Triggers”, I skip right to the “Trough of Disillusionment” without really passing through the promising “peak”…

e.g. Hype Cycle for Emerging Technologies | Gartner, 2009 – Am I missing something here?

Hype Cycle for Emerging Technologies | Gartner, 2009

p.s. I do think Technology Triggers are very good for SEO and I will update you if it will work… ;)

*** *** *** *** *** *** *** *** *** *** *** *** *** *** *** *** *** ***
Human Augmentation, 3D Flat Panel Displays, Quantum Computing, Context Delivery Architecture, Video Search, Behavioral Economics, Mobile Robots, Surface Computers, Augmented Re4ality, 3D Printing, Internet TV, Wireless Power, Cloud computing, E-Book Readers, Social Software Suites, Micorblogging, Green IT, Video Telepresence, Mesh Networks, Online Video, Home Health Monitoring, Public Virtual Worlds, RFID, Social Network Analysis, Web 2.0, Idea Management, Tablet PC, Wikis, Corporate Blogging, SOA, Location Aware Applications, Speech Recognition etc.

→ No CommentsTags: Conspiracy · Disruptive Technology

Backup your Life with Amazon S3

February 1st, 2009 by Moti Karmona | מוטי קרמונה · 10 Comments

Head in the CloudsS3 (Simple Storage Device ) Overview

“Amazon S3 provides a simple web services interface that can be used to store and retrieve any amount of data, at any time, from anywhere on the web. It gives any developer access to the same highly scalable, reliable, fast, inexpensive data storage infrastructure that Amazon uses to run its own global network of web sites. The service aims to maximize benefits of scale and to pass those benefits on to developers.” (http://aws.amazon.com/s3/)

What can I do with this S3 thingy?

Many things… but I will focus on a pragmatic and more common use case – You can use S3 as the ultimate network drive to share your music collection, backup your documents or to store your blog images (a.k.a. CDN for the masses) etc.

Is it free???

Almost…   $0.15 per Giga (+ you can estimate your monthly bill using the AWS Monthly Calculator)

Amazon S3 Pricing

How do I start?

To use Amazon S3 service, you’ll simply need to open an Amazon account and register to the S3.

Interesting tools to simplify your S3 experience:

  • Amazon S3 Firefox Organizer – Simple Firefox add-on that provides an FTP like interface (Windows Explorer) to upload and manage files on S3 – “S3Fox Organizer helps you organize/manage/store your files on Amazon S3. It is easy to install and use as it is integrated into the browser…”
  • DropBox – New Amazon-S3-backed-storage service (thanks to Shlomo for the introduction) – Very simple to use and with 2GB storage limit on the default (free) account and paid upgrade to 50GB of space for $9.99 / month which is not that much above the $7.5 they need to pay to Amazon (before optimizations ;)

P.S. If you liked DropBox you might also like many others

→ 10 CommentsTags: Amazon · Disruptive Technology · Simplicity · Tools

Head in the Clouds

January 29th, 2009 by Moti Karmona | מוטי קרמונה · 2 Comments

La Condition Humaine | Rene Magritte, 1933It seems like everywhere I go these days, people are talking about cloud computing… Would it be accurate to say that almost everyone are “fairly optimistic” regarding Cloud Computing?

For your convenient, I have collected few cloud-buzz-quotes to  spice your cloud computing elevator pitch:

  • According to Gartner: “Cloud computing heralds an evolution of business that is no less influential than e-business”
  • IDC on Cloud Computing: “This is about the IT industry’s new model for the next 20 years“, Vernon Turner, head of enterprise infrastructure, consumer and telecoms research.

Cloud Computing The Latest Evolution of Hosting | Forrester Research

  • Merrill Lynch estimates that by 2012, the annual global market for cloud computing will surge to $95 billion and that 12% of the worldwide software market would go to the cloud in that period.
  • In January of 2008 Amazon announced that the Amazon Web Services now consume more bandwidth than do the entire global network of Amazon.com retail sites

Amazon Web Services Bandwidth

  • “… the new computing cloud age” Eric Schmidt (April, 2008)

Do you happen to have more???

→ 2 CommentsTags: Amazon · Cloud · Conspiracy · Disruptive Technology

The Social Graph Challenge

December 30th, 2008 by Moti Karmona | מוטי קרמונה · 2 Comments

The Story Behind The Delver Kid ImageI was analyzing, dreaming, monitoring, crawling, debugging, reading, breathing, cursing, scaling, visualizing and learning the social graph for the last couple of months and I thought it might be a good idea to write a little something about The Social Graph Challenge with a pragmatic twist on few other common concepts.

 

——— Blitz Introduction to The Social Graph ———

The social graph is just a simplified mathematic abstraction when nodes are people and edges are relations between them.

In the last decade the internet have became more social than was ever expected it to be with the rapid growth and adaptation of social networks, social media and user-generated contributions and interactions. 

Nowadays, there is a growing feeling that it is feasible to model and map the social web into a real-life social graph replication.

Delver Starfish

——— Pragmatic Overview on The Social Graph Challenge ———

Modeling | Building | Processing | Size | Architecture

(1) Modeling the Social Graph

*** Vocabulary 

To better understand how complicated it is to create a vocabulary for expressing metadata about people, their interests, relationships and activities you should simply pay a quick visit to the FOAF Project technical specification page

The FOAF (“Friend of a Friend”) Project  has the most comprehensive model available today and it is still lacking some basic modeling granularity e.g. time awareness metadata, no privacy model, poor relationship model 

*** The Social Cloud

It is common mistake to forget that people are more than just flat internet identities (e.g. Linked profile) and to complete the profile modeling we must add all their content to the graph e.g. Personal Blog, Flickr images, YouTube Videos, Delicious bookmarks, Tweets, Blog Comments etc.

Modeling all these content and consumption types will yield a broader definition (a.k.a. The Social Cloud) with even more complex modeling challenges.

More Delver Kids

(2) Building the Social Graph

*** The Paradigm Shift

While conventional internet crawlers, follow hyperlinks within web pages and treat pages as plain-text, social crawlers should have social-”awareness”:

  • Identify and extract people identities fragments (e.g. social network profiles, blog authors)
  • Identify relationships (e.g. social networks connections, blog-roll fans)
  • Identify relations between content and people (author, bookmark, reference etc.)

*** The Standards Dilemma – No Silver Bullet

Beside FOAF, there are several open standard like RSS, ATOM for content syndication and microformats like HCard, XFN for profiles and network discovery,  that seems promising and can help with the identification quest but although this is being pushed by giants (e.g. Google Social Graph API) the adaptation is still low and have many correctness and corruptions issues - e.g. all these people claimed to be Wordpress.com using the XFN (rel=”me”) microformat 

*** The Promise of Structured Sources (a.k.a. The structure myth)

The Myth: Most social Media sites (e.g. FaceBook, LinkedIn, MySpace, Flickr etc.) have a public available structured profile pages so in principle all need to be done is some XPath magic on HTML DOM to finish the parsing task.

But… Most of the work isn’t parsing but data modeling which require deep understanding of each site user model and usage

  • Many Social Media sites have EULA restrictions which prohibit any access or use to the site content but if you are lucky you will get some offical API’s instead.
  • Social Media sites have many (~weekly) structural changes in their CSS/HTML.
  • Social Media sites have many changes (~monthly) in their data privacy policy and have complex privacy model which create inconsistency in profile, network and content presentation.

*** Few more Challenges with Social Crawling:

  • Privacy-Ownership-Control - The data is the property of the users
  • Unstructured Sources – It isn’t a trivial task to extract social entities from unstructured sources (e.g. blogs) and might require offline semantic processing on your collected data.
  • Cross Network Relations – How to find those important hidden cross network relations e.g. between the biggest reliable network graph (e.g. FaceBook) and the richest content contributions (e.g. Blogosphere, YouTube, Flickr etc.)
  • Identify Social Signs (e.g. Social Widgets, Comments, Blogroll etc.)
  • Social Graph Update Mechanism and crawlers distribution
  • Profiles Canonization 

Delver Rodents

(3) Processing the Social Graph

*** The Identity Crisis

  • Filtering Impersonation e.g. all these site use XFN (rel=”me”) to “say” they are TechCrunch
  • Identify and have different modeling for non-individual identities (groups, shared authorship) e.g. Knitters Blog with 629 knitting contributors :)
  • Strive to merge identities  (a.k.a. profile fusion) when possible e.g. Moti Karmona in LinkedIn and Moti Karmona in FaceBook could be two instances (/profiles) of the same person and merging this profiles will enable:
    • Cross network connectedness => Bridging between network richness (e.g. FaceBook) to content richness (e.g. Blogosphere)
    • Richer people representation using identities aggregation => Richer networks
  • The Fusion Challenge: You can pay a short visit to the nearest social aggregator directory but you can’t get away from some more complex algorithms for disambiguating web appearances of people with more common names like James Smith who doesn’t “play” in the social aggregation playground (like 98.7% of the graph).

*** Graph Enrichment 

  • Implicit Relations - Enrich the network with “implicit” relationships (Colleagues, Graduates, Neighbors) e.g. I have a LinkedIn profile and all my connections are hidden for public crawlers but the fact I work in Delver  is public so if Delver is startup company with less than ~50 people than there is a good chance I know all the other workers in Delver => This simple heuristic rule can create an implicit relation between me and other workers of Delver without me explicitly claim that I know them (as I did in FaceBook)
  • Generating the inverted relations when needed Followed vs. Follower
  • Deeper, semantic extraction of social entities un-structured content

Delver Faces

(4) The Social Graph Size

Let’s have some quick (and very dirty) guesstimates:

World Population is approx. ~6.7 Billion / 22% Internet penetration => 1.5 Billion internet users 

Let’s say 65% of these users have some kind of presence in Social Media (~20% have more than one) => ~1 Billion Profiles x ~10 content items per profile

1 Billion Profiles Nodes x ~100 network relations per profile  => ~110 Billion Graph Edges + ~10 Billion Graph Nodes

It is highly depended on graph implementation but with this numbers, you can easily find yourself with ~1-2 Terabytes of graph metadata alone (without contents and profiles*

Delver Diving Suite

(5) Two Cents on Social Graph Architecture

Updating and querying gigantic, dynamic, distributed, directed, cyclic, colored, weighted graph have “some” algorithmic, computational complexity – a little more complex than a blog post could cover…;-)

You can take a quick look at the tiny 15 Giga, 25 million nodes graph implementation in LinkedIn to get a glimpse to the technological challenge … 

* Note: Indexing content and profiles data (e.g. for Building a Social Search Engine) is an architecture challenge equivalent to any modern search engine with ~10 Billion documents index

The Delver Kid

This is only the tip of the iceberg but it is more than enough for one blog post ;)

_________

Credit: All the images were taken from Tamar Hak’s amazing artwork – creating The Delver Kid image.

→ 2 CommentsTags: Delver · Disruptive Technology · Search · Semantic Web · Social Network · Web 3.0

Yahoo Open Strategy

October 28th, 2008 by Moti Karmona | מוטי קרמונה · 1 Comment

yos diagram 150x150 Yahoo Open StrategyYahoo have released the Y!OS (Yahoo Open Strategy) 1.0 platform.

This is a cool set of simple APIs that can give you access to everything you ever wanted in Y! but was afraid to ask for…

Yahoo! Social Platform (YSP)
// The Yahoo Social Platform is a set of RESTful APIs for Profiles, Connections, Updates, Contacts and Status.

Yahoo! Query Language (YQL)
// The Yahoo Query Language is a web service that functions much like SQL (see example below)

OAuth Authentication
// OAuth is the authentication and authorization standard Yahoo has decided to use when giving third parties access to Yahoo user data.

Yahoo! Applications Platform (YAP)
// Currently very limited and in a restricted sandbox.

________________________________

Example: How to use YQL APIs to access MyBlogLog profiles?

Simply ask for all the community members of MyBlogLog community with this YQL:

select * from mybloglog.members.find where community_id in (select id from mybloglog.community.find where name=”Karmona Pragmatic Blog”)

And once you have the IDs you can ask for my personal profile by:

select * from mybloglog.member where member_id =”2008070609482910″

Well… together with the existing BOSS API, this set of APIs is a powerful enablers to the Y! development network and I am sure some cool stuff are going to emerge from this innovative move…

Amazing!!!

________________________________

* You can have more YQL experiments using the YQL Console

** Boss Hack Day is coming to Tel-Aviv | November 6, 2008 @ Feature (!!!)

→ 1 CommentTags: Development · Disruptive Technology · Internet · Search · Software · Web 2.0

Solid State Drives

October 2nd, 2008 by Moti Karmona | מוטי קרמונה · No Comments

bolt100metersbeijing 150x150 Solid State Drives5 intriguing facts on SSD

The 1st  modern SSD was developed by StorageTek in 1978 (which was acquired 27 years later by Sun for US$4.1 billion)

Google plan to use Intel’s SSD in Production Search Systems in Q2 2008 – The company’s adoption of solid state drives will save energy, speed search, and potentially lead to a shortage of 16-GB and 32-GB NAND flash chips…

Prices for SSD drives are expected to halve every ~9 months (!!!)

In the future everything will be on SSD ;)

We have bought a simple OCZ 128GB – SATA II 2.5 SSD for benchmarking and we are not too happy with it yet.

→ No CommentsTags: Disruptive Technology · Google

MySQL Surprise

August 11th, 2007 by Moti Karmona | מוטי קרמונה · 1 Comment

mySQL Trends“We have used MySQL far more than anyone expected. We went from experimental to mission-critical in a couple of months.” – Jeremy Zawodny, MySQL Database Expert, Yahoo! Finance

Did you know that YouTube, Flickr, Linden Labs, Technorati, Facebook, FeedBurner, StumbleUpon, Wikipedia, Digg, LiveJournal, del.icio.us, Yahoo (Finance) and many others have all selected MySQL as a database backend for their web operations?

e.g. Flickr is using MySQL to store ~2 Pb (1 Petabyte = 1024 Terabyte); storing more than ~470M photos with more than 4 billion queries per day…!

Coming from an enterprise software company delivering products which only integrate with the highest-end (perceived) commercial databases, I didn’t have the pleasure to try MySQL but recently I do… I am still in the very beginning of my learning curve and until now it have been a real pleasant surprise – mainly due to the database’s speed and ability to easily “scale-out” on low-cost hardware (sharding).

MySQL have the disruptive technology “smell” all over it and my premonition is that it will increasingly evolved to be “good enough” for a larger and larger segment of the market…

→ 1 CommentTags: Disruptive Technology · MySQL