Google captures many images which contain data that can’t be easily deciphered by machines, such as scans of old manuscripts and photographs of signs. One way they’ve helped to overcome that is to use their reCAPCHA, where a user is asked to decipher a string of numbers or letters in an image. If several people decipher a particular image in the same way, there is a high likelihood that their interpretation is correct. It’s one of the most elegant crowdsourcing projects ever.
Making sense of data is at the heart of Google’s endeavors. But that’s a silly statement. Data is information, and information is at the heart of all of our endeavors. As the computer age progresses, we’re creating more and more information by the second.
On LinkedIn, I can see that one friend has liked a post, read a book, and been given a new job. On Facebook, I can see that same person has listened to a piece of music and watched a movie on Netflix.
There are some checks-ins posted there as well, via Foursquare or Facebook check-in. On Google Plus, I can follow the trail of that person to various blog posts they’ve written, which then are also posted on some communities that they now belong to.
The Greatest Databases Ever Imagined
Google+, Facebook, LinkedIn, and other social platforms are the input screens sitting on top of massive databases. We are freely doing all the data input. Every time a person posts something on social media, they’re adding a bit of information to the largest databases ever being created.
Setting Tables
My own start in computing was in databases. I had bought a computer that came packaged with Microsoft Access, a relational database program. With a beginning tutorial and a bit of cabin fever, I created a database application for artists to track their work.
Relational databases work on a simple premise and were designed to save space. A non-relational database might look like this:
Apple, fruit
Orange, fruit
Pineapple, fruit
Peas, vegetable
Fig, fruit
Squash, vegetable
You can see how the words fruit and vegetable are repeated. In earlier computing, words being repeated added a lot of size to data files, so programmers devised relational databases. Instead of one big, bloated table, you could have two tables that were linked. In one, you could define your categories, like this:CategoryIDName1Fruit2Vegetable
In the next table, then, I would only have to use the CategoryID to go with the food name:Food NameCategoryApple1Carrot2
In many applications, the database relationships can get fairly dense. I’ve seen programmer bullpens where all the walls of the room were covered in charts showing the relationships of the all of the various tables to one another.
Relational databases have been a critical part of modern computing. All of the world’s 60 million plus WordPress blogs and websites use databases, as do all of the Joomla, Drupal, Ruby, .Net websites, content management systems, e-commerce sites, and all the other myriad Web-based applications.
Google & Facebook Databases
The old world of relational databases was too limited, however, for the scheme of things.
When it comes to comprehending the vast size and distances of the universe, our brain falters. It doesn’t do well with data, either. It’s thought that the human brain is able to store about 2.5 petabytes of binary information. Google processes about 25 petabytes each day, and may store 100 petabytes (they don’t share that information). A petabyte is the equivalent of a million of those gigabyte drives you can buy at Office Depot.
With the advent of XML, and later Schema, we got a hint that there is a move afoot to put all of the world’s knowledge into a database. By asking Web page authors to use the “Rel=Author” and “Rel=Publisher” tags, Google is asking you to give the information some structure.
Google has moved away from old-fashioned relational databases, and now uses something called Big Table — something altogether different than a relational database. Big Table, and similar database systems, allow for the creation of datasets across hundreds if not thousands of servers.
Facebook still uses MySQL, the common open source database format even used in WordPress websites. According to Shlomo Priymak, a MySQL Database Engineer at Facebook, the organization’s MySQL database cluster comprises of many thousands of servers across multiple data centers on two continents. Facebook is also experimenting with Apache Cassandra, which is related to Big Table.
Of all the creatures of the world, it is said that only humans have the capacity for what’s called “cultural ratcheting,” the ability to build upon the knowledge and culture of our predecessors. In Technology, we’re able to create things that no single one of us knows how to make. Even something as simple as a computer mouse: no one knows how to make the plastic, the transistors, and then put the whole thing together. It’s a product of collective thinking (auto play video).
A Look Beneath The Hood
Without doubt, social media is a driving force behind the revolution in marketing and business communications. While the architects of those systems are introducing design patterns that are both complementing and shifting individual and social behavior, what’s happening beneath the hood is as critical and revolutionary, too.
Comments