Its Time To Think About Data Differently?

Time to think about data differently

Man being a creature of habit tends to incorporate “skeumorphic” elements into evolutionary designs to provide that level of “comfort” as we climb the ladder. Think about it, why does your computer need files, and folders? This paradigm was carried forward from the physical world yet lacks a relevance to the digital one which it is applied to.

Now some may argue this is a requirement for acceptability or better put “understand-ability” of the masses as the path of evolution to revolution is in fact sigmodial and we homosapiens tend to be a funny bunch about dragging our baggage with us. However, to make that final leap we do have to leave that “baggage” behind us as it is accumulative in nature and this brings me to the idea of data.

As the computer does two things, ether it is a reductionary device where it is provided massive amounts of data such as in “Big Data” and it is asked to “reduce” it, or “creationary” where its provided something (typically an Algorithm) and it creates something from that. In the later case, large random number sets for Monte Carlo simulations would be an example. Now since “man” (referred to in a phyla sense) created the computer, they also applied the skeumorphic concepts of “data” to the model so both operations creation & reduction depend upon, and these are now the bottle neck.

As we all know data is growing exponentially and I will save you the rehash of the details, yet the explanation above is important to understand as this is why data is growing. The First Law of Thermodynamics proves this out as we can only shift the information in a system, we can not create nor destroy it, so we are left with a bit of a conundrum.

Lets step back for moment and look at data as well as how it is used by a computer today to better understand how our applied skeumorphism is holding us back. As data exists on a hard disk as series of ones and zeros press very tightly together in a small sequential space on a round spinning disk. For moment we will forget about SSD’s as their relevance isn’t important to the concept, as the point is all of your data exists in “whole” for the most. Your Miley Cyrus songs and all exist in a whole state on your hard drive. Now say you want to share this song (legally) with a friend so you tell your computer to “copy” this data to their computer across the country and what happens?

Well first off your computer will waste a substantial amount of bytes (in reference to the data size you wish to move) just to find and establish a connection with your friends computer, next your computer will say “I have a One, please create a One on your side” and so on till the process is complete yet both computers will spend (waste) a significant about bytes exponentially more than the actual file itself. Note, this discussion isn’t about protocols and the like for communication as all forms of communications have a cost (confirmed by the Second Law of Thermodynamics) so we accept a measure of this as a given. However what if we didn’t have to do this hand shake for each and every byte of information?

How to get around this you ask, the answer is simple math. The computer is a “math” machine, in fact that is all it knows yet we spend large amounts of time and money forcing it to understand our perceived mental models of “data”. With this said, what if we reduced say a terabyte of data to just one algorithm and instead of sending those billions or trillions of bytes we sent one formula? Now you might say we have started down the sigmodial road of this using zip files and WAN compression, yet this is only circumstantial to the greater idea of everything actually being a formula.

In the past, slow processing abilities of the CPU’s created a limiting factor as most of the data taken in by a computer has analog origins and the reduction to a single “formula” if you will was not reasonably possible, yet today those same chains are quickly falling away and new abilities are being created every day.

Evidence of this can be seen in the growth of Regex (Regular Expressions) where linguistic patterns are distilled to mathematical equations which can be rapidly applied to look at vast amounts of data. This is how your spam filters work, as to attempt to a string for string match across the millions of mails passing though those servers would be impossible. Additionally, this is how the NSA also looks at all the data they do, as what people miss is there is to much to read, so the goal is to mathematically mine for items of interest and algorithms allow for this to happen.

Still not convinced this is possible? Well you have to look no further than yourself as you in fact are nothing more than an extremely large hard-drive. What do I mean, every cell in in your body was built by a single root data model named DNA, yes Deoxyribonucleic Acid. Comprised of only five elements being Hydrogen, Oxygen, Nitrogen, Carbon and Phosphorus, they form only 4 (one half of a byte) building blocks being guanine, adenine, thymine, and cytosine. These four building blocks recombine to from our complete being including “conscience” from this one seemingly data model.

To extent this back to the digital world for moment, what if we could store that terabyte of information we spoke of earlier in just one formula? The imputed abilities and saving of such a capability would be enormous. Simply look at the energy (typically electrical) to store all this data by spinning hard-disk or refreshing SSD’s for each read cycle.

Keep in mind that while we like to think data is “unique” to us in the pictures we take and music we record, this is really not the case as while the possibly for data to be infinite exists, the probability for it to be finite is statically overwhelming…

Forget Silicon When You Have Bacterium…

Can't Wait for this in an iPad version...

Biological computing has been the subject of many science fiction movies and television shows, however as all scifi buffs reading this know that what was once the fantasy of fiction is the stuff of reality. Its worth noting after the waxing above I sit here in Europe having  just finished FaceTiming my wife a half world away on my iPad a device that would have even made Mr Spock proud to call it his own.

With this said it’s interesting that some scientists and engineers toil to develop improved versions of current silicon the  computing technology.   Well their colleagues on the carbon based side of the world (biology) are looking into a drastically different approach to the same means. As the concepts of DNA based computing, much as we carbon centered humans use, offers the potential of performing massively parallel calculations at the cost of very low power consumption at small sizes.

While research in this area has been limited to small systems, a group of researchers from Caltech have recently constructed a series of  DNA logic gates using 130 different molecules which they then used the system to calculate the square roots of numbers. In addition, the same group has published a paper in Nature describing an artificial neural network, consisting of four neurons, created using the same DNA circuits.

In short the way this system works is the researchers preload “gates” with a molecule, then add a bunch of molecules to the input, and simply wait for the laws of statistics to kick in and do their thing.  As the process works out that the more of a given “input” molecule that is around to start with, the greater the likelihood there is  that it will displace the molecule at the gate, which then creates an output.  This is basically material science taking place at the nano-scale leveraging the natural properties of the molecule being used and hence low cost of applied energy within the system.

Even being in its infancy, should one apply Moore;s Law in a sigmodial fashion or subscribe to Kurzweil’s race to a singularity, this technology will be out of the lab into commercial use within the life time of many reading this post.  Even if further convincing is required, we only have to look to the fact that in the news its reported that Lockheed Martin has just been the first company to purchase the once mythical Quantum Computer and one has to wonder no more…