Sometimes, it already seems like the websites you visit know more about you than you know about yourself — but this is just the beginning. Every day you're helping to generate masses more data, which computers are getting better and better at crunching. How long will it be before the predictive power of these systems becomes so powerful, they're almost magical?
Here are 10 ways that Big Data is already creating a shiny new science fiction future.
Top image by Sparth.
1. Dating sites that can predict when you're lying
OKCupid, which has 7 million active users — and can therefore figure out details like what people are likely to lie about. (Unsurprisingly, height and salary are popular.) Just imagine what'll happen when these sites start playing with unstructured data, stuff like the contents of your "about me" profile, which can't be shoehorned into a traditional relational database and therefore can't yet be easily crunched. Suddenly you're working with way more than 99 dimensions of compatibility, and digital sourcing of one's future light-o-love isn't just as good as meeting someone in a bar — it's actually easier.
2. Social networks already know who you know
LinkedIn's "People You May Know" is one of the great showcases for what companies can do with big data. If you're a user, you've probably seen names pop up, where you just can't quite figure out how LinkedIn knew you knew them — people you've only corresponded with over email, friends of friends, colleagues at the new job you haven't yet added to your profile. We're zooming past the age of "friending," where people had to actually look each other up. Now there are so many active social media users, putting out so much information, that networks are increasingly able to make the connections on your behalf. You're almost taken out of the equation, because the algorithm can figure out — without your help — that you met your best friend's cousin's cute coworker on Friday night.
3. Surveillance gets really Orwellian, really fast
In August, the NYPD established a social media unit to snoop on Facebook and Twitter for mentions of criminal activity. One individual sitting in front of TweetDeck can only do so much, but new analytical tools allow law enforcement to uncover all kinds of patterns. On top of that, there are new ways of gathering information to analyze. The US military already collects mind-boggling amounts of data, and they're only sharpening their capabilities. For example, the Army is deploying a new drone helicopter to Afghanistan which is equipped with something called the Autonomous Real-time Ground Ubiquitous Surveillance Imaging System (ARGUS). Every single day, that's going to collect six petabytes — or almost 80 years' worth — of HD video. That's a lot of data to crunch.
4. Recommendation engines get much smarter
Companies like Netflix and Amazon have pioneered the use of big data in recommendation engines, and with initiatives like the Netflix prize, they're constantly investing in ways to push what's possible. Now imagine what happens if you add social elements to the mix — and not just as a "your friend from middle school is reading The Help" add-on, either. Think more along the lines of recommending Portlandia because your girlfriend has listened to non-stop Sleater Kinney on Spotify for three days straight. Granted, this requires services to share data, and it's unlikely a company like Amazon will ever entirely open up its valuable treasure trove of data points. But even just a couple of major partnerships could create far, far better content delivery.
5. Scientists and doctors can make sense of your genome — and so can insurers
The Human Genome Project was founded in 1990, and didn't announce a completed DNA sequence until 2003. Now it's possible to unlock an individual genome in mere weeks, and the cost is rapidly approaching the point of being within reach for the average individual. At CES, Ion Torrent released technology that brings the cost of sequencing below $1000. This where things get really interesting. Suddenly scientists can examine large genetic data sets for disease patterns, and doctors can start tailoring treatments to individual genetic profiles. So expect both ground-breaking research and a more personalized kind of medicine.
Of course, there's a very good chance that insurers will also have access to that data, which opens up a whole new Gattaca-flavored can of worms.
6. Journalists know when your habits are changing
Everyone wants to spot the next big thing, but it's often all too clear that it's just "three-stories-make-a-trend" reporting. But just wait until marketers and news organizations have access to a whole lot more raw data, and this puts them in a position where they've got something concrete to base their conclusions on. Nor is this limited to lifestyle pieces about the emerging popularity of something like yorkies or Greek yogurt. The Guardian, in particular, is pioneering a data-driven form of journalism, using open source data to do more in-depth reporting. All this means eerily accurate insight into what you (and your cohort) are doing and why. Think William Gibson novels, here.
7. Intensely personal data gets crunched
Know anyone who's got trouble sleeping? The challenge in figuring out why comes from the fact that sleep is a pretty complicated process. There are so many variables involved — breathing patterns, REM patterns, body position, ambient noise — that it's hard to isolate what might causing the trouble. Enter devices like Jawbone and Zeo, which enable individuals to keep track of their own vitals and adjust their habits accordingly. They're part of an emerging "quantified self" movement, where number-minded individuals are looking for ways to measure themselves, which gives them (and others) the insight necessary to make big changes.
8. Early detection mitigates catastrophes
Speed makes a tremendous difference when it comes to disaster response, and even a slight heads up can save countless lives. When an 8.9 magnitude earthquake struck Japan in 2010, an early warning system built in 2007 was able to offer just enough warning to shut down the nation's bullet trains and other facilities. This isn't limited to natural disasters, either — a recent study demonstrated that using data from Twitter would have enabled faster identification and more accurate tracking of the Haitian cholera epidemic that's killed 6,500 people. The implications for the containment of infectious diseases are enormous.
9. The US electrical grid finally enters the digital age
The US electrical grid is currently so analog that hacking it — the subject of many cyberwar doomsday predictions — is basically impossible. However, huge chunks of the system do go down without the assistance of cyber-attackers, in cascading failures like the 2003 Northeast blackout. And that's just the catastrophic stuff. Everyday power outages caused by errant tree branches can't be fixed, until the power company can triangulate the source of the problem, using customer phone calls. On top of all of that, have you looked at your meter, lately? There are coffee makers that use more technology.
Right now, companies like eMeter are racing to market with new technologies to change all this. That's going to give utilities unprecedented insight into their own networks, making the whole system more efficient. But that won't just catch the industry up to the information age, either. A smart grid is absolutely vital for the integration of distributed power generation — ie, solar and other alternative energy sources.
10. An accurate weather forecast
This one's pretty far out on the edge, just because weather systems are so very, very complex. But the company that manages to create even a half-way effective weather-prediction algorithm will be able to, as they say, make it rain.