Big Data is More Than More Data

by Brian Timoney

As a buzz-worthy term “Big Data” has a lot going for it: easy to remember, vague enough that a shared clear meaning is always in doubt, and its own O’Reilly conference. Like programmers describing the merits of their software only in terms of the number of lines of code, talking about big-ness merely in terms of number of records in a database seems to be missing some larger point.

Though we in the geo world have experience with bulky data (e.g. LIDAR, maybe SCADA…), what’s looming with the sensor web, ubiquitous GPS, etc. is on a different scale altogether. One would hope that given our advantage of being schooled in the analysis and display of location our industry would have more than a leg up on our uninitiated brethren. But then haven’t we already seen the story  of companies who succeeded by understanding scale then figuring out the geo part later?

Paradoxically enough, success with Big Data may be as much a question of Art as of Science. Because the phrase “the data tells the story”–which was never true–is even more misleading in the context of Big Data due to its size and speed. A common analogy is that of sticking one’s face in front of an open fire hydrant: the expectation of the data telling its own story and you’d emerge a bit dazed, only able to conclude that you experienced some sort of odorless liquid.

Context and narrative are key no matter what data you’re dealing with, but without it Big Data in particular has little value. To use another analogy, the value of Big Data is directly correlated to an organization’s ability to mine big data for meaningful, actionable information.  The decidedly mixed record of the enterprise in doing anything interesting with their data besides storage, retrieval, and elementary reporting fueled this great take that there There’s no such thing as big data.” That’s why Michael Driscoll sees “Big Analytics” as a necessary complement to Big Data. This is where geo can shine: there is no more immediate context than location context; throw in spatial analysis and now you’re cooking with Crisco®.

So Big Data requires more than adding a couple of sub-select statements to your trusty SQL queries.  Parallel processing strategies, NoSQL databases, and much faster methods of moving data from server to browser (Node.js) are some of the weaponry needed to tame the beast.

