The technology already supporting big data acquisition, classification and queries should not sound remotely foreign to those sifting through procurement-specific data today (and know how the underlying technologies work). To wit, as the NYT suggests, "At the forefront are the rapidly advancing techniques of artificial intelligence like natural-language processing, pattern recognition and machine learning...The wealth of new data, in turn, accelerates advances in computing -- a virtuous circle of Big Data. Machine-learning algorithms, for example, learn on data, and the more data, the more the machines learn." If you think this sounds like a dead ringer for how spend classification systems, not to mention predictive analytics capabilities of supplier risk and supply chain risk scoring and alerting tools, you're right.
In fact, one could argue that spend analysis capabilities were one of the first commercially ubiquitous -- well, at least among top performing organizations -- big data technologies to infiltrate the enterprise. And if you dissect what spend analysis systems do on the most basic level, it will give you a really good sense of the general requirements of the process that big data focused systems and processes must go through in general to enable new analyses. First, systems must be able to acquire information from multiple sources. Historically, this was known as the ETL process (extract, transfer, load) but as new technologies (e.g., data hubs) have begun to complement but real-time and batch-based integration capabilities, it's not longer a simple one-size-fits-all approach to data acquisition.
Second, you need to find a way to put data in context -- and to make sure it's accurate in the first place. This is where the data cleansing and classification stages of spend analysis come into play (and where data cleansing and classification steps are essential for big data analysis in general). An organization needs to be able to make sure, for example, that I.B.M. is the same in their system as IBM, International Business Machines, Emptoris (a division of IBM -- better update that one!), at the like. And then, of course, a system must correct for misspellings, typos and even, in the most advanced of cases, transliteration challenges (Arabic to English is notoriously difficult in this case, as there are multiple correct translations for many names, places, etc.). Of course you then need to apply context on top of all of this to sort through the data, and this is where classification schemas or taxonomies come into play (e.g., UNSPSC, NAICS).
And we haven't even gotten into data enrichment, let alone the analytics itself (e.g., OLAP, dashboards, alerting, predictive analytics)! But hopefully, you get the main argument here -- if you understand the basics of spend analysis, then, well, there's not a whole lot more challenge around big data in procurement and supply chain, aside from the fact there's more of it, it's coming from numerous external sources, and we've got to make sense of unstructured information as well as structured (which we haven't done before in procurement and supply chain as a general rule).