Big Data is a hot topic in business and industry, and for good reason. Over the past decade, the scope, detail and amount of data captured and stored by companies has skyrocketed. For perspective, in 2005 there were an estimated 30 billion gigabytes of data; in 2013, that is expected to increase twentyfold ( "Is Big Data an Economic Big Dud?," New York Times, August 17, 2013).
With my first job at a large transportation company, we stored data on a centralized mainframe database that, to retrieve data, you had to write queries from your computer (e.g., similar to "coding"), save those as tables on the mainframe, then use another tool to open it up and analyze it. And if you wanted to use Excel for analysis, it had better not have had more than 65,000 rows.\
Nowadays, pretty much all of that can be done on a powerful PC. In addition to nearly unlimited storage, most large companies have spent the past 15 years implementing ERP systems that capture this data in very rich detail. And finally, while the storage and capture of data have been developing, powerful tools for analysis have advanced alongside. Excel is the most common tool of choice of users (especially supply chain users), and it can be used with much larger datasets than before, with extensive general capabilities.
But there are many other, more sophisticated and powerful tools used for analysis, such as SAS, Stata and IBM's SPSS. These tools can handle very large datasets and run various statistical tests at the click of a button; all that is required is expertise in how to use them.
All of this is a boon for analytically-minded executives and managers. Given a large dataset and basic tools such as MS Access and Excel, a skilled analyst can answer just about any question one can think of pertaining to a particular data set. And instead of taking days to complete, the analysis can be done in a matter of hours. This can be extended relatively easily to sophisticated statistical analyses and integrated into optimization models, which parallels the "descriptive, predictive, prescriptive" process commonly accepted across analytics (for a more detailed description, search "INFORMS" and "analytics").
Where is the Value?
The New York Times piece also reasonably questioned whether the value of Big Data is all that it is cracked up to be. Will it have the economic impact of, say, past industrial revolutions, the emergence of the Internet, or the oil and gas revolution? These are valid questions, and I won't address all of them, but there is a particular quote from Josh Marks, CEO of masFlight, that succinctly captures the challenge of getting value out of Big Data:
"The promises that are made around the ability to manipulate these very large data sets in real time are overselling what they can do today."
For now, most of the raw data flowing across the Web has limited economic value. Far more useful is specialized data in the hands of analysts with a deep understanding of specific industries.
To get value out of Big Data as a supply chain organization you should ask yourself these questions:
- Do we have the organizational skills to manipulate and transform big data sets (e.g., database skills)?
- Have we captured the relevant data to address our fundamental problem? Is the data actually what we think it is? If not, what else do we need to collect?
- Do we know the appropriate statistical tests to answer our questions? Can we characterize uncertainty in our answers (to ensure we don't come to incorrect conclusions)?
- Can we integrate insights that we may gain into actions? Can we use it to make better decisions (perhaps through optimization models)?
To summarize, the skills required to get value out of Big Data include expertise in databases, spreadsheets, statistical tools and possibly optimization tools. And last but certainly not least, industry expertise.
So it is quite clear why getting value out of Big Data is a challenge in supply chain management, particularly because supply chain organizations typically cannot afford teams of data scientists like other organizations within a company may be able to.
A Transportation Example
I recently went to a conference on Big Data. And while it was informative, most of the talks revolved around what data is now being captured, with scant evidence or examples of where it is used to make better, more profitable decisions. This must be the next step for Big Data in supply chain management, showing actual value instead of potential value. We will see more and more examples as time goes on.
As an example, I'll draw upon my work experience with a leader in the truckload transportation industry and consulting projects/research I've done with shippers that are users of the for-hire truckload industry.
In my opinion, there is a severe gap in publicly available real-time information in the truckload industry. There are a few sources of information, such as DAT or the CASS Truckload Index, that provide some insights. But I have not come across any that I consider comprehensive, robust, accurate, or representative of real-time local, regional, or national market conditions. As many readers will know, the truckload market can swing from excess capacity to excess demand rather swiftly, and back again, with accompanying swings in prices. Take, for example, the CASS Index. It is calculated based on data from a broad array of shippers and carriers, so in that regard, it is powerful. However, it is aggregated by month and nationally. So, while it might be useful for macro-level planning, it cannot be used for tactical decision-making. Capacity in local markets can get very tight while national capacity is loose. Or, available capacity and prices can change significantly within a month, detail that is lost in a monthly index. So, unlike a commodity such as corn or oil, one cannot access a broad public market such as the Chicago Mercantile Exchange to get the price of a truck on a particular lane at a particular time. One must use private markets for this, such as 3PLs, who hold most of the information privately and to their advantage.
It is debatable whether truckload capacity is a commodity such as corn or beef and in my opinion it is somewhere between a commodity and a specialized service (although closer to a commodity). But leaving that discussion for another day, there is clearly a gap in information. So how can a company that collects this data use it to their advantage?
When I was with the national truckload carrier, we captured and calculated massive amounts of data on our private prices, freight volumes and profitability, among many other measures. Based on this, we could answer questions such as: What freight is the most profitable, considering interconnected loads? What is the real-time status of the market? Which customers are the most important (and hence worthy of a higher service level)? From this, we could make better operational decisions as to which freight to accept or reject, which freight to solicit and how to price spot market loads.
For instance, in down markets a carrier wants to utilize contracted rates as much as possible. However, in tight markets, which for reasons mentioned above are hard to ascertain unless you carefully capture and analyze information in real-time, you want to honor loyal customers while accessing as much spot freight as possible. The reason for this is that spot loads often have a significant price premium, which also depends on exactly how tight the market is. So this is a perfect example of utilizing Big Data to make better decisions using information that others might not know. As an indication of the value of this type of analysis, the company's stock price and market capitalization have increased several-fold since I was there, while thousands of other competitors have gone out of business.
So a natural extension takes the other perspective: How can a shipper utilize their transportation information to avoid tight markets, or leverage carriers and contractual rates better when these conditions occur? Can these markets be predicted and avoided, if at all possible?
I won't go into the details, but based on some recent work I've done, the short answer is that there is likely significant value to be attained. And that, in essence, is the value of Big Data. It enables better real-time decisions that are out of the immediate grasp of the human mind.
Alex Scott is a Ph.D. student in supply chain management at Penn State University. Prior to this he held several positions at IBM related to supply chain planning and execution. Scott is also a member of MH&L's Editorial Advisory Board.