Can the Cloud Improve Data Analytics?
Risa Savold, Technical Director, Public Sector at Infor, presented “Can the Cloud Improve Data Analytics?” during the NDTA-USTRANSCOM Fall Meeting on October 7, 2019, in St. Louis, Missouri.
Data is a strategic asset. Software applications may change over time, but even when that occurs, it is imperative to maintain control of your data. While the cloud can help enable that control, you must be thoughtful in how to use the cloud.
Utilizing the cloud to store data starts with understanding data’s unique characteristics, which will allow you to recognize the challenges and advantages it presents. There are ten common attributes of data that begin with the letter V: Volume, velocity, variety, variability, veracity, validity, vulnerability, volatility, visualization, and value.
The sheer volume of data and the velocity in which it is accumulating are major challenges to managing data. This challenge is compounded by the third V—variety. Variety describes the fact that data is produced, collected, and stored in many different types and formats, including structured and unstructured data.
There is a significant demand for storing data. Companies know there is value in their data but may not always know what that value is or what data will provide that value, so they opt to save all data in vast “data lakes.” Storing data this way necessitates scalable technology to store the data, as well as to transport the data at very high speeds.
“We already talked about having a ton of bytes that are adding up really quickly in lots of different data types and formats,” said Ms. Savold summarizing the first three Vs. “Unfortunately, that means you’re going to have a lot of inconsistencies [variability]. That means you’re going to have less confidence and less trust [veracity]. We’re certainly always worried about whether that data is even accurate [validity]. Is it worth our time to analyze the security of that data [vulnerability]?”
Volatility is a different aspect of accuracy that describes the timeliness of the information. Visualization is important in helping people comprehend the data is different. The final V, value, may be most important as it looks at whether the information is relevant to your mission.
Understanding the value first will help with other challenges of data management, “asking your business-driven questions, rather than data-driven questions, that’s going to help you reduce the volume of data that you have to worry about. From there, that’s going to reduce the amount of the variety and the variability in the data [and] you will have fewer inconsistencies to worry about. And so that will help you develop requirements, get that high-value process data requirements right first. And then you can go out to a lot of your subject matter experts, your stakeholders, and get the next level of requirements for what is the raw data that you actually need to collect.”