Oracle’s Enterprise Big Data Predictions 2016
Neil Mendelson, Vice President, Big Data Product Management, Oracle Corp
Companies big and small are finding new ways to capture and use more data. The push to make big data more mainstream will get stronger in 2016. Here are Oracle’s top 10 predictions:
- Data civilians operate more and more like data scientists. While complex statistics may still be limited to data scientists, data-driven decision-making shouldn’t be. In the coming year, simpler big data discovery tools will let business analysts shop for datasets in enterprise Hadoop clusters, reshape them into new mashup combinations, and even analyze them with exploratory machine learning techniques. Extending this kind of exploration to a broader audience will both improve self-service access to big data and provide richer hypotheses and experiments that drive the next level of innovation.
- Experimental data labs take off. With more hypotheses to investigate, professional data scientists will see increasing demand for their skills from established companies. For example, banks, insurers, and credit-rating firms will turn to algorithms to price risk and guard against fraud more effectively. But many such decisions are hard to migrate from clever judgments to clear rules. Expect a proliferation of experiments default risk, policy underwriting, and fraud detection as firms try to identify hotspots for algorithmic advantage faster than the competition.
- DIY gives way to solutions. Early big data adapters had no choice but to build their own big data clusters and environments. But building, managing and maintaining these unique systems built on Hadoop, Spark, and other emerging technologies is costly and time-consuming. In fact, average build time of six months. Who can wait that long? In 2016, we’ll see technologies mature and become more mainstream thanks to cloud services and appliances with pre-configured automation and standardization.
- Data virtualization becomes a reality. Companies not only capture a greater variety of data, they use it in a greater variety algorithms, analytics, and apps. But developers and analysts shouldn’t have to know which data is where or get stuck with just the access methods that repository supports. Look for a shifting focus from using any single technology, such as NoSQL, Hadoop, relational, spatial or graph, to increasing reliance on data virtualization. Users and applications connect to virtualized data, via SQL, REST and scripting languages. Successful data virtualization technology will offer performance equal to that of native methods, complete backward compatibility and security.
- Dataflow programming opens the floodgates. Initial waves of big data adoption focused on hand coded data processing. New management tools will decouple and insulate the big data foundation technologies from higher level data processing needs. We’ll also see the emergence of dataflow programming which takes advantage of extreme parallelism, provides simpler reusability of functional operators, and gives pluggable support for statistical and machine learning functions.
- Big data gives AI something to think about. 2016 will be the year where Artificial Intelligence (AI) technologies such as Machine Learning (ML), Natural Language Processing (NLP) and Property Graphs (PG) are applied to ordinary data processing challenges. While ML, NLP and PG have already been accessible as API libraries in big data, the new shift will include widespread applications of these technologies in IT tools that support applications, real-time analytics and data science.
- Data swamps try provenance to clear things up. Data lineage used to be a nice-to-have capability because so much of the data feeding corporate dashboards came from trusted data warehouses. But in the big data era data lineage is a must-have because customers are mashing up company data with third-party data sets. Some of these new combinations will incorporate high-quality, vendor-verified data. But others will use data that’s not officially perfect, but good enough for prototyping. When surprisingly valuable findings come from these opportunistic explorations, managers will look to the lineage to know how much work is required to raise it to production-quality levels.
- IoT + Cloud = Big Data Killer App. Big data cloud services are the behind-the-scenes magic of the internet of things (IoT). Expanding cloud services will not only catch sensor data but also feed it into big data analytics and algorithms to make use of it. Highly secure IoT cloud services will also help manufacturers create new products that safely take action on the analyzed data without human intervention.
- Data politics drives hybrid cloud. Knowing where data comes from – not just what sensor or system, but from within which nation’s borders – will make it easier for governments to enforce national data policies. Multinational corporations moving to the cloud will be caught between competing interests. Increasingly, global companies will move to hybrid cloud deployments with machines in regional data centers that act like a local wisp of a larger cloud service, honoring both the drive for cost reduction and regulatory compliance.
- New security classification systems balance protection with access. Increasing consumer awareness of the ways data can be collected, shared, stored—and stolen—will amplify calls for regulatory protections of personal information. Expect to see politicians, academics and columnists grappling with boundaries and ethics. Companies will increase use of classification systems that categorize documents and data into goups with pre-defined policies for access, redaction and masking. The continuous threat of ever more sophisticated hackers will prompt companies to both tighten security, as well as audit access and use of data.