In this Q&A, Matt Thomson, Director EMEA Product Specialists at Databricks, talks about the growing value of data.
He also discusses how to allay public concerns about data sharing – especially with the rise of open source and open standards.
1. Data is predicted to be the currency of the future. Do you agree with this claim?
Absolutely. I’ve always been fascinated by data – I’ve had several jobs helping clients develop their big data skills and getting them to a much more data-driven mindset. And that’s because I believe in his worth, as we all should.
These days, data really underpins everything. It’s key to digital transformation and improving business performance – enabling teams to identify trends, opportunities and problems in business operations. Crucially, it’s no longer just the domain of data science or tech teams. Now it’s being used across the enterprise, and teams like HR, sales, and marketing are using data in increasingly sophisticated ways to make critical decisions.
How companies really use data today can mean the difference between success and failure and allows them to remain competitive. It should be the goal of every organization to become more data-driven.
2. What is the impact of the pandemic on changing attitudes towards data sharing and AI tools?
The pandemic has changed things for almost every company out there. The world was accelerating with the huge demand for digital technology, putting pressure on technical teams. This prompted many data teams to use AI and ML tools to automate repeatable tasks and free up time for data innovation and problem solving to drive business performance. For some, AI and ML tools have been a real lifeline – enabling companies to make important decisions with their data in place while transforming their entire business models.
Crucially, AI also has a role to play in enabling organizations to ask questions about future scenarios. The shock of the pandemic has highlighted the importance of this and the importance of being prepared for the next crisis. The long-term use of AI allows companies to make more accurate predictions and anticipate potential problems – from staffing shortages or skills shortages in the workforce to supply chain disruptions and predicting when critical equipment could fail. So its value in a post-COVID world is really huge.
COVID-19 has also been a strong case for data sharing tools. With the pressure to deliver services quickly even with remote teams, the importance of exchanging information quickly—whether with internal teams or external partners—became truly critical. Just look at how governments have had to do it, sharing data between multiple departments and even internationally to make quick decisions on how to respond to the crisis at all levels.
3. The public has concerns about access to data. Are they right to have concerns and what can be done to overcome them? How can companies ensure the data they use is secure?
I think that as data breaches become more prevalent, it’s natural for people to have concerns about how their data is being stored, used, and handled. This is especially true as hackers become more sophisticated. Organizations that handle customer data must try to stay ahead of the curve and ensure that governance and security are built into the core of their data and analytics platform. And individuals should expect organizations to hold themselves accountable.
There are also extensive ethical considerations with regard to AI – for example to avoid prejudice against certain groups. Businesses need to stay on top of AI and ethics to maintain customer trust and ensure decisions made in these models are fair and representative. I strongly believe that regulators have a role to play here to ensure society stays ahead of the curve in this very fast moving space to avoid unintended consequences
4. Databricks places great value on open source and open standards. Why is that?
Open source tools have become the de facto standard for building and deploying AI and machine learning applications at scale. That’s why at Databricks we place a strong emphasis on open standards – standards driven by a combination of research, community development and technology companies.
There is, of course, the cost benefit of this approach; Open-source technology typically comes at little to no cost and has typically been vetted by experts within the ecosystem. That means teams are building on reliable, proven solutions—reducing the likelihood of risk down the line. The open-source approach also discourages teams from developing overly complex solutions in-house, saving unnecessary resources.
But there is much more to say about the open approach. A modern, open, and collaborative platform—like the Lakehouse—ensures teams are collaborating from a single source of truth, allowing for faster iteration and improvement. This allows organizations to foster a truly data-driven culture, relieving silo systems and unreliable data, and setting the stage for AI and ML adoption.
The open source technology also offers complete transparency and insight into the source code, providing data teams with a connection to the broader open source community. There is much to be said for peer-to-peer technology education, serving as a limitless source of inspiration, troubleshooting and technical talent. Of course there are challenges – open technologies evolve quickly and need to be carefully managed to ensure compliance with security standards, for example. But their value far outweighs the disadvantages.
Featured image: ©Tostphoto