Close Menu

    Subscribe to Updates

    Get the latest creative news from FooBar about art, design and business.

    What's Hot

    Don’t toss your Windows 10 PC! Try switching to KDE Plasma instead

    Windows 10 gets an extra year of free security updates (with a catch)

    Philps Hue smart lights are already pricey. They’re about to get pricier

    Facebook X (Twitter) Instagram
    • Artificial Intelligence
    • Business Technology
    • Cryptocurrency
    • Gadgets
    • Gaming
    • Health
    • Software and Apps
    • Technology
    Facebook X (Twitter) Instagram Pinterest Vimeo
    Tech AI Verse
    • Home
    • Artificial Intelligence

      Apple sued by shareholders for allegedly overstating AI progress

      June 22, 2025

      How far will AI go to defend its own survival?

      June 2, 2025

      The internet thinks this video from Gaza is AI. Here’s how we proved it isn’t.

      May 30, 2025

      Nvidia CEO hails Trump’s plan to rescind some export curbs on AI chips to China

      May 22, 2025

      AI poses a bigger threat to women’s work, than men’s, report says

      May 21, 2025
    • Business

      Google links massive cloud outage to API management issue

      June 13, 2025

      The EU challenges Google and Cloudflare with its very own DNS resolver that can filter dangerous traffic

      June 11, 2025

      These two Ivanti bugs are allowing hackers to target cloud instances

      May 21, 2025

      How cloud and AI transform and improve customer experiences

      May 10, 2025

      Cookie-Bite attack PoC uses Chrome extension to steal session tokens

      April 22, 2025
    • Crypto

      How Plume Drove a 100% Jump in RWA Holders to Overtake Ethereum

      June 24, 2025

      $400 Million SHIB Supply Zone Might Prevent Shiba Inu From Ending Downtrend

      June 24, 2025

      Turkey Overhauls Crypto Regulations to Stop Money Laundering

      June 24, 2025

      What Crypto Whales Are Buying After Israel-Iran Ceasefire Announcement

      June 24, 2025

      Midnight Network Tokenomics Introduces Radically Accessible and Fair Token Distribution Model 

      June 24, 2025
    • Technology

      Don’t toss your Windows 10 PC! Try switching to KDE Plasma instead

      June 25, 2025

      Windows 10 gets an extra year of free security updates (with a catch)

      June 25, 2025

      Philps Hue smart lights are already pricey. They’re about to get pricier

      June 25, 2025

      Amazon’s Fire TV Stick 4K drops to its best price of the year

      June 25, 2025

      The state of DTC marketing in 2025: How brands and agencies are leveraging data and automation to fuel ROI

      June 25, 2025
    • Others
      • Gadgets
      • Gaming
      • Health
      • Software and Apps
    Shop Now
    Tech AI Verse
    You are at:Home»Technology»Interview: Pure Storage on the AI data challenge beyond hardware
    Technology

    Interview: Pure Storage on the AI data challenge beyond hardware

    TechAiVerseBy TechAiVerseJune 24, 2025No Comments7 Mins Read0 Views
    Facebook Twitter Pinterest Telegram LinkedIn Tumblr Email Reddit
    Interview: Pure Storage on the AI data challenge beyond hardware
    Share
    Facebook Twitter LinkedIn Pinterest WhatsApp Email

    Interview: Pure Storage on the AI data challenge beyond hardware

    To successfully tackle artificial intelligence (AI) workloads is not just about throwing compute and storage resources at it. Sure, you need enough processing power and the storage to supply it with data at the correct rate, but before any such operations can achieve success, it’s critical to ensure the quality of data used in AI training.

    That’s the core message from Par Botes, vice-president of AI infrastructure at Pure Storage, whom we caught up with last week at the company’s Accelerate event in Las Vegas.

    Botes emphasised the need for enterprises tackling AI to capture, organise, prepare and align data. That’s because data can often be incomplete or inappropriate to the questions AI tries to answer. 

    We talked to Botes about data engineering, data management, the use of data lakehouses and making sure datasets fit the need being addressed by AI. 

    What does Pure Storage view as the key upcoming or emerging storage challenges in AI? 

    I think it’s hard to create systems that solve problems using AI without having a really good way of organising data, capturing data, then preparing it and aligning it to the processing elements, the GPUs [graphics processing units], that make them access data fast enough. 

    What in particular makes those challenges difficult? 

    I’ll start with the most obvious one: how do I get GPUs to consume the data? The GPUs are incredibly powerful, and they drive a tremendous amount of bandwidth.

    It’s hard to feed GPUs with data at the pace we consume it. That is starting to increasingly become solved, particularly at the high end. But for a regular enterprise type of company, these are new types of systems and new types of skills they have to implement. 

    “As your data improves, as your insights change, your data has to change with it. Thus, your model has to evolve with it. This becomes a continuous process”

    Par Botes, Pure Storage

    It’s not a hard problem on the science side, it’s a hard problem in operations, because these are not muscles that have existed in enterprise for a long time. 

    The next part of that problem is: How do I prepare my data? How do I gather it? How do I know where I have the correct data? How do I assess it? How do I track it? How do I apply lineage to it to see that this model is trained with this set of data? How do I know that it has a complete dataset? That’s a very hard problem. 

    Is that a problem that varies between customer and workload? Because I can imagine one might know, just by the expertise that resides within an organisation, that you have all the data you need. Or, in another situation, it might be unclear whether you do or not.

    It’s pretty hard to know, without reasoning about [whether] you have all the data you need. I’ll give you an example.

    I spent many years building a self-driving car – perception networks, driving systems – but frequently, we found the car didn’t perform as well in some conditions.

    The road turned left and slightly uphill, with other cars around it. We then realised we didn’t have enough training data. So, having a principled way of reasoning about the data, reasoning about completeness, reasoning about the range [of data], and to have all the data for that, and analysing it mathematically, is not a discipline that’s super common outside of high-end training companies.

    Having looked at the issues that tend to arise, the difficulties that can arise with AI workloads, how would you say that customers can begin to mitigate those? 

    The general approach I recommend is to think about your data engineering processes. So, we partner with data engineering companies that do things like lakehouses.

    Think about: How do I apply a lakehouse to my incoming data? How do I use my lakehouse to clean it and prepare it? In some cases, maybe even transform it and make it ready for the training system. I will start by thinking about the data engineering discipline in my company and how do I prepare that to be ready for AI? 

    What does data engineering consist of if you drill down into it? 

    Data engineering generally consists of how do I get access to other datasets that can exist in corporate databases, in structured systems, or in other systems we have, and how do I get access to that? How do I ingest that into an intermediate form that I lakehouse? And how do I then transform that and select data from those sets that might be across different repositories to create a dataset that represents the data I want to train against.

    That’s the discipline we typically call data engineering. And it’s becoming a very distinct skill and a very distinct discipline. 

    When it comes to storage, how do customers support data lakehouses with storage? In what forms?

    Today, what’s common is you have the cloud companies, which provide the data lakehouses, and for the on-prem, we have the system houses.

    We work with several of them. We provide complete solutions that include data lakehouse vendors. And we partner with those.

    And then, of course, the underlying storage that makes it perform fast and work well. And so the key components, I’d say, are the popular data lakehouse databases and the infrastructure beneath that, and then connect those over into other storage systems for the training side. 

    Looking at data engineering, is it really a one-time, one-off challenge, or is it something that’s ongoing as organisations tackle AI? 

    Data engineering is kind of hard to disentangle from storage. They’re not exactly the same thing, but they’re closely related. 

    Once you start using AI, you want to record all new data. You want to transform it and make it part of your AI system, whether you’re using that with RAG [retrieval augmented generation] or fine-tuning, or if you are advanced, you build your own model.

    You’re constantly going to increase it and make it better. As your data improves, as your insights change, your data has to change with it. Thus, your model has to evolve with it.

    This becomes a continuous process. 

    You have to think about a few things, such as lineage. What’s the history of this data? What originated from where? What’s consumed where? You want to think about, when people use your model or when you internally use your model. What’s the question being asked? What’s the question that comes up with it? 

    And you want to store and use that for quality assurance, also for further training in the future. This becomes what we call an AI flywheel of data. The data is constantly ingested, consumed, computed, ingested, consumed, computed.

    And that circle doesn’t stop. 

    Is there anything else you think customers ought to be looking at? 

    You should also think, what is this data really, what does the data represent? If this data represents something you observe or something you do, if you have gaps in the data, the AI will fill in those gaps. When it fills in those gaps wrongly, we call it hallucination.

    The trick is to know your data well enough that you know where there are gaps. And if you have gaps, can you find ways to fill out those gaps? When you get to that level of sophistication, you’re starting to have a really impressive system to use. 

    Even if you start with the very basics of using a cloud service, start by recording what you send and what you’re getting back. Because that forms the basis for your data management discipline. And when I use the term data engineering, in between data engineering and storage is this discipline called data management.

    This is the organisation of data, which you want to start as early as you can. Because by the time you get ready to do something beyond just using the service, you now have the first body of data for your data engineers and for your storage.

    That’s a tremendous insight that I wish everyone would consider doing really quickly. 

    Share. Facebook Twitter Pinterest LinkedIn Reddit WhatsApp Telegram Email
    Previous ArticleOne year since being freed, Julian Assange still a victim of state secrecy
    Next Article Zopa Bank launches current account earlier than expected
    TechAiVerse
    • Website

    Jonathan is a tech enthusiast and the mind behind Tech AI Verse. With a passion for artificial intelligence, consumer tech, and emerging innovations, he deliver clear, insightful content to keep readers informed. From cutting-edge gadgets to AI advancements and cryptocurrency trends, Jonathan breaks down complex topics to make technology accessible to all.

    Related Posts

    Don’t toss your Windows 10 PC! Try switching to KDE Plasma instead

    June 25, 2025

    Windows 10 gets an extra year of free security updates (with a catch)

    June 25, 2025

    Philps Hue smart lights are already pricey. They’re about to get pricier

    June 25, 2025
    Leave A Reply Cancel Reply

    Top Posts

    New Akira ransomware decryptor cracks encryptions keys using GPUs

    March 16, 202525 Views

    OpenAI details ChatGPT-o3, o4-mini, o4-mini-high usage limits

    April 19, 202521 Views

    Rsync replaced with openrsync on macOS Sequoia

    April 7, 202515 Views

    Arizona moves to ban AI use in reviewing medical claims

    March 12, 202511 Views
    Don't Miss
    Technology June 25, 2025

    Don’t toss your Windows 10 PC! Try switching to KDE Plasma instead

    Don’t toss your Windows 10 PC! Try switching to KDE Plasma instead Image: KDE Come…

    Windows 10 gets an extra year of free security updates (with a catch)

    Philps Hue smart lights are already pricey. They’re about to get pricier

    Amazon’s Fire TV Stick 4K drops to its best price of the year

    Stay In Touch
    • Facebook
    • Twitter
    • Pinterest
    • Instagram
    • YouTube
    • Vimeo

    Subscribe to Updates

    Get the latest creative news from SmartMag about art & design.

    About Us
    About Us

    Welcome to Tech AI Verse, your go-to destination for everything technology! We bring you the latest news, trends, and insights from the ever-evolving world of tech. Our coverage spans across global technology industry updates, artificial intelligence advancements, machine learning ethics, and automation innovations. Stay connected with us as we explore the limitless possibilities of technology!

    Facebook X (Twitter) Pinterest YouTube WhatsApp
    Our Picks

    Don’t toss your Windows 10 PC! Try switching to KDE Plasma instead

    June 25, 20250 Views

    Windows 10 gets an extra year of free security updates (with a catch)

    June 25, 20250 Views

    Philps Hue smart lights are already pricey. They’re about to get pricier

    June 25, 20250 Views
    Most Popular

    Ethereum must hold $2,000 support or risk dropping to $1,850 – Here’s why

    March 12, 20250 Views

    Xiaomi 15 Ultra Officially Launched in China, Malaysia launch to follow after global event

    March 12, 20250 Views

    Apple thinks people won’t use MagSafe on iPhone 16e

    March 12, 20250 Views
    © 2025 TechAiVerse. Designed by Divya Tech.
    • Home
    • About Us
    • Contact Us
    • Privacy Policy
    • Terms & Conditions

    Type above and press Enter to search. Press Esc to cancel.