Close Menu

    Subscribe to Updates

    Get the latest creative news from FooBar about art, design and business.

    What's Hot

    Honda CR-V Hybrid Lineup Expanded in Malaysia From RM178,200

    vivo V70 – Top 7 Flagship Features You Will Love

    Apple iPad Air with M4 Officially Launches in Malaysia From RM2,799

    Facebook X (Twitter) Instagram
    • Artificial Intelligence
    • Business Technology
    • Cryptocurrency
    • Gadgets
    • Gaming
    • Health
    • Software and Apps
    • Technology
    Facebook X (Twitter) Instagram Pinterest Vimeo
    Tech AI Verse
    • Home
    • Artificial Intelligence

      What the polls say about how Americans are using AI

      February 27, 2026

      Tensions between the Pentagon and AI giant Anthropic reach a boiling point

      February 21, 2026

      Read the extended transcript: President Donald Trump interviewed by ‘NBC Nightly News’ anchor Tom Llamas

      February 6, 2026

      Stocks and bitcoin sink as investors dump software company shares

      February 4, 2026

      AI, crypto and Trump super PACs stash millions to spend on the midterms

      February 2, 2026
    • Business

      Weighing up the enterprise risks of neocloud providers

      March 3, 2026

      A stolen Gemini API key turned a $180 bill into $82,000 in two days

      March 3, 2026

      These ultra-budget laptops “include” 1.2TB storage, but most of it is OneDrive trial space

      March 1, 2026

      FCC approves the merger of cable giants Cox and Charter

      February 28, 2026

      Finding value with AI and Industry 5.0 transformation

      February 28, 2026
    • Crypto

      Strait of Hormuz Shutdown Shakes Asian Energy Markets

      March 3, 2026

      Wall Street’s Inflation Alarm From Iran — What It Means for Crypto

      March 3, 2026

      Ethereum Price Prediction: What To Expect From ETH In March 2026

      March 3, 2026

      Was Bitcoin Hijacked? How Institutional Interests Shaped Its Narrative Since 2015

      March 3, 2026

      XRP Whales Now Hold 83.7% of All Supply – What’s Next For Price?

      March 3, 2026
    • Technology

      Spotify’s new feature makes it easier to find popular audiobooks

      March 3, 2026

      This portable JBL Grip Bluetooth speaker is so good at 20% off

      March 3, 2026

      ‘AI’ could dox your anonymous posts

      March 3, 2026

      Microsoft says new Teams location feature isn’t for ’employee tracking’

      March 3, 2026

      OpenAI got ‘sloppy’ about the wrong thing

      March 3, 2026
    • Others
      • Gadgets
      • Gaming
      • Health
      • Software and Apps
    Check BMI
    Tech AI Verse
    You are at:Home»Technology»Interview: Pure Storage on the AI data challenge beyond hardware
    Technology

    Interview: Pure Storage on the AI data challenge beyond hardware

    TechAiVerseBy TechAiVerseJune 24, 2025No Comments7 Mins Read0 Views
    Facebook Twitter Pinterest Telegram LinkedIn Tumblr Email Reddit
    Interview: Pure Storage on the AI data challenge beyond hardware
    Share
    Facebook Twitter LinkedIn Pinterest WhatsApp Email

    Interview: Pure Storage on the AI data challenge beyond hardware

    To successfully tackle artificial intelligence (AI) workloads is not just about throwing compute and storage resources at it. Sure, you need enough processing power and the storage to supply it with data at the correct rate, but before any such operations can achieve success, it’s critical to ensure the quality of data used in AI training.

    That’s the core message from Par Botes, vice-president of AI infrastructure at Pure Storage, whom we caught up with last week at the company’s Accelerate event in Las Vegas.

    Botes emphasised the need for enterprises tackling AI to capture, organise, prepare and align data. That’s because data can often be incomplete or inappropriate to the questions AI tries to answer. 

    We talked to Botes about data engineering, data management, the use of data lakehouses and making sure datasets fit the need being addressed by AI. 

    What does Pure Storage view as the key upcoming or emerging storage challenges in AI? 

    I think it’s hard to create systems that solve problems using AI without having a really good way of organising data, capturing data, then preparing it and aligning it to the processing elements, the GPUs [graphics processing units], that make them access data fast enough. 

    What in particular makes those challenges difficult? 

    I’ll start with the most obvious one: how do I get GPUs to consume the data? The GPUs are incredibly powerful, and they drive a tremendous amount of bandwidth.

    It’s hard to feed GPUs with data at the pace we consume it. That is starting to increasingly become solved, particularly at the high end. But for a regular enterprise type of company, these are new types of systems and new types of skills they have to implement. 

    “As your data improves, as your insights change, your data has to change with it. Thus, your model has to evolve with it. This becomes a continuous process”

    Par Botes, Pure Storage

    It’s not a hard problem on the science side, it’s a hard problem in operations, because these are not muscles that have existed in enterprise for a long time. 

    The next part of that problem is: How do I prepare my data? How do I gather it? How do I know where I have the correct data? How do I assess it? How do I track it? How do I apply lineage to it to see that this model is trained with this set of data? How do I know that it has a complete dataset? That’s a very hard problem. 

    Is that a problem that varies between customer and workload? Because I can imagine one might know, just by the expertise that resides within an organisation, that you have all the data you need. Or, in another situation, it might be unclear whether you do or not.

    It’s pretty hard to know, without reasoning about [whether] you have all the data you need. I’ll give you an example.

    I spent many years building a self-driving car – perception networks, driving systems – but frequently, we found the car didn’t perform as well in some conditions.

    The road turned left and slightly uphill, with other cars around it. We then realised we didn’t have enough training data. So, having a principled way of reasoning about the data, reasoning about completeness, reasoning about the range [of data], and to have all the data for that, and analysing it mathematically, is not a discipline that’s super common outside of high-end training companies.

    Having looked at the issues that tend to arise, the difficulties that can arise with AI workloads, how would you say that customers can begin to mitigate those? 

    The general approach I recommend is to think about your data engineering processes. So, we partner with data engineering companies that do things like lakehouses.

    Think about: How do I apply a lakehouse to my incoming data? How do I use my lakehouse to clean it and prepare it? In some cases, maybe even transform it and make it ready for the training system. I will start by thinking about the data engineering discipline in my company and how do I prepare that to be ready for AI? 

    What does data engineering consist of if you drill down into it? 

    Data engineering generally consists of how do I get access to other datasets that can exist in corporate databases, in structured systems, or in other systems we have, and how do I get access to that? How do I ingest that into an intermediate form that I lakehouse? And how do I then transform that and select data from those sets that might be across different repositories to create a dataset that represents the data I want to train against.

    That’s the discipline we typically call data engineering. And it’s becoming a very distinct skill and a very distinct discipline. 

    When it comes to storage, how do customers support data lakehouses with storage? In what forms?

    Today, what’s common is you have the cloud companies, which provide the data lakehouses, and for the on-prem, we have the system houses.

    We work with several of them. We provide complete solutions that include data lakehouse vendors. And we partner with those.

    And then, of course, the underlying storage that makes it perform fast and work well. And so the key components, I’d say, are the popular data lakehouse databases and the infrastructure beneath that, and then connect those over into other storage systems for the training side. 

    Looking at data engineering, is it really a one-time, one-off challenge, or is it something that’s ongoing as organisations tackle AI? 

    Data engineering is kind of hard to disentangle from storage. They’re not exactly the same thing, but they’re closely related. 

    Once you start using AI, you want to record all new data. You want to transform it and make it part of your AI system, whether you’re using that with RAG [retrieval augmented generation] or fine-tuning, or if you are advanced, you build your own model.

    You’re constantly going to increase it and make it better. As your data improves, as your insights change, your data has to change with it. Thus, your model has to evolve with it.

    This becomes a continuous process. 

    You have to think about a few things, such as lineage. What’s the history of this data? What originated from where? What’s consumed where? You want to think about, when people use your model or when you internally use your model. What’s the question being asked? What’s the question that comes up with it? 

    And you want to store and use that for quality assurance, also for further training in the future. This becomes what we call an AI flywheel of data. The data is constantly ingested, consumed, computed, ingested, consumed, computed.

    And that circle doesn’t stop. 

    Is there anything else you think customers ought to be looking at? 

    You should also think, what is this data really, what does the data represent? If this data represents something you observe or something you do, if you have gaps in the data, the AI will fill in those gaps. When it fills in those gaps wrongly, we call it hallucination.

    The trick is to know your data well enough that you know where there are gaps. And if you have gaps, can you find ways to fill out those gaps? When you get to that level of sophistication, you’re starting to have a really impressive system to use. 

    Even if you start with the very basics of using a cloud service, start by recording what you send and what you’re getting back. Because that forms the basis for your data management discipline. And when I use the term data engineering, in between data engineering and storage is this discipline called data management.

    This is the organisation of data, which you want to start as early as you can. Because by the time you get ready to do something beyond just using the service, you now have the first body of data for your data engineers and for your storage.

    That’s a tremendous insight that I wish everyone would consider doing really quickly. 

    Share. Facebook Twitter Pinterest LinkedIn Reddit WhatsApp Telegram Email
    Previous ArticleOne year since being freed, Julian Assange still a victim of state secrecy
    Next Article Zopa Bank launches current account earlier than expected
    TechAiVerse
    • Website

    Jonathan is a tech enthusiast and the mind behind Tech AI Verse. With a passion for artificial intelligence, consumer tech, and emerging innovations, he deliver clear, insightful content to keep readers informed. From cutting-edge gadgets to AI advancements and cryptocurrency trends, Jonathan breaks down complex topics to make technology accessible to all.

    Related Posts

    Spotify’s new feature makes it easier to find popular audiobooks

    March 3, 2026

    This portable JBL Grip Bluetooth speaker is so good at 20% off

    March 3, 2026

    ‘AI’ could dox your anonymous posts

    March 3, 2026
    Leave A Reply Cancel Reply

    Top Posts

    Ping, You’ve Got Whale: AI detection system alerts ships of whales in their path

    April 22, 2025703 Views

    Lumo vs. Duck AI: Which AI is Better for Your Privacy?

    July 31, 2025286 Views

    6.7 Cummins Lifter Failure: What Years Are Affected (And Possible Fixes)

    April 14, 2025164 Views

    6 Best MagSafe Phone Grips (2025), Tested and Reviewed

    April 6, 2025124 Views
    Don't Miss
    Gadgets March 4, 2026

    Honda CR-V Hybrid Lineup Expanded in Malaysia From RM178,200

    Honda CR-V Hybrid Lineup Expanded in Malaysia From RM178,200 Honda Malaysia has officially launched the…

    vivo V70 – Top 7 Flagship Features You Will Love

    Apple iPad Air with M4 Officially Launches in Malaysia From RM2,799

    Apple Launches iPhone 17e in Malaysia from RM2,999

    Stay In Touch
    • Facebook
    • Twitter
    • Pinterest
    • Instagram
    • YouTube
    • Vimeo

    Subscribe to Updates

    Get the latest creative news from SmartMag about art & design.

    About Us
    About Us

    Welcome to Tech AI Verse, your go-to destination for everything technology! We bring you the latest news, trends, and insights from the ever-evolving world of tech. Our coverage spans across global technology industry updates, artificial intelligence advancements, machine learning ethics, and automation innovations. Stay connected with us as we explore the limitless possibilities of technology!

    Facebook X (Twitter) Pinterest YouTube WhatsApp
    Our Picks

    Honda CR-V Hybrid Lineup Expanded in Malaysia From RM178,200

    March 4, 20262 Views

    vivo V70 – Top 7 Flagship Features You Will Love

    March 4, 20262 Views

    Apple iPad Air with M4 Officially Launches in Malaysia From RM2,799

    March 4, 20262 Views
    Most Popular

    7 Best Kids Bikes (2025): Mountain, Balance, Pedal, Coaster

    March 13, 20250 Views

    VTOMAN FlashSpeed 1500: Plenty Of Power For All Your Gear

    March 13, 20250 Views

    Best TV Antenna of 2025

    March 13, 20250 Views
    © 2026 TechAiVerse. Designed by Divya Tech.
    • Home
    • About Us
    • Contact Us
    • Privacy Policy
    • Terms & Conditions

    Type above and press Enter to search. Press Esc to cancel.