Close Menu

    Subscribe to Updates

    Get the latest creative news from FooBar about art, design and business.

    What's Hot

    ARC Raiders dev claims they built an auction house-like trading system for the game but later removed it as it’s “very risky territory”

    Citizen unveils 3 new tachymeter bezel chronographs with 43 mm stainless-steel cases

    Portable 27-inch monitor with Google TV, battery and built-in soundbar launches with discount

    Facebook X (Twitter) Instagram
    • Artificial Intelligence
    • Business Technology
    • Cryptocurrency
    • Gadgets
    • Gaming
    • Health
    • Software and Apps
    • Technology
    Facebook X (Twitter) Instagram Pinterest Vimeo
    Tech AI Verse
    • Home
    • Artificial Intelligence

      Ashley St. Clair, the mother of one of Elon Musk’s children, sues xAI over Grok sexual images

      January 17, 2026

      Anthropic joins OpenAI’s push into health care with new Claude tools

      January 12, 2026

      The mother of one of Elon Musk’s children says his AI bot won’t stop creating sexualized images of her

      January 7, 2026

      A new pope, political shake-ups and celebs in space: The 2025-in-review news quiz

      December 31, 2025

      AI has become the norm for students. Teachers are playing catch-up.

      December 23, 2025
    • Business

      New VoidLink malware framework targets Linux cloud servers

      January 14, 2026

      Nvidia Rubin’s rack-scale encryption signals a turning point for enterprise AI security

      January 13, 2026

      How KPMG is redefining the future of SAP consulting on a global scale

      January 10, 2026

      Top 10 cloud computing stories of 2025

      December 22, 2025

      Saudia Arabia’s STC commits to five-year network upgrade programme with Ericsson

      December 18, 2025
    • Crypto

      Trump Shifts on Fed Pick as Hassett Odds Fade: Who Will Replace Powell?

      January 17, 2026

      A Third of French Crypto Firms Still Unlicensed Under MiCA as Deadline Nears

      January 17, 2026

      DOJ Charges Venezuelan National in $1 Billion Crypto Laundering Scheme

      January 17, 2026

      One of Wall Street’s Top Strategists No Longer Trusts Bitcoin | US Crypto News

      January 17, 2026

      3 Altcoins To Watch This Weekend | January 17 – 18

      January 17, 2026
    • Technology

      ARC Raiders dev claims they built an auction house-like trading system for the game but later removed it as it’s “very risky territory”

      January 17, 2026

      Citizen unveils 3 new tachymeter bezel chronographs with 43 mm stainless-steel cases

      January 17, 2026

      Portable 27-inch monitor with Google TV, battery and built-in soundbar launches with discount

      January 17, 2026

      Civilization VII coming to iPhone and iPad

      January 17, 2026

      Flagship power with an XXL battery

      January 17, 2026
    • Others
      • Gadgets
      • Gaming
      • Health
      • Software and Apps
    Check BMI
    Tech AI Verse
    You are at:Home»Technology»Dutch researcher’s AI breakthrough tackles the structured data paradox
    Technology

    Dutch researcher’s AI breakthrough tackles the structured data paradox

    TechAiVerseBy TechAiVerseJanuary 17, 2026No Comments10 Mins Read2 Views
    Facebook Twitter Pinterest Telegram LinkedIn Tumblr Email Reddit
    Dutch researcher’s AI breakthrough tackles the structured data paradox
    Share
    Facebook Twitter LinkedIn Pinterest WhatsApp Email

    Dutch researcher’s AI breakthrough tackles the structured data paradox

    Organisations sit on vast quantities of structured data in relational databases and spreadsheets. It’s organised and searchable, yet when it comes to extracting insights, we barely scratch the surface.

    “We don’t know what we don’t know,” says Madelon Hulsebos, researcher at the Dutch Centrum Wiskunde & Informatica (CWI), the national research institute for mathematics and computer science in the Netherlands.

    Hulsebos began her career as a data scientist, and noticed that highly paid specialists repeatedly performed the same manual tasks: cleaning tables, extracting features and linking datasets.

    During her PhD at the University of Amsterdam and postdoctoral research at the University of California, Berkeley, she developed “table representation learning” – enabling artificial intelligence (AI) to understand what tables mean rather than simply searching them. She now leads the Table Representation Learning Lab at CWI, working on this challenge with three PhD students, two postdocs and six master’s students.

    “As a data scientist, I experienced how incredibly difficult and frustrating it is to find relevant datasets, for instance, to train machine learning models,” says Hulsebos.

    Much of the data exists but sits scattered or buried deep in large, complex tables.

    Using funding including an NWO AiNed Fellowship Grant – a National Growth Fund programme to attract and retain top AI researchers at Dutch universities and research institutes – she established the CWI lab with the goal of democratising insights from structured data. “The aim is essentially that, based on questions people have – business users, analysts – we can automatically retrieve the relevant data across different systems and provide answers,” says Hulsebos.

    Information to insight

    The project for which Hulsebos received the grant is called DataLibra, which runs from 2024 to 2029. Over those five years, the researcher and her team aim not only to gain insights, but also to build concrete tools that organisations can use to extract more value from their data.

    “It should be as simple to query data within your organisation as it is to perform a Google search,” she says. “AI can play a major role here because it enables the use of natural language instead of requiring people to have knowledge of programming, business intelligence and relational databases.”

    That AI can play a role here seems contradictory. For years, AI has been positioned as the solution for unstructured data such as text, images and video, while structured data in tables was supposedly easy to search. But the problem isn’t the structure itself, says Hulsebos, but its diversity.

    Each system uses different column names and logic, causing traditional methods such as SQL and pattern matching to fall short. “You need to understand what columns mean, not just what they’re called,” she adds. “And that’s where machine learning excels, because it can generalise and understand context.”

    Retrieving the right dataset is only the beginning. “We call that information retrieval, but we want to move towards insight retrieval,” says Hulsebos. “Once you’ve found the relevant tables, you often still need to combine, link or process them before you can extract an insight.”

    That makes the challenge more complex than simple searching. At the same time, she emphasises that full automation isn’t the goal. “Nobody can simply trust an insight,” she says. “You must always be able to explain why an answer is the right answer for that specific question. Transparency and iteration are crucial in that regard.”

    Automating data science

    When asked how table representation differs from traditional business intelligence, Hulsebos responds: “Data scientists do more than traditional BI [business intelligence] tasks such as reports and dashboards, they also train machine learning models. Our goal is also to develop tools to automate repetitive, everyday tasks such as data cleaning, validation or data transformation.”

    It’s often said that data science is 80% data work and 20% modelling. “We want to automate that 80% as much as possible, so data scientists can focus on the other part where they think about critical aspects of problems, such as ethical questions,” she says.

    Beyond that, Hulsebos wants to give all non-data scientists more capabilities. “And this does indeed touch on business intelligence, but at present, it still takes considerable time and money to do it yourself, because you still need someone who builds dashboards and understands what the real insight need is,” she says.

    “But often the person with a problem doesn’t see which data might help. And the person who manages the data doesn’t understand the problem. That gap is the issue. By ensuring that relational databases can be queried in plain language without requiring knowledge of SQL or underlying data structures, you can already generate far more insights.”

    Many software suppliers currently claim to have such AI features in their products, but Hulsebos remains unimpressed. “It’s very easy to build something that doesn’t necessarily always work well,” she says. “There are plenty of fancy demos of agentic data scientists or analysts, but I’ve examined the benchmarks and the success rate is often zero. It all sounds wonderful, but to actually get there, we still have much work to do.”

    Hulsebos emphasises the importance of robustness and transparency in systems. “You can ask an LLM [large language model] a question and it will always provide an answer, but it must also be able to convince you that it’s the right answer,” she says. “That transparency and context are necessary for adoption.”

    Context determines data sensitivity

    Precisely that transparency and context proved crucial in a project Hulsebos recently conducted for the United Nations (UN). It illustrates not only why existing tools fall short, but also what’s needed to make table representation learning work in practice.

    The collaboration came about when Hulsebos, once on the academic path, approached the Humanitarian Data Centre. “The humanitarian aid aspect really drives me,” she says. “I saw that from my position I could achieve societal impact by collaborating with the UN on scientific research questions.”

    The first joint project focused on detecting sensitive data, a challenge that directly connects to her earlier Massachusetts Institute of Technology research into what tables mean. The Humanitarian Data Centre facilitates local organisations in providing aid during conflicts, natural disasters and other crises. Via their Humanitarian Data Exchange platform, these organisations share datasets that others can use for planning and coordination.

    “The problem is that much of that data comes from conflict zones and contains extremely sensitive information,” says Hulsebos. “But what’s sensitive here differs fundamentally from what many current systems classify as ‘sensitive’. They typically focus on personal data such as names and addresses, but here we look further, namely at data that can be dangerous in a specific context. Consider, for example, detailed coordinates of hospitals in conflict zones. Those could enable new attacks. You want to filter out such datasets before they become publicly accessible.”

    Together with master’s student Liang Telkamp, Hulsebos developed two mechanisms to tackle this. The first mechanism incorporates the full data context in its reasoning, dramatically reducing false positives. “Existing tools detect an address and conclude it’s sensitive,” she says. “But a company address may be perfectly public – not sensitive. You need to look at the context in which something is mentioned, not just the data type.”

    The second mechanism – “retrieve then detect” – links datasets to relevant policies and protocols applicable at that moment. “When a conflict breaks out somewhere, what’s sensitive changes,” says Hulsebos. “Your system must be able to retrieve that new context and incorporate it into its assessment.”

    That dynamic approach proves essential. A dataset about hospitals in the Netherlands requires a different assessment than the same data from Gaza. “It’s not only situational, but also time-dependent,” she says. “Information that wasn’t sensitive five years ago might suddenly be so now. You must be able to reason about the context in which data is used.”

    The results demonstrate that the approach works, particularly for detecting personal information, but the system also proves valuable for situationally sensitive data. “The Quality Assessment Officers at the UN found the contextualised explanations from the LLMs enormously useful,” says Hulsebos. “Those information sharing protocols are extremely long documents. That the system extracts the relevant rules and explains why something is sensitive was already highly insightful for them.”

    Telkamp’s work – she now works at the UN on the integration – was recently awarded the Amsterdam AI Thesis Award, partly due to its societal impact.

    Making data insights more broadly accessible

    The UN project illustrates a specific problem, but the underlying challenge – how to make data accessible and comprehensible – plays out in every organisation. Understanding data sensitivities in an organisation’s context is always useful, says Hulsebos. Moreover, it’s important to realise that LLMs are trained on all sorts of datasets scraped from the internet, including data sharing portals.

    “It’s so important to ensure that no sensitive data ends up on those portals, because once it’s in those models’ training data, it doesn’t come out,” she says.

    But organisations also fail to fully utilise the data they collect. “We don’t know what we don’t know,” says Hulsebos. “People ask questions about things they already know the data exists for. But how many insights are you missing because you don’t know certain data even exists? Or because you don’t know which datasets you should combine to get an answer?”

    She therefore wants to make visible what people don’t yet know about their data and make access to data and insights more broadly available in organisations. “For a CEO, it’s extremely useful when everyone within their organisation has direct access to insights that help them make important decisions,” says Hulsebos.

    She describes first having to mobilise the data science or business intelligence department as “a barrier for someone in sales, logistics or finance to quickly ask an important question”.

    “By the time a BI dashboard or SQL query is delivered, the insight is no longer relevant,” says Hulsebos.

    That requires AI-powered systems that democratise insights from structured data, enabling people to act and decide directly. “Speed to insight is the key factor,” she adds.

    Concrete solutions for business are in development. One of her PhD students is building tools to automate the retrieval aspect and support structured query language generation. “We’re making all those tools available as open source,” says Hulsebos. “We’re trying to make things genuinely usable, not just publish them. Within the next two months, first versions will be available.”

    One example is DataScout, a tool she developed during her time at the University of California, Berkeley. The system helps users find datasets based on their task or problem, rather than keywords. “Task-based search with LLMs that think proactively proves enormously useful,” says Hulsebos.

    In user studies, DataScout proved faster and more effective than traditional data platforms with keyword search. “As a data scientist, it could easily take two weeks to a month before you’d gathered the right data for a machine learning model,” she says.

    That such systems still aren’t standard in data platforms, whilst they could save weeks of search work, still surprises Hulsebos. “The goal is that everyone in an organisation – from CEO to sales staff – can ask questions of their data directly,” she says. “Without intermediaries, without waiting time.”

    Share. Facebook Twitter Pinterest LinkedIn Reddit WhatsApp Telegram Email
    Previous ArticleCutting through the noise: SaaS accelerators vs. enterprise AI
    Next Article NSA urges continuous checks to achieve zero trust
    TechAiVerse
    • Website

    Jonathan is a tech enthusiast and the mind behind Tech AI Verse. With a passion for artificial intelligence, consumer tech, and emerging innovations, he deliver clear, insightful content to keep readers informed. From cutting-edge gadgets to AI advancements and cryptocurrency trends, Jonathan breaks down complex topics to make technology accessible to all.

    Related Posts

    ARC Raiders dev claims they built an auction house-like trading system for the game but later removed it as it’s “very risky territory”

    January 17, 2026

    Citizen unveils 3 new tachymeter bezel chronographs with 43 mm stainless-steel cases

    January 17, 2026

    Portable 27-inch monitor with Google TV, battery and built-in soundbar launches with discount

    January 17, 2026
    Leave A Reply Cancel Reply

    Top Posts

    Ping, You’ve Got Whale: AI detection system alerts ships of whales in their path

    April 22, 2025619 Views

    Lumo vs. Duck AI: Which AI is Better for Your Privacy?

    July 31, 2025235 Views

    6.7 Cummins Lifter Failure: What Years Are Affected (And Possible Fixes)

    April 14, 2025135 Views

    6 Best MagSafe Phone Grips (2025), Tested and Reviewed

    April 6, 2025109 Views
    Don't Miss
    Technology January 17, 2026

    ARC Raiders dev claims they built an auction house-like trading system for the game but later removed it as it’s “very risky territory”

    ARC Raiders dev claims they built an auction house-like trading system for the game but…

    Citizen unveils 3 new tachymeter bezel chronographs with 43 mm stainless-steel cases

    Portable 27-inch monitor with Google TV, battery and built-in soundbar launches with discount

    Civilization VII coming to iPhone and iPad

    Stay In Touch
    • Facebook
    • Twitter
    • Pinterest
    • Instagram
    • YouTube
    • Vimeo

    Subscribe to Updates

    Get the latest creative news from SmartMag about art & design.

    About Us
    About Us

    Welcome to Tech AI Verse, your go-to destination for everything technology! We bring you the latest news, trends, and insights from the ever-evolving world of tech. Our coverage spans across global technology industry updates, artificial intelligence advancements, machine learning ethics, and automation innovations. Stay connected with us as we explore the limitless possibilities of technology!

    Facebook X (Twitter) Pinterest YouTube WhatsApp
    Our Picks

    ARC Raiders dev claims they built an auction house-like trading system for the game but later removed it as it’s “very risky territory”

    January 17, 20260 Views

    Citizen unveils 3 new tachymeter bezel chronographs with 43 mm stainless-steel cases

    January 17, 20260 Views

    Portable 27-inch monitor with Google TV, battery and built-in soundbar launches with discount

    January 17, 20260 Views
    Most Popular

    A Team of Female Founders Is Launching Cloud Security Tech That Could Overhaul AI Protection

    March 12, 20250 Views

    Senua’s Saga: Hellblade 2 leads BAFTA Game Awards 2025 nominations

    March 12, 20250 Views

    7 Best Kids Bikes (2025): Mountain, Balance, Pedal, Coaster

    March 13, 20250 Views
    © 2026 TechAiVerse. Designed by Divya Tech.
    • Home
    • About Us
    • Contact Us
    • Privacy Policy
    • Terms & Conditions

    Type above and press Enter to search. Press Esc to cancel.