A lot of population numbers are fake
Here’s the story of a remarkable scandal from a few years ago.
In the South Pacific, just north of Australia, there is a small, impoverished, and remote country called Papua New Guinea. It’s a country that I’ve always found absolutely fascinating. If there’s any outpost of true remoteness in the world, I think it’s either in the outer mountains of Afghanistan, in the deepest jungles of central Africa, or in the highlands of Papua New Guinea. (PNG, we call it.) Here’s my favorite fact: Papua New Guinea, with about 0.1 percent of the world’s population, hosts more than 10 percent of the world’s languages. Two villages, separated perhaps only by a few miles, will speak languages that are not mutually intelligible. And if you go into rural PNG, far into rural PNG, you’ll find yourself in places that time forgot.
But here’s a question about Papua New Guinea: how many people live there?
The answer should be pretty simple. National governments are supposed to provide annual estimates for their populations. And the PNG government does just that. In 2022, it said that there were 9.4 million people in Papua New Guinea. So 9.4 million people was the official number.
But how did the PNG government reach that number?
The PNG government conducts a census about every ten years. When the PNG government provided its 2022 estimate, the previous census had been done in 2011. But that census was a disaster, and the PNG government didn’t consider its own findings credible. So the PNG government took the 2000 census, which found that the country had 5.5 million people, and worked off of that one. So the 2022 population estimate was an extrapolation from the 2000 census, and the number that the PNG government arrived at was 9.4 million.
But this, even the PNG government would admit, was a hazy guess.
About 80 percent of people in Papua New Guinea live in the countryside. And this is not a countryside of flat plains and paved roads: PNG is a country of mountain highlands and remote islands. Many places, probably most places, don’t have roads leading to them; and the roads that do exist are almost never paved. People speak different languages and have little trust in the central government, which simply isn’t a force in most of the country. So traveling across PNG is extraordinarily treacherous. It’s not a country where you can send people to survey the countryside with much ease. And so the PNG government really had no idea how many people lived in the country.
Late in 2022, word leaked of a report that the UN had commissioned. The report found that PNG’s population was not 9.4 million people, as the government maintained, but closer to 17 million people—roughly double the official number. Researchers had used satellite imagery and household surveys to find that the population in rural areas had been dramatically undercounted.
This was a huge embarrassment for the PNG government. It suggested, first of all, that they were completely incompetent and had no idea what was going on in the country that they claimed to govern. And it also meant that all the economic statistics about PNG—which presented a fairly happy picture—were entirely false. Papua New Guinea had been ranked as a “lower-middle income” country, along with India and Egypt; but if the report was correct then it was simply a “lower-income” country, like Afghanistan or Mali. Any economic progress that the government could have cited was instantly wiped away.
But it wasn’t as though the government could point to census figures of its own. So the country’s prime minister had to admit that he didn’t know what the population was: he didn’t know, he said, whether the population is “17 million, or 13 million, or 10 million.” It basically didn’t matter, he said, because no matter what the population was, “I cannot adequately educate, provide health cover, build infrastructures and create the enabling law and order environment” for the country’s people to succeed.
But in the end, the PNG government won out. To preserve its dignity, it issued a gag order on the report, which has still never been released. There was some obscure behind-the-scenes bureaucratic wrangling, and in 2023 the UN shelved the report and agreed with the PNG government’s existing estimate. And so today, PNG officially has approximately 10 million people, perfectly in line with what had been estimated before.
The truth, of course, is that we have no idea how many people live in Papua New Guinea.
Last week, someone calling themselves Bonesaw went viral on Twitter for a post that claimed that China’s population numbers were entirely fake. China, they said, had been lying about its population for decades: it actually had only about 500 million people. In fact practically every non-Western country had been lying about its population. India’s numbers were also badly exaggerated: the idea that there are 1.5 billion Indians was absurd. The true population of the world, Bonesaw said, was significantly less than 1 billion people.
This is obviously an extremely stupid idea. It’s possible that Chinese population numbers are mildly exaggerated, but the most credible estimates—the ones advanced by Yi Fuxian—are that the exaggeration is on the order of a few percentage points. (It’s also worth noting that no reputable source has yet backed Yi Fuxian’s theory.) Actually faking the existence of billions of people would require a global conspiracy orders of magnitude more complex than anything in human history. Tens or hundreds of thousands of people, spread across every country in the world, would have to be in on it. Local, regional, and national governments would all have to be involved; also the UN, the World Bank, the IMF, every satellite company, every NGO that does work in any of these places. Every election would have to be fake. Every government database would have to be full of fake names. And all for what? To get one over on the dumb Westerners?
So we can dismiss Bonesaw’s claim pretty easily. But, as much as I hate to admit it, his argument does have a kernel of truth. And that kernel of truth is this: we simply have no idea how many people live in many of the world’s countries.
This is not the case for most countries, of course. In wealthy countries, like Germany or Japan or Sweden, populations are generally trusting and bureaucracies are generally capable. Sweden, for its part, maintains such an accurate daily birth-and-death count of population numbers that it no longer even needs to conduct a census. And population numbers are also not so much of a problem in countries like China, India, or Vietnam. These places might be poorer, but they have strong central governments that have a strong interest in knowing what’s going on inside the country. Population counts might be slightly overstated in these places because fertility is falling faster than expected (which could be the case in a country like India where fertility rates are falling quickly), or because local officials are exaggerating the number of students in their schools to secure more education subsidies (that’s Yi Fuxian’s theory of population counts in China), or because more people have emigrated than expected (as was the case in Paraguay when a census revealed its population to be smaller than officials expected). But if the state is in full control of a country, it will want to know what’s going on inside that country; and that starts with the simple fact of knowing how many people live there.
But “the state being in full control of a country” is not a criterion that holds in much of the world. Which brings us to Nigeria.
Nigeria is a huge place. Officially, it’s a nation of 240 million people, which would make it the most populous country in Africa and the sixth most populous country in the world. And without a doubt, there are a lot of people in Nigeria. But we actually have no idea how many there are.
Like PNG, Nigeria is supposed to conduct a census every 10 years. But in Nigeria, the census is a politically fraught thing. Nigeria is not a natural polity, and its ongoing unity as a single country is fragile. And so Nigerian elites expend enormous effort to ensure that Nigeria remains one country. They have two important tools at their disposal. The first is the relative representation of different regions in the Nigerian state. And the second is the distribution of Nigeria’s vast oil revenues. Both of these—how many seats a state is given in the Nigerian parliament, and how large a share of oil revenues it receives—are determined by its share of the population.
So local elites have a strong incentive to exaggerate the number of people in their region, in order to secure more oil revenue, while national elites have a strong incentive to balance populations across states in order to maintain the precarious balance of power between different regions. And so the overwhelming bias in Nigerian population counts is toward extremely blatant fraud.
It’s long been the case that censuses in Nigeria are shoddy affairs. When Nigeria was a colony of Britain, its censuses were limited to Lagos, a few townships, and a small number of villages: so the 1931 census for Nigeria yielded numbers that were too low by as much as 75 percent. Once Nigeria became independent, in 1960, the bias swung from underestimation to overestimation. Nigeria’s first census as an independent state came in 1962, and it immediately caused a political problem: the ruling regime was dominated by northern elites, but the census found that southern Nigeria had more people. And so another census was ordered the next year, which conveniently found an extra eight million people in the north. This pattern of brazenly false numbers continued for decades. The next census, in 1973, was such an obvious fraud that the government opted not even to publish the results. For eighteen years after that there was no attempt even to conduct a census. The next census, in 1991, was by far the most credible, and it shocked many people by finding that the population was about 30 percent smaller than estimated. But even that one was riddled with fraud. Many states reported that every single household had exactly nine people.
In 2006, Nigeria tried once again to count its population. And as luck would have it, it found that since the last census each state’s proportion of the national population had remained exactly the same: so there was no need to change the composition of the Nigerian parliament or the distribution of oil revenues. But this census was an extremely rocky affair. The city of Lagos, for instance, rejected the results of the census, which it claimed undercounted its population in order to preserve northern power; so it conducted its own (technically illegal) census and found that it had eight million more people than the national census had reckoned. And there was also a good deal of violence that accompanied the census: about ten people were killed in clashes around the census, usually in regions with separatist activity. The whole experience was so difficult that Nigeria has opted not to repeat it. The 2006 census was the last time that Nigeria has tried to count how many people live in the country.
So the Nigerian government’s figure of 240 million people is, as is the case in Papua New Guinea, an extrapolation from a long-ago census figure. Is it credible? Very few people think so. Even the head of Nigeria’s population commission doesn’t believe that the 2006 census was trustworthy, and indeed said that “no census has been credible in Nigeria since 1816.” (Nigeria’s president fired him shortly thereafter.) There are plenty of reasons to think that Nigeria’s population might be overstated. It would explain, for instance, why in so many ways there appear to be tens of millions of missing Nigerians: why so few Nigerians have registered for national identification numbers, or why Nigerian voter turnout is so much lower than voter turnout in nearby African nations (typically in the 20s or low 30s, compared to the 50s or 60s for Ghana, Cameroon, or Burkina Faso), or why SIM card registration is so low, or why Nigerian fertility rates have apparently been dropping so much faster than demographers expected.
None of this evidence is conclusive, of course. (There are credible third parties—like the Against Malaria Foundation—that believe that Nigerian population counts might actually be understated.) But the crucial thing, as in Papua New Guinea, is that we don’t know how many people live in Nigeria. It might be that there are 240 million Nigerians, as the Nigerian government claims; or that there are 260 million Nigerians; or that there are only 180 million. We don’t know. But we have plenty of reason to think that the official numbers have little relationship to reality.
What about other countries?
Nigeria is not the only poor country with an extremely patchy history of censuses. Indeed we find that countless poor nations with weak states have only the vaguest idea how many people they govern. The Democratic Republic of the Congo, which by most estimates has the fourth-largest population in Africa, has not conducted a census since 1984. Neither South Sudan nor Eritrea, two of the newest states in Africa (one created in 2011 and the other in 1991), has conducted a census in their entire history as independent states. Afghanistan has not had one since 1979; Chad since 1991; Somalia since 1975.
The various bodies that interest themselves in national populations, from the World Bank to the CIA, reliably publish population numbers for each of these countries. But without grounding in trustworthy census data, we simply have no idea if the numbers are real or not. Estimates for Eritrea’s population vary by a factor of two. Afghanistan could have anywhere between 38 and 50 million people. Estimates for the DRC’s 2020 population range from 73 million to 104 million. How did the country reach its official number for that year, 94.9 million? We have no idea. “It is unclear how the DRC national statistical office derived its estimate,” the U.S. Census Bureau said, “as there is no information in its 2020 statistical yearbook.”
Many other countries do conduct more regular censuses, but do a terrible job of it. Enumerators are hired cheaply and do a bad job, or they quit halfway through, or they go unpaid and just refuse to submit their data. An unknowable number simply submit fake numbers. These are not, after all, technical experts or trained professionals; they are random people sent into remote places, often with extremely poor infrastructure, and charged with determining how many people live there. It is exceptionally difficult to do that and come out with an accurate answer.
So even those countries that do conduct regular or semi-regular censuses often arrive at inaccurate results. The most recent South African census, for instance, undercounted the population by as much as 31 percent—and that is one of the wealthier and better-run nations in Africa. In poorer and less functional countries, statistical capacity is often just nonexistent. Take, for instance, the testimony of the former director of Sudan’s statistical bureau, who said that the most accurate census in Sudan’s history was conducted in 1956, when the country was still under British rule.
It shouldn’t be new to anyone that population data in the poor world is bad. We’ve known about these problems for a long time. And for an equally long time, we’ve had a preferred solution in mind. Technology can compensate for the deterioration of human coordination: we have satellites.
Satellites have two great benefits for counting populations. First, satellites can see pretty much any part of the world from space, and so you entirely obviate the logistical problem of sending people into remote areas: all you need is a small count of some portion of the area under study, which you can use to ground your estimates in something like reality. And second, you don’t have to rely on local governments to obtain the data—so you can get away from the bad incentives of, say, Nigerian elites.
But satellite data can only tell us so much. A satellite can look at a house, but it can’t determine whether three people live there, or six people, or eight people. And often the problem is worse than that. Sometimes a satellite can’t tell what’s a building and what’s a feature of the landscape. Dense cities are a problem; and so, by the way, are jungles—satellites can’t penetrate thick forest cover, and there are quite a few people around the world who still live in forests. (The “forest people” of central Africa, for instance, or a few million of the Adivasi in India.)
So guessing population numbers from high-resolution satellite imagery is an extraordinarily difficult problem. The various companies that guess population numbers from satellite imagery—working with groups like the World Health Organization that might be interested in mapping, say, malaria cases—take different approaches to tackling this problem. And the different approaches they take can lead to wildly different results. For example: Meta and WorldPop both used satellite imagery to predict the population of the city of Bauchi, in northeastern Nigeria. But the numbers that they reached were entirely different, because they take different approaches: Meta uses a deep learning model to detect individual buildings in images and then distributes population proportionally across those structures, while WorldPop feeds a machine-learning model with dozens of variables (land cover, elevation, road networks, so on) and uses that to predict population. Meta guessed that Bauchi has 127,000 children under the age of five; WorldPop says that it has 254,000, about twice as many. So Meta’s estimate is about 50 percent lower than WorldPop’s. We see similar differences in other regions. Meta says that Ganjuwa, also in northeastern Nigeria, has 76,000 children under the age of five; WorldPop says that it has 162,000.
And when we do have ground-truth data, we tend to find that satellite-based data doesn’t perform much better. Last year, three Finnish scientists published a study in Nature looking at satellite-based population estimates for rural areas that were cleared for the construction of dams. This was a useful test for the satellite data, because in resettling the people of those areas local officials were required to count the local population in a careful way (since resettlement counts determine compensation payments), and those counts could be compared to the satellite estimates. And again and again, the Finnish scientists found that the satellite data badly undercounted the number of people who lived in these areas. The European Commission’s GSH-POP satellite tool undercounted populations by 84 percent; WorldPop, the best performer, still underestimated rural populations by 53 percent. The pattern held worldwide, with particularly large discrepancies in China, Brazil, Australia, Poland, and Colombia. Nor is it just rural areas being resettled: WorldPop and Meta estimated slums in Nigeria and Kenya to be a third of their actual size.
(All of this, by the way, is a good reason to think that the report that the UN commissioned on Papua New Guinea’s population is probably inaccurate. And indeed, when the PNG government conducted a new census in 2024, its results broadly supported its own numbers. But we are not out of the woods yet: that census was also riddled with accusations of severe undercounting. So again we must return to the central fact: we just don’t know how many people live in Papua New Guinea.)
So satellite data is not a panacea. It might be that in the future the tools advance to the point where they can produce reliable estimates of human populations in areas of arbitrary size. But we are not really close to that point.
Where does that leave us?
I don’t think there’s any reason to embrace the sort of idiotic conspiracism of Bonesaw. We simply have no reason to think that the number of people in the world is dramatically different from what official estimates indicate; indeed while there are specific cases where the numbers might be dramatically off, there’s just no reason to think that this is the case for every country. There are many places, like perhaps Papua New Guinea, where population counts are probably too low. The only thing that can be said with any reliability is that we simply don’t know how many people live in these countries.
Given that we don’t have much evidence of a systematic bias in population counts—Nigeria might overcount, but Sudan might undercount, and at scale these differences should cancel out—the best we can do is assume that there is a sort of “law of large numbers” for population counts: the more units we have under consideration, the more closely the numbers should hew to reality. So population counts for individual countries, particularly in Africa, are probably badly inaccurate. It wouldn’t be surprising if the total population for Africa is off-base by some amount. But we don’t have much reason to think that the global population is very different from what we believe it to be.
But it’s good to be reminded that we know a lot less about the world than we think. Much of our thinking about the world runs on a statistical edifice of extraordinary complexity, in which raw numbers—like population counts, but also many others—are only the most basic inputs. Thinking about the actual construction of these numbers is important, because it encourages us to have a healthy degree of epistemic humility about the world: we really know much less than we think.
