The Bantu Expansion is unique among ancient dispersals of peoples and languages due to its combination of high amplitude, rapid pace, and adaptation to multiple ecozones (1). The spread of Bantu-speaking people from a homeland region on the border between Nigeria and Cameroon toward eastern and southern Africa starting ~4000 years ago had a momentous impact on the continent’s linguistic, demographic, and cultural landscape (1–3). The ~550 Bantu languages spoken today constitute Africa’s largest language family (1), and the genetic diversity of Bantu-speaking communities throughout sub-Saharan Africa is still characterized by a dominant ancestral West African component (4, 5). In the Congo rainforest, the intensification of settlements from the first millennium BCE with pottery and large refuse pits, and with evidence of cultivation, husbandry, and metallurgy later on, reflects the development of a more sedentary lifestyle contrasting with that of earlier forest inhabitants. The spread of this unprecedented material culture is generally viewed as the archeological backdrop of the area’s penetration by the first Bantu speakers (6).
Despite the emergence of substantially more data pertinent to the Bantu Expansion from diverse disciplines, the debate continues on the driving forces behind the large-scale dispersal of Bantu-speaking communities and their interaction with regional landscapes, especially during the earliest phases. Strongly opposing viewpoints persist on whether the Bantu Expansion was facilitated by climate-induced vegetation change (7–12) or instead fostered deforestation through slash-and-burn farming (13, 14). The vigor of this debate [for recent developments, see, e.g., (15)] highlights the daunting challenge of integrating all linguistic, archeological, paleoecological, and genetic evidence accumulated to date (16, 17). One persistent question is to what extent present-day data, particularly linguistic evidence, still mirror the initial Bantu Expansion. It is commonly thought that genealogical relationships among present-day languages reflect the original spreading of Bantu-speaking communities across the continent (2, 3). However, studies of present-day human Y-chromosomal DNA suggest that multiple expansion phases occurred in succession, erasing any founder effect and masking the genetic signature of the initial migration (18, 19). Yet, to date, no attempt has been made to detect evidence of such “spread-over-spread” processes (17) through comprehensive, supraregional analysis of well-dated archeological records.
Summed probability distributions (SPDs) of radiocarbon (14C) dates collected in archeological contexts are increasingly used to trace changes in human activity through time (20–24). Large collections of such 14C dates have previously been used to reconstruct the precise migration routes of early Bantu speakers in sub-Saharan Africa (25) or the human impact on their ecological resource base as a potential trigger for the initial migration (14). A particularly interesting application of SPDs involves the assessment of possible demographic fluctuations through time (26). However, such analyses are vulnerable to critique that the inference of demographic booms and busts may be affected by sampling biases (27) or a complicated relationship between the 14C data and actual human population size (28). To circumvent these issues, we here use an integrated multiproxy approach that combines a comprehensive analysis of all available archeological 14C dates as a proxy for human activity as well as potential demographic fluctuation (22, 23) with a comprehensive analysis of the diversity and distribution of pottery styles as a proxy for socioeconomic development. For this analysis, we considered all archeological data associated with the spreading of pottery-producing communities (~2000 BCE to ~1900 CE) throughout the Congo rainforest, the first ecozone where ancestral Bantu-speaking migrants settled (regions A to H in Fig. 1) after leaving their homeland. We also considered similar datasets from the southern Congo basin woodlands (regions I and J) and Bioko Island in the Gulf of Guinea (region K), two areas adjacent to the present-day forest. Last, we compared the results of this combined analysis with the subset of available human genetic data (5) relating directly to the settlement history of the Congo rainforest.
We first compiled an inventory of 1444 14C dates in Central Africa (data S1) and ran them through a transparent classification system (table S1) to assess their reliability and archeological relevance. Following this classification, we retained 1149 dates considered to be reliable, found within an archeological context, and associated with pottery-producing communities (class I). The 295 other dates were excluded because they are either irrelevant to the present study (class II) (29) or unreliable (class III) (30, 31). Then, a recently refined statistical approach (23) in R was used to construct and analyze SPDs of the class I 14C dates, both for the entire Congo rainforest (regions A to H combined) and the adjacent areas (regions I to K) and for the eight defined regions within the Congo rainforest separately. We also ran a sensitivity test to evaluate the robustness of our main result against variation in the classification and selection of 14C data. We refer to Materials and Methods for details.
Second, we compiled a dataset of all well-described styles of ancient pottery from the study area (data S2). Pottery is the most prominent and identifiable type of archeological artifact found in Central Africa. The shape of vessels and the technique and patterns of their decorations have been used to cluster pottery fragments from one or multiple sites into groups with the same set of stylistic characteristics. Irrespective of analytical and terminological differences among academics, such “pottery groups” are usually considered to comprise pottery that was produced during a specific period within a given territory by communities that shared a set of common characteristics for this aspect of material culture. Examination of 152 research publications (data S3) allowed us to discriminate 115 pottery groups within our study area. Occurrences of 72 of these could be linked directly to one or more (up to 26) 14C dates (fig. S1); the 43 others were dated indirectly by comparison with the directly dated groups (fig. S2). The interconnected datasets of screened 14C dates and pottery groups cover a total of 800 sites (Fig. 1 and data S4). R scripts to reproduce the data displayed in the figures and tables of this paper are available on zenodo (https://zenodo.org/record/4394894).
Bimodal temporal pattern of 14C ages across the Congo rainforest
The SPD of all screened archeological 14C dates associated with pottery remains (class I dates) in the Congo rainforest displays a bimodal temporal pattern, with the large majority of 14C dates concentrated in two periods of increased human activity (Fig. 2A). To constrain the timing of these periods, we compared the observed SPD with four different models of hypothetical population growth drawn from our dataset of screened archeological 14C dates (fig. S3) (22–24). This allowed delineation of four successive periods during which the observed SPD either exceeds or falls short of the population growth trends predicted by the models (blue and red shading, respectively, in Fig. 2A and fig. S3). Overall, the timing of these periods is remarkably consistent among the four tested null models. However, comparison of our 14C data with the uniform model infers a shorter period of less intense human activity toward the end of the first millennium CE and a longer period of significantly more intense human activity in the past ~500 years (fig. S3). The logistic growth model is probably most pertinent in the context of the Bantu Expansion, which is often presented as a large-scale and exceptionally rapid process followed by a continuous presence of Bantu-speaking people after the initial expansion (2, 3, 32, 33). Our analysis (Fig. 2A) shows that the SPD of Congo rainforest 14C dates deviates from this hypothetical pathway. Instead, it shows clear periods of significantly higher (blue) and lower (red) degrees of human activity than predicted.
In addition, the strongly positive and negative rates of change in observed SPD that precede or follow the inferred periods of more intense human activity define phases of pronounced demographic expansion (“boom”) and collapse (“bust”), respectively (22, 34–38). Thus, our analysis provides a precise delineation of the periods commonly recognized as the Early Iron Age and Late Iron Age in Central Africa (27, 32, 39), each of which comprises a phase of expansion followed by a period of high human activity (Fig. 2A). The SPD of all screened archeological 14C dates from within the Congo rainforest (n = 1075) indicates that the Early Iron Age ended with a collapse between ~400 and 600 CE, followed by a period of markedly low human activity between ~600 and 1000 CE, during which the SPD falls short of all four tested hypothetical growth models (Fig. 2A and fig. S3). Only 61 sites scattered across the Congo rainforest have class I 14C dates from the period 600 to 1000 CE (fig. S4), suggesting that these represent remnant communities.
The combined SPD of archeological 14C dates from the three outlying regions (I to K; n = 74) contrasts with those from within the Congo rainforest by being limited to the last 2000 years and by displaying a largely unimodal temporal pattern (Fig. 2B). The inferred peak human activity in these regions seems to be situated between 900 and 1200 CE, i.e., largely coeval with the period of low human activity in the rainforest (Fig. 2A).
Repetition of a two-phase settlement pattern in pottery-style distributions
Analysis of the temporal distribution of the 115 recognized pottery groups in the Congo rainforest shows a marked repetition of the same two-phase settlement pattern during both the Early Iron Age and Late Iron Age (Fig. 3). Each expansion phase (~1000 BCE to 0 CE and ~1000 to 1400 CE) is characterized by an initially low but increasing number of pottery groups (Fig. 3A), which are stylistically homogeneous across large numbers of sites (Fig. 3B) and consequently tend to have a large distribution area (Fig. 3C). In contrast, the subsequent periods of high activity (~0 to 400 CE and ~1400 to 1900 CE) are characterized by a large number of pottery groups (Fig. 3A), of which many are found at only a few sites (Fig. 3B) and consequently have a smaller mean area of distribution (Fig. 3C), indicative of a regionalization process.
The high number of pottery groups that characterize the first regionalization phase quickly drops during the two centuries following the Early Iron Age, eventually dwindling to an absolute minimum of only three groups around 800 CE: Ilambi, Nandá, and Spaced Curvilinear (Fig. 3A and fig. S1A). This decline in the number of pottery groups (Fig. 3A) is fully coeval with the period of significantly declining activity observed in the SPD analysis (400 to 600 CE; Fig. 2). Furthermore, the subsequent period of very low numbers of pottery groups is largely coeval with the period of low human activity, i.e., the inferred demographic bust (600 to 1000 CE). Besides the low number of pottery groups, it is characterized by relatively few sites per group (particularly 700 to 1000 CE; Fig. 3B) and smaller distribution areas (already from 500 CE; Fig. 3C).
In addition, pottery styles prevailing during the Late Iron Age are always substantially different from those of the Early Iron Age. In the inner Congo basin (region F), the first expansion period is characterized by the dominance of the well-known Imbonga group between 400 and 100 BCE (32, 40), which spread over a vast area within less than 200 years (Fig. 3D and fig. S1A). Over the subsequent 500 years, it morphed into multiple distinct styles (e.g., Monkoto, Lokondola, and Yete), whose unique characteristics in terms of vessel shapes and decorations all disappeared during the collapse period (fig. S1A). Then, from ~1000 CE, a previously unknown widespread pottery group emerged (Bondongo), which, in turn, split into regional specializations (e.g., Bolondo, Malelembe, and Besongo) within the next century. A similar pattern is found in Gabon, where the widespread Okala pottery (500 to 0 BCE) coincides with the Early Iron Age expansion period (fig. S1A). During the subsequent regionalization period (0 to 400 CE), this widespread pottery group fell apart into the Okanda, Oveng, and Otoumbi groups, each showing only very local distributions.
Regional patterns of archeological 14C ages and pottery styles
The SPDs of available class I 14C dates from the eight distinct regions within the Congo rainforest separately are displayed in fig. S5. We find that in regions with ~200 or more 14C dates (A to C: southern Cameroon, Gabon, and the Lower Congo), region-specific SPDs infer the same boom-and-bust pattern in human activity, as does the cross-regional SPD (Fig. 2). In particular, the collapse (~400 to 600 CE) and subsequent period of low human activity (especially between 800 and 1000 CE) are prominent in regions A to C, albeit that the timing of the most rapid decline varies among them. Regions A and B experienced two early peaks in human activity (~700 BCE and 400 to 200 BCE) followed by a temporary setback between ~150 BCE and 0 CE that was not as momentous as the later collapse (fig. S5). In region C, rapid expansion started only ~400 BCE, but the subsequent period of peak activity lasted longer than in regions A and B (until ~50 BCE; fig. S5).
SPDs for the northern and western Congo basin (regions D and E, both with ~100 14C dates) show marked differences with the cross-regional SPD. While they also display a bimodal temporal pattern, they lack a significant volume of dates during the expansion phase of the Early Iron Age (800 BCE to 0 CE; fig. S5). Human activity in region D started to accelerate only after 0 CE, broadly coincident with the temporary setback in region C and post-setback expansion in regions A and B. The peak Early Iron Age activity in regions C and E (250 to 450 CE) coincided with a temporary setback in region D. Together, these regional archeological records suggest that strong connectivity existed between the pottery-producing communities of different rainforest areas during the Early Iron Age (6). Specifically, the broad long-term similarity between the SPDs of regions A to C suggests long-term connectivity among the communities inhabiting Lower Guinean forests. On the other hand, during the Early Iron Age, also major and concerted migrations may have occurred within the rainforest, from more densely populated regions (A and B) toward adjacent, previously unsettled regions (first C, later also D and E). These migrations may have temporarily depleted populations in regions A to C, as indicated by the SPD minima coeval with SPD maxima elsewhere (fig. S5).
Expanding this argument to the pottery evidence, the onset of activity and demographic growth in the western Congo basin (region E) may have been stimulated by the success of settlers from the inner Congo basin (region F) who, between 400 and 100 BCE, had made Imbonga one of the most ubiquitous pottery groups of the Early Iron Age and found at 58 sites so far (Fig. 3, fig. S1, and data S2 and S4). Pikunda-Munda, the earliest well-attested pottery in the western Congo basin, shows substantial technological similarities (41), and some similarities concerning decoration techniques and patterns, to contemporaneous pottery found in the inner Congo basin (Lusako, Lokondola, Lingonda, and Bokuma), although vessel shapes are different (32). This suggests that some communities in the former region may have emerged from the latter. In contrast, the oldest known pottery in the northern Congo basin (region D), Batalimo-Maluba, is technologically and stylistically entirely different from all pottery groups found further south (40), suggesting largely independent demographic development resulting from local prosperity instead.
The pottery evidence from the studied regions adjacent to the Congo rainforest (fig. S1B) is generally consistent with the corresponding SPD of archeological 14C dates (Fig. 2B). Particularly in the woodlands of the Upemba depression in Democratic Republic of the Congo (DRC) (region J, Katanga), the period of elevated human activity between 700 and 1300 CE is well covered by five pottery groups. However, no distinct pottery styles have thus far been found in the southern Congo basin and northern Angola (regions H and I), and the current sample size from these outlying regions is too low overall to draw final conclusions. In this context, an important final observation from the regional analysis is that none of the separate SPDs from the eight regions within the Congo rainforest infers relative population densities during the collapse period significantly higher than those inferred from the corresponding cross-regional SPD. On the contrary, the SPD for southern Cameroon (region A) infers even lower activity than for the Congo rainforest as a whole, reaching near-zero values for about a century ~900 to 1000 CE (fig. S5). This ~100% decline in summed 14C probability being recorded in the best-sampled region of our study area confirms the overall significance of the population collapse that occurred in the Congo rainforest around 1000 years ago.
Reconstructing demographic patterns with genetic evidence
To test the paleodemographic fluctuations implied by the combined archeological data, we estimated changes of effective population size (Ne) over the last 130 generations in 16 current-day agriculturalist communities from Gabon (region B), the genetically best-sampled region of our study area (with 816 individuals in total), by using the IBDNe method (42) to analyze inferred identity-by-descent (IBD) segments in available genome data (5). We find that all these communities display a remarkably similar pattern indicative of very low population size until ~35 generations ago, i.e., around 1000 CE (Fig. 4 and table S3). After 1000 CE, the population sizes of individual communities diverge, with growth in most of them becoming exponential from ~1300 CE onward. Within the uncertainty of the applied generation time [30 years; (22)], this strong demographic expansion coincides exactly with the second phase of increasing human activity shown in the combined archeological data (Figs. 2 and 3). Thus, genetic data support our archeological inferences that today’s Bantu-speaking communities in Gabon did not move into that area of the rainforest before the second expansion phase (1000 to 1500 CE) and do not descend from its Early Iron Age inhabitants. The centers of origin of the Bantu languages spoken by the Gabonese communities covered in our analysis (table S3), which belong to two distinct clades, are situated outside of their present-day distribution area. The putative homeland of the West-Western Bantu languages (table S3 and Fig. 4) in Gabon is situated in savannah woodland ~750 km to the southeast (43), and languages most closely related to North-Western Bantu languages in Gabon are located far to the north (3, 10). Collectively, our findings indicate that the original (semi)sedentary inhabitants of region B may have either moved elsewhere during the collapse period (400 to 600 CE), ceased to exist as distinct populations, or shrank to very low numbers in the period 600 to 1000 CE and were subsequently absorbed by the newly immigrating ancestors of present-day Gabonese Bantu-speaking communities.
Population collapse ended the Early Iron Age in the Congo rainforest
The combined archeological evidence, supplemented by human genetic and linguistic data, shows that the “boom-and-bust” pattern in the SPD of archeological 14C dates does reflect past fluctuations in the overall density of the human population inhabiting the Congo rainforest. Particularly, the strong correlation between inferred human activity (Fig. 2) and the signatures of socioeconomic expansion and intensification revealed by the pottery data (Fig. 3), points to the conclusion that the Early Iron Age in the Congo rainforest ended in a widespread socioeconomic decline. The decimation of rainforest communities starting around 400 CE, culminating in a ~400-year period of inferred low population density before renewed demographic growth started the Late Iron Age.
The notion that the Early and Late Iron Ages in Central Africa were separated by a period of population collapse has previously been considered for southern Cameroon, Gabon, and the Lower Congo separately (regions A to C) (26, 44) and has also been documented for the inner Congo basin (region F) (20). However, the lack of a comprehensive and statistically sound assessment of 14C dates at the scale of the entire Congo rainforest, and the absence of other lines of evidence, allowed regional occurrences of a “hiatus” in observed 14C dates to be attributed to a research gap (27). Our integrated analysis of quality-screened archeological 14C dates and pottery group distribution patterns shows that this is not the case. The near-simultaneous occurrence of the boom-and-bust demographic pattern in eight distinct regions of the Congo rainforest and most coherently so in the three best-documented regions (fig. S5) is supplemented by the marked repetition of Early and Late Iron Age settlement history both including distinct phases of socioeconomic expansion followed by regionalization (Fig. 3). This supports the notion that Central Africa has experienced an episode of widespread population collapse (18, 19), with important consequences for our understanding of the Bantu Expansion. While evidence for temporal fluctuation in ancient human populations and socioeconomic traditions has been documented elsewhere on the African continent, including southern Africa (45, 46), West Africa, and central North Africa (47, 48), a pronounced population collapse between 400 and 600 CE has thus far not been documented for sub-Saharan regions other than the Congo rainforest.
Spread-over-spread model of Bantu Expansion
Until now, the dispersal of Bantu-speaking communities in the Congo rainforest has tended to be seen as a single, and long-term continuous, macroevent (2, 3, 32, 33). In this traditional view, today’s Bantu speakers descended in a direct line from those who originally settled the rainforest ~2700 years ago, and current-day Bantu languages developed directly from the ancestral languages of those first settlers (3, 10, 25). However, our results show that this initial wave of putative Bantu-speaking Early Iron Age communities largely vanished from the entire Congo rainforest region by ~600 CE, with the persistence of only a few scattered populations (fig. S4). The population decline was rapid between ~400 and 600 CE and then continued at a slower pace to eventually result in about 400 years of very limited sedentary activity (~600 to 1000 CE) before a second wave of immigration and new settlements developed into the Late Iron Age (Figs. 2 and 4).
This spread-over-spread model of Bantu-speaking communities in the Congo rainforest, already hinted at by integration of genetic and linguistic evidence (18, 19), is now firmly corroborated by the archeological data. It is particularly clear in the substantial evolution of the material culture of rainforest-inhabiting communities as testified by the temporal distribution of pottery groups (Fig. 3). Among the best examples of this evolution is the marked shift in the types of pottery produced in the inner Congo basin (region F, Fig. 3D), where flat-based vessels omnipresent in the Early Iron Age were replaced by round-based vessels during the Late Iron Age (32). In addition, in the eastern Congo basin (region G), Early Iron Age pottery technology was markedly different from that of later pottery groups (49). These distinct breaks in material culture confirm the existence of two clearly distinct periods of settlement in the rainforest.
Possible drivers of population collapse
The initial migration of pottery-producing communities into the present-day Congo rainforest area is thought to have been facilitated by a reduction of forest cover during widespread drought between 500 and 0 BCE (8–12). It is not unlikely that climate change also played a role in the collapse of forest-dwelling populations between ~400 and 600 CE (Fig. 2). A trend toward wetter climatic conditions ~2000 years ago (50) may have caused substantial changes in living conditions or in the resources to which these first groups had become accustomed. Regional palynological records (11) and soil charcoal identifications (9) show that pioneer forest trees were abundant in the period ~0 to 600 CE, indicating that the rainforest was recovering from severe disturbance. Toward the end of this forest recovery period, hot and damp conditions inside an increasingly dense forest may have impeded the cultivation of some traditional crops (8, 51), while palm oil and other natural resources depending on open patches within the forest may have become scarce (52). However, it seems unlikely that deteriorating living conditions by themselves would have caused the prolonged and major reduction in human population density implied by the 14C data (a ~74% drop in SPD value between 400 and 1000 CE; Fig. 2) or impelled large numbers of people to purposefully move out of the rainforest and resettle in drier woodland areas.
Conversely, a wetter climate may have triggered more frequent outbursts of vector-borne diseases (53). Three well-documented pandemics in Europe and Asia within the last 2000 years were preceded by episodes of high rainfall and ended when climate returned to cooler and drier conditions (54, 55). Unspecified epidemics have also been proposed as a possible cause for the sharp decline in human activity between 550 and 1150 CE observed in the archeological record of Gabon (26). In this context, we note the broad coincidence between population collapse in the Congo rainforest (400 to 600 CE; Figs. 2 and 3) and the Justinian plague (541 to 750 CE) (34). Caused by the bacteria Yersinia pestis and transmitted by fleas, this long-lasting pandemic is generally regarded as one of many factors leading to the collapse of the Roman Empire and may have killed up to 100 million people in Asia, Europe, and Africa (56). A potential center of diffusion would be present-day Ethiopia, where it may have contributed to the collapse of the Aksumite Empire ~450 to 750 CE (57). Although its African origin is not universally accepted (54, 58), there is robust genetic evidence for the long-standing presence of Y. pestis in Central Africa (59). One particular Y. pestis strain, today found exclusively in DRC, Zambia, Kenya, and Uganda, has occurred in Central Africa for at least 300 years and is the oldest living strain closely related to the 14th century Black Death lineage (59, 60). At present, there is no firm evidence for persistent vector-borne diseases to have afflicted the Congo rainforest communities during the period of population collapse. However, the modern distribution of Y. pestis strains does suggest that “Africa harbors strains of plague that entered the continent in different historical periods” (59, 60). More genetic research would be needed to corroborate our hypothesis that the strong population decline during 400 to 600 CE revealed by our data may have been caused by a vector-borne epidemic.
Our regional SPD analysis of archeological 14C dates strongly supports the notion of long-term connectivity between Early Iron Age agricultural communities inhabiting different regions of the Congo rainforest, corroborating the evidence for the introduction of iron metallurgy and cereal cultivation to previously settled Central African communities. These processes eventually linked the Congo rainforest with the Sahel region of North Africa (8, 61). Thus, also in Early Iron Age Africa, a vector-borne infectious disease may have propagated relatively effectively over large distances. At the same time, depending on the intensity of socioeconomic exchange between individual regions, such a disease may have developed asynchronously and caused different levels of mortality depending on region-specific population density and living conditions. The population decline ending the Early Iron Age in the continental interior regions of western and northern Congo basin (regions D and E) was less pronounced than that in southern Cameroon and Gabon (regions A and B) and lagged behind it by more than a century (fig. S5); available 14C dates from the Lower Congo basin (region C) present an intermediate pattern. Moreover, the SPDs of regions D and E reach minimum values only after 1000 CE, when those of all three Lower Guinean forest regions (A to C) already reflect renewed population expansion heralding the Late Iron Age.
Late Iron Age resettlement and peak population density
Analogous to the first wave of immigration that gave rise to the Early Iron Age, also the second wave in the Late Iron Age may have been facilitated by an episode of widespread climatic drought. A review of hydroclimatic trends in Africa over the past 2000 years, focusing on well-dated records with high temporal resolution (62), indicates below-average precipitation in both western and eastern tropical Africa during a period broadly coincident with the Medieval Climate Anomaly in Europe (900 to 1250 CE). Most likely, this drought anomaly extended over the Congo basin, although very few high-quality records are available from the latter.
Perhaps the most notable feature of Congo rainforest demographic history as represented by the cross-regional SPD of archeological 14C dates (Fig. 2) is the relatively low inferred peak Late Iron Age probability reached in the 18th century compared to the high cross-regional peak inferred for the Early Iron Age reached around 350 CE. The comparatively unimpressive appearance of the Late Iron Age is even more pronounced in the SPD-based composite kernel density estimate (CKDE) trajectories (Fig. 2). This result contrasts with the strong increase in the number of pottery groups during the Late Iron Age expansion phase (1000 to 1400 CE), eventually reaching maxima (24 to 26 in the period 1300 to 1600 CE) that rivaled those of the Early Iron Age (23 to 25 in the period 0 to 400 CE; Fig. 3A). In addition, the trajectories of population density in Gabonese forest communities inferred by genetic analysis (Fig. 4) mostly show exponential growth throughout the Late Iron Age. Considered in isolation, it could be argued that the modest peak Late Iron Age probabilities within the archeological 14C dates may reflect depopulation of Central Africa due to the transatlantic slave trade (16th to 19th century) and its impact on Central African communities. A well-known example is the decline of the Kongo Kingdom (region C) and its societal transformation in the 17th and 18th century CE (63). A detailed comparison of the region-specific SPDs (fig. S5) with historical data on the predominant timing and volume of slave extraction from each region may be instructive in this context but is outside the scope of this paper. However, the contrasting evidence from pottery and genetic data indicates that the modest Late Iron Age probabilities within the summed 14C ages are largely due to a general tendency of archeologists to refrain from obtaining 14C dates on Late Iron Age archeological contexts either because the nonweathered appearance of pottery finds is already deemed sufficient to assign them to this period or because they are found in association with objects of European origin that can be dated independently (27). Other possible reasons are lack of interest for what is more recent, or uncertainty in the 14C calibration curves for this relatively recent period.
Reassessment of the Bantu Expansion in linguistics and paleoecology
Genealogical classifications of present-day Bantu languages are commonly interpreted as reflecting the initial wave of migration through sub-Saharan Africa. Our results imply that some of the ancestral languages spoken between ~1000 BCE and ~400 CE have almost certainly become extinct during the period of low human population density ~1000 years ago and can no longer be factored in to reconstruct the original routes of Bantu Expansion. Language diversity evolving during the second phase of population growth (~1000 to 1500 CE) and subsequent regionalization (1500 to 1800 CE; Fig. 3) probably involved a reduced subset of the Bantu languages previously spoken in the Congo rainforest. Although patchy persistence of remnant populations during ~600 to 1000 CE (fig. S4) suggests that not all Early Iron Age forest communities disappeared, present-day Bantu languages in the Congo rainforest may descend from languages that were (re)introduced during the second migration wave and could thus be up to 1000 years younger than previously thought.
The ancient introduction of slash-and-burn farming in the Congo rainforest is thought by some scholars (13, 14) to have had a profound impact on forest extent and composition, potentially overshadowing the influence of climate variability. Particularly, the period between 600 and 400 BCE is believed to be characterized by intense deforestation, followed by remarkably fast forest regeneration ~2000 years ago, which supposedly reflects a sudden and marked decline in human population density at that time (14). Scholars considering a larger body of archeological and paleoecological evidence (11, 12) heavily contest this interpretation, concluding instead that although humans may have changed landscapes at the local scale, they could not have been responsible for the synchronous declines in lake levels, draining of swamps, and the large-scale opening up of forest canopies (12). Instead, changing climatic conditions such as more pronounced rainfall seasonality may have caused changes in vegetation and may have eventually triggered the Bantu Expansion (7, 8, 11, 12). In addition, our results contrast strongly with the scenario of large-scale anthropogenic deforestation. We showed that population density in the Congo rainforest was regionally diverse but overall built up gradually from ~1000 BCE onward, reached a maximum only around 2000 years ago (~0 BCE/CE), and declined strongly only after 400 CE (Figs. 2 and 3). The proxy records used to support the human-disturbance scenario (13, 14) fail to show a peak in forest loss coincident with the first period of high population density (0 to 400 CE) or with the second wave of Bantu immigration, which started ~1000 CE (Figs. 2 and 3) and rapidly evolved into a phase of exponential population growth (Fig. 4). Regional integration of paleoecological records from the Congo rainforest must account for a settlement history characterized by two periods of high population density separated by a prominent ~400-year-long period of strongly reduced human impact.
Overall, our results significantly advance the current understanding of paleodemographic fluctuations in Central Africa over the last four millennia. Our integrated assessment of the archeological evidence reveals a supraregional decline of Early Iron Age Bantu-speaking communities across the Congo rainforest, with notable loss of both their material culture and linguistic heritage. This widespread population collapse created a nearly blank canvas for the rise of new communities, material culture, and languages during the Late Iron Age. These groundbreaking insights urge reassessment of the Bantu Expansion as a spread-over-spread process rather than a single and long-term continuous macroevent.
MATERIALS AND METHODS
Demarcation of the study area and its constituent regions
The Bantu “homeland” is generally considered to be the border region of present-day Nigeria and Cameroon in northwestern Central Africa. From there, Bantu-speaking people spread toward eastern and southern Africa, eventually occupying almost the entire area south of the equator (3). Here, we focus on demographic evolution in the Congo rainforest, the first ecozone where ancestral Bantu speakers settled. We constrained the Congo rainforest (gray area in Fig. 1) as the region composed of “tropical lowland rainforest,” “swamp forest and mangrove,” and “dry forest and thicket” in White’s vegetation map of Africa (64), which includes local “anthropic landscapes” but generally represents the natural distribution of vegetation types without 20th and 21st century land-use change. We compared demographic evolution in the Congo rainforest with that in the southern Congo basin woodland and on Bioko Island in the Gulf of Guinea, two areas in the forest’s periphery that were colonized by Bantu speakers at a later stage. The southern Congo basin woodland (white area in Fig. 1) comprises White’s “undifferentiated woodland” and “Zambezian miombo woodland” (64). In all, our datasets cover seven Central African countries: Cameroon, the Central African Republic, the DRC (or Congo-Kinshasa), the Republic of the Congo (Congo-Brazzaville), Gabon, Equatorial Guinea, and northern Angola.
This study involves both cross-regional analyses of archeological data from the entire Congo rainforest and adjacent areas as defined above and analyses of demographic evolution in 11 distinct Central African regions. Our demarcation of these regions is mainly based on strategies of archeological surveying specific to individual projects and modern-day national borders affecting those strategies. Archeological research in southern Cameroon and Gabon has mainly been conducted along modern roads (26); archeological sites in the Lower Congo are mainly situated on hilltops (27); knowledge from the inner Congo basin is mainly based on surveys along rivers (32, 40); and one team in the eastern Congo basin is experimenting with targeting young forest patches as potential indicators of past human activity within forested areas (49). Each surveying approach potentially involves sampling biases. In addition, archeological teams produced and published diverse amounts of 14C dates. We, therefore, divided our study area into 11 regions (Fig. 1), each of which has been studied by a group of archeologists applying a similar surveying strategy and approaches to describe pottery inventories. With such specific biases confined to one or a few regions, our cross-regional analysis is expected to be relatively free of systematic sampling-related bias. The specific research history of each region is discussed in data S2.
Classification of radiocarbon (14C) dates
The earliest 14C dates from Central Africa were produced for archeological purposes in the early 1960s (50), and since then, the overall amount of dates has increased steadily. We compiled a dataset of 1444 14C dates (data S1) from the study area (Fig. 1) that are younger than 2000 BCE; 170 older dates were not used in the present study. Among our inventory of 14C dates, 54 dates, which were obtained from 15 sites excavated during the BantuFirst (n = 36) and AFRIFORD (n = 18) projects, had not been previously published. These previously unavailable dates are flagged as “unpublished” in the “Source” column in data S1.
Archeological assemblages relevant for our analysis consist of pottery, metallurgical objects (including iron slag), and/or edible fruit remains (typically charred endocarps of Elaeis guineensis palm nuts). Data S1 provides an overview of all archeological materials associated with each 14C date. This compilation allowed us to evaluate the reliability and relevance of individual 14C dates with regard to this study, taking into account the interpretation of the original authors and the results of subsequent analyses. All available context information was then condensed into a transparent classification system for each 14C date, as detailed below and summarized in table S1. All 14C dates were obtained on charcoal or other organic matter remains accompanied by a possible archeological assemblage that can be attributed to human activity.
Radiocarbon dates assigned to class I (n = 1149) were retained for our chronological analysis of human activity (Fig. 2) because they are reliable and unmistakably associated with archeological assemblages containing pottery or metallurgical remains and/or abundant edible fruit remains. Class I comprises four subclasses. Dates in class Ia (n = 453) are associated with finds of well-described pottery groups (see differential diagnoses in data S2) or metallurgical objects originating from clear anthropogenic features such as refuse pits or settlement horizons, hence representing a strong archeological context. Class Ib dates (n = 542) are associated with undefined pottery assemblages or metallurgical remains, hence representing a moderately strong archeological context. Class Ic dates (n = 64) originate from archeology-focused investigations and were considered by the original authors to represent human activity. However, they lack specific documentation with regard to their archeological context, and we therefore consider this association as weak. Dates assigned to class Id (n = 90) include large collections of charcoal or charred edible fruit remains that can be positively attributed to mostly sedentary pottery-producing communities but lack direct association with human artifacts. Most such dates were obtained from large accumulations of charred E. guineensis endocarps found in or along riverbeds in the Republic of Congo and the Central African Republic (52) and considered to represent ancient arboriculture.
Class II 14C dates (n = 239) are considered reliable but irrelevant for our present study and are therefore excluded from further analysis. These dates were sorted into three subclasses. Class IIa dates (n = 18) are associated with lithic artifacts only and hence do not represent pottery-producing communities. Class IIb dates (n = 55) were rejected by the original authors based on substantial discrepancies with the typology of the local archeological context. Probably, the dated charcoal was formed during wildfires and became mixed with pottery from another time period due to postdepositional processes (29). Class IIc dates (n = 166) are not associated with any archeological material; hence, this charcoal was also probably formed during wildfires. Most often, it was collected in trenches excavated in the rainforest, either randomly (9, 65) or along forest inventory transects (66), without aiming to find archeological assemblages. Wild-burning fires are an important aspect of any forest type, even rainforests (67), although the existence of natural wildfires is often overlooked when interpreting charcoal assemblages from the Congo basin (9).
Last, class III dates (n = 56) are excluded from further analysis because they are unreliable or inaccurate. Among these, class IIIa dates (n = 26) are considered unreliable due to laboratory error (30). In total, 121 14C dates in our dataset (data S1) were produced at the Niedersächsisches Landesamt für Bodenforschung at Hannover (Germany) in the 1980s. During that time, this laboratory faced a substantial problem concerning the reproducibility of 14C dates and potential errors due to the use of acetylene as counting gas (30). Doubts on the quality of Hannover dates were first raised in the late 1980s (40), but a critical analysis of their reliability in the context of the associated archeological finds has been conducted only for material from the inner Congo basin (region F) (32). We incorporated the conclusions of that study in our 14C database, by assigning these 26 dates to class IIIa (data S1 and table S1). However, 90 other archeological 14C dates produced at Hannover during the 1980s, mostly from Cameroon, Gabon, and the Lower Congo (regions A to C), lack detailed discussion of their reliability, and therefore, we have no a priori reason to exclude them from our analysis. Dates assigned to classes IIIb (n = 18) and IIIc (n = 12) were derived from freshwater shells and bulk organic matter in sediment cores, respectively, and are therefore possibly biased by an old-carbon age offset (31).
SPDs of archeological 14C dates
SPDs of 14C dates are widely used as a proxy for the temporal evolution of population density (20–24). However, changes in these SPDs can be misleading, even in archeologically well-studied regions (68). Therefore, robust analytical methods are essential to interpret SPDs critically. We used a recently refined statistical approach to construct and analyze SPDs of 14C dates (23), which has so far been applied to population dynamics in Europe (22, 23) and eastern Asia (69) and has been made available in the R-package “rcarbon” (23, 35, 36, 69).
For each 14C date of class I, the posterior probabilities for all calendar years are calculated with the calibrate function in rcarbon (36), using the Intcal20 calibration curve (70). Of the 1149 (class I) dates, 18 are too young for calibration (uncalibrated 14C date < 0 BP), and these were also excluded from our main analysis. We then used the binPrep function in rcarbon (36) to group similar 14C dates from the same assemblage into “bins” of 100 calendar years to compensate for “investigator bias” (23). This bias occurs when multiple 14C dates have been performed on a single archeological feature (e.g., a single refuse pit) and some or all of these are within the same age range. Last, the empirical summed probability function is calculated from the calibrated 14C dates using the spd function in rcarbon (36). For each bin, the spd function calculates the SPD using all dates associated with the bin and then divides the SPD by the number of contributing dates. This procedure controls for differences in sampling intensity and ensures that each assemblage contributes equally to the final SPD. To offset variability created by the calibration process, the SPDs were smoothed (21), using a moving average of 60 years because the present study covers a relatively short period of time (4000 years) and the average SE of all 14C dates in our dataset is ~60 years (71). Last, we visualized data variability by computing a CKDE from 500 sets of the calibrated ages of 675 randomly sampled 14C dates, corresponding to the number of binned dates, using the ckde command in rcarbon (36).
In addition to the cross-regional SPD analysis, we constructed separate SPDs of all available class I archeological 14C dates for eight distinct regions within the Congo rainforest as demarcated above (Fig. 1). These regional SPDs (fig. S5) were compared with the cross-regional SPD (Fig. 2) by performing a random mark permutation test, using the permTest function in rcarbon (36). For a region with n 14C dates, this function generates 100 SPDs by randomly picking the same number of 14C dates from the supraregional dataset (total n = 1149) and computing a simulated 95% uncertainty envelope (gray areas in fig. S5). Time windows during which the regional SPD exceeds or falls short of this envelope define periods when regional settlement history differs significantly from the cross-regional pattern.
Demarcation of periods with high and low human activity
We compared the observed SPDs (Fig. 2) with hypothetical models of uniform, linear, exponential, and logistic population growth, each averaged over 1000 Monte Carlo simulations drawn from our database of class I 14C dates, using the ModelTest function in rcarbon (36). This allows to identify periods during which the SPD exceeds or falls short of the theoretical models. Such periods are considered as periods of more and less intense human activity, respectively, and visualized by blue and red shading in all figures presenting data along a time axis (Figs. 2 and 4 and figs. S1 to S3 and S5). To better characterize the magnitude of population change during these phases, we calculated the rate of change in SPD for each 100-year interval between 2000 BCE and 1900 CE, using the spd2rc function in rcarbon (36), and defined increases as “population expansion” in 100-year intervals where the rate of change exceeds the first quartile of all positive growth rates. Similarly, we defined an SPD decline as “population collapse” in 100-year intervals where the rate of change is lower than the first quartile of all negative growth rates.
Sensitivity of the Congo rainforest SPD to variable selection of 14C dates
Our classification of available 14C dates from the study region (table S1 and data S1) is intended to limit SPD analysis to those dates that actually contain trustworthy information on human activity (20, 29, 32, 40). Besides known biases, such as the freshwater reservoir effect (31) and documented laboratory errors (30), studies of Congo rainforest settlement history often fail to separate the dates associated with archeological finds (class I) from those that are not (class II). Fossil charcoal assemblages found in randomly excavated test pits are often assigned to human activity by default, even if they lack archeological context [e.g., (13, 66)]. If such wrongly assigned dates represent a substantial fraction of the dataset, for example, in a study covering a relatively small area of forest, they evidently risk undermining the principal aim of finding a meaningful temporal pattern. Moreover, this approach excludes the possibility of natural fire in the forest, thereby impeding the use of fossil charcoal assemblages to study the ecosystem effects of past climate variations (9, 67). Although classification may be hampered by the absence of pertinent metadata on some published 14C dates, at least in this study these represent only 4% of the total dataset (class Ic in table S1 and data S1). To assess the sensitivity of our main result (Fig. 2) to a failure of excluding irrelevant (class II) and unreliable (class III) 14C dates, we repeated the SPD analysis with all three classes included (n = 1444) and found no significant difference (compare fig. S6 with Fig. 2). Nonetheless, we encourage authors to critically assess the origin of fossil charcoal assemblages and to clearly describe their archeological context.
Pottery group distribution in space and time
Pottery is the most prominent type of artifact found in archeological surveys and excavations in Central Africa. Changes in their shape and decoration have been traced through time by sorting vessels and sets of sherds with shared characteristics into so-called pottery groups. Each pottery group comprises often large numbers of sherds, allowing diagnostic features for each pottery group to be defined based on vessel shape and style of decoration. Although archeological teams have not always used the same standards for pottery description, descriptions are easy to compare because most teams use recognizable nomenclature and provide detailed drawings of vessels and sherds. Screening of 152 publications (data S3) allowed us to distinguish 115 distinct pottery groups in our study area (figs. S1 and S2), each of them described in one or several publications as summarized in data S2. The name of a pottery group typically relates to the location from where it was first described or to a characteristic type of decoration. Each pottery group has been found in one or multiple sites, with a maximum of 101. In total, 472 sites have yielded sherds belonging to well-defined pottery groups (data S4). An additional 223 sites yielded pottery sherds that are either too small for description or are large enough to distinguish decorations or shapes but do not correspond to well-described pottery groups or have not been discussed comparatively. Such pottery assemblages are referred to as “unclassified pottery” in Fig. 1 and throughout the paper.
To a large extent, pottery is dated by 14C dating charcoal fragments found in association with the archeological assemblage. Dating performed on other organic materials such as freshwater shells or bulk organic matter in lake-sediment cores is rare and was excluded as potentially unreliable (see above). The temporal distribution of 72 pottery groups considered in this study (out of 115) is constrained by one or multiple (up to 26) 14C dates. Figure S1 visualizes the calibrated age range for each pottery group, using all 14C dates considered relevant and reliable (i.e., class I dates). The chronological position of the 43 pottery groups that could not be dated directly using 14C dates was estimated on the basis of their resemblance with dated pottery groups. Figure S2 visualizes the estimated age range for each of these indirectly dated pottery groups.
The most important criteria to distinguish pottery groups are the shape and decoration of vessels. These aspects, unlike pottery technology, are subject to substantial change over time. Using our comprehensive dataset of dated pottery group occurrences, we untangled cross-regional temporal patterns in the abundance and distribution of distinct pottery groups in the Congo rainforest as proxies for the intensity and spread of human activity and the evolution of material culture over time. For this purpose, we divided the time scale into 100-year bins (Fig. 3) and, for each bin, counted the total number of pottery groups encountered (Fig. 3A) and the number of sites per pottery group (Fig. 3B), and we calculated the distribution area of each group during this 100-year time window using a concave hull (72) around all sites at which the group was found (Fig. 3C). Examples of distribution areas for specific pottery groups are visualized in Fig. 3D. These analyses were performed using the statistical platform R, version 3.6.3 (73).
Temporal variation of population size inferred from genetic data
Paleodemographic fluctuations in the study area were also assessed by tracing changes of effective population size (Ne) in Bantu-speaking agriculturalist communities from West-Central Africa over the last 130 generations, using publicly available genome-wide single-nucleotide polymorphism (SNP) data generated on the Human OmniExpress array (~700,000 SNPs) (5). This analysis was confined to 16 communities in Gabon (our region B), which to our knowledge is the genetically best-sampled region within our study area. We selected communities with large sample sizes of 51 individuals on average (ranging from 38 to 69) (table S3). Ne was estimated for each community using IBDNe software (Fig. 4A). We performed standard quality control (QC) steps using PLINK v1.9 (–mind 0.15; –geno 0.1; and –hwe 0.0000001) (74). To identify first- or second-degree relatives, we used KING (75) for all pairwise individuals included in the dataset and removed one of the related individuals in each pair. After QC, we ended up with 689,310 autosomal SNPs from a total of 816 unrelated individuals (table S3). To detect shared IBD segments between individuals of each community included in the dataset, we used the haplotype-based “Refined IBD” detection tool (76) implemented in Beagle v4.1 (77), with default settings except for the IBD segment-length threshold that was set to 2 cM. To avoid the conflation effect of short IBD segments (42), only IBD segments longer than 2 cM were retained. The tool “merge-idb-segments” was used to filter out the IBD segments by removing breaks and short gaps (>0.6 cM in length). Mean Ne values were calculated for each generation in each community, and generations were converted to calendar years BCE/CE by assuming a generation time of 30 years, the recommended value for preindustrial societies in genetics-based studies of population divergence (78).