A Comparison of Statistical and Geometric Reconstruction Techniques: Guidelines for Correcting Fossil Hominin Crania

My MSc looked at the reconstruction of fossils that have been damaged during their fossilisation. For example, much of the hominin (essentially bipedal apes — humans and our ancestors) fossil material from South Africa has been found in infilled and collapsed caves. The weight of all these sediments, and the collapse of various cavities associated with the caves, has compressed the fossils, damaging many of them. If we want to study them, we need to somehow account for this damage. My thesis is an application of computer science (especially computer graphics) to palaeoanthropology, with a good deal of mathematics and statistics thrown in on the side. A large portion of this work was implemented in R and C++.

The degree was awarded with distinction, and was supervised by James Gain and Rebecca Rogers Ackermann.


You can download the full version of my thesis here, or from UCT’s Computer Science department’s digital library. There is also a list of errata available here, or from the the departmental digital library.


Mrs Ples — STS 5

My data came from measurements made using contact digitisers, CT, and photogrammetric measurements. The above is a surface model I constructed using CT data.

The study of human evolution centres, to a large extent, around the study of fossil morphology, including the comparison and interpretation of these remains within the context of what is known about morphological variation within living species. However, many fossils suffer from environmentally caused damage (taphonomic distortion) which hinders any such interpretation: fossil material may be broken and fragmented while the weight and motion of overlaying sediments can cause their plastic distortion. To date, a number of studies have focused on the reconstruction of such taphonomically damaged specimens. These studies have used myriad approaches to reconstruction, including thin plate spline methods, mirroring, and regression-based approaches. The efficacy of these techniques remains to be demonstrated, and it is not clear how different parameters (e.g., sample sizes, landmark density, etc.) might effect their accuracy.

In order to partly address this issue, this thesis examines three techniques used in the virtual reconstruction of fossil remains by statistical or geometrical means: mean substitution, thin plate spline warping (TPS), and multiple linear regression. These methods are compared by reconstructing the same sample of individuals using each technique. Samples drawn from Homo sapiens, Pan troglodytes, Gorilla gorilla, and various hominin fossils are reconstructed by iteratively removing then estimating the landmarks. The testing determines the methods’ behaviour in relation to the extant of landmark loss (i.e., amount of damage), reference sample sizes (this being the data used to guide the reconstructions), and the species of the population from which the reference samples are drawn (which may be different to the species of the damaged fossil).

Thesis image

The bulk of the correction techniques were written in R, as was the comparison and visualisation of the results.

Given a large enough reference sample, the regression-based method is shown to produce the most accurate reconstructions. Various parameters effect this: when using small reference samples drawn from a population of the same species as the damaged specimen, thin plate splines is the better method, but only as long as there is little damage. As the damage becomes severe (missing 30% of the landmarks, or more), mean substitution should be used instead: thin plate splines are shown to have a rapid error growth in relation to the amount of damage. When the species of the damaged specimen is unknown, or it is the only known individual of its species, the smallest reconstruction errors are obtained with a regression-based approach using a large reference sample drawn from a living species. Testing shows that reference sample size (combined with the use of multiple linear regression) is more important than morphological similarity between the reference individuals and the damaged specimen.

The main contribution of this work are recommendations to the researcher on which of the three methods to use, based on the amount of damage, number of reference individuals, and species of the reference individuals.