An MRI study of the articulatory properties of Italian consonants

This is an extract from Romano, A. & Badin, P. (2009).

The articulatory properties of Italian consonants (both the peculiar ones and those common to other languages) are well known thanks to various organic instrumental surveys (such as the radiographic one by L. Croatto in Tagliavini, 1965) and to scattered studies which appeared in different places and at different times (cp. Magno Caldognetto, 1988; we shall quote Farnetani, 1986, among the numerous palatographic works).

At present, however, among the various techniques used for the investigation of the articulatory characteristics of Italian sounds, MRI (Magnetic Resonance Imaging) seems not to have raised much interest, yet. Despite the initial scepticism towards its use in the phonetic field (see the list of advantages and drawbacks compiled by Stone, 1997), nowadays this technique for the acquisition of articulatory information, which uses magnetic fields and radio waves for the representation of anatomical and physiological characteristics either of the vocal tract or of the brain during the production of sounds, is gaining international consensus as for its use in descriptive or experimental linguistic research.

The distrust of this type of surveys in the field of articulatory phonetics – partly related to their being extremely difficult to get and to their questioned representativeness – seems to be over at last thanks to recent progress which has allowed to overcome certain procedural impasses and to improve the acquisition and presentation of the results.

These improvements have enhanced the use of MRI, which has been common since the early 90’s for the measurement of the volumes and the forms of the resonators and for the observation of the position of the mobile articulators in the static configurations they assume during the articulation of a sound or for the reconstruction of the movements from one configuration to another, in experimental linguistics.

One of the drawbacks still at issue, though often overcome with the use of mobile magnets, is the fact that the informer has to assume either a prone or a supine position. However, a satisfactory answer has been given to these objections in dynamic terms or in terms of compensation (cp. for these aspects Tiede et alii, 2000, Kitamura et alii, 2005, Kedrova et alii, 2006). Considering that MRI is the only technique capable of providing images in transversal (coronal) section, and keeping in mind the improving acquisition speed and the wider spatial resolution achieved in the last years, it is easy to understand the growing use of MRI for the acquisition of information on the articulatory characteristics during the production of sounds in replacement for the traditional radiographic images.

Even though in the international literature this technique is often used to investigate specific phenomena – by applying sophisticated computational and/or volumetric techniques, e.g. for research in coarticulation, vocal tract estimation and inversion –, here the observation of available scans has the aim of suggesting an objective evaluation of the modes and places of articulation merely for descriptive purposes.

Data

The Magnetic Resonance (MR) images dealt with in this study have been acquired at the Regional University Hospital (CHRU) of Grenoble, France, in 2001 and at the Radiodiagnostic Service of the "Molinette" Hospital of Turin, Italy, in 2004 and in 2008.

Even though data have been collected for three speakers, the main database discussed here is based on productions by the author AR. The reference to a limited number of subjects usually allows to simplify the data collection and analysis, but may result in the description of non-representative, speaker-specific conditions. The advantages are, however, relevant since the corpus size can be increased significantly and data quality better verified. Moreover, when using a known speaker easily available, the possibility to obtain natural speech also provides a reference that can be used when verifying vocal tract shapes. Larger possibilities of successfully combining data from different acquisition methods are also offered in these conditions.

Acquisition procedure

The scans were realised in three one-two hour sessions with a 1 Tesla MRI Scanner "Philips GyroScan T10-NT". The midsagittal images had a size of 256x256 pixels and a final resolution of 1 mm/pixel. They were acquired to cover the largest vocal tract length possible, with maximally lowered larynx and maximally protruded lips.

In all three cases, the subjects lay in supine position in the MR machine with their head inside a Radio Frequency (RF) coil. A padded crane support was used in the RF coil to minimise head movements. Therefore, the subjects’ heads were not fixed, but movements were limited by the crane support and the coil provided a reference frame for keeping the head in position (see Engwall & Badin, 2000).

The acquisition time was about 10 seconds for the first set and approximately 4 seconds for the subsets. During this time, the subjects held the articulation in full apnoea or breathing out very slowly (fricatives). For stops and affricates (but also other full contact sounds) the scans were carried out during the contact phase.

Corpus

As stated above, we analysed three static MRI sets of native Italian male speakers: a full set of 70 midsagittal scans for AR (33 years), a speaker coming from a South-Eastern region (even though without any specific diatopic marks); a partial corpus of 6 scans (midsagittal and coronal) for GM (65 years), a speaker coming from a Northern region (also investigated for his dialectal palatal articulation); and a partial corpus of 39 scans (midsagittal and coronal) for FG (24 years), a speaker coming from Southern Italy (also investigated for his dialectal cacuminal articulation). All the corpus contains images of sustained sounds.

The first one is intended to allow for a complete and fine-grained description of place and manner of /p/, /f/, /t͡s/, /t/, /s/, /ʃ/, /t͡ʃ/, /k/, /λ/, /l/, /r/, /m/, /n/ and /ɲ/, but it includes vowels, jaw and teeth references; it also includes 9 scans related to cacuminal and prepalatal specific dialectal sounds and 12 scans related to nasal combinatory variants.

The second corpus contains one midsagittal and one coronal MRI slices scanned for three articulatory places (postalveolar, palatal, velar). The third corpus contains 7 sagittal and 6 coronal MRI slices scanned during the articulation of three affricates (dental, cacuminal, postalveolar). Whenever it was possible, voiceless articulations were observed for each articulatory place. Furthermore, all the scans for each consonant (C) were acquired during its production in the utterances /'aC:#'Ca/, /'iC:#'Ci/ and /'uC:#'Cu/.

Original scans for the /'aC:#'Ca/ sequence for speaker AR (with /a/ > [a]) are shown here.

The midsagittal airway boundaries of all the scans were hand-traced on the computer. Unfortunately the manual tracing was carried on without using any edge detector. This required careful evaluations and had as a consequence a longer processing time. The different profiles for the same consonant were superimposed as shown here.

Acknowledgments

We are debtors to Christophe Segebarth for the images acquired at the CHRU of the Michallon Hospital of Grenoble and to Laura Rizzo and Alessandra Graziano for the images acquired at the Radiodiagnostic Service of the San Giovanni Battista (Molinette) Hospital of Turin. A further thank to Gianni Molino and Francesco Gambino, for their willingness and for the time they spent for this research, and to Paolo Mairano, for his linguistic help.

MRI's Outline-tracings