1 The size and structure of phonological inventories

1 The size and structure of phonological inventories

1.1 Introduction

The database is known formally as the UCLA Phonological Segment Inventory Database, and its acronym is UPSID.

1.2 Design of the database

A. The database includes the inventories of 317 languages. For 192 of the 317 languages UPSID has profited from the work of the Stanford Phonology Archive(SPA).

In determining the segment inventories, there are two especially problematic areas.

(i) choosing between a unit or sequence interpretation, e.g. affricates, prenasalized stops, diphthongs…

(ii) choosing between a segmental and a suprasegmental analysis of certain properties

1.3 Variation in inventory size

The number of segments in a language may vary widely. The smallest inventories included in UPSID have only 11 segments (Rotokas, Mura) and the largest has 141 (!Xu).

Inventory size	Number of segments
Smallest	11
Largest	141
Typical	20-37
Mean	>31
Median	28-29

Not only that upper and lower limits on inventory size will tend to be rather flexible, but also that areal-genetic deviations from the central tendency

should be expected.

1.4 Relationship between size and structure

Number of consonants and vowels in UPSID

Size	Consonants	Vowels
Total	6-95	3-46
Mean	22.8	8.7
Modal	21

Ratio of consonants and vowels

Ratio of consonants and vowels	0.065-1.308
Mean	0.402
Median	0.36

Median

0.36

Hypothesis

A smaller inventory has a greater probability of including a given common segment than a larger one, and a larger inventory has greater probability of including an unusual segment type than a smaller one.

Test 1

To examine given segment types and see how they are distributed across inventory by size.

Conclusion

The relationship between the size and the content of an inventory is a matter that concerns individual types of segments, rather than being accountable to broad generalizations.

Test 2

Consider what kind of consonants inventory would be formed if only the most frequent segments were included.

Possible consonant inventory size:

Consonants	Possible number	Percentage of languages
Stops	5-11(minimum 3, maximum36, mean 10.5)	63%
Fricatives	1-4	58%
Nasals	2-4	91%
Liquids	2	41%
Voiced approximants	2	72%
/h/		63%

Conclusion

At the modal inventory size for more frequent segments to occur than in the UPSID data files as whole.

1.5 Phonetic salience and the structure of inventories

Examples of strong implicational hierarchies between particular types of

segments:

(i) /k/ does not occur without /t*/. (One exceptions in UPSID, Hawaiian)

(ii) /p/ does not occur without /k/. (There are 24 languages with /k/ but no /p/)

(iii) Nasal consonants do not occur unless stops occur at the same place of articulation. (There are 5 exceptions in UPSID, Ewe, Efik, Auca, Hupa, and Igbo. There are numerous examples of languages with stopsat particular places of articulation with no corresponding nasal consonants.)

(iv) Voiceless nasals and approximants do not occur unless the language has the voiced counterparts.(No exceptions in UPSID.)

(v) Mid vowels do not occur unless high and low vowels occur. (Two exceptions in UPSID. Cheremis and Tagalog are reported to lack low vowels.)

(vi) Rounded front vowels do not occur unless unrounded front vowels of the same height occur. (Two exceptions in UPSID, Bashkir and Khalaj)

(vii) /φ/ and /oe/ do not occur unless /y/ also occurs. (Hopi, Wolof and Akan are exceptions)

Equally valid general prohibition on the co-occurrence of segments within an inventory can also be founded:

(i) A language does not contain both voiced implosives and laryngealized

plosives. (No counterexamples in UPSID.)

(ii) A language does not contain a voiceless lateral fricative and a voiceless lateral approximants. (No counterexamples in UPSID.)

(iii) A language does not contain both /Φ/ and /f/ or both /β/ and /v/. (2 counterexamples in UPSID, Tarascan and Ewe)

(iv) A language does not include a dental stop, fricative, nasal or lateral and an alveolar stop, fricative, nasal or lateral of the same type. (There are 22 exceptions)

Hypothesis

If we can explain why certain kinds of segments never occur together in an inventory on the grounds that the distinctions between them are not salient enough, perhaps the favoring of certain segments can be explained on the grounds that they are the most salient, and an appropriate selection of such sounds maintains generous phonetic distance between the segments of the language involved.

1.6 Compensation in inventory structure

Hypothesis

Martinet(1955) suggests that a historical change which simplifies an inventory in one area is counterbalanced by a compensating elaboration elsewhere.

1.7 Segments and suprasegmentals

Hypothesis

Firchow and Firchow(1969)remarked that “as the Rotokas segmental phonemes are simple, the suprasegmentals are complicated.”

Test

Examine the languages in UPSID which have less than 20 or more than 45 segments to determine if the former had obviously more complex patterns of stress and tone than the latter.

Conclusion

Suprasegmental features tend to be more elaborate in the languages with larger inventories.

1.8 Segment inventories and syllable inventories

Hypothesis

Languages might then have approximately equal number of syllables even though they differ substantially in the number of segments.

Test

Calculating the number of possible syllables in 9 languages

Syllable inventory size of 9 selected languages

Language	Total possible syllables
Hawaiian	162
Rotokas	350
Yoruba	582
Tsou	968
Ga	2331
Cantonese	3456
Quechua	4068
Vietnamese	14430
Thai	23638

Conclusion

Syllable inventory size does not depend heavily on segment inventory size