Version of 2008-11-13

Wersja polskaBilanguage versionWersja dwujęzyczna

Grzegorz Jagodziński

Etymology of numerals

This article treats on etymology of Polish numerals in Borean background.

Numerals are a conservative layer of vocabulary. Certain forms can stay in a language thousands, and even tens of thousands of years of its development. It is nothing strange then that e.g. IE languages have a common set of numerals basically. However, it would be a simplification to assert that all the forms come from one source. On the other hand, some of them can be even compared to forms which are present in languages of other families.

In the article, reconstructed IE forms are marked with the asterisk (*), while forms which belong to protolanguages of particular groups (PG, PS) are marked with “†”.

012345678910100100010 000

Zero (0)

In Polish only one lexeme zero is in use. It is a word borrowed from Medieval Latin, where it was used as zephirum and meant ‘digit’. The words cyfra ‘digit’ and szyfr ‘code, cipher’ come from the same source but through Germ. and Fr. respectively. The Lat. word is taken from Arab. ṣifr ‘null, emptiness, vacuum’.

When counting on the medieval abacus, counters called apices were used for the digits 1–9. The apex for ‘0’ was not in use, and lack of the counters was called sipos. The name can have connections to the above described term for naught.

Jeden (1)

OPol. jedzin, jedzien (gen. jednego) had i originally. That vowel shortened to a mobile e due to frequency, and it had d (instead of ) which was taken from oblique cases. The older form without reduction developed into jedyny ‘the only one’ with the hard d under the influence of jeden (cf. also the substantive jedynka ‘the figure one’). The WSl. form was †jedinъ, while the ESl. one was †odinъ, cf. the borrowing odyniec (from Ukr.) with the original meaning ‘old male wild boar which lives alone’ (now just ‘male wild boar’).

The element jed- ~ od- seems to be the remnant of the old formant *ed with the meaning ‘only, just’. Perhaps it was the neuter form of the pronoun with the stem *e which is attested in Anatol. (Hitt. locative edi ‘in this’), in Lat. (ecce < *ed-ke ‘here’, mēd ‘me, acc.’ < *me-ed), in Gmc. (Germ. etwas ‘something’), in Skr. (asya ‘of this (genitive)’ < *e-sjo). The same formant can be found in Pol. ledwie ‘hardly, scarcely’, earlier only ledwo, which is an abbreviation of le jedwo (known in OPol.); jedwo contains the adverbial ending -o and replaces CSl. jedva < *ed-wōs, cf. Lith. võs ‘hardly’.

The element jed- ~ od- has connection to the numeral ‘one’ in many languages: Fin. yksi, Mansi ük < *ükte, perh. Selkup ukkyr, Sirenik atə́ʁəsə́ẋ, Greenl. atāsiq < *ataʁu-ci-, Arab. wāḥidun, ˀaḥadun, Hbr. ˀeḥāḏ. Chad. *ʔʷaḥad-, *ʔʷaḥid-. Cf. also Turkm. öjkün- ‘imitate’, Yakut ütügün- ‘t.s.’, OJap. ojazi < *əja(n)si ‘equal’, Georg. oden ‘yet, only’, Tamil. utti ‘match, rival in game’, Telugu uddi ‘pair, equal’.


The particle inъ can also be seen in Pol. inny < iny ‘another’, OCS inorogъ ‘unicorn’. The vowel i is the result of development of the IE reduced *oi (perhaps through the stage of *ui), which was the most primitive form of the numeral ‘one’ that can be reconstructed. It may be preserved in putative Hittite *ās, Luw. a < *ojos, as well as in Gr. dial. íos ‘one’ in another apophonic shape (however, this word can be the result of analogy). In particular IE languages it was widened with various consonantal elements:

Two elements are seen in Pahlavi ēvak, perhaps also in Kurd., Pers. yak.

The stem *oi (*Hʷoi) might have expressed not ‘1’ but ‘2’ originally: *oinos, *oikos, *oiwos meant ‘this one from the two things’, and also ‘that one from the two things’, hence Pol. inny today. Cf. also possible *d(e)-Hʷoi > †dъvě > dwie ‘two’ (feminine).


Another IE stem with the meaning ‘1’ was *sem- ~ *sm̥-, cf. Gr. heĩs < *sems ‘one’ (m), mía < *smijə ‘one’ (f), hen < *sem ‘one’ (neutr.), haploũs < haplóos < *sm̥- ‘single, simple’, Myc. eme ‘1’; Lat. sem-el ‘once, one time’, sim-plex ‘single’, Lyc. sñta ‘1’, Toch. sas, Arm. mi < *smi ‘one’. It is also present in Pol. but in another apophonic shape, namely in sam ‘alone; -self; same’ < PS †samъ < *sōmos. The Slavic form probably preserved the original meaning of the stem. Yet another variant is present in Germanic, cf. Eng. same < *somos. In composite words *se- also occurs, cf. Skr. sa- (if not from *sm̥-, see ‘1000’), Gr. hekatón < *se-ḱm̥tom, literally ‘one hundred’.


Yet another old root with the meaning ‘1’ has preserved perhaps in Greek, and it is present in the words monás < †sm-on-ad-s ‘the number one’ as well as mónos ‘the only one, single’. Its cognates are, as they think, the reconstructed Drav. *mēn̠i ‘body’, Jap. mono ‘thing, object’, Tung.-Manch. *mēn ‘oneself, own’, Mong. mön ‘he, the same’, Korean mom ‘body’, Kush. *mVn ‘one’. However, the given Greek words have also an alternative etymology – they are being linked to the root *sem- ~ *sm̥-, ex. monás < †sm-on-ad-s.


The similarity of the suggested IE *se- to Indonesian se- (< satu ‘one’, cf. se-ratus ‘100’, literally ‘one hundred’) is surely accidental, however see below.


The origin of ‘1’ in some languages in unclear, e.g. Hitt. ant maybe from *anter- ~ *alter- ‘one of two’, cf. Lat. alter.


In Etrurian (considered the closest relative to Indo-European) ‘1’ was named tu, tun, θu, θun. This name may be connected to IE numeral ‘2’. Like Romans, Etruscans used to express the numeral ‘19’ by means of subtraction one from 20: θunem zaθrum.


The apex for ‘1’ was called igin. The name is not explained even if sometimes the Akkadian or Hungarian hypotheses are considered (in Akkadian, ‘1’ = išten, in modern Hungarian, ‘1’ = egy, unclear, perhaps from the pronoun *e ‘this’). In various languages we can find numerous unclear forms, like Khanty it (related to Hung?), Nenets ŋobʔ.


When enumerating things, the form raz is used in Polish (†razъ ‘stroke, shock, blow, bump’ < *wrōǵ- ‘incision, notch’, cf. Lith. rúožas ‘bolt, stroke, line’, Gr. rhõks, rhõg- ‘notch, crevice’, Lesb. wrẽksis), instead of jeden raz.


The ordinal numeral pierwszy ‘first’ is exclusively Polish, made as a form of comparative / superlative of the Slavic adjective †pŕ̥vъ ‘first, original’, cf. Russ. пе́рвый ‘first’, Pol. pierwotny ‘original, primitive’, pierworodny ‘first-born’, ‘original’ (sin), dopiero ‘only’ (referring to time), ‘not before’. The word seems to be based on the original IE stem *pirvo- < *perHʷ ~ *proHʷ (cf. Gr. prõtos < *proHʷto-, the circumflex intonation is extorted by Greek stress laws even if there was a group with a laryngeal in this word). In other languages the forms with *-omo- are in use. The suffix is used to make superlative, cf. Gr. Hom. prómos < *pr̥Hʷomo-, Lat. prīmus, Lith. pìrmas, OE forma < *perH̥ʷmo-, perh. Arm. aṙaǰin ‘first’. Next, Gr. próteros ‘frontal’ has the suffix of comparative *-ter-. Another apophonic degree can be observed in Pol. przód ‘front’ < †perdъ, przedni ‘frontal’ < †perdъnьjь < *perHʷ-. Skr. prathama may come from the form with inverted consonants *protHʷomo- < *proHʷtomo- and it is original superlative (with the normal suffix -tama- < *-t-omo-). Modern Eng. first is also an original superlative, with another suffix (OE fyrmest). An exact counterpart of the Slavic form seems to be Skr. pūrvas < *pr̥Hʷo- ‘first, oldest’; it is possible that a better reconstruction would be *pr̥Hʷ-wo- with an additional suffix. Cf. also Skr. purā < *pr̥Hʷ- ‘in the old times’, Pol. pra- < *proHʷ- (pradziadek ‘great grandfather’, pradawny ‘ancient, very old’, prajęzyk ‘protolanguage’).

Dwa (2)

Pol. dwa and dwie are well preserved dual forms (originally masculine and feminine, resp.) which are made from the root which is present in all IE languages: Skr. dvāu, dvē or duvāu, duvē, Av. duwa, Oss. dıwæ, Pers. do, Gr. Hom. Aeol. dýō < *duwō, Gr. dýo, Lat. duo (with secondary shortening due to frequency), also Gr. dṓdeka ‘12’ < *dwō-, Eng. two, Goth. twai < PG †twai, Hitt. dā-, Luw. duwa-, Toch. wu, Alb. dy, OIr. dá > dó, Welsh dau (in the Cardiganshire / Ceredigion dialect the form with a voiceless stop was recorded, perhaps it is the result of analogy to Germanic forms or to the next numeral târ ‘3’), Lith. dù < dvúo, PS †dъva, †dъvě < *d(u)wō(u), *d(u)wai. The hesitation *ōu ~ *ō in the masculine form is a trace of a laryngeal: *-ō(u) < *-oHʷ. The Polish collective form dwoje is also IE. Previously it was used for all the genders (OPol. dwój, dwoja, dwoje), cf. PS †dъvojъ ‘double’, Skr. dvaya-, Gr. doiós < †dwoios, Lith. dvejì. In oblique cases an additional -g- appears (dwojga, dwojgu, etc.). Its origin is unclear, see czworo.

The modern masculine-personal form dwaj contains an unclear -j. This element became a masculine-personal dual marker in some Slavic languages. In Pol. it has preserved residually.

The original form of the IE numeral ‘2’ can be reconstructed as *du. The form *dwi- which was used in composite forms (cf. Skr. dvi-, Gr. di-, Lat. bi-) is not preserved in Slavic. Its relation to Gr. diá-, Aeol. ‘through; still; very’, Lat. dis- ‘dis-’, OS te-, ti-, OE te-, OHG zi-, ze-, Germ. zer-, Alb. tsh- is unclear.

The similarity between PIE *du and Indonesian dua ‘two’ has probably become by chance. On the other hand, Arm. erku, completely dissimilar by all appearances, is a regular IE form (in this language old *dw developed into rk).

Gr. deúteros ‘second, nearest, secondary’ is said to be related with deúomai ‘I am worse, weaker’. However, we cannot refuse farther relation of it with *de-wo-, the same root which is present in PIE *duwō(u) ‘2’, possibly it was the verb which was made secondarily. From the same root *deu-, with another apophony degree, Pol. dawno < *dōu-ino ‘in the old times, long ago’ developed.


Perhaps the notion of ‘two’ could have given originally with the element *wo ~ *wi ~ *woi (*Hʷo ~ *Hʷi ~ *Hʷoi) which was added then to *d(e)- of the root (meaning ‘1’ originally) which can be seen in *deḱm̥tten’. An argument for this hypothesis is the old numeral ‘20’, which is not preserved in Slavic, cf. Skr. viṁśati, Av. visaiti, Arm. kʰsan, Gr. eikosi, Aeol. wīkatī, Lat. vīgintī with *Hʷī- (in PGr. it developed into†ewī-, and the vocalic prothesis e-, a trace of a laryngeal, was not present in some dialects). Cf. also Skr. vi- ‘apart’.


Another root is present in Pol. oba ‘both’ < *obhō(u), obydwa (composed of oba and dwa), oboje. Lith. forms are similar: m. abù, f abì (also abùdu, abìdvi), collective abejì, ãbejos. In Gr. Hom. we can find ámphō but in Class. only amphóteros. Other examples of this root are Lat. ambō, Toch. A āmpi, Toch. B āntpi, antapi, Skr. ubhā(u), ubhē, Goth. bai, ba, OE , Eng. both.

Fin. kumpi ‘which of the two’, kumpikin ‘both’ are probably connected to the IE form. Hung. mindkét ‘both’, lit. ‘every two’, resembles comlex BS forms with its structure (not form).


The most probably there existed another IE with the meaning ‘2’, which can be seen in Skr. yamá- ‘twin’, Av. yə̄ma-, OIr. emuin ‘twins’, emnatar ‘they double’. Because of Latv. jumis, juma, jume ‘two joint things’, Lat. geminus, -a, -um ‘twin, double’, Lith. kemerĩs ‘fruit or nut accreted of two’, the PIE form is hard to be reconstructed (*jemH-, *jumH-, *gem-, *kem-).


During the history of the Polish word bliźniak, notable semantic changes have taken place. The PIE root *bhlēiǵ- ~ *bhlīǵ- meant ‘to beat, to cast’. On the Slavic ground, an adjective based on this root, *blizъ ‘beaten’, obtained the meaning ‘compact, dense’, next ‘adjacent’, at last ‘close’. The adjective *bližьnjь or *blizъnь comes from it (hence Polish bliźni ‘fellow creature’), originally a description of a neighbour, then a relative. New words were formed of this adjective: bliźniec, then bliźnię, today rather bliźniak. According to the norm the term bliźnięta denotes children coming from the same delivery, not necessarily dwojaczki ‘twins’. Connections of bliźnięta with the idea of the number 2 became yet weaker when the term bliźni began to be applied for every man according to rules of Christian ethics.


The Etrurian numeral ‘2’ was zal, we also known the form of ‘20’ – zaθrum. The form eslem zaθrum meant ‘18’, literally two less twenty (notice zal : esl-). According to the newest hypothesis, śar or zar meant ‘12’ (and not ‘10’, as it was supposed before). These forms do not show connection to IE; śar may even mean 2 * 6.


The ordinal numeral drugi, like in numerous IE languages, is not related to the main numeral. Its original meaning was ‘one of the squad’ (cf. seemingly improbable etymology of cztery), later ‘another of many’, ‘the other of two’, and finally ‘the second’. The old meaning is preserved in the OPol. substantive drug, now druh (of Ukrainian origin) < *dhroughos. The same word is known from Lith, where draũgas ‘husband; comrade’. cf. also Lith. draũg ‘together’, Goth. drauhtinon ‘to belong to one squad’, driugan (*dhreugh-) ‘to unite into a squad’, OE dryht ‘companion, comrade in arms’ (*dhrugh-). Lat. drungus ‘squad’ and OIr. drong ‘mob, squad’ are farther related.


The numeral drugi supplanted the OPol. and OCS form wtóry < vьtorъ (or vъtorъ), possibly from *witero-, cf. Skr. vitaras ‘leading farther / away’, vi- ‘apart’. However, if the form with ъ is more primitive, †vъtorъ < †ъtorъ < †ǫtorъ < *antoro-, cf. Lith. añtaras, Skr. antaras, anyas, Eng. other, Goth. anþar ‘other’, cf. also Cz. úterý ‘Tuesday’. Now only the derivate wtórny ‘secondary’ is in use. Lat. alter, alius are probably based on another root (*al-), unless *alteros < *anteros, which is possible in a frequently used form; the Lat. interrogative uter which is very similar to the Slavic form is probably a result of irregular development of *kʷoteros ‘which of two’ (cf. Pol. który ‘which’ < *kʷotoros).

The apex for ‘2’ was called andras. This word shows similarity to some IE ordinal numerals with the meaning ‘second’. A possible connection to Syriac təren is less convincible.


Yet another root was present in OE æfterra < *apo-ter-. The part *apo- expresses sequence in this word, just like Polish po or Latin post.


The Lat. form secundus (and the borrowed Eng. second) is originally a future participle (‘which is to happen’) from the verb sequor ‘I follow’. The same root can be seen in Eng. see < *sekʷ-, originally ‘I follow something with my eyes’.


The fractional numeral pół ‘half’ (and the secondary form połowa) is a u-
root substantive: †polus, cf. Alb. palë ‘side, party, part’. The root *pelə- ~ *polə- meant ‘divide in two’ originally, cf. pleć < †pelti, now plewić ‘to weed’, originally ‘separate chaff from grains’.

As early as in PS other fractional numerals existed: półtora < †polъ vъtora ‘one and a half’, półtrzecia < †polъ tretьja ‘two and a half’, etc. Those forms, except półtora, are not in use now.

The PG form, still seen in Eng. half, has probably metathesis and irregular development of a laryngeal, which is sometimes present in Gmc.: †xalb- < *kolp- < (?) *polk- < (?) *polH-.

Uralic forms, like Fin. puoli ‘side, half’, Hung. fél ‘t.s.’, are connected to the IE form in genetic sense.


Another root can be seen in Lith. pùsė, pùs- (? < *pl̥H-s-) ‘1/2’. This form seems to be connected to Toch. poṣi ‘site, wall, rib’.

Trzy (3)

The numeral trzy is originally neuter (< PS †tri, cf. Hitt. tri-, Luw. *tarri-, Wedic Skr. trī, Lith. trỹs, OIr., Welsh tri, Gr., Lat. tria < *trij-H, Goth. þrija). In the masculine form a secondary j was added (now masculine-personal trzej). PS †trьje < *trej-es, cf. Skr. traya-, Av. þrayō, Kurd. , Pers. se, Toch. A tre, Toch. B trai, Lat. trēs, Gr. treĩs, Goth. þreis, Alb. tre, Oss. ærtæ with metathesis and prothesis, Arm. erekʰ ‘3’ and eṙapatik ‘triple’. The stem was *trej- ~ *tri- (apophony). Feminine forms contained the suffix *sor ~ *sr ‘woman, feminine being’, cf. Skr. tisras, OIr. teoir < *t(r)i-s(o)r-. They disappeared as early as before the Proto-Slavic epoch.

Trzeci ‘third’ is probably based on the simple stem form (*tre-ti-), cf. OE þridda < *tret-jo-. In Gr. dialects along with trí-tos the most primitive (?) tértos < *tr̥-to- occured, similarly in Lat. tertius < *tr̥-tio-.

The etymology of *trejes is unclear. It impresses that the stem *tr(e)- has something in common with the morpheme -ter- ~ -tor- which exists in many words for an element of a pair. E.g. IE *pəter – *māter (‘father’ – ‘mother’), Pol. który < *kʷotoros ‘which’ (originally ‘which of two’), Skr. prataras ‘front’, literally ‘this one of two who is at the front’. So, perhaps originally ‘three’ meant ‘a pair and one’ or ‘third’ meant ‘none of the pair’.

It was also presumed that this numeral is connected to the stem *ter- meaning border, end, e.g. Skr. tarati ‘it penetrates’, tarman ‘point of a sacrificial pole’, Hitt. tarma- ‘peg, nail’, Gr. térthron ‘end, peak’, térma ‘aim, final point’, térmōn ‘border’, Lat. termen, termo, terminus ‘landmark, border stone’, trans ‘beyond something’, Eng. through < OE þurh, Germ. durch, Skr. tiraś-ca, Pol. Tatry ‘Tatras’ from Illyrian Tr̥tri (stem with reduplication of the shape *tr̥-tr-), Gr. Tártaros ‘Tartar, the end of life, the seat of the dead ones’. The semantic shift had to happen long time ago, in the epoch when only ‘one’, ‘two’ and ‘many’ were distinguished, and ‘three’ meant ‘beyond two’, ‘more than two’. This hypothesis is confirmed by Eng. throng (< IE *tr-onk-, cf. however OIr. drong ‘throng, squad’ < *dhr-) or Franconian throp ‘heap’, from which Fr. trop, Italian troppo ‘too, too many’, and also Lat. troppus, Eng. troop or Pol. trupa.

In Indonesian we meet the form tiga, however for the Austronesian protolanguage the form *telu is reconstructed. In other Austronesian languages the form teru occurs, yet closer to IE (cf. also Sem. ṯalaṯ-). For comparison, Arm. er(e)kʰ which looks completely different from the reconstructed IE form, developed from *trejes (-s is said to have developed into Arm. -kʰ, t is said to have disappeared in the consonant cluster, e- is a prothesis added for making pronunciation simpler).


The Etrurian numeral ‘3’ was ci, ki. We also know cealχ ‘30’. Then ciem zaθrum meant ‘17’, i.e. three less twenty.


The apex for ‘3’ was called ormis. This name is dark, compare however Hungarian három (with Finno-Ugrian etymology: Finnish kolme).

Cztery (4)

The Pol. forms czterej, cztery show irregular disappearing of the vowel due to frequency, and dispalatalization (hardening) of r: OPol. knew the forms czterzy, czetyrzy. CSl. forms (masculine and neuter) sounded †četyre, †četyri. The numeral ‘ten’ is an old composition, not enough clear, the IE counterparts of the PS forms could have been *kʷetūres, *kʷetūrī. Besides the stem form *kʷetūr- also other apophonic variants existed:

The collective numeral czworo (OPol. czwór, czwora, czworo ‘quadruple’, also czwiór, czetwiór) comes from †četverъ (cf. Lith. ketverì). It has irregular hardening of -w- due to frequency, however PS knew also †četvorъ. The formant -er- ~ -or- builds collective numerals starting from czworo, perhaps it was abstracted from this numeral and transferred into the others (pięcioro, sześcioro, …). In the oblique cases the stem expansion -g- appears; originally it probably built another formation of collective numerals. Lith. ketvérgis ‘four years old’ is not the strict counterpart of the Slavic form †četverъgo as the latter contains a yer after r. We can discuss on possible connection of this stem expansion with the genitive ending of pronouns of the type jego and further with the postulated IE emphatic particle *ghe ~ *gho (cf. Pol. że, Lat. mihi ‘me’, Skr. aham ‘I’) or *ge ~ *go (Goth. mik < *mege ‘me’, Gr. égōge, maybe also Gr. égōge, egṓ, Lat. ego ‘I’).

The stem *kʷet- had probably the original meaning ‘a pair (of people)’, the element *-wo- < *Hʷo would express the ‘duality’ of the pair in this case. OCS has preserved the word četa ‘troop, squad, mob’, cf. Lat. caterva (? < queterva), OIr. cethern with the same meaning (perhaps ‘mob’ < ‘group’, ‘squad’ < ‘collective’ < ‘two people together’). The possibility to connect the word meaning ‘squad’ with the numeral ‘2’ is confirmed by the doubtless etymology of the ordinal numeral drugi, a word of much more modern origin.

Longer forms were being changed irregularly, shortened and reduced due to their frequency, e.g. Gr. trápeza instead of †tetrápeza ‘a table (standing on four legs)’ or Skr. turīyas instead of †caturīyas ‘fourth’, Pol. cztery, czworo, czwarty instead of **czetery, **czetwioro, **czetwarty. In Germanic also *kʷ > *p > f irregularly, and OE fēower ‘four’, fēorþa ‘fourth’ with additional disappearing of d, cf. Goth. fidwōr. Other irregularities: Alb. katër without expected palatalization, Toch. śtwar, Arm. čʰorkʰ, čʰors ‘4’ but kʰaṙasun ‘40’. Lat. and Gr. forms have irregular gemination of t. The ordinal numerals: Lat. quārtus, Skr. caturtha-, and Lith. ketvérgis show traces of a laryngeal (in Lat. also irregular disappearing of *-tw-).

The original IE form may have been the composition *kʷet-twoHr-. Etrurian huθ may have been connected to the first part, *kʷet- (cf. Gr. Hyttenía = Tetráptolis, Four Cities). In that language a duodecimal system was probably in use – hence huθzar is now interpreted as ‘16’, literally ‘four twelve’.

The second part, *tworH- (perhaps continued in Skr. turīyas ‘fourth’), seems to have connections to Altaic forms: Chuv. tăvat, Yakut tüört, Turk. dört, OMong. dörben ‘4’, döčin ‘40’, Manchu dujn ‘4’, Jap. yo- < *də-.


Some old IE languages, especially Anatol, have preserved other forms with the meaning ‘4’, cf. Hitt. meiu-, Luw. mawi- (Lyc. had however teteri, perhaps borrowed from Gr.). A hypothesis on connections between this form and Etrurian maχ ‘5’ suggests itself, but details are however unclear.


The dual form of the numeral *oḱtō(u)eight’ speaks for existence of another IE word with the meaning ‘4’ (cf. also the postulated *oketā, *okʷetā ‘four-teeth harrow’). This hypothesis is confirmed by the South Caucasian numeral ‘four’ (Georg. otxi, Swan vōštxv), which looks like either a borrowing from IE, or a word which was inherited from the common, very distant protolanguage. Avar ašti ‘width equal to four fingers’ may also be an early Indo-Iranian borrowing.


Probably no traces in IE can be found for the form which is seen in Fin. neljä ‘4’, Khanty ńĕlə, Hung. négy < *ńeljä, Tamil nāl. A further related form is Sin.-Cauc. *-V́nŁe seen in Burush. wálto, Kachin məli1, Burm. lijh, Chin. sì < *slhijs, Tibetan bźi, Basque lau. Related forms may also denote ‘2’ or ‘8’.


The apex for ‘4’ was called arbas. Its name has obvious Semitic etymology: Arab. arbaˁu.


Even if Indonesian empat seems to be very distant from IE forms, however we can see similarities between PIE *kʷetw- and Proto-Austronesian *xepate (if we took kʷe : x(e)pa).

Pięć (5)

PS †pętь, originally a numeral substantive *penkʷtis (Skr. paŋktiṣ ‘the number five’) from the proper numeral *penkʷe. Slavic languages have preserved only the proper numerals 1–4, cf. Lith. penkì (with another suffix), Gr. pénte, dial. pémpe, Skr. pañca, Av. panča, Oss. fondz, Kurd. pēnj, Pers. panj, Toch. A päñ, Toch. B piś, Alb. pesë, Arm. hing < *penkʷe. Arm. yisun ‘50’ is unclear. Lat. quīnque ‘5’ as well as Celtic forms (Welsh pump, OIr. cóic) show assimilation or come from the form *kʷenkʷe which was older than *penkʷe. Another type of assimilation can be observed in Germanic, for example Goth. fimf < *pempe, similarly in Osk. pompe; also in Gr. pémptos ‘5th’ < *penkʷtos would be irregular because *kʷ before a consonant developed into k in this language under normal conditions. In the collective Polish form pięcioro the formant -er- is present. If it was transferred from czworo, it must have happened as early as in Balto-Slavic, cf. Lith. penkerì.

The numeral pięć is connected to the substantive pięść < †pęstь, cf. Germ. Faust, Eng. fist < †funxsti- < *pn̥kʷ-sti- (originally ‘hand’; the Slavic form can, even if need not, come from the root with full vocalism), cf. also Eng. finger < *pn̥kʷ-r-. From the same stem, piądź, piędź < *penkʷ-dhi- ‘span, inch of ground’ seems to originate, or we can have the related stem *pendh- here. Connections with Gr. pygmḗ and Lat. pugnus ‘fist’ (*pug- < (?) *pogʷ-) would also be possible, at least in the distant past.

An interesting problem is caused by Lith. kùmštis, Pruss. kuntis ‘fist’ < *kumpstis < *punkstis, *kunkstis (metathesis or dissimilation) < *pn̥kʷ-sti- or *kʷn̥kʷ-sti-. However we can see further connection also to Latv. kàmpt ‘grab, catch’, and yet further to Lat. capere ‘catch’ and PG †xabē- (cf. Eng. have). Perhaps the same stem, but with irregular phonetic changes, is present in Lat. habēre ‘have’ < *ghəbh- ~ *kəp-, cf. also modern Pol. nagabywać ‘to ply, to molest, to importune’ and OPol. gabać ‘to attack’, Lith. góbti ‘to take possession of sth.’ < *ghābh- (or *ghab-). Yet another form is present in Eng. keep. An obstacle for a reconstruction of Proto-IE stems of different words meaning ‘5’, ‘hand’, ‘catch’, ‘take’ and ‘have’ is the difference of the velar kʷ ~ k (gh). We must not forget, however, that we may talk about a very distant relationship only, and during thousands of years many irregular changes might have occurred.


Etrurian maχ has completely different origin than the IE word (having observed similarity of this form to Luw. mawi-, it was thought to mean ‘4’ but this hypothesis has not been proved). We also know muvalχ ‘50’.


The apex for ‘5’ was called quimas. Its name shows similarity to Semitic forms (maybe influenced by Latin quinque).

Sześć (6)

PS †šestь was a numeral substantive originally. Despite of OIr. , Lat. sex or Gmc. forms (Goth. sáihs, Eng. six), the initial part of the numeral was not the s alone, but a consonantal cluster, probably *ksw-, cf. Av. xšvaš ‘6’, xštva- ‘sixth’ (with reduction and transposition of consonants), Oss. æxsæz ‘6’. The PS form, Lith. šešì (with another suffix), Kurd. šaš, Pers. šeš also prove PIE *ks- indirectly. The Skr. form ṣaṣ is also exceptional – in this language the cerebral can appear only after i, u, r, k normally.

The final cluster was probably -ks, the aspiration in Skr. ṣaṣṭha- ‘sixth’ remains unclear. The cardinal numeral can be reconstructed as *ksweks lub *ksweḱs, and its further etymology belongs to a circle of hypotheses with little probability (e.g. connection to the postulated *ks ~ *kos ~ *kes ‘three’ or to the stem *skh(e)id- ‘split, cut’ – cf. PS †šьstъ ‘perch, log, bough, branch’, Gr. skhízō < *skhid-jō, Lat. scindō ‘split’, Skr. chinatti ‘he cuts off’).

On the presence of *w, see Gr. Myc. we-, weks, Dor. weks (Class. heks can also come from PGr. †sweks), Gaul suex, Welsh chwech, Arm. vecʰ ‘6’, vatʰsun ‘60’, Pruss. uschts ‘sixth’. Alb. gjashtë is the strict counterpart of the Slavic form; the initial gj- may be the result of development of a consonantal group. Saka ksäta ‘6’, Gr. kséstriks ‘a six-row barley’ proves the presence of k-. Toch. B skas may also contain sk- < *ksw- (cf. Toch. A säk).

A disquieting similarity to IE form can be observed among South Caucasian numerals ‘6’ (Georg. ekvsi, Svan usgva), as well as among Sem. (hbr. šēš, Akkad. šediš, Arab. sittun < *šidθ-, cf. especially the postulated IE connection to the stem *skh(e)id-). The connection between IE *ksweks and FU ‘6’ is also probable: Hung. hát, Fin. kuusi < *kutte ~ *kūte (< *kukste ?).

Etrurian śa, sa may also be connected to the Indo-European form. The numeral ‘60’ was śealχ.


The apex for ‘6’ had an interesting form called caltis or calctis. We can see here posible connection to both the above mentioned Nostratic forms for ‘6’ and to the Turkish dark form altı.

Siedem (7)

PS †sedmь (OPol. siedm; the fleeting e in modern siedem is secondarily inserted), originally a numeral substantive, cf. Alb. shtatë, Lith. septynì with various suffixes. The proper numeral sounded *septm̥, cf. Hitt. šipta-, Skr. sapta-, Av. hapta, Kurd., Pers. haft, Gr. heptá, Lat. septem, OIr. secht, Welsh saicht, Gaul sextan, Arm. ewtʰn (now yotʰ), Toch. A spät, Toch. B sukt. The Gmc. form (Goth. sibun, Eng. seven) shows irregular disappearing of t. The irregular voicing in Slavic (d < t) under the influence of the following m has analogies in other languages, e.g. in Gr. hébdomos <*septomos ‘seventh’, Oss. avd ‘7’.

A common origin (or a common source of borrowings) can have IE, Sem. (hbr. ševaˁ, Arab. sabˁun), South Caucasian (Georg. švidi, Swan išgvid) and UF forms (Hung. hét, Fin. seitsemän). Connection to Etrurian semφ also seems to be obvious (cf. semφalχ ‘70’). The apex for ‘7’ was called zenis – an unclear form, but surely connected to the previous ones.

Turkic words for ‘7’ make an interesting problem: Proto-Bulgar *ǯiati, Chuv. śičĕ, Yakut sette, Azer yeddi, Turk. yedi, with traces of gemination (*jetti < *šepti?), compared with other Altaic forms without such traces. Forms like Manchu nadan, Korean ilkop (MKorean nìr-kúp), Jap. nana- would correspond to Turk. *jeti, giving basis for reconstruing PA *nadi. Mong. *dal- in turn would suggest earlier *ĺadi- (with metathesis). Because of irregularities of Altaic forms, they all can finally be related to Indo-European, Finno-Ugric and Semitic ones.

Osiem (8)

PS †osmь (OPol. ośm; the fleeting e in modern osiem is secondarily inserted), originally a numeral substantive made by analogy to †sedmь. Alb. tetë contains another suffix and shows strong reduction, cf. also Lith. aštuonì. The proper numeral sounded *oḱtō(u) < *oḱtoHʷ, cf. Toch. okät, Skr. aṣṭāu, Av. ašta, Oss. ast, Goth. ahtau, Lat. octō, OIr. ocht, Welsh wyth, Cardiganshire nîch, assimilated to noch ‘9’, Kurd., Pers. hašt assimilated to haft ‘7’, also Gr. ógdoos ‘eighth’ < †ogdowos (with voicing like in hébdomosseventh’), hence it had a form of dual. The stem*oḱt had to mean ‘four’ originally. The structure of this numeral is an argument that the ancestors of the Indo-Europeans counted in a four-based system.

Even if forms of the numeral ‘8’ in satəm languages prove presence of the palatal *ḱ, we can find optṓ in Gr. (Eleian dialect) with p < *kʷ, together with Class. Gr. oktṓ. Also Arm. utʰ proves the presence of *kʷ. So, *oḱtoHʷ might have been the secondary form (with dissimilation of two labialized consonants), used together with the most primitive *okʷtoHʷ. If it really was so, the stem *okʷt- (cf. postulated *okʷetā ‘four-teeth harrow’) can contain the same element *kʷt which is seen in *kʷet ‘pair’ (see four), preceding by *Hʷo-two’. If the hypothesis is correct, *Hʷo-kʷtoHʷ is a dual of the expression ‘two pairs’. A proof for the presence of a laryngeal in the initial part of the word can be the reconstructed Luw. haktau.

The form *okʷetā, *oḱetā, *oḱwetā ‘harrow’ is attested by Lat. occa < *otika < *okitā, Gr. oksínē < *oktinā, Welsh and Cornish ocet, Breton oguet, Germ. Egge < OHG egida, OE egede, egde, Lith. akė́čios, ekė́čios, Pruss. aketes ‘harrows’. There also exists a hypothesis connecting the *okʷ- ~ *oḱ- here with the stem *aḱ- meaning ‘sharp’.


Etrurian cezp has no connection to the IE form (cf. cezpalχ ‘80’ and Latin Cespius, Cispius ‘The Eighth Hill’).


The apex for ‘8’ was called temenias. Its name shows similarity to Semitic forms.

Dziewięć (9)

PS †devętь was originally a numeral substantive (cf. Alb. nëntë ‘9’ and Skr. navatiṣ ‘the number 9’). As early as in the Balto-Slavic period, d- replaced older n- because of the dissimilation d-n < n-n (cf. Lith. devynì), perhaps also under influence of the similar form of the next numeral †desętь. Some traces of the proper numeral (PS †devę, and also primitive †nevę with the stem †nevęt-) are preserved in the name of the plant dziewięćsił (Carlina or Inula, OPol. dziewięsił, Russ. devesíl, devjasíl, Cz. nevěsil), literally ‘nine powers’ or ‘nine forces’. The IE numeral can be reconstructed as *newm̥ or *newn̥, cf. Skr., Av. nava, Kurd. , Pers. nah, Lat. novem, Goth. niun, OIr. noi, Welsh naw, Toch. ñu. It can show its original meaning ‘new (over 8)’, which is another argument that the Proto-Indo-Europeans counted in a four-based system.

Gr. ennéa, Myc. enewo < *-newm̥ and Arm. inn, modern inə contain an extending element, perhaps *ed-, cf. Slavic jed-inъ. The Balto-Slavic form could be a shortening from *ed-newn̥ in this case, cf. Gr. tra- instead of tetra-, or Skr. turīyas instead of †caturīyas (see four). However, if the stem of the numeral ‘nine’ was *H́newn̥, Gr. en- could be the result of development of the palatalized laryngeal *H́-, and the etymology of this word would be unclear. The picture is unclear because of Gr. enenḗkonta ‘90’ (†enwen-ḗkonta instead of †ennea-ḗkonta) – Homer knew ennḗkonta. Anatolian facts speak against the supposition on the presence of a laryngeal (Luw. nu).

In the form of the Gr. ordinal numeral énatos < PGr. †enwatos (cf. Ionian eínatos), we can see the stem in another apophonic form: *-nwn̥-to-.

Etrurian nurφ may be related to the IE form.


The original meaning of the PIE term seems to be affirmed by Oss. farast. This innovation means ‘next after eight’ (ast).


The apex for ‘9’ was called celentis (cf. Hung. kilenc).

Dziesięć (10)

PS †desętь was originally a numeral substantive, cf. Alb. dhjetë. However, a number of form has preserved in its declension, from the previous numeral †desę (the stem †desęt-; e.g. ORuss. singular accusative desja < †desę, plural nominative desęte < *deḱm̥t-es instead of the expected form ˚desęti). The IE numeral can be reconstructed as *deḱm̥, cf. Skr. daśa, Av. dasa, Oss. dæs, Kurd. da, Pers. dah, Arm. tasn, Goth. taíhun, Gr. deka, Lat. decem, Gaul decam, OIr. deich, Welsh deg, Toch. śäk. The final t is probably secondary, e.g. Lith. dẽšimt < Old Lith. dešim-tis. In some languages forms without the final *m exist as well, e.g. Lat. decu-ria, Goth. fidwor tigjus ‘40’. An obvious connection to the numeral ‘100’ can be seen, the element *de- had probably the meaning ‘whole’, ‘this’ or ‘one’. Connection between *ḱm̥t and PG †handus ‘hand’ (*ḱomt-u-) is also possible: in that case *deḱm̥t means ‘the whole of hands’, i.e. all ten fingers.

The numerals 11–19 was built on the Slavic ground after the archaic model jedenaście < jeden na dziesięcie ‘one on ten’, where dziesięcie is the old locative from dziesięć, reduced due to frequency. In the Middle Age, that -naście was indeclinable, so, there existed wtóry naście ‘twelfth’ (now dwunasty), jedno naście ‘eleven’ etc. Now the only trace of declension of the first part is the alternation dwanaście : dwunastu. Other languages shows traces of an analogical count system, especially Welsh un ar ddeg (ar ‘on’), Alb. njëmbëdhjetë (mbë ‘on’). A little more distant forms are Eng. eleven < †ajna-lif- ‘one left (over 10)’, Goth. áinlif, Lith. vienuolika < *-likʷ- ‘leave, stay’. Skr. ēkadaśa, Gr. héndeka, Lat. ūndecim, Goth. fidwōrtaíhun without the preposition are probably newer (?).

The numeral dwadzieścia ‘20’ (with the masculine dual ending) has replaced the older form dwadzieście from †dъva desętě, where the second part was a regular dual form from the consonantal stem †desęt- (OCS knew dъva desęti from the stem *-ti- of the old numeral substantive †desętь). The names of the next two tens, trzydzieści and czterdzieści, contain the plural form †desęti of the derivative stem (contrary to OCS, where the plural of the consonantal stem, desęte, was in use). Finally, the numerals 50–90 contain the particle -dziesiąt < *desętъ, originally plural genitive of the primitive numeral with the stem †desęt-. In some Slavic languages some forms with different structure are in use, e.g. Russ. sórok ‘40’ (from a Turkic dialect, cf. modern Turk. kırk), devjanósto ‘90’ (probably an archaism: †devęnosъto < *newenəḱomtə ~ *newenāḱomtə, cf. Gr. enenḗkonta, Lat. nōnāgintā).


It has been considered for a long time that the Etrurian numeral ‘10’ was śar. The most recent interpretation regards this form as ‘12’, however, while ‘10’ was to be halχ instead. None of these forms has connection to IE.


Hung. -van, -ven, Mansi -man, -mən, -pən, Komi, Udm. -mi̮n in names of tens are related to Fin. moni < *mone ‘many’ (and likely farther to IE forms like Pol. mnogi ‘numerous’ or Eng. many).

Sto (100)

The PS form †sъto had the hard yer ъ on the place of a nasal sonant (as the result of irregular development due to frequency). The Slavic form †sъto < *suto < *sumto can be derived from *ḱomto-, while the Baltic form represents IE *ḱemto- (under the influence of ‘10’?). It was probably a substantive with the meaning ‘a hundred’, which had originated from ‘hands’, ‘many hands’, so what there are fingers in many hands altogether. A direct connection with ‘10’ is less probable: *ḱm̥to- < *dḱm̥to-. Having examined the development of this numeral in various languages we were able to divide all IE languages into two groups: kentum (centum) and satəm. In the first group the soft *ḱ hardened and mixed with *k, in the second one *ḱ developed into an affricate, and finally often into a fricative. The Slavic (satəm) †sъto, Lith. šim̃tas, Alb. nji-qind, Skr. śatam or Av. satəm correspond to Gr. he-katón, Lat. centum (pronounced kentum in the classic language), OIr. cét (pronounced két), Goth. hund. The initial elements like Alb. nji- or Gr. he- should be compared to Eng. a hundred ‘one hundred’ (instead of just hundred).


Arm. hariwr, now haryur, is unclear.

Tysiąc (1000)

The etymology of this word is not explained completely. Yet as early as for the CSl. epoch we must suppose the existing of at least two forms: †tysǫštjь and †tysęštjь. The Polish word represents the second one. Gmc, Baltic and Slavic forms are based on the stem *tū- ‘strong, powerful’, cf. Pol. tyć ‘to grow stouter, fat’, tuczyć ‘to fatten’, Lith. tùkti ‘to fatten’, Skr. tavas ‘strength’. The most probable reconstruction is *tūs-dḱm̥t-i-, with the meaning ‘a powerful hundred’, cf. Pruss. tūsimtons (acc.pl.), Lith. tū́kstantis, Latv. tũkstuotis, Goth. þūshundī. The strange contrast Gmc. -s- : Slavic -s- (original -s- after had to develop into †-x-) and Lith. -kst- can be explained as results of regular development of the rare consonantal cluster *tḱ. According to another hypothesis PG *þūs-xundi and PIE *tū-s-ḱm̥t-i- are reconstructed but such a suggestion does not explain Baltic forms. Moreover *sk should not have changed in Proto-Germanic, so this view is likely false.


The view, presented in the literature, that the IE language did not know the idea of ‘1000’, is wrong. However, not only one of words with such a meaning has preserved (compare ‘1’ or ‘4’). The forms which are present in other languages represent the form based on the stem *ǵhesl-: Skr. sahasra-, Av. hazaŋra, Arm. (from Pers.) hazar < *se-ǵhesl-o- (*se- had the meaning ‘one’, cf. Gr. he-katon), Gr. khīlioi < *ǵhesl-i-o- (cf. dial. khellioi), Lat. mīlle < *smī-ǵhsl-i (cf. *smī and Gr. mia < †smia < *smijə < *smiH ‘one (feminine)’).


In Toch. yet another word was in use: wälts < *weldhom ‘large number’.

Dziesięć tysięcy (10 000)

A simple word exists for this number in certain languages. Such forms as Arm. bewr or biwr, Gr. myriás belong here.