Welsh Acquisition Database

    Database of the Welsh of Children 3-7 Years

    Lexicon

     Format of the lexicon

    A lexicon has been created which lists the word forms which the children use, together with their categories (parts of speech) and lexemes (or lemma or dictionary entry form). The conventions of CHILDES are used:

    "word-form" {[scat "category"]} "lexeme"
         
    afal {[scat en]} "afal"
    fale {[scat en]} "afal"
    wedi {[scat ag]} "wedi" \
    {[scat ar]} "wedi"

    In the case the category of nouns, number and gender are also indicated as follows:

     
    "word-form" {[scat "category"]} "lexeme"
         
    afal {[scat en] [rhif un] [cen g]} "afal"
    fale {[scat en] [rhif ll]} "afal"

    Details of the coding are given under en in the description of the Categories below.

    Categories in the lexicon

    The following categories and codes are used:

    ?? the category is unclear
    a1 pro-form place adjuncts like there, fama 'here', 'yonder'
    ab conjuncts and disjuncts like hefyd 'also', felly 'therefore'
    ad other adjuncts
    ag aspect markers yn 'progressive', wedi 'perfect'
    an adjectives
    ar prepositions
    as adverbs allan 'out', ymlaen 'onwards', i+ffwrdd 'away', i+lawr 'down', etc.
    at adverbs beginning with tu - tu+allan 'outside', tu+ol 'behind', etc.
    b4 Welsh finite verb with English inflection
    bd English verbs in -ed, -en or equivalent e.g. crashed, drunk
    be verbnoun forms (compare English plain infinitive) including auxiliaries but not bod 'be'
    bf finite-verb forms (including the imperative forms) except bod 'be'
    bf1 lexical finite verb (in the syntactic glosses in data files but not in the lexicon)
    bf2 auxiliary finite verb (in the syntactic glosses in data files but not in the lexicon)
    bf3 imperative finite verb (in the syntactic glosses in data files but not in the lexicon)
    bg English verbs in -ing
    bp English plain infinitive forms
    ca singing or song
    cd co-ordinating conjunctions
    ce verbnoun (compare English plain infinitive) of bod 'be'
    cf finite forms of bod 'be'
    cm mwy 'more' as a comparative particle before adjectives
    cn greetings and farewells
    cy subordinating conjunctions like achos 'because'
    d1 preverbal particle, tag: oni namely 't, 'n', yn', ynd, etc.
    d2 preverbal particle, positive: y1, fe1, mi1
    d3 preverbal particle, negative: nid namely d, t
    d4 preverbal particle, negative, in answers and tags: na
    d5 preverbal particle, negative, subordinate clauses: na5
    d6 preverbal particle, interrogative, subordinate clauses: a1
    eb standard exclamations like aa 'ah', oo 'oh'
    en nouns
    features on nouns are: rhif (number) = un(igol) (singular) or ll(uosog) (plural);
    cen(edl) (gender) = g(wrywaidd) (masculine) or b(enywaidd) (feminine) or gb
    er the post-modifying words arall 'other' and eraill 'others'
    es eisiau 'wants, needs' - a nominal form
    f1 answer word, positive: ie, ia, etc.; do
    f2 answer word, negative: nage, nace, naci etc.; naddo
    g1 nominal wh- words - beth 'what', pwy 'who'
    g2 adverbial wh- words - pryd 'when', pam 'why', sut 'how'
    g3 the wh- word pa 'which'
    g4 compounds involving wh- words like beth+bynnag 'whatever', pryd+bynnag 'whenever'
    g5 the wh- word faint 'how much/many'
    ga grammatcically invariant answer words ie 'yes', nage 'no', do 'yes' a naddo 'no'.
    gc the comparative particle na 'than'
    gd demonstrative words dyna 'there/that is', dyma 'here/this is', dacw 'yonder is'
    gg intensifiers like rhy 'too', go 'gairly', mor 'so'.
    gm quantifiers like digon 'enough', llawer 'much/many, mwy 'more'
    gt the predicatival particle yn
    gy particle onid e: yntefe, tefe, and also 'de, 'te, ynte, etc.; the latter may be ynteu sometimes
    ll pro-form adjuncts yna 'there', yma 'here' and acw 'yonder'
    ly letters of the alphabet
    mo words indicating epistemic modality efallai 'perhaps', hwyrach 'perhaps'
    ne the negator dim 'no/not' both as quantifier and adverb
    on onomatopeic words (but should be analysed as nouns if they occur as nouns e.g. 'beepbeep' = 'car')
    pa politeness expressions
    pe determiners
    pi forms of piau, used to indicate ownership
    qq for obscure forms
    r1 personal pronouns
    r2 demonstrative pronouns
    r3 indefinite pronouns like rhywun 'someone'
    r4 negative pronouns
    r5 reflexive pronouns
    r6 reciprocal pronouns
    r7 conjunctive pronouns like finnau 'me too'
    r8 prefixed (possessive) pronouns
    r9 the 'alternative' pronoun llall 'other', lleill 'others'
    rd rhaid 'must, necessity'
    ri numbers
    rp universal pronouns like pawb 'everyone'
    rq indefinite phrases like beth+'na 'thingie', lle+'na, be+ti'+'n+galw 'what do you call it'
    sg standard verbal pauses like ymm 'uhm'
    sy standard paralinguistic forms like hy+hy 'uh-uh', mm+mm 'uhm-uhm'
    ya manner-adverbial particle yn e.g. yn gyflym 'quickly'
    ys English word e.g. naughty
    z1 fronting particle, interrogative: efe
    z2 fronting particle, declarative: na3, mai, taw

    Forms not in the lexicon

    Nonsense words Suffixed with @gl in the data files. chic+chics+tics@gl
    Noises Suffixed with @sn in the data files.iii@sn
    English Words Single English words in Welsh sentences and in isolation are included.
    Strings of English words as phrases or sentences are excluded. In the data files, they are surrounded with <...> which is followed with [% Saesneg]. Words from other languages are treated in the same way.
    welish i ‹big christmas tree› [% Saesneg].
    Words in songs etc..In the data files, they are enclosed in <...> which is followed with [% ca:n]. ‹dau gi bach yn mynd i 'r coed› [% ca:n].
    Proper names. Begin with a capital letter in the data files.
    Unfinished words. Begin with & in the data files. &ffl

    Lexicon files

    • Main Lexicon
    • Nonsense Words
    • Sounds