I have circulated this plain English description of the Pama-Nyungan (now Comparative Australian*) lexical database to various language centres in Australia, but I’m posting it here too in case it’s useful to others writing such descriptions, and in case others would like to know about the database in broad terms. I am in the process of writing a more detailed paper that describes the database.

Database description in plain English

I am making a database with words from many different Australian Aboriginal and Torres Strait Islander languages. I have put down here some information about what I am doing, why I am doing it, who is involved, some of the risks and benefits for a project like this, and the possible outcomes. This was mostly written for regional language centres but probably answers most of the questions that others would have too.


I am a linguist. Part of the work I do involves working with communities in Australia on documenting languages. I have worked in the Kimberley, Queensland and Arnhem Land, mostly with elders. I’ve done ‘theoretical’ linguistic work, but also learner’s guides, a dictionary (in progress), school materials, and a lot of oral history and place name work.

I am also interested in language history – that is, how words and languages change over time, and how we can use information from languages to find out about the history of people. We can look at similarities and differences between words to see how languages have changed, which people have been in contact with one another, and so on. People borrow words from one another and often we can tell that the word was borrowed. For example, we know that speakers of English borrowed the word kangaroo from Guugu Yimidhirr people from Cooktown in Queensland. We know this because English speakers didn’t have a word for kangaroo before they came to Australia, but the language records from Cooktown record this word.

I work at Yale University in the USA, but I’m from Canberra originally and I often go home to visit my family. In Arnhem Land, I’m Wamuttjan.

The Database

As part of this work on language history, I am making a database of lots of words from lots of different languages all over Australia.

What is going into the database? I am including published language materials, and other materials which are freely available (for example, through the archive at AIATSIS in Canberra). As part of doing this, I have been looking at old sources – that is, materials published in the 19th century – other published materials, and fieldnotes. We (my students and I) have been typing these materials into the computer and making a database program.

What type of words are in the database? It is mostly common words, parts of the body, and other items like this. Where I have been given electronic dictionaries, it includes everything in the dictionary. I am not including any secret words or words that refer to men’s business or women’s business. Some words may be tabooed in some languages but not others. It’s also possible that some words are secret in some languages but not in others. We have not knowingly included such words.

Some of the words are marked as restricted: these are restricted because the person who gave me the information did not want it more widely circulated. Some people have let us use data on the condition that it is not given to others. That is fine with us. I have a list of languages that need to be restricted and when I export the database (that is, when I make a copy of it for others) I take out those items.

Who has access to the database? Currently, the only people who have access to the database are me, my research students, and a few of my colleagues. I have no plans to make the whole database freely available, although I would eventually like to publish reconstructions of the history of some of the words, and that will involve quoting from individual languages (although I will not quote anything which I have been asked not to). I can send copies of drafts to interested people if they would like to see what I plan to include.

How will this database be useful to Aboriginal people?

There are a couple of ways that this project might be useful to you. If you are working on reviving your language and are looking for materials, there might be some words from your language in the database. If your language was mostly written down a while ago, sometimes it can be hard to know exactly how the word should be said. Sometimes we can work out what the right pronunciation is through looking at other languages are. The database is digital, which means it is easy to produce word lists for individual languages. My team and I can help in making language materials like wordlist books, which could be the start of a dictionary.

What does the database look like?

The database has a lot of information in it and it is a bit complicated, but we can break it down into some different sections:

There is a part that has language names, and some information about each language. This includes the different spellings of the language name, some information about where it’s spoken, and what other languages it is closely related to.

There is a part that has information about the sources, such as the books and articles and fieldnotes that the language information came from.

The main part of the database has the words in it. Each word is listed in its original spelling, a guide for how to say the word, what it means in English, what language and variety it comes from, and the source of the word (that is, the book, article or notes it came from and who wrote it down). There is also information about the grammar of the word, and information relating to the project, such as related words, its meaning class (whether it is a fish term, or a body part, or something relating to the weather, and so on).

Finally, there is a set of reconstructions. These are guesses as to what earlier stages of the languages looked like. There’s also information about what languages have the same words for an item and what are different (e.g. which languages use the word karli for boomerang).

Does this database have anything to do with Native Title?

No. I have never worked on a Native Title claim. This database may be useful for claimants in some areas, and I could make the information for claimants available on the condition that it not be held sub judice. The historical reconstruction aspects of the project are far further in the past than is relevant to Native Title claims.

