Пример готовой курсовой работы по предмету: Языки (переводы)
Содержание
ABSTRACT 1
CONTENTS 2
INTRODUCTION 3
1 THE MAIN FEATURES OF CORPUS LINGUISTICS AND TEXT CORPORA 5
1.1 Approaches and Definitions of Text Corpora and Corpus Linguistics Definitions 5
1.2 Types of Corpora 11
Corpus Software and Corpus Data Input 14
1.4 Architecture and operation of the Global Web-Based English Corpus 16
1.5 Usage of British and American Vocabulary in GloWbE 17
2 ANALISYS OF BRITISH AND AMERICAN LEXEMES 23
INCLUDED INTO THE STRUCTURE OF THE GLOBAL 23
WEB-BASED ENGLISH CORPUS 23
CONCLUSION 26
Theses 28
REFERENCES 29
Выдержка из текста
This term paper deals with one of the most important challenges facing researchers of World Englishes: the question of collecting primary material from speakers of dialects.Potential data sourcesencompass collections of newspapers, blogs, emails, SMS texts, transcripts from recorded conversation, and fictional literature. Studies based on each of these data types are found in English World-Wide during the past several years. Another possibility is to use “structured corpora”. An important set of corpora for the study of World Englishes has been the extended Brown family of corpora, which includes the Brown Corpus of 1960s American English and other parallel corpora of varieties and time points such as 1990s American English, 1960s and 1990s British English, as well as Australian English, New Zealand English and Indian English. Each of these individual corpora contains about one million words of text. However, the most widely used corpus for research on World Englishes may be the International Corpus of English (ICE) (Davies, 2013: 9).
The study of the corpus ofGlobalWeb-basedEnglish, or GloWbE, is considered to be topical in the way that it allows specialists to carry out comparisons between different varieties of English. Moreover, the Corpus of Global Web-Based English provides data on differences betweendialects of English, in ways that are not possible with any other corpus. GloWbE is related to many othercorpora of Englishthat have been already created and in their turn offer unparalleled insight intovariation in English.
There were three goals in the creation of GloWbE: size, genre balance (including informal language), and accuracy in terms of identifying the dialect that it is representing. In terms of size, the goal in creating GloWbE was to have a corpus that was large enough to permit research on a wide range of phenomena in World Englishes. To this end, there was really only one possible source for the texts, and those were web pages. Virtually all corpora that are larger than about 500 million words in size are based largely (or exclusively) on web pages.
As for the genres, The GloWbE corpus provides very interesting data on the distribution of the construction in blogs and other web pages from the different dialects. Written texts within the corpus were selected using three criteria:“domain”,“time”and“medium”. The first one considers the contents (what will be in asubject field) of the text; the second refers to the time period when the text has been produced, while medium considersthe type of publication (books, periodicals or the other).
The spoken data in the GloWbEhas been created on the basis of two atributes:“demographic”and“context-governed”. The demographic component is composed of informal encounters recorded by 124 volunteer respondents selected by age group, sex, social class and geographical region, while the context-governed component consists of more formal encounters such as meetings, lectures and radio broadcasts recorded in four broad context categories. The two components of spoken data complement each other, as many types of spoken text would not have been covered if demographic sampling techniques alone were used in data collection.
The balance aims to develop a large, fully annotated reference corpus of The GloWbE, “mirroring The GloWbE in terms of genres and its coverage of written and spoken language. Running texts have been collected, and part of the data (30 million words) has been compiled into a balanced corpus.
Bearing all this in mind, it was decided to circumscribe this present work to those aspects that help define and understand the features of The GloWbE corpus.
The key goal of our research is to investigate the balance of the functioning of the British and American lexemes within The GloWbE corpus.
To achieve the goal, the following enabling objectives have been set:
- to consider the corpus concept and its main features;
- to analyze approaches to text corpuses classifiations;
- to consider The GloWbE corpus;
- to compare the use of the English and American lexical units within The GloWbE corpus;
- to analyse words with reference of numbers of usage in American and British blogs and comparison of the meanings.
This research paper uses literature reviews and analyses as empirical and theoretical research methods respectivelly.
This paper is structured as follows. After the introduction, the Chapter 1 will focus the attention on the theory of text corpora developed by different linguists: the most important features and properties will be reviewed. Chapter 2 being an emipical part of the researchwill deal with the analysis of English and American lexemes included into the structure of The GloWbE corpus words with reference of numbers of usage in American and British blogs and comparison of the meanings. After having conducted the analysis, some conclusions will be drawn.
Список использованной литературы
1. Biber, D., Conrad, S., Reppen, R.(2004) Corpus linguistics Investigating language structure and use. Cambridge: Cambridge University Press.
2. Crystal, D. (2003) English as a Global Language. Cambridge: Cambridge University Press.
3. Dash, N.S. (2010) Corpus linguistics: A General Introduction. CIIL, Mysore.
4. Davies M. Corpus of Global Web-Based English: 1, 9 billion words from speakers in 20 countries Available from http://corpus.byu.edu/glowbe/ [Accessed on 16 November 2016]
5. Diemer, S., M.-L. Brunner, C. Collet & S. Schmidt. Forthcoming. CASE: Corpus of Academic Spoken English. Saarbrücken: Saarland University (coordination) / Sofia: St KlimentOhridski University / Forlì: University of Bologna-Forlì / Santiago: University of Santiago de Compostela. Available from http://www.unisaarland.de/index.php?id=36728. Accessed 18 November 2016.
6. C. J. Fillmore, Ruppenhofer, J., and Baker, C. F.(2004) “FrameNet and Representing the Link between Semantic and Syntactic Relations”, in Computational Linguistics and Beyond. Taipei: Institute of Linguistics, Academia Sinica.
7. Firth, J.R (1957)Papers in Linguistics 1934– 1951. London: Oxford University Press.
8. Hunston, S (2002) Corpora in Applied Linguistics. Cambridge: Cambridge University Press.
9. Kachru B. B., Kachru Y., Nelson C. L. (eds.) (2006) The Handbook of World Englishes. Blackwell Publishing
10. Kübler S., Zinsmeister H. (2015) Corpus Linguistics and Linguistically Annotated Corpora.Bloomsbury
11. McEnery, T., Wilson, A. (2001) Corpus Linguistics: An Introduction. Edinburgh: University Press
12. Meyer C. F. (2004) English Corpus Linguistics: An Introduction Cambridge: Cambridge University Press.
13. Leńko-Szymańska, A., Alex Boulton (2015)Multiple Affordances of Language Corpora for Data-driven Learning.Amsterdam: John Benjamins.
14. Lindquist H., Mair C. (ed.) (2004) Corpus Approaches to Grammaticalization in English. John Benjamins.
15. Sinclair, J. (2004) Trust the Text: Languafe, Corpus and Discourse. Routledge.
16. Smith, N.(2005) Language, Frogs and Savants: More Linguistic Problems, Puzzles and Polemics. Blackwell.
17. TogniniBonelli, E. (2001)Corpus Lingustics at Work. Amsterdam: Benjamins.
18. Tottie, G. (1991) Negation in English Speech and Writing: A Study in Variation. London: Academic Press.