Text Mining
This specialisation aims to further understand the field of textual data analysis. Textual data analysis is the process of acquiring information from semi-structured or unstructured data. The information is generally obtained by discovering patterns and trends by means of statistical pattern learning approaches or through pattern recognition with logical theory. The application of text mining includes: machine translation, image captions, opinion mining, information extraction, text segmentation, sentiment analysis, opinion mining, text summarisation, text grouping, text categorisation, hoax analysis / identification, spam analysis, automated question answering, set expansion, concept expansion, truth discovery, topic labelling, natural language sentence parsing, etc.
Specialised courses for the textual data mining include:- IF1801013 - Text Mining and Natural Language Processing
This course discusses textual data processing on Web documents. The topics covered include Dimensionality Reduction, Use of Links and Structure, Relations and Passage Retrieval, Question Answering, Machine Learning and Text Classification, Partitional and Hierarchical Text Clustering, Sequence Labelling and Named Entity recognition, and Information Extraction. - IF1801023 - Text Retrieval
This course covers the methods / algorithms as well as design and implementation of the retrieval of modern textual information. The topics include include: the design and implementation of the retrieval system, textual analysis techniques, the retrieval model (for example: Boolean, vector space, probabilistic, and learning methods), search evaluation, and retrieval feedback. This course is carried out with face-to-face meetings, discussions, and presentations from individual students / groups in solving a problem / case. - IF1801033 - Machine Learning for Text
This course discusses machine learning methods for textual data. Machine learning is an application of artificial intelligence (AI) that provides the ability of the system to automatically learn from experience without being explicitly programmed. Machine learning focuses on developing computer programs that can access data and use it for self-study. Topics covered include ranking and machine learning for text information retrieval and Web search, statistical language modelling, kernel methods, Word-Context Matrix Factorisation Model, word vector representation, and Neural Language Model. - IF1801043 - Semantic Analysis
This course discusses semantic analysis of textual data. Semantic analysis is a process that connects the syntactic structure, from the level of phrases, clauses, sentences, and paragraphs to the level of writing as a whole, to the meaning of their independent language. Topics covered include language understanding, semantic analysis approaches / methods (predicate logic and statistical approaches), Logical Form Language, Logical Graph, Dependency Structure to Logical Forms for Semantic Parsing, Domain Ontology, and Resources for Semantic Analysis. - IF1801053 - Text Processing on the Web
This course discusses textual data processing on Web documents. Topics covered include Dimensionality Reduction, Use of Links and Structure, Relations and Passage Retrieval, Question Answering, Machine Learning and Text Classification, Partitional and Hierarchical Text Clustering, Sequence Labelling and Named Entity recognition, and Information Extraction. - IF1801063 - Text Knowledge Representation
This course discusses methods for representing knowledge from textual data. The approach used to represent knowledge usually involves methods of artificial intelligence. Knowledge represented is usually from a specific or closed domain. Topics covered include methods for extracting relations and entities from text documents, representations with graphical models (discriminatory and generative models), Distance Supervision for Relation Extraction, and Unsupervised Relation Extraction using Generative Models.
Specific skills expected from the text mining specialisation of Informatics students is to be able to apply and analyse textual data mining methods including machine translation, image captions, opinion mining, information extraction, text segmentation, sentiment analysis, opinion mining, text summation, text grouping, text categorisation, hoax analysis / identification, spam analysis, automated question answering, set expansion, concept expansion, truth discovery, topic labelling, natural language sentence parsing, etc.
For a complete list of students for textual data mining specialisation, please access the following link List of Text Mining Students.
FACULTY OF MATH AND SCIENCE