Advanced Programming Techniques

Information and Communication Technologies
5 ECTS; 2º Ano, 2º Semestre, 30,0 PL + 30,0 TP

- Ricardo Nuno Taborda Campos

C# Computer Skills
The Courses "Programming and Algorithms" and "Programming Languages" (recommended).

Students should be able to design the data structure of a search engine, explore crawling tools, understand the different stages of natural language processing, implement an inverted index as well as data search models and Cranfield assessment.

1. Informantion Retrieval and Search Engines
1.1. Objectives
1.2. Search Engines
1.3. Aplicattions
1.4. Difficulties and Challenges
1.5. IR architecture

2. Crawling
2.1. Definition
2.2. Performance
2.3. Implementation

3. Text Processing
3.1. Sentence splitting
3.2. Tokenization
3.3. Part-of-speech tagging
3.4. Named entity recognition
3.5. Stopwords
3.6. Stemming

4. Text representation
4.1. Types of evidence
4.2. Bag-of-words

5. Indexing
5.1. Inverted Files
5.2. Posting Lists

6. IR Models
6.1. Boolean
6.2. Vector Space Model
6.3. Other models

7. Evaluation
7.1. Relevance
7.2. Methods(Lab, user-centered, online)
7.3. Cranfield
7.4. Metrics
7.5. Tests

Evaluation Methodology
- Midterm assessment: Midterm test (40%) + project I (60%)
- Final assessment: Exame (40%) + project I (60%)

- Liu, B. (2007). Web Data Mining. Ams: Springer
- Croft, B. e Metzler, D. e Strohman, T. (0). Search Engines: Information Retrieval in Practice.Acedido em24 de novembro de 2015 em
- Manning, C. e Raghavan, P. e Schütze, H. (0). An Introduction to Information Retrieval.Acedido em24 de novembro de 2015 em
- Van Rijsbergen, C. (0). Information Retrieval.Acedido em24 de novembro de 2015 em Information Retrieval

Method of interaction
Theoretical-practical sessions: Presentation of the topics under study using expository and demonstrative methods Practical sessions: Analysis and resolution of case studies.

Software used in class
Microsoft Visual Studio