Język wykładowy 
Angielski 
Semestr 
Zimowy 
Status 
W ofercie 
Opiekun 
Przemysław Uznański

Liczba godzin 
30 (wyk.)
30 (ćw.)

Rodzaj 
I2.Z  zastosowania inf. 
ECTS 
6

Polecany dla I roku 
No 
Egzamin 
Yes 
Tagi 
DS (Data Science)

Opis przedmiotu:
The course presents the mathematical and algorithmical ways of dealing with very large data sets. The need to process such large datasets is ubiquitous now, and arises in Internet search, machine learning, network traffic monitoring, scientific computing, signal processing, and other areas.
In such datasets, even linear complexity is simply not feasible. This invalidates many classical approaches to algorithm and data structure design, and creates the need to represent the data in a fashion that preserves some selected structural properties, but otherwise is highly compressed, and the need for new algorithmmic approaches.
__Example 1:__
Highdimensional bitvectors (think of: for each user of a social network, we store which items they liked/disliked). Surprisingly, instead of storing each bitvector explicitly (which would take gigabytes _per user_), we can design relevant sketches. Each sketch takes logarithmic space, and we can query bitvectors for similarity in logarithmic time!
__Example 2:__
Approximate matrix multiplication: we can multiply matrices much faster, if we are fine with output being a close approximation (in some sense) of the actual product.
__Note:__
This will be a rather *math heavy* course. However, the mathematical tools used will be rather basic (basic probability and linear algebra).
This course is should be interesting enough to be for PhD students as well.