27-29 November, Vilnius

Conference about Big Data, High Load, Data Science, Machine Learning & AI

Early Bird Ends In:

Day(s)

:

Hour(s)

:

Minute(s)

:

Second(s)

Andrew Svetlov

Ocean SA, Ukraine

Biography

Andrew Svetlov specialises in Network programming, Python Core and data structures.

Talk

Fetching Data from the Web

The talk covers trivial but important steps for data pre-processing:
1. Make a crawler to fetch a data from the Web.
2. Convert the binary data into a text representation.
3. Apply Unicode normalization and cleanup.
4. Strip meaningless characters
Bonus: several data structures that should know and use every data scientist.