With the development of the internet, there have been increases in the diversity and volume of data circulating on the web.
One of the priorities of modern analyses is to automate tedious processes – and data collection, including that from websites in particular. As a result of this improvement, it is possible to focus on a more creative and strategically important analytical area – interpretation, drawing conclusions and making important business decisions on the basis of data.
Let us assume a few hypothetical scenarios which pertain to data from the web.
1. You are interested in the subject of a certain currency exchange rate and its historical changes. Bank Y provides daily data in an Excel file which you can download. However, each such file can be found on a separate page. To prepare the data for 2 years, you would have to visit more than 700 pages. For 10 years this represents more than 3,500 clicks.
2. For your social campaign, you plan to analyse the content of the headlines of articles made available in the last 2 years on major industry portals. Each headline must be selected with the mouse, copied and pasted into a local file.
3. You were planning to present some data at an upcoming meeting to support your arguments for several decisions in the arena in question. On one of the websites the data in which you are interested is displayed in the form of a table, but the attempt to copy this to Excel fails, for some reason. Furthermore, even if you succeed, this table is divided into more than 100 pages, and you would, therefore, have to repeat the action 100-fold, and time does not allow you to do this.