We met for the first time on Monday. In the end of the week we produced a beautiful presentation that sums up what we achieved in 5 days:
On Monday morning we knew that our data is the Finnish newspaper corpus that contained all the newspapers from 1820 to 1910. Tuula Pääkkönen from the National Library brought this treasure to us. Tuula, Timo Honkela and Eetu Mäkelä helped our group during the whole week. Also student Maria von Hertzen brainstormed with us on Monday. We suggested several topics for research and the topic ‘socialism’ was chosen for elaborating. In the end, we decided to study the key concepts of Finnish socialism.
We restricted our research to the newspapers from 1895 to 1910 since the first socialist newspaper was published in 1895. We wanted to create two data sets: 1) all the newspapers of 1895-1910 and 2) the socialist newspapers of 1895-1910. Elizaveta used previous research on the Finnish newspapers and made a list comprising 20 socialist newspapers of the time.
We started from keyword statistic analysis. Our computer master Eric made it quickly and we could see in numbers the top list of words which are more frequent in the sample of socialist newspapers (percentage in comparison to 100% of all the words) than in the big sample. At the same time we got a negative keyword list, i.e. the words which the least frequent in socialist newspapers compared to all the newspapers. We started to play with these two tops in various ways; analysing why exactly these words, and visualising them. Eric automatically developed word clouds in which the size of the word script corresponded the relative frequency of this word, the most frequent and the least frequent words.
We went deeper and studied some socialist key words chronologically. Eric produced graphic picture of year by year word frequency statistics.
Susanna made a wave picture; each wave was as wide as the relative frequency of the word it was bringing inside, and the width varied along with the spread of the wave from one year to another (X-line).
Risto and Elizaveta studied personal names often mentioned in the socialist newspapers; they also noticed the differences between mentioning concrete persons in socialist vs. non-socialist newspapers. They also found some interesting imperatives that the socialist newspapers were giving to the readers (“join!”, “avoid!”, “smoke!”).
Larisa looked closely at the text in Korp Concordance Tool (a search program within Kielipankki, the Finnish ‘language bank’). In particular, she collected statistics of the use of word sosialismi in various case forms and explored how and in what kind of context this word was used in socialist newspapers vs non-socialist newspapers.
Almost every day we had a presentation to the other groups and got many useful comments. Every day we also had ‘guests’ at some point. Normally hosts entertain their guests, but within DHH week, it was the guests who entertained us! They told us and showed us fascinating things which could be done with the help of computer programs – the things which we maybe sometimes will be able to do! In the almost-very end of DHH we – all of us, four groups, entertained each other and guests telling about our adventure and what we would do together had we three more years 🙂 DHH culminated in the poster session with wine and unofficial interaction Friday afternoon. The week was interesting and hard, great experience for us all. We are very grateful to the organizers and in particular Mikko Tolonen, who kept good care of us all.
Cheers from Finland,
Larisa Leisiö, Elizaveta Arzamastseva, Risto Turunen, Eric Malmi & Susanna Ånäs.