DHH 15 English language variation

INTRODUCTION to our research

“The deeper political integration of her kingdoms was a key policy of Queen Anne, the last Stuart monarch of England and Scotland and the first monarch of Great Britain.”
WikipediaParliament of Scotland

How do historical events show up in personal letters? The VARIENG group at Helsinki University’s Digital Humanities Hackathon set out to look at whether and how the Act of Union between England and Scotland manifests itself in the Corpus of Early English Correspondence (CEEC). The Act, which was finalized and put into effect in 1707, turned the kingdoms of England and Scotland into one country. This felt like a relevant topic, since the future of the union has been on the line recently.

about the CORPUSletter

The CEEC (Corpus of Early English Correspondence) contains letters from 1400 to 1800. It was designed for sociolinguistic studies (has a reasonably good balance of different kinds of writers from different strata of English society).

RESULTS we attained

We found a spike/rise in mentions of Scotland in the letters around the time of the Act of Union. (Actually, we first narrowed down the period in question to 1680–1800, and then, in order not to miss letters that may contain words related to Scotland, we relaxed our searching criteria to include anything starting with “scot” after some research by Olli.  In addition, Olli had put forward a research plan that would help us pinpoint what we’re interested in more precisely in future.)

All letters that contain the word Scotland/Scottish/Scot…
All letters that contain the word Scotland/Scottish/Scot…

We also have an interactive Word Cloud over different time periods online by courtesy of Joe.cloud

 Go and check yourself!



Challenges we came across:

  • communication: every member of our team has a different background, which took us some time to get used to each other’s “working corpus”.
  • collaboration: each of us has different research interests and his or her own repertoire.
  • tight schedule: we need to come up with a feasible plan that could get us some results within a short time span.
  • how to process data: at first sight, the corpus being readily available seems in our favor, then we found getting the data into a form we could use is not such an easy task.


EXPERIENCES from our team membersdhh1



I enrolled to Hackathon thinking it would be mostly generating ideas for future research. What I got was a lot more: we did throw around ideas, of course, but then we had to plan what to do with those ideas and how. The schedule was pretty tight but we were able to come up with a tentative research plan by Tuesday afternoon. Monday had been spent almost entirely getting to know each other, the data, and what abilities our group had that we could use to analyse the data.

Between Monday and Tuesday we were brainstorming about research topics, and tried some tentative ideas ourselves. I read randomly selected letters from different writer age groups to see if there would be something to study. The only variance in the letters  was on the use of has/hath, and we decided we would not focus on that (since VARIENG has already studied it) – and age did not seem to affect it much, at least within the small sample I had read. We decided that during the week we would focus on how Scotland was depicted in the letters during the Act of Union. While I was close-reading letters that contained both keywords “England” and “Scotland”, the rest of our group focused on other things (which they will probably specify themselves).

I have always been an advocate of qualitative study, but having focused on that during the Hackathon week and seeing what type of results others have gotten, I have seen the undeniable advantages of quantitative analysis and methods. At least in a short span of time, a lot can be done with numbers, for example all kinds of plots, and through those, interesting things can be seen. Our plot-maker Jukka played around with the data and found, for instance, that the word “hospital” was only beginning to appear in our data sometime after year 1720.

I got to try a bit of coding as well. I have to say that at at the beginning I had no “eye” for reading codes (for example, I was completely lost at what directory, or folder, I was in), but I would like to think that I can now at least see the logic in a simple code. I really want to try to play around with it, for example change my “location” (which folder I’m in) back and forth, to get a feel of what I am doing, and how.

Besides the tight schedule, I think the most difficult (and at the same time dhh2rewarding) thing in DHH was the communication. I noticed how much I really use my disciple’s “lingo” and how that can hinder effective communication. Having to explain myself differently and not understanding what computer scientist side people were trying to say, I realised that more exchange needs to happen between humanists and coders. We need to be able to understand each others’ languages and world views, especially since it seems that digital and computing skills will become more and more necessary for all humanists in the future.



DHH was an exhilarating, if taxing, experience. Trial and error were the themes for our group, and for the first three days we mainly concentrated on throwing various things at our data to see what sticks and what doesn’t. For the most part, this was very rewarding. I also learned to like the chaos that is a necessary part of a project as short and ambitious as this. Above all, the hackathon taught me to better explain myself not only to computer scientists but also humanists in fields other than mine, such as historians. DH is still a relatively young concept that has not yet fully split into sub-disciplines: digital history, digital linguistics and digital research on visual communication are all walking under the same umbrella. Thus, DH brings together people with very different persuasions. It is surprisingly rare to have this kind of collaboration even among humanists in today’s increasingly specialized academia.



I enrolled in the Hackathon to see how people from different backgrounds could collaborate on a project such as this in relatively short amount of time. I was a bit skeptical as to whether it would work, given that much academic work is sequential in nature but the shortness of time meant that we needed to work in parallel. It also wasn’t clear to me how well collaboration would work without a common programming language. While I don’t think we completely overcame those issues, I think we did a lot more than I was thought we would be able to do. I think our group really came together at the end, and I was impressed by the heroic efforts of members of the team to do so so much so quickly. Overall, I think the Hackathon gave us a lot of ideas about how we can collaborate on such projects in the future.



I am a student in computer science.  I took part in the Hackathon out of the interest in natural language processing.  I am also very curious about where and how machine learning could be applied to practical use and even curious to see if I myself could work well in a multilingual team with people from completely different background.  In fact, I have no confidence at all, because recently I focus mainly on theoretical field and had never dipped my toes into anything concrete.  To make the matter worse, I am not experienced in coding.  To my great luck, my team and our team leaders were so patient with me; they endured my incompetence, and helped me a lot.  It turned out with a team whose members had complete different background, research interest, working language and ethic, collaboration was not an easy task.  Though, in my view, working in parallel without a clear effort allocation plan cost us some efficiency along the way, it was still a feat to be able to set up a goal and achieve it in such a short time.  Everyone of our team worked ardently.  The whole Hackathon experience wais an adventure for me.  Everybody here is an expert in his or her own field, our team leaders knows how to get the thing done in this setting.  I learned a lot from getting out of my comfort zone.  I worked happily in my team, and had a good time talking with people from other teams at the end of the event.  To tell the truth, I am planning to build up my skills over what seems to be necessary for DHH and to enroll in it next year.  Hopefully, I could contribute more to my team next time.  From my experience, it seems Python, Matlab or R, and know how to process data are necessary skills for who want to take part in DHH as a prolific coder.dhh5



I joined DHH’15 for new experience and meet new people. The word that caught my attention was humanity. I am a Master student of Computer Science. I have to run with the pace of technology change. I am frustrated when I see a new technology strikes the market and becomes obsolete in few months. Whenever we get some time to think, we think about making money, new ideas that will help making more money, new career plan that bring more money and fame, position and so on. It never ends. Maybe this is just an illusion given by the system that we unknowingly  built over time. May be life is beyond money, fame and some machines surrounded by us that always command us and never even say “please.” So I was looking for something that will get me out this frustration. I found this event. I joined and I must say life is beyond python, R, Git, algorithms and some weird terms that we feel very proud to talk about because we are familiar or know those terms. Life is simply not about those to me. Yes, we may need those but we need to have conscience and a brave heart that can do justice, distinguish between justice and injustice and stand up against injustice.

Humanists talk about those. Therefore, it was enough and justified reason for me to join DHH.

So I joined. I joined a group of people who were working on a data set (actually lettres ) of about three century old and to be more specific an event for what todays UK was formed.  The event was “Act of Union” held in 1707. We had to find the influence of people and English Language related to that event.

There, I had the chance to widen my mind a bit when I worked with my group. Everything was new to me. The terms (such as corpus, Act of Union). Well not the words but the context. I learned by listening the discussion of our group members.  Then I was amazed by gathering and playing with the information. I contributed a bit but received a lot as gift from my group members.

Did I choose consciously this group? No not at all. I joined two days later because of number of reasons. I was looking for a suitable team and found my current group. They were kind enough to share their hard work with me and make me feel like I am one of them. When I started I was feeling like a newborn for whom the task was learning “Mandarin Chinese and communicate with the world overnight .” But my group members helped me to adjust with the environment and I managed to understand the goal and started working accordingly. I missed my first assignment deadline but eventually managed to do it on the presentation day.

Our achievements were tremendous. Making the data machine readable and process them, close reading, statistical data analysis, developing apps and all within 5 days. Amazing!!!

But I must mention that we had great mentors who continuously helped and did the hard stuffs by living behind the shadow.

At this point I would conclude with the hope that we will meet again and continue this journey of humanity which has been digitalized with the greatest technological inventions over time but still remained organic. One of our mentors told me during our discussion that human are afraid of changes. Maybe I am also afraid of changes. I never want to be the witness of the days where same event will be named as “Digital Mechatronity.”