Michael Granitzer (michael.granitzer@uni-passau.de)
License
This work is licensed under a Creative Commons Attribution 3.0 Unported License
We will cover basics on scientific writing by trying to understand the reader and the problems of the reader. The majority of the ideas here come from one great source, which i recommend to everyone:
Scientific Writing 2.0 - A Reader and Writer's Guide from Jean-Luc Lebrun
So go and grab the book. Take a look at his website
There is also software acompanying that book called SWAN - the Scientific Writing Assistant. When writing papers/homeworks, use that software to improve your writing.
My lecture notes are just the kick-off to get you started. To really master the subject you need to work through Lebrun's book.
Knowing the reader is critical to write a good paper.
But what are the skills/properties of a reader?
(Source Lebrun, http://www.scientific-writing.com/)
Short Term Memory
Your Goal:
According to [Lebrun 2007]:
<img src="media/img/webrat-granitzer.png">
Example - Pronouns
BPMN (Business Process Modelling Notation) [35] is a standardised graphical process notation that is experiencing a rapid adoption among BPM tools vendors. There is no need; however, to have a BPMN model of a conference review process to know that paper selection can start after all the reviews have been received. This is because the status of a process is inherent to the information associated to artefacts that are part of the process.
What are the problems? How do you resolve pronouns?
Example - Diverting Synonyms
"The angry dog hunted for the cat, but the cat escaped. The Canis Lupus Familiaris was happy, but the Felis Catus wasn't."
Who was happy? The dog or the cat?
"The angry dog rushed for the cat, but the cat escaped. The dog was happy to had fun, but the cat was unhappy because of getting hunted."
Novels should trigger the fantasy of readers. Science must not.
Example - The Distant Background
WebRat - a new, interactive search engine interface - allows to perform web-searches easily.
WebRat is immediately defined after its named. So the background knowledge becomes immediately available for the reader.
But beware! Do not separate subject and verb too far from each other by inserting to complex background knowledge.
Example - The Broken Couple
Experiment A shows that under the assumption of a uniform distribution of the input data and a normal distributed noise pattern (which is similar to cosmic noise when restricted to a certain bandwith) no improvement can be achieved.
Whats wrong?
Experiment A has been conducted under the assumption of a uniform distribution of the input data and a normal distributed noise pattern. Results show that no improvement could be achieved.
OK? What could we do better?
Experiment A assumed a uniform distribution of the input data and a normal distributed noise pattern. Results show no significant improvement.
use strong verbs, i.e. verbs with high expressiveness; is, are and their inflections are not strong verbs.
Always keep together the following couples:
Attention is reduced by
When you loose attention, your mind drifts off and reading becomes looking.
Keeping attention is hard, because it costs energy. Keeping attention vs. everyday activities is comparable to sprinting vs. strolling.
To keep attention high, the story must progress.
Most common errors that indicate non-progression:
These problems are often nested/combined.
Example of redundancy/non-progression
"Hence we propose a new algorithm that eliminates unwanted attributes in order to increase the categorization performance and to avoid the curse of dimensionality. The new algorithm of attribute selection eliminates attributes in order to increase the categorization performance and to avoid the curse of dimensionality. The new approach is based on the idea of transforming the value of the correlated attributes into new instance for the retained attributes. We aim to reduce the attributes space by performing attributes selection and increasing the learning space by creating new instances using the redundant attributes."
"Hence we propose a new algorithm that eliminates unwanted attributes in order to increase the categorization performance and to avoid the curse of dimensionality. The new algorithm of attribute selection eliminates attributes in order to increase the categorization performance and to avoid the curse of dimensionality. The new approach is based on the idea of transforming the value of the correlated attributes into new instance for the retained attributes. We aim to reduce the attributes space by performing attributes selection and increasing the learning space by creating new instances using the redundant attributes."
See the bold texts, which are redundant. Reader's are getting bored, because the authors don't put things forward. This is an extreme, but real-world example. Most often it happens in a less obvious manner, or, in the case you went through the text thousand of times, you simply don't recognize it any more.
"A web crawler gathers information from the Web based on the Web's Hyperlink structure. Web pages are mostly written in the HTML format. A web crawler consists of a frontier, a resolver, a fetcher and a extractor. The extractor extracts "a" elements from an HTML page for identifying links to other web pages. Elements define the structure and layout of a web page. The fetcher downloads Web pages, while the resolver resolves a link. Finally, the frontier maintains the set of unprocessed links"
What wrong patterns do you detect?
"A web crawler gathers information from the Web based on the Web's Hyperlink structure. The Web's Hyperlink structure consists of Web pages written in HTML - the Hypertext Markup Language. Thereby HTML defines the hyperlink element 'a', which allows to include links pointing to other web pages.
A web crawler consists of the following components: a frontier, a resolver, a fetcher and a extractor. The frontier maintains the set of unprocessed links. For every unprocessed link the frontier calls the resolver. The resolver resolves a link and passes the resolved address to the fetcher. The fetcher downloads the Web Page, which is then processed by the extractor. The extractor extracts all "a" elements - the hyperlink elements - and adds it to the frontier."
Not perfect, but better. Every paragraph covers one topic, which is defined with the first sentence. Every paragraph progresses by (i) deepening the explanation of the topic or (ii) by outlining a process.
Paragraphs form the core element for continuous reading. If you structure a paragraph well, you did already 50% of your article right.
Some rules:
Note: In a good paper, you just need to read the first line of every paragraph to get an intuitive idea what the paper is about.
Example
Browsing large-scale document collections usually requires a structural organization form like topic hierarchies. Unsupervised machine learning techniques, foremost document clustering, overcome the labor intensive, manual creation of such topic hierarchies by automatic partitioning of unstructured document collections into browse-able cluster hierarchies. This cluster based browsing approach has been shown to successfully improve access to unstructured document collections.
Topic based Progression
Non Topic based Progression
Example for Progression around a constant topic
Information Retrieval through browsing remains a core concept for reading online content. Browsing large-scale document collections usually requires a structural organization form like topic hierarchies. Contrary, browsing hypermedia is based around the concept of hyperlinks. Both browsing types have their pros and cons.
The topic in the paragraph remains constant. Every sentence reveals more details about browsing.
Progression through partial aspects or subclasses of the main topic
Hypermedia consists of content and a markup language. The content is not limited to text alone, but can be any multimedia format. However, it does not define any additional information. The markup language on the other hand marks part of the content in order to define its semantics, presentation or function.
The topic progresses like in a tree:
Example for Chain Progression
Information Retrieval can be differentiated in retrieving and browsing large document collections. Browsing large-scale document collections usually requires a structural organization form like topic hierarchies. Topic hierarchies can be either created manually or automatically through a clustering process. Clustering ....
The topic of the paragraph progresses like a chain. The ending of one sentence (the so-called stress or the new knowledge) becomes the beginning of the next sentence. This structure guarantees that you explain the unknown with the known and that there are no topic jumps.
Example for progression through explanation or illustration
Information Retrieval refers to the process of finding information in a large document collection based on an fuzzy, user-defined information need. The process starts by identifying a users information need and to translate it into a query understandable by the information retrieval system. Often not only the process itself is considered as information retrieval, but ....
We explain what information retrieval is, which becomes the theme of the paragraph. We could also use an example.
Information Retrieval refers to the process of finding information in a large document collection based on an fuzzy, user-defined information need. For example, a query send to Google for finding web-pages is such a retrieval process.
Example for logical sequential progression
Retrieving information from the Web takes three steps: First, a information need entered by the user. Second, a query created from the information need and interpreted by the retrieval system. Third, a matching and ranking mechanism that fits queries to documents.
Note: First, Second etc. are strong indicators for a logical (or enumerated) progression. It clearly separates the individual steps.
Example for Progression through transition words or enumeration
In this paper we make the following contributions: (i) We show that scientific writing is easy. (ii) We give concrete examples of common mistakes and (ii) we illustrate remedies for improving writing skills
Progression through enumeration is again a very strong, explicit way of formulating things. If overdone it becomes boring easily. Similarly, transition words can strengthen transitions, but often at the risk on establishing transitions if there are none.
Jim and Bob played in the garden. In addition, cat Felix ran over the street.
If you write everything in the same way, readers will loose attention due to monotonic input. Compare it with a speaker, who is not changing his voice while talking.
You can use the following methods to change the presentation of your content:
Change format, style and structure:
Change the syntax and style
Use attention grabbing phrases to
ATTENTION
Complexity can not be avoided in a research article. You need to go into details.
However, after going into details, give the reader a pause through
Example
The example in blue provides clarity and gives the reader time to see if she understands the equation. Further, the summary in green concludes the paragraph and summarizes the essential points of the equation (that what retains in the readers memory).
Shorter reading time means less time needed for maintaining attention and focus.
Some hints to reduce reading time [1]:
[1]: I know my own text still contain many of the errors listed here. Removing such errors costs time, so don't underestimate it.
Motivation is the starting place for a reader to search for papers, open your paper and start reading it. Reading scientific papers is hard, labour intensive work. Motivation is the main fuel to do the work and you as a writer should support the reader in doing so.
Some reasons why people read scientific papers:
Think about your own situations, why you read scientific literature. If you abort reading a paper, ask why and what the writer could have done better.
Since it is hard to know a readers motivation a-priori, a good research paper has a structure that allows the reader to take the parts that helps him/her most. So the structure of a paper is the key to keep a reader motivated and darwing him/her to your research.
Often readers skim through articles trying to find the interesting pieces for their own work. A good structure, visuals and different styles support skimming.
Learn to manage the
You can also watch 6 ideas to write a bad paper for a somehow funny summary.
Jean-Luc Lebrun, (2007), Scientific Writing - A Reader and Writer's Guide, World Scientific, ISBN: 978-981-270-144-2 (highly recommended)
Jean-Luc Lebrun, (2010), Scientific Writing 2.0 - A Reader and Writer's Guide, World Scientific, ISBN: 978-981-270-144-2 (highly recommended)