About swk

I am a computational linguist, teacher of computer science and above all a huge fan of LaTeX. I use LaTeX for everything, including things you never wanted to do with LaTeX. My latest love is lilypond, aka LaTeX for music. I'll post at irregular intervals about cool stuff, stupid hacks and annoying settings I want to remember for the future.

Other settings for “non-scientific” texts in LaTeX

Figures that have no numbers, but only text in the captions:


No indentation at the beginning of a paragraph, but a bigger separation space between paragraphs:


Header lines that contain only page number and chapter, not the section:


LaTeX package “wrapfig”

LaTeX is all nice and fancy if you write technical texts, where the pictures are floating in the text (mostly at the top and/or bottom of pages) and you reference them with numbers. But as I do all sorts of things with LaTeX, sometimes I want more “fun” texts which have pictures somewhere in the pages and text flowing around them.

For this purpose, I have now discovered the package wrapfig:


You can include a picture like this (this one floats left of the text with a width of 7em):


You can control some of the appearance with different settings in the preamble (see the documentation at CTAN), e.g.,


Docker cleanup

Delete all containers

docker rm $(docker container ls -a -q)

Delete all unnamed images

docker rmi $(docker image ls | grep "<none>" | tr -s ' ' | cut -d ' '  -f 3)

docker image ls lists all images,
grep "<none>" selects all that have the tag (or repository! or part of name!) “<none>” ,
tr -s ' ' merges sequences of spaces to a single space,
cut -d ' ' -f 3 splits each line at the space and gives the third column (the image id).
The list of ids is then passed on to docker rmi to be deleted.

Use at your own risk!

How to get WiFi running on Suse Leap 42.3 (Broadcom driver)

After the update from Suse Leap 42.2 to Suse Leap 42.3, my Wifi stopped working. Which is kind of bad, because I need internet to figure out what is wrong…

This was the situation right after the update, when it was not working:

> lspci -nnk | grep -A 3 "Network"
04:00.0 Network controller [0280]: Broadcom Corporation BCM43142 802.11b/g/n [14e4:4365] (rev 01)
        Subsystem: Hewlett-Packard Company Device [103c:804a]
        Kernel driver in use: bcma-pci-bridge
        Kernel modules: bcma
> hwinfo --short
  eth0                 Realtek RTL8101/2/6E PCI Express Fast/Gigabit Ethernet controller
                       Broadcom BCM43142 802.11b/g/n
network interface:
  eth0                 Ethernet network interface
  lo                   Loopback network interface
> iwconfig
lo        no wireless extensions.
eth0      no wireless extensions.
> lsmod | grep "wl"

No WiFi to be seen!

So now this is what I did:

  1. Remove the old driver:
    > rpm -e broadcom-wl broadcom-wl-kmp-default 
  2. Find out my exact kernel version (the last part is the part we need, i.e., “default”):
    > uname -r
  3. Add the Packman repository to my repositories:
    > zypper addrepo http://packman.inode.at/suse/openSUSE_Leap_42.3/ packman
  4. Install the drivers, paying attention to my kernel type (…-“default”):
    > zypper install broadcom-wl-kmp-default broadcom-wl

    You can also download the rpm by hand and install it. In that case, you need to pay attention to the full kernel number. Meaning, for my kernel 4.4.104-39, I should install the driver from broadcom-wl-kmp-default- where the numbers after the k match exactly. Using Packman does that for you.

    Another issue I had with manual installation was missing keys. At least my configuration forces a valid PGP key and aborts if no key is in the key list. And I didn’t have a key for the downloaded rpms. It is possible to tell rpm to install the packages without checking the key (option --nosignature), but that did not properly install the package (without error messages, of course). When installing with zypper it looks for the key itself and you don’t have to worry.

  5. I rebuilt the loaded modules list and then restarted, but I am not sure it is necessary:
    > mkinitrd

Finally, the outputs of the above commands are (for reference, the next time it breaks):

> lspci -nnk | grep -A 3 "Network"
04:00.0 Network controller [0280]: Broadcom Corporation BCM43142 802.11b/g/n [14e4:4365] (rev 01)
        Subsystem: Hewlett-Packard Company Device [103c:804a]
        Kernel driver in use: wl
        Kernel modules: bcma, wl
> hwinfo --short
  eth0                 Realtek RTL8101/2/6E PCI Express Fast/Gigabit Ethernet controller
  wlan0                Broadcom BCM43142 802.11b/g/n

network interface:
  wlan0                WLAN network interface
  eth0                 Ethernet network interface
  lo                   Loopback network interface
> iwconfig
lo        no wireless extensions.
wlan0     IEEE 802.11abg  ESSID:"..."  
          Mode:Managed  Frequency:2.412 GHz  Access Point: ...   
          Bit Rate=65 Mb/s   Tx-Power=200 dBm   
          Retry short limit:7   RTS thr:off   Fragment thr:off
          Encryption key:off
          Power Management:off
          Link Quality=70/70  Signal level=-39 dBm  
          Rx invalid nwid:0  Rx invalid crypt:0  Rx invalid frag:0
          Tx excessive retries:0  Invalid misc:0   Missed beacon:0
eth0      no wireless extensions.
> lsmod | grep "wl"
wl                   6451200  0 
cfg80211              610304  1 wl

And it only took all afternoon … sometimes I hate Linux 🙁

Overlays with Code Listings

You cannot include a lstlisting (package listings) in a only or visible command in LaTeX beamer. BUT you can define the listing beforehand and then include that inside the only or visible!

Example (from slides about recursion in Java):

public int fakultaet(int n) {
   if ( n == 1 ) {
      return 1;
   } else {
      return n * fakultaet( n-1 ) ;

\frametitle{Aufgabe: Fakultät von $n$}

   \item \lstinline{fakultaet( 1 ) = 1}
   \item \lstinline{fakultaet( n ) = n * fakultaet( n-1 ) }


Java Code: 


Setting computer time from the internet [hacky way]

Most of my pool computers show the wrong time and most of them are different. Just for fun, here are the times shown by those running at the moment of the poll:

8:36 (2x), 8:40, 9:36 (2x), 10:35, 10:36 (3x), 10:39 (6x), 11:36 (2x), 11:40

I assume it is the result of setting the time wrong in the installation and then a few semesters of trying to fix some of them (those running at the moment, the first three rows, until the admin was bored, a single one now and then, …), adjusting to daylight savings time or forgetting it and so on.

So this is what I tried to get them back on track (courtesy of AskUbuntu.com):

sudo date -s "$(wget -qSO- --max-redirect=0 google.com 2>&1 | grep Date: | cut -d' ' -f5-8)Z"

The line first gets a random web page (here google.com) and prints the header of the HTTP response, e.g.,:

  HTTP/1.1 302 Found
  Cache-Control: private
  Content-Type: text/html; charset=UTF-8
  Referrer-Policy: no-referrer
  Location: http://www.google.de/?gfe_rd=cr&dcr=0&ei=mVYyWqvyKNHPXuKYpeAP
  Content-Length: 266
  Date: Thu, 14 Dec 2017 10:46:49 GMT

The line then retrieves the part of the response with the date using grep. It splits the line with the date at spaces with cut -d ' ' and uses the parts 5 to 8. In this line, part 4 is the day of the week, part 3 is the text Date: and parts 1-2 are empty because of the leading spaces. So using parts 5 to 8 results in a date and time in a format that the tool date can understand. Before passing the time on to date, the letter Z is appended. This Z stands for UTC, meaning the time zone set on the computer will be taken into account.

So the line after evaluating wget, grep and cut for the example page we got will be:

sudo date -s "14 Dec 2017 10:46:49Z"

The option -s sets the date to the specified value. So if the request ran in a reasonable time, we should have a reasonably accurate time set for the computer.

PS: Yes, I know that there is such a thing as NTP and I know that time synchronization is not a problem that you need to hack on your own. But this version is much more freaky and cool!! [Also NTP and the university firewall don’t seem to be friends]

Discontinuous x axis with pgfplots

Having a discontinuous y axis is common and Stackoverflow has a few solutions for that. I wanted an x axis with a gap (values 0-10 plus value 20). So this is what I did.

I create an axis from 0 to 12 and give 12 the label “20”. I add an extra tick on the x-axis at about halfway between 10 and “12”, where I want the gap and make it thick and white – basically I want a break in the axis. Then over that break I draw the “label” of this tick, which is two vertical lines at an angle, symbolizing the discontinuity. The relevant part of the style:

xticklabels={0, 2, 4, 6, 8, 10, 20},
extra x ticks={11.1},
extra x tick style={grid=none, tick style={white, very thick}, tick label style={xshift=0cm,yshift=.50cm, rotate=-20}},
extra x tick label={\color{black}{/\!\!/}},

And then I add the data with x-values 20 at x-coordinate “12”:

\addplot coordinates {
(0, 43.3) (1, 43.2) (2, 43.3) (3, 42.9) (4, 42.1) (5, 41.4) 
(6, 41.2) (7, 41.7) (8, 41.7) (9, 42.1) (10, 42.1) }; 
\pgfplotsset{cycle list shift=-1}
\addplot coordinates { (12, 43.8) };
\draw[dotted] (axis cs:10, 42.1) -- (axis cs:12, 43.8);

Adding the last point separately from the rest of the data serves the purpose that I can draw the dotted line by hand. cycle list shift=-1 causes the new “plot” to have the same style as the previous. There might be a way of doing this, but this works.

Hat tip: Stackoverflow, but I currently cannot find the question(s) and answer(s) that helped me solve this. Still, thank you, anonymous people.

Learning to learn – supervised versus unsupervised machine learning

In this blog post, I would like to introduce the two main forms of machine learning, supervised and unsupervised machine learning. The two differ quite a lot in the task they address, in the data that is necessary and in the algorithms that are used.

Supervised learning starts out from a set of data where each item is associated with a label that indicates a category. One example data set could be a collection of e-mails where each one is labeled as “spam” or “non-spam“. Another example data set could be a photo collection with categories such as “shows a mountain“, “is a portrait” or “taken at night“. These labels have usually been assigned by a human. The task for the machine learning algorithm is now to learn how to assign these labels. To this end, it is shown a large number of items with labels and it tries to learn how to distinguish one category from the other. The process is similar to a human who tries to learn something new. A child might first call everything with four legs a cat, but after seeing enough animals and the accompanying comment “no, that’s not a cat, that’s a X“, she will over time come to distinguish actual cats from dogs, cows or horses. Supervised machine learning algorithms do basically the same thing. Given a large amount of examples and their category, they try to find features that separate one class from the others. Coming back to the example of e-mails, the algorithm may find that e-mails that contain the phrases “earn a lot of money” or “prince from Nigeria” are likely spam. Or in the case of photos, it may learn that when a picture is dark, it has been taken at night. There are two main differences to the learning process of us humans. One disadvantage is, that the algorithm cannot generalize as well as we do. But this is offset by the advantage that it is much faster than we are and can look at a much larger data set than we ever could. Supervised learning is sometimes also called classification and there are many machine learning algorithms available. Examples include decision trees, Naive Bayes, logistic regression and neural networks.

Let us now turn to unsupervised learning. Just like with supervised learning, we start with a large data set to show the computer. But in contrast to supervised learning, there are no labels. No one is telling the algorithm what to learn. The task is rather to use the internal characteristics of the data to come up with groups inside the data. For example we could try to find groups of users with similar shopping habits out of all the online customers of your company. Or products that are similar to each other in the set of items those sold at a web shop. Or group the web pages in the result of a web search, e.g., the pages discussing jaguar the car versus those about the cat. The resulting division in the data is not based on outside input, like it is for classification, where a human has to define the categories for the data beforehand. The division is only based on the similarity of items in the data set among each other. No human has defined that for the search “jaguar” there are results for a cat and a car, but just by looking at the pages it turns out that there are two groups of pages that use a very different vocabulary. Algorithms for unsupervised learning include clustering algorithms and methods for covariance analysis like principal component analysis/singular value decomposition.

For the sake of completeness, let me mention that supervised and unsupervised learning are the two poles of machine learning methods, but not everything falls clearly into one camp or the other. Several semi-supervised approaches exist that fall somewhere in between. Some of these approaches use partial labels or external information to create the data set from where supervised learning can then start. Other methods use supervised learning to incrementally increase the data set on which the learning algorithm itself is trained. And of course there is no limit to creativity in this area.

To summarize, supervised and unsupervised learning differ in the task they want to solve (supervised learning assigns human-defined categories while unsupervised learning tries to find inherent groups in the data), the data that is necessary (supervised learning needs a set of items with associated categories, unsupervised learning needs only the items) and in the algorithms that are used (classification algorithms for supervised learning versus clustering algorithms for unsupervised learning).

This post has first appeared at 5analytics.com