Created: Wed Mar 20 19:29:25 CET 2019

Last modiﬁed: Fri Mar 22 00:22:11 CET 2019

*“Only simplify after the function is correct.”*

Source code for this posts’ data structures is found here.

It took me a while to realize that most of the code you see in books is
already simpliﬁed. Sometimes, it can be hard to understand **how** the
author *got there*.

Today’s instance will be about doubly-linked lists.

Day 1 of (seriously) learning data structures and I want to make sure I know how to implement them before moving on.

When programming recursive functions in Scheme (or in any language by
the way), you’re often told to divide your computation into *cases*. In
the context of recursive functions, that would be the *base case* and
the *natural recursion*.

It never occured to me that I could do the same thing when writing in imperative style.

Basically writing functions matching the few borderline cases and another one for the “general case”; then mixing them all.

That’s the base idea for this post,

I’m merely checking that I can successfully reimplement singly and doubly-linked lists by heart by dividing the complicated stuff into simple cases.

We will look at four ways to insert something in a doubly-linked list and we’ll show that they all have two subcases.

*You might want to take a glimpse at the conclusion at this point.*

As you will see in the conclusion, you don’t need every functions there
if you just want to deal with doubly-linked lists. The reason I added
them is that they also work with singly-linked lists **and** circular
lists; which is like killing two birds with one stone.

**Doubly-linked lists**

Here is the structure for a node:

```
typedef struct ListElmt_ {
struct ListElmt_ *next;
struct ListElmt_ *prev;
} ListElmt;
```

And here is the structure for holding a list’s metadatas such as its head, tail and size,

```
typedef struct List_ {
int size;
ListElmt *head;
ListElmt *tail;
} List;
```

We basically removed ﬁelds because we’re only interested in the links between nodes, not the data they hold.

Now onto what matters: how do I insert a node, where ?

**Case n°1: inserting at the head**

This is equivalent to pushing a value onto a stack.

The algorithm should be really simple; assuming every pointers in above structures never contain garbage and are set to NULL when they do.

Case n°1 has subcases: the one where the list is empty, and the one when there is at least one element in the list. Since we’re inserting at the head, we don’t care about an eventual second or third element in the list.

Assuming our new element is named `new_element`

, and we have write
access to a list `list`

, the following algorithm will work just ﬁne:

```
new_element->next = list->head;
new_element->prev = NULL;
list->head = new_element;
```

We forget some important details here: what about the tail ?

What if the list was empty and the new element becomes the only element
in the list ? There are two ways to test for this, (1) either by
verifying that `list->size == 0`

or (2) by verifying that
`new_element->next`

, `list->head`

or `list->tail`

is NULL.

No matter which one you choose, you’re basically inserting this at the end of the algorithm:

```
if (list->size == 0)
list->tail = new_element
```

But what if **there was** a head, and `new_element->next`

is hence not
NULL ? Shouldn’t its `prev`

ﬁeld be set ? Just add an `else`

branch to
the above `if`

.

`else new_element->next->prev = new_element;`

And don’t forget to increment the list’s size after the operation succeeds.

`++(l->size);`

We now have a function that inserts whatever data we want right at the head of our list.

**Case n°2: inserting at the tail**

Surprisingly easy: take case n°1, replace `head`

with `tail`

and `tail`

with `head`

, replace `next`

with `prev`

and `prev`

with `next`

and
you’re done. No kidding.

A doubly-linked list can be traversed from left to right, just like a singly-linked list, but it can also be traversed from right to left..

**Case n°3: inserting between two elements**

Let’s refer to them as `element1`

and `element2`

.

Case n°3 translates as “inserting **after** some element”.

Since we’re inserting **between** two items, we can’t possibly have to
set `list->head`

.

There are two subcases: the general case and the one when `element2`

is
NULL, meaning we’re inserting after the tail.

For our function to handle both cases, we’ll need the following block:

```
if (element2 != NULL)
element2->prev = new_element;
```

which translates to “only tell `element2`

about `new_element`

if
`element2`

exist!” Which kind of make sense.

Now the easy part, common to both case is to tell `element1`

that
`new_element`

is his new right neighbor:

`element1->next = new_element;`

You’ll also want to tell `new_element`

about both its new neighbors:

```
new_element->next = element1;
new_element->prev = element2;
```

And now, for the ﬁnish, we’ll set `list->tail`

at the condition that
`new_element->next`

(which is a pointer to `element2`

) __is__ a NULL
pointer.

```
if (new_element->next == NULL)
list->tail = new_element;
```

And don’t forget to increment `list->size`

after the operation succeeds.

**Case n°4: inserting before some element**

It’s basically the same as with case n°2, just take case n°3 and invert everything. Simple.

**Conclusion**

So far we have an idea of an implementation for basically every cases of inserting one new element.

There is a thing tho,… Why would I need a function that inserts at the head of my list, when I also have a function that can insert before any element, head included ?

Those functions indeed compete with each others. You can get by with just n°1 and n°3, n°2 and n°4, or n°3 and n°4, which I believe really handle every instances you could have to face.

I care about n°1 because, just like pretty much everyone, I learned about doubly-linked lists after learning about singly-linked lists. Long story short, it is easier to remember how to insert before head and how to insert after any element since it also works for singly-linked lists.

SIDE NOTE: “inserting before head” is but a special case of “inserting before any element”, in fact, I implemented case n°4 by “extending” case n°1.

I wrote this whole post by heart (that was the goal for today), now every time I’m lost on an island and I ﬁnd a piece of paper and a piece of wood, I’ll, at the very least, be able to implement linked lists from scratch.

BONUS: After reading a little part of the source code of csh(1), it seems that tokens are stored in a doubly-linked list, maybe I’ll add something like that to my lsh as some kind of joke of a parser.