From singly to dou­bly-linked lists

Created: Wed Mar 20 19:29:25 CET 2019

Last mod­i­fied: Fri Mar 22 00:22:11 CET 2019


Only sim­plify af­ter the func­tion is cor­rect.”

Source code for this posts’ data struc­tures is found here.

It took me a while to re­al­ize that most of the code you see in books is already sim­pli­fied. Sometimes, it can be hard to un­der­stand how the author got there.

Today’s in­stance will be about dou­bly-linked lists.

Day 1 of (seriously) learn­ing data struc­tures and I want to make sure I know how to im­ple­ment them be­fore mov­ing on.

When pro­gram­ming re­cur­sive func­tions in Scheme (or in any lan­guage by the way), you’re of­ten told to di­vide your com­pu­ta­tion into cases. In the con­text of re­cur­sive func­tions, that would be the base case and the nat­ural re­cur­sion.

It never oc­cured to me that I could do the same thing when writ­ing in imperative style.

Basically writ­ing func­tions match­ing the few bor­der­line cases and another one for the general case”; then mix­ing them all.

That’s the base idea for this post,

I’m merely check­ing that I can suc­cess­fully reim­ple­ment singly and doubly-linked lists by heart by di­vid­ing the com­pli­cated stuff into simple cases.

We will look at four ways to in­sert some­thing in a dou­bly-linked list and we’ll show that they all have two sub­cases.

You might want to take a glimpse at the con­clu­sion at this point.

As you will see in the con­clu­sion, you don’t need every func­tions there if you just want to deal with dou­bly-linked lists. The rea­son I added them is that they also work with singly-linked lists and cir­cu­lar lists; which is like killing two birds with one stone.

Doubly-linked lists

Here is the struc­ture for a node:

typedef struct ListElmt_ {
        struct ListElmt_ *next;
        struct ListElmt_ *prev;
}               ListElmt;

And here is the struc­ture for hold­ing a list’s meta­datas such as its head, tail and size,

typedef struct List_ {
        int             size;
        ListElmt       *head;
        ListElmt       *tail;
}               List;

We ba­si­cally re­moved fields be­cause we’re only in­ter­ested in the links between nodes, not the data they hold.

Now onto what mat­ters: how do I in­sert a node, where ?

Case n°1: in­sert­ing at the head

This is equiv­a­lent to push­ing a value onto a stack.

The al­go­rithm should be re­ally sim­ple; as­sum­ing every point­ers in above structures never con­tain garbage and are set to NULL when they do.

Case n°1 has sub­cases: the one where the list is empty, and the one when there is at least one el­e­ment in the list. Since we’re in­sert­ing at the head, we don’t care about an even­tual sec­ond or third el­e­ment in the list.

Assuming our new el­e­ment is named new_element, and we have write access to a list list, the fol­low­ing al­go­rithm will work just fine:

new_element->next = list->head;
new_element->prev = NULL;

list->head = new_element;

We for­get some im­por­tant de­tails here: what about the tail ?

What if the list was empty and the new el­e­ment be­comes the only el­e­ment in the list ? There are two ways to test for this, (1) ei­ther by verifying that list->size == 0 or (2) by ver­i­fy­ing that new_element->next, list->head or list->tail is NULL.

No mat­ter which one you choose, you’re ba­si­cally in­sert­ing this at the end of the al­go­rithm:

if (list->size == 0)
    list->tail = new_element

But what if there was a head, and new_element->next is hence not NULL ? Shouldn’t its prev field be set ? Just add an else branch to the above if.

else new_element->next->prev = new_element;

And don’t for­get to in­cre­ment the list’s size af­ter the op­er­a­tion succeeds.

++(l->size);

We now have a func­tion that in­serts what­ever data we want right at the head of our list.

Case n°2: in­sert­ing at the tail

Surprisingly easy: take case n°1, re­place head with tail and tail with head, re­place next with prev and prev with next and you’re done. No kid­ding.

A dou­bly-linked list can be tra­versed from left to right, just like a singly-linked list, but it can also be tra­versed from right to left..

Case n°3: in­sert­ing be­tween two el­e­ments

Let’s re­fer to them as element1 and element2.

Case n°3 trans­lates as inserting af­ter some el­e­ment”.

Since we’re in­sert­ing be­tween two items, we can’t pos­si­bly have to set list->head.

There are two sub­cases: the gen­eral case and the one when element2 is NULL, mean­ing we’re in­sert­ing af­ter the tail.

For our func­tion to han­dle both cases, we’ll need the fol­low­ing block:

if (element2 != NULL)
        element2->prev = new_element;

which trans­lates to only tell element2 about new_element if element2 ex­ist!” Which kind of make sense.

Now the easy part, com­mon to both case is to tell element1 that new_element is his new right neigh­bor:

element1->next = new_element;

You’ll also want to tell new_element about both its new neigh­bors:

new_element->next = element1;
new_element->prev = element2;

And now, for the fin­ish, we’ll set list->tail at the con­di­tion that new_element->next (which is a pointer to element2) is a NULL pointer.

if (new_element->next == NULL)
        list->tail = new_element;

And don’t for­get to in­cre­ment list->size af­ter the op­er­a­tion suc­ceeds.

Case n°4: in­sert­ing be­fore some el­e­ment

It’s ba­si­cally the same as with case n°2, just take case n°3 and in­vert everything. Simple.

Conclusion

So far we have an idea of an im­ple­men­ta­tion for ba­si­cally every cases of inserting one new el­e­ment.

There is a thing tho,… Why would I need a func­tion that in­serts at the head of my list, when I also have a func­tion that can in­sert be­fore any element, head in­cluded ?

Those func­tions in­deed com­pete with each oth­ers. You can get by with just n°1 and n°3, n°2 and n°4, or n°3 and n°4, which I be­lieve re­ally handle every in­stances you could have to face.

I care about n°1 be­cause, just like pretty much every­one, I learned about dou­bly-linked lists af­ter learn­ing about singly-linked lists. Long story short, it is eas­ier to re­mem­ber how to in­sert be­fore head and how to in­sert af­ter any el­e­ment since it also works for singly-linked lists.

SIDE NOTE: inserting be­fore head” is but a spe­cial case of inserting before any el­e­ment”, in fact, I im­ple­mented case n°4 by extending” case n°1.

I wrote this whole post by heart (that was the goal for to­day), now every time I’m lost on an is­land and I find a piece of pa­per and a piece of wood, I’ll, at the very least, be able to im­ple­ment linked lists from scratch.

BONUS: After read­ing a lit­tle part of the source code of csh(1), it seems that to­kens are stored in a dou­bly-linked list, maybe I’ll add something like that to my lsh as some kind of joke of a parser.

source code