Monthly Archives: August 2014

The end of the internship

Last week the official days of the internship ended. It was a challenging and very interesting experience for me.

A short overview of the internship:

The beginning was the hardest part of the internship. I had to accommodate with both RCU notions, Coccinelle syntax and on top of this I had to take my exams at the university. The first weeks I struggled to figure out the Coccinelle syntax and try to use it for my tasks. Also, I was reading and talking with my mentor about RCU in order to know its API and the way it works so I can apply this understanding in solving problems.

Once I began being friends with Coccinelle everything became a little easier.

The main steps I followed in solving the tasks were:
– think about the task and discuss about it with my mentor
– write a simple Coccinelle script and step-by-step make it more efficient (where the task permitted;
some of the scripts made the changes right when the script was applied on the files, others only highlighted cases and after that, I would analyze the output and decide if I should make a change or not)
– discuss with my mentor my observations and possible fixes for the bugs

Unfortunately (for me), for some tasks I didn’t find many bugs as I wanted (a chance to send more patches and contribute more to Linux kernel 🙂 ). Anyway I learned a lot by analyzing all those cases even though only a few of them transformed into patches. Many of those cases were tricky ones and I think that was a good learning experience for me, trying to see if there really was a problem or not.

I still have work in progress and patches to send. Also, I wait for answers at many patches already sent. I will come up with a post with summaries for every task after I have all the answers.

Some lessons learned:
– good understanding of RCU
– intermediate user of Coccinelle
– analyze code written by other people
– you don’t have to automatize something if there is nothing to automatize
For sure, there are more, but these are in my mind right now.

Many thanks to my mentor, Paul E. McKenney, for all the advice for the project and in general, the quick answers and for all the help that he offered me. Thanks to Julia Lawall because every time when I had a Coccinelle problem she made time and helped me. Also, I want to thank the people from OPW. All of them are very nice and eager to help anybody!

In the end, I can say that my work for Linux kernel will continue for sure.

I can’t wait for LinuxCon Europe in October where I will talk about my project and meet people from OPW.
See you around!

Advertisements

Searching for bugs

In this post, I will talk about another task I’ve been working on:
“Making use of an RCU-protected pointer after passing it to call_rcu() or similar function (“call_rcu_bh()”, “call_rcu_sched()”, “call_srcu()”)”.

First, let’s see what “call_rcu()” does.
The write-side RCU primitives allow the caller to defer an action like deleting a pointer until all pre-existing RCU critical sections have finished execution (meaning that the pointer isn’t used anymore and it is safe to remove it).
RCU’s API provides two ways for this:
– “synchronize_kernel()”
– “call_rcu()”

When “synchronize_kernel()” is called it blocks until the end of all pre-existing read-side RCU critical sections.
But, in some cases, you don’t want to wait and it might be inefficient. So, instead of “synchronize_kernel()” you use “call_rcu()”.
“call_rcu()” invokes the callback function (the second parameter) after all pre-existing RCU critical section have completed execution.

Why do we need “call_rcu_bh()”, “call_rcu_sched()” and “call_srcu()”?
That is because of RCU’s flavors:

RCU-sched is needed for waiting for hardware interrupts.
RCU-bh is needed in cases related with denial of service attacks.
SRCU permits sleeping in RCU read-side critical sections.

The principal thing to do in order to solve this task is to identify when the variable goes dead.
Examples:

/* BUG */
call_rcu(&p->head, func);
/* p doesn’t exist anymore */
p->a = 1;

/* OK */
call_rcu(&p->head, func);
/* we are using another pointer that is stored in the same pointer variable */
p = kmalloc(sizeof(p), GFP_KERNEL);
p->a = 1;

I started with this Coccinelle semantic patch which looks for a “call_rcu()” call and sees if after that call the pointer is used.

@@
identifier f, p;
@@

f(…) {
… when any
* call_rcu((<+…p…+>), …);
… when any
* (<+…p…+>)
… when any
}

For the similar functions I replaced “call_rcu()” with the function of interest.
I didn’t find bugs using this approach, but I will show you some cases that at first sight might seem bugs.

– file “kernel/rcu/rcutorture.c”:

rcu_read_lock(); /* Make it impossible to finish a grace period. */
call_rcu(&rh1, rcu_torture_leak_cb); /* Start grace period. */
local_irq_disable(); /* Make it harder to start a new grace period. */
call_rcu(&rh2, rcu_torture_leak_cb);
call_rcu(&rh2, rcu_torture_err_cb); /* Duplicate callback. */
local_irq_enable();
rcu_read_unlock();

This case (the two calls with rh2) is OK because it is for debugging purposes.

– file “net/bridge/br_multicast.c”:

if (!old)
    goto out;

call_rcu_bh(&mdb->rcu, br_mdb_free);

out:
    rcu_assign_pointer(*mdbp, mdb);

A few lines before the “call_rcu_bh()” call there is a “goto out” so this means that it doesn’t get to the call when the if condition is successful.

– file “arch/powerpc/mm/hugetlbpage.c”:

call_rcu_sched(&(*batchp)->rcu, hugepd_free_rcu_callback);
*batchp = NULL;

Everything OK, it just uses the same pointer variable.

The first approach was the normal one and the simpler one to start with, but every case that I found was OK.

The second approach is the following:
first step: find assignments like this one: “g = call_rcu(…)” (g is a global variable).
second step: find the functions where these assignments are
third step: find the functions where the functions found at the second step are called
fourth step: see if in the functions found at the third step the protected pointer is used in a bad way.

The Coccinelle script used for this:

@ locally @
identifier l;
type t;
position p;
@@

t l;
… when any
call_rcu@p((<+…p…+>), …);

@ globally @
identifier fn;
identifier g != locally.l;
position p2 != locally.p;
@@

fn(…) {
… when any
call_rcu@p2((<+…p…+>), …);
… when any
}

@ other_func @
identifier globally.fn, ff;
@@

ff(…) {
… when any
* fn(…)
… when any
}

In the first rule, I look for uses of “call_rcu()” where it assigns to a local variable. In the second rule, I exclude the cases where the variable might be local (using positions). In the third rule, I search for the calls to the functions found in the second rule.

In a few days I will tell you about two interesting cases that I found using this script and the solutions (or if they need a solution).