Segmentation and Paging in Linux

The goal of both paging and segmentation is ultimately the same – to provide a way to translate virtual or logical addresses into physical addresses, that actually exist in the RAM.

Segmentation

Segmentation in Linux is not really used as the main virtualization technique anymore, and is kept mainly as a relic from the past.

There is a segment register for the code and the data segment in the program, as well as a bunch of other segments, which we won’t discuss. But each of these segments are described by a segment descriptor which is an 8 byte  entity.  There is obviously a bunch of stuff in each segment descriptor, but it mainly has the base address for each segment, its bounds and so on. The list of all the segment descriptors is maintained by the kernel in the Global Descriptor Table (GDT).  There is 1 GDT per every processor.  The kernel can also use a per-process descriptor table also called the Local Descriptor Table (LDT),  but this is rarely used.

So how do we get to the segment descriptors? Each segment register stores a segment selector.  A segment selector is a 16 bit entity. The 2 LSBs stand for the Requested Privilege Level (RPL).  Clearly it can represent 4 different privilege levels. The privilege levels 0-2 are all considered to be kernel mode whereas  level 3 is considered to be user mode. Thus this provides an easy way to protect the user mode from performing kernel-mode operations. The next bit is either a 0 or 1 indicating whether we should be using a GDT or LDT. Finally the remaining 13 MSBs are an index into the GDT(LDT) for the segment descriptor corresponding to the segment. In practice though, it isn’t necessary to access the GDT each time we need to look up a segment descriptor. Whenever there is a context switch, the kernel will initialize the segment registers with the corresponding segment selectors for the new process. At the same time, there are also corresponding non-programmable registers that will store the segment descriptor for the segment. Thus the  GDT needs to be looked up just once, during the context switch.

 

Paging

Paging in linux just provides a means to translate the virtual page numbers (VPN) to the corresponding physical page numbers. The typical way to do this is to maintain a page table. A page table simply provides a mapping from each VPN to the corresponding PPN.  Unfortunately, even for 32 bit architectures, this would mean having 2^32 entries, in the main memory.

The CR3 register stores the base address of the page table. Obviously, since each process will have a different VPN -> PPN mapping, there is a separate page table for each process. Whenever there is a context switch, the kernel saves the CR3 for the process in its task_struct and restores the CR3 from the task_struct for the new process.

Thus in linux, a page table is usually multi-level. First let’s talk about 32 bit architectures.  In this case the CR3 register points to page directory, the page directory in turn points to the page table. Each VPN is thus split into 3 parts. The first 10 MS bits are the index into the page directory and the next 10 bits are an index into the page table. Finally the last 12 bits are the actual offset in the page.  In case of 64 bit architectures, there is even more level of indirection.

 

Static vs Shared libraries and Dynamic Linking

What is a (static) library?

A library is just an object file, that is obtained by pre-compiling several source files together. These source files offer common functionality that may be re-used in a lot of other source files. Thus instead of compiling them each time, with each other source file that requires them, it is more efficient to just pre-compile it, ready to be linked with whichever program needs to use it. Static libraries end with a “.a” extension.

How does the compiler know that a particular static library must be linked to some program?

Well, this is specified by using the -l option with gcc. For instance, if your hello.c program needs to use the libmath.a library, then it would be compiled as 

gcc  -c hello.c

gcc -o hello hello.o -lmath

Also if the libmath.a library isn’t located in a standard path like say /usr/lib, then it is also necessary to specify the exact path at which it may be found by using the -L switch with gcc

Ok, so what are shared libraries then?

So static libraries are not bad. They keep pre-compiled the commonly used code, so it is efficient and ready to use, we save up on the compile time. But there may be more than 1 program that is using the libmath library. And each such separate program that needs the libmath library simply just links in its own copy. This means that multiple programs running in the memory may have multiple copies of the same library. Clearly this wastes space. What we want to have is, a way of sharing the same library amongst several programs, that may be resident in the memory at the same time. Shared libraries offer this solution. Shared libraries typically end with the “.so” extension

Well that, how do shared libraries work?

First of all, shared libraries are usually compiled with -fPIC option (positionally independent code) – this makes it possible to place it at any virtual memory location, during runtime. 

Executables ( which are usually in the ELF format btw) that need to use shared libraries, aren’t directly linked to them, instead the ELF header, contains information about which shared libraries the executable will require.

Now it is the job of the dynamic linker, to locate these libraries at run time ( if some other program is already using these libraries, they might already be in the memory, otherwise they will need to be loaded into the memory) and perform all the symbol resolution. Again, if the libraries aren’t in some standard location like /usr/lib, then the path needs to be specified by setting the LD_LIBRARY_PATH variable. Shared libraries are pretty much the norm in most of the use cases today, with static libraries being used rarely.

Some useful tools while working with shared libraries?

One very useful tool is ldd.  Should be run as ldd <name of your executable>. This will tell you all the shared libraries that are used in the executable, and also the path to which they resolve. Often useful in figuring out some weird compilation error. 

ldconfig is another tool. This will keep track of all the possible directories in which the shared libraries may be located. The paths can be specified in the /etc/ld.so.conf and it will also search standard paths like /usr/lib. It then creates a list of all the different libraries in /etc/ld.so.cache. This helps the dynamic linker to resolve the paths quickly without wasting too much time in searching all the possible directories.

And finally what is dynamic linking?

Not going into it much here, but essentially instead of loading all the shared libraries that a program depends on upfront, this is a way of lazily loading a library, only when it is required. The dlopen API is provided and used for achieving this.

Dynamic programming via memoization

I personally find it much easier to make a recursive function efficient by caching previous solutions a.k.a a top down approach vs a bottom up approach.

There are so many interesting resources that go through some cool problems to which DP can be applied. The most interesting ones I found are the MIT 6.006 series on youtube by Erik Demaine. There, Erik discusses some nice basic properties of DP, such as it being equivalent to a DAG traversal etc, etc and then goes on to apply it to some pretty unconventional problems like playing the guitar or blackjack. Here’s the link : Video Lectures. They focus mainly on the bottom up approach.

Another compendium of  some classic DP problems is DP-Practice. Here again the flash animations mainly focus on the bottom-up approach.

In this blog post I’ll focus on some of the main techniques/ gotchas that I’ve found useful while using the top-down / memoization approach on recursive solutions.

 

  • If you are planning to apply memoization on your recursive function make sure that it returns a value ( the answer to the subproblem). Often recursive functions are void, with the state being stored internally, but it will take much more effort to convert such recursive functions to memoized DP’s
  • Make sure that your recursive problem is solving sub-problems and reducing the state-space for all the parameters. While writing recursive solutions, just because it is already very “brute-forcey”,  it is easy to forget that certain parts are unnecessary. Make sure to consistently prune the state at every call, and pass strictly the state that is required for that particular call. If you skip doing this, it will be impossible to memoize your DP solution.
  • Remember to pass in only the “independent” states and not the “dependent” state as parameters to your recursive function. For instance, let’s say you have a problem where you want to check if a string s1 is formed by some interleaving of strings s2 and s3. One possible state you could pass in the recursive function is the set{i, j, k} the indices for s1, s2 and s3 where you are currently positioned. However notice, that in this case, the variable i is not really required, it is dependent on j and k, basically i = j+k in any valid solution to the problem. Thus you really need to pass in just 2 variables  {j, k} in the state space.
  • Once you have decided the minimum set of parameters that make up your state space, the memoization is just about making either an array/ hashmap , that stores for each valid combination of the parameters, the solution that was computed by the function. Here’s why the previous step is important, the space complexity of your solution will grow as the number of parameters grow.
  • Now that you have a cache in place for storing solutions, you just need to modify your functions (at the logical level, there may of course be some other simple code changes you’ll need to make) so that :
    • on entering the function you check whether the particular combination of parameters already has a solution in the cache, if so simply return with the solution.
    • if not, then just continue the rest of the recursive function as before, but just before returning the result, make sure to store it in the cache, against the current set of parameter values.

And that should be it. This should usually be enough to quickly convert that cpu-hogging sloth of a recursive function to its much slicker DP cousin!

Of course a lot of caveats apply, and this is just the tip of the iceberg. Often top down approaches may still not be slick enough to pass the timing limits in some fancy competitive programming competition. But if you just want to be happy and satisfied about owning a toolkit to cook up some efficient solution to a DP problem, this is definitely not bad.

Great Engineers

Here is my  (possibly wrong and  definitely still a WIP) opinion on what makes engineers, and in particular programmers really really good.

  1. They are infinitely curious.

    • Can we do better than this? (They are always asking this)
    • What was the reason for a particular design decision?
    • Why did changing the code in this way, make the bug disappear? (In particular they are never satisfied that a fix makes the bug go away, until they have really reached the root cause)
    • Under what conditions would my code break or behave unexpectedly?
    • Most importantly, they never make any assumptions about there code or believe in any assumptions, but verify them before proceeding.
  2. They are fearless:

    • They are never to afraid or lazy of diving deep into any complex code to make sure they understand what it does (see point 1)
    • They aren’t afraid of making potentially large impact changes on a code base, if it will make it better in the long run.