Application Development: Pointers and unsafe code in c#

Introduction

We will see that C# allows suspending the verification of code by the CLR to allow developers to directly access memory using pointers. Hence with C#, you can complete, in a standard way, certain optimizations which were only possible within unmanaged development environments such as C++. These optimizations concern, for example, the processing of large amounts of data in memory such as bitmaps.

Pointers and unsafe code

C++ does not know the notion of code management. This is one of the advantages of C++ as it allows the use of pointers and thus allows developers to write optimized code which is closer to the target machine.

This is also a disadvantage of C++ since the use of pointers is cumbersome and potentially dangerous, significantly increasing the development effort and maintenance required.

Before the .NET platform, 100% of the code executed on the Windows operating system was unmanaged. This means the executable contains the code directly in machine instructions which are compatible with the type of processor (i.e. machine language code). The introduction of the managed execution mode with the .NET platform is revolutionary. The main sources of hard to track bugs are detected and resolved by the CLR. Amongst these:

Array access overflows (Now dynamically managed by the CLR).

Memory leaks (Now mostly managed by the garbage collector).

The use of an invalid pointer. This problem is solved in a radical way as the manipulation of pointers if forbidden in managed mode.

The CLR knows how to manipulate three kinds of pointers:

Managed pointers. These pointers can point to data contained in the object heap managed by the garbage collector. These pointers are not used explicitly by the C# code. They are thus used implicitly by the C# compiler when it compiles methods with out and ref arguments.

Unmanaged function pointers. The pointers are conceptually close to the notion of delegate. We will discuss them at the end of this article.

Unmanaged pointers. These pointers can point to any data contained in the user addressing space of the process. The C# language allows to use this type of pointers in zones of code considered unsafe. The IL code emitted by the C# compiler corresponding to the zones of code which use these unmanaged pointers make use of specialized IL instructions. Their effect on the memory of the process cannot be verified by the JIT compiler of the CLR. Consequently, a malicious user can take advantage of unsafe code regions to accomplish malicious actions. To counter this weakness, the CLR will only allow the execution of this code at run-time if the code has the SkipVerification CAS meta-permission.

Since it allows to directly manipulating the memory of a process through the use of an unmanaged pointer, unsafe code is particularly useful to optimize certain processes on large amounts of data stored in structures.

Compilation options to allow unsafe code

Unsafe code must be used on purpose and you must also provide the /unsafe option to the csc.exe compiler to tell it that you are aware that the code you wish to compile contains zones which will be seen as unverifiable by the JIT compiler. Visual Studio offers the Build Allow unsafe code project property to indicate that you wish to use this compiler option.

Declaring unsafe code in C#

In C#, the unsafe keyword lets the compiler know when you will use unsafe code. It can be used in three situations:

Before the declaration of a class or structure. In this case, all the methods of the type can use pointers.

Before the declaration of a method. In this case, the pointers can be used within the body of this method and in its signature.

Within the body of a method (static or not). In this case, pointers are only allowed within the marked block of code. For example:

unsafe
{
...
}

Let us mention that if a method accepts at least one pointer as an argument or as a return value, the method (or its class) must be marked as unsafe, but also all regions of code calling this method must also be marked as unsafe.

Using pointers in C#

Each object, whether it is a value or reference type instance, has a memory address at which it is physically located in the process. This address is not necessarily constant during the lifetime of the object as the garbage collector can physically move objects store in the heap.

.NET types that support pointers

For certain types, there is a dual type, the unmanaged pointer type which corresponds to the managed type. A pointer variable is in fact the address of an instance of the concerned type. The set of types which authorizes the use of pointers limits itself to all value types, with the exception of structures with at least one reference type field. Consequently, only instances of the following types can be used through pointers: primitive types; enumerations; structures with no reference type fields; pointers.

Declaring pointers

A pointer might point to nothing. In this case, it is extremely important that its value should be set to null (0). In fact, the majority of bugs due to pointers come from pointers which are not null but which point to invalid data. The declaration of a pointer on the FooType is done as follows:

FooType * pointeur;

For example:

long * pAnInteger = 0;

Note that the declaration...

int * p1,p2;

... makes it so that p1 is a pointer on an integer and p2 is a pointer.

Now let's see the first program

Program 1

using System;

class MyClass {
 public static void Main() {
  int iData = 10;
  int* pData = &iData;
  Console.WriteLine("Data is " + iData);
  Console.WriteLine("Address is " + (int)pData );
 }

Here I use a pointer in this program. Now compile this program. The compiler gives the error

Now let's change the program a little bit and add unsafe modifier with the function.

Program 2

using System;

class MyClass {
 public unsafe static void Main() {
  int iData = 10;
  int* pData = &iData;
  Console.WriteLine("Data is " + iData);
  Console.WriteLine("Address is " + (int)pData );
 }
Data is 10
Address is 1244316

It is not necessary that we define the unsafe modifier with the function. We can define a block of unsafe code. Let's change a program little bit more.

Program 3

using System;

class MyClass {
 public static void Main() {
  unsafe {
   int iData = 10;
   int* pData = &iData;
   Console.WriteLine("Data is " + iData);
   Console.WriteLine("Address is " + (int)pData );
  }
 }
}

In this program a block is defined with unsafe modifier. So we can use pointers in that code. The output of this program is the same as previous one.

Now let's change the program a little bit to get a value from the pointer.

Program 4

using System;

class MyClass {
 public static void Main() {
  unsafe {
   int iData = 10;
   int* pData = &iData;
   Console.WriteLine("Data is " + iData);
   Console.WriteLine("Data is " + pData->ToString() );
   Console.WriteLine("Address is " + (int)pData );
  }
 }
}
Program 5
using System;

class MyClass {
 public static void Main() {
  testFun();
 }

 public static unsafe void testFun() {
  int iData = 10;
  int* pData = &iData;
  Console.WriteLine("Data is " + iData);
  Console.WriteLine("Address is " + (int)pData );
 }

In this program a function with unsafe modifier is called from a normal function. This program shows that a managed code can call unmanaged functions. The output of the program is the same as previous program.

Now change the program little bit and make an unsafe function in another class.

Program 6

using System;

class MyClass {
 public static void Main() {
  TestClass Obj = new TestClass();
  Obj.testFun();
 }
}

class TestClass {
 public unsafe void testFun() {
  int iData = 10;
  int* pData = &iData;
  Console.WriteLine("Data is " + iData);
  Console.WriteLine("Address is " + (int)pData );
 }
}

The output of the program is same as previous one.

Now try to pass pointer as a parameter. Let’s see this program.

Program 7

using System;

class MyClass {
 public static void Main() {
  TestClass Obj = new TestClass();
  Obj.testFun();
 }
}

class TestClass {
 public unsafe void testFun() {
  int x = 10;
  int y = 20;
  Console.WriteLine("Before swap x = " + x + " y= " + y);
  swap(&x, &y);
  Console.WriteLine("After swap x = " + x + " y= " + y);
 }

 public unsafe void swap(int* p_x, int *p_y) {
  int temp = *p_x;
  *p_x = *p_y;
  *p_y = temp;
 }
}

In this program the unsafe function testFun() calls the classic swap() function to interchange the value of two variables passing by reference. Now change the program a little bit.

Program 8

using System;

class MyClass {
 public static void Main() {
  TestClass Obj = new TestClass();
  unsafe {
   int x = 10;
   int y = 20;
   Console.WriteLine("Before swap x = " + x + " y= " + y);
   Obj.swap(&x, &y);
   Console.WriteLine("After swap x = " + x + " y= " + y);
  }
 }
}

class TestClass {
 public unsafe void swap(int* p_x, int* p_y) {
  int temp = *p_x;
  *p_x = *p_y;
  *p_y = temp;
 }
}

This program does the same job as previous one. But in this program we write only one unsafe function and call this function from the unsafe block in Main.

Now let's see another program which show the usage of array in C#

Program 9

using System;

class MyClass {
 public static void Main() {
  TestClass Obj = new TestClass();
  Obj.fun();
 }
}

class TestClass {
 public unsafe void fun() {
  int [] iArray = new int[10];

  // store value in array
  for (int iIndex = 0; iIndex < 10; iIndex++) {
   iArray[iIndex] = iIndex * iIndex;
  }

  // get value from array
  for (int iIndex = 0; iIndex < 10; iIndex++) {
   Console.WriteLine(iArray[iIndex]);
  }
 }
}

This program display the square of numbers from zero to 9.

Let's change the program a little bit and pass the array as a parameter to a function.

Program 10

using System;

class MyClass {
 public static void Main() {
  TestClass Obj = new TestClass();
  Obj.fun();
 }
}

class TestClass {
 public unsafe void fun() {
  int [] iArray = new int[10];

  // store value in array
  for (int iIndex = 0; iIndex < 10; iIndex++) {
   iArray[iIndex] = iIndex * iIndex;
  }

  testFun(iArray);
 }

 public unsafe void testFun(int [] p_iArray) {

  // get value from array
  for (int iIndex = 0; iIndex < 10; iIndex++) {
   Console.WriteLine(p_iArray[iIndex]);
  }
 }
}

The output of the program is same as previous one.

Now let's change the program a little bit and try to get the value of the array from a pointer rather than an index.

Program 11

using System;

class MyClass {
 public static void Main() {
  TestClass Obj = new TestClass();
  Obj.fun();
 }
}

class TestClass {
 public unsafe void fun() {
  int [] iArray = new int[10];

  // store value in array
  for (int iIndex = 0; iIndex < 10; iIndex++) {
   iArray[iIndex] = iIndex * iIndex;
  }

  // get value from array
  for (int iIndex = 0; iIndex < 10; iIndex++) {
   Console.WriteLine(*(iArray + iIndex) );
  }
 }
}

Here in this program we try to access the value of the array from *(iArray + iIndex) rather than iArray[iIndex]. But the program gives the following error.

Microsoft (R) Visual C# Compiler Version 7.00.9030 [CLR version 1.00.2204.21]
Copyright (C) Microsoft Corp 2000. All rights reserved.

um11.cs(21,24): error CS0019: Operator '+' cannot be applied to operands of type 'int[]' and 'int'

In C# int* and in[] are not treated the same. To understand it more let's see one more program.

using System;

class MyClass {
 public static void Main() {
  TestClass Obj = new TestClass();
  Obj.fun();
 }
}

class TestClass {
 public unsafe void fun() {
  int [] iArray = new int[10];
  iArray++;

  int* iPointer = (int*)0;
  iPointer++;

 }
}

There are two different types of variable in this program. First, the variable iArray is declared an array and the second variable iPointer is a pointer variable. Now I am going to increment both. We can increment the pointer variable because it is not fixed in memory but we can't increment the iArray, because the starting address of the array is stored in iArray and if we are allowed to increment this then we will lose starting address of array.

The output of the program is an error.

Microsoft (R) Visual C# Compiler Version 7.00.9030 [CLR version 1.00.2204.21]
Copyright (C) Microsoft Corp 2000. All rights reserved.

um12.cs(13,3): error CS0187: No such operator '++' defined for type 'int[]'

To access the element of the array via a pointer we have to fix the pointer so it can't be incremented. C# uses the fixed reserve word to do this.

Program 13

using System;

class MyClass {
 public static void Main() {
  TestClass Obj = new TestClass();
  Obj.fun();
 }
}

class TestClass {
 public unsafe void fun() {
  int [] iArray = new int[10];

  // store value in array
  for (int iIndex = 0; iIndex < 10; iIndex++) {
   iArray[iIndex] = iIndex * iIndex;
  }

  // get value from array
  fixed(int* pInt = iArray)
  for (int iIndex = 0; iIndex < 10; iIndex++) {
   Console.WriteLine(*(pInt + iIndex) );
  }
 }
}

We can use the same technique to pass the array to a function which receives the pointer as a parameter.

Program 14

using System;

class MyClass {
 public static void Main() {
  TestClass Obj = new TestClass();
  Obj.fun();
 }
}

class TestClass {
 public unsafe void fun() {
  int [] iArray = new int[10];

  // store value in array
  for (int iIndex = 0; iIndex < 10; iIndex++) {
   iArray[iIndex] = iIndex * iIndex;
  }

  // get value from array
  fixed(int* pInt = iArray)
  testFun(pInt);
 }

 public unsafe void testFun(int* p_pInt) {

  for (int iIndex = 0; iIndex < 10; iIndex++) {
   Console.WriteLine(*(p_pInt + iIndex) );
  }
 }
}

The output of the program is the same as the previous one. If we try to access beyond the array limit then it will print garbage.

Program 15

using System;

class MyClass {
 public static void Main() {
  TestClass Obj = new TestClass();
  Obj.fun();
 }
}

class TestClass {
 public unsafe void fun() {
  int [] iArray = new int[10];

  // store value in array
  for (int iIndex = 0; iIndex < 10; iIndex++) {
   iArray[iIndex] = iIndex * iIndex;
  }

  // get value from array
  fixed(int* pInt = iArray)
  testFun(pInt);
 }

 public unsafe void testFun(int* p_pInt) {

  for (int iIndex = 0; iIndex &lt 20; iIndex++) {
   Console.WriteLine(*(p_pInt + iIndex) );
  }
 }
}

Here we try to read 20 elements from array but there are only 10 elements in the array so it will print garbage after printing the elements of array.

Program 16

using System;

struct Point {
 public int iX;
 public int iY;
}

class MyClass {
 public unsafe static void Main() {

  // reference of point
  Point refPoint = new Point();
  refPoint.iX = 10;
  refPoint.iY = 20;

  // Pointer of point
  Point* pPoint = &refPoint;

  Console.WriteLine("X = " + pPoint->iX);
  Console.WriteLine("Y = " + pPoint->iY);

  Console.WriteLine("X = " + (*pPoint).iX);
  Console.WriteLine("Y = " + (*pPoint).iY);

 }
}

Here pPoint is the pointer of Point class instance. We can access the element of it by using the -> Operator.

Change in Beta 2

When you want to compile program using command line switch you type the program name after the compiler name; for example if your program name is prog1.cs then you will compile this:

scs prog1.cs

This works fine for unsafe code while you are programming in beta 1. In beta 2 Microsft added one more switch to command line compiler of C# for writing unsafe code. Now if you want to write unsafe code then you have to specify the /unsafe command line switch with command line compiler otherwise the compiler gives an error. In beta 2 if you want to write unsafe code in your program then you compile your programas follows:

csc /unsafe prog1.cs

Application Development

Pointers and unsafe code in c#