开发者

System.Int32 contains... another System.Int32

开发者 https://www.devze.com 2023-01-10 03:08 出处:网络
I used reflection to inspect the contents of System.Int32 and found that it contains another System.Int32.

I used reflection to inspect the contents of System.Int32 and found that it contains another System.Int32.

System.Int32 m_value;

I don't see how that's possible.

This int really is the开发者_如何学C "backing integer" of the one you have: if you box an int and use reflection to change the value of its m_value field, you effectively change the value of the integer:

object testInt = 4;
Console.WriteLine(testInt); // yields 4

typeof(System.Int32)
    .GetField("m_value", BindingFlags.NonPublic | BindingFlags.Instance)
    .SetValue(testInt, 5);
Console.WriteLine(testInt); // yields 5

There's gotta be a rational explanation behind this singularity. How can a value type contain itself? What magic does the CLR use to make it work?


As noted, a 32-bit integer can exist in two varieties. Four bytes anywhere in memory or a CPU register (not just the stack), the fast version. And it can be embedded in System.Object, the boxed version. The declaration for System.Int32 is compatible with the latter. When boxed, it has the typical object header, followed by 4 bytes that stores the value. And those 4 bytes map exactly to the m_value member. Maybe you see why there's no conflict here: m_value is always the fast, non-boxed version. Because there is no such thing as a boxed boxed integer.

Both the language compiler and the JIT compiler are keenly aware of the properties of an Int32. The compiler is responsible for deciding when the integer needs to be boxed and unboxed, it generates the corresponding IL instructions to do so. And it knows what IL instructions are available that allows the integer to be operated on without boxing it first. Readily evident from the methods implemented by System.Int32, it doesn't have an override for operator==() for example. That's done by the CEQ opcode. But it does have an override for Equals(), required to override the Object.Equals() method when the integer is boxed. Your compiler needs to have that same kind of awareness.


Check out this thread for a laborious discussion of this mystery.


The magic is actually in the boxing/unboxing.

System.Int32 (and its alias int) is a value type, which means that it's normally allocated on the stack. The CLR takes your System.Int32 declaration and simply turns it into 32 bits of stack space.

However, when you write object testInt = 4;, the compiler automatically boxes your value 4 into a reference, since object is a reference type. What you have is a reference that points to a System.Int32, which is now 32 bits of space on the heap somewhere. But the auto-boxed reference to a System.Int32 is called (...wait for it...) System.Int32.

What your code sample is doing is creating a reference System.Int32 and changing the value System.Int32 that it points to. This explains the bizarre behavior.


Rationale

The other answers are ignorant and/or misleading.

 

Prologue

It can help to understand this by first reading my answer on How do ValueTypes derive from Object (ReferenceType) and still be ValueTypes?

 

So what's going on?

System.Int32 is a struct that contains a 32-bit signed integer. It does not contain itself. To reference a value type in IL, the syntax is valuetype [assembly]Namespace.TypeName.

II.7.2 Built-in types The CLI built-in types have corresponding value types defined in the Base Class Library. They shall be referenced in signatures only using their special encodings (i.e., not using the general purpose valuetype TypeReference syntax). Partition I specifies the built-in types.

This means that, if you have a method that takes a 32-bit integer, you mustn't use the conventional valuetype [mscorlib]System.Int32 syntax, but the special encoding for the 32-bit signed built-in integer int32.

In C#, this means, that whether you type System.Int32 or int, either will be compiled to int32, not valuetype [mscorlib]System.Int32.

You may have heard that int is an alias for System.Int32 in C#, but in reality, both are aliases of the built-in CLS value type int32.

So while a struct like

public struct MyStruct
{
    internal MyStruct m_value;
}

Indeed would compile to (and thus be invalid):

.class public sequential ansi sealed beforefieldinit MyStruct extends [mscorlib]System.ValueType
{
    .field assembly valuetype MyStruct m_value;
}

 

namespace System
{
    public struct Int32
    {
        internal int m_value;
    }
}

Instead compiles to (ignoring interfaces):

.class public sequential ansi sealed beforefieldinit System.Int32 extends [mscorlib]System.ValueType
{
    .field assembly int32 m_value;
}

The C# compiler does not need a special case to compile System.Int32, because the CLI specification stipulates that all references to System.Int32 are replaced with the special encoding for the built-in CLS value type int32. Ergo. System.Int32 is a struct that doesn't contain another System.Int32 but an int32. In IL, you can have 2 method overloads, one taking a System.Int32 and another taking an int32 and have them co-exist:

.assembly extern mscorlib
{
  .publickeytoken = (B7 7A 5C 56 19 34 E0 89)
  .ver 2:0:0:0
}
.assembly test {}
.module test.dll
.imagebase 0x00400000
.file alignment 0x00000200
.stackreserve 0x00100000
.subsystem 0x0003
.corflags 0x00000001
.class MyNamespace.Program
{
    .method static void Main() cil managed
    {
        .entrypoint
        ldc.i4.5
        call int32 MyNamespace.Program::Lol(valuetype [mscorlib]System.Int32) // Call the one taking the System.Int32 type.
        call int32 MyNamespace.Program::Lol(int32) // Call the overload taking the built in int32 type.
        call void [mscorlib]System.Console::Write(int32)
        call valuetype [mscorlib]System.ConsoleKeyInfo [mscorlib]System.Console::ReadKey()
        pop
        ret
    }

    .method static int32 Lol(valuetype [mscorlib]System.Int32 x) cil managed
    {
        ldarg.0
        ldc.i4.1
        add
        ret
    }

    .method static int32 Lol(int32 x) cil managed
    {
        ldarg.0
        ldc.i4.1
        add
        ret
    }
}

Decompilers like ILSpy, dnSpy, .NET Reflector, etc. can be misleading. They (at the time of writing) will decompile both int32 and System.Int32 as either the C# keyword int or the type System.Int32, because that's how we define integers in C#.

But int32 is the built-in value type for 32-bit signed integers (i.e. the VES has direct support for these types, with instructions like add, sub, ldc.i4.x, etc.); System.Int32 is the corresponding value type defined in the class library. The corresponding System.Int32 type is used for boxing, and for methods like ToString(), CompareTo(), etc.

 

If you write a program in pure IL, you can absolutely make your own value type that contains an int32 in exactly the same way, where you're still using int32 but call methods on the custom "corresponding" value type.

.class MyNamespace.Program
{
    .method hidebysig static void  Main(string[] args) cil managed
    {
        .entrypoint
        .maxstack 8
        ldc.i4.0
        call void MyNamespace.Program::PrintWhetherGreaterThanZero(int32)
        ldc.i4.m1 // -1
        call void MyNamespace.Program::PrintWhetherGreaterThanZero(int32)
        ldc.i4.3
        call void MyNamespace.Program::PrintWhetherGreaterThanZero(int32)
        ret
    }

    .method private hidebysig static void PrintWhetherGreaterThanZero(int32 'value') cil managed noinlining 
    {
        .maxstack 8
        ldarga 0
        call instance bool MyCoolInt32::IsGreaterThanZero()
        brfalse.s printOtherMessage

        ldstr "Value is greater than zero"
        call void [mscorlib]System.Console::WriteLine(string)
        ret
    printOtherMessage:
        ldstr "Value is not greater than zero"
        call void [mscorlib]System.Console::WriteLine(string)
        ret
    }
}

.class public MyCoolInt32 extends [mscorlib]System.ValueType
{
    .field assembly int32 myCoolIntsValue;

    .method public hidebysig bool IsGreaterThanZero()
    {
        .maxstack 8

        ldarg.0
        ldind.i4
        ldc.i4.0
        bgt.s isNonZero
        ldc.i4.0
        ret
    isNonZero:
        ldc.i4.1
        ret
    }
}

This is no different from the System.Int32 type, except, that the C# compiler doesn't consider MyCoolInt32 the corresponding int32 type, but to the CLR, it doesn't matter. This will fail PEVerify.exe, however, but it'll run just fine. Decompilers will show casts and apparent pointer dereferences when decompiling the above, because they don't consider MyCoolInt32 and int32 related either.

But functionally, there's no difference, and there's no magic going on behind the scenes in the CLR.

0

精彩评论

暂无评论...
验证码 换一张
取 消