05.深入探索Class的结构

一、内存平移

我们先通过下面的例子了解一下内存平移

int nums[4] = {1, 2, 3, 4};

NSLog(@"数组nums的地址:%p\n", &nums);

for (int i = 0; i < 4; i++) {
    NSLog(@"第%d个元素是%d地址是%p", i + 1, nums[i], &nums[i]);
}

NSLog(@"\n");

int *numsPointer = nums;
NSLog(@"数组nums的指针:%p\n", numsPointer);

for (int i = 0; i < 4; i++) {
    NSLog(@"偏移%d,取值%d,地址是%p", i, *(numsPointer + i), (numsPointer + i));
}

输出:

2021-06-22 23:27:02.344397+0800 Class[15488:5517067] 数组nums的地址:0x16f00d3a0

2021-06-22 23:27:02.344734+0800 Class[15488:5517067] 第1个元素是11地址是0x16f00d3a0
2021-06-22 23:27:02.344910+0800 Class[15488:5517067] 第2个元素是12地址是0x16f00d3a4
2021-06-22 23:27:02.345776+0800 Class[15488:5517067] 第3个元素是13地址是0x16f00d3a8
2021-06-22 23:27:02.346025+0800 Class[15488:5517067] 第4个元素是14地址是0x16f00d3ac
2021-06-22 23:27:02.346173+0800 Class[15488:5517067] 
2021-06-22 23:27:02.346328+0800 Class[15488:5517067] 数组nums的指针:0x16f00d3a0

2021-06-22 23:27:02.347933+0800 Class[15488:5517067] 偏移0,取值11,地址是0x16f00d3a0
2021-06-22 23:27:02.348115+0800 Class[15488:5517067] 偏移1,取值12,地址是0x16f00d3a4 // 4字节
2021-06-22 23:27:02.348425+0800 Class[15488:5517067] 偏移2,取值13,地址是0x16f00d3a8 // 4字节
2021-06-22 23:27:02.349788+0800 Class[15488:5517067] 偏移3,取值14,地址是0x16f00d3ac // 4字节

nums一个元素4字节,指针偏移1次4字节。所以这里可以通过从数组首地址开始偏移的方式进行取值操作。

同理对Class的内存结构我们是否也能进行内存平移来取值呢?

二、Class的结构内存计算

在Class的源码中我们发现核心的数据结构是这样的,暂时忽略其中的方法,因为不影响Class的内存结构。

方法都存在方法区

struct objc_class : objc_object {
    ...

    // Class ISA; objc_object 中的 ISA
    Class superclass;
    cache_t cache;             // formerly cache pointer and vtable
    class_data_bits_t bits;    // class_rw_t * plus custom rr/alloc flags

    ...
}

-

ISA

superClass

cache

bits

说明

ISA(结构体指针)

父类(结构体指针)

方法缓存

类的具体信息

大小(字节)

8

8

16

8

三、objc_class: cache_t 的内存结构

我们忽略掉方法,剩下的核心数据结构如下

struct cache_t {
private:
    explicit_atomic<uintptr_t> _bucketsAndMaybeMask;
    union {
        struct {
            explicit_atomic<mask_t>    _maybeMask;
#if __LP64__
            uint16_t                   _flags;
#endif
            uint16_t                   _occupied;
        };
        explicit_atomic<preopt_cache_t *> _originalPreoptCache;
    };
}

3.1 小补充:LP64数据模型

在里面我们可以看到 __LP64__,这里代表的是LP64数据模型。现今所有64位的类Unix平台均使用LP64数据模型。

TYPE

LP32

ILP32

LP64

ILP64

LLP64

含义

指long和pointer是32位的

指int,long和pointer是32位的

指long和pointer是64位

指int,long,pointer是64位

指long long和pointer是64位的

CHAR

8

8

8

8

8

SHORT

16

16

16

16

16

INT

16

32

32

64

32

LONG

32

32

64

64

32

LONG LONG

64

64

64

64

64

POINTER

32

32

64

64

64

3.2 内部联合体的内存结构

联合体包含了两部分:结构体、指针。我们已知arm64下指针占用8字节。我们来看下结构体需要占用多少

-

_maybeMask

_flags

类型

uint32_t

uint16_t

大小(字节)

4

2

结构体大小 8 字节,联合体大小为8字节

3.3 结构总结

-

_bucketsAndMaybeMask

union

说明

指针(做什么的还需要研究)

联合体

大小(字节)

long类型 8

8

cache_t大小为16字节

四、objc_class : class_data_bits_t 的内存结构

核心数据结构如下

struct class_data_bits_t {
  ...
  // Values are the FAST_ flags above.
  uintptr_t bits;
  ...
}

我们发现内部只有一个指针bits,那么如何通过它获取到类的详细信息呢?

我们发现这样一个函数,通过bits与上FAST_DATA_MASK可以获得class_rw_t。和我们找ISA的过程比较类似。

那么它是做什么的呢?

class_rw_t* data() const {
    return (class_rw_t *)(bits & FAST_DATA_MASK);
}

FAST_DATA_MASK在LP64下的定义

#if __LP64__
...
// data pointer
#define FAST_DATA_MASK          0x00007ffffffffff8UL

五、class_data_bits_t : class_rw_t

rw: read-write

5.1 class_rw_t的内存结构

struct class_rw_t {
  // Be warned that Symbolication knows the layout of this structure.
  // 标志位,如是否是元类,是否实现了等
  uint32_t flags;
  uint16_t witness;

  explicit_atomic<uintptr_t> ro_or_rw_ext; //

  Class firstSubclass;
  Class nextSiblingClass;

}

5.2 一些重要函数

这里我们会看到一些眼熟的函数,在methods() properties() protocols()中我们发现了另一个重要的类型class_rw_ext_t

class_rw_ext_t *deepCopy(const class_ro_t *ro) {
    return extAlloc(ro, true);
}

const method_array_t methods() const {
    auto v = get_ro_or_rwe();
    if (v.is<class_rw_ext_t *>()) {
        return v.get<class_rw_ext_t *>(&ro_or_rw_ext)->methods;
    } else {
        return method_array_t{v.get<const class_ro_t *>(&ro_or_rw_ext)->baseMethods()};
    }
}

const property_array_t properties() const {
    ...
}

const protocol_array_t protocols() const {
    ...
}

六、class_data_bits_t : class_rw_t : class_rw_ext_t

struct class_rw_ext_t {
    DECLARE_AUTHED_PTR_TEMPLATE(class_ro_t)
    class_ro_t_authed_ptr<const class_ro_t> ro;// 成员变量
    method_array_t methods;
    property_array_t properties;
    protocol_array_t protocols;
    char *demangledName;
    uint32_t version;
};

6.1 list_array_tt

method_array_t/property_array_t/protocol_array_t 都是继承于 list_array_tt 的,通过泛型来完成不同的定义。

源码核心部分:

template <typename Element, typename List, template<typename> class Ptr>
class list_array_tt {
  ...
  private:
    union {
        Ptr<List> list;
        uintptr_t arrayAndFlag;
    };
  ...
}
  1. method_array_t

class method_array_t : 
    public list_array_tt<method_t, method_list_t, method_list_t_authed_ptr> {...}
  1. property_array_t

class property_array_t : 
    public list_array_tt<property_t, property_list_t, RawPtr> {...}
  1. protocol_array_t

class protocol_array_t : 
    public list_array_tt<protocol_ref_t, protocol_list_t, RawPtr> {...}

6.2 class_ro_t

源码核心部分:

struct class_ro_t {
  ...
  const ivar_list_t * ivars; // 成员变量
  ...
}

七、LLDB调试Class内存结构

结构图

有这样的一个类

@interface RYModel : NSObject

@property (nonatomic, copy) NSString *name;
@property (nonatomic, assign) NSInteger age;
@property (nonatomic, assign) float height;
@property (nonatomic, strong) RYModel *father;
@property (nonatomic, assign) BOOL isBoy;

- (void)dosomething;

- (void)dosomethingWith:(NSString *)title;

+ (void)classDoSomething;

- (NSString *)sayMyName;

@end

7.1 成员变量

获取class_data_bits_t

这里通过内存偏移的方式找到class_data_bits_t,如前文所述:bits所在位置的偏移为8+8+16=32字节。

那么我们理论上可以通过对象地址+0x20计算出它的位置。

(lldb) p/x RYModel.class
(Class) $0 = 0x00000001000083e0 RYModel

(lldb) x/4gx 0x00000001000083e0
0x1000083e0: 0x00000001000083b8 0x000000010036a140
0x1000083f0: 0x000000010073c2d0 0x0004803400000007

// 计算bits
(lldb) p/x 0x1000083e0 + 0x20 
(long) $1 = 0x0000000100008400

// 这里转换一下类型便于后面继续调试
(lldb) p (class_data_bits_t *)0x0000000100008400 
(class_data_bits_t *) $2 = 0x0000000100008400

通过 data() 方法获取 class_rw_t

(lldb) p (class_data_bits_t *)0x0000000100008400
(class_data_bits_t *) $4 = 0x0000000100008400

(lldb) p $4->data()
(class_rw_t *) $5 = 0x000000010073c040

通过 properties 获取属性列表容器

(lldb) p $5->properties
(const property_array_t) $6 = {
  list_array_tt<property_t, property_list_t, RawPtr> = {
     = {
      list = {
        ptr = 0x0000000100008280
      }
      arrayAndFlag = 4295000704
    }
  }
}
  Fix-it applied, fixed expression was: 
    $5->properties()

取出 list

(lldb) p $6.list
(const RawPtr<property_list_t>) $7 = {
  ptr = 0x0000000100008280
}

取出 ptr

(lldb) p $7.ptr
(property_list_t *const) $8 = 0x0000000100008280

取出 ptr 内的值

(lldb) p *$8
(property_list_t) $9 = {
  entsize_list_tt<property_t, property_list_t, 0, PointerModifierNop> = (entsizeAndFlags = 16, count = 5)
}

通过下标取出属性

(lldb) p $9.get(0)
(property_t) $10 = (name = "name", attributes = "T@\"NSString\",C,N,V_name")
(lldb) p $9.get(1)
(property_t) $11 = (name = "age", attributes = "Tq,N,V_age")
(lldb) p $9.get(2)
(property_t) $12 = (name = "height", attributes = "Tf,N,V_height")
(lldb) p $9.get(3)
(property_t) $13 = (name = "father", attributes = "T@\"RYModel\",&,N,V_father")
(lldb) p $9.get(4)
(property_t) $14 = (name = "isBoy", attributes = "Tc,N,V_isBoy")

7.2 实例方法列表

前面一些相似的步骤我们省略掉

class_rw_t -> methods 获取 方法列表容器

(lldb) p $5->methods
(const method_array_t) $6 = {
  list_array_tt<method_t, method_list_t, method_list_t_authed_ptr> = {
     = {
      list = {
        ptr = 0x00000001000080e8
      }
      arrayAndFlag = 4295000296
    }
  }
}
  Fix-it applied, fixed expression was: 
    $5->methods()

取出 list

(lldb) p $6.list
(const method_list_t_authed_ptr<method_list_t>) $14 = {
  ptr = 0x00000001000080e8
}

取出 ptr

(lldb) p $14.ptr
(method_list_t *const) $15 = 0x00000001000080e8

取出 ptr 中的数据 method_list_t

(lldb) p *$15
(method_list_t) $16 = {
  entsize_list_tt<method_t, method_list_t, 4294901763, method_t::pointer_modifier> = (entsizeAndFlags = 27, count = 14)
}

获取方法数量

(lldb) p $16.count
(uint32_t) $29 = 14

通过c++函数get()big()单个获取类的实例方法

这里除了我们自己定义的一些实例方法,还看到了一些列属性自动生成的set、get方法,以及析构函数

(lldb) p $16.get(0).big()
(method_t::big) $19 = {
  name = "setFather:"
  types = 0x0000000100003f60 "v24@0:8@16"
  imp = 0x0000000100003cc0 (ObjcBuild`-[RYModel setFather:])
}
(lldb) p $16.get(1).big()
(method_t::big) $20 = {
  name = "father"
  types = 0x0000000100003f6b "@16@0:8"
  imp = 0x0000000100003ca0 (ObjcBuild`-[RYModel father])
}

... // 省略一些Set、get

(lldb) p $16.get(3).big()
(method_t::big) $22 = {
  name = "dosomething"
  types = 0x0000000100003f3b "v16@0:8"
  imp = 0x0000000100003b30 (ObjcBuild`-[RYModel dosomething])
}
(lldb) p $16.get(4).big()
(method_t::big) $23 = {
  name = "dosomethingWith:"
  types = 0x0000000100003f60 "v24@0:8@16"
  imp = 0x0000000100003b40 (ObjcBuild`-[RYModel dosomethingWith:])
}

...

(lldb) p $16.get(8).big()
(method_t::big) $27 = {
  name = ".cxx_destruct"
  types = 0x0000000100003f3b "v16@0:8"
  imp = 0x0000000100003d30 (ObjcBuild`-[RYModel .cxx_destruct])
}

... // 省略一些Set、get

7.3 协议列表

前面一些相似的步骤我们省略掉

class_rw_t -> protocols 获取 协议列表容器

(lldb) p $2->protocols()
(const protocol_array_t) $3 = {
  list_array_tt<unsigned long, protocol_list_t, RawPtr> = {
     = {
      list = {
        ptr = 0x00000001000083b8
      }
      arrayAndFlag = 4295001016
    }
  }
}

取出 list

(lldb) p $3.list
(const RawPtr<protocol_list_t>) $4 = {
  ptr = 0x00000001000083b8
}

取出 ptr

(lldb) p $4.ptr
(protocol_list_t *const) $5 = 0x00000001000083b8

取出 ptr 中的数据 protocol_list_t

(lldb) p *$5
(protocol_list_t) $6 = (count = 1, list = protocol_ref_t [] @ 0x00007f9a62cb8778)

这里可以发现,里面的list中存了一个指针,并不是完整的一个数据结构。我们找到了这样的定义:

typedef uintptr_t protocol_ref_t; // protocol_t *, but unremapped

获取 protocol_t *

(lldb) p (protocol_t *)$6.list[0]
(protocol_t *) $7 = 0x00000001000088a0

读取数据

(lldb) p *$7
(protocol_t) $8 = {
  objc_object = {
    isa = {
      bits = 4298547400
      cls = Protocol
       = {
        nonpointer = 0
        has_assoc = 0
        has_cxx_dtor = 0
        shiftcls = 537318425
        magic = 0
        weakly_referenced = 0
        unused = 0
        has_sidetable_rc = 0
        extra_rc = 0
      }
    }
  }
  mangledName = 0x0000000100003e71 "RYProtocol"
  protocols = 0x0000000100008310
  instanceMethods = 0x0000000100008328
  classMethods = nil
  optionalInstanceMethods = nil
  optionalClassMethods = nil
  instanceProperties = nil
  size = 96
  flags = 0
  _extendedMethodTypes = 0x0000000100008348
  _demangledName = 0x0000000000000000
  _classProperties = nil
}

7.4 成员变量

前面一些相似的步骤我们省略掉

class_rw_t -> ro() 获取 成员变量列表容器

(lldb) p $4->ro()
(const class_ro_t *) $18 = 0x00000001000083d0

class_ro_t -> ivars 获取成员变量列表

(lldb) p $18->ivars
(const ivar_list_t *const) $19 = 0x0000000100008570

(lldb) p *$19
(const ivar_list_t) $20 = {
  entsize_list_tt<ivar_t, ivar_list_t, 0, PointerModifierNop> = (entsizeAndFlags = 32, count = 5)
}

从成员变量列表中获取成员变量

(lldb) p $20.count
(const uint32_t) $21 = 5

(lldb) p $20.get(0)
(ivar_t) $22 = {
  offset = 0x0000000100008770
  name = 0x0000000100003de7 "_isBoy"
  type = 0x0000000100003f47 "c"
  alignment_raw = 0
  size = 1
}
(lldb) p $20.get(1)
(ivar_t) $23 = {
  offset = 0x0000000100008778
  name = 0x0000000100003dee "_height"
  type = 0x0000000100003f49 "f"
  alignment_raw = 2
  size = 4
}
(lldb) p $20.get(2)
(ivar_t) $24 = {
  offset = 0x0000000100008780
  name = 0x0000000100003df6 "_name"
  type = 0x0000000100003f4b "@\"NSString\""
  alignment_raw = 3
  size = 8
}
(lldb) p $20.get(3)
(ivar_t) $25 = {
  offset = 0x0000000100008788
  name = 0x0000000100003dfc "_age"
  type = 0x0000000100003f57 "q"
  alignment_raw = 3
  size = 8
}
(lldb) p $20.get(4)
(ivar_t) $26 = {
  offset = 0x0000000100008790
  name = 0x0000000100003e01 "_father"
  type = 0x0000000100003f59 "@\"RYModel\""
  alignment_raw = 3
  size = 8
}

7.5 类方法

类方法我们需要到元类中进行查找

获取元类类信息

(lldb) p/x RYModel.class
(Class) $67 = 0x00000001000087c0 RYModel
(lldb) x/4gx 0x00000001000087c0
0x1000087c0: 0x0000000100008798 0x000000010036a140
0x1000087d0: 0x00000001010695c0 0x0004803400000007
(lldb) p/x 0x0000000100008798 & 0x00007ffffffffff8ULL
(unsigned long long) $68 = 0x0000000100008798
(lldb) p 0x0000000100008798 + 0x20
(long) $70 = 4295002040
(lldb) p/x 0x0000000100008798 + 0x20
(long) $71 = 0x00000001000087b8
(lldb) p (class_data_bits_t *)0x00000001000087b8
(class_data_bits_t *) $72 = 0x00000001000087b8

(lldb) p $72->data()
(class_rw_t *) $73 = 0x0000000101069510
(lldb) p $73->methods()
(const method_array_t) $74 = {
  list_array_tt<method_t, method_list_t, method_list_t_authed_ptr> = {
     = {
      list = {
        ptr = 0x0000000100008398
      }
      arrayAndFlag = 4295000984
    }
  }
}

获取元类方法列表(类方法)

(lldb) p $74.list
(const method_list_t_authed_ptr<method_list_t>) $75 = {
  ptr = 0x0000000100008398
}
(lldb) p $75.ptr
(method_list_t *const) $76 = 0x0000000100008398
(lldb) p *$76
(method_list_t) $77 = {
  entsize_list_tt<method_t, method_list_t, 4294901763, method_t::pointer_modifier> = (entsizeAndFlags = 27, count = 1)
}

输出方法信息

(lldb) p $77.get(0).big()
(method_t::big) $78 = {
  name = "classDoSomething"
  types = 0x0000000100003f3f "v16@0:8"
  imp = 0x0000000100003920 (ObjcBuild`+[RYModel classDoSomething])
}

Last updated