05.深入探索Class的结构

一、内存平移

我们先通过下面的例子了解一下内存平移

int nums[4] = {1, 2, 3, 4};

NSLog(@"数组nums的地址：%p\n", &nums);

for (int i = 0; i < 4; i++) {
    NSLog(@"第%d个元素是%d地址是%p", i + 1, nums[i], &nums[i]);
}

NSLog(@"\n");

int *numsPointer = nums;
NSLog(@"数组nums的指针：%p\n", numsPointer);

for (int i = 0; i < 4; i++) {
    NSLog(@"偏移%d，取值%d，地址是%p", i, *(numsPointer + i), (numsPointer + i));
}

输出：

2021-06-22 23:27:02.344397+0800 Class[15488:5517067] 数组nums的地址：0x16f00d3a0

2021-06-22 23:27:02.344734+0800 Class[15488:5517067] 第1个元素是11地址是0x16f00d3a0
2021-06-22 23:27:02.344910+0800 Class[15488:5517067] 第2个元素是12地址是0x16f00d3a4
2021-06-22 23:27:02.345776+0800 Class[15488:5517067] 第3个元素是13地址是0x16f00d3a8
2021-06-22 23:27:02.346025+0800 Class[15488:5517067] 第4个元素是14地址是0x16f00d3ac
2021-06-22 23:27:02.346173+0800 Class[15488:5517067] 
2021-06-22 23:27:02.346328+0800 Class[15488:5517067] 数组nums的指针：0x16f00d3a0

2021-06-22 23:27:02.347933+0800 Class[15488:5517067] 偏移0，取值11，地址是0x16f00d3a0
2021-06-22 23:27:02.348115+0800 Class[15488:5517067] 偏移1，取值12，地址是0x16f00d3a4 // 4字节
2021-06-22 23:27:02.348425+0800 Class[15488:5517067] 偏移2，取值13，地址是0x16f00d3a8 // 4字节
2021-06-22 23:27:02.349788+0800 Class[15488:5517067] 偏移3，取值14，地址是0x16f00d3ac // 4字节

nums一个元素4字节，指针偏移1次4字节。所以这里可以通过从数组首地址开始偏移的方式进行取值操作。

同理对Class的内存结构我们是否也能进行内存平移来取值呢？

二、Class的结构内存计算

在Class的源码中我们发现核心的数据结构是这样的，暂时忽略其中的方法，因为不影响Class的内存结构。

方法都存在方法区

struct objc_class : objc_object {
    ...

    // Class ISA; objc_object 中的 ISA
    Class superclass;
    cache_t cache;             // formerly cache pointer and vtable
    class_data_bits_t bits;    // class_rw_t * plus custom rr/alloc flags

    ...
}

ISA

superClass

cache

bits

说明

ISA（结构体指针）

父类（结构体指针）

方法缓存

类的具体信息

大小（字节）

三、objc_class: cache_t 的内存结构

我们忽略掉方法，剩下的核心数据结构如下

struct cache_t {
private:
    explicit_atomic<uintptr_t> _bucketsAndMaybeMask;
    union {
        struct {
            explicit_atomic<mask_t>    _maybeMask;
#if __LP64__
            uint16_t                   _flags;
#endif
            uint16_t                   _occupied;
        };
        explicit_atomic<preopt_cache_t *> _originalPreoptCache;
    };
}

3.1 小补充：LP64数据模型

在里面我们可以看到 __LP64__，这里代表的是LP64数据模型。现今所有64位的类Unix平台均使用LP64数据模型。

TYPE

LP32

ILP32

LP64

ILP64

LLP64

含义

指long和pointer是32位的

指int，long和pointer是32位的

指long和pointer是64位

指int，long，pointer是64位

指long long和pointer是64位的

CHAR

SHORT

INT

LONG

LONG LONG

POINTER

3.2 内部联合体的内存结构

联合体包含了两部分：结构体、指针。我们已知arm64下指针占用8字节。我们来看下结构体需要占用多少

_maybeMask

_flags

类型

uint32_t

uint16_t

大小（字节）

结构体大小 8 字节，联合体大小为8字节

3.3 结构总结

_bucketsAndMaybeMask

union

说明

指针(做什么的还需要研究)

联合体

大小（字节）

long类型 8

cache_t大小为16字节

四、objc_class : class_data_bits_t 的内存结构

核心数据结构如下

struct class_data_bits_t {
  ...
  // Values are the FAST_ flags above.
  uintptr_t bits;
  ...
}

我们发现内部只有一个指针bits，那么如何通过它获取到类的详细信息呢？

我们发现这样一个函数，通过bits与上FAST_DATA_MASK可以获得class_rw_t。和我们找ISA的过程比较类似。

那么它是做什么的呢？

class_rw_t* data() const {
    return (class_rw_t *)(bits & FAST_DATA_MASK);
}

FAST_DATA_MASK在LP64下的定义

#if __LP64__
...
// data pointer
#define FAST_DATA_MASK          0x00007ffffffffff8UL

五、class_data_bits_t : class_rw_t

rw: read-write

5.1 class_rw_t的内存结构

struct class_rw_t {
  // Be warned that Symbolication knows the layout of this structure.
  // 标志位，如是否是元类，是否实现了等
  uint32_t flags;
  uint16_t witness;

  explicit_atomic<uintptr_t> ro_or_rw_ext; //

  Class firstSubclass;
  Class nextSiblingClass;

}

5.2 一些重要函数

这里我们会看到一些眼熟的函数，在methods() properties() protocols()中我们发现了另一个重要的类型class_rw_ext_t

class_rw_ext_t *deepCopy(const class_ro_t *ro) {
    return extAlloc(ro, true);
}

const method_array_t methods() const {
    auto v = get_ro_or_rwe();
    if (v.is<class_rw_ext_t *>()) {
        return v.get<class_rw_ext_t *>(&ro_or_rw_ext)->methods;
    } else {
        return method_array_t{v.get<const class_ro_t *>(&ro_or_rw_ext)->baseMethods()};
    }
}

const property_array_t properties() const {
    ...
}

const protocol_array_t protocols() const {
    ...
}

六、class_data_bits_t : class_rw_t : class_rw_ext_t

struct class_rw_ext_t {
    DECLARE_AUTHED_PTR_TEMPLATE(class_ro_t)
    class_ro_t_authed_ptr<const class_ro_t> ro;// 成员变量
    method_array_t methods;
    property_array_t properties;
    protocol_array_t protocols;
    char *demangledName;
    uint32_t version;
};

6.1 list_array_tt

method_array_t/property_array_t/protocol_array_t 都是继承于 list_array_tt 的，通过泛型来完成不同的定义。

源码核心部分：

template <typename Element, typename List, template<typename> class Ptr>
class list_array_tt {
  ...
  private:
    union {
        Ptr<List> list;
        uintptr_t arrayAndFlag;
    };
  ...
}

method_array_t

class method_array_t : 
    public list_array_tt<method_t, method_list_t, method_list_t_authed_ptr> {...}

property_array_t

class property_array_t : 
    public list_array_tt<property_t, property_list_t, RawPtr> {...}

protocol_array_t

class protocol_array_t : 
    public list_array_tt<protocol_ref_t, protocol_list_t, RawPtr> {...}

6.2 class_ro_t

源码核心部分：

struct class_ro_t {
  ...
  const ivar_list_t * ivars; // 成员变量
  ...
}

七、LLDB调试Class内存结构

结构图

有这样的一个类

@interface RYModel : NSObject

@property (nonatomic, copy) NSString *name;
@property (nonatomic, assign) NSInteger age;
@property (nonatomic, assign) float height;
@property (nonatomic, strong) RYModel *father;
@property (nonatomic, assign) BOOL isBoy;

- (void)dosomething;

- (void)dosomethingWith:(NSString *)title;

+ (void)classDoSomething;

- (NSString *)sayMyName;

@end

7.1 成员变量

获取class_data_bits_t

这里通过内存偏移的方式找到class_data_bits_t，如前文所述：bits所在位置的偏移为8+8+16=32字节。

那么我们理论上可以通过对象地址+0x20计算出它的位置。

(lldb) p/x RYModel.class
(Class) $0 = 0x00000001000083e0 RYModel

(lldb) x/4gx 0x00000001000083e0
0x1000083e0: 0x00000001000083b8 0x000000010036a140
0x1000083f0: 0x000000010073c2d0 0x0004803400000007

// 计算bits
(lldb) p/x 0x1000083e0 + 0x20 
(long) $1 = 0x0000000100008400

// 这里转换一下类型便于后面继续调试
(lldb) p (class_data_bits_t *)0x0000000100008400 
(class_data_bits_t *) $2 = 0x0000000100008400

通过 data() 方法获取 class_rw_t

(lldb) p (class_data_bits_t *)0x0000000100008400
(class_data_bits_t *) $4 = 0x0000000100008400

(lldb) p $4->data()
(class_rw_t *) $5 = 0x000000010073c040

通过 properties 获取属性列表容器

(lldb) p $5->properties
(const property_array_t) $6 = {
  list_array_tt<property_t, property_list_t, RawPtr> = {
     = {
      list = {
        ptr = 0x0000000100008280
      }
      arrayAndFlag = 4295000704
    }
  }
}
  Fix-it applied, fixed expression was: 
    $5->properties()

取出 list

(lldb) p $6.list
(const RawPtr<property_list_t>) $7 = {
  ptr = 0x0000000100008280
}

取出 ptr

(lldb) p $7.ptr
(property_list_t *const) $8 = 0x0000000100008280

取出 ptr 内的值

(lldb) p *$8
(property_list_t) $9 = {
  entsize_list_tt<property_t, property_list_t, 0, PointerModifierNop> = (entsizeAndFlags = 16, count = 5)
}

通过下标取出属性

(lldb) p $9.get(0)
(property_t) $10 = (name = "name", attributes = "T@\"NSString\",C,N,V_name")
(lldb) p $9.get(1)
(property_t) $11 = (name = "age", attributes = "Tq,N,V_age")
(lldb) p $9.get(2)
(property_t) $12 = (name = "height", attributes = "Tf,N,V_height")
(lldb) p $9.get(3)
(property_t) $13 = (name = "father", attributes = "T@\"RYModel\",&,N,V_father")
(lldb) p $9.get(4)
(property_t) $14 = (name = "isBoy", attributes = "Tc,N,V_isBoy")

7.2 实例方法列表

前面一些相似的步骤我们省略掉

从 class_rw_t -> methods 获取 方法列表容器

(lldb) p $5->methods
(const method_array_t) $6 = {
  list_array_tt<method_t, method_list_t, method_list_t_authed_ptr> = {
     = {
      list = {
        ptr = 0x00000001000080e8
      }
      arrayAndFlag = 4295000296
    }
  }
}
  Fix-it applied, fixed expression was: 
    $5->methods()

取出 list

(lldb) p $6.list
(const method_list_t_authed_ptr<method_list_t>) $14 = {
  ptr = 0x00000001000080e8
}

取出 ptr

(lldb) p $14.ptr
(method_list_t *const) $15 = 0x00000001000080e8

取出 ptr 中的数据 method_list_t

(lldb) p *$15
(method_list_t) $16 = {
  entsize_list_tt<method_t, method_list_t, 4294901763, method_t::pointer_modifier> = (entsizeAndFlags = 27, count = 14)
}

获取方法数量

(lldb) p $16.count
(uint32_t) $29 = 14

通过c++函数get()与big()单个获取类的实例方法
这里除了我们自己定义的一些实例方法，还看到了一些列属性自动生成的set、get方法，以及析构函数

(lldb) p $16.get(0).big()
(method_t::big) $19 = {
  name = "setFather:"
  types = 0x0000000100003f60 "v24@0:8@16"
  imp = 0x0000000100003cc0 (ObjcBuild`-[RYModel setFather:])
}
(lldb) p $16.get(1).big()
(method_t::big) $20 = {
  name = "father"
  types = 0x0000000100003f6b "@16@0:8"
  imp = 0x0000000100003ca0 (ObjcBuild`-[RYModel father])
}

... // 省略一些Set、get

(lldb) p $16.get(3).big()
(method_t::big) $22 = {
  name = "dosomething"
  types = 0x0000000100003f3b "v16@0:8"
  imp = 0x0000000100003b30 (ObjcBuild`-[RYModel dosomething])
}
(lldb) p $16.get(4).big()
(method_t::big) $23 = {
  name = "dosomethingWith:"
  types = 0x0000000100003f60 "v24@0:8@16"
  imp = 0x0000000100003b40 (ObjcBuild`-[RYModel dosomethingWith:])
}

...

(lldb) p $16.get(8).big()
(method_t::big) $27 = {
  name = ".cxx_destruct"
  types = 0x0000000100003f3b "v16@0:8"
  imp = 0x0000000100003d30 (ObjcBuild`-[RYModel .cxx_destruct])
}

... // 省略一些Set、get

7.3 协议列表

前面一些相似的步骤我们省略掉

从 class_rw_t -> protocols 获取 协议列表容器

(lldb) p $2->protocols()
(const protocol_array_t) $3 = {
  list_array_tt<unsigned long, protocol_list_t, RawPtr> = {
     = {
      list = {
        ptr = 0x00000001000083b8
      }
      arrayAndFlag = 4295001016
    }
  }
}

取出 list

(lldb) p $3.list
(const RawPtr<protocol_list_t>) $4 = {
  ptr = 0x00000001000083b8
}

取出 ptr

(lldb) p $4.ptr
(protocol_list_t *const) $5 = 0x00000001000083b8

取出 ptr 中的数据 protocol_list_t

(lldb) p *$5
(protocol_list_t) $6 = (count = 1, list = protocol_ref_t [] @ 0x00007f9a62cb8778)

这里可以发现，里面的list中存了一个指针，并不是完整的一个数据结构。我们找到了这样的定义：

typedef uintptr_t protocol_ref_t; // protocol_t *, but unremapped

获取 protocol_t *

(lldb) p (protocol_t *)$6.list[0]
(protocol_t *) $7 = 0x00000001000088a0

读取数据

(lldb) p *$7
(protocol_t) $8 = {
  objc_object = {
    isa = {
      bits = 4298547400
      cls = Protocol
       = {
        nonpointer = 0
        has_assoc = 0
        has_cxx_dtor = 0
        shiftcls = 537318425
        magic = 0
        weakly_referenced = 0
        unused = 0
        has_sidetable_rc = 0
        extra_rc = 0
      }
    }
  }
  mangledName = 0x0000000100003e71 "RYProtocol"
  protocols = 0x0000000100008310
  instanceMethods = 0x0000000100008328
  classMethods = nil
  optionalInstanceMethods = nil
  optionalClassMethods = nil
  instanceProperties = nil
  size = 96
  flags = 0
  _extendedMethodTypes = 0x0000000100008348
  _demangledName = 0x0000000000000000
  _classProperties = nil
}

7.4 成员变量

前面一些相似的步骤我们省略掉

从 class_rw_t -> ro() 获取 成员变量列表容器

(lldb) p $4->ro()
(const class_ro_t *) $18 = 0x00000001000083d0

class_ro_t -> ivars 获取成员变量列表

(lldb) p $18->ivars
(const ivar_list_t *const) $19 = 0x0000000100008570

(lldb) p *$19
(const ivar_list_t) $20 = {
  entsize_list_tt<ivar_t, ivar_list_t, 0, PointerModifierNop> = (entsizeAndFlags = 32, count = 5)
}

从成员变量列表中获取成员变量

(lldb) p $20.count
(const uint32_t) $21 = 5

(lldb) p $20.get(0)
(ivar_t) $22 = {
  offset = 0x0000000100008770
  name = 0x0000000100003de7 "_isBoy"
  type = 0x0000000100003f47 "c"
  alignment_raw = 0
  size = 1
}
(lldb) p $20.get(1)
(ivar_t) $23 = {
  offset = 0x0000000100008778
  name = 0x0000000100003dee "_height"
  type = 0x0000000100003f49 "f"
  alignment_raw = 2
  size = 4
}
(lldb) p $20.get(2)
(ivar_t) $24 = {
  offset = 0x0000000100008780
  name = 0x0000000100003df6 "_name"
  type = 0x0000000100003f4b "@\"NSString\""
  alignment_raw = 3
  size = 8
}
(lldb) p $20.get(3)
(ivar_t) $25 = {
  offset = 0x0000000100008788
  name = 0x0000000100003dfc "_age"
  type = 0x0000000100003f57 "q"
  alignment_raw = 3
  size = 8
}
(lldb) p $20.get(4)
(ivar_t) $26 = {
  offset = 0x0000000100008790
  name = 0x0000000100003e01 "_father"
  type = 0x0000000100003f59 "@\"RYModel\""
  alignment_raw = 3
  size = 8
}

7.5 类方法

类方法我们需要到元类中进行查找

获取元类类信息

(lldb) p/x RYModel.class
(Class) $67 = 0x00000001000087c0 RYModel
(lldb) x/4gx 0x00000001000087c0
0x1000087c0: 0x0000000100008798 0x000000010036a140
0x1000087d0: 0x00000001010695c0 0x0004803400000007
(lldb) p/x 0x0000000100008798 & 0x00007ffffffffff8ULL
(unsigned long long) $68 = 0x0000000100008798
(lldb) p 0x0000000100008798 + 0x20
(long) $70 = 4295002040
(lldb) p/x 0x0000000100008798 + 0x20
(long) $71 = 0x00000001000087b8
(lldb) p (class_data_bits_t *)0x00000001000087b8
(class_data_bits_t *) $72 = 0x00000001000087b8

(lldb) p $72->data()
(class_rw_t *) $73 = 0x0000000101069510
(lldb) p $73->methods()
(const method_array_t) $74 = {
  list_array_tt<method_t, method_list_t, method_list_t_authed_ptr> = {
     = {
      list = {
        ptr = 0x0000000100008398
      }
      arrayAndFlag = 4295000984
    }
  }
}

获取元类方法列表（类方法）

(lldb) p $74.list
(const method_list_t_authed_ptr<method_list_t>) $75 = {
  ptr = 0x0000000100008398
}
(lldb) p $75.ptr
(method_list_t *const) $76 = 0x0000000100008398
(lldb) p *$76
(method_list_t) $77 = {
  entsize_list_tt<method_t, method_list_t, 4294901763, method_t::pointer_modifier> = (entsizeAndFlags = 27, count = 1)
}

输出方法信息

(lldb) p $77.get(0).big()
(method_t::big) $78 = {
  name = "classDoSomething"
  types = 0x0000000100003f3f "v16@0:8"
  imp = 0x0000000100003920 (ObjcBuild`+[RYModel classDoSomething])
}

Previous04.ISA与Class Next06.WWDC20-runtime优化

Last updated 4 years ago

Was this helpful?