# 05.深入探索Class的结构

## 一、内存平移

我们先通过下面的例子了解一下`内存平移`

```cpp
int nums[4] = {1, 2, 3, 4};

NSLog(@"数组nums的地址：%p\n", &nums);

for (int i = 0; i < 4; i++) {
    NSLog(@"第%d个元素是%d地址是%p", i + 1, nums[i], &nums[i]);
}

NSLog(@"\n");

int *numsPointer = nums;
NSLog(@"数组nums的指针：%p\n", numsPointer);

for (int i = 0; i < 4; i++) {
    NSLog(@"偏移%d，取值%d，地址是%p", i, *(numsPointer + i), (numsPointer + i));
}
```

输出：

```cpp
2021-06-22 23:27:02.344397+0800 Class[15488:5517067] 数组nums的地址：0x16f00d3a0

2021-06-22 23:27:02.344734+0800 Class[15488:5517067] 第1个元素是11地址是0x16f00d3a0
2021-06-22 23:27:02.344910+0800 Class[15488:5517067] 第2个元素是12地址是0x16f00d3a4
2021-06-22 23:27:02.345776+0800 Class[15488:5517067] 第3个元素是13地址是0x16f00d3a8
2021-06-22 23:27:02.346025+0800 Class[15488:5517067] 第4个元素是14地址是0x16f00d3ac
2021-06-22 23:27:02.346173+0800 Class[15488:5517067] 
2021-06-22 23:27:02.346328+0800 Class[15488:5517067] 数组nums的指针：0x16f00d3a0

2021-06-22 23:27:02.347933+0800 Class[15488:5517067] 偏移0，取值11，地址是0x16f00d3a0
2021-06-22 23:27:02.348115+0800 Class[15488:5517067] 偏移1，取值12，地址是0x16f00d3a4 // 4字节
2021-06-22 23:27:02.348425+0800 Class[15488:5517067] 偏移2，取值13，地址是0x16f00d3a8 // 4字节
2021-06-22 23:27:02.349788+0800 Class[15488:5517067] 偏移3，取值14，地址是0x16f00d3ac // 4字节
```

![内存平移](https://4193904735-files.gitbook.io/~/files/v0/b/gitbook-legacy-files/o/assets%2F-MI8JgbGh3U6X_oedqkm%2Fsync%2Fabcb62d4f212999f835e420cb0e9995755058c79.png?generation=1624445641813171\&alt=media)

`nums`一个元素4字节，指针偏移1次4字节。所以这里可以通过从数组首地址开始偏移的方式进行取值操作。

同理对`Class`的内存结构我们是否也能进行`内存平移`来取值呢？

## 二、Class的结构内存计算

在Class的源码中我们发现核心的数据结构是这样的，暂时忽略其中的方法，因为不影响Class的内存结构。

> 方法都存在方法区

```cpp
struct objc_class : objc_object {
    ...

    // Class ISA; objc_object 中的 ISA
    Class superclass;
    cache_t cache;             // formerly cache pointer and vtable
    class_data_bits_t bits;    // class_rw_t * plus custom rr/alloc flags

    ...
}
```

|    -   |     ISA    | superClass | cache |  bits  |
| :----: | :--------: | :--------: | :---: | :----: |
|   说明   | ISA（结构体指针） |  父类（结构体指针） |  方法缓存 | 类的具体信息 |
| 大小（字节） |      8     |      8     |   16  |    8   |

## 三、objc\_class: cache\_t 的内存结构

我们忽略掉方法，剩下的核心数据结构如下

```cpp
struct cache_t {
private:
    explicit_atomic<uintptr_t> _bucketsAndMaybeMask;
    union {
        struct {
            explicit_atomic<mask_t>    _maybeMask;
#if __LP64__
            uint16_t                   _flags;
#endif
            uint16_t                   _occupied;
        };
        explicit_atomic<preopt_cache_t *> _originalPreoptCache;
    };
}
```

### 3.1 小补充：LP64数据模型

在里面我们可以看到 `__LP64__`，这里代表的是`LP64数据模型`。现今所有64位的类Unix平台均使用LP64数据模型。

|    TYPE   |        LP32        |          ILP32         |        LP64       |         ILP64         |          LLP64          |
| :-------: | :----------------: | :--------------------: | :---------------: | :-------------------: | :---------------------: |
|     含义    | 指long和pointer是32位的 | 指int，long和pointer是32位的 | 指long和pointer是64位 | 指int，long，pointer是64位 | 指long long和pointer是64位的 |
|    CHAR   |          8         |            8           |         8         |           8           |            8            |
|   SHORT   |         16         |           16           |         16        |           16          |            16           |
|    INT    |         16         |           32           |         32        |           64          |            32           |
|    LONG   |         32         |           32           |         64        |           64          |            32           |
| LONG LONG |         64         |           64           |         64        |           64          |            64           |
|  POINTER  |         32         |           32           |         64        |           64          |            64           |

### 3.2 内部联合体的内存结构

联合体包含了两部分：结构体、指针。我们已知arm64下指针占用8字节。我们来看下结构体需要占用多少

|    -   | \_maybeMask |  \_flags  |
| :----: | :---------: | :-------: |
|   类型   |  uint32\_t  | uint16\_t |
| 大小（字节） |      4      |     2     |

> 结构体大小 8 字节，联合体大小为8字节

### 3.3 结构总结

|    -   | \_bucketsAndMaybeMask | union |
| :----: | :-------------------: | :---: |
|   说明   |     指针(做什么的还需要研究)     |  联合体  |
| 大小（字节） |        long类型 8       |   8   |

> cache\_t大小为16字节

## 四、objc\_class : class\_data\_bits\_t 的内存结构

核心数据结构如下

```cpp
struct class_data_bits_t {
  ...
  // Values are the FAST_ flags above.
  uintptr_t bits;
  ...
}
```

我们发现内部只有一个指针`bits`，那么如何通过它获取到类的详细信息呢？

我们发现这样一个函数，通过`bits`与上`FAST_DATA_MASK`可以获得`class_rw_t`。和我们找ISA的过程比较类似。

那么它是做什么的呢？

```cpp
class_rw_t* data() const {
    return (class_rw_t *)(bits & FAST_DATA_MASK);
}
```

> FAST\_DATA\_MASK在`LP64`下的定义

```cpp
#if __LP64__
...
// data pointer
#define FAST_DATA_MASK          0x00007ffffffffff8UL
```

## 五、class\_data\_bits\_t : class\_rw\_t

> `rw`: read-write

### 5.1 class\_rw\_t的内存结构

```cpp
struct class_rw_t {
  // Be warned that Symbolication knows the layout of this structure.
  // 标志位，如是否是元类，是否实现了等
  uint32_t flags;
  uint16_t witness;

  explicit_atomic<uintptr_t> ro_or_rw_ext; //

  Class firstSubclass;
  Class nextSiblingClass;

}
```

### 5.2 一些重要函数

这里我们会看到一些眼熟的函数，在`methods()` `properties()` `protocols()`中我们发现了另一个重要的类型`class_rw_ext_t`

```cpp
class_rw_ext_t *deepCopy(const class_ro_t *ro) {
    return extAlloc(ro, true);
}

const method_array_t methods() const {
    auto v = get_ro_or_rwe();
    if (v.is<class_rw_ext_t *>()) {
        return v.get<class_rw_ext_t *>(&ro_or_rw_ext)->methods;
    } else {
        return method_array_t{v.get<const class_ro_t *>(&ro_or_rw_ext)->baseMethods()};
    }
}

const property_array_t properties() const {
    ...
}

const protocol_array_t protocols() const {
    ...
}
```

## 六、class\_data\_bits\_t : class\_rw\_t : class\_rw\_ext\_t

```cpp
struct class_rw_ext_t {
    DECLARE_AUTHED_PTR_TEMPLATE(class_ro_t)
    class_ro_t_authed_ptr<const class_ro_t> ro;// 成员变量
    method_array_t methods;
    property_array_t properties;
    protocol_array_t protocols;
    char *demangledName;
    uint32_t version;
};
```

### 6.1 list\_array\_tt

`method_array_t`/`property_array_t`/`protocol_array_t` 都是继承于 `list_array_tt` 的，通过泛型来完成不同的定义。

源码核心部分：

```cpp
template <typename Element, typename List, template<typename> class Ptr>
class list_array_tt {
  ...
  private:
    union {
        Ptr<List> list;
        uintptr_t arrayAndFlag;
    };
  ...
}
```

1. method\_array\_t

```cpp
class method_array_t : 
    public list_array_tt<method_t, method_list_t, method_list_t_authed_ptr> {...}
```

1. property\_array\_t

```cpp
class property_array_t : 
    public list_array_tt<property_t, property_list_t, RawPtr> {...}
```

1. protocol\_array\_t

```cpp
class protocol_array_t : 
    public list_array_tt<protocol_ref_t, protocol_list_t, RawPtr> {...}
```

### 6.2 class\_ro\_t

源码核心部分：

```cpp
struct class_ro_t {
  ...
  const ivar_list_t * ivars; // 成员变量
  ...
}
```

## 七、LLDB调试Class内存结构

### 结构图

![Class的内存结构](https://4193904735-files.gitbook.io/~/files/v0/b/gitbook-legacy-files/o/assets%2F-MI8JgbGh3U6X_oedqkm%2Fsync%2Fa71a50fedb46cbb137c62d39f25b164d0721fdaf.png?generation=1624711149547329\&alt=media)

有这样的一个类

```cpp
@interface RYModel : NSObject

@property (nonatomic, copy) NSString *name;
@property (nonatomic, assign) NSInteger age;
@property (nonatomic, assign) float height;
@property (nonatomic, strong) RYModel *father;
@property (nonatomic, assign) BOOL isBoy;

- (void)dosomething;

- (void)dosomethingWith:(NSString *)title;

+ (void)classDoSomething;

- (NSString *)sayMyName;

@end
```

### 7.1 成员变量

> 获取`class_data_bits_t`

这里通过内存偏移的方式找到`class_data_bits_t`，如前文所述：`bits`所在位置的偏移为8+8+16=32字节。

那么我们理论上可以通过`对象地址+0x20`计算出它的位置。

```cpp
(lldb) p/x RYModel.class
(Class) $0 = 0x00000001000083e0 RYModel

(lldb) x/4gx 0x00000001000083e0
0x1000083e0: 0x00000001000083b8 0x000000010036a140
0x1000083f0: 0x000000010073c2d0 0x0004803400000007

// 计算bits
(lldb) p/x 0x1000083e0 + 0x20 
(long) $1 = 0x0000000100008400

// 这里转换一下类型便于后面继续调试
(lldb) p (class_data_bits_t *)0x0000000100008400 
(class_data_bits_t *) $2 = 0x0000000100008400
```

> 通过 `data()` 方法获取 `class_rw_t`

```cpp
(lldb) p (class_data_bits_t *)0x0000000100008400
(class_data_bits_t *) $4 = 0x0000000100008400

(lldb) p $4->data()
(class_rw_t *) $5 = 0x000000010073c040
```

> 通过 `properties` 获取`属性列表容器`

```cpp
(lldb) p $5->properties
(const property_array_t) $6 = {
  list_array_tt<property_t, property_list_t, RawPtr> = {
     = {
      list = {
        ptr = 0x0000000100008280
      }
      arrayAndFlag = 4295000704
    }
  }
}
  Fix-it applied, fixed expression was: 
    $5->properties()
```

> 取出 `list`

```cpp
(lldb) p $6.list
(const RawPtr<property_list_t>) $7 = {
  ptr = 0x0000000100008280
}
```

> 取出 `ptr`

```cpp
(lldb) p $7.ptr
(property_list_t *const) $8 = 0x0000000100008280
```

> 取出 `ptr` 内的值

```cpp
(lldb) p *$8
(property_list_t) $9 = {
  entsize_list_tt<property_t, property_list_t, 0, PointerModifierNop> = (entsizeAndFlags = 16, count = 5)
}
```

> 通过下标取出`属性`

```cpp
(lldb) p $9.get(0)
(property_t) $10 = (name = "name", attributes = "T@\"NSString\",C,N,V_name")
(lldb) p $9.get(1)
(property_t) $11 = (name = "age", attributes = "Tq,N,V_age")
(lldb) p $9.get(2)
(property_t) $12 = (name = "height", attributes = "Tf,N,V_height")
(lldb) p $9.get(3)
(property_t) $13 = (name = "father", attributes = "T@\"RYModel\",&,N,V_father")
(lldb) p $9.get(4)
(property_t) $14 = (name = "isBoy", attributes = "Tc,N,V_isBoy")
```

### 7.2 实例方法列表

前面一些相似的步骤我们省略掉

> 从 `class_rw_t -> methods` 获取 `方法列表容器`

```cpp
(lldb) p $5->methods
(const method_array_t) $6 = {
  list_array_tt<method_t, method_list_t, method_list_t_authed_ptr> = {
     = {
      list = {
        ptr = 0x00000001000080e8
      }
      arrayAndFlag = 4295000296
    }
  }
}
  Fix-it applied, fixed expression was: 
    $5->methods()
```

> 取出 `list`

```cpp
(lldb) p $6.list
(const method_list_t_authed_ptr<method_list_t>) $14 = {
  ptr = 0x00000001000080e8
}
```

> 取出 `ptr`

```cpp
(lldb) p $14.ptr
(method_list_t *const) $15 = 0x00000001000080e8
```

> 取出 `ptr` 中的数据 `method_list_t`

```cpp
(lldb) p *$15
(method_list_t) $16 = {
  entsize_list_tt<method_t, method_list_t, 4294901763, method_t::pointer_modifier> = (entsizeAndFlags = 27, count = 14)
}
```

> 获取方法数量

```cpp
(lldb) p $16.count
(uint32_t) $29 = 14
```

> 通过c++函数`get()`与`big()`单个获取类的实例方法
>
> > 这里除了我们自己定义的一些实例方法，还看到了一些列属性自动生成的`set、get`方法，以及析构函数

```cpp
(lldb) p $16.get(0).big()
(method_t::big) $19 = {
  name = "setFather:"
  types = 0x0000000100003f60 "v24@0:8@16"
  imp = 0x0000000100003cc0 (ObjcBuild`-[RYModel setFather:])
}
(lldb) p $16.get(1).big()
(method_t::big) $20 = {
  name = "father"
  types = 0x0000000100003f6b "@16@0:8"
  imp = 0x0000000100003ca0 (ObjcBuild`-[RYModel father])
}

... // 省略一些Set、get

(lldb) p $16.get(3).big()
(method_t::big) $22 = {
  name = "dosomething"
  types = 0x0000000100003f3b "v16@0:8"
  imp = 0x0000000100003b30 (ObjcBuild`-[RYModel dosomething])
}
(lldb) p $16.get(4).big()
(method_t::big) $23 = {
  name = "dosomethingWith:"
  types = 0x0000000100003f60 "v24@0:8@16"
  imp = 0x0000000100003b40 (ObjcBuild`-[RYModel dosomethingWith:])
}

...

(lldb) p $16.get(8).big()
(method_t::big) $27 = {
  name = ".cxx_destruct"
  types = 0x0000000100003f3b "v16@0:8"
  imp = 0x0000000100003d30 (ObjcBuild`-[RYModel .cxx_destruct])
}

... // 省略一些Set、get
```

### 7.3 协议列表

前面一些相似的步骤我们省略掉

> 从 `class_rw_t -> protocols` 获取 `协议列表容器`

```cpp
(lldb) p $2->protocols()
(const protocol_array_t) $3 = {
  list_array_tt<unsigned long, protocol_list_t, RawPtr> = {
     = {
      list = {
        ptr = 0x00000001000083b8
      }
      arrayAndFlag = 4295001016
    }
  }
}
```

> 取出 `list`

```cpp
(lldb) p $3.list
(const RawPtr<protocol_list_t>) $4 = {
  ptr = 0x00000001000083b8
}
```

> 取出 `ptr`

```cpp
(lldb) p $4.ptr
(protocol_list_t *const) $5 = 0x00000001000083b8
```

> 取出 `ptr` 中的数据 `protocol_list_t`

```cpp
(lldb) p *$5
(protocol_list_t) $6 = (count = 1, list = protocol_ref_t [] @ 0x00007f9a62cb8778)
```

这里可以发现，里面的list中存了一个指针，并不是完整的一个数据结构。我们找到了这样的定义：

`typedef uintptr_t protocol_ref_t; // protocol_t *, but unremapped`

> 获取 `protocol_t *`

```cpp
(lldb) p (protocol_t *)$6.list[0]
(protocol_t *) $7 = 0x00000001000088a0
```

> 读取数据

```cpp
(lldb) p *$7
(protocol_t) $8 = {
  objc_object = {
    isa = {
      bits = 4298547400
      cls = Protocol
       = {
        nonpointer = 0
        has_assoc = 0
        has_cxx_dtor = 0
        shiftcls = 537318425
        magic = 0
        weakly_referenced = 0
        unused = 0
        has_sidetable_rc = 0
        extra_rc = 0
      }
    }
  }
  mangledName = 0x0000000100003e71 "RYProtocol"
  protocols = 0x0000000100008310
  instanceMethods = 0x0000000100008328
  classMethods = nil
  optionalInstanceMethods = nil
  optionalClassMethods = nil
  instanceProperties = nil
  size = 96
  flags = 0
  _extendedMethodTypes = 0x0000000100008348
  _demangledName = 0x0000000000000000
  _classProperties = nil
}
```

### 7.4 成员变量

前面一些相似的步骤我们省略掉

> 从 `class_rw_t -> ro()` 获取 `成员变量列表容器`

```cpp
(lldb) p $4->ro()
(const class_ro_t *) $18 = 0x00000001000083d0
```

> `class_ro_t -> ivars` 获取成员变量列表

```cpp
(lldb) p $18->ivars
(const ivar_list_t *const) $19 = 0x0000000100008570

(lldb) p *$19
(const ivar_list_t) $20 = {
  entsize_list_tt<ivar_t, ivar_list_t, 0, PointerModifierNop> = (entsizeAndFlags = 32, count = 5)
}
```

> 从成员变量列表中获取成员变量

```cpp
(lldb) p $20.count
(const uint32_t) $21 = 5

(lldb) p $20.get(0)
(ivar_t) $22 = {
  offset = 0x0000000100008770
  name = 0x0000000100003de7 "_isBoy"
  type = 0x0000000100003f47 "c"
  alignment_raw = 0
  size = 1
}
(lldb) p $20.get(1)
(ivar_t) $23 = {
  offset = 0x0000000100008778
  name = 0x0000000100003dee "_height"
  type = 0x0000000100003f49 "f"
  alignment_raw = 2
  size = 4
}
(lldb) p $20.get(2)
(ivar_t) $24 = {
  offset = 0x0000000100008780
  name = 0x0000000100003df6 "_name"
  type = 0x0000000100003f4b "@\"NSString\""
  alignment_raw = 3
  size = 8
}
(lldb) p $20.get(3)
(ivar_t) $25 = {
  offset = 0x0000000100008788
  name = 0x0000000100003dfc "_age"
  type = 0x0000000100003f57 "q"
  alignment_raw = 3
  size = 8
}
(lldb) p $20.get(4)
(ivar_t) $26 = {
  offset = 0x0000000100008790
  name = 0x0000000100003e01 "_father"
  type = 0x0000000100003f59 "@\"RYModel\""
  alignment_raw = 3
  size = 8
}
```

### 7.5 类方法

类方法我们需要到元类中进行查找

> 获取元类类信息

```cpp
(lldb) p/x RYModel.class
(Class) $67 = 0x00000001000087c0 RYModel
(lldb) x/4gx 0x00000001000087c0
0x1000087c0: 0x0000000100008798 0x000000010036a140
0x1000087d0: 0x00000001010695c0 0x0004803400000007
(lldb) p/x 0x0000000100008798 & 0x00007ffffffffff8ULL
(unsigned long long) $68 = 0x0000000100008798
(lldb) p 0x0000000100008798 + 0x20
(long) $70 = 4295002040
(lldb) p/x 0x0000000100008798 + 0x20
(long) $71 = 0x00000001000087b8
(lldb) p (class_data_bits_t *)0x00000001000087b8
(class_data_bits_t *) $72 = 0x00000001000087b8

(lldb) p $72->data()
(class_rw_t *) $73 = 0x0000000101069510
(lldb) p $73->methods()
(const method_array_t) $74 = {
  list_array_tt<method_t, method_list_t, method_list_t_authed_ptr> = {
     = {
      list = {
        ptr = 0x0000000100008398
      }
      arrayAndFlag = 4295000984
    }
  }
}
```

> 获取元类方法列表（类方法）

```cpp
(lldb) p $74.list
(const method_list_t_authed_ptr<method_list_t>) $75 = {
  ptr = 0x0000000100008398
}
(lldb) p $75.ptr
(method_list_t *const) $76 = 0x0000000100008398
(lldb) p *$76
(method_list_t) $77 = {
  entsize_list_tt<method_t, method_list_t, 4294901763, method_t::pointer_modifier> = (entsizeAndFlags = 27, count = 1)
}
```

> 输出方法信息

```cpp
(lldb) p $77.get(0).big()
(method_t::big) $78 = {
  name = "classDoSomething"
  types = 0x0000000100003f3f "v16@0:8"
  imp = 0x0000000100003920 (ObjcBuild`+[RYModel classDoSomething])
}
```
