# 05.深入探索Class的结构

## 一、内存平移

我们先通过下面的例子了解一下`内存平移`

```cpp
int nums[4] = {1, 2, 3, 4};

NSLog(@"数组nums的地址：%p\n", &nums);

for (int i = 0; i < 4; i++) {
    NSLog(@"第%d个元素是%d地址是%p", i + 1, nums[i], &nums[i]);
}

NSLog(@"\n");

int *numsPointer = nums;
NSLog(@"数组nums的指针：%p\n", numsPointer);

for (int i = 0; i < 4; i++) {
    NSLog(@"偏移%d，取值%d，地址是%p", i, *(numsPointer + i), (numsPointer + i));
}
```

输出：

```cpp
2021-06-22 23:27:02.344397+0800 Class[15488:5517067] 数组nums的地址：0x16f00d3a0

2021-06-22 23:27:02.344734+0800 Class[15488:5517067] 第1个元素是11地址是0x16f00d3a0
2021-06-22 23:27:02.344910+0800 Class[15488:5517067] 第2个元素是12地址是0x16f00d3a4
2021-06-22 23:27:02.345776+0800 Class[15488:5517067] 第3个元素是13地址是0x16f00d3a8
2021-06-22 23:27:02.346025+0800 Class[15488:5517067] 第4个元素是14地址是0x16f00d3ac
2021-06-22 23:27:02.346173+0800 Class[15488:5517067] 
2021-06-22 23:27:02.346328+0800 Class[15488:5517067] 数组nums的指针：0x16f00d3a0

2021-06-22 23:27:02.347933+0800 Class[15488:5517067] 偏移0，取值11，地址是0x16f00d3a0
2021-06-22 23:27:02.348115+0800 Class[15488:5517067] 偏移1，取值12，地址是0x16f00d3a4 // 4字节
2021-06-22 23:27:02.348425+0800 Class[15488:5517067] 偏移2，取值13，地址是0x16f00d3a8 // 4字节
2021-06-22 23:27:02.349788+0800 Class[15488:5517067] 偏移3，取值14，地址是0x16f00d3ac // 4字节
```

![内存平移](/files/-McsVJoATMa2f4jkLf9P)

`nums`一个元素4字节，指针偏移1次4字节。所以这里可以通过从数组首地址开始偏移的方式进行取值操作。

同理对`Class`的内存结构我们是否也能进行`内存平移`来取值呢？

## 二、Class的结构内存计算

在Class的源码中我们发现核心的数据结构是这样的，暂时忽略其中的方法，因为不影响Class的内存结构。

> 方法都存在方法区

```cpp
struct objc_class : objc_object {
    ...

    // Class ISA; objc_object 中的 ISA
    Class superclass;
    cache_t cache;             // formerly cache pointer and vtable
    class_data_bits_t bits;    // class_rw_t * plus custom rr/alloc flags

    ...
}
```

|    -   |     ISA    | superClass | cache |  bits  |
| :----: | :--------: | :--------: | :---: | :----: |
|   说明   | ISA（结构体指针） |  父类（结构体指针） |  方法缓存 | 类的具体信息 |
| 大小（字节） |      8     |      8     |   16  |    8   |

## 三、objc\_class: cache\_t 的内存结构

我们忽略掉方法，剩下的核心数据结构如下

```cpp
struct cache_t {
private:
    explicit_atomic<uintptr_t> _bucketsAndMaybeMask;
    union {
        struct {
            explicit_atomic<mask_t>    _maybeMask;
#if __LP64__
            uint16_t                   _flags;
#endif
            uint16_t                   _occupied;
        };
        explicit_atomic<preopt_cache_t *> _originalPreoptCache;
    };
}
```

### 3.1 小补充：LP64数据模型

在里面我们可以看到 `__LP64__`，这里代表的是`LP64数据模型`。现今所有64位的类Unix平台均使用LP64数据模型。

|    TYPE   |        LP32        |          ILP32         |        LP64       |         ILP64         |          LLP64          |
| :-------: | :----------------: | :--------------------: | :---------------: | :-------------------: | :---------------------: |
|     含义    | 指long和pointer是32位的 | 指int，long和pointer是32位的 | 指long和pointer是64位 | 指int，long，pointer是64位 | 指long long和pointer是64位的 |
|    CHAR   |          8         |            8           |         8         |           8           |            8            |
|   SHORT   |         16         |           16           |         16        |           16          |            16           |
|    INT    |         16         |           32           |         32        |           64          |            32           |
|    LONG   |         32         |           32           |         64        |           64          |            32           |
| LONG LONG |         64         |           64           |         64        |           64          |            64           |
|  POINTER  |         32         |           32           |         64        |           64          |            64           |

### 3.2 内部联合体的内存结构

联合体包含了两部分：结构体、指针。我们已知arm64下指针占用8字节。我们来看下结构体需要占用多少

|    -   | \_maybeMask |  \_flags  |
| :----: | :---------: | :-------: |
|   类型   |  uint32\_t  | uint16\_t |
| 大小（字节） |      4      |     2     |

> 结构体大小 8 字节，联合体大小为8字节

### 3.3 结构总结

|    -   | \_bucketsAndMaybeMask | union |
| :----: | :-------------------: | :---: |
|   说明   |     指针(做什么的还需要研究)     |  联合体  |
| 大小（字节） |        long类型 8       |   8   |

> cache\_t大小为16字节

## 四、objc\_class : class\_data\_bits\_t 的内存结构

核心数据结构如下

```cpp
struct class_data_bits_t {
  ...
  // Values are the FAST_ flags above.
  uintptr_t bits;
  ...
}
```

我们发现内部只有一个指针`bits`，那么如何通过它获取到类的详细信息呢？

我们发现这样一个函数，通过`bits`与上`FAST_DATA_MASK`可以获得`class_rw_t`。和我们找ISA的过程比较类似。

那么它是做什么的呢？

```cpp
class_rw_t* data() const {
    return (class_rw_t *)(bits & FAST_DATA_MASK);
}
```

> FAST\_DATA\_MASK在`LP64`下的定义

```cpp
#if __LP64__
...
// data pointer
#define FAST_DATA_MASK          0x00007ffffffffff8UL
```

## 五、class\_data\_bits\_t : class\_rw\_t

> `rw`: read-write

### 5.1 class\_rw\_t的内存结构

```cpp
struct class_rw_t {
  // Be warned that Symbolication knows the layout of this structure.
  // 标志位，如是否是元类，是否实现了等
  uint32_t flags;
  uint16_t witness;

  explicit_atomic<uintptr_t> ro_or_rw_ext; //

  Class firstSubclass;
  Class nextSiblingClass;

}
```

### 5.2 一些重要函数

这里我们会看到一些眼熟的函数，在`methods()` `properties()` `protocols()`中我们发现了另一个重要的类型`class_rw_ext_t`

```cpp
class_rw_ext_t *deepCopy(const class_ro_t *ro) {
    return extAlloc(ro, true);
}

const method_array_t methods() const {
    auto v = get_ro_or_rwe();
    if (v.is<class_rw_ext_t *>()) {
        return v.get<class_rw_ext_t *>(&ro_or_rw_ext)->methods;
    } else {
        return method_array_t{v.get<const class_ro_t *>(&ro_or_rw_ext)->baseMethods()};
    }
}

const property_array_t properties() const {
    ...
}

const protocol_array_t protocols() const {
    ...
}
```

## 六、class\_data\_bits\_t : class\_rw\_t : class\_rw\_ext\_t

```cpp
struct class_rw_ext_t {
    DECLARE_AUTHED_PTR_TEMPLATE(class_ro_t)
    class_ro_t_authed_ptr<const class_ro_t> ro;// 成员变量
    method_array_t methods;
    property_array_t properties;
    protocol_array_t protocols;
    char *demangledName;
    uint32_t version;
};
```

### 6.1 list\_array\_tt

`method_array_t`/`property_array_t`/`protocol_array_t` 都是继承于 `list_array_tt` 的，通过泛型来完成不同的定义。

源码核心部分：

```cpp
template <typename Element, typename List, template<typename> class Ptr>
class list_array_tt {
  ...
  private:
    union {
        Ptr<List> list;
        uintptr_t arrayAndFlag;
    };
  ...
}
```

1. method\_array\_t

```cpp
class method_array_t : 
    public list_array_tt<method_t, method_list_t, method_list_t_authed_ptr> {...}
```

1. property\_array\_t

```cpp
class property_array_t : 
    public list_array_tt<property_t, property_list_t, RawPtr> {...}
```

1. protocol\_array\_t

```cpp
class protocol_array_t : 
    public list_array_tt<protocol_ref_t, protocol_list_t, RawPtr> {...}
```

### 6.2 class\_ro\_t

源码核心部分：

```cpp
struct class_ro_t {
  ...
  const ivar_list_t * ivars; // 成员变量
  ...
}
```

## 七、LLDB调试Class内存结构

### 结构图

![Class的内存结构](/files/-Mcsh84pg5Gx80CiAWgO)

有这样的一个类

```cpp
@interface RYModel : NSObject

@property (nonatomic, copy) NSString *name;
@property (nonatomic, assign) NSInteger age;
@property (nonatomic, assign) float height;
@property (nonatomic, strong) RYModel *father;
@property (nonatomic, assign) BOOL isBoy;

- (void)dosomething;

- (void)dosomethingWith:(NSString *)title;

+ (void)classDoSomething;

- (NSString *)sayMyName;

@end
```

### 7.1 成员变量

> 获取`class_data_bits_t`

这里通过内存偏移的方式找到`class_data_bits_t`，如前文所述：`bits`所在位置的偏移为8+8+16=32字节。

那么我们理论上可以通过`对象地址+0x20`计算出它的位置。

```cpp
(lldb) p/x RYModel.class
(Class) $0 = 0x00000001000083e0 RYModel

(lldb) x/4gx 0x00000001000083e0
0x1000083e0: 0x00000001000083b8 0x000000010036a140
0x1000083f0: 0x000000010073c2d0 0x0004803400000007

// 计算bits
(lldb) p/x 0x1000083e0 + 0x20 
(long) $1 = 0x0000000100008400

// 这里转换一下类型便于后面继续调试
(lldb) p (class_data_bits_t *)0x0000000100008400 
(class_data_bits_t *) $2 = 0x0000000100008400
```

> 通过 `data()` 方法获取 `class_rw_t`

```cpp
(lldb) p (class_data_bits_t *)0x0000000100008400
(class_data_bits_t *) $4 = 0x0000000100008400

(lldb) p $4->data()
(class_rw_t *) $5 = 0x000000010073c040
```

> 通过 `properties` 获取`属性列表容器`

```cpp
(lldb) p $5->properties
(const property_array_t) $6 = {
  list_array_tt<property_t, property_list_t, RawPtr> = {
     = {
      list = {
        ptr = 0x0000000100008280
      }
      arrayAndFlag = 4295000704
    }
  }
}
  Fix-it applied, fixed expression was: 
    $5->properties()
```

> 取出 `list`

```cpp
(lldb) p $6.list
(const RawPtr<property_list_t>) $7 = {
  ptr = 0x0000000100008280
}
```

> 取出 `ptr`

```cpp
(lldb) p $7.ptr
(property_list_t *const) $8 = 0x0000000100008280
```

> 取出 `ptr` 内的值

```cpp
(lldb) p *$8
(property_list_t) $9 = {
  entsize_list_tt<property_t, property_list_t, 0, PointerModifierNop> = (entsizeAndFlags = 16, count = 5)
}
```

> 通过下标取出`属性`

```cpp
(lldb) p $9.get(0)
(property_t) $10 = (name = "name", attributes = "T@\"NSString\",C,N,V_name")
(lldb) p $9.get(1)
(property_t) $11 = (name = "age", attributes = "Tq,N,V_age")
(lldb) p $9.get(2)
(property_t) $12 = (name = "height", attributes = "Tf,N,V_height")
(lldb) p $9.get(3)
(property_t) $13 = (name = "father", attributes = "T@\"RYModel\",&,N,V_father")
(lldb) p $9.get(4)
(property_t) $14 = (name = "isBoy", attributes = "Tc,N,V_isBoy")
```

### 7.2 实例方法列表

前面一些相似的步骤我们省略掉

> 从 `class_rw_t -> methods` 获取 `方法列表容器`

```cpp
(lldb) p $5->methods
(const method_array_t) $6 = {
  list_array_tt<method_t, method_list_t, method_list_t_authed_ptr> = {
     = {
      list = {
        ptr = 0x00000001000080e8
      }
      arrayAndFlag = 4295000296
    }
  }
}
  Fix-it applied, fixed expression was: 
    $5->methods()
```

> 取出 `list`

```cpp
(lldb) p $6.list
(const method_list_t_authed_ptr<method_list_t>) $14 = {
  ptr = 0x00000001000080e8
}
```

> 取出 `ptr`

```cpp
(lldb) p $14.ptr
(method_list_t *const) $15 = 0x00000001000080e8
```

> 取出 `ptr` 中的数据 `method_list_t`

```cpp
(lldb) p *$15
(method_list_t) $16 = {
  entsize_list_tt<method_t, method_list_t, 4294901763, method_t::pointer_modifier> = (entsizeAndFlags = 27, count = 14)
}
```

> 获取方法数量

```cpp
(lldb) p $16.count
(uint32_t) $29 = 14
```

> 通过c++函数`get()`与`big()`单个获取类的实例方法
>
> > 这里除了我们自己定义的一些实例方法，还看到了一些列属性自动生成的`set、get`方法，以及析构函数

```cpp
(lldb) p $16.get(0).big()
(method_t::big) $19 = {
  name = "setFather:"
  types = 0x0000000100003f60 "v24@0:8@16"
  imp = 0x0000000100003cc0 (ObjcBuild`-[RYModel setFather:])
}
(lldb) p $16.get(1).big()
(method_t::big) $20 = {
  name = "father"
  types = 0x0000000100003f6b "@16@0:8"
  imp = 0x0000000100003ca0 (ObjcBuild`-[RYModel father])
}

... // 省略一些Set、get

(lldb) p $16.get(3).big()
(method_t::big) $22 = {
  name = "dosomething"
  types = 0x0000000100003f3b "v16@0:8"
  imp = 0x0000000100003b30 (ObjcBuild`-[RYModel dosomething])
}
(lldb) p $16.get(4).big()
(method_t::big) $23 = {
  name = "dosomethingWith:"
  types = 0x0000000100003f60 "v24@0:8@16"
  imp = 0x0000000100003b40 (ObjcBuild`-[RYModel dosomethingWith:])
}

...

(lldb) p $16.get(8).big()
(method_t::big) $27 = {
  name = ".cxx_destruct"
  types = 0x0000000100003f3b "v16@0:8"
  imp = 0x0000000100003d30 (ObjcBuild`-[RYModel .cxx_destruct])
}

... // 省略一些Set、get
```

### 7.3 协议列表

前面一些相似的步骤我们省略掉

> 从 `class_rw_t -> protocols` 获取 `协议列表容器`

```cpp
(lldb) p $2->protocols()
(const protocol_array_t) $3 = {
  list_array_tt<unsigned long, protocol_list_t, RawPtr> = {
     = {
      list = {
        ptr = 0x00000001000083b8
      }
      arrayAndFlag = 4295001016
    }
  }
}
```

> 取出 `list`

```cpp
(lldb) p $3.list
(const RawPtr<protocol_list_t>) $4 = {
  ptr = 0x00000001000083b8
}
```

> 取出 `ptr`

```cpp
(lldb) p $4.ptr
(protocol_list_t *const) $5 = 0x00000001000083b8
```

> 取出 `ptr` 中的数据 `protocol_list_t`

```cpp
(lldb) p *$5
(protocol_list_t) $6 = (count = 1, list = protocol_ref_t [] @ 0x00007f9a62cb8778)
```

这里可以发现，里面的list中存了一个指针，并不是完整的一个数据结构。我们找到了这样的定义：

`typedef uintptr_t protocol_ref_t; // protocol_t *, but unremapped`

> 获取 `protocol_t *`

```cpp
(lldb) p (protocol_t *)$6.list[0]
(protocol_t *) $7 = 0x00000001000088a0
```

> 读取数据

```cpp
(lldb) p *$7
(protocol_t) $8 = {
  objc_object = {
    isa = {
      bits = 4298547400
      cls = Protocol
       = {
        nonpointer = 0
        has_assoc = 0
        has_cxx_dtor = 0
        shiftcls = 537318425
        magic = 0
        weakly_referenced = 0
        unused = 0
        has_sidetable_rc = 0
        extra_rc = 0
      }
    }
  }
  mangledName = 0x0000000100003e71 "RYProtocol"
  protocols = 0x0000000100008310
  instanceMethods = 0x0000000100008328
  classMethods = nil
  optionalInstanceMethods = nil
  optionalClassMethods = nil
  instanceProperties = nil
  size = 96
  flags = 0
  _extendedMethodTypes = 0x0000000100008348
  _demangledName = 0x0000000000000000
  _classProperties = nil
}
```

### 7.4 成员变量

前面一些相似的步骤我们省略掉

> 从 `class_rw_t -> ro()` 获取 `成员变量列表容器`

```cpp
(lldb) p $4->ro()
(const class_ro_t *) $18 = 0x00000001000083d0
```

> `class_ro_t -> ivars` 获取成员变量列表

```cpp
(lldb) p $18->ivars
(const ivar_list_t *const) $19 = 0x0000000100008570

(lldb) p *$19
(const ivar_list_t) $20 = {
  entsize_list_tt<ivar_t, ivar_list_t, 0, PointerModifierNop> = (entsizeAndFlags = 32, count = 5)
}
```

> 从成员变量列表中获取成员变量

```cpp
(lldb) p $20.count
(const uint32_t) $21 = 5

(lldb) p $20.get(0)
(ivar_t) $22 = {
  offset = 0x0000000100008770
  name = 0x0000000100003de7 "_isBoy"
  type = 0x0000000100003f47 "c"
  alignment_raw = 0
  size = 1
}
(lldb) p $20.get(1)
(ivar_t) $23 = {
  offset = 0x0000000100008778
  name = 0x0000000100003dee "_height"
  type = 0x0000000100003f49 "f"
  alignment_raw = 2
  size = 4
}
(lldb) p $20.get(2)
(ivar_t) $24 = {
  offset = 0x0000000100008780
  name = 0x0000000100003df6 "_name"
  type = 0x0000000100003f4b "@\"NSString\""
  alignment_raw = 3
  size = 8
}
(lldb) p $20.get(3)
(ivar_t) $25 = {
  offset = 0x0000000100008788
  name = 0x0000000100003dfc "_age"
  type = 0x0000000100003f57 "q"
  alignment_raw = 3
  size = 8
}
(lldb) p $20.get(4)
(ivar_t) $26 = {
  offset = 0x0000000100008790
  name = 0x0000000100003e01 "_father"
  type = 0x0000000100003f59 "@\"RYModel\""
  alignment_raw = 3
  size = 8
}
```

### 7.5 类方法

类方法我们需要到元类中进行查找

> 获取元类类信息

```cpp
(lldb) p/x RYModel.class
(Class) $67 = 0x00000001000087c0 RYModel
(lldb) x/4gx 0x00000001000087c0
0x1000087c0: 0x0000000100008798 0x000000010036a140
0x1000087d0: 0x00000001010695c0 0x0004803400000007
(lldb) p/x 0x0000000100008798 & 0x00007ffffffffff8ULL
(unsigned long long) $68 = 0x0000000100008798
(lldb) p 0x0000000100008798 + 0x20
(long) $70 = 4295002040
(lldb) p/x 0x0000000100008798 + 0x20
(long) $71 = 0x00000001000087b8
(lldb) p (class_data_bits_t *)0x00000001000087b8
(class_data_bits_t *) $72 = 0x00000001000087b8

(lldb) p $72->data()
(class_rw_t *) $73 = 0x0000000101069510
(lldb) p $73->methods()
(const method_array_t) $74 = {
  list_array_tt<method_t, method_list_t, method_list_t_authed_ptr> = {
     = {
      list = {
        ptr = 0x0000000100008398
      }
      arrayAndFlag = 4295000984
    }
  }
}
```

> 获取元类方法列表（类方法）

```cpp
(lldb) p $74.list
(const method_list_t_authed_ptr<method_list_t>) $75 = {
  ptr = 0x0000000100008398
}
(lldb) p $75.ptr
(method_list_t *const) $76 = 0x0000000100008398
(lldb) p *$76
(method_list_t) $77 = {
  entsize_list_tt<method_t, method_list_t, 4294901763, method_t::pointer_modifier> = (entsizeAndFlags = 27, count = 1)
}
```

> 输出方法信息

```cpp
(lldb) p $77.get(0).big()
(method_t::big) $78 = {
  name = "classDoSomething"
  types = 0x0000000100003f3f "v16@0:8"
  imp = 0x0000000100003920 (ObjcBuild`+[RYModel classDoSomething])
}
```


---

# Agent Instructions: Querying This Documentation

If you need additional information that is not directly available in this page, you can query the documentation dynamically by asking a question.

Perform an HTTP GET request on the current page URL with the `ask` query parameter:

```
GET https://ryukiedev.gitbook.io/wiki/ios/di-ceng/05.-shen-ru-tan-suo-class-de-jie-gou.md?ask=<question>
```

The question should be specific, self-contained, and written in natural language.
The response will contain a direct answer to the question and relevant excerpts and sources from the documentation.

Use this mechanism when the answer is not explicitly present in the current page, you need clarification or additional context, or you want to retrieve related documentation sections.
