clang9适配一阶段总结

1. 概述

截止2021年11月25日，clang9完成sdk/gtest/dsopt模块的编译。

参照下面的脚本下载了所有[TR-16607] clang9交叉编译工具链制作和验证 - Enflame Company JIRA相关的修改，包含merged和当前还是open状态的修改：

怎么从gerrit批量导出详细的patch - 周荣华_Ronghua - enflame wiki

特地说明一下，gerrit的query命令里面不能有括号，所以实际如果存在多个条件的复杂联合时，默认是AND运算，如果想使用OR运算的话，需要把多个可选表达式用OR连接起来。

简单统计了一下，新增3924行代码，删除4164行代码：

PS D:\code> grep "^+[^+]" .\diffrecord.txt |wc
   3924   24785  152346
PS D:\code> grep "^-[^-]" .\diffrecord.txt |wc
   4164   23159  147430

前期修改的时候，由于打开了-Werr选项，所以有一些是不太重要的告警，由于告警实在太多，后期将-Werr临时先关闭了，只保留了部分特定的Werr选项。

另外，由于tops下面的代码中从大的整型向小的整型隐式转换的非常多，后面还用-Wno-c++11-narrowing临时关闭了相关告警。

2. 问题发现和解决的方法

如果每次发现一个问题之后，修改完之后，再走全量编译，通常非常耗时，下面的方法可以获取单个的编译或者链接命令，便于针对性验证。

2.1. cmake的编译命令获取

cmake有编译字典，在cmake_build(敲cmake命令的目录，可能是其他目录)目录下会生成一个“compile_commands.json”文件，里面记录了所有.c/.cc/.cpp生成.o的目录和完整命令，例如想知道

hlir_utils_test.cc的编译命令，可以用下面的途径获取：

grep hlir_utils_test.cc compile_commands.json
  "command": "/opt/efb/clang9/bin/clang++  -DLLVM_DISABLE_ABI_BREAKING_CHECKS_ENFORCING -D_GLIBCXX_USE_CXX11_ABI=0 -I/home/ronghua.zhou/clang1_build/tops/../test_sdk_build/execroot/dtu_sdk/bazel-out/k8-opt/bin/external/sdksrc/include/_virtual_includes/include -I/home/ronghua.zhou/clang1_build/tops/../test_sdk_build/execroot/dtu_sdk/bazel-out/k8-opt/bin/external/sdksrc/include/_virtual_includes/include/dtu -I/home/ronghua.zhou/clang1_build/tops/../test_sdk_build/execroot/dtu_sdk/bazel-out/k8-opt/bin/lib/umd/include/_virtual_includes/include -I/home/ronghua.zhou/clang1_build/tops/../test_sdk_build/execroot/dtu_sdk/bazel-out/k8-opt/bin/external/ef_log/include/_virtual_includes/include -I/home/ronghua.zhou/clang1_build/tops/sdk -I/home/ronghua.zhou/clang1_build/tops/sdk/lib -I/home/ronghua.zhou/clang1_build/tops/sdk/lib/cpu_ops -I/home/ronghua.zhou/clang1_build/tops/../test_sdk_build/execroot/dtu_sdk/external/llvm-project/llvm/include -I/home/ronghua.zhou/clang1_build/tops/../test_sdk_build/execroot/dtu_sdk/external/llvm-project/mlir/include -I/home/ronghua.zhou/clang1_build/tops/../test_sdk_build/execroot/dtu_sdk/external/org_tensorflow -I/home/ronghua.zhou/clang1_build/tops/../test_sdk_build/execroot/dtu_sdk/external/eigen_archive -I/home/ronghua.zhou/clang1_build/tops/../test_sdk_build/execroot/dtu_sdk/external/com_google_absl -I/home/ronghua.zhou/clang1_build/tops/../test_sdk_build/execroot/dtu_sdk/external/com_google_protobuf/src -I/home/ronghua.zhou/clang1_build/tops/../test_sdk_build/external/dtu_sdk/bazel-bin -I/home/ronghua.zhou/clang1_build/tops/../test_sdk_build/external/llvm-project/llvm/utils/unittest/googlemock/include -I/home/ronghua.zhou/clang1_build/tops/../test_sdk_build/external/com_googlesource_code_re2 -I/home/ronghua.zhou/clang1_build/tops/../test_sdk_build/execroot/dtu_sdk/bazel-out/k8-opt/bin -I/home/ronghua.zhou/clang1_build/tops/../test_sdk_build/execroot/dtu_sdk/bazel-out/k8-opt/bin/lib -I/home/ronghua.zhou/clang1_build/tops/../test_sdk_build/execroot/dtu_sdk/bazel-out/k8-opt/bin/external/org_tensorflow -I/home/ronghua.zhou/clang1_build/tops/../test_sdk_build/execroot/dtu_sdk/bazel-out/k8-opt/bin/external/llvm-project/llvm/include -I/home/ronghua.zhou/clang1_build/tops/../test_sdk_build/execroot/dtu_sdk/bazel-out/k8-opt/bin/external/llvm-project/mlir/include -isystem /home/ronghua.zhou/clang1_build/tops/3rdparty/googletest/include -isystem /home/ronghua.zhou/clang1_build/tops/3rdparty/googletest  -O3 -g0 -DNDEBUG -fPIE   -m64 -march=x86-64 -mtune=generic -Werror=array-bounds -Werror=empty-body -Werror=format-extra-args -Werror=incompatible-pointer-types -Werror=array-bounds-pointer-arithmetic -Werror=c++-compat -Werror=shift-count-overflow -Werror=sizeof-pointer-memaccess -Werror=for-loop-analysis -Werror=unused-label -Werror=delete-incomplete -Werror=empty-translation-unit -Werror=unused-local-typedef -Werror=gnu-case-range -Werror=mismatched-new-delete -Werror=infinite-recursion -Werror=unreachable-code -Werror=sometimes-uninitialized -Werror=c++14-binary-literal -Werror=implicit-fallthrough -Werror=constant-logical-operand -Werror=exceptions -fcxx-exceptions -Werror=extra-tokens -Werror=format -Werror=format-security -Werror=header-guard -Werror=literal-conversion -Werror=null-conversion -Werror=pointer-bool-conversion -Werror=shift-overflow -Werror=tautological-constant-out-of-range-compare -Werror=tautological-pointer-compare -Werror=varargs -Wdouble-promotion -Wno-error=extern-c-compat -Wall -Wno-c++11-narrowing -Wextra -fsanitize=address -fno-omit-frame-pointer -std=gnu++14 -std=gnu++14 -o sdk/tests/hlir/cc_tests/CMakeFiles/hlir_utils_test.dir
hlir_utils_test.cc.o -c /home/ronghua.zhou/clang1_build/tops/sdk/tests/hlir/cc_tests/hlir_utils_test.cc",
  "file": "/home/ronghua.zhou/clang1_build/tops/sdk/tests/hlir/cc_tests/hlir_utils_test.cc"

2.2. bazel的编译命令获取

?https://github.com/vincent-picaud/Bazel_and_CompileCommands

上面这个开源项目提到可以用–experimental_action_listener=//tools/actions:generate_compile_commands_listener到bazel命令的方式来实现接收编译命令，但我用了几次没有成功，最终改为在编译过程中用原始的ps命令来获取，例如想获取hlir_utils_test.ccbian编译命令可以用下面的命令：

ps -elf |grep hlir_utils_test.cc

另外，bazel命令后面加上-s参数也可以达到获取后续编译命令的效果。

2.3. 链接命令的获取

如果知道链接的具体目标文件，可以参照2.2的方法用ps命令获取，例如要链接libdtu_sdk.so，可以用下面命令获取链接命令：

ps -elf |grep libdtu_sdk.so

如果不清楚链接的具体目标，在链接对象不多的情况下可以用“ps -elf”获取一个全集，从全集里面可以看到很多“ld @/tmp/response-xxx.txt”的进程，将当前所有的/tmp/response*拷贝到别的目录下，研究下这些文件用来链接生成什么目标的，这些文件里面会有完整的链接命令和参数，通过这个文件可以得到链接命令。

3. 实际修改分类

3.1. 编译选项的修改

3.1.1. 增加的选项

-fcxx-exceptions ：因为dsopt使用了异常，clang的异常处理默认关闭，需要打开。

-Wno-c++11-narrowing ：tops下面的代码中从大的整型向小的整型隐式转换的非常多，临时关闭，等各个组件消除了相关问题之后再打开，clang里面把从大整型到小整型的隐式转换当做错误处理。

3.1.2. 删除的选项

-Werror ：告警实在太多，要求消除所有告警不现实，临时先删除该选项。

3.1.3. 修改的选项

set (CMAKE_CXX_STANDARD 14) ：原来的默认标准是17，和TensorFlow的默认标准14冲突，也和gcc的默认标准14冲突，改成c++14。

-fno-canonical-system-headers ：这个参数仅gcc支持，clang不支持，所以把它从所有编译器都打开，改到仅gcc打开。

3.1.4. bazel的选项说明

bazel的编译选项分copt/cxxopt/conlyopt，其中copt是c和c++公用的选项，cxxopt是仅c++才是用的选项，conlyopt是仅c才有的选项，如果用错了，会出现很多告警。

3.1.5. CMAKE的CMAKE_TOOLCHAIN_FILE变量在rerun的时候，有一定概率会把搜索路径下的工具链配置文件加上全路径，导致直接STREQUAL判断失败

解决方案是用MATCHES代替STREQUAL，通配是否增加全路径的情况：

CMakeLists.txt Expand source

3.2. 模板相关错误

3.2.1. use 'template' keyword to treat 'cast' as a dependent template name

clang里面对在一个模板实例化后的对象中调用一个需要动态翻译的函数，需要使用template显示说明，否则会报错。参照ISO C++03 14.2/4：

When the name of a member template specialization appears after . or -> in a postfix-expression, or after nested-name-specifier in a qualified-id, and the postfix-expression or qualified-id explicitly depends on a template-parameter (14.6.2), the member template name must be prefixed by the keyword template. Otherwise the name is assumed to name a non-template.

例如hlir的SinkTransposeWithScalarBroadcast类里面调用了mlir::RankedTensorType、mlir::ShapedType的cast方法

diff --git a/sdk/lib/hlir/transforms/TopsInferenceHlirPass/HlirTransposeMoverExt.cc b/sdk/lib/hlir/transforms/TopsInferenceHlirPass/HlirTransposeMoverExt.cc
index c82fa217a21..9952ddbc470 100644
--- a/sdk/lib/hlir/transforms/TopsInferenceHlirPass/HlirTransposeMoverExt.cc
+++ b/sdk/lib/hlir/transforms/TopsInferenceHlirPass/HlirTransposeMoverExt.cc
@@ -237,11 +237,14 @@ struct SinkTransposeWithScalarBroadcast : public mlir::OpRewritePattern {
     }
     llvm::SmallVector4> new_operands(root->getNumOperands(), {});
     for (auto& it : broadcast_ops) {
-      auto transposedTy = getTransposedType(std::get<1>(it)
-                                                ->getResult(0)
-                                                .getType()
-                                                .cast(),
-                                            prePermutation);
+      // fix error:
+      // use 'template' keyword to treat 'cast' as a dependent template name
+      auto transposedTy =
+          getTransposedType(std::get<1>(it)
+                                ->getResult(0)
+                                .getType()
+                                .template cast(),
+                            prePermutation);
       auto new_attr = llvm::cast(std::get<1>(it))
                           .broadcast_dimensionsAttr();
       if (new_attr) {
@@ -251,7 +254,7 @@ struct SinkTransposeWithScalarBroadcast : public mlir::OpRewritePattern {
           new_data[i] = layout[data[i]];
         }
         new_attr = mlir::DenseIntElementsAttr::get(
-            new_attr.getType().cast(),
+            new_attr.getType().template cast(),
             llvm::makeArrayRef(new_data));
       }
       mlir::Operation* transpose_bs_op =
@@ -274,7 +277,7 @@ struct SinkTransposeWithScalarBroadcast : public mlir::OpRewritePattern {
     mlir::Operation* ret_transpose = rewriter.create(
         root->getLoc(), root->getResult(0).getType(), new_root->getResult(0),
         mlir::DenseIntElementsAttr::get(
-            permutation.getType().cast(), layout));
+            permutation.getType().template cast(), layout));
     root->replaceAllUsesWith(ret_transpose);
   }

注意，如果不是模板实例化的函数，不需要加template，同一个类里面也存在不需要处理的函数调用，例如同一个文件里面的ss对象是非模板实例化的，类型是固定的mlir::Operation*，ss在调用存在多态的cast函数时就不需要使用temple进行前置声明：

mlir::Operation* ss = op.getOperation();
auto new_operand_ty = getTransposedType(operand_ty, prePermutation);
auto new_source_ty = getTransposedType(source_ty, prePermutation);
auto new_result_ty = getTransposedType(
    ss->getResult(0).getType().cast(),
    prePermutation);

同样的问题也存在于factor模块的factor_profiler_pass.cc中：

diff --git a/sdk/lib/factor/codegen/passes/factor_profiler_pass.cc b/sdk/lib/factor/codegen/passes/factor_profiler_pass.cc
index 43419fd305a..ad23a709f20 100644
--- a/sdk/lib/factor/codegen/passes/factor_profiler_pass.cc
+++ b/sdk/lib/factor/codegen/passes/factor_profiler_pass.cc
@@ -55,11 +55,11 @@ mlir::Value getFirstOperand(mlir::Value op) {
  
 template 
 int getSrcCompressed(T op) {
-  return op.template dma_src_compressedAttr().getInt();
+  return op.dma_src_compressedAttr().getInt();
 }
 template 
 int getDstDecompressed(T op) {
-  return op.template dma_dst_decompressAttr().getInt();
+  return op.dma_dst_decompressAttr().getInt();
 }
  
 #define DISABLE_DMA_COMPRESS_ATTR_GETTER(OP) \
@@ -84,11 +84,11 @@ DISABLE_DMA_COMPRESS_ATTR_GETTER(mlir::factor::FactorDeSliceOp)
  
 template 
 int getReverseLr(T op) {
-  return op.template dma_reverse_lrAttr().getInt();
+  return op.dma_reverse_lrAttr().getInt();
 }
 template 
 int getReverseTb(T op) {
-  return op.template dma_reverse_tbAttr().getInt();
+  return op.dma_reverse_tbAttr().getInt();
 }
  
 #define DISABLE_REVERSE_ATTR_GETTER(OP) \
@@ -114,7 +114,7 @@ DISABLE_REVERSE_ATTR_GETTER(mlir::factor::FactorDeSliceOp)
  
 template 
 int getDmaType(T op) {
-  return op.template dma_typeAttr().getInt();
+  return op.dma_typeAttr().getInt();
 }
  
 #define DISABLE_DMA_TYPE_GETTER(OP) \
@@ -142,8 +142,8 @@ std::string formatDmaAttrs(int direction, int src_compressed,
 template 
 void extractDmaMetaInfoTo(T op, dtu_activity_data &data) {
   auto &args = data.args;
-  mlir::Value from = getFirstOperand(op.template from());
-  mlir::Value to = getFirstOperand(op.template to());
+  mlir::Value from = getFirstOperand(op.from());
+  mlir::Value to = getFirstOperand(op.to());
   auto engine_type = getDmaType(op);
   auto direction = op.dma_directionAttr().getInt();

3.2.2. 二义性

部分模板实例化的时候，如果同一个调用用模板函数A和模板函数B都能正常匹配到，clang会报二义性错误，gcc不报错。

例如下面的EraseHelp，原来的版本定义了两种原型，其实对存在多个模板类型需要使用TypeSequence进行原型定义的时候，编译器其实不知道是该先把Last抽出来计算，还是先把Inner抽出来计算，如果这2个函数的实现逻辑不一样的话，在gcc里面居然没报错，不知道是随机找到一个匹配的原型就调用，还是用第一个或者最后一个原型来调用。

constexpr static auto EraseHelp(TypeSequence, TypeSequence);

constexpr static auto EraseHelp(TypeSequence, TypeSequence);

diff --git a/sdk/lib/hlir/ir/type_utils.h b/sdk/lib/hlir/ir/type_utils.h
index 3cf2bc7994a..0e645fd1e7e 100644
--- a/sdk/lib/hlir/ir/type_utils.h
+++ b/sdk/lib/hlir/ir/type_utils.h
@@ -157,12 +157,9 @@ struct EraseSeqIf {
     using type = decltype(EraseHelp(LeftSeq(), TypeSequence()));
     return type();
   }
-  template 
-  constexpr static auto EraseHelp(TypeSequence, TypeSequence) {
-    using type = typename std::conditional::value,
-                                           TypeSequence,
-                                           TypeSequence>::type;
-    return type();
+  template 
+  constexpr static auto EraseHelp(TypeSequence, TypeSequence<>) {
+    return TypeSequence();
   }
   using type = decltype(EraseHelp(TypeSequence<>(), TypeSequence()));
 };

3.3. 类型不匹配

3.3.1. 大整型向小整型的隐式转换

例如sdk/tests/llir/dataflow1_pingpang_buffer_test.cc里面定义的func_entry是int64_t类型，但实际调用函数的时候，函数原型要求的入参是uint32_t，会触发int64_t → uint32_t的隐式转换：

diff --git a/sdk/tests/llir/dataflow1_pingpang_buffer_test.cc b/sdk/tests/llir/dataflow1_pingpang_buffer_test.cc
index fa824f03d9a..70298b1fb59 100644
--- a/sdk/tests/llir/dataflow1_pingpang_buffer_test.cc
+++ b/sdk/tests/llir/dataflow1_pingpang_buffer_test.cc
@@ -522,7 +522,7 @@ TEST(Pavo2xCDMAPattern1Test, Pavo2xCDMAPattern1WithPingpangTest) {
                              {{0}, {1}, {2}, {3}, {4}, {5}}, 1, 1, 1, -1, -1,
                              output_queues_l1);
  
-    int64_t func_entry = 0;
+    uint32_t func_entry = 0;
     // trigger sip
     for (uint64_t idx = 0; idx < SIP_COUNT; ++idx) {
       std::string sip_name = std::string("sip") + std::to_string(idx);

其他类似的有：

sdk/tests/llir/dataflow1_test.cc

sdk/tests/llir/dataflow2_test.cc

sdk/tests/llir/dataflow3_test.cc

sdk/tests/llir/dataflow5_test.cc

sdk/tests/llir/dataflow5_test_1xcdma.cc

sdk/tests/llir/dataflow7_test.cc

sdk/tests/llir/llir2assembler_leo_test.cc

sdk/tests/llir/utils/llir_test_util.cc

sdk/tests/llir/utils/llir_test_util.h

3.3.2. 有符号向无符号的隐式转换

-1转换为无符号整型：

diff --git a/sdk/lib/hlir/ir/type_utils.h b/sdk/lib/hlir/ir/type_utils.h
index 0e645fd1e7e..f84360269f3 100644
--- a/sdk/lib/hlir/ir/type_utils.h
+++ b/sdk/lib/hlir/ir/type_utils.h
@@ -122,10 +122,9 @@ struct FindIf {
  
 template

clang9适配一阶段总结

1. 概述

2. 问题发现和解决的方法

2.1. cmake的编译命令获取

2.2. bazel的编译命令获取

2.3. 链接命令的获取

3. 实际修改分类

3.1. 编译选项的修改

3.1.1. 增加的选项

3.1.2. 删除的选项

3.1.3. 修改的选项

3.1.4. bazel的选项说明

3.1.5. CMAKE的CMAKE_TOOLCHAIN_FILE变量在rerun的时候，有一定概率会把搜索路径下的工具链配置文件加上全路径，导致直接STREQUAL判断失败

3.2. 模板相关错误

3.2.1. use 'template' keyword to treat 'cast' as a dependent template name

3.2.2. 二义性

3.3. 类型不匹配

3.3.1. 大整型向小整型的隐式转换

3.3.2. 有符号向无符号的隐式转换

3.3.3. 浮点向整型的隐式转换

3.3.4. double向float的隐式转换

3.3.5. 指针向bool的隐式转换

3.3.6. 不同类型隐式转换

3.3.7. 函数原型中的const隐式转换

3.3.8. void*向char*的隐式转换

3.3.9. string类型到char*的隐式转换

3.4. switch中break缺失

3.4.1. 语义上确实需要break的场景，增加break

3.4.2. 语义上确实不需要break的场景，增加编译指示，让编译器忽略检查

3.5. format不匹配问题

3.5.1. 不匹配，但实际上不影响功能

3.5.2. 不匹配，并且影响功能

3.6. 有定义无使用

3.6.1. 未使用变量

3.6.2. 未使用参数

3.6.3. 未使用label

3.6.4. 执行不到的代码

3.6.5. 未被调用的inline函数

3.6.6. 未使用的class声明

3.6.7. 未使用的类型定义

3.7. 重复定义

3.8. 入参初始化顺序异常

3.9. 类型申明不全

3.10. 数组初始化

3.10.1. 确实必须是变长数组的使用new[]()和delete[]来申请和释放内存

3.10.2. 实际语义是定长数组的，通过加const修饰来解决

3.11. 函数原型中的auto

3.12. strlen返回值不作为常量类型的处理

3.13. 其他语法问题

3.13.1. lambda语法问题

3.13.2. return语句中的move调用

3.13.3. 使用未初始化的对象

3.13.4. clang禁止使用括号表达式初始化数组

3.13.5. clang的泛型函数的实例化必须有相关调用才会触发

3.13.6. clang的constexpr中不允许定义需要内存处理的复杂对象

3.13.7. clang的虚函数的重载需要加上显式的override关键字

3.13.8. alignas使用问题

3.14. 为了解决告警顺带做的一些优化

3.14.1. 冗余的计算

3.14.2. 引用指针和空指针的冗余比较

相关

3.3.8. void向char的隐式转换