Lua中的代码效率注意事项

在项目开发的实际过程中,可能由于各种原因而导致lua代码在执行效率上出现了瓶颈。那么就必须要在代码书写上就要注意代码的实际执行效率了!

luajit的效率

luajit 在不同平台下的表现

传送门

luajit 的jit 模式

JIT模式在iOS以及其他有权限管制的平台(例如PS4,XBox)都不能使用。

luajit 的interpreter模式

interpreter模式就和原生的lua没啥区别了。

luajit 的编译方式(trace compiler)

trace compiler的不稳定性,还有jit在平台上编译转换的时候,很多时候都是失败的


luajit高效代码注意事项

官方优化指导link 以下是个人译文(错误指出(Conerlius))

This is a mailing list post by Mike Pall.

从Mike Pall发来的邮件

The following things are needed to get the best speed out for numerical computations with LuaJIT (in order of importance): 需要以下几点可以让LuaJIT的数值计算获得最佳的执行速度(按重要性排序):

Reduce number of unbiased/unpredictable branches. 减少不确定/不可预测的分支代码

Heavily biased branches (>95% in one direction) are fine.

选择确定性分支(在一个分支树上大于95%偏向)优先判断 Prefer branch-free algorithms. 优先选择无分支算法

Use math.min() and math.max().

使用math.min() 和 math.max()

Use bit.*, e.g. for conditional index computations. 尽可能使用bitop,如:使用数组索引

      分支代码就是根据条件会跳转的代码(最典型就是if..else),简单说:
      if 条件1 then
      elseif 条件2 
      then
      假如条件1或者条件2其中一方达成的概率非常高(>95%),
      那我们认为这是可预测的分支代码。

Use FFI data structures. 使用FFI数据结构(项目目前没有用,以后再说吧,没时间)

Use int32_t, avoid uint32_t data types.

使用int32_t,避免使用 uint32_t data

Use double, avoid float data types. 使用double, 避免使用 float

Metamethods are fine, but don’t overdose them. 元方法不要过量地使用

Call C functions only via the FFI. 尽可能用ffi来调用c函数

Avoid calling trivial functions, better rewrite them in Lua.

避免跨语言调用一些简单的功能方法,最好是能在lua里重写它

Avoid callbacks – use pull-style APIs (read/write) and iterators instead. 避免使用有返参数的调用。建议使用拉取式api或迭代器

Use plain ‘for i=start,stop,step do … end’ loops. 实现循环时,最好使用简单的for i = start, stop, step do这样的写法,或者使用ipairs,而尽量避免使用for k,v in pairs(x) do

Prefer plain array indexing, e.g. ‘a[i+2]’.

建议使用数组型下下标访问

Avoid pointer arithmetic. 避免指针运算

Find the right balance for unrolling. 找到展开式的合适平衡点

Avoid inner loops with low iteration count (< 10).

避免出现内循环次数少于10次的循环体

Only unroll loops if the loop body has not too many instructions. 只展开那些循环体内结构不是太复杂的函数

Consider using templates instead of hand-unrolling (see GSL Shell). 考虑使用模板而不是手动展开(请参阅GSL Shell)。

You may have to experiment a bit. 你可能需要尝试一下。(官方这说法也是够操蛋的了) Define and call only ‘local’ (!) functions within a module. local function 最好只在模块内定义和使用

Cache often-used functions from other modules in upvalues. 缓存经常使用的、来自其他模块的方法

E.g. local sin = math.sin … local function foo() return 2*sin(x) end

比如 local sin = math.sin … local function foo() return 2*sin(x) end

Don’t do this for FFI C functions, cache the namespace instead, e.g. local lib = ffi.load(“lib”). FFI的C方法就不要如此做了,缓存其名称空间就可以了

Avoid inventing your own dispatch mechanisms. 避免使用你自己实现的分发调用机制

Prefer to use built-in mechanisms, e.g. metamethods.

尽量使用內建的例如metatable这样的机制

Do not try to second-guess the JIT compiler. 无需过多去帮jit编译器做手工优化

It’s perfectly ok to write ‘z = x[a+b] + y[a+b]’.

‘z = x[a+b] + y[a+b]’这种写法是完全ok的

Do not try CSE (Common Subexpression Elimination) by hand, e.g. ‘local c = a+b’.

不要需要手动执行子表达式消除

It may become detrimental if the lifetime of the temporary is longer than needed. If the compiler cannot deduce that it’s dead, then the useless temporary will block a register or stack slot and/or it needs to be stored to the Lua stack. 如果临时变量的生命周期超出作用域,则可能会变得更坏。如果编译器不能推断变量已经无用了,那么无用的临时变量将阻塞寄存器或堆栈或把临时变量存储到Lua堆栈。

Duplicate expression involving basic arithmetic operators that are relatively close to each other (and likely in the same trace) should not be manually CSEd. Loads only need to be manually hoisted, if alias analysis is likely to fail. 涉及相似的基本算术运算符的重复表达式尽可能不要手动去消除子表达式。当分析可能失败了,才需手动提升负载。

It’s perfectly ok to write ‘a[i][j] = a[i][j] * a[i][j+1]’. 比较好的表达写法是 a[i][j] = a[i][j] * a[i][j+1]

Do not try to cache partial FFI struct/array references (e.g. a[i]) unless they are long-lived (e.g. in a big loop).

不要试图缓存部分FFI结构体和数组引用(例如a[i]),除非他们会长期地使用(例如:在一个打循坏中使用)

There are quite a few “easy” optimizations where the compiler is in a better position to perform them. Better focus on the difficult things, like algorithmic improvements. 有很多“简单”的优化,编译器可以更好地执行。最好是多关注复杂的事情,如算法改进。

Be careful with aliasing, esp. when using multiple arrays. 变量的别名可能会阻止jit优化掉子表达式,尤其是在使用多个数组的时候

LuaJIT uses strict type-based disambiguation, but there are limits to this due to C99 conformance.

LuaJIT使用严格的基类的消歧,但由于C99的一致性,这一点受到了限制。

E.g. in ‘x[i] = a[i] + c[i]; y[i] = a[i] + d[i]’ the load of a[i] needs to be done twice, because x could alias a. It does make sense to use ‘do local t = a[i] … end’ here. 比如在x[i] = a[i] + c[i]; y[i] = a[i] + d[i]a[i]执行了两次,因为x可能是a的别名。在这里使用do local t = a[i] ... end是有一定道理的

Reduce the number of live temporary variables. 减少存活着的临时变量的数量

Best to initialize on definition, e.g. ‘local y = f(x)’

最好在初始化的时候去定义。比如local y = f(x)

Yes, this means you should interleave this with other code.

是的,这意味着你可能存在和其他代码交错使用

Do not hoist variable definitions to the start of a function – Lua is not JavaScript nor K&R C.

不要将变量定义在函数的开头。

Use ‘do local y = f(x) … end’ to bound variable lifetimes. 使用 do local y = f(x) ... end 去控制变量的生命周期

Do not intersperse expensive or uncompiled operations. 减少使用高消耗或者不支持jit的操作

print() is not compiled, use io.write().

print() 还没有完成,建议使用io.write。

E.g. avoid assert(type(x) == “number”, “x is a “..mytype(x)”) 比如。避免assert(type(x) == "number", "x is a "..mytype(x)")

The problem is not the assert() or the condition (basically free). The problem is the string concatenation, which has to be executed every time, even if the assertion never fails!

问题不在于assert()或条件(基本上是空闲的)。问题在于字符串永远都会并联,即是assert永远不会失败。

Watch the output of -jv and -jdump. 关注-jv和-jdump的输出

You need to take all of these factors into account before deciding on a certain algorithm. An advanced algorithm, that’s fast in theory, may be slower than a simpler algorithm, if the simpler algorithm has much fewer unbiased branches. 在决定算法之前,需要考虑所有的因素。理论上高级的算法速度更快,但实际上有可能币简单算法更慢,如果简单算法具有较少的分歧分支的话。

–Mike