c – 为什么这个函数的递归版本更快?

这是一个用于迭代多维数值范围的简单类:

#include <array>
#include <limits>

template <int N>
class NumericRange
{
public:
  //  typedef std::vector<double>::const_iterator const_iterator;
  NumericRange() {
    _lower.fill(std::numeric_limits<double>::quiet_NaN());
    _upper.fill(std::numeric_limits<double>::quiet_NaN());
    _delta.fill(std::numeric_limits<double>::quiet_NaN());
  }
  NumericRange(const std::array<double, N> & lower, const std::array<double, N> & upper, const std::array<double, N> & delta):
    _lower(lower), _upper(upper), _delta(delta) {
    _state.fill(std::numeric_limits<double>::quiet_NaN());
    _next_index_to_advance = 0;
  }

  const std::array<double, N> & get_state() const {
    return _state;
  }

  void start() {
    _state = _lower;
  }

  bool in_range(int index_to_advance = N-1) const {
    return ( _state[ index_to_advance ] - _upper[ index_to_advance ] ) < _delta[ index_to_advance ];
  }

  void advance(int index_to_advance = 0) {
    _state[ index_to_advance ] += _delta[ index_to_advance ];
    if ( ! in_range(index_to_advance) ) {
      if (index_to_advance < N-1) {
    // restart index_to_advance
    _state[index_to_advance] = _lower[index_to_advance];

    // carry
    index_to_advance;
    advance(index_to_advance+1);
      }
    }
  }

private:
  std::array<double, N> _lower, _upper, _delta, _state;
  int _next_index_to_advance;
};

int main() {
  std::array<double, 7> lower{0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0};
  std::array<double, 7> upper{1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0};
  std::array<double, 7> delta{0.03, 0.06, 0.03, 0.06, 0.03, 0.06, 0.03};

  NumericRange<7> nr(lower, upper, delta);
  int c = 0;
  for (nr.start(); nr.in_range(); nr.advance()) {
    const std::array<double, 7> & st = nr.get_state();
    ++c;
  }
  std::cout << "took " << c << " steps" << std::endl;

  return 0;
}

当我使用非递归变量替换advance函数时,运行时会增加:

void advance(int index_to_advance = 0) {
  bool carry;
  do {
    carry = false;
    _state[ index_to_advance ] += _delta[ index_to_advance ];
    if ( ! in_range(index_to_advance) ) {
      if (index_to_advance < N-1) {
    // restart index_to_advance
    _state[index_to_advance] = _lower[index_to_advance];

    // carry
    ++index_to_advance;
    carry = true;
    //    advance(index_to_advance);
      }
    }
  } while (carry);
}

通过命令时间使用unix用户时间获取运行时.代码使用gcc-4.7编译,选项-std = c 11 -O3(但我认为它应该与gcc-4.6上的c 0x一起使用).递归版本需要13秒,迭代版本需要30秒.两者都需要相同数量的高级调用才能终止(如果在for(ns.start()…)循环中打印nr.get_state()数组,则两者都做同样的事情).

这是一个有趣的谜语!帮我弄清楚为什么递归会更有效/更可优化.

最佳答案
递归版本是尾递归的一个示例,这意味着编译器可以将递归转换为迭代.现在,一旦执行了转换,递归函数看起来就像这样:

void advance(int index_to_advance = 0) {
    _state[ index_to_advance ] += _delta[ index_to_advance ];
    while ( !in_range(index_to_advance) && index_to_advance < N-1 ) {
        // restart index_to_advance
        _state[index_to_advance] = _lower[index_to_advance];

        // carry
        ++index_to_advance;
        _state[ index_to_advance ] += _delta[ index_to_advance ];
    }
  }

如您所见,您的版本包含一个额外的测试和条件变量.如果仔细观察,循环就相当于

for( ; index_to_advance < N-1 && !in_range(index_to_advance);++index_to_advance)

(删除末尾的index_to_advance),优化器可能更有可能展开它.

话虽这么说,我不认为这解释了巨大的时差,虽然它确实解释了为什么递归版本比迭代版本慢得多.检查生成的程序集以查看编译器实际执行的操作.

转载注明原文:c – 为什么这个函数的递归版本更快? - 代码日志